DARPA Is Worried About How Well Open-Source Code Can Be Trusted

An anonymous reader quotes a report from MIT Technology Review: “People are realizing now: wait a minute, literally everything we do is underpinned by Linux,” says Dave Aitel, a cybersecurity researcher and former NSA computer security scientist. “This is a core technology to our society. Not understanding kernel security means we can’t secure critical infrastructure.” Now DARPA, the US military’s research arm, wants to understand the collision of code and community that makes these open-source projects work, in order to better understand the risks they face. The goal is to be able to effectively recognize malicious actors and prevent them from disrupting or corrupting crucially important open-source code before it’s too late. DARPA’s “SocialCyber” program is an 18-month-long, multimillion-dollar project that will combine sociology with recent technological advances in artificial intelligence to map, understand, and protect these massive open-source communities and the code they create. It’s different from most previous research because it combines automated analysis of both the code and the social dimensions of open-source software.

Here’s how the SocialCyber program works. DARPA has contracted with multiple teams of what it calls “performers,” including small, boutique cybersecurity research shops with deep technical chops. One such performer is New York — based Margin Research, which has put together a team of well-respected researchers for the task. Margin Research is focused on the Linux kernel in part because it’s so big and critical that succeeding here, at this scale, means you can make it anywhere else. The plan is to analyze both the code and the community in order to visualize and finally understand the whole ecosystem.

Margin’s work maps out who is working on what specific parts of open-source projects. For example, Huawei is currently the biggest contributor to the Linux kernel. Another contributor works for Positive Technologies, a Russian cybersecurity firm that — like Huawei — has been sanctioned by the US government, says Aitel. Margin has also mapped code written by NSA employees, many of whom participate in different open-source projects. “This subject kills me,” says d’Antoine of the quest to better understand the open-source movement, “because, honestly, even the most simple things seem so novel to so many important people. The government is only just realizing that our critical infrastructure is running code that could be literally being written by sanctioned entities. Right now.” This kind of research also aims to find underinvestment — that is critical software run entirely by one or two volunteers. It’s more common than you might think — so common that one common way software projects currently measure risk is the “bus factor”: Does this whole project fall apart if just one person gets hit by a bus?

SocialCyber will also tackle other open-source projects too, such as Python which is “used in a huge number of artificial-intelligence and machine-learning projects,” notes the report. “The hope is that greater understanding will make it easier to prevent a future disaster, whether it’s caused by malicious activity or not.”