Robot holds a finger near the head. 3D illustration
Eye spy: software can spot faulty code better than humans © Getty

Nine out of 10 reported cyber security incidents are the result of an error in software code. Hackers particularly prize so-called zero-day attacks, where a previously unknown flaw allows them to infiltrate a computer system. Stuxnet, the computer virus that targeted computers in Iran’s uranium enrichment programme, is a famous example of a zero-day attack.

With billions of lines of code written every year, however, catching and correcting every error is difficult. Researchers in both the US and China believe artificial intelligence could offer a solution.

Human efforts, so far, have failed to keep pace. If anything, the number of defects is increasing. Data collected in last year’s Coverity Scan report, which analyses open-source software, for example, suggests that the number of faults is mounting.

“What’s concerning and challenging is that the bugs in software are not decreasing,” says Sandeep Neema, a program manager at the Defense Advanced Research Projects Agency, the US military’s research agency, which has spent millions of dollars funding the development of AI systems that can detect software flaws.

Current software checking technology works a little like the spellcheck in a word processor, recognising typographical or syntax errors. Developers also commonly review each other’s code and run tests before putting new software into action.

“The state of the art in terms of how we improve the quality of software is still testing-driven,” says Mr Neema. The problem with these methods, he says, is that many errors are not caught — even though 50 to 75 per cent of development time is typically spent on testing. AI bug detectors promise to make vetting code more accurate and less labour intensive for developers.

“It’s really reducing the amount of time they have to spend to find those high-priority vulnerabilities,” says Rebecca Russell, a machine learning scientist, describing the AI system she helped design at the Draper Laboratory with funding from Darpa. Draper’s system, which scans software to identify which parts of a program contain vulnerabilities, outperformed three tools that use static analysis, one of the best software vetting methods available, according to a research paper published by Ms Russell and her colleagues this summer.

Marc McConley, technical director for the project, says the lab is now working with various elements of the US Department of Defense to find applications for the technology. “Their main concerns are going to be things like protecting their large software systems from cyber attacks,” he says.

Draper is also working on an AI that can automatically repair software faults, although this research is at an earlier stage. Fan Long, an assistant professor of computer science at the University of Toronto who also works on automatic software repair, says commercially viable tools to automatically fix routine errors will probably be available in the next few years. “Fixing many of these errors is not very creative. People tend to make similar mistakes on similar systems,” says Prof Long.

Chinese state agencies have also funded research to produce an AI bug detection system. When tested on four “very widely used” commercial software products, this system spotted 10 as-yet-undetected vulnerabilities, says Shouhuai Xu, a computer science professor at the University of Texas at San Antonio who developed the system with a group of academics in China. Given that such flaws can be used for zero-day attacks, Prof Xu declined to give further details about the vulnerabilities his group found.

These tools that zero in on flaws that pose security risks are still under development, but some companies are already using AI for general software scanning. Ubisoft, the video game maker, for example, announced a tool in March that uses AI to flag potentially faulty code before it is implemented. Yves Jacquier, head of the company’s Montreal-based research lab, says their tool reduced development time by 20 per cent during testing and that the company is planning a “significant” rollout by the end of this year.

Darpa’s work on bug detection is part of a program called Muse, which also promotes AI research in a broader category known as “big code”. This field is based on roughly the same principles as “big data”, examining vast repositories of code to generate insights and learn how to write better code. It seeks to address software problems from the other side, by creating code that has fewer flaws in the first place.

Copyright The Financial Times Limited 2024. All rights reserved.
Reuse this content (opens in new window) CommentsJump to comments section

Follow the topics in this article