Microsoft claims to have developed a system that correctly distinguishes between security and non-security software bugs 99% of the time, and that accurately identifies critical, high-priority security bugs on average 97% of the time. In the coming months, it plans to open-source the methodology on GitHub, along with example models and other resources.
The work suggests that such a system, which was trained on a data set of 13 million work items and bugs from 47,000 developers at Microsoft stored across AzureDevOps and GitHub repositories, could be used to support human experts. Coralogix estimates that developers create 70 bugs per 1,000 lines of code and that fixing a bug takes 30 times longer than writing a line of code; in the U.S., $113 billion is spent annually on identifying and fixing product defects.
In the course of architecting the model, Microsoft says that security experts approved the training data and that statistical sampling was used to provide those experts a manageable amount of data to review. The data was then encoded into representations called feature vectors, and Microsoft researchers set about designing the system using a two-step process. First, the model learned to classify security and non-security bugs, and then it learned to apply severity labels — critical, important, or low-impact — to the security bugs.
Microsoft’s model leverages two techniques to make its bug predictions. The first is a term frequency-inverse document frequency algorithm (TF-IDF), an information retrieval approach that assigns importance to a word based on the number of times it appears in a document and checks how relevant the word is throughout a collection of titles. (Microsoft says that its bug titles are generally very short, containing around 10 words.) The second technique — a logistic regression model — uses a logistic function to model the probability of a certain class or event existing.
Microsoft says that the model is deployed in production internally, and that it is continually retrained with data approved by security experts who monitor the number of bugs generated in software development.
“Every day, software developers stare down a long list of features and bugs that need to be addressed. Security professionals try to help by using automated tools to prioritize security bugs, but too often, engineers waste time on false positives or miss a critical security vulnerability that has been misclassified,” wrote Microsoft senior security program manager Scott Christiansen and Microsoft data and applied scientist Mayana Pereira in a blog post. “We discovered that by pairing machine learning models with security experts, we can significantly improve the identification and classification of security bugs.”
Microsoft isn’t the only tech giant using AI to weed out software bugs. Amazon’s CodeGuru service, which was partly trained on code reviews and apps developed internally at Amazon, spots issues including resource leaks and wasted CPU cycles. As for Facebook, it developed a tool called SapFix that generates fixes for bugs before sending them to human engineers for approval, and another tool called Zoncolan that maps the behavior and functions of codebases and looks for potential problems in individual branches as well as in the interactions of various paths through the program.