viralamo

Menu
  • Technology
  • Science
  • Money
  • Culturs
  • Trending
  • Video

Subscribe To Our Website To Receive The Last Stories

Join Us Now For Free
Home
Technology
GitHub now uses AI to recommend open issues in project repositories
Technology

GitHub now uses AI to recommend open issues in project repositories

22/01/2020

Large open source projects on Github have intimidatingly long lists of problems that require addressing. To make it easier to spot the most pressing among them, GitHub recently introduced the “good first issues” feature, which matches contributors with project issues that are likely to fit their interests. The initial version, which launched in May 2019, surfaced recommendations based on labels applied to issues by project maintainers. But an updated release shipped last month incorporates an AI algorithm that Github says surfaces issues in about 70% of repositories recommends to users.

Github says it’s the first deep-learning-enabled product to launch on Github.com.

According to senior machine learning engineer Tiferet Gazit, informed by an analysis and manual curation, Github last year created a list of 300 label names used by popular open source repositories. (All were synonyms for either “good first issue” or “documentation,” like “beginner friendly,” “easy bug fix,” and “low-hanging-fruit.”) But relying on these meant that only about 40% of the repositories recommended had issues that could be surfaced. Plus, it left project maintainers with the burden of triaging and labeling issues themselves.

The new AI recommender system is largely automatic, by contrast. But building it required crafting an annotated training set of hundreds of thousands of samples.

Github recommender AI

Github began with issues that had any of the roughly-300 labels in its curated list, which the company supplemented with a few sets of issues that were also likely to be beginner-friendly. (This included those that were closed by a user who had never previously contributed to the repository, as well as issues closed that touched only a few lines of code in a single file.) After detecting and removing near-duplicate issues, the training, validation, and test sets were separated across repositories to prevent data leakage from similar content, and Github trained the AI system using only preprocessed and denoised issue titles and bodies to ensure it detected good issues as soon as they’re opened.

In production, each issue for which the recommender predicts a probability above the required threshold is slated for recommendation, with a confidence score equal to its predicted probability. Open issues from non-archived public repositories that have at least one of the labels from the curated label list are given a confidence score based on the relevance of their labels, with synonyms of “good first issue” given higher confidence than synonyms of “documentation.” Within each repository, all detected issues are then ranked primarily based on their confidence score (though label-based detections are generally given higher confidence than ML-based detections), along with a penalty on issue age.

Data acquisition, training, and inference pipelines run daily, according to Gazit, using scheduled workflows to ensure the results remain “fresh” and “relevant.” In the future, Github plans to add better signals to its repository recommendations and a mechanism for maintainers and triagers to approve or remove AI-based recommendations in their repositories. And it plans to extend issue recommendations to offer personalized suggestions on next issues to tackle for anyone who has already made contributions to a project.

Source link

Share
Tweet
Pinterest
Linkedin
Stumble
Google+
Email
Prev Article
Next Article

Related Articles

Epic Games acquires facial animation technology maker Cubic Motion
Epic Games has acquired Cubic Motion, a provider of automated …

Epic Games acquires facial animation technology maker Cubic Motion

Caddy offers TLS, HTTPS, and more in one dependency-free Go Web server
Enlarge / Production-ready in a few lines? Color us interested. …

Caddy offers TLS, HTTPS, and more in one dependency-free Go Web server

Leave a Reply Cancel reply

Find us on Facebook

Related Posts

  • Amazon’s Jeff Bezos pledges $10 billion to battle climate change
    Amazon’s Jeff Bezos pledges $10 billion to …
    19/02/2020
  • Indian education startup Byju’s turns profitable – TechCrunch
    Indian education startup Byju’s turns profitable – …
    18/12/2019
  • Crowdsourced coronavirus tracking apps are great, but we need a more coordinated approach
    Crowdsourced coronavirus tracking apps are great, but …
    25/03/2020
  • Intel’s Mobileye demos autonomous car that navigates using cameras alone
    Luminar and Intel’s Mobileye team up to …
    20/11/2020
  • VENN unveils 20 hours of weekly video game programming for broadcast TV
    VENN unveils 20 hours of weekly video …
    10/07/2020

Popular Posts

  • Top 10 Dumbest Products on Shark Tank …
    21/05/2022 0
  • The World’s 10 Most Dangerous Beaches – …
    22/04/2022 0
  • Hackers hammer SpringShell vulnerability in attempt to install cryptominers
    Hackers hammer SpringShell vulnerability in attempt to …
    22/04/2022 0
  • 10 Times Florida Man Saved the Day …
    23/04/2022 0
  • Ten Chilling Murders of Baseball Stars at …
    23/04/2022 0

viralamo

Pages

  • Contact Us
  • Privacy Policy
Copyright © 2022 viralamo
Theme by MyThemeShop.com

Ad Blocker Detected

Our website is made possible by displaying online advertisements to our visitors. Please consider supporting us by disabling your ad blocker.

Refresh