viralamo

Menu
  • Technology
  • Science
  • Money
  • Culturs
  • Trending
  • Video

Subscribe To Our Website To Receive The Last Stories

Join Us Now For Free
Home
Technology
MIT and IBM develop AI that recommends documents based on topic
Technology

MIT and IBM develop AI that recommends documents based on topic

20/12/2019

Even the best text-parsing recommendation algorithms can be stymied by data sets of a certain size. In an effort to to deliver faster, better classification performance than the bulk of existing methods, a team at the MIT-IBM Watson AI Lab and MIT’s Geometric Data Processing Group devised a technique that combines popular AI tools including embeddings and optimal transport. They say that their approach can scan millions of possibilities given only the historical preferences of a person, or the preferences of a group of people.

“There’s a ton of text on the internet,” said lead author on the research and MIT assistant professor Justin Solomon in a statement. “Anything to help cut through all that material is extremely useful.”

To this end, Solomon and colleagues’ algorithm summarizes collections of text into topics based on commonly-used words in the collection. Next, it divides each text into its five to 15 most important topics, with a ranking indicating each topic’s importance to the text overall. Embeddings — numerical representations of data, in this case words — help make evident the similarity among words, while optimal transport helps to calculate the most efficient way of moving objects (or data points) among multiple destinations.

The embeddings make it possible to leverage optimal transport twice — first to compare topics within the collection and then to measure how closely common themes overlap. This works especially well when scanning large collections of books and documents, according to the researchers; in an evaluation involving 1,720 pairs of titles in the Gutenberg Project data set, the algorithm managed to compare all of them in one second, or more than 800 times faster than the next-best method.

Moreover, the algorithm does a superior job of sorting documents than rival methods, for example grouping books in the Gutenberg dataset by author and product reviews on Amazon by department. It’s also more explainable in that it provides lists of topics, enabling users to better understand why it’s recommending a given document.

The researchers leave to future work developing an end-to-end training technique that optimizes the embedding, topic models, and optimal transport jointly as opposed to separately, as with the current implementation. They also hope to apply their approach to larger data sets, and to investigate applications to the modeling of images or three-dimensional data.

“[Our algorithm] appears to capture differences in the same way a person asked to compare two documents would: by breaking down each document into easy to understand concepts, and then comparing the concepts,” wrote Solomon and coauthors in a paper summarizing their work. “[W]ord embeddings provide global semantic language information, while … topic models provide corpus-specific topics and topic distributions. Empirically these combine to give superior performance on various metric-based tasks.”

Source link

Share
Tweet
Pinterest
Linkedin
Stumble
Google+
Email
Prev Article
Next Article

Related Articles

Jason’s 2020 Gloriously Geeky Gift Guide for the Geezer Geeks in your life
The holidays in 2020 will be unlike any in recorded …

Jason’s 2020 Gloriously Geeky Gift Guide for the Geezer Geeks in your life

Get your online business up and running with 7 great deals
As so many in-person businesses remain closed or operating at …

Get your online business up and running with 7 great deals

Leave a Reply Cancel reply

Find us on Facebook

Related Posts

  • AppsFlyer measures in-app advertising for Facebook gaming campaigns
    AppsFlyer measures in-app advertising for Facebook gaming …
    02/06/2020
  • New Chrome security measure aims to curtail an entire class of Web attack
    New Chrome security measure aims to curtail …
    13/01/2022
  • Researchers propose AI that improves the quality of any video
    Researchers propose AI that improves the quality …
    28/02/2020
  • Facebook trains AI to generate worlds in a fantasy text adventure
    Facebook gamifies data collection to boost conversational …
    19/08/2020
  • Amazon partners with Verizon to extend AWS cloud to 5G networks
    Amazon launches its first African AWS datacenters …
    22/04/2020

Popular Posts

  • 10 People Who Suffer From Strange Phobias …
    19/06/2022 0
  • 10 Real Historical Events That Inspired ‘Game …
    22/05/2022 0
  • Top 10 ’90s Songs You Didn’t Realize …
    23/05/2022 0
  • Top 10 Mysteries, Cold Cases & Puzzles …
    23/05/2022 0
  • Ransomware attack on Planned Parenthood steals data of 400,000 patients
    Why it’s hard to sanction ransomware groups
    23/05/2022 0

viralamo

Pages

  • Contact Us
  • Privacy Policy
Copyright © 2022 viralamo
Theme by MyThemeShop.com

Ad Blocker Detected

Our website is made possible by displaying online advertisements to our visitors. Please consider supporting us by disabling your ad blocker.

Refresh