viralamo

Menu
  • Technology
  • Science
  • Money
  • Culturs
  • Trending
  • Video

Subscribe To Our Website To Receive The Last Stories

Join Us Now For Free
Home
Technology
Google’s AI lets users search language-agnostic knowledge bases in their native tongue
Technology

Google’s AI lets users search language-agnostic knowledge bases in their native tongue

11/11/2020

Entity linking fulfills a key role in grounded language understanding. Given a text mention of an entity (e.g., the word “helpful”), an algorithm identifies the entity’s corresponding entry in a knowledge base (such as a Wikipedia article). To extend its usefulness, researchers at Google propose a new technique where language-specific mentions resolve to a language-agnostic knowledge base. They describe a single entity retrieval model that covers over 100 languages and 20 million entities while ostensibly outperforming results from more limited cross-lingual tasks.

Multilingual entity linking involves linking a text snippet in some context to the corresponding entity in a language-agnostic knowledge base. Knowledge bases are essentially databases comprising information about entities — people, places, and things. In 2012, Google launched a knowledge base, the Knowledge Graph, to enhance search results with hundreds of billions of facts gathered from sources including Wikipedia, Wikidata, and CIA World Factbook. Microsoft provides a knowledge base with over 150,000 articles created by support professionals who have resolved issues for its customers.

Knowledge bases in multilingual entity linking may include textual information like names and descriptions about each entity in one or more languages. But they make no prior assumption about the relationship between these knowledge base languages and the mention-side language.

The Google researchers used what’s called enhanced dual encoder retrieval models and WikiData as their knowledge base, which canvasses a large set of diverse entities. WikiData contains names and short descriptions, but through its close integration with all Wikipedia editions, it also connects entities to rich descriptions (and other features) drawn from the corresponding language-specific Wikipedia pages.

Google entity model

The researchers extracted a large-scale dataset of 684 million mentions in 104 languages linked to WikiData entities, which they say is at least six times larger than datasets used in prior English-only linking work. In addition, the coauthors created a matching dataset — Mewsli-9 — that spans a diverse set of languages and entities, including 289,087 entity mentions appearing in 58,717 news articles from WikiNews. (Only 11% of the 82,162 distinct target entities in Mewsli-9 don’t have English Wikipedia pages, setting an upper bound on systems focused on English Wikipedia entities.)

The researchers say the results show that entity linking can better reflect the real-world challenges of rare entities and/or low resource languages. “Operationalized through Wikipedia and WikiData, our experiments using enhanced dual encoder retrieval models and frequency-based evaluation provide compelling evidence that it is feasible to perform this task with a single model covering over a 100 languages,” they wrote. “Our automatically extracted Mewsli-9 dataset serves as a starting point for evaluating entity linking beyond the entrenched English benchmarks and under the expanded multilingual setting.”

It’s unclear whether the researchers’ models exhibits demographic bias, however. In a paper published earlier this year, Twitter researchers claimed to have found evidence of prejudice in popular named entity recognition models, particularly with respect to Black and other “non-white” names. But the Google coauthors leave the door open to using non-expert human raters to improve the quality of the training dataset and incorporate relational knowledge.


How startups are scaling communication:

The pandemic is making startups take a close look at ramping up their communication solutions. Learn how


Source link

Share
Tweet
Pinterest
Linkedin
Stumble
Google+
Email
Prev Article
Next Article

Related Articles

PC shipments grew in 2019 ahead of bet on 5G and dual-screen devices
After years of bad news, the PC market has a …

PC shipments grew in 2019 ahead of bet on 5G and dual-screen devices

Beat Saber is now an Oculus studio after Facebook acquisition
TechCrunch ist Teil von Verizon Media. Klicken Sie auf ‘Ich …

DeFi aims to bridge the gap between blockchains and financial services

Leave a Reply Cancel reply

Find us on Facebook

Related Posts

  • HTC’s Vive Pro Eye headset drops to $799 as lower-end model sells out
    HTC’s Vive Pro Eye headset drops to …
    02/03/2020
  • 2020 will be a big year for online childcare — here are 7 startups to watch
    In the future, everyone will be famous …
    11/01/2020
  • Russian hackers are exploiting bug that gives control of US servers
    Security powerhouse FireEye says it was breached …
    08/12/2020
  • Wi-Fi 6E and 5G will share 6GHz spectrum to supercharge wireless data
    Wi-Fi 6E and 5G will share 6GHz …
    02/04/2020
  • Outlier raises $22.1 million to spot anomalies in business data with AI
    Outlier raises $22.1 million to spot anomalies …
    23/01/2020

Popular Posts

  • Ring patched an Android bug that could have exposed video footage
    Ring patched an Android bug that could …
    18/08/2022 0
  • 10 Best Everyday Items for Survival Situations …
    21/07/2022 0
  • 10 Popular TV Characters That Weren’t Part …
    21/07/2022 0
  • 4 vulnerabilities under attack give hackers full control of Android devices
    Zero-day used to infect Chrome users could …
    21/07/2022 0
  • Ten Gender-Swapped Cover Songs That Altered the …
    22/07/2022 0

viralamo

Pages

  • Contact Us
  • Privacy Policy
Copyright © 2022 viralamo
Theme by MyThemeShop.com

Ad Blocker Detected

Our website is made possible by displaying online advertisements to our visitors. Please consider supporting us by disabling your ad blocker.

Refresh