viralamo

Menu
  • Technology
  • Science
  • Money
  • Culturs
  • Trending
  • Video

Subscribe To Our Website To Receive The Last Stories

Join Us Now For Free
Home
Technology
Google’s AI lets users search language-agnostic knowledge bases in their native tongue
Technology

Google’s AI lets users search language-agnostic knowledge bases in their native tongue

11/11/2020

Entity linking fulfills a key role in grounded language understanding. Given a text mention of an entity (e.g., the word “helpful”), an algorithm identifies the entity’s corresponding entry in a knowledge base (such as a Wikipedia article). To extend its usefulness, researchers at Google propose a new technique where language-specific mentions resolve to a language-agnostic knowledge base. They describe a single entity retrieval model that covers over 100 languages and 20 million entities while ostensibly outperforming results from more limited cross-lingual tasks.

Multilingual entity linking involves linking a text snippet in some context to the corresponding entity in a language-agnostic knowledge base. Knowledge bases are essentially databases comprising information about entities — people, places, and things. In 2012, Google launched a knowledge base, the Knowledge Graph, to enhance search results with hundreds of billions of facts gathered from sources including Wikipedia, Wikidata, and CIA World Factbook. Microsoft provides a knowledge base with over 150,000 articles created by support professionals who have resolved issues for its customers.

Knowledge bases in multilingual entity linking may include textual information like names and descriptions about each entity in one or more languages. But they make no prior assumption about the relationship between these knowledge base languages and the mention-side language.

The Google researchers used what’s called enhanced dual encoder retrieval models and WikiData as their knowledge base, which canvasses a large set of diverse entities. WikiData contains names and short descriptions, but through its close integration with all Wikipedia editions, it also connects entities to rich descriptions (and other features) drawn from the corresponding language-specific Wikipedia pages.

Google entity model

The researchers extracted a large-scale dataset of 684 million mentions in 104 languages linked to WikiData entities, which they say is at least six times larger than datasets used in prior English-only linking work. In addition, the coauthors created a matching dataset — Mewsli-9 — that spans a diverse set of languages and entities, including 289,087 entity mentions appearing in 58,717 news articles from WikiNews. (Only 11% of the 82,162 distinct target entities in Mewsli-9 don’t have English Wikipedia pages, setting an upper bound on systems focused on English Wikipedia entities.)

The researchers say the results show that entity linking can better reflect the real-world challenges of rare entities and/or low resource languages. “Operationalized through Wikipedia and WikiData, our experiments using enhanced dual encoder retrieval models and frequency-based evaluation provide compelling evidence that it is feasible to perform this task with a single model covering over a 100 languages,” they wrote. “Our automatically extracted Mewsli-9 dataset serves as a starting point for evaluating entity linking beyond the entrenched English benchmarks and under the expanded multilingual setting.”

It’s unclear whether the researchers’ models exhibits demographic bias, however. In a paper published earlier this year, Twitter researchers claimed to have found evidence of prejudice in popular named entity recognition models, particularly with respect to Black and other “non-white” names. But the Google coauthors leave the door open to using non-expert human raters to improve the quality of the training dataset and incorporate relational knowledge.


How startups are scaling communication:

The pandemic is making startups take a close look at ramping up their communication solutions. Learn how


Source link

Share
Tweet
Pinterest
Linkedin
Stumble
Google+
Email
Prev Article
Next Article

Related Articles

2020 will be a big year for online childcare — here are 7 startups to watch
TechCrunch ist Teil von Verizon Media. Klicken Sie auf ‘Ich …

Judge rejects Tulsi Gabbard’s ‘free speech’ lawsuit against Google

Glassdoor: HubSpot dethrones Zoom as the best tech company to work for in the U.S.
Career website Glassdoor today released its 12th annual Employees’ Choice …

Glassdoor: HubSpot dethrones Zoom as the best tech company to work for in the U.S.

Leave a Reply Cancel reply

Find us on Facebook

Related Posts

  • PlayStation hires Double Fine vet Greg Rice to oversee indie relationships
    PlayStation hires Double Fine vet Greg Rice …
    12/12/2019
  • 2020 will be a big year for online childcare — here are 7 startups to watch
    Advocating reform, activist investor Elliott Management takes …
    06/02/2020
  • Beat Saber is now an Oculus studio after Facebook acquisition
    HeyMama, a premium social network for working …
    29/01/2020
  • Super Mega Baseball 3 launches in April with cross-platform multiplayer
    Super Mega Baseball 3 launches in April …
    11/03/2020
  • From Washington state to Washington DC, lawmakers rush to regulate facial recognition
    ICE warrantless facial recognition searches trigger Maryland …
    28/02/2020

Popular Posts

  • Top 10 Things That Prove Deserts Are …
    05/03/2021 0
  • Details on SpaceX Starlink beta emerge along with photos of user terminals
    SpaceX Starlink passes 10,000 users and fights …
    04/02/2021 0
  • Zoombombing countermeasures are ineffective in the vast majority of cases
    Zoombombing countermeasures are ineffective in the vast …
    05/02/2021 0
  • Top 10 Successful Movies Everyone Expected To …
    05/02/2021 0
  • Top 10 Bizarre Sausages – Listverse
    05/02/2021 0

viralamo

Pages

  • Contact Us
  • Privacy Policy
Copyright © 2021 viralamo
Theme by MyThemeShop.com

Ad Blocker Detected

Our website is made possible by displaying online advertisements to our visitors. Please consider supporting us by disabling your ad blocker.

Refresh