viralamo

Menu
  • Technology
  • Science
  • Money
  • Culturs
  • Trending
  • Video

Subscribe To Our Website To Receive The Last Stories

Join Us Now For Free
Home
Technology
Researchers say we need better benchmarks to build more useful AI assistants
Technology

Researchers say we need better benchmarks to build more useful AI assistants

05/08/2020

The promise of conversational AI is that, unlike virtually any other form of technology, all you have to do is talk. Natural language is the most natural and democratic form of communication. After all, humans are born capable of learning how to speak, but some never learn to read or use a graphical user interface. That’s why AI researchers from Element AI, Stanford University, and CIFAR recommend academic researchers take steps to create more useful forms of AI that speak with people to get things done, including the elimination of existing benchmarks.

“As many current [language user interface] benchmarks suffer from low ecological validity, we recommend researchers not to initiate incremental research projects on them. Benchmark-specific advances are less meaningful when it is unclear if they transfer to real LUI use cases. Instead, we suggest the community to focus on conceptual research ideas that can generalize well beyond the current datasets,” the paper reads.

The ideal way to create language user interfaces (LUIs), they say, is to identify a group of people who would benefit from its use, collect conversations and corresponding programs or actions, train a model, then ask users for feedback.

The paper, titled “Towards Ecologically Valid Research on Language User Interfaces,” was published last week on preprint repository arXiv and promotes the creation of practical language models that can help people in their professional or personal lives. It identifies common shortcomings in existing popular benchmarks like SQuAD, which does not focus on working with target users, and CLEVR, which uses synthetic language.

Examples of speech interface challenges that academic researchers could pursue instead, authors say, include AI assistants that can talk with citizens about government data or benchmarks for popular games like Minecraft. Facebook AI Research released data and code to encourage the development of a Minecraft assistant last year.

Some governments have explored the use of conversational AI to help guide citizens through important moments in life or navigating government services. The Computing Community Consortium (CCC) recommends the development of lifelong intelligent assistants to do things like help people through their daily tasks or help them adapt to big changes like a new job or hobby.

The paper’s authors focus on language user interfaces such as an AI that can act as a personal assistant or speech interface for interacting with a home robot, but they draw a distinction between LUIs and AI models made for specific events like the Alexa Prize challenge, which rewards bots capable of holding a conversation with a human for 10 minutes.

Researchers identified a number of problematic characteristics among LUI benchmarks, such as the use of artificial tasks that can take place in environments not directly associated with the use case of the language model or the employment of synthetic language.

Some refer to using Amazon Mechanical Turk employees, a source of human labor AI researchers increasingly seem to rely on, as “ghost work.” The authors criticize it as a bad practice because the workers are not considered a potential user of LUIs.

One example of failure to work with a target population mentioned in the paper comes from the the visual question-answering (VQA) task to train an AI system to recognize objects and words. The VQA data set is made up of questions humans think may stump a home robot. It gathers questions from Mechanical Turk employees but does not include questions from people who are blind or visually impaired, even though the data set was made in part to assist the visually impaired. The researchers conclude, “the population that would actually benefit from the language user interface rarely participates in the data collection effort.”

The VizWiz VQA project found that people with visual impairments may ask questions differently, often asking questions that begin with “What” or that require the ability to read text. LUIs differ from conversational AI interfaces made for typed SMS or chat exchanges because people can word things differently when they speak as opposed to type. Scripted exchanges can also lead to the phenomena in which the human learns the exact words a speech interface or AI assistant needs to hear in order to operate rather than using their own natural language, which defeats the purpose of creating natural language models in the first place.

Some benchmarks also lack multi-turn dialogue, which the authors also criticized. Multiple studies have found that people using AI to accomplish concrete tasks respond best to multi-turn dialogue, the ability to ask multiple questions or engage in dialogue instead of issuing a series of single, separate commands.

In other recent news in language models, Microsoft research said this week they created advanced NLP for health care professionals, and last month researchers developed a method for identifying bugs in cloud AI offerings from major companies like Amazon, Apple, and Google.

Source link

Share
Tweet
Pinterest
Linkedin
Stumble
Google+
Email
Prev Article
Next Article

Related Articles

News publishers seek the same App Store terms Apple gives Amazon
(Reuters) — Major news publishers are seeking more favorable terms …

News publishers seek the same App Store terms Apple gives Amazon

PlayStation 5 gets Godfall looter-slasher from Gearbox Publishing
TechCrunch ist jetzt Teil der Verizon Media-Familie. Wir (Verizon Media) …

Study associates frequency, quality of monthly reports with startup success

Leave a Reply Cancel reply

Find us on Facebook

Related Posts

  • Fast & Furious: Crossroads lives video games a quarter-mile at a time
    Snapchat will launch Bitmoji TV, a personalized …
    29/12/2019
  • India’s HomeLane raises $30M to expand its one-stop-shop for interior design – TechCrunch
    India’s HomeLane raises $30M to expand its …
    23/12/2019
  • Inky raises $20 million to prevent phishing attacks with AI ‘fence’
    Inky raises $20 million to prevent phishing …
    04/06/2020
  • Tetris Effect soundtrack debuts on Billboard charts
    Tetris Effect soundtrack debuts on Billboard charts
    09/06/2020
  • Fast & Furious: Crossroads lives video games a quarter-mile at a time
    Airbnb will pay hosts $250 million to …
    31/03/2020

Popular Posts

  • Phishing scam had all the bells and whistles—except for one
    Phishing scam had all the bells and …
    21/01/2021 0
  • Top 10 Chillwave Songs – Listverse
    23/12/2020 0
  • Top 10 Absurd Robots That Scientists Have …
    23/12/2020 0
  • SEC charges blockchain company Ripple over $1.3 billion unregistered securities offering
    SEC charges blockchain company Ripple over $1.3 …
    23/12/2020 0
  • How AI and ML innovations are driving the need for hardware transformation (VB Live)
    How AI and ML innovations are driving …
    23/12/2020 0

viralamo

Pages

  • Contact Us
  • Privacy Policy
Copyright © 2021 viralamo
Theme by MyThemeShop.com

Ad Blocker Detected

Our website is made possible by displaying online advertisements to our visitors. Please consider supporting us by disabling your ad blocker.

Refresh
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it.I AgreePrivacy policy