viralamo

Menu
  • Technology
  • Science
  • Money
  • Culturs
  • Trending
  • Video

Subscribe To Our Website To Receive The Last Stories

Join Us Now For Free
Home
Technology
Researchers find machine learning models still struggle to detect hate speech
Technology

Researchers find machine learning models still struggle to detect hate speech

06/01/2021

Detecting hate speech is a task even state-of-the-art machine learning models struggle with. That’s because harmful speech comes in many different forms, and models must learn to differentiate each one from innocuous turns of phrase. Historically, hate speech detection models have been tested by measuring their performance on data using metrics like accuracy. But this makes it tough to identify a model’s weak points and risks overestimating a model’s quality, due to gaps and biases in hate speech datasets.

In search of a better solution, researchers at the University of Oxford, the Alan Turing Institute, Utrecht University, and the University of Sheffield developed HateCheck, an English-language benchmark for hate speech detection models created by reviewing previous research and conducting interviews with 16 British, German, and American nongovernmental organizations (NGOs) whose work relates to online hate. Testing HateCheck on near-state-of-the-art detection models — as well as Jigsaw’s Perspective tool — revealed “critical weaknesses” in these models, according to the team, illustrating the benchmark’s utility.

HateCheck’s tests canvass 29 modes that are designed to be difficult for models relying on simplistic rules, including derogatory hate speech, threatening language, and hate expressed using profanity. Eighteen of the tests cover distinct expressions of hate (e.g., statements like “I hate Muslims,” “Typical of a woman to be that stupid,” “Black people are scum”), while the remaining 11 tests cover what the researchers call contrastive non-hate, or content that shares linguistic features with hateful expressions (e.g., “I absolutely adore women,” which contrasts with “I absolutely loathe women”).

In experiments, the researchers analyzed two DistilBERT models that achieved strong performance on public hate speech datasets and the “identity attack” model from Perspective, an API released in 2017 for content moderation. Perspective is maintained by Google’s Counter Abuse Technology team and Jigsaw, the organization working under Google parent company Alphabet to tackle cyberbullying and disinformation, and it’s used by media organizations including the New York Times and Vox Media.

The researchers found that as of December 2020, all of the models appear to be overly sensitive to specific keywords — mainly slurs and profanity — and often misclassify non-hateful contrasts (like negation and counter-speech) around hateful phrases.

Hate speech

Above: Examples of hate speech in HateCheck, along with the accuracy of each model the researchers tested.

The Perspective model particularly struggles with denouncements of hate that quote the hate speech or make direct reference to it, classifying only 15.6% to 18.4% of these correctly. The model recognizes just 66% of hate speech that uses a slur and 62.9% of abuse targeted at “non-protected” groups like “artists” and “capitalists” (in statements like “artists are parasites to our society” and “death to all capitalists”), and only 54% of “reclaimed” slurs like “queer.” Moreover, the Perspective API can fail to catch spelling variations like missing characters (74.3% accuracy), added spaces between characters (74%), and spellings with numbers in place of words (68.2%).

As for the DistilBERT models, they exhibit bias in their classifications across certain gender, ethnic, race, and sexual groups, misclassifying more content directed at some groups than others, according to the researchers. One of the models was only 30.9% accurate in identifying hate speech against women and 25.4% in identifying speech against disabled people. The other was 39.4% accurate for hate speech against immigrants and 46.8% accurate for speech against Black people.

“It appears that all models to some extent encode simple keyword-based decision rules (e.g. ‘slurs are hateful’ or ‘slurs are non-hateful’) rather than capturing the relevant linguistic phenomena (e.g., ‘slurs can have non-hateful reclaimed uses’). They [also] appear to not sufficiently register linguistic signals that reframe hateful phrases into clearly non-hateful ones (e.g. ‘No Muslim deserves to die’),” the researchers wrote in a preprint paper describing their work.

The researchers suggest targeted data augmentation, or training models on additional datasets containing examples of hate speech they didn’t detect, as one accuracy-improving technique. But examples like Facebook’s uneven campaign against hate speech show significant technological challenges. Facebook claims to have invested substantially in AI content-filtering technologies, proactively detecting as much as 94.7% of the hate speech it ultimately removes. But the company still fails to stem the spread of problematic posts, and a recent NBC investigation revealed that on Instagram in the U.S. last year, Black users were about 50% more likely to have their accounts disabled by automated moderation systems than those whose activity indicated they were white.

“For practical applications such as content moderation, these are critical weaknesses,” the researchers continued. “Models that misclassify reclaimed slurs penalize the very communities that are commonly targeted by hate speech. Models that misclassify counter-speech undermine positive efforts to fight hate speech. Models that are biased in their target coverage are likely to create and entrench biases in the protections afforded to different groups.”

VentureBeat

VentureBeat’s mission is to be a digital townsquare for technical decision makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you,
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform
  • networking features, and more.

Become a member

Source link

Share
Tweet
Pinterest
Linkedin
Stumble
Google+
Email
Prev Article
Next Article

Related Articles

Google creates Journalism Emergency Relief Fund to support local news
Google will expand free shopping results from a narrow experiment …

Google expands free retail listings into search as pandemic hits ad sales

Palantir IPO filing reveals 2019 loss of $580 million
(Reuters) – Palantir Technologies on Tuesday filed to go public …

Palantir IPO filing reveals 2019 loss of $580 million

Leave a Reply Cancel reply

Find us on Facebook

Related Posts

  • OpenAI launches Microscope to visualize the neurons in popular machine learning models
    OpenAI launches Microscope to visualize the neurons …
    14/04/2020
  • Tech startups going public raise 3x more today than in 2015 – TechCrunch
    Tech startups going public raise 3x more …
    18/12/2019
  • Facebook’s Hanabi-playing AI achieves state-of-the-art results
    Facebook’s DEC AI identified hundreds of millions …
    13/12/2019
  • NZXT Kraken X3 and Z3 CPU coolers have the looks and performance
    NZXT Kraken X3 and Z3 CPU coolers …
    30/03/2020
  • Steam’s top 20 new games for January 2020: Asian studios surge
    Steam’s top 20 new games for January …
    19/02/2020

Popular Posts

  • Ars online IT roundtable Thursday: What’s the future of the data center?
    Ars online IT roundtable Thursday: What’s the …
    19/01/2021 0
  • Top 10 Crazy Ways To Get Free …
    21/12/2020 0
  • Chinese face-scanning firm CloudMinds rebrands U.S. unit after blacklisting
    Chinese face-scanning firm CloudMinds rebrands U.S. unit …
    21/12/2020 0
  • Real estate software and data analytics company RealPage to be acquired for $10.2 billion
    Real estate software and data analytics company …
    21/12/2020 0
  • Bolt raises $75 million to fight ecommerce fraud with machine learning
    Bolt raises $75 million to fight ecommerce …
    21/12/2020 0

viralamo

Pages

  • Contact Us
  • Privacy Policy
Copyright © 2021 viralamo
Theme by MyThemeShop.com

Ad Blocker Detected

Our website is made possible by displaying online advertisements to our visitors. Please consider supporting us by disabling your ad blocker.

Refresh
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it.I AgreePrivacy policy