viralamo

Menu
  • Technology
  • Science
  • Money
  • Culturs
  • Trending
  • Video

Subscribe To Our Website To Receive The Last Stories

Join Us Now For Free
Home
Technology
Researchers show that computer vision algorithms pretrained on ImageNet exhibit multiple, distressing biases
Technology

Researchers show that computer vision algorithms pretrained on ImageNet exhibit multiple, distressing biases

03/11/2020

State-of-the-art image-classifying AI models trained on ImageNet, a popular (but problematic) dataset containing photos scraped from the internet, automatically learn humanlike biases about race, gender, weight, and more. That’s according to new research from scientists at Carnegie Mellon University and George Washington University, who developed what they claim is a novel method for quantifying biased associations between representations of social concepts (e.g., race and gender) and attributes in images. When compared with statistical patterns in online image datasets, the findings suggest models automatically learn bias from the way people are stereotypically portrayed on the web.

Companies and researchers regularly use machine learning models trained on massive internet image datasets. To reduce costs, many employ state-of-the-art models pretrained on large corpora to help achieve other goals, a powerful approach called transfer learning. A growing number of computer vision methods are unsupervised, meaning they leverage no labels during training; with fine-tuning, practitioners pair general-purpose representations with labels from domains to accomplish tasks like facial recognition, job candidate screening, autonomous vehicles, and online ad delivery.

Working from the hypothesis that image representations contain biases corresponding to stereotypes of groups in training images, the researchers adapted bias tests designed for contextualized word embedding to the image domain. (Word embeddings are language modeling techniques where words from a vocabulary are mapped to vectors of real numbers, enabling models to learn from them.) Their proposed benchmark — Image Embedding Association Test (iEAT) — modifies word embedding tests to compare pooled image-level embeddings (i.e., vectors representing images), with the goal of measuring the biases embedded during unsupervised pretraining by comparing the association of embeddings systematically.

To explore what kinds of biases may get embedded in image representations generated where class labels aren’t available, the researchers focused on two computer vision models published this past summer: OpenAI’s iGPT and Google’s SimCLRv2. Both were pretrained on ImageNet 2012, which contains 1.2 million annotated images from Flickr and other photo-sharing sites of 200 object classes. And as the researchers explain, both learn to produce embeddings based on implicit patterns in the entire training set of image features.

The researchers compiled a representative set of image stimuli for categories like “age,” “gender-science,” “religion,” “sexuality,” “weight,” “disability,” “skin tone,” and “race.” For each, they drew representative images from Google Images, the open source CIFAR-100 dataset, and other sources.

In experiments, the researchers say they uncovered evidence iGPT and SimCLRv2 contain “significant” biases likely attributable to ImageNet’s data imbalance. Previous research has shown that ImageNet unequally represents race and gender; for instance, the “groom” category shows mostly white people.

Both iGPT and SimCLRv2 showed racial prejudices both in terms of valence (i.e., positive and negative emotions) and stereotyping. Embeddings from iGPT and SimCLRv2 exhibited bias for an Arab-Muslim iEAT benchmark measuring whether images of Arab Americans were considered more “pleasant” or “unpleasant” than others. iGPT was biased in a skin tone test comparing perceptions of faces of lighter and darker tones. (Lighter tones were seen by the model to be more “positive.”) And both iGPT and SimCLRv2 associated white people with tools while associating Black people with weapons, a bias similar to that shown by Google Cloud Vision, Google’s computer vision service, which was found to label images of dark-skinned people holding thermometers “gun.”

Beyond racial prejudices, the coauthors report that gender and weight biases plague the pretrained iGPT and SimCLRv2 models. In a gender-career iEAT test estimating the closeness of the category “male” with “business” and “office” and “female” to attributes like “children” and “home,” embeddings from the models were stereotypical. In the case of iGPT, a gender-science benchmark designed to judge the relations of “male” with “science” attributes like math and engineering and “female” with “liberal arts” attributes like art showed similar bias. And iGPT displayed a bias toward lighter-weight people of all genders and races, associating thin people with pleasantness and overweight people with unpleasantness.

The researchers also report that the next-level prediction features of iGPT were biased against women in their tests. To demonstrate, they cropped portraits of women and men including Alexandria Ocasio-Cortez (D-NY) below the neck and used iGPT to generate different complete images. iGPT completions of regular, businesslike indoor and outdoor portraits of clothed women and men often featured large breasts and bathing suits; in six of the ten total portraits tested, at least one of the eight completions showed a bikini or low-cut top.

iGPT sexist image generation

The results are unfortunately not surprising — countless studies have shown that facial recognition is susceptible to bias. A paper last fall by University of Colorado, Boulder researchers demonstrated that AI from Amazon, Clarifai, Microsoft, and others maintained accuracy rates above 95% for cisgender men and women but misidentified trans men as women 38% of the time. Independent benchmarks of major vendors’ systems by the Gender Shades project and the National Institute of Standards and Technology (NIST) have demonstrated that facial recognition technology exhibits racial and gender bias and have suggested that current facial recognition programs can be wildly inaccurate, misclassifying people upwards of 96% of the time.

However, it should be noted that efforts are underway to make ImageNet more inclusive and less toxic. Last year, the Stanford, Princeton, and University of North Carolina team behind the dataset used crowdsourcing to identify and remove derogatory words and photos. They also assessed the demographic and geographic diversity in ImageNet photos and developed a tool to surface more diverse images in terms of gender, race, and age.

“Though models like these may be useful for quantifying contemporary social biases as they are portrayed in vast quantities of images on the internet, our results suggest the use of unsupervised pretraining on images at scale is likely to propagate harmful biases,” the Carnegie Mellon and George Washington University researchers wrote in a paper detailing their work, which hasn’t been peer-reviewed. “Given the high computational and carbon cost of model training at scale, transfer learning with pre-trained models is an attractive option for practitioners. But our results indicate that patterns of stereotypical portrayal of social groups do affect unsupervised models, so careful research and analysis is needed before these models make consequential decisions about individuals and society.”


How startups are scaling communication:

The pandemic is making startups take a close look at ramping up their communication solutions. Learn how


Source link

Share
Tweet
Pinterest
Linkedin
Stumble
Google+
Email
Prev Article
Next Article

Related Articles

Virtual Incision raises $20 million for mini surgical robots
Virtual Incision, a medical technology company that’s developing miniaturized surgical …

Virtual Incision raises $20 million for mini surgical robots

Fast & Furious: Crossroads lives video games a quarter-mile at a time
TechCrunch ist Teil von Verizon Media. Klicken Sie auf ‘Ich …

Moda Operandi, an online marketplace for high-end fashion, raises $100M led by NEA and Apax

Leave a Reply Cancel reply

Find us on Facebook

Related Posts

  • Beat Saber is now an Oculus studio after Facebook acquisition
    Finally dark mode arrives to soothe your …
    03/03/2020
  • TechCrunch’s Favorite Things of 2019
    European Commission goes teleworking by default over …
    12/03/2020
  • PlayStation 5 gets Godfall looter-slasher from Gearbox Publishing
    Facebook Messenger launches Mac & Windows apps
    02/04/2020
  • Language learning game Drops hits 25 million users
    Language learning game Drops hits 25 million …
    24/10/2020
  • 2020 will be a big year for online childcare — here are 7 startups to watch
    YouTube has seen soaring growth in South …
    05/02/2020

Popular Posts

  • Top 10 Movie Flops Everybody Expected To …
    18/01/2021 0
  • How an obscure British PC maker invented ARM and changed the world
    How an obscure British PC maker invented …
    20/12/2020 0
  • The Callisto Protocol: How Striking Distance Studios is creating survival horror of the future
    The Callisto Protocol: How Striking Distance Studios …
    20/12/2020 0
  • What game development methodology can teach the Biden administration about solving the COVID-19 pandemic
    What game development methodology can teach the …
    20/12/2020 0
  • Top 10 Bad Movies That Wasted Great …
    21/12/2020 0

viralamo

Pages

  • Contact Us
  • Privacy Policy
Copyright © 2021 viralamo
Theme by MyThemeShop.com

Ad Blocker Detected

Our website is made possible by displaying online advertisements to our visitors. Please consider supporting us by disabling your ad blocker.

Refresh
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it.I AgreePrivacy policy