viralamo

Menu
  • Technology
  • Science
  • Money
  • Culturs
  • Trending
  • Video

Subscribe To Our Website To Receive The Last Stories

Join Us Now For Free
Home
Technology
A super-fast machine learning model for finding user search intent
Technology

A super-fast machine learning model for finding user search intent

01/12/2019

In April 2019, Benjamin Burkholder (who is awesome, by the way) published a Medium article showing off a script he wrote that uses SERP result features to infer a user’s search intent. The script uses the SerpAPI.com API for its data and labels search queries in the following way:

  • Informational — The person is looking for more information on a topic. This is indicated by whether an answer box or PAA (people also ask) boxes are present.
  • Navigational — The person is searching for a specific website. This is indicated by whether a knowledge graph is present or if site links are present.
  • Transactional — The person is aiming to purchase something. This is indicated by whether shopping ads are present.
  • Commercial Investigation — The person is aiming to make a purchase soon but is still researching. This is indicated by whether paid ads are present, an answer box is present, PAAs are present, or if there are ads present at the bottom of the SERP.

This is one of the coolest ways to estimate search intent, because it uses Google’s understanding of search intent (as expressed by the SERP features shown for that search).

The one problem with Burkholder’s approach is its reliance on the Serp API. If you have a large set of search queries you want to find intent for, you need to pass each query phrase through the API, which then actually does the search and returns the SERP feature results, which Burkholder’s script can then classify. So on a large set of search queries, this is time consuming and prohibitively expensive.

SerpAPI charges ~$0.01 per keyword, so analyzing 5,000 keywords will cost you $50. Running these results through Burkholder’s labeler script also takes 3 to 5 hours to get through these 5,000 keywords.

So I got to thinking: What if I adapted Burkholder’s approach so that, rather than use it to classify intent directly, I could use it to train a machine learning model that I would then use to classify intent? In other words, I’d incur one-time costs to produce my Burkholder-labeled training set, and, assuming it was accurate enough, I could then use that training set for all further classification, cost free.

With an accurate training set, anyone could label huge numbers of keywords super quickly, without spending a dime.

Finding a model

Hamlet Batista has written a few stellar posts about how to leverage Natural Language models like BERT for labeling intent.

In his posts, he uses an existing intent labeling model that returns categories from Kaggle’s Question Answering Dataset. While these labels can be useful, they are not really “intent categories” in line with what we typically think of for intent taxonomy categories and instead have labels such as Description, Entity, Human, Numeric, and Location.

He achieved excellent results by training a BERT encoder, getting near 90% accuracy in predicting labels for new/unlabeled search keywords.

The big question for me was, could I leverage the same tech (Uber’s Ludwig BERT encoder) to create an accurate model using the search intent labels I’d get from Burkholder’s code?

It turns out the answer is yes!

How to do it

Here’s how the process works:

1. Gather your list of keywords. If you’re planning on training your own model, I recommend doing so within a specific category/niche. Training on clothing-related keywords and then using that model to label finance related keywords will likely be significantly less accurate than training on clothing related keywords and then using that model to label other unlabeled clothing related keywords. That said, I did try using a model labeled on one category/niche to label another, and the results still seemed quite good to me.

2. Run Burkholder’s script over your list of keywords from Step 1. This will require signing up for SerpAPI.com and buying credits. I recommend getting labels for at least 10,000 search queries with this script to use for training. The more training data, the more accurate your model will likely be.

3. Use the labeled data from the previous step as your training data for the BERT model. Batista’s code to do this is very straightforward, and this article will guide you through the process. I was able to get about ~72% accuracy using about 10,000 labels of training data.

4. Use your model from Step 3 to label unlabeled search data, and then take a look at your results!

The results

I ran through this process using a huge list (13,000 keywords) of clothing/fashion-related search terms from SEMrush as my training data. My resulting model gets just about 80% accuracy.

It seems likely that training the model with more data will continue to improve its accuracy up to a point. If any of you attempt it and improve on 80% accuracy, I would love to hear about it. I think with 20,000+ labeled searches, we could see up to maybe 85-90% accuracy.

This means when you ask this model to predict the intent of unlabeled search queries, 8 times out of 10 it will give you the same label as what would have been returned by Burkholder’s Serp API rules-based classifier. It can also do this for free, in large volumes and incredibly fast.

So something that would have taken a few thousand dollars and days of scraping can now be done for free in just minutes.

In my case I used keywords from a related domain (makeup) instead of clothing keywords, and overall I think it did a pretty good job. Labeling 5,000 search queries took under two minutes with the BERT model. Here’s what my results looked like:

 

The implications

For SEO tools to be useful, they need to be scalable. Keyword research, content strategy, PPC strategy, and SEO strategy usually rely on being able to do analysis across entire niches/themes/topics/websites.

In many industries, the keyword longtails can extend into the millions. So a faster, more affordable approach to Burkholder’s solution can make a lot of difference.

I forsee AI and machine learning tools being used more and more in our industry, enabling SEOs, paid search specialists, and content marketers to gain superpowers that haven’t been possible before these new AI breakthroughs.

Happy analyzing!

Kristin Tynski is a founder and the SVP of Creative at Fractl, a boutique growth agency based in Delray Beach, FL.

Source link

Share
Tweet
Pinterest
Linkedin
Stumble
Google+
Email
Prev Article
Next Article

Related Articles

Immortals: Fenyx Rising hands-on — How it stands out from Zelda’s shadow
Immortals: Fenyx Rising is both ambitious and shameless. This Ubisoft …

Immortals: Fenyx Rising hands-on — How it stands out from Zelda’s shadow

Auducka is the VR child of Duck Hunt and Harmonix’s Audica
What do you get when you mix Harmonix’s rhythmic VR …

Auducka is the VR child of Duck Hunt and Harmonix’s Audica

Leave a Reply Cancel reply

Find us on Facebook

Related Posts

  • DeepMind’s MEMO AI solves novel reasoning tasks with less compute
    DeepMind trains robots to insert USB keys …
    10/02/2020
  • Autonomous cats, humanoids, and other cool robots at CES 2020
    Autonomous cats, humanoids, and other cool robots …
    10/01/2020
  • 2020 will be a big year for online childcare — here are 7 startups to watch
    Two-year-old Indian edtech startup Doubtnut raises $15M
    01/02/2020
  • Apple wins $15 billion EU court appeal over Irish tax scheme
    Apple wins $15 billion EU court appeal …
    15/07/2020
  • 2020 will be a big year for online childcare — here are 7 startups to watch
    FCC looks to mandate anti-robocall tech after …
    08/03/2020

Popular Posts

  • 10 Real Historical Events That Inspired ‘Game …
    22/05/2022 0
  • Top 10 Most Singular Encounters with Unidentified …
    24/04/2022 0
  • 10 Creepy Apocalyptical Predictions – Listverse
    25/04/2022 0
  • 10 Meetings That Shaped History – Listverse
    25/04/2022 0
  • The first “Meta Store” is opening in California in May
    The first “Meta Store” is opening in …
    25/04/2022 0

viralamo

Pages

  • Contact Us
  • Privacy Policy
Copyright © 2022 viralamo
Theme by MyThemeShop.com

Ad Blocker Detected

Our website is made possible by displaying online advertisements to our visitors. Please consider supporting us by disabling your ad blocker.

Refresh