Optimizing your website for keywords can have a huge impact on your SEO and search traffic. However, finding the right keywords to target takes research. This guide will teach you how to leverage web scraping to unlock keyword insights and data to boost your SEO rankings.
What are SEO Keywords?
SEO keywords are search terms that users enter into search engines. There are 3 main types of keywords:
- Short-tail keywords – Broad, popular terms like “web scraping”
- Long-tail keywords – More specific, less popular terms like “web scraping Google rankings”
- LSI keywords – Semantically related words like “web scraper”
Search engines use keywords to rank pages in search results. Websites optimized with target keywords tend to rank higher and get more traffic. So properly optimizing and researching keywords is crucial for SEO and driving organic search traffic.
Why Keyword Research Matters for SEO
Picking the right keywords is crucial for SEO success.
- Targeting keywords your content can rank for helps increase organic traffic from search engines.
- Optimizing pages for specific keywords improves your chances of ranking on page 1.
- Focusing on low-competition long-tail keywords makes it easier to rank against competitors.
But how do you know which keywords to target? That's where keyword research comes in. Keyword research gives you data to identify the best keywords to focus your SEO efforts on.
Some key metrics to look at when researching keywords:
- Search volume – How often a keyword is searched monthly. High-volume keywords have more traffic potential.
- Keyword difficulty – How hard it is to rank for a keyword based on competition. Ideal range is 20-60.
- CPC – The cost advertisers pay per click in AdWords for a keyword. Higher CPC indicates high commercial intent.
- Trends – Identifying rising and falling keyword trends over time.
- Competition – Analyzing what sites currently rank for the keyword.
While there are paid keyword research tools out there, web scraping provides a free way to get keyword data straight from Google itself. Next, let's cover the basics of how to scrape Google search results…
Scraping Google Search Results
Google search results pages (SERPs) are packed with useful SEO data. However accessing this data requires scraping the HTML from search result pages. Here are the key steps to scrape Google SERPs:
1. Search for a keyword
First, search for a keyword on Google. For example:
https://www.google.com/search?q=web+scraping+with+python
2. Extract HTML
Use a HTTP request library like requests
to download the page HTML:
import requests search_term = "web+scraping+with+python" url = f"https://www.google.com/search?q={search_term}" response = requests.get(url) html = response.text
3. Parse HTML
Use a parser like BeautifulSoup
or parsel
to analyze the HTML and extract data:
from parsel import Selector selector = Selector(text=html)
4. Extract data
Use CSS selectors or XPath to extract specific elements from the HTML:
results = selector.css("div.g") for result in results: title = result.css("h3::text").get() link = result.css("a::attr(href)").get()
That's the gist of how to scrape SERPs. Now let's look at what types of data we can extract.
Scrape Keyword Rankings
One of the most useful applications of SERP scraping is determining keyword rankings. This shows you which pages rank in what positions for a target keyword – invaluable data for SEO. To extract keyword rankings, we grab individual search result blocks from the HTML, then record the:
- Title
- URL
- Domain
- Ranking position
For example:
from urllib.parse import urlparse def scrape_rankings(term, pages=1): rank = 0 for page in range(pages): params = {"q": term, "start": page*10} url = f"https://google.com/search?{params.urlencode()}" response = requests.get(url) selector = Selector(response.text) for result in selector.css("div.g"): rank += 1 title = result.css("h3::text").get() link = result.css("a::attr(href)").get() domain = urlparse(link).netloc print(f"{rank}. {title} - {domain}") scrape_rankings("python web scraper", pages=3)
This would output the title, domain, and ranking for the top 30 results like:
1. How To Scrape Websites with Python - scrapfly.io 2. Web Scraping with Python - Real Python 3. Build a Web Scraper from Scratch in Python - freeCodeCamp ...
Analyzing ranking data over time can help track your own site's rankings as you optimize for keywords. You can also gauge the level of competition for a keyword based on domain ranking.
Next, let's discuss scraping suggested keywords and related searches…
Extract Keyword Suggestions
In addition to rankings, Google provides tons of keyword suggestions. These suggest new long-tail keywords and search terms that can help expand your targeting options. To extract these, we scrape elements like:
- Related searches
- People also ask
- Searches related to [term]
For example:
related_kws = [] for item in selector.xpath("//div[@id='brs']//li"): kw = item.xpath(".//text()[1]").get() related_kws.append(kw) print(related_kws)
This gives you a list of additional keywords to research and add to your content. Scraping suggestions is useful for:
- Finding closely related keywords around a topic to target.
- Identifying questions people are asking to address in your content.
- Inspiration for writing new pages and articles.
Scrape Keyword Metrics
In addition to rankings and suggestions, we can also scrape keyword metrics directly from the SERPs. Google displays keyword stats at the top of some search results.
We can extract stats like:
- Monthly search volume
- Competition level
- Average CPC
For example:
stats = {} for item in selector.xpath("//div[@id='result-stats']"): key = item.xpath(".//text()[1]").get() val = item.xpath(".//text()[2]").get() stats[key] = val print(stats)
{ "Approx. Monthly searches": "10,000", "High competition": "0.9", "Avg. CPC": "$1.20", }
These metrics help assess keyword difficulty and commercial value at scale. Next, we'll discuss how to expand the scale of your scrapes using proxies…
Scale Keyword Scraping with Proxies
When scraping Google aggressively, you risk getting your IP blocked. To scrape SERPs at scale, we can use proxy services to rotate different IP addresses. Proxies make it appear like requests are coming from many different locations and users. This avoids tripping Google's anti-scraping defenses.
Here's an example using BrightData proxies:
from brightdata import BrightDataClient brightdata = BrightDataClient(API_KEY) def scrape_with_proxy(): proxy = brightdata.get_proxy() headers = { "User-Agent": "Mozilla/5.0...", "X-Forwarded-For": proxy.host } response = requests.get(url, headers=headers, proxies=proxy) # Rest of scraping script...
BrightData provides access to a pool of millions of residential proxies from around the world. This makes it easy to scale up keyword scraping and avoid blocks.
Track Keyword Rankings Over Time
One useful technique is tracking your keyword rankings over an extended period. This helps you:
- Gauge the effectiveness of your optimization efforts
- Monitor competitor ranking changes
- Identify new entrants targeting your keywords
- Spot positive or negative keyword ranking trends
For example, you could run a script to scrape rankings daily, weekly, or monthly and store the results in a database.
import psycopg2 DB_CONFIG = { "host": "localhost", "database": "keywords", "user": "postgres", "password": "123" } conn = psycopg2.connect(**DB_CONFIG) cur = conn.cursor() # Scrape rankings data = scrape_rankings("web scraping with python") # Insert scraped data into database sql = "INSERT INTO rankings (keyword, date, domain, url, position) VALUES (%s, %s, %s, %s, %s)" cur.execute(sql, (keyword, date, domain, url, position)) conn.commit() cur.close() conn.close()
You now have historical ranking data to analyze and identify trends over time. Next, let's discuss performing competitor analysis using keywords…
Analyze Competitors by Keyword
Another useful application is scraping to perform competitor analysis. This helps you answer questions like:
- What keywords do my competitors rank for?
- What new keywords are competitors targeting?
- How have competitor keyword rankings changed over time?
You can scrape SERPs to build a database of competitors, keywords, and ranking positions.
For example:
competitors = ["domain1.com", "domain2.com"] for competitor in competitors: rankings = scrape_rankings(f"site:{competitor}") for result in rankings: keyword = result["keyword"] position = result["position"] print(f"{competitor} ranks #{position} for {keyword}")
This data enables benchmarking against competitors and identifying opportunities to target keywords they rank for. You can also combine this with tracking changes over time.
Overall, scraping can provide powerful competitive intelligence to inform your SEO strategy.
Leverage Keyword Research Tools
In addition to scraping Google, there are also specialized tools for keyword research:
- Google Keyword Planner – Provides keyword volumes and forecasts based on Google's internal data. Limited to only 100 queries/day without paid Adwords account.
- SEMrush – Feature-packed paid tool with volumes, CPC data, rankings, and more. 7 day free trial available.
- Ahrefs – Again a paid tool but offers a 7-day $7 trial for access to volumes, difficulty, CPCs, and other metrics.
- AnswerThePublic – Neat free tool that visualizes keyword suggestions in a wheel graph. Helpful for brainstorming semantic keyword permutations.
- Ubersuggest – Free alternative to SEMrush to fetch volumes, CPCs, related keywords, and other data.
- Keywords Everywhere – Browser extension that shows search volume estimates right on Google.
The best approach is combining multiple tools to get a complete picture for your keywords.
Tips for Keyword Selection
When researching keywords, keep these tips in mind:
- Target a mix of short-tail and long-tail keywords. Long tails are easier to rank for.
- Consider mid-tail keywords as well, which balance volume and difficulty.
- Leverage low-competition keywords with 100-1k monthly searches. This search traffic can add up.
- Focus on keywords relevant to your business to maximize conversion potential.
- Prioritize keywords your content can reasonably rank for in the top 10.
- Avoid over-optimization by sticking to about 10-20 target keywords per page.
- Use related keywords to link between content and interlink internal pages.
- Revisit keyword optimization every 3-6 months to capture trends.
Smart keyword selection is crucial for any effective SEO strategy. Now let's wrap up with some key takeaways…
Conclusion
The web scraping and Python code examples above should give you a template to build your own keyword scraper. Just remember not to overload Google with too many requests at once, or you risk getting blacklisted. I hope this guide provides a useful introduction to leveraging web scraping for SEO keyword research!