16
Scrape Bing Related Questions using Python
Contents: intro, imports, what will be scraped, process, code, links, outro.
This blog post is a continuation of Bing's web scraping series. Here will be shown how to scrape Related Questions from Bing search results using Python.
from bs4 import BeautifulSoup
import requests
import lxml
from serpapi import GoogleSearch
import os # for creating environment variable
Everything below was done using SelectorGadget Chrome extension.
from bs4 import BeautifulSoup
import requests, lxml
headers = {
"User-Agent":
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.19582"
}
html = requests.get('https://www.bing.com/search?q=lion king&hl=en', headers=headers)
soup = BeautifulSoup(html.content, 'lxml')
for related_question in soup.select('#relatedQnAListDisplay .df_topAlAs'):
question = related_question.select_one('.b_1linetrunc').text
snippet = related_question.select_one('.rwrl_padref').text
title = related_question.select_one('#relatedQnAListDisplay .b_algo p').text
link = related_question.select_one('#relatedQnAListDisplay .b_algo a')['href']
displayed_link = related_question.select_one('#relatedQnAListDisplay cite').text
print(f'{question}\n{snippet}\n{title}\n{link}\n{displayed_link}\n')
# part of the output:
'''
What kind of game is The Lion King?
Jump on top of giraffe’s head and eat bugs in this awesome classic platformer game. The Lion King is a classic 1994 platformer video game based on the multi-award winning animated film of the same name. The game takes place after the death of Simba’s father where Simba was told a lie and forced to hide.
The Lion King - Play Game Online - ArcadeSpot.com
https://arcadespot.com/game/the-lion-king/
arcadespot.com/game/the-lion-king/
'''
SerpApi is a paid API with a free trial of 5,000 searches.
from serpapi import GoogleSearch
params = {
"api_key": "YOUR_API_KEY",
"engine": "bing",
"q": "lion king"
}
search = GoogleSearch(params)
results = search.get_dict()
for result in results['related_questions']:
question = result['question']
snippet = result['snippet']
title = result['title']
link = result['link']
displayed_link = result['displayed_link']
print(f'{question}\n{title}\n{link}\n{displayed_link}\n{snippet}\n')
# part of the output:
'''
Is the Lion King a circle of life?
Disney THE LION KING | Award-Winning Best Musical
https://www.lionking.com/
www.lionking.com/
Circle of Life in 360 - Experience THE LION KING like never before - WATCH IT NOW Quite Simply, Stunning. -TimeOut New York A Deeply Felt Celebration of Life.
'''
If you have any questions or something isn't working correctly or you want to write something else, feel free to drop a comment in the comment section or via Twitter at @serp_api.
Yours,
Dimitry, and the rest of SerpApi Team.
16