18
Create a wordcloud of news headlines in python!
If you haven't read this tutorial explaining how to scrape news headlines in python, make sure you do.
In summary, here's the code for scraping news headlines in python:
In summary, here's the code for scraping news headlines in python:
import requests
from bs4 import BeautifulSoup
url='https://www.bbc.com/news'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
headlines = soup.find('body').find_all('h3')
for x in headlines:
print(x.text.strip())
To create a wordcloud out of these news headlines, first import these 2 libraries beside the libraries needed to scrape our news source:
import requests
from bs4 import BeautifulSoup
from wordcloud import WordCloud #add wordcloud
import matplotlib.pyplot as plt #add pyplot from matplotlib
Next, replace
for x in headlines:
print(x.text.strip())
with
h3text = ''
for x in el:
h3text = h3text + ' ' + x.text.strip()
Before we make the wordcloud, you can check the news headlines by using
print(h3text)
wordcloud = WordCloud(width=500, height=500, margin=0).generate(soup.get_text(h3text))
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
plt.margins(x=0, y=0)
plt.show()
Let me explain...
interpolation='bilinear'
just makes the words in the wordcloud easier to read).plt.axis("off")
and plt.margins(x=0, y=0)
make sure our wordcloud isn't displayed as a graph.
If you're a beginner who likes discovering new things about python, try my weekly python newsletter

Byeeeeeđź‘‹
18