Analyzing the r/wallstreetbets hivemind — August 2021

The activity in the Reddit r/wallstreetbets community is staggering. Each day, there are around 800 posts and 50,000 comments debating approximately 280 different stocks. But by just browsing Reddit, between the memes and degenerate gamblers, it can be hard to understand the full nature of the discussion.

In this post, I’ve turned to a bit of SQL and Python to explore what’s happening in the wallstreetbets hivemind. I’ve analyzed stock popularity, sophisticated but overlooked discussions, and community influencers.

If you’re interested, here’s the raw Reddit data, my data pipeline, the derived data, and my Jupyter notebook. I’m using Beneath, an open data platform I’m building, to stream and save the data.

Btw, this isn’t investment advice… DYOR.

The meme stock rankings

Let’s start with the basics. What are the most discussed stocks and how have they changed over time?

The stocks on wallstreetbets can be broadly bucketed into two categories: long-standing wallstreetbets’ interests and stocks related to current events. We can see these two categories by inspecting a line graph of mentions over time (select stocks):

The long-standing wallstreetbets interests jump out: these are the lines that occupy a significant percentage of mentions over the whole time period. These include staples like Gamestop (GME) and AMC, but the community has also long been tracking Clover Health (CLOV) and AMD, the semiconductor manufacturer. It doesn’t look like wallstreetbets will lose interest in these anytime soon.

On the other hand, we see stocks that spike suddenly due to specific events, such as Robinhood (HOOD) and Microvast (MVST), a lithium-ion battery manufacturer. Both of these stocks went public at the end of July and received bursts of attention from the community, but the interest hasn’t lasted. As of September 1st, both stocks now have a near-zero percent share of daily mentions.

In this next chart, we zoom in on the most discussed stocks in the month of August.

GME and AMC have long been community favorites, and even in August, they remain the most mentioned stocks. I’ve been collecting data from wallstreetbets since March, and the two companies have been the most discussed for 4 of these 6 months. But it’s also clear that it’s not a power law distribution, and contending stocks get significant discussion, too.

Discussions of the next big thing

The NASDAQ includes over 4000 public equities, and the NYSE over 3000, so how does the community come to rally around certain stocks? One of my hypotheses is that some initial post triggers a deep and unique discussion that ultimately leads to community-wide attention. So, let’s try to find some interesting conversations.

I’ve tried to uncover some under-appreciated discussions by filtering for posts with at least 15 comments and 25 upvotes, and sorting those posts by highest average words per comment. Here are the top 10 for the month of August:

If you’d like to read the discussions, the interactive chart includes a link to each post’s page on reddit.com. Note that the numbers might not reflect what you see on reddit.com because comments can be edited and deleted after-the-fact, and scores are continually changing.

The posts above reveal fairly educated discussions about storylines that, for the most part, haven’t yet hit the wallstreetbets front page. Stocks like Lordstown Motors (RIDE), Ford (FORD), and Proterra (PRTA) haven’t yet garnered much attention, but, in light of these deep discussions, they could be worth keeping an eye on.

Another hypothesis I wanted to test is that the share of rocket emojis in a discussion could signal a stock’s momentum within the community. Here’s a ranking of the posts from August that had the highest percentage of commenters include a rocket emoji (filtered for posts with at least 25 comments):

Unsurprisingly, these posts reveal a number of meme stocks that have already made it to the front-page, like CLOV and WISH. But there are also companies like Pizza Hut (HUT) and Bitfarms (BITF) that haven’t made it to the front page (yet?). They’re definitely worth watching.

Behind every post and comment is a member of the wallstreetbets community. Let’s find out which authors are leading the discussion.

The influencers of wallstreetbets

To identify influencers, I wanted to find the active authors who get the most upvotes on substantial, forward-looking posts. To that end, I’ve applied a couple criteria. First, I’ve excluded posts labeled as a “Meme,” “Gain,” or “Loss,” which are mostly retrospective. Second, I’ve filtered for authors who have posted at least once since July 1st. One of the most popular Redditors of all time was u/DeepF***ingValue, but his last post was on April 15th, and I want this analysis to be current.

Here are the top authors since I started collecting data in March:

The influencers that I found can be split into two categories: the analysts and the hype men.

The analysts, like the two top authors u/quantkim and u/nobjos, contribute breaking news, technical analysis, and quantitative reports. For example, u/quantkim shares articles about GameStop’s corporate turnaround, like this one, and has averaged 11,332 upvotes over 15 posts.

Conversely, the hype men typically talk up their big positions in popular stocks. Here’s one from u/dumbledoreRothIRA about a $600k position in $CLOV, and one from u/lookshee laying out his intention to buy the entirety of the GameStop company.

All the authors above clearly influence the community, so, to jump ahead of the crowd, it’d be smart to set up notifications for whenever they post.

What’s next

My analyses in this post really just scratch the surface of what you can infer from wallstreetbets data — there’s much more to do. To extend this work, I’m currently considering factoring in price movements, doing sentiment analysis, and creating a bot that mines for insights in real-time.

Last week, a Wall Street Journal article detailed that forward-thinking hedge funds are diving into the r/wallstreetbets data. By making this data public and queryable on Beneath, I hope I’ve made it more accessible to the everyday person!

If you’re interested in any of this, come hangout in the Beneath Discord community, follow me on twitter @ericpgreen2, or jump right into the data yourself 🚀🚀🚀

28