I Got Attacked in the YouTube Comments Section (So I Tagged and Analyzed the Data)
Stefani Bachetti, Director of The Studio, starred in an episode of Ask This Old House. The YouTube video posted afterwards amassed a half million views and over a thousand (mostly negative) comments.
So she did what researchers do best: analyzed the data to better understand the sources, nature, and spread of negativity.
I knew better than to read the comments—but once you start scrolling, it’s hard to look away.
When you think “the comments section,” you don’t exactly think “civility.” And as the responses to my video racked their way up past a thousand, I grew amazed by the creativity (and severity) of the insults, snide remarks, and cut-downs.
But let’s back up for a second.
A couple years ago I was on an episode of the PBS program Ask This Old House (AskTOH), which was a staple in my home as a kid. In the episode, they helped build my garage into a woodshop. It was an unexpected opportunity and a really great experience for me.
After the taping, friends and family waited anxiously for the broadcast. Everyone was excited. Several months later, I found out about it airing. PBS uploaded the episode clip to two locations: their website, and their YouTube channel. I came across the YouTube channel post first. I watched, and then I scrolled down.
It was what everyone would expect: a hodge-podge of quick congratulations and compliments…and then a lot of ugly.
Lucky for me, I don’t offend easily. In fact, I found the variety of ways people were able to be terrible…oddly interesting? The thoughts that came to their minds, and the things they felt they could say, hit on areas I would not have predicted. I began checking the video weekly as the comment count continued to creep up.
In my cursory glances at the content, I noticed I seemed to be a big focus for everyone.
A really big focus.
I began to wonder: Was I just biased? Was I just picking up on the comments about me more than anything else? Or was I, as a blonde female guest on the show, really more “in the spotlight” than the subject of the clip?
Because I lead a team of design researchers at dscout, I decided to do what I’m trained to do: synthesize disparate and free-form data, like comments. I got on GitHub and found a web client that would scrape all the comments (thanks philbot9) and got started. I chose four parameters to analyze: category, sentiment, number of likes, and number of replies. I tagged each comment appropriately and then took a look at the frequencies to identify trends across each. (To get a sense of how I approached the analysis, head to the end of this write-up).
What I found
The main takeaway shouldn’t surprise you: most commenters were there to cut people down. Negative comments got the most attention, in the form of both replies and likes from others. Piling onto someone else’s negativity was even more appealing than initiating a negative comment to begin with.
So what were people so negative about? To get a sense of where the negativity was distributed, and whether it was concentrated on something specific, I looked at the sentiment of (the feeling behind) each comment by category.
A breakdown of negativity by theme
Turns out, the negativity was pretty universal. The exclamation category was made up of general statements expressing desire or opinions not directed at anything specific to the video. It was the only category with more positive comments than negative.
Comments on location were overwhelmingly negative. Chicago was billed as some kind of wretched, crime-infested city, where gang members and meth-heads rule the streets. The fact that I lived by the train implied never sleeping at night, and, apparently, no one believed that a car would fit in my garage.
The show itself and the episode received intense scrutiny as well, though this was tempered by those tried-and-true fans of PBS and Ask This Old House. Die-hard fans commended the execution of the shop and what AskTOH does, but a much larger group was dissatisfied with what they saw as blatant product flaunting due to sponsorship. Their assumptions about the cost frustrated them, because it felt like something they can’t afford themselves. Some people took issue with the production, going so far as to read the intro as reminiscent of pornography, blaming poor scripting and too many takes.
Meanwhile, the views on the actors highlighted the biggest polarization between commenters. The split between positive and critical comments was the closest here. People loved Tom (the show’s host) and they loved me, but they also kind of disliked Tom, and really, really hated me.
Maybe I know something about tools, maybe I don’t. Either way, neither the show, nor my outfit did me any favors. And an unsettling amount of people claimed, “Tom Silva definitely plowed her in between takes.”
Finally, there was a lot of turmoil about the execution of building out the shop. Here discontent really shined: viewers took pride in catching tiny technique errors, and spoke to larger issues they fundamentally disagreed with. They went so far as to scour the video and include timestamps to support their catch. Viewers felt the shop was poorly thought-out, that the fold-down bench would collapse on my feet, that the miter saw was way too big for a shop this size, and that you just can’t drill into brick or mortar.
But back to my original question: what topic was commented on more?
Both the shop category and the actors categories accounted for roughly one-third of the total comments each. But as a bottom-up analysis junkie, I approached my tagging in a tiered approach.
In this case, that meant every comment tagged shop also received a more specific tag that fell within that category. For example, about sixteen tags made up the shop category, four in the actors category, etc. Looking at these two categories on a more granular level gave a slightly different understanding of what was talked about most.
About one-third of the shop comments were about drilling into brick, 70 total. Sometimes people disagreed with what a Tapcon screw is capable of; sometimes they didn’t know what Tapcons were at all. Either way, not using an anchor to mount into masonry was blasphemy to the internet.
As far as the actor comments, three-quarters of those were about me, a total of 143. The number of comments about me was twice as high as the ones about mounting something into brick — strange, for a PBS television show on home improvement. It turned out that I did, in fact, come out “on top” as the most talked-about topic in the video. Here’s a sample of the discussion:
My hair was wrong.
My shoes were wrong.
And while some people were on my side…
…(which got creepy at some points)…
…the YouTube community generally agreed that women just shouldn’t be in the shop.
Unless, of course, they think it’s a turn-on.
So what do we take away from this?
That people on the internet are terrible, and being a woman on the internet makes things twice as bad? That we should never read the comments, because they’re rife with putdowns, assumptions, and objectification?
But also maybe: that there’s value in approaching online ugliness with curiosity. That this kind of behavior can at least be harnessed for learning.
What I found was, no matter how horribly people are behaving, it’s always interesting to gain a deeper understanding of what’s going on — to dig deeper and understand the trends. Like any good research, you’ll leave with more questions than answers. Some might never be answered. But there might be some context gained, and some new insight worth exploring.
Analysis + methodology
I started with a data set just shy of 1,000 total comments and replies. In reading through the content, however, it was easy to realize that the replies were not where time needed to be spent. I scrapped all the devolving back-and-forths and focused on the 605 main comments that were usually (but not always) coherent.
I chose four key parameters to assess: category, sentiment, likes, and replies.
Category was developed in a bottom-up approach, and is made up of seven buckets: tools, actors, location, the show/episode, general exclamations, meta commentary, and other.
Sentiment covers positive, critical, neutral, and mixed (both positive and critical) groupings.
In assessing likes and replies, I looked at whether or not a comment has one or the other, and if so, how many. But the content of those replies, as mentioned, was disregarded.
Category and sentiment were hand-tagged, so each was a judgment call based on the context of the comment. The category tags are pretty descriptive; not a lot of interpretation is needed there.
The granular tags that fell into each category are as follows:
SHOP: general, painting, wrenches, brick drilling, drill/driver, drillbit, bench, bench:hinges, miter saw, pegboard, pocket holes, router, sander, ventilation, tablesaw
ACTORS: general, tom, stefani, norm
LOCATION: condo, chicago, train, garage space, other
SHOW: general, toh, scripting/staging, cost, sponsorship/brands, other
EXCLAMATION: general, desire, how you should do it, my situation, self promo
Sentiment got a little more hairy, so rather than force-fitting those ambiguous comments into one of the buckets, the “unclear” tag took care of those. I needed to reach a certain level of confidence to even tag a sentiment.
Whether or not a comment had a like or reply, and the number of each, was included in the export that was produced by the web client I found on GitHub.
When I was done, I ran basic frequencies, set up a few pivot tables, and got a friend to spit out some crosstabs in Tableau to start taking a look at how the various parameters compared.