Blaming Git Blame

In this article I make a case for renaming the feature git blame to a neutral equivalent (e.g. git author or git inspect). I argue that the use of git blame primes workplaces and teams for blame culture and contributes to unhelpful feelings of shame. The language we use shapes our cognition, feelings and behaviours in small but systematic ways, and it is for this reason that I argue renaming the otherwise useful feature git blame is long overdue. I also investigate the source code of git in an attempt to understand how the feature got introduced, and connect with its author to find that he, too, wishes the feature was named differently.

What is git?

Git is an open source version control tool with widespread adoption that allows individuals to track changes to files over time and revert back to previous versions of files at any time. It is extremely useful for collaboration and has become the de-facto standard for software engineering teams. While there are alternative version control systems, many companies consider a basic command of git a pre-requisite for employment.

What is git blame? 😳

Git blame is a git command which reveals what revision and author last modified each line of a file.

When I was first shown the feature, I chuckled nervously at the discovery. Being new to tech, I generously allow for the possibility that my code might not yet be, objectively, what one would call "the shit". Naturally, I immediately imagined my colleagues repeatedly invoking my name using git blame, subconsciously adding the tiniest bit of additional frustration to an ever-growing stinking pile of frustrations upon each discovery of me as the culprit. In 21st century language, I'd summarise my feelings about learning of git blameas follows: 👀.

Whilst my instinctual reaction was that of pre-emptive shame creeping in, it was quickly countered by some good old rational thinking. I feel safe in my workplace, knowing that my colleagues aren't on the hunt to assign blame for mistakes, and that my organisation values collaboration over rockstar developing.

I was told that the main utility of the feature is not enabling team members to blame each other for their poor code contributions, but enabling individuals to find out who would be the best person to ask about a specific part of a codebase. The ability to quickly identify the author of a given part of the codebase for follow-up questions is handy, and saves time.

If you're asking yourself, why would the feature be named blame instead of author, when it primarily gets used to identify the code's author, then you and I are asking ourselves the same question.

Git blame is harmful 🤕

Let's assume we agree that git blame shouldn't get used with the intent of assigning blame within teams. A big assumption to make, that I bet some teams with archaic or authoritarian work philosophies would disagree with. But philosophical differences regarding effective collaboration are beyond the scope of this article.

Okay, so we're assuming we all agree - let's not blame each other and instead primarily use git blame to find the code's author in a blame-free context. Great. But even so, 2 issues remain:

Why should the onus be on us developers to do mental gymnastics and internally rebrand git blame as git author, when in fact, the feature itself could just be named git author in the first place?
Even though we can agree to see beyond its literal meaning, the naming of the feature still continues to affect us in systematic ways.

This is because language subconsciously shapes the way we think, feel and behave. Repeated exposure to the word blame in a work context is harmful. It feels implausible that in today's climate, where we rightfully pay attention to our language more than ever, a feature like git blame should be allowed to peacefully continue existing and priming workplaces for blame culture.

Language measurably shapes our cognition 🧠

The language we use demonstrably influences how and what we think. For example, existing cross-cultural differences in how we perceive time, distance and size can be traced back to how different languages construct these abstract ideas. Also, speakers of languages which use grammatical genders such as Russian or German are heavily influenced by the arbitrary grammatical gender of nouns. These speakers' visual representations of such nouns and their instinctive associations depend heavily on the noun's otherwise random gender. Even visual perception is not immune to the effect of language. For example, Russian speakers are better at distinguishing between shades of blue than English speakers because they have more words to describe them. You can read more about all of this here.

Language has a priming effect 🗣️

Priming is a phenomenon whereby exposure to one stimulus influences a response to a subsequent stimulus, without conscious guidance or intention. For example, exposing someone to the word "yellow" will evoke a faster response to the word "banana" than it will to an unrelated word like "sofa". Priming takes place outside of our conscious awareness, but it plays an important role in our daily lives. From influencing how we interpret information to our behaviour, priming can play a part in our perceptions, emotions, and actions. Marketers, advertisers and even social media engineers have sought to exploit this effect.

It's reasonable to assume that seeing the word blame repeatedly primes us for negative affective responses. If you are committing new code, thinking of the word "blame" is likely to introduce just the slightest bit of fear, shame or guilt. On the flip side, if you are the one investigating changes and see your colleague's name next to the word "blame", you will be more likely to actually assign blame to them and associate their name with a negative affective state.

Now let me pre-empt some questions a reader might have at this stage.

If it bothers you so much, why don't you just alias it? ❄️

I'm aware that git has a neat alias feature that allows you to replace any command with another. Technically, nothing's stopping me from renaming git blame to git candyfloss. But the feature is ubiquitous and difficult to avoid, so creating an alias in my local device would barely reduce my exposure to it.

image of VSCode blame extension

There are useful git extensions that display git blame, and even Github itself has a UI version of git blame that shows you the author and the name of the commit associated with each part of the codebase. According to the 2020 Stackoverflow Annual Developer Survey, more than 82% of us use Github, so the word blame really is near impossible to escape.

image of Github's blame interface (yes, this is what Github looks like in light mode)

If it bothers you so much, why don't you just use git annotate? ❄️

git annotate is another git command with nearly the same functionality. Whilst git annotate doesn't differ from the criticised feature by much, it doesn't solve the issue for the exact same reasons described above.

Having justified why git blame is worth discussing, let's now delve into its history.

What's to blame for git blame? 🔎

Git was created in 2005 by none other than Linus Torvald, the creator of Linux, who was at the time working on his operating system with a dispersed team of volunteer developers. Working remotely before it was cool, he was using the proprietary version control software BitKeeper, which was made available to his team for free. Until, one controversy later involving an engineer being accused of reverse-engineering the software, BitKeeper's founder Larry McVoy withdrew the agreement.

Torvalds soon announced he would go on a week-long holiday, and in true vacation spirit, created the first version of the version control tool that 82% of today's nearly 27 million developers use, probably all whilst reclining on a beach chair and sipping on a cold one.

actual footage of Torvalds implementing git (probably)

Git's success was a surprise to him. It's safe to say that carefully naming projects sensitively, especially those he never expected to take off in such spectacular fashion, wasn't Torvalds' priority nor inclination.

“The in-joke was that I name all my projects after myself, and this one was named ‘Git’. Git is British slang for ‘stupid person’,” Torvalds tells us.

That's right. Git is actually a British swear word. If you've previously noticed this and chuckled at the fact that git is eponymous with the swear word popularised by The Beatles and Ronald Weasley, the joke's on you - git's creator knew this full well. The README of Git elaborates even further:

"Git", the "stupid content tracker", can mean anything, depending on your mood

Random three-letter combination that is pronounceable, and not actually used by any common UNIX command. The fact that it is a mispronunciation of "get" may or may not be relevant.
Stupid. Contemptible and despicable. Simple. Take your pick from the dictionary of slang.
"Global information tracker": you're in a good mood, and it actually works for you. Angels sing, and a light suddenly fills the room.
"Goddamn idiotic truckload of sh*t": when it breaks.

Open source

Git's maintenance continued to be led by one of the most active early contributors Junio Hamano, who is said to remain somewhat of a benevolent dictator to the git community up to this day. It's important to note that git is by no means one man's toy. It is the result of a decade and a half of collaboration and hard work by an active and caring community.

Their Code of Conduct speaks of a community that is respectful and kind. But the extent to which this is true is unclear to me because I am neither a contributor myself nor am I impervious to stories about the patriarch himself, Linus, occasionally telling contributors they should probably be retroactively aborted. A man who enjoys to jokingly call out stupid people who write stupid patches for being so stupid that it's a surprise they've managed to find a tit to suck on as stupid babies, is not a man I associate with creating welcoming communities.

Torvalds letting us all know what he really thinks of our code contributions (probably)

Whilst I must admit that I personally enjoy his particular brand of dark humour, I'd advocate for being more mindful of the power dynamics we are all part of, because whether we like it or not, our words can have unintended harmful consequences.

A theory started forming in my head: perhaps git blame was the result of the same strain of dark humour? Maybe a seemingly innocent but really quite insensitive joke that feels so on brand for the Linux creator's past? I say past here, because the man is apparently evolved now.

I'd like to think that, having known the many millions of developers who'd come into contact with the concept of blame on a daily basis as a result of a mildly amusing joke, the naming of git blame would have been reconsidered. I'd say, it's about time we retroactively abort it.

Getting to the bottom of git blame 🕵🏼‍♀️

Spoiler alert: as you might have already guessed, astute reader, my naive assumption that git blame started as a joke is, in fact, completely wrong. But let's dig into how I derived at this sobering conclusion.

Thankfully, git is so spectacularly useful that I didn't have to stop at simply theorising about my burning question of how, when and why git blame was introduced. The beauty of git itself being an open source project made with git, means that anyone can check out who authored which changes when.

In search of some answers, I put on my detective hat, raised a monocle to my eye, and downloaded Git's source code repo. To search for the earliest introduction of the string "blame" into the codebase, I used git log --all -p --reverse --source -S 'blame' .

The first mention of "blame" I found was in the documentation intended for users migrating from CVS, a competing version control tool. The changed file, in classic Linus fashion, begins with the following sentence: "OK, so you're a CVS user. That's ok, it's a treatable condition, and the first step to recovery is admitting you have a problem". Had programming not worked out for the man, standup comedy would have been a solid second choice.

Here's the bit that features the word "blame".

Author: Linus Torvalds torvalds@linux-foundation.org
Date: Wed Jun 8 13:19:31 2005 -0700

So, something has gone wrong, and you don't know whom to blame, and
you're an ex-CVS user and used to do "cvs annotate" to see who caused
the breakage. You're looking for the "git annotate", and it's just
claiming not to find such a script. You're annoyed.

Yes, that's right. Core git doesn't do "annotate", although it's
technically possible, and there are at least two specialized scripts out
there that can be used to get equivalent information...

The text suggests ex-CVS users were predicted to expect the command git annotate to produce what we now know as the git blame output, based on CVS itself supporting an annotate command. Neither of the commands existed in git at the time (although now, they both do).

At this point, it seemed to me like the jury was still out on whether the naming of git blame was the result of a joke or an intentional, serious, naming decision. But from reading through the documentation, it started seeming more and more like git blame was the result of a work ideology that starkly departs from mine, and embraces a culture where assigning blame is the norm and tolerance for mistakes is scant.

Still hungry for answers, I dug deeper and uncovered git's public mailing list archives.

The real origin of git blame 💫

In May 2005, a member of the git open source community sent an email, wondering how to replicate the CVS annotate feature. Linus replied saying that whilst he knows how to do it (obviously 💅), he's hoping someone else will, because he generally doesn't care about the feature enough. So much for my Torvalds-joke hypothesis.

Junio Hamano came forward and implemented a slow yet working algorithm in Perl, which became the base of future git blame implementations. Then, 2 people independently created their own versions - one of which became git annotate, named consistently with SVN, and the other became git blame .

Here's the "guilty" commit.

Author: Fredrik Kuivinen frekui@gmail.com
Date: Tue Feb 21 00:40:54 2006 +0100

Add git-blame, a tool for assigning blame.

I have also been working on a blame program. The algorithm is pretty
much the one described by Junio in his blame.perl. My variant doesn't
handle renames, but it shouldn't be too hard to add that. The output
is minimal, just the line number followed by the commit SHA1...

So here we go, git blame turns out to be described by its creator as a tool for assigning blame and was originally unequivocally intended to do exactly what it says on the tin - assign blame.

Admittedly, the finding is no sensation - my detective work has led me to the outcome most would have rightfully assumed to be true by default, without feeling the need to go off and do hours of research on the topic in the hope of discovering that, surely, this was all down to a simple misunderstanding or innocent joke gone awry.

However, Fredrik was not the first one to have implemented a "blame" algorithm. As he says in his commit message, he was working off of Junio's blame.perl, which makes copious use of the words blame and guilt itself, as shown below, in an excerpt from the blame algorithm's description.

How does this work, and what do we do about merges?

The algorithm considers that the first parent is our main line of development and treats it somewhat special than other parents. So we pass on the blame to the first parent if a line has not changed from it. For lines that have changed from the first parent, we must have either inherited that change from some other parent, or it could have been merge conflict resolution edit we did on our own.

The following picture illustrates how we pass on and assign blames.

In the sample, the original O was forked into A and B and then merged into M. Line 1, 2, and 4 did not change. Line 3 and 5 are changed in A, and Line 5 and 6 are changed in B. M made its own decision to resolve merge conflicts at Line 5 to something different from A and B:

+
+                A: 1 2 T 4 T 6
+               /               \ 
+O: 1 2 3 4 5 6                  M: 1 2 T 4 M S
+               \               / 
+                B: 1 2 3 4 S S
+

In the following picture, each line is annotated with a blame letter.
A lowercase blame (e.g. "a" for "1") means that commit or its ancestor is the guilty party but we do not know which particular ancestor is responsible for the change yet. An uppercase blame means that we know that commit is the guilty party

Even I, self-appointed Language Police Officer™, can see nothing wrong with the above description. In the world of algorithms, removed from humans, the words "guilt" and "blame" are simply the most succinct descriptions of the variables' functionality. In this context, the drawbacks of their negative connotations don't outweigh their utility.

But the rabbit hole goes even deeper than Junio's blame.perl. Turns out, another competing source control tool called Subversion had already implemented a blame feature all the way back in 2003!

The usage of the term blame is likely the result of 2 things: the word's variable naming utility and the collective mindset at the time.

Contacting The Author 🗯️

When I contacted the git feature's author, Fredrik Kuivinen, he told me that he mainly called it blame because it had been called blame elsewhere already. He also added the following:

"In hindsight, I should have taken the opportunity to come up with a better name, e.g., "praise" which has been suggested by others."

I felt vindicated. My temporary career change to detective had been worth it - after all, the author himself appears to agree a naming change would be beneficial!

But at the time Fredrik was implementing the feature, coming up with a better name would have required an implausible amount of foresight on his part. Firstly, he would have had to anticipate the success of git and the future ubiquitousness of the feature. Secondly, he would have had to anticipate a cultural shift in tech.

Moreover, the same command already existed elsewhere, and going with the default produces less friction than changing the status quo. And judging by mail exchanges, other contributors also preferred the naming of "blame" over "annotate".

Therefore, we will be assigning NO blame to Fredrik for implementing the feature in this article.

What have we learned? 🎓

The term blame in a version-control context existed elsewhere prior to git, because it is a succinct descriptor, and a reflection of the collective prevalent mindset at the time that no single individual is responsible for.
Using words with negative affect in internal variables might in some cases well be the most elegant naming solution. But to, in 2021, name external commands in the same way, demonstrates either a lack of empathy for the end user or a lack of awareness of the significant impact language has on us all.
Open source is magnificent. More work than I could have previously imagined went into git. I'm grateful for the hard work that continues to set the foundations of all technology that surrounds us.

This doesn't seem like such a big deal, why get hung up on it? 💁🏼

I agree that on an individual level, the existence of git blame feels easy to brush off. Once the initial twinge of surprise at discovering the feature subsides, it's easy to go back to work and never question it again. But as I've explained above, the language we are exposed to continues to affect us in subtle but real ways.

So whilst the effect of git blame on any one person might be negligible, its cumulative effect on an entire industry that encounters the feature every single day is significant. Over time, tiny amounts of shame and hesitation encountered by millions add up to a mountain of innovations foregone.

Moreover, many clearly still find it amusing that git blame exists. To those who do, I'd say: get with the times, folks. Being a dick is 2000 and late.

And although I have no science to back this up, I have the suspicion that some are more affected by the feature than others. Those who engage in more introspection, those who feel they have more to lose if they commit the wrong code, those who experience a wider range of emotions overall. As we move towards creating a diverse workforce open to individuals from any background, we would do well to also adjust our language to be more inclusive, thus allowing anyone to feel welcome and contribute with confidence.

So now, only one question remains... who is confident in C and wants to submit a revolutionary patch to git? 🤩