Generating poetry using gcc diagnostics

Our legions are brim-full, our cause is ripe.
invalid application of ‘sizeof’ to a void type

Thy worthie, manly heart, be yet unbroken,
expected primary-expression before ‘<’ token

Did utter forth a voice. Yes, thou must die:
two or more data types in declaration of ‘i’

Introducing... gado!

After years tired of cold, apathetic error messages that couldn't look more computer generated, I've spared some time to find a more human way to display compiler diagnostics.

That's how gado (gcc awesome diagnostics orchestrator) was born! Built in Python, it internally uses gcc (g++ is supported too) to compile a file, and display two rhyming lines to every compiler message.

How it works?

Initial processing

The essential problem to solve was given a compiler message, how to find a verse that rhyme with it?

Initially, we have to find a place to take the verses from. For this project, I'm using a database of all of Shakespeare's works.

We also take the output from gcc when compiling a program. As we have a lot of unusual symbols such as @ % & ; } on compiler messages, we simply remove them to make the matching easier, so we are left with just letters.

How rhymes (don't) work

The next step is to get a verse from that database that rhymes with an error. And for two sentences to rhyme, we need the last word of each to rhyme.

But how do we know if two words rhyme? For many cases, you can say by just checking if two strings end in the same way (like potatoes and tomatoes)

... But things aren't so easy all the time, though.

There are a lot of words that share a common ending but don't rhyme - for example, none of the words "through, tough, thorough, trough, though" rhyme with each other, but there are words that don't even have a single letter in common but rhyme anyway, like "eye" and "high".

Teaching computers to rhyme

To address those problems, I used a Python library called pronouncing. It internally uses the CMU Pronouncing Dictionary to get rhymes for words.

But after some testing, another problem emerges. Compiler messages have a lot of words that don't exist in English Dictionaries, like int, fpermissive or arith (from pointer arithmetic).

How to get these to match with normal words, taken from English poems written centuries before our first Hello World!?

For these, I have implemented a simpler matching - just looking for a common ending in strings, as initially stated.

I know what you are thinking by now! This won't feature interesting rhymes like "pause/paws" and can even display incorrect ones like "though/tough", but now we can match every compiler message to a line of our database, even with nonexistent words!

To make it more fun, the rhymes are chosen at random, so it will generate two different poems even if you compile the same program twice. Nondeterminism FTW!

More examples & Source Code & Contribution

I made a website (gado.dikson.xyz) featuring some of rhymes generated by the program, so you can see some fun verses without having to install it.

If you want to install gado or give a look the source code, you are more than welcome to see (and contribute to) its GitHub repository!

23