Some Experiments with GitHub Copilot

I recently got my hands the GitHub Copilot extension for VS Code and it's amazing (borderline scary)

The examples below are in Python

Function to get Pokemon Data

Was able to auto-complete my comment as well. Initially I never intended to save it to a json file.

Added some in-line code comments

Used an external library (requests) to make a request. Used Json to save the data.

Chose a decent file name on it's own

Was able to find a data source, surprisingly someone's github repo

Function to zip and unzip file

Was able to import all the necessary libraries. Although it did import shutil and never used.

Used zipfile library to unzip/zip

For the second function, it didn't import the zipfile liibrary.

Was able to use the right parameters

Building a tictactoe game

Generated 64 lines of code

Was able to write functions for various purposes

Knew the winning combinations of a tictactoe board

Added Error Handling

Added print statements and ability to take input from user

Logic to check the result of the game

Although it wrote all the sub-functions, it never invoked them to actually build a playable game

Crypto Price

Added a parameter

Used a crypto api to get the data

Was able to return the correct column for the price

Build a streamlit app to display Github Repos

For this, I had to write multiple comments and it actually felt like I was pair-programming with the Copilot. However, most of the code was generated by Copilot.

Was able to get data from github api

Since I mentioned popular, it sorted the repos based on 'stars'. It is mind-boggling how it was able to relate 'stars' to popularity.

It was able to use an external library streamlit (streamlit is used to build web apps)

It also added a title and text to be displayed in the web app

It re-used the previously created function

For most parts, it was auto-completing my comments as well

General Observations

The variable and function names are pretty explanatory

Added relevant in-line code comments

Was able to use external libraries

Was able to get data from various data sources

The format of the code was clean with proper indentation and line breaks

It took me quite a few tries (trying out different comments) for it to actually use streamlit and build a simple app. In the end, I ended up importing the library and it started generating code using the library. However, sometimes it was able to generate code on its own as well

When trying to get the pokemon/crypto data, it often made suggestions that used beautiful soup to scrape the data. Web Scraping is not always the best option and in some cases, you might even end up breaking some laws.

Some weirdness

Sometimes it acted weird. For instance, at times the generated code contained local files paths for some other users,eg "Users/Projects/......."

I tried getting suggestions for a variable named api_key and it actually suggested a string with random keys. Ofc, it might actually be random but yeah that was weird.

At times it generated receptive code. Eg: When I was trying to generate code for streamlit, it generated the same two lines over and over again

It suggests a bunch of unnecessary imports at times

For some reason, it kept on generating code that used Dash although I specifically mentioned streamlit

My Views and a few questions

My Views are my own

It is sure going to improve a software developer's productivity. However, I don't see it replacing a software developer. Copilot often generated non-sensical and repetitive code. It didn't import the necessary libraries at times as well. It is basically like Kite or TabNine on steroids.

A good analogy I can think of is Google Translate. It's been there for years but it has not replaced the need for an actual translator. You could translate an article from English to Japanese in a few seconds. However, you would still need somebody fluent in both languages to ensure that the translation is grammatically correct and it delivers the same message as the original article.

Another issue I can think of - Who would be liable for the code? If I used GitHub to generate some code and later get sued by someone for some reason. Can I put the blame on GitHub? GitHub will most likely make users agree to some terms and conditions which prevent them from being sued. So we would actually need somebody experienced in software to ensure that the generated code is safe to use.

Although GitHub Copilot is good for new projects, I am not sure if it would be as useful when working with an existing codebase. In an existing codebase, it would have to follow the existing coding style and be able to re-use already written code. I have not tried working with Copilot in an already existing project so I can't comment much

Assuming it is constantly learning as more users use it, how would it distinguish 'good' code from 'bad' or 'spaghetti code.

Conclusion

One thing GitHub copilot can guarantee is that it is going to make coding and software, in general, more accessible 💯

What are your thoughts on the Copilot? Did you try out anything cool with it? How do you think it is going to affect the Software/Data Science Industry? Let me know in the commentsPublishSave draftRevert new changes