Ever since I started uploading code to github, I had the intention of allowing anyone copy and use it. After all, it’s the Internet, and I would be dumb to think no one would ever copy and paste my code. If I wanted to prevent people from using it, I would keep it private. I know that once something is on the Internet, it stays on the Internet. To clarify my intention, I put the following statement in my about page:
If you are currently a data science student, you have my permission to reproduce any part of my projects, including file organization, code, or explanations, as long as you give me due credit.
In hindsight, even if people checked out one of my github repositories, it’s highly unlikely that they then traveled over to my blog to read that blurb on the about page. Also, these instructions are so vague. Are you only supposed to reproduce the code if you are a “data science student”? Does “give me due credit” mean putting my name in a code comment? Would it matter if you never share your project that includes my code?
Here’s what Github has to say about this:
If you find software that doesn’t have a license, that generally means you have no permission from the creators of the software to use, modify, or share the software. Although a code host such as GitHub may allow you to view and fork the code, this does not imply that you are permitted to use, modify, or share the software for any purpose.
In other words, even though I had that statement in my github blog, theoretically no one has permission to use my code! This is absolutely NOT what I intended. To clarify my intentions, I had to add licenses to my projects, past and present.
Licensing My Own Code
Github has a great site called Choose a License, which briefly explains what licenses are available and why you should choose each one. On a deeper level, Catherine Raymond has a Licensing HOWTO document that answers most of my basic questions pertaining to copyright, licensing, and open source.
The licenses that interested me the most were the MIT License and the GPLv3. Most of my coding projects I wanted to release under the MIT License. The code for my climbing dashboard I wanted to release under the GPLv3 license, because as unlikely as it is, if anyone creates a derivative work, I’d like for it to also be open-source and hopefully link their project back to me so I can learn from them.
Adding licenses to projects that I wrote myself was straightforward. For the MIT License, all I had to do was follow the instructions here, and updating the README is optional. For GPLv3, there was the additional step of adding some text to a License section in the README.
Crediting Others
In building my blog, I borrowed most of the code from mediator by Dirk Fabish and some from Lagrange by Paul Le. Both of these themes are licensed under the MIT License, and Dirk Fabish even goes one step further to add this to his README:
Licensing
MIT with no added caveats, so feel free to use this on your site without linking back to me or using a disclaimer or anything silly like that.
In the spirit of open-source, I also want to release my code under the MIT License. What is the best way to accomplish this while still giving proper attribution to the original authors?
I found this Quora thread, in which Gil Yehuda mentions how Yahoo does the attributions using a LICENSE file and a CREDITS.md file. I then found the code for Yahoo’s COVID-19 dashboard, which gave me an example of how they actually structured their LICENSE and CREDITS.md file. I figured that if a big company like Yahoo can do it this way, it’s probably good enough for me too.
Summary
Licensing my coding projects makes it abundantly clear that my intention is to allow anyone to copy and use my code. By following the precedent set by Yahoo, I have also provided a template for how to credit past contributors if anyone were to clone my blog template and use it for themselves. Lastly, if you have any questions pertaining to any of my projects, please feel free to email me.