First, let me say this: open-source software has largely conquered the world. We’re a long way from when Microsoft called Linux “a cancer”.
Nevertheless, in some areas, source is closely guarded. So when a question came up on our C++ Algotrading Telegram group (email us if you want to join) suggesting there weren’t any open-source projects for algo-trading, I had to do some investigation.
In the process I’ve learned a little about some of the actual stats of open-source: it’s mind-boggling.
TL;DR - I’ll also provide the top OS repos for Quants and Traders in various categories - see below.
It’s ironic that it is Microsoft, especially after their purchase of Github that has become the greatest supporter of open-source. I have no love for Microsoft. I believe their historical anti-competitive practices set back the software industry a decade. Further, I think
git (that Github is based on) has serious design issues - leading to the need for sites like ohshitgit.com.
But Microsoft have now taken over the open-source narrative.
Microsoft bought Github for $7.5B. I thought at the time “there’s proof Microsoft has more money than they know what to do with”. Now I realize how astute they were. At the same time they’ve some how turned what was a simple vim-like editor - VSCode - into a ferocious developer behemoth. Remarkable.
Github has now cemented its place as the open-source repository venue. There’s little point looking anywhere else for code. For new code only the politically extreme even start anywhere else. Old code is being migrated or mirrored there. Github has won.
Finding the Needles in the Haystack
Unquestionably the average quality in Github is low. There is no restriction on what you can submit. So how can I choose a set of top repos? The answer is the Github feature of “Starring” - you Star the repos you want to remember (presumably because you like them).
Thankfully, Github has an API (and actually a Graph QL API too) and there is an open-source Python library for it - both, of course, hosted on Github. One use is listing - for a project - the set of users who Starred it, and - for a user - the set of projects they’ve Starred. We used this to get our lists.
What was striking, for me, in this process is the sheer number of repos - all significant enough for people to acknowledge with a Star.
From the repos we used as “seeds” we found hundreds of thousands of starred repos.
Using Starring as an indicator of repo popularity, the process is quite straightforward:
- Find a few repos popular with algo traders and quants
- Use Stars to work out what other repos are popular.
- List by combined popularity.
- Order the top by relative popularity from the seed repo starrers vs other.
For simplicity we used Sqlite to store the data. Similarly JupyterLab with Python made it straightforward to run the code. Of course we made our code open-source with Popular Gits - on Github.
I had put forward QuantLib on our telegram group. I threw in CCXT because it’s a popular algotrading library. This was in response to a question on how to get a toe in the door as a professional Quant Dev. Contributing to QuantLib and CCXT are great ways to learn these areas. Therefore they were the “seed” projects for this effort. 3300 users have Starred QuantLib and 25000 Starred CCXT. The total number of projects Starred by those users is over 1 million. Over 1000 of them are Starred by 100 or more users.
For the lists below I did a little further curation to get a reasonable result. I took out:
- repos unrelated to the subjects (like Diem and Bitcoinbook)
- those without an English translation. There were many in Chinese: Kung Fu, Abu, RiceQuant and QUANTAXIS for example. This is very interesting for me (I lived in China) but not being Chinese literate I can’t make a judgment on these repos. VN.PY is translated so I’ve included that. If anyone would like to be part of a translation effort, please contact me.
- aggregators. These repos are useful - like EliteQuant - but listing aggregators in an aggregate list seems wrong-headed.
Finally here it is - and some of the results are interesting!
I’ve made a long list. I’ve cut it off fairly arbitrarily - there’s of course many more good projects. These are the ones picked out via our algorithm by Quantlib and CCXT users. Those two are of course at the top of the list!
|TA Lib (Python)||TA||Cython|
|Machine Learning for Trading||ML||Python|