As part of my on-going series on learning data science and reviewing the latest tools, I ended up needing to work on data analysis with people in different countries. While big companies have their own internal tools for sharing code among teams, there were less available for students and freelancers. Fortunately, two such tools, Google Colab and CoCalc, are emerging to help data scientists collaborate online (Disclosure, I am a contractor with the tech policy nonprofit, Tech4America).
Google’s Colaboratory (Colab, for short) began as a research project with makers of the popular online programming notebook, Jupyter. The features have recently ramped up, as machine learning and other data science needs have become more commonplace.
Both Colab and CoCalc operate like Jupyter notebooks, which, if you aren’t familiar, look like a standard text editor that, among other things, allows users to execute code in discrete blocks, with output directly below each block.
But, other than this similarity, there are notable differences. I’ll break down each platform by pros and cons.
Colab has a smooth desktop interface–scrolling over long notebooks is a relatively seamless experience. It also has a table-of-contents like sidebar interface where you can collapse and expand sections, which makes organizing work slick. This may sound like a minor feature, but it makes a big difference when you’re constantly going up and down a notebook searching for code in different sections.
It’s also easy to upload datasets from other Google products, like their data storage service BigQuery, or upload from Google Sheets.
It’s as simple to share a notebook as it is a Google Drive document–just click a button in the upper-right hand corner and input an email address to give someone access to the notebook.
Colab excels at being super-efficient in sharing, scrolling, and organizing code.
Though I have not used it, Colab also integrates well with its own machine learning project, Tensor Flow, which is especially nice for compute-heavy machine learning projects.
Colab is still a relatively closed environment; at one point, I wanted to add a Python package and couldn’t; I was told on a support forum that there was no way to run the code because Colab had not added it. This makes Colab good for common tools but not for specialization.
The other con is that there’s no live editing. Two people can’t write code at the same time, which means lots of up and back sharing for teams.
Finally, the mobile experience is pretty lack-luster. If I needed to review a teammates code, I could sometimes open it up on my phone and run the code, but it was clunky and not guaranteed that I’d be able to execute it.
CoCalc excels in many of the places where Colab misses the mark.
CoCalc allows live editing, which is perfect for teams that want to work on different parts of the notebook simultaneously or work collaboratively on the same part without having to wait and refresh the page.
CoCalc’s mobile experience is decent, especially since it plugs in with a mobile app, Juno. The mobile experience is clunky and limited enough that I don’t like to write large sections of code purely on my phone, but I rarely run into issues reviewing or editing code done by a teammate on mobile, which makes it much easier to be productive on-the-go.
Finally, CoCalc has a terminal backend, which means that it can perform most any of the functions that I can do on my local machine, including installing obscure python packages.
CoCalc’s interface is not as slick as Colab. It’s harder to scroll through large notebooks and it seems choppier on desktop.
Unlike Colab, there’s no way to organize notebooks, so I end up spending time going back and forth between different sections.
Finally, to be complete, both Colab and Cocalc have a ‘con’ compared to using a Juypter notebook on a local machine. They need to be refreshed after a few hours or a day away from the notebook. Having to redo code from the beginning (or re-authenticate to connect to a database) is an inconvenience that can make the experience frustrating.
Overall, I like both Colab and CoCalc. As of late, I’ve been working more in CoCalc than Colab, because I like the live-editing feature, but it’s still too early to make a determination between the two.
Credit: Google News