The Lord Governor rose from his seat and gazed around the packed arena with pride.
“Unleash the Datum Sorcerers!” he thundered, raising his arms skywards. The spectators in their thousands roared in response.
The Warlord standing at one side of the governor turned towards him and saluted. “Are we certain, my lord?” he asked the Governor with a slight tremble in his voice.
“Indeed! The arena has been prepared! The people have gathered! The datum stones have been retrieved! Even the Emperor has granted his approval! It is time!”
“Very well, my lord. Just a little matter regarding the sorcerers… they, um,” the Warlord hesitated before continuing, “the last of the sorcerers died a week ago.”
“What? All of my sorcerers are dead? How can that be! We had so many of them!” the Lord Governor was aghast.
“Yes, yes. They, err, exited our world gradually, one by one, out of starvation. The Human Resource Priests had diagnosed that the sorcerers were emaciated for lack of datum.”
“Bah! I always knew one could not rely on this hocus-pocus of sorcery,” asserted the Governor, and continued, “Very well, summon the Mystic Legion Elves instead. I am sure they have learned to automate the datum gems without needing the sorcerers.”
“Very well, my lord. Just a little matter regarding the elves… they died out even before the sorcerers. They suffered from a strange unfamiliar affliction. Some of our Human Resource Priests gave it a new name — Ennui.”
The Governor stared up at the sky, with shock enveloping him. Thousands of impatient eyes from every corner of the arena were upon him. His stomach churned with the nauseating weight of their restless expectations.
“What about the Datum Alchemists? I never understood why sorcery is different from alchemy. I don’t see why the Alchemists cannot create the automaton. Can you call for the Alchemists, or are they all dead too?” the Governed glared at the Warlord.
“Very well, my lord. Just the little matter…” the Warlord faltered.
“Is EVERYONE dead?!” the Governor screamed, “Who remains among my Automaton Inquisition Troops?”
“There is, uh, one Datum Sorcery Manager, one Principal Lord of the Elves, and one Datum Alchemy Director.”
As the Governor sank back into his throne in resignation, he could hear the distant sounds of the Emperor’s army marching towards the arena…
You might have read the paper Hidden Technical Debt in Machine Learning Systems (Neurips 2015). If not, I highly recommend it. It is an excellent discussion of the technical risks and challenges associated with building and maintaining machine learning (ML) systems. Recently, going back to this article led me to ponder about an orthogonal set of challenges hindering the efforts to build, maintain, iterate and improve ML products — organizing the ML practitioners in a company. Instead of continuing to ponder in private, I felt a series of posts (starting with this one) can help me, and possibly others, unravel these organizational aspects. For the sake of a fancy sounding title, let’s call this series — “Unhidden Organizational Debt of Machine Learning Teams”.
Organizational debt isn’t a new concept. I am sure it is as old as organizations themselves. Organizational debt can, and does, accumulate in any type of company. Several articles and probably books have been written about it. It’s time to start tackling organizational debt is an example of one such recent article, which also links to several other articles (I chose to cite this article because it too is in the form of a LinkedIn post with no peer review — very much like this one!). Tech companies are not immune to organizational debt, be they a funding-starved startup or an aging behemoth. Despite their penchant for regular restructuring, the dynamic nature of the tech industry means that their organizational structure of these companies is always playing catch up with the changing day-to-day reality.
1. 130 Machine Learning Projects Solved and Explained
2. The New Intelligent Sales Stack
3. Time Series and How to Detect Anomalies in Them — Part I
4. Beginners Guide -CNN Image Classifier | Part 1
In this series of articles, however, you and I are going to zoom in to look at organizational debt specifically in terms of the people behind the development of ML systems and products. To further narrow down our scope, we will only think in terms of medium sized organizations that either have a significant software branch, or are a full-fledged software company (that leaves out startups which don’t have rigid org structures and very large companies which can be considered as conglomerates of multiple smaller ones).
Tackling these issues in the form of Why, Who, Where, What, When, and How seemed like an interesting design choice to me. This post will focus on the Why and the Who.
Since machine learning is a relatively new frontier for most tech organizations, they have only recently started forming their ML teams, often from scratch. Organizational leaders such as VPs or executives may not have a lot of prior experience of building and managing ML teams. For many of them, it might be the first time that they have a ML team in their department or organization.
While at a high level, ML development appears deceptively similar to traditional software development, anyone using ML in the industry will tell you that it differs significantly in most aspects ranging from ideation to deployment. However, with a lack of ML experience at the leadership level, there can be a gap in the understanding of the nuances of ML development and unreasonable expectations of quick turnarounds or ‘magical’ solutions. The ML teams and team leaders may then resort to quick and dirty solutions leading to a pile up of mainly technical, but also organizational debt (of course, this is also applicable to software at large, but lack of awareness and the inherent uncertainty can make it harder with respect to ML).
Coming back to the organizational aspects, as a company slowly increases its investment in data and ML, the evolution of the ML team(s) often tends to be ad-hoc, reactive and ill-designed. The skills and responsibilities of the ML teams turn out to be varied or inconsistent, and often misunderstood. All this leads to gradual accumulation of organizational debt.
On top of all this, there are no well-established best practices on organizing or managing ML teams. In fact, even the skills and tasks associated with the same job title vary significantly across companies. To underscore this, and to ensure that we are all on the same page for the remainder of the discussion, let us look at the common job titles, their multiple interpretations and the underlying skillsets. This brings us to the Who!
DS aka Data Scientist (aka Datum Sorcerer): As DS became the most sought-after magical ability a few years ago with both job-seekers and job-givers coveting it, it led to a diversification of what DS implies. However, I think the responsibilities have coalesced into a few major categories. When a company thinks of a “data scientist” creature, it may be imagining any of the below:
- DS-: responsible for understanding a business problem and building data-driven solutions that will go into production, i.e., typically training ML models, including the processes of data munging, dataset creation, and so on. Sometimes also referred to as Machine Learning Scientist, Research Scientist, and other prefixes ( *-Scientist).
- DS-: responsible for munging, mining and analysing data, gathering insights, and answering business questions with data, i.e., product analytics, behavior analytics, A/B testing design & analysis, causal relationships, etc. Often called as Data Analyst, or the fancier, Decision Scientist.
- DS-ƛ: responsible for both building/training ML models and deploying them into production, i.e., model serving which includes constructing the data ingestion pipelines, postprocessing and delivery of predictions, MLOps, etc. Sometimes also referred to as Applied Scientist, Full-stack Data Scientist, and more commonly, Machine Learning Engineer.
MLE aka Machine Learning Engineer (aka Mystic Legion Elf): Some time after DS had blasted onto the scene, a new creature started making an appearance. It was named MLE. In the initial days while MLE might have meant bringing the engineering chops to be able to utilize the data science models, it ended up getting morphed into two major species:
- MLE-: responsible for taking trained ML models and deploying them into production environments, i.e., model serving which includes MLOps, constructing the data ingestion pipelines, postprocessing and delivery of predictions. Sometimes, they go by an even more mouthful of a title- Software Engineer, Machine Learning.
- MLE-ƛ: responsible not just for productionizing ML models, but also for training them in the first place by understanding the business, the problems and the data. As evident from above, sometimes also called as Applied Scientist, Full-stack Data Scientist, or plain old, Data Scientist.
DA aka Data Analyst (aka Datum Alchemist): These are some of the ancient creatures who have traversed both tech and non-tech companies since time immemorial. They also took on different names with different permutations of the spell-binding words like business, data, intelligence, analysis, analytics, research, etc. In the context of tech-ish companies with a focus on data science and/or machine learning, they may frequently be grouped into two varieties:
- DA-: responsible for munging and analysing data and generating insights, i.e., product analytics, behavior analytics, A/B testing design & analysis, causal relationships, etc. If this sounds familiar, it is because they sometimes go by the supposedly fancier title- Data Scientist.
- DA-: responsible mainly for data analysis and gathering insights, however, on the rare occasion, resorts to training ML models to generate predictions. Often seen in companies that are beginning to test the potions of machine learning. They are likely to eventually transform their title to Data Scientist.
SWE-ML aka Software Engineer, Machine Learning (aka regular Elf with a dose of mysticism): While the two halves of the name are essential components to this whole discussion, their combination as a title is a relatively new phenomenon. Of course, previous decades had many pioneering creatures who just went by the title of Software Engineer but who not just used ML, but laid the foundation for the emergence of the field of Data Science and Machine Learning. In a sense, they were the progenitors of several of the above species. In current times, however, creatures adopting this name are mainly of the below varieties:
- SWE-ML-: responsible for taking trained ML models and deploying them into production environments, i.e., model serving which includes the MLOps, constructing the data ingestion pipelines, postprocessing and delivery of predictions. Sometimes go by the shorter title, Machine Learning Engineer.
- SWE-ML-ε: responsible for software engineering and working alongside MLEs or DSs in the ML teams. Has enough understanding of ML to steer clear of it but occasionally helps out with some MLOps. Usually go by the adornment-free title, Software Engineer.
DE aka Data Engineer (aka Datum Extractor): This is among the oldest and the least ambiguous of the creatures (perhaps due to the marked history). However, one does occasionally come across a couple of variants:
- DE-: responsible for extracting, receiving, processing, transforming, curating, storing, supplying and accessorizing data. In other words, if they don’t supply the datum gems, all the sorcerers, alchemists, and mystic elves will be nothing but idle and bored creatures.
- DE-: responsible for all the data engineering, but also have to step in to productionize ML models by building the data pipelines and other surrounding infrastructure to model serving. May eventually refashion their title to Machine Learning Engineer.
As we see, the same title implies different skills and responsibilities in different organizations. No matter what variants and combinations of titles are present in a single organization, it is critical that, for magic to happen, all the necessary skills are covered sufficiently:
- = building ML models
- = productionizing ML models
- = data analysis and insights
- = data processing and storing
- ε = software engineering
and finally, although, it is an uncommon skill, ƛ can be pretty useful:
6. ƛ = both & together = building and productionizing ML models.
If your organization lacks people exercising skills, there will be no datum. All the rest of the creatures will end up hunting for data, or most likely, better jobs! If is missing, then those creatures busy doing will be interrupted by others and pestered for data analysis and insights. If you lack in , then is only as useful as a fun hobby, or to win hackathons or Kaggle competitions. If you are deprived of , you will end up doing black, and certainly bad, magic, probably burning everything down. Finally, if your organization does not have ε, then you have been wasting your time reading this article!
This concludes the Why and the Who. Next posts will focus on the remaining questions: Where, How, What and When.
PS — credits: a quick shoutout to Nikhil Bojja who is the antithesis of reviewer 2.