In the midst of seismic shifts that cloud providers have brought to the analytics landscape over the past week, Alteryx – a company that has been always hard to pigeonhole – just convened its annual Inspire conference. Beyond corny references to the host city Nashville, there were surprisingly few headlines given that the company just recently closed a major acquisition — ClearStory Data. The top story – unveiling an “assisted” modeling experience – provides good example on why the company has been hard to pin down.
Alteryx is releasing a beta of an assisted ML capability on the desktop that will provide a documented AutoML experience. That is, the system will suggest potential models, and support the ability to conduct side-by-side tests of 3 – 5 of them at a time, then provide explanation on hows and whys of model performance, and the rationale behind the system’s choice of the most optimal model of the bunch. It will also provide suggestions on what to do with imperfect data sets, such as those that are missing values. At beta launch, Alteryx will only support a handful of models (that’s pretty consistent with what other AutoML services have done) and gradually add to them over time.
This is part of a broader industry initiative to open the black box of artificial intelligence to make models more explainable that has drawn efforts from providers ranging from IBM to H2O. Alteryx is only at the beginning of this journey. For instance, it’s side-by-side testing capability is a pales compared to what specialized data science tool providers such as Data Robot already support. So, beyond adding support of more models, it would make sense to get the new assisted modeling feature running up on the server where there would be more scale in the ability to test more models at once across larger data sets.
It’s not surprising that, compared to offerings from providers targeting hard-core data scientists, that Alteryx would start with the training wheels. It all draws back to understanding where the company’s sweet spot is: business analysts and developers. Over the years, the spotlight has shifted back and forth from data preparation to data science. Alteryx provides a drag and drop environment that automates the workflow, from data discovery through preparation, modeling, and presentation, so it’s any and all of the above. But by far, most of the customers we spoke with at the conference started with – and in many cases, continue to singularly focus on – data preparation.
As such, Alteryx overlaps with data prep providers, data science tools, and visualization. Specialized data prep tools have enjoyed an advantage of having ML capabilities to assist the process, but unlike Alteryx, they are standalone. By comparison, Alteryx automates the analytics lifecycle with drag and drop workflows, backed by a palette of roughly 250 tools for performing operations ranging from preparation to joins, parsing, transforms, data investigation, predictive analytics, and so on. The association with data science comes from support of R and Python and ability to code or import programs from Jupyter notebooks.
So, while Alteryx offers a charting function, nobody would confuse it with Tableau, and likewise for Tableau data prep. Through the years, Tableau has been one of Alteryx’s closest partners – it’s commonplace for analysts and developers to prep data in Alteryx and visualize and chart it in Tableau.
As for data science, it’s clear from conference sessions such as Data Science 101, that the demographic points more to would-be data scientists: the analysts and developers who want to spice up their resumes with data science experience. Sure, data scientists can work with their Jupyter notebooks, but their interaction with the Alteryx world will likely be more to importing their models into workflows, or checking the R or Python code that Alteryx functions generate. It’s not surprising that Alteryx’s messaging was directed at citizen data scientists.
Of course, the other shoe to drop is Alteryx’s plans for ClearStory Data. With ink still drying on the acquisition, Alteryx will leverage ClearStory in a couple ways. First, it will apply a much-needed infusion of machine learning enhancement to its data rep capabilities, especially in data matching and blending. Secondly, it will add a second Spark-based execution engine to the mix for handling larger data manipulation runs. As scaling processing and data volume with in-memory capability, that could only enhance Alteryx’s capability to crunch ML models. Going forward, Alteryx plans to expand the palette with more such execution engines.
The events of the past few weeks have changed the landscape that the company competes in. For the short term, Salesforce’s announcement to buy Tableau shouldn’t change the dynamics of the Alteryx relationship markedly; the biggest impact might come from large customers who might hesitate signing enterprise deals with Tableau until salesforce formally announces its plans. But as with MuleSoft, Salesforce plans to operate Tableau as a separate company. Again, theoretically, that shouldn’t change matters for Alteryx, but what if Salesforce were to pump more resource in fleshing out Tableau prep and, in turn, incorporated the MuleSoft connectors? Theoretically, that would compete directly in Alteryx’s backyard. Likewise, Google’s acquisition of Looker could traipse on similar ground, especially if Google integrates its AutoML capabilities.
And the 16-ton gorilla sitting in the corner is the issue of cloud transformation: Alteryx has been built for Windows desktops and is just getting the wheels rolling to add a Linux server and browser-based Ui. But then again, on the UI front, Tableau is in the same situation. And this is before we even start talking containerization that could ready these tools for cloud-based PaaS offerings. We’re getting ahead of ourselves here, but emergence of cloud-based AutoML services will be the tail to wag this dog.
Given the newness of the Google and Salesforce acquisitions, real results will be at least 6 – 12 months away at minimum. While Alteryx’s community is dwarfed by that of Tableau, that self-service community will be its strongest asset as the playing field changes.