In his keynote opening this year’s Ignite conference, Microsoft CEO Satya Nadella reiterated the theme of tech intensity, which he defined as the product of tech adoption, tech capability, and trust. Given the flurry of announcements that streamed out last week, which literally filled an entire 80+ page book, one could be forgiven for conflating density with pure onslaught. When the portfolio gets so rich and varied, overlaps are inevitable, and with it, the need for Microsoft to provide a clear story that helps its customers understand where to start and which service(s) to choose.
By the way, Microsoft is hardly alone with this dilemma. When we last counted, about a year ago, AWS had five categories, 16 instance families, and 44 instance types of AWS infrastructure, and that doesn’t begin to count all the managed services that it offers.
These pages provided a good sampling of the array of announcements, from Microsoft’s Hybrid 2.0 strategy to changes in its Power Tools, glass archival storage, quantum computing partnerships, to a slew of data and analytics announcements including the GA release of SQL Server 2019, Power BI data protection features, the extension of Azure Analytics, and a slew of updated Azure data services. We’ll focus our review on the two announcements that significantly expanded the portfolio: Azure Synapse Analytics and Azure Arc.
Azure Synapse Analytics – extending the data warehouse
The release of Azure Synapse Analytics, covered in detail by Big on Data bro Andrew Brust, represents a complete revamping of Azure SQL Data Warehouse that includes capabilities from Azure Data Factory and access to multiple analytic runtimes and languages. The goal, according to Microsoft, was to bridge data warehousing and big data analytics, a process that traditionally involved running a Hadoop cluster (or in the cloud, a Hadoop service), using separate tools and skills to analyze data, and then if the analytics were to be productionalized, the data would be moved to a data warehouse with its own set of tooling. Jobs are organized within “workspaces” that lend themselves to collaboration. It did so by collapsing an entire toolchain into a unified offering as shown in the “before” and “after” slides below.
The extending of Azure SQL Data Warehouse to a broader offering spanning much of the analytic data lifecycle is an example of how cloud-native architectures, with their use of microservices, containers, and Kubernetes orchestration, can dissolve the silos in the toolchain.
Microsoft is not the first to get there, and we suspect that it won’t be the last. In fact, barely a month ago, SAP announced SAP Data Warehouse Cloud that extended the data warehouse. But both are reverse mirror images, in that each offering extended the data warehouse in opposite directions.
Microsoft addressed the front part of the analytics lifecycle with data integration and integration to an array of runtimes, targeting data engineers and query developers. Conversely, SAP targeted the last mile. It focused on the business analyst, building in self-service by incorporating its Analytics Cloud offering (which includes a business semantics layer). Our takeaway is that the extended data warehouse will become the rule, not the exception, and we expect that AWS, Google Cloud, and Oracle will respond in short order.
How to choose the right analytics path?
Azure Synapse Analytics adds to Microsoft’s analytics data platform portfolio. There are several other options available – so how to choose between them? As Andrew pointed out, for Spark processing, Microsoft also offers Azure Databricks – and if we want to be fully comprehensive, there’s also HDInsight (the Azure Hadoop service). For us, it’s clear that Azure Databricks would be the tool of choice for data scientists who prefer programmatic rather than SQL query, and a more optimized Spark engine. And then there’s SQL Server 2019 Big Data Clusters, which adds the ability to scale analytics on on-premise Hadoop clusters, by deploying a SQL engine on every HDFS data node. By the way, its in-database R and Python processing are features that Synapse could use.
The good news is that Microsoft is not trying to force-fit analytic solutions on its customers. By providing choice, it is offering flexibility. But, with that flexibility comes the need to better articulate which platforms and tools are better choices for specific scenarios or user types. For instance, while Azure Synapse and Azure Databricks overlap in Spark support, the latter offering is better geared toward data scientists and engineers who just want to be set loose on raw data and work from notebooks, who don’t need a database.
Azure Arc extends the hybrid portfolio
Azure Arc adds to Microsoft Azure’s options for hybrid cloud deployment. It’s an answer to a growing demand that we’re hearing form customers: how can they get the benefits of cloud on premises, and how can they keep their choices of where.
At first glance, it would be easy to misconstrue Arc as a replacement to Azure Stack, when in reality, it is a complement. Mary Jo Foley previously provided the blow-by-blow breakdown on Arc and the expansion of the Stack portfolio. Since then, we’ve learned that Arc is anything but a pure software implementation of Azure Stack. While Azure Stack is intended to support a private or hybrid cloud deployment, Azure Arc is a software agent that extends the Azure cloud control plane over infrastructure deployed on-premises, or conceivably, in other public clouds. That could mean a number of things – hold that thought.
Extending cloud management or making cloud PaaS services portable?
Microsoft’s positioning and explanation of Azure Arc is still a bit unclear. On what was supposed to be the explanatory “What is Azure Arc” slide, we saw at the top that it was the extension of Azure PaaS services to anywhere, while at the bottom of the slide, we saw the message that it was the vehicle to provide Azure control and management. In actuality, the control plane is the common thread as it’s required regardless of whether the customer actually uses Azure PaaS services or instead uses Arc to tame their existing IT server environments. And both could be completely different use cases.
At launch, Microsoft is making available two PaaS services on Arc, both of which happen to be data services: Azure SQL Database Managed Instance and Azure Database for PostgreSQL Hyperscale. But the fact that the first services out of the gate are Azure databases was just serendipitous; on the horizon, we expect that Arc will add other Azure PaaS services such as developer tools, AI and machine learning services, serverless functions, and others will trickle out over time.
But until the portfolio of services hit critical mass, we believe that the initial “killer app” for Arc will be for enterprises to use Azure management to tame their legacy server environments, which could be bare metal or VMs. Given that there’s not a lot of Kubernetes skills base, yet, the good news is that this use case for Arc will be useful for enterprises that haven’t yet mastered the art of setting up Kubernetes clusters (they will need to know Kubernetes if they choose to deploy one of the Azure database services, however).
Given that Arc is only available for private preview, there’s plenty of opportunity for Microsoft to sharpen the narrative on the problems that Arc is designed to solve.