This being SAP HANA’s tenth year, and not coincidentally, the week of SAP’s digitally-reconvened SAPPHIRE 2020 conference, SAP is marking out the roadmap for its data platform, which, in a nutshell, converges with cloud. SAP announced the June 26 planned release and new long-term support release for the 2.0 version of the HANA data platform, numbered SPS05. For most of the announcements, general availability will come in phases over the next two quarters.
For on-premise, SAP HANA 2.0 SPS05 is the definitive release
With promised support through 2025, HANA 2.0 SPS05 has been designated the definitive on-premises migration target for current HANA 1.0 customers, although SAP will continue current practice of issuing additional short-term upgrades as well. It’s standard practice for on-premise database platforms, as customers cannot be expected to upgrade at the continuous delivery pace that is customary in the cloud. Oracle has taken a similar strategy.
A highlight of the HANA 2.0 SPS05 release is hybrid cloud support with SAP HANA Cloud, which is SAP’s fully-managed, multi-cloud, and consumption-based database as a service (DBaaS). A new cloud gateway for secure connection from on-premises SAP HANA databases to the SAP HANA Cloud provides the link for data and query federation. SAP HANA Cloud connects to local and remote data sources, providing a single source for data that reduces data duplication. SAP HANA Cloud can scale with multiple data tiers, and with independent storage and compute, provides elasticity.
SAP HANA 2.0 SPS05 can query HANA Cloud and return result sets taking advantage of capabilities such as machine learning, spatial, and graph, and federated query to a data lake in cloud storage. Its data virtualization capabilities can extend out to the HANA Cloud, which is treated as a remote data source. Furthermore, customers have a choice of accessing data through federated query or through replication, and the choice can be toggled in real time. There’s little question that when it comes to the debate over the multi-model vs. fit for purpose database, it’s clear where SAP stands. That puts SAP in sync with Oracle, which is also promoting multi-model support, but different from the cloud pure-plays.
SAP Data Warehouse Cloud grows its footprint
SAP’s Data Warehouse Cloud, which was unveiled last year, gets a major expansion of its footprint with this release, extending into modeling and integrating data and busines semantics.
To recap, SAP Data Warehouse Cloud is intended as an end-to-end analytics cloud services that goes beyond the database. It reengineered the HANA data platform into a cloud-native architecture deployed with containers and microservices, with compute separated from storage, and with the initial announcement, integrated HANA with capabilities from the SAP Analytics Cloud, providing a data warehouse with integrated self-service analytics and visualization.
It’s more than coincidental that SAP was not the only provider introducing an end-to-end data warehousing service last fall. Microsoft reengineered Azure SQL Data Warehouse into Synapse Analytics.
It’s a reflection of the newness of end-to-end cloud analytic services that SAP’s and Microsoft’s stabs at it had different mixes of capabilities. Whereas SAP initially concentrated on integrating self-service capabilities for business analysts, Microsoft’s focus was more on the back end, putting its data warehousing service together with Azure Data Factory, which was aimed at the back end of putting together data transformation pipelines (Azure Synapse also extended to the data lake through support of ADLS cloud storage and Spark processing). Azure Synapse did integrate through single clicks to analytic services including Power BI and Azure machine learning, but they remained packaged as separate cloud services.
With the new release of SAP Data Warehouse Cloud, the footprint is moving to the data integration side as well. While this does not entail repackaging SAP Data Intelligence, which is the umbrella for data integration in the SAP Cloud, there are a number of data flow (transformation) and data virtualization features that are being incorporated into the broadened Data Warehouse Cloud. That addresses data engineers and DBAs, but for business analysts, there’s also another data integration path via business modeling.
The data flow side is about building data transformation pipelines. It provides a choice of visual transformations, using drag and drop with prebuilt operators such as projection, aggregation, join, filter, and union, or for those preferring a more programmatic approach, there is also a scripting editor for building transforms using Python 3. Once the transforms are developed, there are capabilities for harmonizing columns and filtering data.
For business analysts, there is another path, where data is mapped into business entities such as customer, product, product hierarchy, sales pipeline, time, and so on. Operating independently from the underlying data layer, business users designate the objects, and then those objects are mapped to data models. A glossary sets the parameters for the business models, including organizational entity (for which relationships with other entities are defined), fact model, consumption use cases, and authorization scenarios dictating data access by user or role. In turn, the models populate a business catalog, which is the starting point for defining, discovering, reusing models, and governing their lifecycles.
As we’ve previously stated, the cloud provides the opportunity for data platform providers to break down the silos in the toolchain. In the cloud, refactoring monolithic systems into microservices that are deployed in containers allows the opportunity for data and analytics providers to rethink the tooling silos. With Microsoft having already taken the bait, and Oracle likely to as well, for now it appears that delivering end-to-end experience is how they are differentiating from AWS and GCP.