Amazon Web Services’ (AWS’) re:Invent conference in Las Vegas, though it focuses broadly on the AWS cloud, has become a coming out party for the Seattle-based provider’s data, analytics and AI-related services. But beyond that, re:invent is a data industry event, which many AWS partners in the data ecosystem use as their own vehicle for launches and announcements.
While I am not attending the event this year, I did receive bulletins from a range of data- and analytics-focused companies each with their own announcements around offerings on the AWS cloud. Most of these announcements were made yesterday, making for a good “roll-up” opportunity today, and I’ll summarize those announcements here.
SEE: AWS re:Invent 2018: A guide for tech and business pros (free PDF) TechRepublic
Lakefront product announcements
First up, data management specialist Okera announced its new Intelligent Schema Management for Amazon S3 Data Lakes. The product creates an all-up schema-based catalog of data sets stored in Amazon’s Simple Storage Service (S3). Many AWS data services, including Elastic MapReduce (EMR) and its Hive, Spark and Presto components, can use S3, as can various third party BI platforms. As such, the Okera component, which is part of its Active Data Access Platform (ODAP), can essentially create a unified directory of data generated and consumed by all of them, and help assert uniform access controls over them.
And Okera isn’t the only vendor announcing solutions around S3-based data lakes. Attunity, a well-established change data capture (CDC)-based data pipeline player, announced on Monday its Streaming Data Pipeline Solution for Data Lakes on Amazon Web Services. The solution leverages Apache Spark to ingest and transform data as it enters S3 data lakes from “major enterprise databases, mainframes, and applications such as SAP,” according to the company.
Speaking of enterprise databases, the next announcement comes from EnterpriseDB, a company heavily focused on EDB Postgres Advanced Server, an enterprise-grade version of the open source PostgreSQL database. Specifically, EnterpriseDB has announced Oracle compatibility in the product, and the availability of that feature on its AWS-based Cloud Database Service.
The service is available in AWS Regions US-East-1 and US-West-1 regions for now, and EnterpriseDB says it plans to expand to other regions, worldwide. The Oracle compatibility feature gets added to an already impressive list of value-added features for EDB Postgres Advanced Server, including point-in-time-recovery, load balancing and auto-scaling.
Another database player, this time in the graph database space is TigerGraph, a vendor I first ran across at September’s Strata Data Conference in New York. At re:Invent yesterday, the company announced it’s making its product available in Database as a Service (DBaaS) form on AWS, as TigerGraph Cloud.
The company says that in addition to the service itself, TigerGraph Cloud includes “out-of-the-box starter kits for quicker application development – for use cases such as Anti-Fraud, Anti-Money Laundering (AML), Customer 360, Enterprise Graph analytics and more.” The combination of those starter kits and TigerGraph’s SQL-like query language lead the company to promote TigerGraph Cloud as “the industry’s first Graph Database-as-a-Service available to everyone and anyone.” Such messaging does beg the question of who is adopting Amazon’s own Neptune managed graph database service.
Beyond databases and lakes
Moving beyond the world of vendors focused on data lakes and databases, video and AI vendor Nvidia announced that Amazon now offers a special P3dn virtual machine type on its Elastic Compute Cloud (EC2) that features the Nvidia V100 32GB GPU, a chip geared towards heavy data science workloads. Also, the Marketplace for Containers announced by Amazon at re:Invent is debuting with six different Nvidia AI containers.
Finally, as good as all these data lake, database and AI wares are, what about some actual data? Never fear, as Bloomberg announced that its flagship real-time market data feed, B-PIPE, is now available on AWS. This is the very same normalized market data obtained through the ubiquitous Bloomberg Terminal, but now it’s available in the cloud, and ostensibly ready to mash up with all the other products we’ve already discussed here.
Get on the bus
The battle for the cloud is largely a battle for where customers store, process and analyze their data. And given the dominance of AWS in computing overall, it’s no wonder that re:Invent has become a major event for data industry. Whether it be schema management, data processing pipelines, relational databases, graph databases, AI or major data providers, the AWS cloud now constitutes a sort of data super-hub, and we can expect many more announcements from companies that want to connect up and add value to it.