Data.world connects data and people. That was the gist of our coverage of data.world’s release of its enterprise platform about a year ago. Data.world has been doing that with great success, and today it’s bringing in Capsenta to take things to the next level.
In a nutshell, data.world uses the superpowers of Knowledge Graph (aka Linked Data, aka Semantic Web) technology to integrate datasets and provide a data management and collaboration platform for the enterprise. Capsenta has patented technology that helps integrate data sources (predominantly relational) and make them accessible as knowledge graphs, be it on premises or in the cloud.
The two companies have been working together for a while, and they match not just technologically, but also culturally. ZDNet discussed with data.world CEO and co-founder Brett Hurt, and Capsenta founder Juan Sequeda, who will be serving as Principal Scientist at data.world.
Integrating data via virtual knowledge graphs
Hurt mentioned that the recent wave of acquisitions is a testament to the fact that data and analytics are becoming a core element for enterprises. He went on to add that data.world’s offering has been really popular with enterprises, including customers such as a Big Four professional services firm and a Top-10 American investment bank.
Some of these customers are in highly regulated industries, such as healthcare or finance, and sometimes this poses a challenge. Data.world runs a cloud service which can ingest and catalog data, metadata, or both. In regulated industries, moving data to the cloud may not be an option. This is where Capsenta’s Ultrawrap™ data integration software comes in.
Sequeda, who has been working on knowledge graph virtualization for over a decade, knows very well both the strengths and the weaknesses of this technology. Knowledge graphs, and the SPARQL query language, are ideal for data integration and cataloging scenarios. The problem, as Sequeda put it, is that in some ways SPARQL tried to reinvent the wheel:
“We have relational technology with over 30 years of accumulated experience, why not reuse this?”. So what Capsenta’s technology does is to act as a bridge between SPARQL and relational data sources. Queries are formulated in SPARQL, leveraging its ability to support many different data sources, then translated and executed in SQL, where those data sources are relational.
Also: Knowledge graphs beyond the hype: Getting knowledge in and out of graphs and databases
This way, data stored in relational databases on premises can stay where they are, while ingested metadata can be used to make it part of a knowledge graph spanning many data sources both on premise and in the cloud. Sequeda noted that Capsenta’s solution adds negligible overhead, effectively making the execution time of SPARQL in the cloud equal to that of SQL on premise.
As Hurt mentioned, data.world has been working with Capsenta for about a year and a half already. Initially Capsenta was a data.world partner, but the slew of enterprise customers meant that the use cases where Capsenta’s technology could be used proliferated. This, in turn, made joining forces the next step, and as both Hurt and Sequeda said, “it feels like we should have been working together all along”.
Building knowledge graphs via consumer grade UI
This means that there’s not much of an integration to be done, because for the most part it’s already there. But virtual knowledge graphs is not the only thing Capsenta brings to the table. Data.world is vocal about its technology stack, but at the same time wants to make it invisible to end users. What they call consumer grade user interface (UI) is a key part of this, and Capsenta’s Gra.fo is a perfect match.
Gra.fo is a visual knowledge graph editor that Capsenta started working on in stealth mode about 3 years ago. Building knowledge graphs is not entirely uncomplicated. Part of the reason has to do with the complexity of the underlying data models. Although there are a few visual tools for building knowledge graphs in the market, Sequeda felt that something was missing.
Something that would lower the bar of building knowledge graphs beyond experts to business users. Another aspect that Sequeda felt was missing was that of collaboration, and as he went on to add, this is something that is central to data.world as well:
“People can comment, follow others, share workspaces. The people aspect of data is very important. Most tools are very technical, not suitable for end users. We want to be people-centric. Nobody knew about Gra.fo, and when people saw it, it blew their minds”. Gra.fo is not 100 percent production ready at this point, but it’s coming along quite well.
Speaking of being people-centric, there is a community take on the data.world-Capsenta story too. Both companies are based in Austin, Texas, and share investors and ties to the University of Texas at Austin, which is where Sequeda’s research started.
Terms of the deal were not disclosed, and we don’t expect this one to draw as much attention as the recent mega-deals did. However, it’s not that often that we see acquisitions where the alignment is evident on so many levels.
It seems data.world has found a way to deal with the shortcomings of the technology it’s building on, which happens to be in the spotlight lately. The acquisition of Capsenta further enhances data.world’s stack and adds to its talent.