Informatica’s Cloud Data Governance and Catalog is the latest addition to the company’s data management as-a-service portfolio. It’s a relaunch of an older set of services that have been rebuilt from the ground up as a unified, cloud-native offering. It knits together data catalog and governance, that were formerly integrated but offered separately that are now a converged offering built under a modern microservices architecture.
For the most part, the Cloud Data Governance and Catalog service refactors capabilities and services that Informatica already offered under its Axon and Enterprise Data Catalog tools, and added some key new capabilities on top. For instance, Informatica has long had a business glossary that began life as a standalone tool, but is now incorporated into business metadata.
As noted above, aside from refactoring from monolithic software to microservices, the highlight is that business and technical metadata are better integrated, meaning that they are accessed through the same pane of glass, and linkages where apropos are established. Functionally, data discovery and data governance are now offered under the same umbrella. So, you can look for a business term “customer” and then click through to view the associated data model and lineage information. And then you can assign governance workflows for it, and check how the data is being consumed. This was possible previously, but only through using separate applications on-premises, or separate services from the Informatica Cloud.
Underneath the hood, business/organizational metadata is enriched with a knowledge graph connecting business metadata and organizational ownership, while technical metadata is automatically classified, profiled, and assessed for data quality. Machine learning is used for enriching data lineage, such as making inferences where necessary about specific data sources, which is useful for establishing data provenance.
Another example of the converging of business and technical metadata is the search capability; previously, there were separate search screens for business and technical metadata. A new feature with this release is what Informatica bills as “natural language-like” querying; beyond basic word-based searches, it can also build queries that are formed in context with business terms and enable search for related assets.
Informatica is taking a stretch outside its lane by expanding its scope for governance beyond data to AI models. The rationale is that governance is not only about how data is managed, but how it is used, thereby closing the loop on how data is used and whether the data being used to feed AI models is the right data. Big on Data bro Andrew Brust provides the deeper dive on Informatica’s AI model management foray.
As Informatica typically is used by data architects, data engineers, and data stewards, Cloud Data Governance customers are likely to face blowback from data scientists who are likely to favor their own tools for model governance, not to mention the fact that a growing number of cloud AI services are growing their capabilities for model bias detection. Informatica is realistic that it’s not about to replace the tools used by data scientists, and is positioning this as a way for each of the stakeholders to have a common meeting place to collaborate. It does address a widening gap in data governance as use of ML models grows. Nonetheless, our take is that for Informatica, adding governance of AI models will be a lift that is both aspirational and heavy.
With compliance being a key driver of data governance, Informatica’s Cloud Data Governance also enables customers to build standard workflows that can be used for identifying, tagging and classifying, isolating, and where necessary, deprecating or concealing private or sensitive data.
A few months back, we reviewed Informatica’s journey to the cloud. It’s been an evolutionary one that started with, essentially, hosting existing on-premises software with the traditional monolithic architecture for offerings such as PowerCenter or Master Data Management.
Fast forward to the present, and the portfolio is delivered as microservices, billed via consumption. So, you don’t have to fire up Axon or Enterprise Data Catalog separately, and you only pay for when you actually ingest and/or autogenerate metadata, store it, and build a governance workflow. And, while Informatica’s billing units are not the easiest to decipher (as units for various services are weighted differently), customers can use those credits to mix and match services from across its cloud SaaS portfolio, such as Cloud Data Integration, Application Integration, API Management, Data Quality, B2B Gateway, and so on.
Informatica Cloud Data Governance and Catalog will be generally available at the end of July 2021.