Machine Learning Data Catalogs, Q2 2018
In our 29-criteria evaluation of machine learning data catalogs (MLDCs) providers, we identified the 12 most significant ones — Alation, Cambridge Semantics, Cloudera, Collibra, Hortonworks, IBM, Infogix, Informatica, Oracle, Reltio, Unifi Software, and Waterline Data — and researched, analyzed, and scored them. This report shows how each provider measures up and helps enterprise architecture (EA) professionals make the right choice.
MLDCs Are The Stepping Stone For The Intelligent Business
The four V’s of big data (i.e., volume, variety, velocity, and veracity) may be a cliché. But firms are still struggling under the weight of their data: 36% to 38% of global data and analytics decision makers reported that their structured, semistructured, and unstructured data each totaled 1,000 TB or more in 2017, up from only 10% to 14% in 2016. And the growth of data is outpacing organizations’ ability to get value from it. The two biggest challenges our respondents reported in using systems of insight were 1) merging existing business processes to source data to analyze it and implement insights and 2) sourcing, gathering, managing, and governing the data as it grows.
For EA professionals, relying on people and manual processes to provision, manage, and govern data simply does not scale. Enterprises are waking up to this fact and turning to data catalogs to democratize access to data, enable tribal data knowledge to curate information, apply data policies, and activate all data for business value quickly. Data catalog investment links to: