ETL Engine

ETL Engine

The ETL Engine handles the ingestion of data and provides the ability to perform complex data enrichment and KPI creation during the loading of the data.

The ETL Engine offers the opportunity to simultaneously write data to multiple places with a "single read". Writing to more than one location allows processed data to be immediately available to operational functions without having to reprocess it from Hadoop to a more suitable storage type (such as MongoDB).

Cardinality offers its ETL technology as a standalone product for customers who have already commenced on their Big Data journey, our ETL can be added to their existing data solution for maximum benefit. Or, as part of a new greenfield development.

Key Features

  • Standardise data feed implementation. Allows for speedy deliverable of new data sets into the Hadoop cluster.
  • Ingest massive amounts of data from multiple sources. Whether incoming data has explicit or implicit structure, you can rapidly load into Hadoop, where it is available for down-stream analytic processes.
  • Offload transformation of raw data by parallel processing at scale.
  • Performs traditional ETL tasks of cleansing, normalizing, aligning, and aggregating data for your Enterprise Data Warehouse.
  • Built in data monitoring and quality algorithms to manage enormous volumes of data sets in real time.
  • High performance ETL engine already running at scale, capable of processing above 40 billion rows of data in real time per day.
  • Provides solution for managing and scaling data pipelines, which can otherwise be a daunting task.
  • Capable of enriching data in real time using massive cache lookups leading to faster and easier data mining and reporting