Optimal data testing coverage for a data lake implementation
prev
next
A multinational financial services corporation primarily into banking, cards and payments, with 50,000+ employees, and operating globally.
Key Challenges
Comprehensive ingestion testing
Exhaustive data distribution testing
Data lineage and metadata validation
The Impact
40%
effort reduction in end to end validation through automation
15% cost savings by using existing automation tool
The Solution
Infosys Big Data Testing approach reporting
A comprehensive data lake ingestion and distribution test strategy to ensure comprehensive test coverage, complete requirement coverage and optimal test data coverage
Abinitio Test Automation Framework for
Big Data Testing.
Connects to various sources, Data Lake and Data Marts
Supports Data Acquisition Testing, Data Transformation Testing.
Can be used when volume of data is in excess of 5 million
Custom utilities built using JCLs, excel macros, Hive QL
Deploy accelerators using Extreme automation and custom-made macros
Evaluate techniques for handling unstructured data
Automation via Abinitio Test Framework & Custom utilities
Automation using ETL Tool
The Abinitio tool is used as an ETL tool, which was leveraged for validating the data transformations. This was done creating Abinitio graphs using parameterization.
WHITE PAPER
How to ensure data quality during data migration
A test approach giving proper attention to data and why such a high failure rate for data migration programs.