Wednesday, April 24, 2019

ETL : Testing Basics

ETL : Testing Basics 

 SQL types:
  • DML (Data Manuipulation) - Insert , update , Delete , Merge
  • DDL (Data Definition)-Create , alter,Drop,Truncate
  • DRL (Data Retrival )- Select
  • TCL (Transaction Lan)- Commit Rollback
  • DCL (Data Control)- grant ,revoke
Data WareHouse :
  1. It is a DataBase
  2. Contains data for Analysis
OLTP (Online Transaction Processing System
  •  Used in Day to Day Transaction
  • Ex : ATM , Net Banking 
  • Data is Captured into either DB , Flat File (xml) etc.,
DataMart:
  • It is a smaller part of DataWarehouse
  • It is specific to a department eg: Finance , HR etc.,
  •  It is used by middle level management to retrieve data wrt to their deptartment.
ETL ( Extract , Transform ,Load):
  • Process of read data from OLTP and  inserting into Data Warehouse
  • The Data from Data Warehouse is used in Analytics report
Ex: Company A has following departments Eg: Sales , Services department. 
Assume Sales departments has records of customer transaction in DB and Services has transaction in an csv file or flat file ( service number , service mechnaic name ,amount paid , spares replaced ,Date etc.,)
If management wants to find out the total transaction of a customer both sales and services how will they do ?
So they need to extract only relavent information from both data sources and load it into a common Destination .This is called ETL.
Hetrogenous Data : Data coming from different Data Sources like Flat file , Orcale DB,  Mongo, My Sql  ,txt,json etc.,
Mapping Document ( http://dwbitips.blogspot.com/2012/12/what-is-etl-mapping-document-real-time.html ) :
  1. Requirement Document
  2. Contains src , dest table and Column Details
  3.  Conatins transformation query or forumula applied for each column

ETL testing = Data Integration Testing + BI testing (Report)
Generic Data Integration Testing : 
  1. Scheme match wrt requirement
  2. Count match with src and destination
  3. Duplicate records (loaded twice), 
  4. Null Validation
  5. Untransformed Data should match source and destination data (1:1).
  6. Mapping is correct ie., data is going to correct cols as per requirment
  7. Old data already residing is not affected

Late Arriving Dimension or Early Arriving Fact

  • Happens when you get fact data before the dimension data arrives 
  • example:  New employee just onboarded and had an accident before insurance forms could be completed and processed by the insurance company. The hospital will create a medical claim record to be paid by the insurance company, but the insurance company does not yet have a person to associate the claim.
  •  -1 = UNKNOWN, -2 = N/A, -3 = Not Provided etc., 



 Ref:
https://www.softwaretestinghelp.com/etl-testing-data-warehouse-testing/
https://www.guru99.com/utlimate-guide-etl-datawarehouse-testing.html
https://www.tutorialspoint.com/etl_testing/index.htm

No comments:

Post a Comment