Date post: | 03-Jun-2018 |
Category: |
Documents |
Upload: | shravan-kumar |
View: | 221 times |
Download: | 0 times |
of 28
8/11/2019 Auto Test Py
1/28
Automated ETL Testing
with Py.TestKevin A. Smith
Senior QA Automation Engineer
Cambia Health Solutions
8/11/2019 Auto Test Py
2/28
Agenda Overview
Testing Data Quality
Design for Automation & Testability
Python and Py.Test
Examples
2
8/11/2019 Auto Test Py
3/28
Database Applications Taking data out of an OnLine Transaction Processing
(OLTP) system and putting into an OnLine Analytical
Processing (OLAP) system involves Extracting the data,
Transforming the data and then Loading that data intoanother database (ETL)
When Testing an ETL Application:
Extract
Transform
Compare
3
8/11/2019 Auto Test Py
4/28
Cambia EVR Application
4
8/11/2019 Auto Test Py
5/28
Testing Data Quality Data Completeness
Ensures that all expected data is loaded.
Data Integrity Ensures that the ETL application rejects, substitutes default values
or corrects and reports invalid data.
Data Transformation
Ensures that all data is transformed correctly according to businessrules and/or design specifications.
5
8/11/2019 Auto Test Py
6/28
Testing Techniques Stare & Compare
Validate data
transformations
manually.
This step is usually
required to bootstrap
an ETL test automationproject
6
8/11/2019 Auto Test Py
7/28
Testing Techniques Golden FilesUse well-known test
data and golden file
comparison as a
testing oracle.
This technique is very
powerful for automatedtesting of printedoutput.
7
8/11/2019 Auto Test Py
8/28
Testing Techniques Self-VerifyingOracle in Test Scripts
model.
Necessary if you want
to test any aspects of
the ETL application
running in production.
8
8/11/2019 Auto Test Py
9/28
Design for Automation Control
How well the application can be controlled from the test tools.
Visibility How well are intermediate data and results visible to the test tools.
9
8/11/2019 Auto Test Py
10/28
Design for Testability - Visibility
10
8/11/2019 Auto Test Py
11/28
Test Tools - Rules of Thumb1. Do not re-invent the wheel.
2. No test tool will do everything you need - customize
3.
No one test tool will solve all of your test problems tool
box.4. Do not expect your business experts or developers to
be able to create great tests, even with tools.
5.
Do not use one-off technology for testing.
6.
Do not use the built-in test module to your ETLdevelopment tool.
11
8/11/2019 Auto Test Py
12/28
Tool Requirements Support Customization
Support Source to Target Data Mapping
Support Complex Logical Calculations
Support database connections
Support CSV and XML
Existing Tool
Customizable
Leverage Existing Knowledge Multi-OS (AIX, Windows)
12
8/11/2019 Auto Test Py
13/28
Python and Py.Test Support Oracle and Sybase databases with 3rdparty
libraries:
PyODBC, cx_Oracle
Native support for CSV files and XML
Strong support for containers (Tuple, List, Dict)
Easy learning curve for non-programmers
13
8/11/2019 Auto Test Py
14/28
What is Py.Test? Searches Disk for Tests
Sequences and Executes Tests
Captures Output Captures Exceptions
Reports Results
Interfaces to Extend/Customize Behavior Command Line Processing
Test Search/Sequencing/Selecting
Test Handling (Fixtures)
Reporting
14
8/11/2019 Auto Test Py
15/28
Database Supportconn = cx_Oracle.connection(user_name,
password,
server_name)
crsr = conn.cursor()
query_string =
crsr.execute(query_string)
for row in crsr.fetchall():
key = str(row[0]) + _ + str(row[1])
results[key] = {source : row,
target : (Missing,)}
15
8/11/2019 Auto Test Py
16/28
CSV File Support
import csv
csv_data = csv.reader(open(data.csv,
newline=),
delimiter=|)
for row in csv_data():
key = str(row[0]) + _ + str(row[1])
results[key] = {source : row,
target : (Missing,)}
16
8/11/2019 Auto Test Py
17/28
Row Comparison
for value in results.values():
assert value[source] == value[target]
17
8/11/2019 Auto Test Py
18/28
Test Patterns
Database Schema
Row Counting
Simple Source to Target Mapping
Complex Source to Target Mapping
18
8/11/2019 Auto Test Py
19/28
Database Schematable_names = ('OUTPUT_CD_TRNSLTN', 'OUTPUT_DRAG_DT',
'OUTPUT_NTWK', 'OUTPUT_PH_NUM')
def test_dev_schema():
"""
Test the development database."""
schemas = []
crsr = Database.get_cursor('DEV')
for table in table_names:
schemas.append(get_table_dict(crsr,'dev', table, out_dir, base_dir))
crsr.close()
generic_schema_compare(schemas, 'Development')
19
8/11/2019 Auto Test Py
20/28
Database Schema (contd)
def generic_schema_compare(results, title):
"""
Generic table comparison test.
"""
test_rslt = True
for schema in results:
if schema[source'] != schema[target']:
schema[source'].show_diffs(schema[target'])
test_rslt = False
assert test_rslt, title + ' schema differences'
20
8/11/2019 Auto Test Py
21/28
Row Counting
crsr.execute("""
Select count(*)
From FEP_PMT.FEP_CLM
Where FDS_BAT_ID = :arg_1 and
DISP_CD in ('1','2','9') andAMT_PAID < 0""",
arg_1 = fds_bat_id)
for row in crsr.fetchall():
passactual = row[0]
assert actual == 0, 'Negative claims found,invalid incoming data'
21
8/11/2019 Auto Test Py
22/28
Complex Source to Targetfor key, val in get_claim_lines.items():
expected_contract_adj_amt = 0
# calculate the expected contractual adjustment amount
# walk the fields by field name
for i in range(1,6):
# calculate the base name of this hag "row"
hag_base_name = 'HAG'+ str(i) + '_ADJ_'
if [hag_base_name + 'CDE'] == 'CO':
if(val[hag_base_name + 'RSN1'] != '23' and
val[hag_base_name + 'RSN1'] != '171'):
expected_contract_adj_amt += val[hag_base_name + 'AMT1']
if(val[hag_base_name + 'RSN2'] != '23' and
val[hag_base_name + 'RSN2'] != '171'):
expected_contract_adj_amt += val[hag_base_name + 'AMT2']
# now compare the calculation to the amount retrieved from the tableif round(val['CNTRCTL_ADJSTMT_AMT'], 4) != round(expected_contract_adj_amt, 4) :
print('Claim_trans_disp_line: ' + key + ' did not calculate correctly.')
print('actual:', round(val['CNTRCTL_ADJSTMT_AMT'], 4),
'Expected:', round(expected_contract_adj_amt, 4))
print()
test_result = False
assert test_result, 'Incorrect contractual adjustment calculations'
22
8/11/2019 Auto Test Py
23/28
In-memory Data Representation
key = str(row[0]) + _ + str(row[1])
results = {} # create a dict to hold in-memory
# tables of source and target data
results = {key 1 : {source : row,
target : row},
key 2 : {source : row,
target : row,
source 1 : row,
source 2 : row},key 3 : {source : row,
target : (Missing,)},
key 4 : {source : (Added,),
target : row}}
23
8/11/2019 Auto Test Py
24/28
Customized Test Output
24
8/11/2019 Auto Test Py
25/28
Customizations
Shared Database Connection Pool Database connection parameters, including obfuscated login
information
INI-file Processing
File directories for XML, CSV, baseline and output logging files.
Default values for command line options, such as logical databasename mapping
Command Line Option Processing Batch ID
Database Names Standard Test Routines
Source to Target Mapping
Database Schema Testing
25
8/11/2019 Auto Test Py
26/28
Team
James Bass UTi
William Buse Cambia Health Solutions
Matthew Pierce Cambia Health Solutions
Venkatesh Marada Cambia Health Solutions Kanthi Kondreddi Cambia Health Solutions
Bhargavi Kanakamedala Cambia Health Solutions
Tim Rilling Cambia Health Solutions
Gordon Krenn Cambia Health Solutions Tim Peterson Cambia Health Solutions
26
8/11/2019 Auto Test Py
27/28
Upcoming Work
Detailed XML File Tests
Test Results Load Directly to Rally.
Golden-file Comparison with Definable Filtering
Golden File Comparison for PostScript
27
8/11/2019 Auto Test Py
28/28
References
Python
http://www.python.org/
http://en.wikipedia.org/wiki/Python_(programming_language)
Py.Test http://www.pytest.org/
Oracle Python Library
http://cx-oracle.sourceforge.net/html/
Python ODBC Library
https://code.google.com/p/pyodbc/
Companion paper
http://tinyurl.com/kofo3rv/
28