8/22/2019 Data Warehouse overview.ppt
1/6
Data Warehouse Overview
DW Data Sources
Existing Reports
Ad Hoc Reporting
Tools Demo:
Web Report Studio
Enterprise Guide
ETL Studio
8/22/2019 Data Warehouse overview.ppt
2/6
Data Warehouse Data SourcesData Source Years Comments
Good Records 2000 to 2004 2005 is being currently loaded and available by end
of October.
WI W2 Electronically filed only 2000 to 2004 2005 is being currently loaded and available by end
of October.
IMF / IRTF 2000 to 2004 2005 is being currently loaded and available by end
of October.
IRMF 2000 to 2003 2004 is being currently loaded and available by end
of October.
DOT Drivers License Info Quarterly Scheduled to begin November/December 2006.
Resend of their whole database. We load only
new and changed records.
DOT Vehicle Title File Monthly Feeds of only those new Auto
Sales for that given month
Scheduled to begin November/December 2006. This
Monthly feed includes only those new auto
sales for a given month.
DOT Plate File Quarterly Scheduled to begin November/December 2006. Thiscontains plate and address of the whole DOT
database. We will load only new and changed
records.
Next Phase Business Tax DW Integration is scheduled to bring the individual and
corporate tax DW together in one
environment. Development is scheduled to
begin January 2007.
8/22/2019 Data Warehouse overview.ppt
3/6
Original DW Request ListSource Description
Audit Matching
Probability/Profitability
Compliance
Profitability DW Phase
DOR - WI Tax Returns High High Phase 1
US Gov - STAX Information(IMF/IRTF) High High Phase 1
DOR - All W2's Electronically
Filed High High Phase 1
US Gov - STAX Information
(IRMF) High High Phase 2
DOT - Driver License Records High Medium Phase 3
DOT - Motor Vehicle Title
Information High High Phase 3
DOT - Motor Vehicle
Registrations/Renewals High High Phase 3
County/Cities/Villages Real
Property Tax Records High High tbd
DOR - Real Estate Transfers Medium Medium tbd
DOR - Sales Tax Registrants --
Sales Data High High tbd
DRL - Credentials High Medium tbd
DOA - Vendor List Medium Medium tbd
DOR Lottery Retailers Low Medium tbd
DHFS - Cigarette Registrants Low Medium tbd
DOR - Floor Tax Registrants Low Low tbd
DOR Motor Fuel Low Low tbd
DNR - Registrations Boat,
ATV, Etc. Medium High tbd
Us Gov - US Customs
Information Medium Low tbdDOR - Withholding Tax
Registrants Medium High tbd
DOR - Local Liquor Licenses Low High tbd
DNR - Hunting & Fishing
Licensees Medium Medium tbd
CCAP - Court Records Low Medium tbd
Wisconsin Bar Association
Listing High Medium tbd
DWD - Child Care Providers Medium High tbd
DWD - Sole-Proprietor
Employers High Medium tbd
Existing Corporate Tax Data
Warehouse tbd
8/22/2019 Data Warehouse overview.ppt
4/6
DW High Level Flow
ETL Data ProcessExternal System Process
Source Data
External./Internal Databases
Source Data
from External Data Sources
Data Quality - SAS DF
PowerStudio
Additional Data
Transformation in ETL
Tool - SAS ETL Studio
Data Warehouse
(SQL Server Database)
Business Intelligence Tools
End User
Query and Reporting Tools
(Ex: SAS Enterprise Guide,
SAS Web Report Studio,
SAS Info Portal)
SAS OLAP (Online
Analytical Processing)
Creation of Cubes and
querying of cubes
Data Mining (SAS
Enterprise Miner - Statistical
Program, Research,
Trending Analysis)
Data Warehouse Infrastructure
8/22/2019 Data Warehouse overview.ppt
5/6
W2 Good Records IRTFIMF
Data Store/Staging Area will store all Elements of Data Files in cleansed
format, with link to Customer, Form Dimension as possible and include a
time Conversion. All Data fields will be preserved and possibly augmented
with additional information
ODS_DOR_W2(Complete,
Standardized File) +
Customer ID by Year
ODS_DOR_GR(Complete,
Standardized File) +
Customer ID by Year
ODS_DOR_IMF(Complete,
Standardized File) +
Customer ID by Year
ODS_DOR_IRTF(Complete,
Standardized File) +
Customer ID by Year
DM_IRS_FACT(Contains IMF and IRTF
Data)
FTP Raw Data Files to Processing Server
Star Schema / Fact Tables Area will be a location where selected
data elements are loaded as a sub-set for Reporting Purposes. The
data sub-sets will occur based on business rules by given
deliverables by tax year. Some deliverables may share a Star
Schema Model and others may need their own. (For immediate
deliverables we did not find it necessary to have a time dimension,
but rather a 4 digit year field on each Fact Table)
DM_CUSTOMER_DIM
Customer_SK (PK)
DM_IRS_FORM_TYPE_
DIMENSION
DM_GEOGRAPHIC_FACT
**The shaded tables represent conformed dimensions that
may be shared across various fact tables
DM_CUSTOMER_DIM
Customer_SK (PK)
ODS_DOR_W2_
EXCEPTION(Name/Address
Standardized, but
unmatched)
ODS_DOR_GR
_EXCEPTION(Name/Address
Standardized, but
unmatched)
ODS_DOR_IMF_
EXCEPTION(Name/Address
Standardized, but
unmatched)
ODS_DOR_IRTF_EXCEPTION
Exception Data Area stores information that was
successfully matched to a customer and therefore
not loaded into the final cleansed staging area. S
**Some manual review may be required to identify where these kicked out
records should go.
C
B
D
A
A
DM_GR_FACT
DM_WI_FORM_TYPE
_DIMENSION
DM_WI_FORM_TYPE
_DIMENSION
DM_IRS_FORM_TYPE_
DIMENSION
DM_W2_FACT
DM_CUSTOMER_DIM
Customer_SK (PK)
A
DM_CUSTOMER_DIM
Customer_SK (PK)
DM_CUSTOMER_DIM
Customer_SK (PK)
A
A
8/22/2019 Data Warehouse overview.ppt
6/6
DW Portal and Report Access
http://srv156.revenue.wi.gov/Portal/displayLogon.do
rvpc3872
http://srv156.revenue.wi.gov/Portal/displayLogon.dohttp://srv156.revenue.wi.gov/Portal/displayLogon.dohttp://srv156.revenue.wi.gov/Portal/displayLogon.dohttp://srv156.revenue.wi.gov/Portal/displayLogon.do