Introduction to the Institutional Research Database (IRDB) and Discoverer

Post on 12-Jan-2016

19 views 0 download

Tags:

description

Introduction to the Institutional Research Database (IRDB) and Discoverer. Office of Institutional Research and Assessment April 27, 2011. Today’s Agenda. Introductions Housekeeping Data Sources IRDB Structures Tables/Fields Facts/Dimensions Discoverer Folders/Items - PowerPoint PPT Presentation

transcript

OFFICE OF INSTITUTIONAL RESEARCH AND ASSESSMENT

APRIL 27, 2011

1

Introduction to the Institutional Research

Database (IRDB) and Discoverer

Today’s Agenda2

IntroductionsHousekeepingData SourcesIRDB Structures

Tables/Fields Facts/Dimensions

Discoverer Folders/Items Rows, Columns, Page Items Calculations

Documentation

How are data moved from operational systems into a data warehouse?

3

Step 1. Snapshots are extracted from operational systems. Step 2. Extracted files are reformatted and cleaned. Step 3. Pre-processed files are loaded into staging tables and

metadata are loaded into lookup tables in an Oracle relational database.

Step 4. Data in the staging tables are migrated to normalized tables.

Step 5. Summary tables and other high performance query structures are created from the normalized tables and lookup tables.

Step 6. Semester-based fact and dimension tables are created from the normalized tables and lookup tables.

Step 7. Longitudinal fact tables are created from the semester-based fact and dimension tables.

Steps 1–3Why are snapshots used to populate the

IRDB?4

SIMS

Show-Registration

File

SPSS

CleanShow-Reg

FileSQL*Loader SHOW_FILE

Extraction

COBOL

Transformation Load

AcademicProgramInventory

NYSED_PPROGRAM_LOOKUPDatabase Link Database View

SKAT

PerformanceFile

GraduationFile

PostGraduateSurveys

SkillsTests

Results

NCSPearsons

COBOL

PERF_FILE

GRAD_FILE

SKAT_FILE_02

VTEA_SURVEY_FILE_02

CleanPerformance

File

CleanGraduation

File

SPSS

SQL*Loader

SQL*Loader

SQL*Loader

SPSSSurvey

data withSSN’s

SQL*Loader

5

CUNY IRDBData Flow Diagram

CAS(freshman

admissions)

Special Reports

StandardizedFiles

Joins fromMultiple Tables

acrossMultiple Terms

Group bySelectedColumns

(SQL)

OracleDiscovererCrosstabs

Ad-HocQueries

Migrate Datainto Oracle9iEnvironment(SQL*Loader)

NormalizeData

(PL/SQL)

OracleForms

CUNY Data Bookon Institutional

ResearchWeb Site

Extract Files

OracleDiscoverer

Tables

ASTA(transfer

admissions)

SHOW(enrollment)

SKAT(skills tests)

PERF(grades)

GRAD(degrees)

NCES (job survey)

SFA(financial aid)

Clearinghouse(transfers to non-CUNYcolleges)

Reformatand CleanInput Files

(SPSS)

Create Factand Dimension

Tables(SQL)

Migrate Datainto Oracle 9iEnvironment(SQL*Loader)

Type orCut and Paste

SPSSfor

Windows

Crystal Reportsand

Oracle Portal

InstitutionalResearchers

UniversityAdministrators

Public Users

OracleDiscovererCrosstabs

StagingTables

Code Descriptionsfrom File Layouts

PC Files

OperationalData Store

(normalizedstudent-level

data)

LookupTables

(metadata)

Flash Enrollment

SummaryTables

(denormalizedaggregate-level

data)

DataWarehhouse

(denormalizedstudent-level

data)

LongitudinalCohorts

(denormalizedstudent-level

data)Ad-HocQueries

Spreadsheets

6

What are fact and dimension tables and how are they related?

7

A fact table is composed of numerical measures of business performance. Examples of facts would be headcount, FTE’s, and cumulative credits earned.

Dimension tables contain items that describe or categorize the items in the fact table. Examples of dimensions would be gender, full-time/part-time status, and college of attendance.

The fact table also contains foreign keys that can be used to join it with the primary keys of the dimension tables. For example, “Student ID”, “Term Enrolled Date”, and “College ID” are used to join the table “History Facts” with the table “History Major 1 Dim”.

A central fact table with multiple dimension tables radiating out from it is called a star schema.

What are the advantages of using a star schema?

8

Creates a database design that improves performance. Parallels, in the database design, how the end users

usually think and use the data. Provides versatile and robust ad-hoc query capabilities.

Provides an extensible design which supports changing

business requirements. Can be used with point-and-click tools such as Oracle

Discoverer 9iAs.

9

History Facts and Dimensions

The Joins between the Fact Table “History Facts” and its Dimension Tables Are Defined by an OIRA Administrator in the

Discoverer End-User Layer

10

How is a campus limited to viewing only the data of its own students?

11

IR.USERID_LOOKUP# userid# college_id# table_name# table_grant

IR.HISTORY_FACTS# student_id# term_enrolled_date# college_id

IR.SEC_COLLEGE_07_MV# sec_student_id

IRASIInstitutional

ResearchStaten Island

IRASI.HISTORY_FACTS# student_id# term_enrolled_date# college_id

Users Select “Items” from a “Folder”with a Mouse Rather than Writing and Executing SQL

Code

12

Discoverer13

IRDB End-User Query ToolCurrently accessed via Citrix

Requires user id/password – domain log in (managed by CIS) Discoverer (account required – managed by OIRA)

Set of Business Areas (linked fact and dimension tables) History Facts – Historical Enrollment Records Degree Facts - Historical Degree Records (through most recent

complete academic year) Cohort Facts – Integration of Enrollment and Degree data in a

longitudinal structure for tracking cohorts over time Special Business Area - mostly stand-alone tables for specific

analyses (e.g., PMP)

Users Arrange the Items as the Page-Breaks, Columns, and Rows for a Desired Report

14

Accessing the IRDB Through Discoverer

15

Navigate your web browser to https://ez.cuny.edu Log in with your LAN user id and passwordClick on the Discoverer icon in the list of available

applications via CitrixInstall Java code as prompted upon first use of a

given computer (you may need an IT technician to install programs on your computer)

After Java installation, you will be prompted to log in to Discoverer (user id and initial password established by OIRA)

Documentation available

Creating a New Workbook as a Crosstabs Reportwith Discoverer 9iAS

16

The Derived Fact “Headcount” Reflects the Business Rules for Excluding Some Students from Official Enrollment

Statistics

17

Creating a Layout for the Crosstabs Report

18

Discoverer Estimates the Time Needed to Run a Query

19

A Crosstab Built from “History Facts” andTwo Related Dimension Tables

20

Creating Totals and Subtotals21

Creating a New Workbook as a Table Report (or Extract) using Discoverer

22

Creating a Layout for the Table Report

23

An Example of an Implicit Condition

24

Sorting the Rows Retrieved

25

A Table Report of Fall 2002 Graduates with the Original Dimension “Birth Date”, the Computed Fact “Age”, and the Computed Dimension

“Age Group 1”

26

With Discoverer, Table Reports Can Be Exported in a Variety of Formats

27

28

Tracking Student Progress:How Should Many-to-Many Relationships

between Fact Tables be Resolved?

History Facts Degree Facts

29

Answer: Create an Intersection Entity that Has Many-to-One Relationships with both

Tables

History FactsIntersection

EntityDegree Facts

30

Selecting Three Different “Headcount” Facts from the Table “Cohort Facts”

31

“Headcount” of Undergraduates Who Entered in Fall 1990“Headcount” of Fall 1990 Entrants Who Returned in Fall 1991

“Headcount” of Fall 1990 Entrants Who Graduated by Summer 1996

32

Fall-to-Fall Retention of Fall 1990 Undergraduate Entrants

33

The “Headcount” Facts in the Table Cohort Facts” and the Foreign Keys that Join it with the Table “Degree Facts” Can Be Used to Create

a Graduation Rate Item

34

Six- Year Graduation Rates of Fall 1990 Undergraduate Entrants

35