Data Warehouse for Study Program Evaluation Reporting · 2013. 10. 20. · Data Warehouse Model:...

CITEE 2012 Yogyakarta, 12 July 2012 ISSN: 2088-6578

1

Data Warehouse for Study Program Evaluation Reporting

Based on Self Evaluation (EPSBED) using EPSBED

Data Warehouse Model:

Case Study Budi Luhur University

Indra, Yudho Giri Sucahyo, Windarto Faculty of Information Technology of Universitas Budi Luhur, Faculty of Computer Science of Universitas

Indonesia, Faculty of Information Technology of Universitas Budi Luhur

Jl.Ciledug Raya, Post Code 12260, Jakarta, Indonesia

[email protected], [email protected], [email protected]

Abstract- Each of study program at a university in

Indonesia are required to report the results of academic

activities for one semester to the Directorate General of

Higher Education, The Ministry of National Education

of Republic of Indonesia (DIKTI) through the

Coordinator of Private Higher Education (KOPERTIS).

The reporting will be used to measure the performance

of the study program for each university in Indonesia.

The reporting process is known as the Study Program

Evaluation Based on Self Evaluation (EPSBED).

Until now data from the processing of EPSBED

not yet maximized by the executive party of Budi Luhur

University to become one of the reference in the field of

academic decisions. For this reason the analysis of the

data warehouse of EPSBED may be one of an important

component to be considered in any decision-making by

the executive party of Budi Luhur University. Moreover,

by using the EPSBED data warehouse the process of

generating report become faster in a count of minutes

because the process is automated and scheduled.

The methodology undertaken in this research

contains several stages. The first stage is analyzing the

needs of information which required by the executive of

Budi Luhur University. The second stage is to collect

data to fill up information needs. The third stage is

analyzing the data warehouse which designed using star

schema techniques and using the Pentaho for community

or open source version as a tool. The last stage is to

implement Online Analytical Processing (OLAP) from

the application of the data warehouse.

Keywords-components: EPSBED, data warehouse, star

schema, study program, and college

I. INTRODUCTION

A. Background

Since 2002, every college in Indonesia has carry

out its obligations in reporting of study program

performances using a particular database structure.

The database structure has been formalized and

packaged in a reporting system called Study Program

Evaluation Based on Self Evaluation known as

EPSBED by all universities in Indonesia. In

accordance with that was stipulated in Director

General of Higher Education Decree

No.34/DIKTI/Kep/2001.

In terms of the EPSBED reporting process, Budi

Luhur University must do so many query processes to

retrieve the required data from the database of it. This

is due to of UBL database has different structure from

EPSBED database structure. Moreover, it should be

done the data cleansing process because there are

many incomplete data.

The results of the process which generated by

EPSBED report has been building for many years in

the UBL database. To date, there has not been data

warehouse to show historical data results of these

EPSBED. The EPSBED data has not been maximized

as a material consideration by the executive in taking a

decision.

B. Problem Formulation

From those core issues, this is a

fundamental question for “How to design a data

warehouse to facilitate and accelerate the

reporting process and how to implement the

EPSBED data warehouse to be taken into

consideration material in any decision-making by

the executive of Budi Luhur University?“

C. Research Objectives

The final purposes of this research is to:

1) Describes the design and developing of data

warehouse to facilitate and accelerate the

process of EPSBED reporting.

2) Make a cube (fact table) and OLAP (Online

Analytical Processing) to view detailed of

EPSBED report using roll up and drill down

features.

II. LITERATURE REVIEW

A. EPSBED

1) EPSBED Definition

EPSBED is a reporting media which

organized by the study program of each college


2

to the Directorate General of Higher Education,

Ministry of National Education of Republic of

Indonesia (DIKTI). Under the provisions and the

legislation, each of study programs must report

their ongoing activities related to the academic

activities each semester. Since the academic year

of 2002-2003, the reporting of study program

activities has been using electronic data and the

reporting aspects includes institutional,

curriculum, lecturers, students, and associated to

infrastructures which accessed by the study

program (Ilah, http://evaluasi.dikti.go.id/).

2) The Legal Basis of EPSBED

Based on the Decree of Director General of

Higher Education Number: 08/DIKTI/Kep/2002

on Technical Guidelines for the National

Education Decree Number: 184/U/2001 About

Monitoring Control Guidelines and the

Development of Diploma Program, Bachelor and

Master degree in Higher Education (including

the provision of certificates and transcripts).

Those decrees are some of the legal bases in the

implementation of EPSBED in each of

Universities in Indonesia.

3) ESPBED Workflow (Figure 2.1)

EPSBED workflow is sequence of data

migration process from each college's internal

database to DIKTI EPSBED database. In

accordance with the reference of the Higher

Education Development Data Base (PDPT), the

workflow of EPSBED can be described below.

Figure 2.1

EPSBED Workflow [1]

B. Data Warehouse

1) Data Warehouse Definition

Data warehouse is a collection of data used

for management decision making, which subject

oriented (topic), integrated, time variant and not

easily to changed (Inmon, 2005). Turban, Sharda

and Delen (2011) explained that the data

warehouse is also used as a central repository of

past data and current data which potential for

manager's deliberation of an organization.

2) Data Warehouse Modeling Techniques

(Figure 2.2)

In this research will be used multi

dimensional model, where there are two

dimensions for each data warehouse, namely:

fact tables and dimension tables. Fact tables

generally have a foreign key and measurement.

Measurement is a field that has a numeric value,

used for the measurement (measure), while the

foreign key is the primary key of the

corresponding dimension in the design of the fact

table. The data warehouse modeling technique is

using star join approach. Star join approach

resembles the form of star, which is fact table in

the center and dimension tables surrounded it.

This approach can be seen in Figure 2.2.

Figure 2.2

Star Join Approach Multidimensional Data Model [3]

C. Data Warehouse Architecture

Data warehouse architecture can be described

below (Figure 2.3):

Figure 2.3 Data Warehouse Architecture [2]


3

Figure 2.3 show that data warehouse is divided

into four parts:

1) Source Data System

Sources of data obtained from various transactions

and production result of operational application of

the company that runs every day. Transactional

data is still a regular data or raw data.

2) Data Staging Area

Before entering into this phase, first stage data

extracted and entered into the staging area. At this

stage in the data is cleansed, reconciled, matched,

and standardized so that the data are clean from

defects, this process is commonly known as

transform.

3) Data & Metadata Storage

Once data are cleansed then the data inserted

(loaded) into the data warehouse. Data in the data

warehouse can be used as a material in

determining the policy (decision support) by the

executive in a variety of issues.

4) End User Presentation Tools

At this final stage is the development of an

existing data warehouse. One of these is the using

of data warehouse to use as business intelligence.

D. Data Warehouse Tools

To design a data warehouse is used Pentaho

Schema Workbench. As for the implementation

of the Online Analytical Processing (OLAP) is

using JPivot which is already integrated with BI

Pentaho Server. Both of these tools can be

downloaded through a site http://www.

sourceforge.net.

III. RESEARCH METHODOLOGY

1) Information Requirement Analysis

At this stage, the research carried out by

conducting in-depth analysis of the information

required by the executive. This information needs to

be the basis for data collection at a later stage.

2) Data Collection Techniques

At this stage the process of collecting data by

observation techniques, the study of literature and

interviews with relevant parties. Interviews were

conducted with the Head of Information and

Technology bureau, the Chairman of Information

Techniques study program, Information Systems and

Information Management Diploma 3. The result of

this stage is data transaction (OLTP) that will be

retrieved to be used in designing of data warehouse.

3) Designing The Data Warehouse

At this stage the data extracted from

transactional database, and then performed a cleansing

process to eliminate the empty or redundant data.

After cleansing, the data in will be transformed with a

view to defining the tables in the relational data

source. After the transformation process the data is

performed loading process to inserting data into the

data warehouse. In process of designing the data

warehouse used a star schema model. From the result

of designing a model is obtained a star schema fact

table that is expected to support the reporting of

EPSBED.

4) Results of Data Warehouse Processing Analysis

At this stage, the results of data warehouse

process will be developed to be used by the executive

as materials analysis in making decision. The results

of this processing are presented in the form of OLAP

(Online Analytical Processing) with more detailed and

dynamic in roll up and drill down feature.

From four stages of design above, below is

described the flow diagram (Figure 3.1):

Figure 3.1 Research Methodology Stages

IV. DATA WAREHOUSE ARCHITECTURE

1. Logical Data Warehouse architecture (Figure 4.1)

At Figure 4.1 contains an explanation of the logical

architecture of the data warehouse for EPSBED UBL

reporting needs.


4

Figure 4.1 Logical Architecture of EPSBED UBL Data

Warehouse

The data source is a source of data from the

entire academic transaction processes in UBL. The

data source is using Oracle 9i licensed software. At

the first stage, tables selection process would be

carried out which are needed in designing data

warehouse in accordance with the existing dimension

tables and fact tables, this process known as selection

process. Then the specified tables extracted thereafter

performed the data mapping from each of tables which

needs to be inserted into the data warehouse, this

process is known as extraction.

2. Physical Architecture of EPSBED UBL Data

Warehouse

Figure 4.2 Physical Architecture of EPSBED UBL Data Warehouse

UBL operational database is using Oracle 9i with

SID: SYSTEM. While the database which used for the

data warehouse is using Oracle 9i with SID: SIDIKTI.

While in ETL process is using Pentaho Data

Integration (Kettle) as a tool, for the cube is using

Pentaho Schema Workbench and Pentaho Analysis

Services OLAP.

V. DESIGNING THE DATA WAREHOUSE

In implementing the EPSBED data warehouse it

contains multiple fact tables, including the fact table

of student's academic activities.

1) Design of Student’s Academic Activities Fact

Table (FACT_TRAKM)

Figure 5.1 Design of Student’s Academic Activities Fact Table

(FACT_TRAKM)

FACT_TRAKM fact table is a fact table that is

used to generate reports of GPA distribution,

distribution of IPS and the number of student's credits

along their study in each subject and used to generate

student's status reports. FACT_TRAKM fact table

contains of a measurement and the foreign key.

Measurement is a numeric type field which used as a

measurement in the fact table. The foreign key is a

primary key in the corresponding dimension in the

design of the fact table and description of a

measurement and the foreign key in FACT_TRAKM

fact table.

Dimensions related to the design of academic

activities (FACT_TRAKM) fact table are dimension

of student (DIM_MSMHS), dimension of college

(DIM_MSPTI), dimension of GPA condition

(DIM_KONDISI_IPK), dimension of IPS condition

(DIM_KONDISI_IPS), dimension academic year

(DIM_TAHUNAJARAN), dimension of the study

level (DIM_JENJANG), dimension of study program

(DIM_MSPST) and dimension of status of student

(DIM_STATUS_MHS).

CITEE 2012

VI. RESEARCH DISCUSSION

A. Staging Process of Extract, Transform, and

Loading

After designing fact tables and dimension

the next stage are to do the extraction, transform

and loading (ETL) to obtain a valid data which stored

in the data warehouse. ETL processes can be

described as follows:

1) Extraction Process Extraction process is the process of taking data

from the data source as a field or a tabl

transactional database, which is required in the

EPSBED data warehouse. This process is done in two

methods. Those methods are a manual method and the

method of Kettle. The manual method is done because

the data were taken less than 20 records. Th

method is done only by using the query manually to

recall the data.

Kettle is a method of extracting the data source to

select a field or a table using Kettle's tool. The results

of extraction process can be seen in Figure

Figure 6.1 Extract Scheme using

Yogyakarta, 12 July 2012 ISSN: 2088

5

RESEARCH DISCUSSION

Staging Process of Extract, Transform, and

act tables and dimension tables,

extraction, transformation

(ETL) to obtain a valid data which stored

in the data warehouse. ETL processes can be

Extraction process is the process of taking data

from the data source as a field or a table from a

transactional database, which is required in the

EPSBED data warehouse. This process is done in two

methods. Those methods are a manual method and the

method of Kettle. The manual method is done because

the data were taken less than 20 records. The manual

method is done only by using the query manually to

Kettle is a method of extracting the data source to

select a field or a table using Kettle's tool. The results

ure 7.1.

Kettle

2) Transformation Process The transformation process is a process to adjust

the field's name from the data source with

fields dimension and fact tables in accordance with the

requirements of EPSBED data warehouse. Th

adjustment is done due to differences of database

structure in the data source with the data warehouse

structure. The results of transformation process can be

seen in Figure 6.2

Figure 6.2 Transform Process Scheme

3) Loading Process Loading process is the final process of the data

warehouse development stages, after

an extraction phase, transformation phase and

cleansing phase to be inserted into the data warehouse.

This loading process uses the Pentaho Data Integration

(Kettle) tool. The complete scheme of ETL process

described below (see Figure 6.3).

Figure 6.3 ETL Scheme using Kettle

After the ETL process then the data that was

inserted into the data warehouse is a subject

data, has dimension of time and integrated. The

of this data warehouse process can be used as

consideration materials by the executive to make a

decision.

ISSN: 2088-6578

The transformation process is a process to adjust

the field's name from the data source with attributes or

fields dimension and fact tables in accordance with the

requirements of EPSBED data warehouse. The

due to differences of database

structure in the data source with the data warehouse

structure. The results of transformation process can be

.2 Transform Process Scheme

al process of the data

after passing through

an extraction phase, transformation phase and

cleansing phase to be inserted into the data warehouse.

This loading process uses the Pentaho Data Integration

he complete scheme of ETL process

.3 ETL Scheme using Kettle

After the ETL process then the data that was

inserted into the data warehouse is a subject-oriented

data, has dimension of time and integrated. The results

of this data warehouse process can be used as

consideration materials by the executive to make a


6

B. TRAKM Cube Schema

After the ETL stages is completed, the dimension

tables and the fact tables already contain valid

required data for designing the OLAP in data

warehouse. Each dimension will be linked to a fact

table to become a star schema that will be used in the

data warehouse implementation. A tool named

Pentaho Workbench Schema is used to create a star

schema. In Pentaho Workbench Schema will contain a

single fact table (cube) which has some relevant

dimension tables. The fact table contains some

attributes and measurements that will be shown in a

figure below:

Figure 6.4 Cube Scheme in Workbench

From the figure shown above, a cube (fact

table) named c_trakm contains these attributes

nimhstrakm, kdptitrakm, kdjentrakm, kdpsttrakm and

the measurement that is the average of GPA and IPS.

The dimension tables which associated with c_trakm

cube are dim_mahasiswa, dim Perguruan Tinggi

Indonesia (dim_PTI), dim of level study and dim of

course (prodi). After making of cube is completed, the

next step is to publish the results of this scheme into

Pentaho cube bi-server to generate the necessary

OLAP for EPSBED data warehouse analyses.

C. Application Results of The EPSBED Data

Warehouse Model This part is explaining the implementation results

of EPSBED data warehouse model. This information

is used to generate visualization of EPSBED reporting

process and it will be displayed in an Online

Analytical Process (OLAP) form. The result contains

of these information of Cumulative Grade Point

Average, Semester Grade Point Average, distribution

of Cumulative Grade Point Average, and distribution

of Semester Grade Point Average. Figure below

shows the result of OLAP visualization of Cumulative

Grade Point Average and Semester Grade Point

Average.

Figure 6.5 Cumulative Grade Point Average dan Semester

Grade Point Average

As shown at Figure 6.5, by using drill down

from a Cumulative Grade Point Average and Semester

Grade Point Average in the odd semester of year

academic of 2010/2011 for Information Management

study program at Diploma 3 degree in UBL. On the

drill down can be seen that the Cumulative Grade

Point Average value is 2,935 and that the Semester

Grade Point Average value is 2,979. The drill down is

functioned to look at the Cumulative Grade Point

Average and the Semester Grade Point Average each

of study program or any other study program more

dynamic and detailed.

Besides being able to drill down, Mondrian

can also roll up. Here is the roll up of the Cumulative

Grade Point Average and the Semester Grade Point

Average (see Figure 6.6).

Figure 6.6 Roll Up of the Cumulative Grade Point Average and

the Semester Grade Point Average

At the figure 6.6 above is shown the result of

the use of roll-up feature from the Cumulative

Grade Point Average and the Semester Grade

Point of Information Management study program

in the academic year 20102011. The value of

Cumulative Grade Point Average in the academic

year 20102011 is 2.91 and the average value of the

Semester Grade Point Average is 2.792. Those

values mentioned above are an accumulation of all


7

students point at the Information Management

study program in the academic year of 2010/2011.

D. Data Warehouse of The EPSBED Facilitate and

Accelerate The Reporting Process of EPSBED However, after the EPSBED data warehouse being

formed it is already contains required historical data

for reporting the EPSBED. The reporting process of

EPSBED each semester (to report the academic data

each of program study) just take from an existing

EPSBED data warehouse. Formerly, EPSBED

reporting process used to using manual query to

retrieve data from scattered tables on academic

transactional database was usually takes a long time

that was about five days.

By the existence of this EPSBED data warehouse,

process of reporting EPBED become faster and will be

completed in a count of minutes, because the required

data have been prepared in dimension tables and fact

tables. Likewise, the previous semester's data had

been documented at this EPSBED data warehouse as

well.

VII. CONCLUSION

Based on the research that has been done, can be

summed up some of the following:

1) Data warehouse implementation in UBL can help

to solve problems in completing the EPSBED

reporting quickly. Before the implementation of

data warehouse, process of collecting data

includes: extracting, transform and load were

done by queries for EPSBED reporting needs.

Usually it takes a month to complete all the

reports by using queries. By the existence of data

warehouse, then overall data of EPSBED which

would be reported to DIKTI has been passed

stages of extract, transform and load using

Kettle. Thus, the data in EPSBED application is

more quickly presented and have a valid data. It

requires shorter time within only two hours in

completing the EPSBED report as well.

2) The process of reporting EPSBED is done

automatically and can be scheduled using the job

components of Kettle, so it simplify and speed

up the performance of EPSBED reporting team.

3) The results of EPSBED data warehouse

processing can be used as a material

consideration by the executive in determining

policies. The information is presented in a form

of a distribution of Cumulative Grade Point

Average reports, a distribution of Semester

Grade Point Average reports, reports of student's

status and graduation rates of students, the

number of active tenured lecturer reports and the

number of tenured lecturers reports based on

recent education in each of study program.

VIII. REFERENCES

[1] DIKTI. (2010). Pengembangan Pangkalan Data

Pendidikan Tinggi. May, 12 2011. Direktorat

Jenderal Pendidikan Tinggi, Kementerian

Pendidikan Nasional Republik Indonesia.

http://bapsi.ub.ac.id/documents/Paparan_PDPT_

Dikti_Hery.ppt

[2] Efraim Turban et al. (2007). Decision Support

and Business Intelligent System. Pearson.

[3] Inmon, W.H.(2005). Building The Data

warehouse. New York: John Wiley and Sons,

Inc.w

[4] Ilah. (2010). Evaluasi Program Studi

Berdasarkan Evaluasi Diri (EPSBED). May, 10

2011. Direktorat Jenderal Pendidikan Tinggi,

Kementerian Pendidikan Nasional Republik

Indonesia. http://evaluasi.dikti.go.id

Date post:	06-Oct-2020
Category:	Documents
Upload:	others
View:	0 times
Download:	0 times

Data Warehouse for Study Program Evaluation Reporting · 2013. 10. 20. · Data Warehouse Model:...

Documents