STUDENT PART TIME JOB AS TUTOR SYSTEM USING K-MEANS … · university. This project is important to...

STUDENT PART TIME JOB AS TUTOR SYSTEM

USING K-MEANS ALGORITHM

NUR ZARITH AKILLA BINTI AMBOAKA

BACHELOR OF COMPUTER SCIENCE

(INTERNET COMPUTING)

UNIVERSITI SULTAN ZAINAL ABIDIN

2018

STUDENT PART TIME JOB AS TUTOR SYSTEM USING K-MEANS

ALGORITHM

NUR ZARITH AKILLA BINTI AMBOAKA

Bachelor of Computer Science (Internet Computing)

Faculty of Informatics and Computing

Universiti Sultan Zainal Abidin, Terengganu, Malaysia

MAY 2018

i

DECLARATION

I hereby declare that this report is based on my original work except for quotations

and citations, which have been duly acknowledged. I also declare that it has not been

previously or concurrently submitted for any other degree at Universiti Sultan Zainal

Abidin or other institutions.

________________________________

Name : Nur Zarith Akilla Binti Amboaka

Date : ..................................................

ii

CONFIRMATION

This is to confirm that:

The research conducted and the writing of this report was under my supervision.

________________________________

Name : ..................................................

Date : ..................................................

iii

DEDICATION

In the name of Allah, the Most Gracious and the Most Merciful, all praise is only for

Him the documentation and the system for the subject, CSB 35102, Projek Ilmiah

2018/2019 is finished due the time. I would like to take these opportunities to give a

big thanks to my kind supervisor, Dr. Suhailan Bin Dato’ Safei for the valuable idea,

time, support, advice, guidance, and ideas given through the development of research

until complete the part of the project in phase one. Besides that, I also want to dedicate

my appreciation to my beloved family that supports and motivates me during finishing

this project. And not forget I would to thank a lot to friends that willing to lend their

hand for finishing the project. Lastly, thank you everyone who directly or indirectly

involved in the process of making the system and documentation

iv

ABSTRACT

Nowadays there is students who are need an extra pocket money to support their life

in university. One of the way to get an extra pocket money is to be a part time tutor

either among their friends in university or among the school students outside the

university. Being a part time tutor is so good for them to build their self-esteem and

also to gain an experience for their future career. However, some of them are still

confused to teach since they don’t really know how to assess their abilities in the

specific subject. Moreover, they need to proof to their future client or students that

they are capable to teach the subject. Therefore, this project was built to classify their

abilities to teach a subject based on their achievement in the courses that they take in

university. This project is important to convince another student who need a tutor in a

specific subject. To realize this project, clustering technique will be apply using

centroid based clustering algorithm, K-means. K-means is often called an

unsupervised learning, as we don’t have prescribed labels in the data and no class

values denoting a priori grouping of the data instances are given.

v

ABSTRAK

Pada masa kini terdapat pelajar yang memerlukan wang poket tambahan untuk

menyokong kehidupan mereka di universiti. Salah satu cara untuk mendapatkan

wang saku tambahan ialah menjadi tutor sambilan sama ada di antara rakan

mereka di universiti atau di kalangan pelajar sekolah di luar universiti. Sebagai

tutor sambilan adalah sangat baik bagi mereka untuk meningkatkan tahap

keyakinan diri mereka dan juga untuk mendapatkan pengalaman untuk kerjaya

masa depan mereka. Walau bagaimanapun, sesetengah daripada mereka masih

keliru untuk mengajar kerana mereka tidak tahu bagaimana menilai kebolehan

mereka dalam subjek tertentu. Lebih-lebih lagi, mereka perlu membuktikan

kepada klien atau pelajar masa depan mereka bahawa mereka mampu mengajar

subjek. Oleh itu, projek ini dibina untuk mengklasifikasikan kebolehan mereka

untuk mengajar mata pelajaran berdasarkan pencapaian mereka dalam kursus

yang mereka ambil di universiti. Projek ini penting untuk meyakinkan pelajar lain

yang memerlukan tutor dalam subjek tertentu. Untuk merealisasikan projek ini,

teknik clustering akan digunakan menggunakan algoritma kluster berasaskan

centroid, K-means. K-means sering dipanggil pembelajaran tanpa pengawasan,

kerana kami tidak menetapkan label dalam data dan tidak ada nilai kelas yang

menunjukkan kumpulan priori dari contoh data yang diberikan.

vi

CONTENTS

PAGE

DECLARATION i

CONFIRMATION ii

DEDICATION iii

ABSTRACT iv

ABSTRAK v

CONTENTS vi

LIST OF TABLES vii

LIST OF FIGURES xvi

LIST OF ABBREVIATIONS xv

CHAPTER I INTRODUCTION

1.1 Background 1

1.2 Problem statement 1

1.3 Objectives 1

1.4

1.5

1.6

Scopes

1.4.1 Scope Admin

1.4.2 Scope Student

Limitation of Work

Expected Outcome

2

2

2

1.7 Report Structure 3

CHAPTER II LITERATURE REVIEW

2.1 Introduction 4

2.2 Similar System 4

2.3 K-Means Clustering Algorithm

2.3.1 What is Clustering Technique

2.3.2 Introduction to K-Means Clustering

2.3.3 K-Means Clustering Algorithm

4

vii

CHAPTER III

METHODOLOGY

3.1 Introduction 7

3.2 Iterative Model 7

3.2.1 Requirement Phase 8

3.3 Analysis and System Design 9

3.3.1 Framework Design 9

3.3.2 System Design 10

3.3.3 Data Model 11

3.3.4 Technique

3.3.5 User Interface Design

15

16

REFERENCES 18

viii

LIST OF TABLES

TABLE TITLE PAGE

3.1 First table in chapter 3 8

3.2 Second table in chapter 3 9

3.3 Third table in chapter 3 13

3.4

3.5

3.6

Fourth table in chapter 3

Fifth table in chapter 3

Sixth table in chapter 3

14

14

14

ix

LIST OF FIGURES

Figure TITLE PAGE

2.1

2.2

First figure in chapter 2

Second figure in chapter 2

6

6

3.1 First figure in chapter 3 7

3.2 Second figure in chapter 3 9

3.3 Third figure in chapter 3 10

3.4

3.5

3.6

3.7

3.8

3.9

Fourth figure in chapter 3

Fifth figure in chapter 3

Sixth figure in chapter 3

Seventh figure in chapter 3

Eighth figure in chapter 3

Ninth figure in chapter 3

11

12

13

15

16

16

x

LIST OF ABBREVIATIONS / TERMS / SYMBOLS

CD Context Diagram

DFD Data Flow Diagram

ERD Entity Relationship Diagram

FYP Final year project

xi

LIST OF APPENDICES

APPENDIX TITLE PAGE

A Appendix 1 80

B Appendix 2 81

C Appendix 3 82

D Appendix 4 83

1

CHAPTER I

INTRODUCTION

1.1 Background

Student Part Time Job as Tutors System using K-Means Algorithm is a web

base application system. This system is developed based on academic

achievement in a subject. This is to help students who want to be a part-timer

teacher to teach subject that fit their skills in a particular subject. The problem

is how to choose a tutor based on their academic achievement in particular

subject. As example, if they wanted to be a tutor in Data Structure subject,

they must have a good result in basic programming subject and object-oriented

programming subject. This system will count how far they good in this subject

and clast them. To realize the system, K-Means Clustering Algorithm will be

used. To apply a tutor jobs, they need to fill in subject grade and the grade will

be calculated based on the centroids to determine they are in the right tutors

group.

1.2 Problem Statement

how to classify tutor teacher among students according to certain subject

achievement correctly.

1.3 Objectives

There is three main objective that to develop this system such as:

1.3.1 To analyze current problem in Student Part Time Job as Tutors.

1.3.2 To design a proposed system Student Part Time Job as Tutors

based on Subject grade using K-Means technique.

2

1.3.3 To develop system of Student Part Time Job as Tutors based on

Subject grade using K-Means technique.

1.4 Scope

There is two scope in this system :

1.4.1 Scope Admin

1.4.1.1 Admin can login to the system.

1.4.1.2 Admin can manage profile, which the student part timer

tutor profile.

1.4.1.3 Admin can create, update, and delete user profile.

1.4.2 Scope Student

1.4.2.1 Student can register to the system.

1.4.2.2 Student can add, update and delete their details in the

system.

1.4.2.3 Student need to fill in profile form and educational form

in the system.

1.4.2.4 Student can view recommended subject to teach at the

system.

1.5 Limitation of Work

1.5.1 The subject marks are entered manually by the students. It is up

to the management to validate the data.

1.5.2 This system only can cluster the result and give

recommendation to the part timer tutor.

1.6 Expected Outcome

This system is expected to group part time tutors based on similar course

achievement and assign them with a suitable subject to teach that suit their

skill. Finally, students will be given a list of recommended subjects that is

suitable with their range group.

3

1.7 Report Structure

This report structure has six (6) chapters. In the Chapter 1, the content consists

of project background, problem statement of project, the objective and system

scope. Then, Chapter 2 is about the study of literature review. This chapter is

reviewing the previous systems. The next is Chapter 3, describes the

methodology of research. This research used iterative model. Chapter 4

explains the system’s framework and design. Then, Chapter 5 is all about

implementation, testing and result. Lastly, Chapter 6 is the conclusion of the

whole project.

4

Chapter 2

LITERATURE REVIEW

2.1 Introduction

This chapter describes and explains about the literature review about technique

used for the development of a Student Part Time Job as Tutor System on

student’s subject achievement using K-Means Clustering Algorithm.

2.2 Similar System

Student Part Time Job as Tutor system is a project that built to help an

organization to choose the best tutor teacher among student. The system will

choose a tutor will choose a tutor base on a subject that there are good with,

which is they will be choose based on their achievement in particular subject

by calculate their grade of the subject. This is because not all of the student is

good with every subject they take. Some of them have a high understanding

and good achievement in particular subject. This is what we want so that they

can teach the other who didn’t good at the subject. Nowadays, a normal

procedure for tutor or lecture or teacher selections are based on CGPA and

interview session. This method does not guarantee completely that the selected

tutor is good in the job scope given. There is a lack of selection based on

certain subject achievement.

2.3 K-Means Clustering Algorithm

2.3.1 What is clustering technique

Clustering is a technique for finding similarity groups in a data, called clusters.

It is attempts to group individuals in a population together by similarity, but

not driven by a specific purpose. Clustering is often called an unsupervised

learning, as you don’t have prescribed labels in the data and no class values

5

denoting a priori grouping of the data instances are given (Manu Jeevan,2017).

This K-Means clustering is purposed by J.B. MacQueen (Zhang Yufang,2003).

2.3.2 Introduction to K-Means Clustering Algorithm

K-Means is a method of clustering observations into a specific number of

disjoint clusters. The ‘K’ refers to the number of clusters specified. Various

distance measures exist to determine which observation is to be appended to

which cluster. The algorithm aims at minimizing the measure between the

centroid of the cluster and the given observation by iteratively appending an

observation to any cluster and terminate when the lowest distance measure is

achieved.

2.3.3 K-Means Clustering Algorithm

K-Means defines a prototype in terms of a centroid, which is usually the mean

of a group of points and is typically applied to objects in a continuous n-

dimensional space. The K-Means clustering technique is simple and we begin

with a description of the basic algorithm.

2.3.3.1 Initial Centroids Selection

We first choose K initial centroids, centroid (k) is referring to a cluster centre

that is represented using the feature points for a group of the nearby assigned

objects. It is also used as a reference point in assigning objects into a cluster

based on their nearest distance to the centroid. In the beginning of the

assignment process, a number of K set of initial centroids need to be

predetermined so that the objects can be assigned accordingly. In basic K-

Means, these initial centroids are randomly selected among objects.

2.3.3.2 Nearest Cluster Assignment

Each point is then assigned to the closest centroid, and each collection of

points assigned to a centroid cluster. Clustering process begins by measuring

each object distance on each centroid (mk).

6

Figure 2.1 Nearest cluster assignment formula

where Sik is set of the object in cluster-k, k= 0 to K and d is a feature. The

objects will be assigned to a cluster where they have the closest distance to the

centroid. The distance measurement is using the Euclidean distance method; a

typical K-Means nearest object measurement.

2.3.3.3 Centroids Update

Then, the centroid of each cluster updated based on the points assign to the

cluster. We repeat the assignment and update steps until no point changes

clusters, or equivalently, until the centroids remain the same. This is the final

step where once the objects have been re-assigned, the centroid for each cluster

needs to be re-calculated.

Figure 2.2 Centroids update formula

where M is the total of objects in cluster-k, k = 0 to K and d=0 to D. This step

is to ensure that all objects that currently assigned to a cluster definitely belong

to that cluster (i.e. nearest to its new assigned centroid) and far away from

other clusters. If there is an object that turns out to be nearer to another

centroid, then this object needs to be reassigned to the nearest cluster. Thus,

iteratively, the whole process cycle starting from step (b) to (c) needs to be

repeated until there are no changes to the centroids in all clusters.

2.3.3.4 Basic K-Means Algorithm

1; Select K points as initial centroids.

2; repeat

3; Form K clusters by assigning each point to its closest centroid.

4; Recompute the centroid of each cluster.

5; until Centroids do not change.

7

Chapter 3

METHODOLOGY

3.1 Introduction

This chapter will discuss the methodology that has been used to develop the

system from the beginning until the system is completed. Methodology

process is very important in develop our system. It is because, it can describe

step by step about how to develop the system and also as a revision for the

next generation who will continue expand or to study the system. In addition, a

methodology is a formalized approach to implement Software Development

Life Cycle (SDLC). There are various models defined and designed for

software development process. The chosen SDLC model to develop this

system is Iterative Model Life Cycle. Details for every phase involved in this

system development will be explained in this chapter.

3.2 Iterative Model

Figure3.1 Iterative Model

8

In this model the process starts from the requirements and iteratively enhance

the requirements until the final software implemented. The development

begins by specifying and implementing just part of the software, which can

then be reviewed in order to identify further requirements. This process is then

repeated, producing a new version of the software for each cycle of the model.

This model works on four phases. The phases are, requirement phase, design

phase, implementation phase and evaluation phase. This model purposely used

because we can possibly do a better testing at each iteration. In addition, this

model does not require high complexity rate and the feedback is generated

quickly. However, this model requires planning of technical level and also it is

not easily understandable.

3.2.1 Requirement Phase

In this phase, the requirement for the software are gathered and analyzed.

Iteration should eventually result a requirements phase that produces a

complete and final specification of requirements.

3.2.1.1 Software Requirement

Software used to develop the Student Part Time as Tutor System.

Table 3.1 List of Software

9

3.2.1.2 Hardware Requirement

Hardware used to develop the Student Part Time as Tutor System.

Software Description

Laptop

• HP 15-r236TX

Processor: Intel® Core™ i3-4005U CPU @

1.7 GHz

RAM: 8.00 GB

OS: Window 10

GPU: NVIDIA GeForce FT 820M

Table 3.2 List of Hardware

3.3 Analysis and Design Phase

In this phase, the software solution to meet the requirement is designed. The

diagram of system framework, Context Diagram (CD), Data Flow Diagram

(DFD) and Entity Relationship Diagram (ERD) is built to clarify about the

actual system.

3.3.1 Framework Design

Figure 3.2 System Framework

The figure above shows the overview of the system. Both admin and student

will register and login to the system. Admin will update the available tutor

subject to the system, and student can view and apply as many subjects they

10

want. During apply for the subject, they will enter the requirement subject

mark and the mark will be calculate using K-Means technique in the system.

Once the calculation is done, the result we be give to admin for evaluation and

update the result to student if he or she is success or not.

3.3.2 System Design

3.3.2.1 Context Diagram

A system context diagram (CD) is a diagram that defines the boundary

between the system, or part of a system, and its environment, showing the

entities that interact with it. This diagram is a high-level view of a system.

Figure 3.3 Context Diagram

Figure above show the overview flow of the whole system where there is 2

entities included which is Student and Admin.

3.3.2.2 Data Flow Diagram

A data flow diagram (DFD) is a graphical representation of the “flow” of data

through an information system, modeling its process aspects. A DFD is often

used as a preliminary step to create an overview of the system without going

into great detail, which can later be elaborated.

11

3.3.2.2.1 Data Flow Diagram Level – 0

Figure 3.4 Data Flow Diagram Level-0 [Admin]

Figure above show the DFD Level-0 for Admin where there are 6 processes

included in Admin process.

12

Figure 3.5 Data Flow Diagram Level-0 [Student]

Figure above show the DFD Level-0 for Student where there are 6 processes

included in Student process.

3.3.2.3 Entity Relationship Diagram

Entity relationship diagram (ERD) is a graphical representation of entities and

their relationships to each other, typically used in computing in regard to the

organization of data within databases or information systems.

13

Figure 3.6 Entity Relationship Diagram

Figure above show the ERD of the system, where there is 5 entity and 6

relations included.

3.3.3 Data Model

A data model (or data model) is an abstract model that organizes elements

of data and standardizes how they relate to one another and to properties of

the real-world entities.

3.3.3.1 Admin

Table 3.3 Admin Data Model

Table above shows the details of admin data.

14

3.3.3.2 Student

Table 3.4 Student Data Model

Table above shows the details of student data.

3.3.3.3 Subject

Table 3.5 Subject Data Model

Table above shows the details of subject data.

3.3.3.4 Subject Mark

Table 3.6 Subject Mark Data Model

Table above shows the details of subject mark data.

15

3.3.4 Technique

3.3.4.1 K-Means Clustering

K-Means Clustering is the simplest unsupervised learning technique that can

solve clustering problem. The step follows a simple and easy way to classify a

given set of data set through a certain number of cluster (assume k clusters)

fixed a prior.

Define k centroids, one for each cluster.

These centroids should be placed in a wily way because of different

location cause different result. So, is better to place them as much as

possible far away from each other.

Take each point belonging to a given data set and associated it to a

nearest centroid.

When no point is pending the first step is done. At this point, recalculated k

new centroids as center of the clusters resulting from the previous step is

needed.

After this k new centroids, a new binding has to be done between the same

data points and nearest new centroids.

A loop has been generated, until it notices that the k centroids change their

location step by step until no more changes are done. In the simplest

words, centroids do not move any more.

3.3.5 Interface Design

This phase is when the software is coded, integrated and tested for prototyping

purpose.

16

3.3.5.1 Student Interface Prototype

Figure 3.7 Main Page

Figure above show the Main page of the system where Student need to login or

register to the system.

Figure 3.8 Dashboard Page

Figure above show the student Dashboard page where they can view the

available subject to teach and they can apply for it.

17

Figure 3.9 Application Page

Figure above shows the application page where student need to insert the

requirement subject mark by their self.

18

REFERENCES

Ju, C., & Xu, C. (2013). A New Collaborative Recommendation Approach

Based on

Users Clustering Using Artificial Bee Colony Algorithm, 2013.

Kodinariya, T. M., & Makwana, P. R. (2013). Review on determining number

of

Cluster in K-Means Clustering. International Journal of Advance Research in

Computer Science and Management Studies, 1(6), 2321–7782.

Li, C. S. (2011). Cluster center initialization method for K-means algorithm

over data

sets with two clusters. Procedia Engineering, 24, 324–328.

https://doi.org/10.1016/j.proeng.2011.11.2650

Li, Y., & Wu, H. (2012). A Clustering Method Based on K-Means Algorithm.

Physics

Procedia, 25, 1104–1109. https://doi.org/10.1016/j.phpro.2012.03.206

Yadav, S., Bharadwaj, B., & Pal, S. (2012). Data mining applications: A

comparative

study for predicting student’s performance. International Journal of Innovative

Technology & Creative Engineering, 1(12), 13–19. Retrieved from

http://arxiv.org/abs/1202.4815
https://doi.org/10.1016/j.proeng.2011.11.2650https://doi.org/10.1016/j.phpro.2012.03.206http://arxiv.org/abs/1202.4815

Date post:	19-Oct-2020
Category:	Documents
Upload:	others
View:	0 times
Download:	0 times

STUDENT PART TIME JOB AS TUTOR SYSTEM USING K-MEANS … · university. This project is important to...

Documents