+ All Categories
Home > Documents > STUDENT PART TIME JOB AS TUTOR SYSTEM USING K-MEANS … · university. This project is important to...

STUDENT PART TIME JOB AS TUTOR SYSTEM USING K-MEANS … · university. This project is important to...

Date post: 19-Oct-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
32
STUDENT PART TIME JOB AS TUTOR SYSTEM USING K-MEANS ALGORITHM NUR ZARITH AKILLA BINTI AMBOAKA BACHELOR OF COMPUTER SCIENCE (INTERNET COMPUTING) UNIVERSITI SULTAN ZAINAL ABIDIN 2018
Transcript
  • STUDENT PART TIME JOB AS TUTOR SYSTEM

    USING K-MEANS ALGORITHM

    NUR ZARITH AKILLA BINTI AMBOAKA

    BACHELOR OF COMPUTER SCIENCE

    (INTERNET COMPUTING)

    UNIVERSITI SULTAN ZAINAL ABIDIN

    2018

  • STUDENT PART TIME JOB AS TUTOR SYSTEM USING K-MEANS

    ALGORITHM

    NUR ZARITH AKILLA BINTI AMBOAKA

    Bachelor of Computer Science (Internet Computing)

    Faculty of Informatics and Computing

    Universiti Sultan Zainal Abidin, Terengganu, Malaysia

    MAY 2018

  • i

    DECLARATION

    I hereby declare that this report is based on my original work except for quotations

    and citations, which have been duly acknowledged. I also declare that it has not been

    previously or concurrently submitted for any other degree at Universiti Sultan Zainal

    Abidin or other institutions.

    ________________________________

    Name : Nur Zarith Akilla Binti Amboaka

    Date : ..................................................

  • ii

    CONFIRMATION

    This is to confirm that:

    The research conducted and the writing of this report was under my supervision.

    ________________________________

    Name : ..................................................

    Date : ..................................................

  • iii

    DEDICATION

    In the name of Allah, the Most Gracious and the Most Merciful, all praise is only for

    Him the documentation and the system for the subject, CSB 35102, Projek Ilmiah

    2018/2019 is finished due the time. I would like to take these opportunities to give a

    big thanks to my kind supervisor, Dr. Suhailan Bin Dato’ Safei for the valuable idea,

    time, support, advice, guidance, and ideas given through the development of research

    until complete the part of the project in phase one. Besides that, I also want to dedicate

    my appreciation to my beloved family that supports and motivates me during finishing

    this project. And not forget I would to thank a lot to friends that willing to lend their

    hand for finishing the project. Lastly, thank you everyone who directly or indirectly

    involved in the process of making the system and documentation

  • iv

    ABSTRACT

    Nowadays there is students who are need an extra pocket money to support their life

    in university. One of the way to get an extra pocket money is to be a part time tutor

    either among their friends in university or among the school students outside the

    university. Being a part time tutor is so good for them to build their self-esteem and

    also to gain an experience for their future career. However, some of them are still

    confused to teach since they don’t really know how to assess their abilities in the

    specific subject. Moreover, they need to proof to their future client or students that

    they are capable to teach the subject. Therefore, this project was built to classify their

    abilities to teach a subject based on their achievement in the courses that they take in

    university. This project is important to convince another student who need a tutor in a

    specific subject. To realize this project, clustering technique will be apply using

    centroid based clustering algorithm, K-means. K-means is often called an

    unsupervised learning, as we don’t have prescribed labels in the data and no class

    values denoting a priori grouping of the data instances are given.

  • v

    ABSTRAK

    Pada masa kini terdapat pelajar yang memerlukan wang poket tambahan untuk

    menyokong kehidupan mereka di universiti. Salah satu cara untuk mendapatkan

    wang saku tambahan ialah menjadi tutor sambilan sama ada di antara rakan

    mereka di universiti atau di kalangan pelajar sekolah di luar universiti. Sebagai

    tutor sambilan adalah sangat baik bagi mereka untuk meningkatkan tahap

    keyakinan diri mereka dan juga untuk mendapatkan pengalaman untuk kerjaya

    masa depan mereka. Walau bagaimanapun, sesetengah daripada mereka masih

    keliru untuk mengajar kerana mereka tidak tahu bagaimana menilai kebolehan

    mereka dalam subjek tertentu. Lebih-lebih lagi, mereka perlu membuktikan

    kepada klien atau pelajar masa depan mereka bahawa mereka mampu mengajar

    subjek. Oleh itu, projek ini dibina untuk mengklasifikasikan kebolehan mereka

    untuk mengajar mata pelajaran berdasarkan pencapaian mereka dalam kursus

    yang mereka ambil di universiti. Projek ini penting untuk meyakinkan pelajar lain

    yang memerlukan tutor dalam subjek tertentu. Untuk merealisasikan projek ini,

    teknik clustering akan digunakan menggunakan algoritma kluster berasaskan

    centroid, K-means. K-means sering dipanggil pembelajaran tanpa pengawasan,

    kerana kami tidak menetapkan label dalam data dan tidak ada nilai kelas yang

    menunjukkan kumpulan priori dari contoh data yang diberikan.

  • vi

    CONTENTS

    PAGE

    DECLARATION i

    CONFIRMATION ii

    DEDICATION iii

    ABSTRACT iv

    ABSTRAK v

    CONTENTS vi

    LIST OF TABLES vii

    LIST OF FIGURES xvi

    LIST OF ABBREVIATIONS xv

    CHAPTER I INTRODUCTION

    1.1 Background 1

    1.2 Problem statement 1

    1.3 Objectives 1

    1.4

    1.5

    1.6

    Scopes

    1.4.1 Scope Admin

    1.4.2 Scope Student

    Limitation of Work

    Expected Outcome

    2

    2

    2

    1.7 Report Structure 3

    CHAPTER II LITERATURE REVIEW

    2.1 Introduction 4

    2.2 Similar System 4

    2.3 K-Means Clustering Algorithm

    2.3.1 What is Clustering Technique

    2.3.2 Introduction to K-Means Clustering

    2.3.3 K-Means Clustering Algorithm

    4

  • vii

    CHAPTER III

    METHODOLOGY

    3.1 Introduction 7

    3.2 Iterative Model 7

    3.2.1 Requirement Phase 8

    3.3 Analysis and System Design 9

    3.3.1 Framework Design 9

    3.3.2 System Design 10

    3.3.3 Data Model 11

    3.3.4 Technique

    3.3.5 User Interface Design

    15

    16

    REFERENCES 18

  • viii

    LIST OF TABLES

    TABLE TITLE PAGE

    3.1 First table in chapter 3 8

    3.2 Second table in chapter 3 9

    3.3 Third table in chapter 3 13

    3.4

    3.5

    3.6

    Fourth table in chapter 3

    Fifth table in chapter 3

    Sixth table in chapter 3

    14

    14

    14

  • ix

    LIST OF FIGURES

    Figure TITLE PAGE

    2.1

    2.2

    First figure in chapter 2

    Second figure in chapter 2

    6

    6

    3.1 First figure in chapter 3 7

    3.2 Second figure in chapter 3 9

    3.3 Third figure in chapter 3 10

    3.4

    3.5

    3.6

    3.7

    3.8

    3.9

    Fourth figure in chapter 3

    Fifth figure in chapter 3

    Sixth figure in chapter 3

    Seventh figure in chapter 3

    Eighth figure in chapter 3

    Ninth figure in chapter 3

    11

    12

    13

    15

    16

    16

  • x

    LIST OF ABBREVIATIONS / TERMS / SYMBOLS

    CD Context Diagram

    DFD Data Flow Diagram

    ERD Entity Relationship Diagram

    FYP Final year project

  • xi

    LIST OF APPENDICES

    APPENDIX TITLE PAGE

    A Appendix 1 80

    B Appendix 2 81

    C Appendix 3 82

    D Appendix 4 83

  • 1

    CHAPTER I

    INTRODUCTION

    1.1 Background

    Student Part Time Job as Tutors System using K-Means Algorithm is a web

    base application system. This system is developed based on academic

    achievement in a subject. This is to help students who want to be a part-timer

    teacher to teach subject that fit their skills in a particular subject. The problem

    is how to choose a tutor based on their academic achievement in particular

    subject. As example, if they wanted to be a tutor in Data Structure subject,

    they must have a good result in basic programming subject and object-oriented

    programming subject. This system will count how far they good in this subject

    and clast them. To realize the system, K-Means Clustering Algorithm will be

    used. To apply a tutor jobs, they need to fill in subject grade and the grade will

    be calculated based on the centroids to determine they are in the right tutors

    group.

    1.2 Problem Statement

    how to classify tutor teacher among students according to certain subject

    achievement correctly.

    1.3 Objectives

    There is three main objective that to develop this system such as:

    1.3.1 To analyze current problem in Student Part Time Job as Tutors.

    1.3.2 To design a proposed system Student Part Time Job as Tutors

    based on Subject grade using K-Means technique.

  • 2

    1.3.3 To develop system of Student Part Time Job as Tutors based on

    Subject grade using K-Means technique.

    1.4 Scope

    There is two scope in this system :

    1.4.1 Scope Admin

    1.4.1.1 Admin can login to the system.

    1.4.1.2 Admin can manage profile, which the student part timer

    tutor profile.

    1.4.1.3 Admin can create, update, and delete user profile.

    1.4.2 Scope Student

    1.4.2.1 Student can register to the system.

    1.4.2.2 Student can add, update and delete their details in the

    system.

    1.4.2.3 Student need to fill in profile form and educational form

    in the system.

    1.4.2.4 Student can view recommended subject to teach at the

    system.

    1.5 Limitation of Work

    1.5.1 The subject marks are entered manually by the students. It is up

    to the management to validate the data.

    1.5.2 This system only can cluster the result and give

    recommendation to the part timer tutor.

    1.6 Expected Outcome

    This system is expected to group part time tutors based on similar course

    achievement and assign them with a suitable subject to teach that suit their

    skill. Finally, students will be given a list of recommended subjects that is

    suitable with their range group.

  • 3

    1.7 Report Structure

    This report structure has six (6) chapters. In the Chapter 1, the content consists

    of project background, problem statement of project, the objective and system

    scope. Then, Chapter 2 is about the study of literature review. This chapter is

    reviewing the previous systems. The next is Chapter 3, describes the

    methodology of research. This research used iterative model. Chapter 4

    explains the system’s framework and design. Then, Chapter 5 is all about

    implementation, testing and result. Lastly, Chapter 6 is the conclusion of the

    whole project.

  • 4

    Chapter 2

    LITERATURE REVIEW

    2.1 Introduction

    This chapter describes and explains about the literature review about technique

    used for the development of a Student Part Time Job as Tutor System on

    student’s subject achievement using K-Means Clustering Algorithm.

    2.2 Similar System

    Student Part Time Job as Tutor system is a project that built to help an

    organization to choose the best tutor teacher among student. The system will

    choose a tutor will choose a tutor base on a subject that there are good with,

    which is they will be choose based on their achievement in particular subject

    by calculate their grade of the subject. This is because not all of the student is

    good with every subject they take. Some of them have a high understanding

    and good achievement in particular subject. This is what we want so that they

    can teach the other who didn’t good at the subject. Nowadays, a normal

    procedure for tutor or lecture or teacher selections are based on CGPA and

    interview session. This method does not guarantee completely that the selected

    tutor is good in the job scope given. There is a lack of selection based on

    certain subject achievement.

    2.3 K-Means Clustering Algorithm

    2.3.1 What is clustering technique

    Clustering is a technique for finding similarity groups in a data, called clusters.

    It is attempts to group individuals in a population together by similarity, but

    not driven by a specific purpose. Clustering is often called an unsupervised

    learning, as you don’t have prescribed labels in the data and no class values

  • 5

    denoting a priori grouping of the data instances are given (Manu Jeevan,2017).

    This K-Means clustering is purposed by J.B. MacQueen (Zhang Yufang,2003).

    2.3.2 Introduction to K-Means Clustering Algorithm

    K-Means is a method of clustering observations into a specific number of

    disjoint clusters. The ‘K’ refers to the number of clusters specified. Various

    distance measures exist to determine which observation is to be appended to

    which cluster. The algorithm aims at minimizing the measure between the

    centroid of the cluster and the given observation by iteratively appending an

    observation to any cluster and terminate when the lowest distance measure is

    achieved.

    2.3.3 K-Means Clustering Algorithm

    K-Means defines a prototype in terms of a centroid, which is usually the mean

    of a group of points and is typically applied to objects in a continuous n-

    dimensional space. The K-Means clustering technique is simple and we begin

    with a description of the basic algorithm.

    2.3.3.1 Initial Centroids Selection

    We first choose K initial centroids, centroid (k) is referring to a cluster centre

    that is represented using the feature points for a group of the nearby assigned

    objects. It is also used as a reference point in assigning objects into a cluster

    based on their nearest distance to the centroid. In the beginning of the

    assignment process, a number of K set of initial centroids need to be

    predetermined so that the objects can be assigned accordingly. In basic K-

    Means, these initial centroids are randomly selected among objects.

    2.3.3.2 Nearest Cluster Assignment

    Each point is then assigned to the closest centroid, and each collection of

    points assigned to a centroid cluster. Clustering process begins by measuring

    each object distance on each centroid (mk).

  • 6

    Figure 2.1 Nearest cluster assignment formula

    where Sik is set of the object in cluster-k, k= 0 to K and d is a feature. The

    objects will be assigned to a cluster where they have the closest distance to the

    centroid. The distance measurement is using the Euclidean distance method; a

    typical K-Means nearest object measurement.

    2.3.3.3 Centroids Update

    Then, the centroid of each cluster updated based on the points assign to the

    cluster. We repeat the assignment and update steps until no point changes

    clusters, or equivalently, until the centroids remain the same. This is the final

    step where once the objects have been re-assigned, the centroid for each cluster

    needs to be re-calculated.

    Figure 2.2 Centroids update formula

    where M is the total of objects in cluster-k, k = 0 to K and d=0 to D. This step

    is to ensure that all objects that currently assigned to a cluster definitely belong

    to that cluster (i.e. nearest to its new assigned centroid) and far away from

    other clusters. If there is an object that turns out to be nearer to another

    centroid, then this object needs to be reassigned to the nearest cluster. Thus,

    iteratively, the whole process cycle starting from step (b) to (c) needs to be

    repeated until there are no changes to the centroids in all clusters.

    2.3.3.4 Basic K-Means Algorithm

    1; Select K points as initial centroids.

    2; repeat

    3; Form K clusters by assigning each point to its closest centroid.

    4; Recompute the centroid of each cluster.

    5; until Centroids do not change.

  • 7

    Chapter 3

    METHODOLOGY

    3.1 Introduction

    This chapter will discuss the methodology that has been used to develop the

    system from the beginning until the system is completed. Methodology

    process is very important in develop our system. It is because, it can describe

    step by step about how to develop the system and also as a revision for the

    next generation who will continue expand or to study the system. In addition, a

    methodology is a formalized approach to implement Software Development

    Life Cycle (SDLC). There are various models defined and designed for

    software development process. The chosen SDLC model to develop this

    system is Iterative Model Life Cycle. Details for every phase involved in this

    system development will be explained in this chapter.

    3.2 Iterative Model

    Figure3.1 Iterative Model

  • 8

    In this model the process starts from the requirements and iteratively enhance

    the requirements until the final software implemented. The development

    begins by specifying and implementing just part of the software, which can

    then be reviewed in order to identify further requirements. This process is then

    repeated, producing a new version of the software for each cycle of the model.

    This model works on four phases. The phases are, requirement phase, design

    phase, implementation phase and evaluation phase. This model purposely used

    because we can possibly do a better testing at each iteration. In addition, this

    model does not require high complexity rate and the feedback is generated

    quickly. However, this model requires planning of technical level and also it is

    not easily understandable.

    3.2.1 Requirement Phase

    In this phase, the requirement for the software are gathered and analyzed.

    Iteration should eventually result a requirements phase that produces a

    complete and final specification of requirements.

    3.2.1.1 Software Requirement

    Software used to develop the Student Part Time as Tutor System.

    Table 3.1 List of Software

  • 9

    3.2.1.2 Hardware Requirement

    Hardware used to develop the Student Part Time as Tutor System.

    Software Description

    Laptop

    • HP 15-r236TX

    Processor: Intel® Core™ i3-4005U CPU @

    1.7 GHz

    RAM: 8.00 GB

    OS: Window 10

    GPU: NVIDIA GeForce FT 820M

    Table 3.2 List of Hardware

    3.3 Analysis and Design Phase

    In this phase, the software solution to meet the requirement is designed. The

    diagram of system framework, Context Diagram (CD), Data Flow Diagram

    (DFD) and Entity Relationship Diagram (ERD) is built to clarify about the

    actual system.

    3.3.1 Framework Design

    Figure 3.2 System Framework

    The figure above shows the overview of the system. Both admin and student

    will register and login to the system. Admin will update the available tutor

    subject to the system, and student can view and apply as many subjects they

  • 10

    want. During apply for the subject, they will enter the requirement subject

    mark and the mark will be calculate using K-Means technique in the system.

    Once the calculation is done, the result we be give to admin for evaluation and

    update the result to student if he or she is success or not.

    3.3.2 System Design

    3.3.2.1 Context Diagram

    A system context diagram (CD) is a diagram that defines the boundary

    between the system, or part of a system, and its environment, showing the

    entities that interact with it. This diagram is a high-level view of a system.

    Figure 3.3 Context Diagram

    Figure above show the overview flow of the whole system where there is 2

    entities included which is Student and Admin.

    3.3.2.2 Data Flow Diagram

    A data flow diagram (DFD) is a graphical representation of the “flow” of data

    through an information system, modeling its process aspects. A DFD is often

    used as a preliminary step to create an overview of the system without going

    into great detail, which can later be elaborated.

  • 11

    3.3.2.2.1 Data Flow Diagram Level – 0

    Figure 3.4 Data Flow Diagram Level-0 [Admin]

    Figure above show the DFD Level-0 for Admin where there are 6 processes

    included in Admin process.

  • 12

    Figure 3.5 Data Flow Diagram Level-0 [Student]

    Figure above show the DFD Level-0 for Student where there are 6 processes

    included in Student process.

    3.3.2.3 Entity Relationship Diagram

    Entity relationship diagram (ERD) is a graphical representation of entities and

    their relationships to each other, typically used in computing in regard to the

    organization of data within databases or information systems.

  • 13

    Figure 3.6 Entity Relationship Diagram

    Figure above show the ERD of the system, where there is 5 entity and 6

    relations included.

    3.3.3 Data Model

    A data model (or data model) is an abstract model that organizes elements

    of data and standardizes how they relate to one another and to properties of

    the real-world entities.

    3.3.3.1 Admin

    Table 3.3 Admin Data Model

    Table above shows the details of admin data.

  • 14

    3.3.3.2 Student

    Table 3.4 Student Data Model

    Table above shows the details of student data.

    3.3.3.3 Subject

    Table 3.5 Subject Data Model

    Table above shows the details of subject data.

    3.3.3.4 Subject Mark

    Table 3.6 Subject Mark Data Model

    Table above shows the details of subject mark data.

  • 15

    3.3.4 Technique

    3.3.4.1 K-Means Clustering

    K-Means Clustering is the simplest unsupervised learning technique that can

    solve clustering problem. The step follows a simple and easy way to classify a

    given set of data set through a certain number of cluster (assume k clusters)

    fixed a prior.

    Define k centroids, one for each cluster.

    These centroids should be placed in a wily way because of different

    location cause different result. So, is better to place them as much as

    possible far away from each other.

    Take each point belonging to a given data set and associated it to a

    nearest centroid.

    When no point is pending the first step is done. At this point, recalculated k

    new centroids as center of the clusters resulting from the previous step is

    needed.

    After this k new centroids, a new binding has to be done between the same

    data points and nearest new centroids.

    A loop has been generated, until it notices that the k centroids change their

    location step by step until no more changes are done. In the simplest

    words, centroids do not move any more.

    3.3.5 Interface Design

    This phase is when the software is coded, integrated and tested for prototyping

    purpose.

  • 16

    3.3.5.1 Student Interface Prototype

    Figure 3.7 Main Page

    Figure above show the Main page of the system where Student need to login or

    register to the system.

    Figure 3.8 Dashboard Page

    Figure above show the student Dashboard page where they can view the

    available subject to teach and they can apply for it.

  • 17

    Figure 3.9 Application Page

    Figure above shows the application page where student need to insert the

    requirement subject mark by their self.

  • 18

    REFERENCES

    Ju, C., & Xu, C. (2013). A New Collaborative Recommendation Approach

    Based on

    Users Clustering Using Artificial Bee Colony Algorithm, 2013.

    Kodinariya, T. M., & Makwana, P. R. (2013). Review on determining number

    of

    Cluster in K-Means Clustering. International Journal of Advance Research in

    Computer Science and Management Studies, 1(6), 2321–7782.

    Li, C. S. (2011). Cluster center initialization method for K-means algorithm

    over data

    sets with two clusters. Procedia Engineering, 24, 324–328.

    https://doi.org/10.1016/j.proeng.2011.11.2650

    Li, Y., & Wu, H. (2012). A Clustering Method Based on K-Means Algorithm.

    Physics

    Procedia, 25, 1104–1109. https://doi.org/10.1016/j.phpro.2012.03.206

    Yadav, S., Bharadwaj, B., & Pal, S. (2012). Data mining applications: A

    comparative

    study for predicting student’s performance. International Journal of Innovative

    Technology & Creative Engineering, 1(12), 13–19. Retrieved from

    http://arxiv.org/abs/1202.4815

    https://doi.org/10.1016/j.proeng.2011.11.2650https://doi.org/10.1016/j.phpro.2012.03.206http://arxiv.org/abs/1202.4815

Recommended