Database Hiding in Tag Web Using Steganography by Genetic...

i

National Ribat University

Faculty of Graduate Studies & Scientific Research

Database Hiding in Tag Web Using Steganography by

Genetic Algorithm

Thesis Submitted for Fulfilling of the Requirements of

Ph.D. in Computer Science

By : Fatma Abdalla Mabrouk

Supervisor: Prof. Mudawi Mukhtar Elmusharaf

1438-2017

ii

تهاللـــسا

حم بسم للاه حيم الره ن الره

ان ك ل عل لهمت نق الوا سبح ا ع كيم(ا إنهك أ نت الع ليم م ل ن ا إله م الح )

صدق للا العظيم

(32) سورة البقرة اآلية

iii

DEDICATION

To my big family, and my small family for their continuous support and

encouragement

iv

ACKNOWLEDGEMENTS

First of all, thanks almighty god for blessing me more than I deserve, and

granting me the strength and perseverance to complete this search and present it in such

a satisfactory manner.

I would like to express my sincere gratitude to my supervisor Prof. Dr. Mudawi

Mukhtar Elmusharaf first for accepting the supervision of this thesis and second for his

patience, continuous support, motivation and immense knowledge. His guidance helped

me a lot during research time and writing of this thesis.

Also, the completion of this project could not have been accomplished without

the support of my friends Muna Ahmed Alsadig and Imtesal Ali Yasin.

I would like to thank my family for their love and encouragement: my mother

her continuous prayer has always supported me and gave me the strength to keep on

struggling, and to my brothers especially my brother Osama, for his spiritual support

throughout this thesis and in my life as a whole, and to my sisters especially Amal, the

first person who supported me to complete my proposal, thank you for giving me all this

time. And my uncle Salah Mabrouk without him this thesis couldn’t have seen the light.

A special thanks to my dear husband Amir Osman his patience and support

helped me to complete my work.

Last but not least, to the soul of my dear father who raised inside me the love of

science and always supported me during his life and death, my heartfelt thanks.

v

ABSTRACT

The main goal of this research is to study steganography technique by GA and to

design new system known as (SteganoTag), it is one of the new methods of

steganography information through hiding database within the saved web pages by

using genetic algorithm, without changing the page size, to increase the reliability and

confidentiality of the data base system. A sample of database that is designed in XML

language has been selected.

The main sector of this research is the application of genetic algorithm as a

method of data security, it is applied in the field of evolutionary programming in

artificial intelligence, as the experimental method is used in the analysis of different

types of home pages, these proposed technical HTML tags and its attributes are applied

to hide illogical database using the genetic algorithm. This proposed technique

considers labels as genes and characteristics as chromosomes. Then the architecture has

been detailed and the implementation of the proposed system of hiding information

using the software program C # and the system has been simulated using different

scenarios and a variety of data. The good side of using steganography with a genetic

algorithm has been clarified.

Finally, the most important findings in this research is that the combination

between the science of genetic algorithm and steganography raising the efficiency of the

process of masking data in a web page without changing its parameters, and the

encryption algorithm enhances the complexity of illegal attempts of steganography

removal. Genetic algorithm has the ability to achieve a significant improvement in data

security following the same methodology, this collection can be extended to involve the

development of other security systems to get safer and reliable systems of database

hiding. The high flexibility of HTML can be applied in many other techniques, other

non-public languages can be used in the process of database hiding and exploitation of

the Internet protocols, and the development of this method by introducing

developmental algorithms to increase the efficiency of data hiding. The development of

e-mail data process can also be hidden.

vi

المستخلص

دراسة تقنية إخفاء المعلومات بواسطة الخوارزمية الجينية، ومن ثم الهدف من هذا البحث

إخفاء المعلومات من خالل إخفاء قاعدة كأحد الطرق الجديدة في (SteganoTag)نظام تصميم

بيانات داخل صفحة االنترنت المحفوظة باستخدام الخوارزمية الجينية دون تغيير حجم الصفحة

. XMLلزيادة موثوقية وسرية نظام قاعدة البيانات. سيتم أخذ عينة من قاعدة البيانات المصمم بلغة

الجينية كأسلوب أمن للبيانات، ويتم النطاق الرئيسي في هذا البحث تطبيق الخوارزمية

يستخدم المنهج التجريبي استخدامها في مجال البرمجة التطورية في الذكاء االصطناعي، كما انه

HTMLفي تحليل أنواع مختلفة من الصفحات الرئيسية، تستخدم هذه التقنية المقترحة عالمات

وارزمية الجينية. وتعتبر هذه التقنية باستخدام الخ ةوصفاتها إلخفاء قاعدة بيانات غير منطقي

المقترحة أن أي عالمة تمثل الجينات وتمثل أي صفة كروموسوم.

#Cالهندسة المعمارية وتنفيذ نظام إخفاء المعلومات المقترح باستخدام برنامج تم تصميم

يد من ومحاكاة النظام باستخدام سيناريوهات مختلفة ومجموعة بيانات متنوعة. وتوضح الجانب الج

استخدام إخفاء المعلومات مع الخوارزمية الجينية.

هذا البحث أن الجمع بين علم االخفاء وخوارزمية الجينية يحسن من ئجنتامن اهم أخيرا،

من تعقيد ةعملية اخفاء البيانات في صفحة ويب دون تغيير معالمه، ان خوارزميه التشفير عزز

ا. الخوارزمية الجينية لها القدرة على تحقيق تحسن كبير محاوالت إزالة اإلخفاء الغير مسموح به

متد إلى تطوير أنظمة األمان يس المنهجية، هذا المزيج يمكن أن على أمن البيانات وباتباع نف

استغالل المرونة بها. يمكن يةوموثوق ا إلخفاء قواعد البيانات أكثر آمن األخرى للحصول على أنظمة

ر من التقنيات األخرى، كما يمكن استخدام لغات أخرى غير شائعة في في كثي HTMLالعالية للغة

عملية إخفاء قواعد البيانات. استغالل بروتوكوالت شبكة االنترنت في عملية إخفاء قواعد البيانات،

وتطوير هذه الطريقة بإدخال الخوارزميات التطويرية لزيادة كفاءة إخفاء البيانات. كما يمكن تطوير

بيانات البريد اإللكتروني.االخفاء لل

vii

CONTENT

TITLE PAGE…………….…………………………………………………………. i

ii .....……….……………………………………………………………………استهالل

DEDICATION …………………...………………………………………………… iii

ACKNOWLEDGEMENTS ……...…………………………………….................... iv

ABSTRACT…………………………………………………………....................... v

vi ..................................………………………………………………………المستخلص

CONTENT ………………………………………….….…………………….......... vii

LIST OF TABLES ……………………………….………………………….….…. x

LIST OF FIGURES ………………………………….……………………............. xi

LIST OF ABBREVIATION ………………………….……………………............ xiii

LIST OF SYMBOLS …………………………….……….……………………...... xv

CHAPTER

CHAPTER1: INTRODUCTION……………………………………………….… 1

1.1 Problem Background ………………………………………………………. 1

1.2 Motivation ….…………………………...……………………………….…. 4

1.3 Research Problem…………………………………………………………... 4

1.4 Research Objectives …………………………………….……………....…. 5

1.5 Research Questions ………………………...………………………….…… 6

1.6 Research Scope…...………………………………………………………… 6

1.7 Research Methodology and Activities……………………………………… 6

1.8 Glossary of Research Terms………………………………...…….….….…. 8

1.9 Structure of Thesis…………………………………...……………….….…. 10

1.10 Chapter Summary………………………………………………………. …. 10

CHAPTER 2: LITERATURE REVIEW AND PREVIOUS STUDIES………. 11

2.1 Introduction.………………………………………………………………… 11

2.2 Literature Review.…………………………………………………….….…. 11

2.2.1 The Security On the Internet Environment …….….……………… 11

2.2.2 Security and Steganography………………………………….….... 12

2.2.3 Data Hiding Techniques…………………………………………... 14

2.2.4 Comparative Between Cryptography and Steganography………… 16

viii

2.2.5 The Fundamental Requirements When Hiding Data in Data ….…. 17

2.2.6 Method of Hiding Data……………………………………....….… 17

2.2.7 Steganography in Digital Age………………………………….…. 18

2.2.8 The Steganography Approaches ………………………………...... 19

2.2.9 Types of Steganography…………………………………………... 20

2.2.10 Different Types of Media That Hide Data in Steganographic Techniques 22

2.2.11 Techniques of Steganography On Network…………….….……… 23

2.2.12 Network Stego Techniques………………………………….….…. 24

2.2.13 HTML Characteristics……………………………………….….… 26

2.2.14 Data Steganography Techniques On HTML Document……….…. 26

2.2.15 Genetic Algorithm Are Part of Evolutionary Algorithms………… 27

2.2.16 History of Genetic Algorithm………………………………….…. 29

2.2.17 Genetic Algorithms ’s Techniques.….…………………………… 29

2.2.18 Generating Random Population …………………………….…...... 30

2.3 Previous Studies.……………………………………………………………. 34

2.3.1 Previous Analysis of Methods in Steganography…………....……. 34

2.3.2 Previous Survey of Genetic Algorithm Applications……………... 40

2.3.3 Previous Review to HTML Web Page’s Steganography…….……. 45

2.2.4 Comparative Studies with Other Related Methods…………......…. 46

2.4 Chapter Summary…………………………………………………………… 49

CHAPTER3: RESEARCH PROCEDURES………………………………….…. 50

3.1 Introduction…………………………………………………………………. 50

3.2 Hiding Data in The Stored Web Page………………………………….…… 50

3.3 Security Definition On the Project……………………………………...…... 51

3.4 The Proposed Technique………………………………………………….... 52

3.5 The Philosophy of Genetic Algorithm Application………………….…....... 53

3.6 Creation Database and Generation of an Encryption Key …….….….…...... 56

3.6.1 Creating Database……………………………………………….… 56

3.6.2 Generating an Encryption Key……………………………….…… 56

3.6.3 Hiding The Data……………………………………………….…. 57

3.6.4 Extracting The Data………………………………………………. 60

3.7 The Architecture Design of the Proposed System……………….….….…... 62

3.8 The Implementation of the Proposed System……….……………….……... 65

ix

3.9 The Good Side of Using Genetic Algorithm in The Proposed System…… 67

3.10 The Limitations of Using Genetic Algorithm in The Proposed System…… 69

Chapter Summary………………………………………………….….….… 69

CHAPTER 4: ANALYSIS AND DISCUSSION………………………………… 70

4.1 Introduction………………………………………………………….……… 70

4.2 System Examination…………………………………………………...…… 70

4.3 Comparative Studies……………………………………………………...… 74

4.4 Performance Evaluation of Genetic Algorithm…………………….….…… 75

4.4.1 Fitness Value When Select Mutation………………………….…. 76

4.4.2 Fitness Value When Cancel Mutation……………………….….… 80

4.5 CPU Time Usage…………………………………………………….…...... 84

4.6 Chapter Summary………………………………………………….….……. 87

CHAPTER 5: CONCLUSION AND FUTURE WORK………………………... 88

5.1 Conclusions…………………………………………………………………. 88

5.2 Results ……………………………………………………………………… 89

5.3 Future work…………………………………………………………………. 89

5.4 Recommendation…………………………………………………………… 89

REFERENCES……………………………………………………………………. 91

APPENDICES……………………………………………………………………... 99

A. The Screens of the System ………………….……………….….………… 99

B.Source Code ...…...……………………………………………………… 101

x

LIST OF TABLES

Table no Table name Page no

Table 2.1: Comparison between cryptography and steganography……… 17

Table2.2: Summary of studies on steganography……………………...... 39

Table2.3(A): Summary of studies on Genetic Algorithm…………………... 44

Table2.3(B): Summary of studies on Genetic Algorithm…………………... 45

Table 2.4: Compare between some methods used web page………….…. 48

Table 4.1: Results of efficiency on the companies’ pages………………. 71

Table 4.2: Results of efficiency on the news’ pages……………………... 71

Table 4.3: Results of efficiency on universities’ pages……………….…. 72

Table 4.4: Results of efficiency on Social Media’ pages………………… 72

Table 4.5: Simulation Tests Results for pages’ size after hiding data…… 73

Table 4.6: Comparison between pages Capacity on pages share with

other method……………………………………………….….

74

Table 4.7: Training dataset to test a select mutation…………….………. 78

Table 4.8: Training dataset to test a cancel mutation……………………. 81

Table 4.9: Training dataset to test fitness function when cancel mutation. 82

Table 4.10: Results of CPU time when steganography on dell page……… 86

xi

LIST OF FIGURES

Figure no Figure name Page no

Figure 1.1: The Flow of the Research Activities ……………………......... 8

Figure 2.1: The core standards for Computer and network security……… 13

Figure 2.2: Fundamental objectives of Information security……………… 14

Figure 2.3: Types of security system……………………………………… 15

Figure 2.4: Types of Steganography………………………………………. 20

Figure2.5: Steganographic media Techniques……………………………. 23

Figure 2.6: Techniques of Steganography on Network…………………… 24

Figure 2.7: The block situation of natural search techniques……………... 28

Figure 2.8: A Genetic Algorithm’s Techniques…………………………… 30

Figure 2.9: A diagram for a The steps of general Genetic Algorithm….…. 33

Figure 3.1: The criteria to protection database on internet environment...... 51

Figure 3.2: The four security techniques………………………………...... 52

Figure 3.3: Tag characteristic on proposed technique…………………...... 52

Figure 3.4: Flow chart to hide database on HTML document……………. 59

Figure 3.5: Flow chart to extract database from HTML document............. 60

Figure 3.6: The architecture of (SteganoTag) system……………………. 63

Figure 3.7: The main steps in steganography proposal…………………… 67

Figure 4.1: Simulation Tests Results Increase rate after hide data…….…. 74

Figure 4.2: Comparison results between the two methods…………….…. 75

figure 4.3: Experimental dataset when select mutation……………….…. 77

Figure 4.4: The evaluation simulation result when select mutation 78

Figure 4.5: Simulation result of GP modelling of best fitness simulation

when select mutation………………………………………….

78

Figure 4.6: The results of select mutation to training dataset………….…. 79

figure 4.7: The result of GP model fitness evolution of the program……. 79

Figure 4.8: The results of training dataset when select mutation…………. 80

Figure 4.9: Simulation results of test data when select mutation…………. 80

xii

Figure 4.10: The evaluation result when cancel mutation…………………. 81

Figure 4.11: The results of modelling of best fitness simulation when

cancel mutation………………….…………………………….

81

Figure 4.12: The results of training dataset when cancel mutation………… 82

Figure 4.13: Model fitness evolution of the program when cancel mutation. 83

Figure 4.14: The result of training dataset when cancel mutation…….……. 83

Figure 4.15: Simulation results for time line when steganography on BBC

news page…………………………………………….……….

85

Figure 4.16: Simulation results for method grid when steganography on

BBC news page…………………………………………….….

85

Figure 4.17: The results of CPU time when steganography on dell page.…. 85

xiii

LIST OF ABBREVIATION

Abbreviation Description

ADO ActiveX Data Objects

AI Artificial Intelligence

BMP BitMap Picture

BOSS Break Our Steganographic System

BPCS Bit-Plane Complexity Segmentation

CPU Central Process Unite

DWT Separate Wavelet Transform

EAs Evolutionary Algorithms

EOF End-Of-File

EP Evolutionary Programming

FTP File Transfer Protocol such as

GA Genetic Algorithm

GIF Graphics Interchange Format

GP Genetic Programming

HTML Hyper Text Markup Language

HTTP HyperText Transfer Protocol

HUGO Highly Undetectable steGO

HVS Human Visual System

ICMP Internet Control Message Protocol

IP Internet Protocol

IS Information Security

IT Information Technology

xiv

JPEG Joint Photographic Experts Group

LEC Largest Embedding Capacity

LSB Least Significant Bit

LTE Long Term Evolution

MSE Mean Square Error

OPAP Optimal Pixel Adjustment Process

PSNR Peak signal-to-noise ratio

PVD Pixel Value Differencing

RS Retention of Secrecy

TCP Transmission Control Protocol

UDP User Datagram Protocol

URL Uniform Resource Locator

WWW World Wide Web

XML eXtensible Markup Language

xv

LIST OF SYMBOLS

Symbol Definition

(a1, a2) A couple of attributes

(c1, c2) A couple of chromosomes

(G1 …, Gn) A set of gene

Gn Tags

H HTML

On The more crossover order of the couples of attributes in a particular

tag

Pn Pair of chromosomes

Rn,i The sum of for all pairs of chromosomes

nr The number of the attributes

pa The capacity of each attribute on page

1

CHAPTER 1

INTRODUCTION

1.1. Problem Background

The free flow of information has opened the door to everyone to intervene

without distinguishing between beneficial and harmful intervention, making information

susceptible to damage, sabotage, espionage, theft and other forms of aggression. Also

the easy use of advanced software and the evolution of networks which provided greater

flow of information have created more opportunities for easy access and aggression by

different unauthorized users to most sites, different files and information sources.

Sometimes the same service provider may be the intruder and penetrate the secrecy and

privacy of users even though the content of the data is encrypted.

This is why database privacy became one of the main challenges that face this

era of Information Technology (IT) and constitute a source of large anxiety for all users.

Information technology infrastructure started taking these customer concerns as one of

their top priorities in their present and future product and developmental activities.

Some sort of protection for information is achieved by immunization of its environment

through what has been termed as the security of information, namely steganography, it

is the science of hiding information.

The main goal of steganography is to hide the data from a third party, it differs

from what is called cryptography which makes data unreadable by a third party.

“There are a large number of steganographic methods that most of us are familiar with

(especially if you watch a lot of spy movies!), ranging from invisible ink and microdots to secreting a

hidden message in the second letter of each word of a large body of text and spread spectrum radio

communication. With computers and networks, there are many other ways of hiding information”.

(Gary C. Kessler,2001)

2

In steganography techniques, many different cover file formats can be used but

because of wide spread application of digital images on the internet, they have become

the most popular format.

Steganography today, is more sophisticated than the cryptography, allowing a

user to hide large amounts of information within image and audio files. These forms of

steganography often are used in conjunction with cryptography so that the information

is doubly protected; first it is encrypted and then hidden so that an adversary has to first

find the information and then decrypt it.

Also the main goal of steganography is to communicate securely in a completely

undetectable manner and to avoid drawing suspicion to the transmission of a hidden

data. It is not to keep others from knowing the hidden information, but it is to keep

others from thinking that the information even exists. If a steganography method causes

someone to suspect the carrier medium. And researches are still underway to develop

this technology and use it to protect the information in all fields.

In this study it has been proposed to use genetic algorithm in hiding data because

Genetic Algorithm GA has the ability to increase the hiding capacity compared to other

systems, according to the experimental results on this proposal it has been found that

genetic algorithm is capable of providing a larger embedding capacity without causing

noticeable distortions on media cover in comparison with similar existing methods.

GA is a population-based metaheuristic algorithm that uses genetics-inspired

operators to sample the solution space. This means that this algorithm implies some

kind of genetic operators on a population of individuals in order to evolve them

throughout the generations.

GA is a combinatorial optimization technique and its general purpose

optimization method based on Darwin theory of evolution, that searches for an optimal

near value of a complex objective function by simulation of the natural evolutionary

process.

GA has been successfully used in a wide variety of problem domains, it consists

of three basic operators: selection, crossover, and mutation. The algorithm starts with a

3

set of solutions to the problem, the solution set are represented by chromosomes in GA

called the population.

Data hiding which have got many methods, is described in the literature review

on chapter two, is a widely used method in information security. In data hiding

applications, optimization techniques are utilized in order to improve the success of

algorithms. The genetic algorithm is one of the largely used heuristic optimization

technique in these applications.

The current information system security is not able to handle the increasing

development and increasingly complex nature of the computer systems and their

security needs, based on this deficiency, genetic algorithm has been successfully applied

to information security problems like steganography system.

Also the proposed number of genes, which are primary numbers, that have been

processed through a genetic algorithm, will reduce the time to hide data through the

primary tag that is found in relation table. The suitable chromosomes have been selected

through fitness function which assessed the chromosomes of the current generation, in

order to select the offspring.

The proposed technique uses the HTML tags and their attributes to hide database

illogically using genetic algorithm. It is based on the fact that the ordering of the

attributes in the HTML tags has no impact on the appearance of the document. This

ordering can be used to hide the data efficiently. The proposed technique considered

that any tag represents gene and an attribute represents chromosome.

There were two techniques have been integrated in this study. The first

technique is genetic algorithm technique which is inspired from natural evolution and its

main function is to identify the problem and generate its useful solutions through

optimization and search problems.

The second technique is steganography technique which inspired from the

science of hiding and its main function is embedding the data in a transmission medium.

Its ultimate goal of this integration is to increase the efficiency of hiding data within a

large capacity.

4

1.2 Motivation

The main motivation for this work is to investigate:

• Protection of database through hiding the database in mediator without

changing its features (size and specifications), and scattering the database

within certain parts of the mediator. The ability to separate the program of

concealment and decoder concealment will be protected by a password.

• The increased need to protect intellectual property rights by digital content

owners.

• Using a biologically inspired technique like genetic algorithm coupled with

steganography system can be used efficiently to design future generations of

intelligent information security systems.

• Using the steganography is to avoid drawing attention to the transmission of

hidden information.

The goal of the process is optimization of steganography function and increase the

information security.

1.3 Research Problem

The internet has a huge number of such web pages and their number is growing

rapidly. It is clearly that the internet huge size, combined with lack of an effective

control, gives the opportunity to smuggling some content of the ordinary web pages .

“Obviously the internet huge size, combined with lack of an effective control, gives one an

opportunity to “smuggle” some undesirable content into the ordinary web pages. Furthermore, a number

of methods exist that allow to hide such a content or hidden message without changing the web page

look. They are taking advantage of steganography”. (L. Polak 1, Z. Kotulski 2,2010)

As the technology of transmitting information on network in secure, the

importance of information security came to be recognized widely. This research is

simulation of information security system and its application to hide database on web

page which save in local memory.

5

Steganography technique have lately attracted considerable attention as a good

solution for information security and copyright problems and the protection method for

communication privacy.

The goal of the project is to hide the database within a web page using

steganography with genetic algorithm and to compare those algorithms in the context of

speed and quality of concealing, and describe their functionality in data security.

One of problems that face hiding of data is the limited size of the file in which

information needs to be embedded.

1.4 Research Objectives

1. The objective of this research is to study the hiding of database inside a specific

file (Multimedia File) without changing the size of this file and by using genetic

algorithm to increase the reliability and confidentiality of the database system.

2. Hiding information by covering it with another information, and integrate the

new information with the existing information so it does not show the hidden

information and the other information remains visible as it was before, by this the

system can hide a database consisting of nearly 100 bits and more within a web page.

3. The database is hiding within a mediator (web page) without changing its

features (size and specifications), and is scattered within certain parts of the mediator,

the ability to separate the program of concealment and decoder concealment are

protecting by a password that is agreed upon.

4. Simulate steganography system and testing the efficiency and accuracy of hiding

the database through genetic algorithm using c#.net software, and compare and it with

the existing the steganography systems.

5. To Provide more confidentiality and integrity of confidential data authentication

while accessing, storing the database easily.

6. The method used steganography and encryption to ensure the confidentiality of

all data, and duplicate a protection of database.

7. Using steganography to hide a secret data in the best, no one can see that both

parties to a secret connection.

6

1.5 Research Questions

This thesis is an attempt to find the answers for the following questions:

1. How do genetic algorithms produce a safe information hiding tool?

2. Is steganography on HTML document suitable tool for building integrity

information security system?

3. Do choosing illogical data base has an impact on the proposed steganography

system?

1.6 Research Scope

-Thematic scope: The main scope of this research to apply genetic algorithm as a

security technique, it is used in the field of Evolutionary Programming (EP) in Artificial

Intelligence (AI), specifically in the research branch and problem-solving.

Genetic algorithm has the ability to find the best solutions to the problems and

improvement of problem optimization depending on the random statistic search to hide

the database.

-Time scope of the research: The period from 2012 to 2017

1.7 Research Methodology and Activities

On this thesis experimental methodology have been used, analyzing different types

of home pages like company pages, news portals, social media pages and university

pages. These pages have got a lot of items that must be described. HTML tags with

features, increasing the ability of steganography and evaluate the performance of the

genetic algorithm in steganography.



attributes in the HTML tags has no impact on the appearance of the document. This

ordering can be used to hide the data efficiently. The proposed technique has considered

that any tag represents a gene and any attribute represents a chromosome.

7

Home web pages of different sizes are checked based on the number of attributes in

tags. The performance of the system is tested over many runs such as examining

different web pages and categories, testing the effect of variations in multiple

parameters simulating the algorithm to hide data on web page.

The following activities are the main objectives of this research:

1- Analyze a web page and then find out or create the number of the attributes that

must be taken:

• Follow the steps of the genetic algorithm according to their activities.

2- Create and verify the theoretical design:

• Build a theoretical flow of the agent design and verify it.

• Verify the logical and computational models.

3- Create novel mathematical and computational models:

• Build the mathematical and computational model.

4- Simulate the proposed system using c# software:

• Simulate the system using different scenarios and diverse dataset.

• Evaluate the model reliability using traditional and novel metrics and compare it

with the contemporary models.

This research will perform the following:

• Focus on hidden database in web page.

• Hide information using genetic algorithm.

• Study the structure features of HTML on attributes of tag after putting the

hidden data and identify its relation to the attributes of web page.

8

Figure 1.1: The

Flow of the

Research Activities

1.8 Glossary of Research Terms

Allele:

One of two or several alternative forms of a gene that generated by mutation and

are found at the same place on a chromosome 0 or 1.

Cryptography:

The science of saving information security, it is the base of modern security

technologies used to protect information and resources needed on networks.

Evolutionary Algorithms:

9

(EAs) a term used to describe computer based problem solving systems which use

computational models of evolutionary processes as essential elements in their design

and implementation.

Genetic Algorithm:

(GA) is a search inference that simulate the process of natural selection. This

inference is used to generate useful solutions for optimization and searching problems,

using methods inspired by natural evolution.

Plain text:

Refers to any message that is not encrypted - also called clear text.

Steganalysis:

Is the study to discover hidden messages using a technique called steganography.

Steganographic:

Adjective related to a secret data within an ordinary visible information in a

technique called steganography.

Steganography:

The art and science of hiding information as a written text, picture or sound, this

technique can be used together with cryptography technique as a way to increase data

protection.

(Stego) Object:

The object that is actually going to be seen out in the open the text, picture or sound

that will be used to carry the message right under everyone’s nose. It is the result of

combining the cover text and the embedded message.

1.9 Structure of Thesis

10

This thesis is composed of the following chapters:

• The first chapter introduces the whole research in addition it gives a brief idea

about the main concepts involved in this work, motivation of the novel approach,

problem statement, research questions, objectives, scope of research, research

methodology, relevant research activities, and some glossary of research terms.

• The second chapter provides literature review in steganography and genetic

algorithm. This chapter gives an overview of steganography approaches and

security, and defines genetic algorithm with its different types and different

techniques. Also, it gives a brief idea about previous studies in steganography and

genetic algorithm applications and compare these studies with other related

methods.

• The third chapter analyses the problem statement and examines in details the

theoretical aspects of the proposed system and discusses the architecture and

implementation of the proposed system, it demonstrates the good side of using

steganography with genetic algorithm.

• The fourth chapter discuss the evaluation and measurement of system, and discuss

the experiments and dataset used to verify the model and the results.

• Finally, the fifth chapter gives the conclusion and recommendations for future

work.

1.10 Chapter Summary

This chapter presents the research problems, objectives, motivation, scope,

methodologies and activities. The main problem in information security system,

information protection the form hackers and how to hide database with integrated

security.

CHAPTER 2

11

LITERATURE REVIEW AND PREVIOUS STUDIES

2.1 Introduction

As the internet became the fundamental tool in communication services,

information delivery and financial transactions, and as e-governments all-over the world

become heavily dependent on the internet, data security in internet has become the most

important factor to be considered. So the existence of security and safety requirements

for most of online applications to protect against unauthorized access became

mandatory. Governments, large companies, publishing and broadcasting industries

became in urgent need for a technique that can effectively secure and protect their

confidential data, and this has motivated the innovators to discover the different security

methods like steganography, genetic algorithm and cryptography.

Genetic algorithm has been used as an effective technique for information hiding

and improving the performance of information hiding systems. Using genetic algorithm

is mainly based on the mechanism of natural genetics and the theory of evolution.

2.2 Literature Review

2.2.1 The Security on The Internet Environment

There are different secret terms used in internet to prevent the disclosure of

information to unauthorized people. For example, the protection system that is applied

in electronic commerce, sending credit card details from the buyer to the merchant to

compete transaction process will expose the secrecy of the buyer.

The system applies secrecy by encrypting the card number during transmission,

by limiting access to storage areas or hiding the serial number of the card, and printing

receipts records, but unfortunately all these measures are not enough for data protection.

Exposing the secrecy may take different forms. For example, spying on a

personal computer screen to steel a password or exposing personal database without the

owner knowledge, or when hacking governmental computers or computer that keeps

highly sensitive information, leading to violation of high confidentiality.

12

2.2.2 Security and Steganography

Computer and network security have certain core standards that any secret

communication method should address. Though no one method addresses all security

requirements, steganography does satisfy several of these requirements, sometimes in

conjunction with other technologies such as crypto (Donovan Artz,2001).

1. Confidentiality

Confidentiality is a basic aspect of network security, and making sure that any

unauthorized person cannot gain access to or read information on network.

Confidentiality is at the heart of what steganography does. Steganography, though,

accomplishes confidentiality in a slightly different manner than cryptography.

With cryptography, an unauthorized person can see the information but cannot

access it. Because they can tell that there is information being protected, the

unauthorized person may try to break the encryption. With steganography, because the

data is hidden, any unauthorized party does not even know there is sensitive data there.

From a confidentiality standpoint, steganography keeps the information protected at a

higher level.

2. Survivability

The main activity of communication is that one party transmits information and

the other party receives it. The completion of this cycle represents the feature of

survivability. Even when data is being hidden in a message you have to be sure that

whatever processing of the data takes place between sender and receiver does not

destroy the information. Must be sure that the information is not only received by the

recipient, but also extracted so that the message can be read. When using

steganography, it is critical to understand the processing a message will go through and

determine whether the hidden message has a high chance of survivability across a

network.

3. No Detection

13

It makes no sense to perform data hiding if someone can figure out how or

where the information is hidden. If someone can easily detect where you hide your

information and find your message, it defeats the purpose of using steganography. The

way that steganography is usually performed to make it hard to find the hidden data is to

do it in such a way that there is little change to the properties of host file.

Therefore, the algorithm that is used must be robust enough that, even if

someone knows how the technique works, they cannot easily find out that you have

hidden data in a given file. A robust algorithm is one where the insertion method is hard

to detect and hard to destroy.

4. Visibility

When hiding data, it must be undetectable, so it must make sure that people

can’t see any visible changes to the host file in which the data is hidden. If hide a secret

message in an image and it distorts the image in such a way that someone can tell it has

been modified, steganography has been unsuccessful. (Eric Cole, 2003)

Figure 2.1 The core standards for Computer and network security

2.2.3 Data Hiding Techniques

data

14

Confidentiality

Integrity

Authentication

(Do not believe everything you see or hear) is confirmed by concepts of

computer science in the field of data hiding. Bringing most of the contents of the

computer may contain some hidden information without the user's knowledge. Hide

data with all the contents of the advantages and disadvantages of the techniques that

became must interest in them and diving depth.

Some facts concerning hiding of data can be found in. (Eric Cole, 2003). Also in

literature (Eiji Kawaguchi, Eason, 2007), (B.B. Zaidan, A.A. Zaidan, A.K. Al-Frajat,

H.A. Jalab 2010), (Matthew Walker, 2001) and other books and papers.

Information Security (IS) is one of the most misunderstood things within the

Information Technology (IT) world right now. (Robert H. Williams III,2007) so it is

necessary to discuss briefly these techniques before a thorough review is provided.

There are three fundamental objectives of computer security: confidentiality,

integrity and authentication as shown in Fig 2.2

Figure 2.2: fundamental objectives of Information security

A. Confidentiality: Preserving authorized restrictions on information access and

disclosure, including means for protecting personal privacy and secure

information.

B. Integrity: Guarding against improper information modification or destruction

and includes ensuring information non-repudiation and authenticity.

C. Authentication: Assure that the source of the message is an authorized party, or

to detect any unauthorized access to or use of information.

http://ascidatabase.com/author.php?author=H.A.&last=Jalab

http://scialert.net/fulltext/?doi=jas.2010.1650.1655&org=11#f1

15

An important aspect of information security is recognizing the value of

information and the expected attacks for this information from unauthorized parties then

defining appropriate procedures and protection requirements for the information. Not all

information is equal and so not all information requires the same degree of protection.

This requires information to be assigned a security classification where the top-

secret data need highly secure software and procedures to deal with this data and assign

different level of authorized parties such as some parties authorized to disclose the data

only while another have the ability to change it. (B.B. Zaidan, A.A. Zaidan, A.K. Al-

Frajat , H.A. Jalab 2010).Protection system can be classified in to more specific as

encryption information (cryptography)hiding information (steganography).

Figure 2.3: Types of security system [36]

2.2.4 Comparative Between Cryptography and Steganography

16

The advent of computers there has been a vast dissemination of information,

some of which needs to be kept private, some of which doesn't.

The information may be hidden in two basic ways (cryptography and

steganography). The methods of cryptography do not conceal the presence of secret

information but render it unintelligible to outsider by various transformations of the

information that is to be put into secret form, while methods of steganography conceal

the very existence of the secret information.

The main goal of cryptography is keeping data secure form unauthorized

attackers. The reverse of data encryption is data decryption.

The main goal of steganography is to communicate securely in a completely

undetectable manner and to avoid drawing suspicion to the transmission of a hidden

data. It is not to keep others from knowing the hidden information, but it is to keep

others from thinking that the information even exists. If a steganography method causes

someone to suspect the carrier medium.

In hide information can drive two techniques, one is digital watermarking is the

process of embedding information into a digital signal in a way that is difficult to

remove, the signal may be audio, pictures, video or text files; its mostly used for

demonstrate the intellectual property rights purpose such as adding copy right logo or

text (author signature) for multimedia files.

Steganography is the art and science of writing hidden messages in such a way

that no one, apart from the sender and intended recipient, suspects the existence of the

message. Since, the main use for steganography is to send secure messages between

parties, then it’s aim to prevent the message being detected by any other party (Eiji

Kawaguchi and Eason, 2007).

Table2.1: Comparison between cryptography and steganography

http://www.scialert.net/asci/result.php?searchin=Keywords&cat=&ascicat=ALL&Submit=Search&keyword=digital+watermarking

17

2.2.5 The Fundamental Requirements When Hiding Data in Data

The requirements of any data hiding system can be categorized into security,

capacity and robustness (Ingemar J Cox et al.1996). All these factors are inversely

proportional to each other creating the so called data hiding dilemma. (Arup Kumar

Bhaumik1, June 2009)

2.2.6 Method of Hiding data

There are essentially three ways to hide data: injection, substitution, and

generation.

1. Injection

Finds areas in a file that will be ignored and puts your covert message in those

areas. For example, most files contain an EOF or end-of-file marker. When playing an

audio file, the application that is playing the file will stop playing when it reaches the

EOF because it thinks it is the end of the file.

2. Substitution

Cryptography Steganography

1. The encrypted letter could be seen by

anyone but cryptography make the

message not understandable the end result

in cryptography is the cipher text.

1. Steganography is hiding the message in another

media so that nobody will notice the message.

2. The end result in Cryptography is the

cipher.

2. The end result of information hiding is stego-

media

3. The goal of a secure Cryptographic is to

prevent and interceptor from gaining any

information about the plaintext from the

interceptor cipher.

3. The goal of secure Steganographic methods is to

prevent an observant intermediary from even

obtaining knowledge of the mere presence of the

secret data.

4.Any person has the ability of detecting

and modifying the encrypted message.

4. The hidden message is imperceptible to anyone.

5. Steganography cannot be used to adapt

the robustness of Cryptographic system.

5. Steganography can be used in conjunction with

cryptography by hiding an encrypted message.

18

Finds insignificant information in the host file and replaces it with your covert

data. For example, with sound files each unit of sound hear is composed of several

bytes. If modify the Least Significant Bit (LSB) it will slightly modify the sound, but so

slightly that the human ear cannot tell the difference.

3. Generation

Creates a new overt file based on the information that is contained in the covert

message. For example, one generation technique will take covert file and produce a

picture that resembles a modern painting. This is done by substituting a patch of green

for every 0 and substituting a patch of yellow for every 1. The picture is created solely

based on the bit sequence of the covert file. (Eric Cole, 2003)

2.2.7 Steganography in Digital Age

Steganography is the art and science of invisible communication. This is

accomplished through hiding information in other information, thus hiding the existence

of the communicated information. The word steganography is derived from the Greek

words “stegos” meaning “cover” and “grafia” meaning “writing”, defining it as

“covered writing”. In image steganography the information is hidden exclusively in

images.

The idea and practice of hiding information has a long history. In Histories the

Greek historian Herodotus writes of a nobleman, Histaeus, who needed to communicate

with his son-in-law in Greece. He shaved the head of one of his most trusted slaves and

tattooed the message onto the slave’s scalp.

When the slave’s hair grew back the slave was dispatched with the hidden

message. In the Second World War the Microdot technique was developed by the

Germans. Information, especially photographs, was reduced in size until it was the size

of a typed period. Extremely difficult to detect, a normal cover message was sent over

an insecure channel with one of the periods on the paper containing hidden information.

“Today steganography has come into its own on the Internet. Used for transmitting data as well

as for hiding trademarks in images and music (called digital watermarking), electronic steganography

19

may ironically be one of the last bastions of information privacy in our world today”. (Eric Cole,

2003)

Steganography has traditionally been used by the military and criminal classes.

One trend that is intriguing today is the increase in use of steganography by all sectors.

And researches are still underway to develop this technology and use it to protect the

information in all fields.

2.2.8 The Steganography Approaches

The encrypted message using steganography, the resulting stego-image can be

transmitted without revealing that secret information is being exchanged. Furthermore,

even if an attacker were to defeat the steganographic technique and detect the message

from the stego-object, he would still require the cryptographic decoding key to decipher

the encrypted message (Zaidan, Zaidan, 2009). Since then, the steganography

approaches can be divided into three types:

1. Pure steganography

2. Secret key steganography

3. Public key steganography

1. Pure Steganography

This technique simply uses the steganography approach only without

combination with other methods. It is working on hiding information within cover

carrier.

2. Secret Key Steganography

The secret key steganography uses the combination of the secret key

cryptography technique and the steganography approach. The idea of this type is to

encrypt the secret message or data by secret key approach and then hide the encrypted

data within cover carrier.

3. Public Key Steganography

20

The last type of steganography is to combine the public key cryptography

approach and the steganography approach. The idea of this type is to encrypt the secret

data using the public key approach and then hide the encrypted data within cover

carrier. Further direction can be done by using small size of encrypted data to hide it

within multimedia cover.

2.2.9 Types Of Steganography

Over the years, people have categorized steganography techniques in different

ways. The importance classification scheme breaks steganography down into the

following three groups:

1. Insertion-based

2. Algorithmic-based

3. Grammar-based

This scheme focuses on how data is hidden. Note that as new techniques have been

developed, they do not clearly map into this scheme.

Figure 2.4: Types of Steganography

1. Insertion-Based

21

Insertion-based steganography techniques work by inserting blocks of data into a

host file. Using an insertion-based technique, data is inserted at the same point in every

file. This type of technique works by finding places in a file that can be changed,

without having any significant effect on the host file.

Once these redundant areas are identified, the data to be hidden can be broken

up and inserted in them and will be fairly hard to detect. Depending on the file format,

this data can be hidden between headers, in color tables, in image data, or in several

other fields.

A very common way to hide data is to insert it into the Least Significant Bits

(LSB) of an 8-bit or 16-bit file—for example, a 16-bit sound file. With sound files, one

can change the first and second LSB of each 16-bit group without having a large impact

on the quality of the sound. Because data is always being inserted at the same point for

each file, this can be categorized as an insertion steganography technique.

2. Algorithmic-Based

Algorithmic-based steganography techniques use some sort of computer

algorithm to designate where in a file data should be hidden. Because this category of

technique doesn’t always insert data in the same spot in each file, it is possible that the

process will degrade the quality of the file. If someone compared the original file to the

one where data is hidden, that person might be able to see or hear a change in the file.

This category of techniques has to be examined carefully to ascertain whether a

technique is detectable. Remember that one of the goals of stego is to make sure nobody

can detect that data is hidden in a file. If you do not create an algorithm and seed

number that place the data in nonessential locations, the hidden data could completely

obliterate the original image file or result in an image that looks very unusual for

example, if you hide data in an image file you must provide a number to seed the

stenographic technique. This number could be either a random number or the first five

bytes of the file. The algorithmic technique would take the seed value and use it to

determine where to place the secret data throughout the file.

22

The algorithm could be very complex or as simple as this: If the first digit is 1,

insert the first bit at location x; if the first digit is 2, insert the first bit at location y; and

so on. If careful thought is not given to the algorithm that is used, it could result in a

disastrous output file.

3. Grammar-Based

Both the insertion and algorithmic techniques would take the secret message and

somehow embed it in a host file. Grammar-based steganography techniques require no

host file in which to hide a message because it generates its own host file.

This class of technique uses hidden data to generate an output file based on a

predefined grammar. In fact, the output file produced looks just like the predefined

grammar.

This approach could be used to hide data from automatic scanning programs that

use statistical patterns to identify data. These programs scan data looking for anything

unusual. Such a program can scan for English type text, and anything that fits the profile

would not be flagged by the scanning program. (Eric Cole, 2003)

2.2.10 Different Types of Media That Hide Data in Steganographic Techniques

Steganography use different kinds of media to hide the data.

1) Text Steganography:

This technique hides the data within a text file. It is difficult technique, because

sometimes a redundant amount of data is needed to be hidden within a message that is

scarce in text files. (Neha Rani, july 2013).

2) Image Steganography:

It is one of the most commonly used techniques because of the limitation of the

Human Visual System (HVS). Human eye cannot detect the difference in a vast range

of colors, and so it will not be able to notice an insignificant change in the quality of an

image that results from steganography.

23

3) Audio Steganography:

This technique transmits hidden data within an audio signal. it is a difficult form

of steganography, because it is very hard to imbed a secret data within digital sound.

4) Video Steganography:

This technique of hiding some secret data inside a video file. The addition secret

data to a video file is not recognizable by the human eye as the change of a pixel color

is not easily to be detectable.

Figure2.5: Steganographic media Techniques

2.2.11 Techniques of Steganography on Network

A network security system depends on layers of protection and consists of

multiple components like networking monitor, security software and hardware

computer. the components work together to increase the security and the integrity of the

computer network. On a computer network can be used stego techniques to hide files in

traffic. ( Rupali Gawade, 2014)

Plain text

Steganography

Image Video Text Audio

Webpage text

Java script CSS HTML

XML

24

When make a simulating connection usually uses port 80 traffic which is

Hypertext Transfer HTTP protocol, the message might pass without raising anyone’s

suspicion.

2.2.12 Network Stego Techniques

There are four techniques used on the network, each one of has a different standard

of sophistication and a different approach to hide data. (Eric Cole, 2003)

1. Hiding in an attachment

2. Hiding in network headers

3. Hiding in an overt protocol

4. Hiding in a transmission

Figure 2.6: Techniques of Steganography on Network

1.Hiding in an Attachment

It is the simplest form of using a network to transport stego file from party to

another. The stego file is a technique to hide a secret message in the file and take this

file which contains hidden data and attach it to some other form of network traffic.

25

There are three common ways to do these secret massages: by email, by File Transfer

Protocol such as FTP, or by posting a file on a web site.

2.Hiding Data in Network Headers

It is important to know the networking and Transmission Control

Protocol/Internet Protocol (TCP / IP) protocol to understand this technique. Protocol

(TCP / IP) actually contains four major communication protocols: IP, TCP, User

Datagram Protocol UDP, Internet Control Message Protocol (ICMP), these protocols

are running on each of the sending computer and the receiving computer to standardize

communications. TCP protocol on the sending computer communicates with the TCP

protocol on the receiver's computer, the IP protocol communicate with the sender's IP

protocol on the receiving computer. This makes protocol header based stego is possible.

Every packet goes across the Internet must contain these headers, and can easily embed

data in the unneeded portion and transmit the Hidden Data with FTP.

3.Hiding in a Transmission

It is the ability of hiding stego in the attachment, by using one program to hide

data and another program to transfer information file. For example, S-tools can be

applied to hide a secret message in a file and then use a separate e-mail program to

attach a photo and send it.

4.Hiding in an Overt Protocol

This technique is called data camouflaging, because it makes data look like

something else. This technique can take data and put it in normal network traffic, and

modify the data in such a way that it looks like the overt protocol. Most networks carry

large amounts of HTTP or web traffic, so that they can send data over port 80, and it

would be looking like web traffic. The problem in this category is that, if someone

examined the payload, it would not look like normal web traffic, which usually contains

HTML, on the other hand if symbols such as < > </> are added to the data, the traffic

would look like web traffic and probably would slip from the casual observer.

26

2.2.13 HTML Characteristics

The HTML is the language which is widely used in the Internet domain without

having any effect on the network contents, its wide existence on the Internet gave it the

ability to cover all kinds of data, where case of alphabets in opening and closing tags are

swapped at all upper case or all lower case, keeping in view the secret message bit ‘0’ or

‘1’. (K. F. Rafat , M. Sher, December 2012)

The factors promoted to use text steganography in an HTML document, is web

pages, they are present in a huge amount, and detecting which one is containing the

hidden information is next to impossible, also the order of tags used for formatting the

appearance of a web page does not matter.

The HTML tags are enclosed in “angle brackets” that generally appear in pair

referred to as start <” and ‘end >” tag. HTML tags are case insensitive and permits

reordering of tags in variety of ways. This skill of HTML tags is being exploited for

hiding bits of secret data. for example:

The HTML tag “< caption > some text < / caption >” may represent the secret

bit ‘0’ whereas the tag “< caption > some text < / caption >” may represent secret bit

‘1’.

2.2.14 Data Steganography Techniques on HTML Document

To hide a data on HTML document, one of the following techniques could be

chosen. (Neha Rani,2013)

1) Selectively hiding:

This technique requires a large amount of plain text, in which the characters are

hidden in any specific location within the characters of the words. The text can be

extracted by concatenation of the characters.

<caption> <center> <cite>

27

2) HTML web pages:

Because the attributes of HTML tags are case sensitive, this technique uses this

fact to hide the text. Then original text can be retrieved by using the same characters.

(Sandipan Dey,2010)

3) Hiding a character using Whitespace application:

This technique uses the fact that 0 is determined by fewer numbers of

whitespaces and the 1 is determined by the number of whitespaces between words.

2.2.15 Genetic Algorithm are Part of Evolutionary Algorithms

In the 1950s and the 1960s, several computer scientists independently studied

evolutionary systems with the idea that evolution could be used as an optimization tool

for engineering problems. The idea in all these systems was to evolve a population of

candidate solutions to a given problem, using operators inspired by natural genetic

variation and natural selection.

The field of evolutionary strategies has remained an active area of research,

mostly developing independently from the field of genetic algorithms. Developed

"evolutionary programming," a technique in which candidate solutions to given tasks

were represented as finite−state machines, which were evolved by randomly mutating

their state−transition diagrams and selecting the fittest. Evolutionary computing is a

rapidly growing area of artificial intelligence.

Evolutionary Algorithms (EAs) are population based meta heuristic optimization

algorithms that use biology-inspired mechanisms and survival of the fittest theory in

order to refine a set of solution iteratively. (GAs) are subclasses of (EAs) where the

elements of the search space are binary strings or arrays of other element types.

(GAs) are computer based search techniques patterned after the genetic

mechanisms of biological organisms that have adapted and flourished in changing

highly competitive environment.

28

The last decade has witnessed many exciting advances in the use of genetic

algorithms to solve optimization problems in process control systems (GAs) are the

solution for optimization of hard problems quickly, reliably and accurately.

As the complexity of the real-time controller increases, the (GAs) applications

have grown in more than equal measure. The figure 2.7 show outlines the situation of

natural techniques among other well-known search procedures. (S.N.Sivanandam

S.N.Deepa,2007)

Figure 2.7: The block situation of natural search techniques [4]

29

2.2.16 History of Genetic Algorithms

The history of Genetic algorithms (GAs) was invented by John Holland in the

1960s and was developed by Holland and his students and colleagues at the University

of Michigan in the 1960s and the 1970s. In contrast with evolution strategies and

evolutionary programming, Holland's original goal was not to design algorithms to

solve specific problems, but rather to formally study the phenomenon of adaptation as it

occurs in nature and to develop ways in which the mechanisms of natural adaptation

might be imported into computer systems.

Holland's GA is a method for moving from one population of "chromosomes"

(e.g. strings of ones and zeros, or "bits") to a new population by using a kind of "natural

selection" together with the genetics−inspired operators of crossover, mutation, and

inversion.

Each chromosome consists of "genes" (e.g. bits), each gene being an instance of

a particular "allele" (e.g., 0 or 1). The selection operator chooses those chromosomes in

the population that will be allowed to reproduce, and on average the fitter chromosomes

produce more offspring than the less fit ones.

Crossover exchanges subparts of two chromosomes, roughly mimicking

biological recombination between two single−chromosome ("haploid") organisms;

mutation randomly changes the allele values of some locations in the chromosome; and

the inversion reverses the order of a contiguous section of the chromosome, thus

rearranging the order in which genes are arrayed. Holland’s introduction of a

population−based algorithm with crossover, inversion, and the mutation was a major

innovation. (Mitchell, Melanie,1998)

2.2.17 Genetic Algorithms ’s Techniques

Coding techniques in genetic algorithms are specific problem that transforms to

solve the problem in the chromosomes. There are various coding techniques used in

genetic algorithms. Binary encoding, Permutation encoding, Value encoding and Tree

encoding is shown in Figure 2.8 (Anit Kumar,2013). The most common form of

encoding is Binary encoding; it gives many possible chromosomes.

30

Binary encoding is often not natural for many problems and sometimes

corrections must be made after crossover and mutation. The best one that suits with the

request queue or coding problems are permutations. Where coding is used flip. In

flipping encoding, each chromosome is a series of numbers in sequence.

Value coding is a technique in which each chromosome is a series of some of

the values, it is used where require some more complex values. It is more necessary to

develop some new crossover and mutation specific for the problem.

Tree encoding is used mainly for evolving programs or advanced expressions for

genetic programming, where the crossover and mutation can be done relatively easily.

Chromosomes with binary encoding

Chromosome A 101100101100101011100101

Chromosome B 111111100000110000011111

Chromosomes with permutation encoding

Chromosome A 1 5 3 2 6 4 7 9 8

Chromosome B 8 5 6 7 2 3 1 4 9

Chromosomes with value encoding

Chromosome A 1.2324 5.3243 0.4556 2.3293

Chromosome B ABDJEIFJDHDIERJFDLDFL

Chromosome C (north), (south), (east), (west)

Chromosome B Chromosome A

( do_until step wall )

y=(x/(9+y))

Chromosomes with tree encoding

Figure 2.8: A Genetic Algorithm’s Techniques [34]

2.2.18 Generating Random Population

Genetic algorithms (GAs) are mainly inspired by the famous Darwinian’s theory

of survival of the fittest. It is fundamentally based on a group of solutions represented

by chromosomes, called a population.

Do until

step wait

/

+

y 9

+ x

31

A group of solutions is extracted from one population and used to create a new

population, with the motivation that the new population of chromosomes can be better

than the old ones, and the solutions are selected according to their suitability to form

new solutions. This process is repeated until a satisfying condition is reached. The main

outlines of genetic algorithms are as below: (Mitchell Melanie, 1999)

Step 1: Generation of Random Population

It is a random group of solutions formed among the initial population. Its genetic

makeup involves unique representation, which maintained a generation of individuals at

each point in the search process.

It is important that the initial population has a perfect variety of individuals,

because they learn from each other. The first order of diversity is by configuration of

network and random uniformity, this diversity is not related to local optimization

methods or assembly and lack of this diversity will lead to suboptimal solutions.

Step 2: Evaluation of Fitness

The main function of fitness is to measure the obtained object, it is optimized by

applying a genetic process and evaluation of each solution to identify whether it will

assist in the solutions of the second generation. In the process of evaluation for each

chromosome the value is set to return to its full fitness, depending to its proximity to

solve the problem. It is perfectly designed because it selects the individual which

produces and creates the next generation of population.

Step3: New Population

It is composed of the following processes: selection, crossover, mutation and

acceptance called elitism.

A) Selection

Two parents of chromosomes are selected from a population according to their

fitness, thus better fitness of the parent will lead the bigger chance of selection. The

function of evaluation mainly controls the selection of individuals for the coming

32

generation to reproduce or to live. Chromosomes with a greater fitness value are more

likely to reproduce offspring.

B) Crossover

As the offspring is the product of parent chromosomes, the crossover is the

formation of a new chromosome through the combination of two portions of a two good

parent chromosome. It consists of a combination of genes, including configuration.

A crossover point in the parent chromosomes is randomly chosen. Then, the

two different portions of each chromosome are swapped with other portion of

chromosomes to form two new chromosomes. There are many types of crossover, the

typical type of crossover in binary representation is the one-point and the two-point

crossover. In almost all types of crossover operators are picked from the mating pool at

random and some portions of the strings are exchanged between the strings to create

two new strings.

C) Mutation

Mutation is a mandatory operation in GA, it is the most fundamental way to

modify a solution for the next generation. Mutation takes place by changing the value of

allele randomly to a slight change. It improves the general performance of

chromosomes and protects the searching process of premature convergence. It also

keeps the diversity in the population.

The mutation point is randomly chosen and the allele associated with the

mutation point is changed. Not all alleles are mutated but depend on the mutation

probability. The mutation operation alters the strings to hopefully create a better string.

Since this operation is stochastically performed, the claim is not guaranteed.

D) Elitism

Elitism is the mechanism that maintains a number of the best solutions in GA. It

can be done in many different ways, it can be introduced in a simple mechanism in the

steady state, genetic operators are used to create two offsprings, they are then compared

to their parents, and the best two are selected among the four solutions as the next

33

generation. Elitism can be applied universally in the generation sense, in this case when

the offspring population is formed it is combined with the current population and the

next generation is selected from the best solutions.

Step4: Replacement

Applying new solutions (population) to run more of the process of Genetic

Algorithm.

Step 5: Testing and Termination

Solutions are examined, if they satisfied the end of the case, as the fitness value

of the best solution is met or the maximum number of generations is achieved, the

process is terminated and the best solutions has been returned in to the current

population.

Step 6: Looping of GA

Genetic algorithm performance is affected by operators’ crossover and mutation.

If the solutions of the new generation produced one output that is equal or close to the

required answer, so the issue is resolved. If the output was not equal to the required

answer, then the same process will be repeated for the next generation as their parents

did until a solution is found.

Figure 2.9: A diagram for a The steps of general Genetic Algorithm

34

2.3 Previous Studies

2.3.1 Previous Analysis of Methods in Steganography

Many methods are proposed, one of these methods is Generalizations of the

Pixel Value Differencing (PVD) it is (Jen-Chang Liu, Ming-Hong Shih,2008) method

for data hiding in gray level. This method compares between Steganography and

cryptographic.

Jen-Chang Liu [18] increase the capacity by proposing two extensions

of the PVD method, the first approach is the block based and the second

approach is the Haar-based. For the first one he divided cover image into

square blocks of n-pixels and a noverlapping horizontal blocks. For the second

one, the cover image is decomposed by applying the 2-D integer Haar wavelet.

It was found that by using this proposed method, the data hiding

capacity was significantly increased but on expense of the quality of the stego-

image. The Retention of Secrecy (RS) diagram and difference histogram has

approved the security of these proposed methods.

Souvik Bhattacharyya et al. [53] discussed that lately more concerns

about confidentiality of information on the internet have increased due illegal

information access and because the generalized environment of the internet

became more hostile, so a steganography has become a wide field of research

trying to fulfill more immunity for hidden data.

A new method of information hiding in a text by inserting extra blank

space between the words of odd or even size according to the embedding

sequence and also in some cases the blank spaces in between the words of the

original cover text may be used for mapping each two bit of the embedding

sequences.

In the proposed system the secret message is first encoded using the

proposed encrypting algorithm. The encrypted message embedded in the cover

text using the proposed embedding algorithm to form the stego text.

35

At the receiver side, the extraction process starts by extracting the

encoded form of the message. After extraction the encrypted form of message

goes through the decryption process and finally authenticity of the message has

been checked through integer.

These results show the capabilities of secure transfer of the message

compared to earlier techniques with the addition of authenticity checking of the

secret information.

Christian Grothoff et al. [8] studied the systems that steganographically embed

information in the “noise” created by automatic translation of natural language

documents. They focus on two problems– generation of plausible steganographic texts

and avoiding transmission of the original source for stego objects.

The key idea behind translation-based steganography is to hide information in

the noise that invariably occurs in natural language translation.

The new proposes is a protocol for covert message transfer in natural language

text, for which have been a proof-of-concept implementation.

The new steganographic protocol is assumed that the sender and receiver have

previously agreed on a shared secret key. In order to send a message, the sender first

needs to obtain an original text in some source language.

The experimental results revealed that effects of different configurations of the

system produce translations of varying quality, but even quality degradation is not

predictable. This idea is made more difficult by the fact that the translation is

transmitted with no reference to the source text.

It was demonstrated that the variations produced by the stenographic encoding

are similar to those of various unmodified machine translation systems, showing that it

would be impractical for an adversary to establish the existence of a hidden message.

Till now, it has been found that in this modern stegnographic method, the

highest bitrate that our prototype achieved is about 0.33%.

36

Jessica Fridrich et al. [19] made a theoretical study for analyzing a newly

proposed algorithm called Highly Undetectable steGO (HUGO) as part of the Break

Our Steganographic System (BOSS) challenge, to signify the characteristics that are

able to detect the hidden payload by applying these schemes and to get a better picture

concerning the benefit of adaptive steganography with general selection routs.

It is mainly meant to improve the ability of detecting adaptive stegnography as

HUGO stegnography which makes hidden changes in hard-to-model areas of the cover.

HUGO is characterized by preservation of high-dimensional feature vector and so it put

into consideration a great amount of complex dependencies among surrounding pixels.

HUGO can be applied on different domains and medias inspired it was designed

for pictures in raster format.

As a summary, it was not possible to apply the fact that for HUGO the ability of

hiding changes at each pixel can be estimated, and giving the Warden probabilistic

information concerning the chosen channel doesn’t seem to be a weakness. Also the

steganalysis needs to apply high-dimensional features and scalable machine learning as

the level of sophistication of steganographic schemes increases.

Most of the features in this high dimensional feature vector are uninformative

and preserving them will weaken the algorithm. Instead, adding more diverse features

will lead to increase the dimensionality.

Eiji Kawaguchi and Richard O. Eason [12] proposed a new steganography using

the image as the vessel data, and they embedded secret information in the bit-planes of

the vessel. Also, they compared between watermarking and Bit-Plane Complexity

Segmentation (BPCS) Steganography in two fundamental ways.

Their experiments with BMP images have shown capacities exceeding 50% of

the original image size.

T. Morkel et al. [31] published a paper offering the overview of the different

algorithms used for image steganography to illustrate the security potential of

steganography for business and personal use. The reflection is based on a set of criteria

37

that have been identified for image steganography. He found that there is a large

selection of approaches to hiding information in images.

All the major image file formats have different methods of hiding messages,

with different strong and weak points respectively. Where one technique lacks in

payload capacity, the other lacks in robustness.

Least Significant Bit (LSB) in both BMP and GIF makes up for this, but both

approaches result in suspicious files that increase the probability of detection when in

the presence of a warden.

Thus for an agent to decide on which steganographic algorithm to use, he would

have to decide on the type of application he want to use the algorithm for and if he is

willing to compromise on some features to ensure the security of others.

Hedieh Sajedi and Mansour Jamzad [41] proposed an adaptive steganography

technique that can effectively defend against the most famous steganalysis algorithms.

His idea is built on embedding secret data in contoured coefficients via an iterative

embedding procedure to reduce the stereo image distortion.

Embedding is done by changing the coefficient values proportional to the

regions in which the coefficients reside and hidden data can be retrieved with zero-bit

error rates.

The results showed that the state-of-the-art stephanotis methods cannot be

reliably discriminated between clean and stereo images produced by our method.

Through experimentation illustrated that the cover selection measures improve the

undependability of stereo images by the straphanger.

Image complexity criteria are very prompt measures to select a proper cover

image from the database, but they are not very precise. In contrast, exact measures are

relaxed but introduce the best cover image with respect to the confidential data. It also

indicated that the amount of changes and visual quality measures are reliable criteria for

cover selection.

38

Finally, the results demonstrated that by using cover selection one can embed

much more bits in a suitable cover image.

Babloo Saha and Shuchi Sharma [37] made a theoretical study giving a thorough

understanding and evolution of different existing digital image steganography

techniques of data hiding in spatial, transformation and compression domains.

The study covered and integrated recent research work without going into much

detail of steganalysis, which is the art and science of defeating steganography.

They showed the recent research work in the field of oceanography deployed in

spatial, transform, and compression domains of digital images. Transform domain

techniques make changes in the frequency coefficients instead of manipulating the

image pixels directly, thus distortion is maintained at a minimum level and that’s why

they are preferred over spatial domain techniques.

They found that hiding more data results directly into more distortion of the

image. So the steganography technique deployed is dependent on the type of application

it is designed for. They also found that steganography can be misused like other

technologies.

For instance, terrorists may use this technique for their confidence, secure

communication or anti-virus systems can be fooled if viruses are transmitted in this

way. It is evident that steganography has numerous useful applications and will remain

the point of attraction for researchers.

Cheng-Hsing Yang et al. [38] proposed a novel, an efficient steganographic

method, which embeds a large amount of data and takes human vision into

consideration. Hiding data in gray-level images.

The experimental results showed that not only the method has a larger capacity

and can pass the detection of programs, but also the embedded data are totally

imperceptible from the human’s eyes upon the higher. Peak Signal to Noise Ratio PSNR

measures in experiments.

39

Table2.2: Summary of studies on steganography

Author Applications studies The result

Jen-Chang

Liu_, Ming-

Hong Shih,

(2008)

Method for data hiding in gray

level, proposing two extensions of

the PVD method.

The data hiding capacity was significantly

increased but on expense of the quality of

the stego-image.

Souvik

Bhattacharyya

et al. (2010)

Hiding in a text by inserting extra

blank space between the words of

odd or even size according to the

embedding sequence.

The capabilities of secure transfer of the

message compared to earlier techniques

with the addition of authenticity checking

of the secret information.

Christian

Grothoff

et al. (2009)

Studied the systems that steganographically embed

information in the “noise” created

by automatic translation of natural

language documents.

Revealing that effects of different configurations of the system produce

translations of varying quality, but even

quality degradation is not predictable.

Jessica Fridrich

et al. (2011)

Analyzing a new proposed

algorithm called (HUGO)

It was not possible to apply the fact that for

HUGO the ability of hiding changes at

each pixel can be estimated, and giving the Warden probabilistic information

concerning.

Eiji Kawaguchi

and Richard O.

Eason (2007)

Proposing a new steganography

using the image as the vessel data,

and they embedded secret

information in the bit-planes of the

vessel.

The experiments with BMP images have

shown capacities exceeding 50% of the

original image size.

T. Morkel et

al. (2005)

Offering the overview of the

different algorithms used for image

steganography to illustrate the

security potential of steganography

for business and personal use.

Used the algorithm for and if he is willing

to compromise on some features to ensure

the security of others.

Hedieh Sajedi

and Mansour

Jamzad (2010)

Proposing an adaptive

steganography technique that can

effectively defend against the most

famous steganalysis algorithms.

Demonstrated that by using cover selection

one can embed much more bits in a

suitable cover image.

Babloo Saha

and Shuchi

Sharma (2012)

Giving a thorough understanding

and evolution of different existing

digital image steganography

techniques of data hiding in spatial,

transformation and compression

domains.

They found that hiding more data results

directly into more distortion of the image.

They also found that steganography can be

misused like other technologies.

Cheng-Hsing

Yang

et al. (2011)

Proposing a novel, an efficient

steganographic method.

The method has a larger capacity and can

pass the detection of programs, also the

embedded data are totally imperceptible

from the human’s eyes upon the higher

PSNR measures in experiments.

40

2.3 Previous Survey of Genetic Algorithm Applications

Genetic Algorithm (GA) idea was born since 1975, when discovered by

Holland, (GA) is basically a mathematical expressions and logic algorithm, based on the

concept of natural genetics.

Here are some of the theoretical and practical applications that are using the idea

of a genetic algorithm in the researches.

Komal R. Hole1 and Prof. Vijay S et al. [43] proposed Theoretical study gives a

brief overview of the canonical genetic algorithm and reviews the tasks of image pre-

processing. The main task is to enhance image quality with respect to get a required

image perception. They introduced various approaches based on genetic algorithm to

get image with good and natural contrast.

It includes the definition of image enhancement and image segmentation and

also the need of Image Enhancement and the image can be enhanced using the Genetic

Algorithm and the Image Segmentation using Genetic Algorithm.

Rajesh Kumar et al. [50] compared the normal techniques of image fusion with

genetic algorithms based techniques. The results were that the image techniques GA

basis of much better results compared with traditional techniques. The experimental

results that the plans for image fusion based GA better performance than existing

schemes.

Raj Kumar Mohanta1[49] reviewed the applications of genetic algorithms for

image segmentation. It is a difficult task in the photos and the subsequent tasks

including object detection, feature extraction and processing, and to identify the faces

and classification depends on the quality of the segmentation process.

The results indicate that: Genetic algorithms have many advantages in obtaining

the optimal solution. It has been shown that the best way stronger in a large area. A

Genetic algorithm that allows for strong performance. Optimal result depends on the

encoding and the involvement of chromosome genetic system operators, as well as to

the fitness function.

41

P. Surekha1, S. Sumathi2 [47] proposed new method to improve digital images

in a Separate Wavelet Transform (DWT) domain. The tradeoff between transparency

and durability as an optimization problem and solved through the application of genetic

algorithms.

A series of experiments were performed by varying several parameters in GA,

like number of generations, population size, crossover probability, and mutation

probability.

The experimental results of this approach are proving to be safe and strong

attacks filtering, additive noise, rotation, scaling, cropping and compression Joint

Photographic Experts Group JPEG. Peak Signal to Noise Ratio (PSNR), Mean Square

Error (MSE), and is evaluating computational time for group photos.

Mantas Paulinas and Andrius Ušinskas [23] survey for GA use, it is constantly

gaining popularity in image processing. Various tasks from basic image contrast and

level of detail enhancement, with complex filter model parameters are solved using this

paradigm. The algorithm provides an opportunity to perform a robust search without

trapping in local extremes.

Different authors adopt GAs to solve a very big variety of simple and difficult

tasks. Every approach is unique, with different information encoding types,

reproduction and selection schemes. The success of optimization strongly depends on

the chosen chromosome encoding scheme, crossover and mutation strategies as well as

a fitness function. For each problem, careful analysis must be done and the correct

approach chosen.

K.F.Man, et al [20] discussed proposed theoretical studies that it predicted for

GA in the field of computer. |proved that genetic algorithms are the most powerful

unbiased optimization techniques for sampling a large solution space. Because of

unbiased stochastic sampling,

There are large classes of problems that appear to be more susceptible to resolve

the GA by any additional available optimization techniques. Like the applications in

real time, also the systems on the Internet. Perhaps the most promising areas of

42

application are systems in AI- imminent hybrid. The use of GA with Neural Networks

(NN) and fuzzy logic.

T. R. Gopala Krishnan Nair and Suma V, Manas S [32] proposed a

steganography method using genetic algorithm to protect against The Retention of

Secrecy (RS) attacks in color images. With the Implementation of the natural evolution

of the stego image using genetic algorithm enables to achieve Optimized security and

image quality.

The RS analysis is one of the strongest steganalysis, which detects the secret

message by the statistical analysis of pixel values. The objective of this proposal is to

establish a highly RS-resistant secure model with steganography. The method uses the

genetic algorithm. It enables to achieve security and enhance image quality. The

method, the pixel values of the stego image are modified by the genetic algorithm to

retain their statistical characteristics.

It is difficult to detect the existence of the secret message by the RS analysis,

implementation of this approach enhances the visual quality of the stego image.

Nevertheless, as the length of the secret message increases, the probability of detection

of secret message by RS analysis also increases.

The study makes a survey on methods in steganography. Like Least Significant

Bit (LSB) replacement steganography study, which Replacement method, the least

significant bit of the pixel values is replaced with the bit values of the Message.

Shen Wang, Bian Yang and Xiamu Niu [52] presented a new steganography

based on genetic algorithm. It is embedding the secret message in Least Significant Bit

(LSB) of the cover image, the pixel values of the steg image are modeled by the genetic

algorithm to keep their statistic characters.

Thus, the existence of the secret message is hard to be detected by the Retention

of Secrecy (RS) analysis. Meanwhile, better visual quality can be achieved.

The experimental results demonstrate the proposed algorithm's activeness in

resistance to steganalysis with better visual quality. The embedding capacity is 90%.

43

Amrita Khamrui Enrolled Scholar [5] authenticated the image to ensure the

security against the Retention of Secrecy (RS) analysis which is the most notable

steganalysis algorithm. It detects the steg message by the statistical analysis of pixel

values.

The cover image can be either grayscale or color. The cover and authenticating

images are both benchmark images. It is clear that the proposed techniques obtained

high PSNR ratio along with better image fidelity for various images. The payload may

be increased based on the requirement.

A comparative study of Peak Signal to Noise Ratio (PSNR) has been made

between various techniques. It has been that PSNR is in between 35 to 50 (approx)

which is satisfied as the better value of PSNR improves image quality. It has also been

noticed that PSNR is gradually increasing which indicates an improvement of different

techniques.

Elham Ghasemi et al. [13] proposed method embeds the message in Discrete

wavelet transform (DWT) coefficients based on GA and The Optimal Pixel Adjustment

Process (OPAP) algorithm and then applies to the obtained embedded image. It

introduces a novel steganography technique to increase the capacity and the

imperceptibility of the image after embedding.

GA employed to obtain an optimal mapping function to lessen the error

difference between the cover and the stego image and use the block mapping method to

preserve the local image properties.

Also, applied the OPAP to increase the holding capacity of the algorithm in

comparison to other systems. However, the computational complexity of the new

algorithm is high. The simulation results showed that capacity and imperceptibility of

the image had increased simultaneously.

Santi P. Maity, and Malay K. Kundu [29] investigated the scope of usage of GA

for optimality of data hiding in digital images and proposed two algorithms. The first

algorithm proposes data hiding method with improved payload capacity intended for

covert communication. Decoding reliability is improved with the increase of the number

44

of iterations in GA when the set of parameter values is fixed. The algorithm is proven to

be secured against stego test based on higher order statistics. The second method

proposes an invisible image-in-image communication through a noisy channel where

linear, power-law and parabolic functions are used to modulate the auxiliary messages.

The experimental results show that parabolic function offers higher visual and

statistical invisibility and reasonably good robustness, whereas, linear function offers a

higher robustness with reasonably good Invisibility.

Christine K. Mulunda [9] proposed a secure text Steganography algorithm based

on the genetic method. A Genetic algorithm technique is not prone to visual attacks

because of its use of numbers.

This is not the case for Format-Based technique that deals with modifications of

existing text in order to hide the steganographic text by resizing of fonts, insertion of

spaces or non-displayed characters, deliberate misspellings distributed throughout the

text and resizing the fonts, among others. The Experimental results showed that this

approach works, achieving effective optimization, security, and robustness.

Table2.3(A): Summary of studies on Genetic Algorithm


Miss. Komal R.

Hole1 and Prof.

Vijay S et al.

(2013)

Giving a brief overview of the

canonical genetic algorithm and

reviewing the tasks of image pre-

processing.

The need of Image Enhancement and the

genetic algorithm mage can be enhanced

using the Genetic Algorithm.

Rajesh Kumar et

al. (2011)

Comparing the normal techniques of

image fusion with genetic algorithms

based techniques.

The plans for image fusion based GA

better performance than existing schemes.

Raj Kumar

Mohanta1(2012)

Reviewing the applications of genetic

algorithms for image segmentation.

Genetic algorithms have many advantages

in obtaining the optimal solution. Optimal result depends on the encoding and the

involvement of chromosome genetic

system operators.

[47] P.Surekha1

and S. Sumathi2

(2011)

Proposing new method to improve

digital images in a Separate Wavelet

Transform (DWT) domain.

Proving to be safe and strong attacks

filtering, additive noise, rotation, scaling,

cropping and compression JPEG. Peak

Signal to Noise Ratio (PSNR), Mean Square Error (MSE), and is evaluating

computational time for group photos.

45

Table2.3(B): Summary of studies on Genetic Algorithm

2.3.3 Previous Review to HTML Web Page’s Steganography

There has been tremendous research in the field of HTML web pages’

steganography. Some of the data steganography works are listed below.

L.Polak [22] proposed a method of detecting and removing a hidden content

which could be transmitted through the HTML code of World Wide Web WWW pages.

In this method, the procedure which allows monitoring the changes of web pages’

structure and introduced the measure which describes how the attributes of tags are


T. R. Gopalakrishnan

Nair and Suma

V, Manas S

(2012)

Proposing a steganography method using genetic algorithm to protect

against The Retention of Secrecy (RS)

attacks in color images.

It is difficult to detect the existence of the secret message by the RS analysis,

implementation of this approach enhances

the visual quality of the stego image. The

length of the secret message increases, the

probability of detection of secret message

by RS analysis also increases.

Shen Wang,

Bian Yang and

Xiamu

Niu(2010)

Presenting a new steganography based

on genetic algorithm.

Demonstrate the proposed algorithm's

activeness in resistance to steganalysis

with better visual quality. The embedding

capacity is 90%.

Amrita

Khamrui

Enrolled

Scholar (2014)

Authenticated the image to ensure the

security against the Retention of

Secrecy (RS) analysis.

Comparative study of Peak Signal to

Noise Ratio (PSNR) has been made

between various techniques. It has been

that PSNR is in between 35 to 50 (approx)

which is satisfied as the better value of

PSNR improves image quality. It has been

noticed that PSNR is gradually increasing

which indicates an improvement of

different techniques.

Elham Ghasemi

et al .(2012)

Proposing method embeds the message

in(DWT) coefficients based on GA and the Optimal Pixel Adjustment Process

(OPAP) algorithm and then applies to

the obtained embedded image.

The simulation results showed that

capacity and imperceptibility of the image had increased simultaneously.

Santi P. Maity,

and Malay K.

Kundu (2008)

Proposed two algorithms to

investigated the scope of usage of GA

for optimality of data hiding in digital

images.

The parabolic function offers higher visual

and statistical invisibility and reasonably

good robustness; the linear function offers

a higher robustness with reasonably good

Invisibility.

Christine K.

Mulunda(2013)

Proposing a secure text Steganography

algorithm based on the genetic method.

Achieving effective optimization, security,

and robustness.

46

ordered in Hyper Text Markup Language / EXtensible Markup Language

(HTML/XML) documents. The method can be applied to control web pages and to

assure that nobody could exploit them as stego channels. (L.Polak, 2010)

Chintan Dhanani [6] her proposer Try to hide hexadecimal data in the HTML

document to overcome the problems of limited amount of hiding places & increase in

size of the document so one hiding place can hide equal to 4 bits. The paper made a

survey includes the classification of steganography techniques and techniques that

already was implemented to hide information in web documents.

He saw that data hidden in the web document is less suspicious in comparison of

other carriers because HTML web pages are now a routine part of everyone’s life and

HTML document contains the considerable number of tags, attributes & other elements

in which data can be hidden.

Mohit et al. [45] proposed technique used the HTML tags and their attributes to

hide the secret message. In this method, the messages by changing the order of

attributes as the ordering of attributes have no impact on the appearance of the HTML

documents. The key file is essentially a collection of key combinations stored in the

form of rows and columns.

These combinations are generated by thorough scanning of the html documents.

The attributes combinations used in the HTML tags are utilized to generate a key file.

2.2.4 Comparative Studies with Other Related Methods

In last years, many good methods in steganography have been proposed. HTML

steganography is one part of data hiding which uses HTML web document as a cover.

To use HTML has some benefits like large amount of pages available to hide

data, decoding of that data by any unauthorized user is very hard. HTML steganography

methods such as using of null spaces, attribute on tag, attribute value enclosures, case of

characters, evaluating the performance of the algorithm by a Largest Embedding

Capacity (LEC)& law security. ( Dhammjyoti V. Dhawase, 2014)

Here is comparison between some methods that use web page as a cover.

47

Comparison is made according to the experimental results such as a strong anti-testing

capability, strong security capability, good imperceptible and larger embedded capacity.

Mohit Garg et al, [45] proposed a novel approach of text steganography that uses

the HTML tags and attributes to hide the secret messages. HTML tag contains

numerous amount of attributes & attribute order in the tag does not affects the output of

the web pages. So data can be hidden using attribute order.

Advantage: This method has high security technique to hide messages.

Disadvantage: This method can be used when only a small amount of data needs to be

concealed.

Shingo Inoue, et al [30] using null space or white space or invisible character,

proposed some methods for hiding data into XML document. These methods can be

applied to existing XML documents easily.

Advantage: These methods can be applied to existing XML documents easily.

disadvantage: This method has weak security technique.

There are some conditions to apply this method.

1. No dependence of the order of the elements in the application.

2. No reorder of the elements before extracting the secret data.

Xin-Guang [33] gives basic idea with modify the written state (case) of letters This

method achieves the aim of hiding secret information in hypertext by modifying the

written state of the mark-up letters.

Advantage: This proposed system provides an efficient method for hiding the data from

hackers and sending hidden data to the destination in a safe manner without changing

the size of the file even after encoding.

Disadvantage: The Detection and Security of this system is very weak. As hidden data

can easily be detected and attacked.

48

Mohammad Shirali Shahreza, [44] The main idea in this method is to hide coded

data in the ID attribute of the HTML document tags. Colour code or tag id replacement

with Hexadecimal data.

Advantage: this method achieving effective, security, and robustness this approach

works to increase a high Embedding Capacity & strong security of data.

Disadvantage: Embedding Capacity is medium and the file Size is Change.

Chintan Dhanani, [6] using a relative links & multi web page embedment

technology to transfer from one web page to another web page. It divides message in

more than one parts & embed it in different pages.

Advantages: this approach works to increasing a high Embedding Capacity & strong

security of data.

disadvantages: Applicable for text files only.

The comparison between the methods shows very low in embedding capacity,

although the imperceptible is good in all method. As mentioned in table 2.4

Table 2.4: Compare between some methods used web page

Method Imperceptible Embedding

Capacity

Change Size Detection Security

using Null Space or

White space or

Invisible character

GGoooodd LLooww YYeess WWeeaakk WWeeaakk

Modify the written

state (case) of letters

GGoooodd LLooww NNoo WWeeaakk WWeeaakk

changing order of

Attributes

GGoooodd LLooww NNoo YYeess SSttrroonngg

Method/Parameter Tag

displacement

GGoooodd LLooww YYeess YYeess SSttrroonngg

Colour code or tag id

replacement with

Hexadecimal data

GGoooodd MMeeddiiuumm YYeess YYeess AAvveerraaggee

using a relative links &

multi web page

embedment technology

GGoooodd HHiigghh NNoo YYeess SSttrroonngg

49

2.4 Chapter Summary

The Steganography approaches and the fundamental mechanisms to hide data

have been studied. In this chapter fundamental objectives of computer security and

steganography and have been discussed because they are the most important aspects

of information security.

The main goal of steganography has been studied, and comparison have been held

between cryptography and steganography. Types of steganography approaches have

also been discussed, then the main steps of genetic algorithms have been studied as part

of evolutionary algorithms.

Performance analysis of methods in steganography literature have been

discussed. Literature survey of genetic algorithm applications has been reviewed and

some of the theoretical and practical applications that are using the idea of a genetic

algorithm in the research and the experimental results for this researches has been

discussed. Finally, they have been compared to other studies related to other methods of

security.

50

CHAPTER 3

RESEARCH PROCEDURES

3.1 Introduction

This chapter presents the computational model of HTML stegno system

architecture (StegnoTag). This computational model accomplishes the abstract

specification and design criteria that have been set.

For each agent, an algorithm has been proposed, this algorithm should be

implemented in order to achieve the desired design features and integration of GA in

steganography system as proposed in this work.

The expressive details of the most important stage of the solving problem have

been presented practically. The details are given in different sections; they are

structured to gradually follow the abstract model. Each of the subsequent sections

describes different algorithms that have been built upon the basic mechanisms of the

proposed.

3.2 Hiding Data in The Stored Web Page

Internet have rapidly developed and expanded in recent the years, with increased

rate of information exchange and turnover, handling of such a huge amount of

information raised the need for security and confidentiality, especially those dealing

with database systems such as email, e-commerce, medical records, billing information

and other applications database is considered as the main store for internet components

and information, it may also represent the infrastructure for websites.

Because of the value and the sensitivity of the information included within

databases, it is vulnerable to hacking attacks. Protection of data became a must to secure

valuable information included within databases.

51

Security consideration is required to be given to the database itself, beside the

surrounding environment, including the underlying applications and internet system.

.

Figure 3.1: The criteria to protection database on internet environment

3.3 Security Definition on The Project

For the hiding data system in the stored web page to function properly, it needs

some elements far accurate information processing like security control applied for data

protection.

Hwee Hwa Pang [16] published a paper about Steganographic Schemes, his

paper introduces StegFD, a steganographic file driver that securely hides user-selected

files in a file system so that, without the corresponding access keys, an attacker would

not be able to deduce their existence. (HweeHwa Pang, JUNE 2004).

The proposed system has got four security techniques, the first technique creates

the database by XML, this language does not need a local server to save a database, the

second technique is encrypting data with a key to increase level of protection, the third

technique is hiding database in HTML documents by using a genetic algorithm and the

last technique is extracting database from a cover HTML document.

Hiding database is one of the most difficult techniques, because the database

must be dynamically updated, it also has got different attributes.

Database

Security

Survivability Visibility

Attacker

52

Figure 3.2: The four security techniques

3.4 The Proposed Technique



attributes in the HTML tags has no impact on the appearance of the document.

This ordering can be used to hide the data efficiently. The proposed technique

considered that any tag represents gene and an attribute represents chromosome.

Figure 3.3: Tag characteristic on proposed technique

Whenever the page is larger and with more attributes, it is better for the

attributes, and can hide more bits within each attribute. In proposal a relation table have

been created, it consists of a primary data of two columns, each row consists of two

chromosomes represent gene.

The gene contains a set of properties. Hexadecimal encode have been chosen in

the analysis of attributes conversion, to overcome the problems of capacity and limited

amount of data for hiding.

Create database Encryption database Hide database Extract database

XML

Chromosome1 Chromosome2

Attribute1 Attribute2

Gene

Tag

53

In this encoding chromosome are represented using Hexadecimal numbers (0-9,

A-F) so the gene can hide 8 bits on the project, each chromosome in the row is supposed

to hide 4 bits.

The relation table stores the primary tag, from which it is supposed to start and

then make a random search, the algorithm method examines each chromosome of each

HTML (attribute), to examine the existence of the chromosome in the primary field of

the key file.

If the chromosome exists in the primary field, the algorithm will search its

corresponding secondary chromosome in the corresponding HTML tag, if it found

secondary chromosome, then this combination of chromosomes will be used to hide the

bit. If not, then the algorithm will skip this chromosome.

Hiding of a bit is determined by the order of the attributes in the attribute

combination. If the primary attribute is followed by a secondary attribute, it can hide bit

‘1’ in hexadecimal number, if not it can hide bit ‘0’ in hexadecimal number. A genetic

algorithm can be applied in this step.

The extraction of data from the cover page by identifying the first chromosome

combinations that hides a bits and then finding the bits that correspond to the order of

those attributes. If the primary chromosome is located before the secondary

chromosome, then the algorithm will hide bit ‘1’ in hexadecimal number. If not, they

will hide bit ‘0’ in hexadecimal number.

3.5 The Philosophy of Genetic Algorithm Application

Genetic algorithm is mainly based on setting of a group of solutions by random

natural search, then put the possibilities of the solutions (good hiding) and determines

the best solution by its fitness value. So there should be a way to specify how good that

solution is. Chromosomes (attributes) represent the solutions within the genetic

algorithm. The two basic components of chromosomes are the coded solution and its

fitness value.

54

Population represents set of chromosomes (attributes), during the process of

hiding, the genetic algorithm selects a chromosome from (tags) which represents the

population, then the genetic algorithm specifies the fitness value of the chromosome and

to produce new chromosomes called offspring. The fitness is indicated by the good

solution and the proximity of the chromosome to the optimal solution.

The aim of this proposal is to analyze a web page and then find out or create the

number of the attributes that must be taken, so that data can be hidden without

exceeding the appropriate number of created fractures which is identified by the fitness

function. Also the proposer for reduce the number of genes that will be dealt through a

genetic algorithm in order to reduce the time to hide data through the primary tag

(relation table) and account by a fitness function.

The offspring are the new population, which replace some of the chromosomes

in the existing population. The population selects the worst and the best chromosomes

and stores additional statistical information to determine the stopping criteria (hide all

data). The following are the proposed steps of the genetic algorithm:

1. Selection:

It is mainly based on the selection of pair of individuals through the roulette

wheel collection of fitness value of individuals, it starts with a primary tag (relation

table) which will be hidden within the first field of the table, and then it will be more

accurate in the selection of the second attribute, after that it will change from the sequence

properties to random properties, and this does not affect the page.

2. Fitness Function:

It is a function which assess the chromosomes of the current generation, to select

the best chromosomes in order to select the offspring, there are some factors affects a

fitness function for example The number of attributes.

3. Generation Evaluation:

The overall assessment of the generation on the basis fitness as is judged on the

development of future generations the value of function. By the evaluation of the

55

fitness function for each chromosome, put the percentage fitness for each

chromosome, and chose a parents of the next generation by the roulette wheel and

random selection. To calculate the fitness function, it must be the attributes number

and size does not exceed the capacity of Page.

nr=the number of the attributes.

pa=the capacity of the attributes on page.

Given space P of candidate solutions to a problem, fitness function f (p) for P

measures the quality of a solution P. The quality of a solution P may not different

smoothly as the genes comprising P vary since the genetic operators such as crossover

and mutation.

f(p) = (nr*pa)

So f(p) = ∑i(nr*pa ) < capacity of the page from tags

4. Crossover:

The two attributes which are basically chromosomes will be selected randomly

and the crossing point between the two attributes will be selected for the crossover,

and finally switch codes following the transit between the spouses site can select the

point or several standardized points.

5. Mutation:

Which is to prevent the minor end of the program will be through either when

the chromosome 1 exchange and vice versa 0 or choose random locations to be

changed 0 to 1 and vice versa.

6. Replacement:

These is replaced when forming the two attributes of two chromosomes parents

of four is no longer new to the population, but according to the two approaches are

switching to maintain generation.

56

7. Stop:

when complete the process of hiding database.

3.6 Creation Database and Generation of An Encryption Key

The proposal technique has implemented in C#.net language. Its content consists:

1. Creating database

2. Generating an encryption key

3. Hiding the data

4. Extracting the data

3.6.1 Creating Database

XML is very simple data store. It will occupy very less memory almost like a

text file.it can utilize XML features by easily managing the data. XML will reduce the

programming burden by its simplicity. (David Hunter et.al,2007)

XML doesn't need to install and maintenance any database engine. XML data

provide simple access using the power of ActiveX Data Objects(ADO.NET) Dataset.

It can be shared across the Web. XML documents can be stored without schemas

because they contain Meta data. Any XML tag can process an unlimited number of

attributes such as name or password.

3.6.2 Generating Encryption Key:

Encryption is a method of security that turns all kind of information into

unreadable cipher by doing a set of algorithms. These algorithms carry the data into

streams or blocks of seemingly random alphanumeric characters. (Nigel Smart.1997)

An encryption key might decrypt, or perform both functions, depending on the

type of encryption software being used.

There are a lot of types of encryption schemes, but not all are secure. Simple

algorithms can be easily broken using modern computer power, and yet another point of

57

weakness lies in the decryption method. Even the most secure algorithms will decrypt

for anyone who holds the password or key.

To creating and managing keys is an important part of the cryptographic

process. The key must be kept secret from anyone who should not decrypt data.

A simple way on C# has been used which has a special library (using

System.Security.Cryptography). Encrypt secret database using cipher encryption

mechanism and convert the data in binary format.

textBoxEncrypted.Text = creatdatxml.Encrypt.EncryptString (textBox4.Text, textBoxPassword.Text);

When a given password is encrypted, that password will always generate the

array of 24 bytes.

3.6.3 Hiding the Data:

To hide a data on HTML document, first step convert data in the binary number

in terms of bit. Then search the HTML document by scan the attributes combinations

that can be used to hide a bit as shown in Figure 3.4. Convert each attribute to a

chromosome by use library Geneticalgorithm.dll.

If first attribute exists in the primary attribute field of the key file.

Corresponding secondary attribute is searched in the corresponding HTML tag. If the

attribute found, then this couple of attributes is used to hide a bit by make crossover

with the name and value. Then make search randomly by roulette wheel selection.

On genetic algorithm, each couple of chromosomes invent one time, if the

chromosome appears more than one time each time equals one bit for example <hr />

appear 4 times then =16 bit.

if (!ICrossover.Handled)

{

//the attribute is not a primary key attribute. Is it a secondary key attribute?

58

bool copyAttribute = false;

rows = keyTable.Select(String.Format("secondAttribute = '{0:x2}'",

ICrossover.QueryFormattedName));

if (rows.Length > 0)

{

//if the corresponding first attribute does not exist in this tag

//this attribute will not be used and must be copied.

HtmlAttribute firstAttribute = FindAttribute(rows[0]["firstAttribute"].ToString(),

tag.Attributes);

Suppose H≡HTML code contains a set of gene (G1, G2, G3,…,Gn) Each gene has

a couple of chromosomes (c1, c2) the tags≡ Gn contains attributes (a1 , a2) let us denote

that a1 precede a2 in a particular tag occurrence by Gn(c1, c2) = (c1, c2) which means they

consist of an ordered pair of

attributes ≡ chromosomes.

Pn ≡ pair of chromosomes. In that case Pn,i(c1, c2) =1 if Gn,i(c1, c2) = (c1, c2)

otherwise Pn,i(c2, c1) =0, i defines the specific tags occurrence.

On≡ the more crossover order of the couples of attributes in a particular tag.

So On(cx,cy) = (cx,cy) if ∑I Pn,i(cx,cy) = (cx,cy) ≥ Pn,i(cx,cy) = (cx,cy) describes, if for a

specific tag occurrence, the order of chromosomes is compliant with the previously

determined predominant order.

So Rn,i(cx,cy) =1 if Gn,i(cx,cy) On(cx,cy) otherwise Rn,i(cy,cx) =0.

The sum of Rn,i for all pairs of chromosomes determines the number of their

occurrences in the predominant order.

59

Figure 3.4: Flow chart to hide database on HTML document

60

3.6.4 Extracting the Database

The extractor method extracts data from cover page. First enter the key of

encryption if key is true show the data. The stego page (H) document code contains a

set of gene. H≡HTML {G1, G2, G3, …, Gn} Each gene has a couple of chromosomes

(c1, c2) the tags≡ Gn as shown in Figure 3.5.

Figure 3.5: Flow chart to extract database from HTML document

61

Analyze each chromosome of each gene of the stego page by crossover

selection, each consecutive two elements of the chromosome G are considered as a

couple, if this chromosome (attribute) is found in the primary chromosome (attribute)

field of the key file then check if its corresponding secondary chromosome (attribute) is

present in the currently being processed gene.

If yes, then this pair of chromosome hides a bit.

if (!ICrossover.Handled)

{

//attribute has not been used, yet

//find key row for this attribute

rows = keyTable.Select(String.Format("firstAttribute =

{0:x2}'",ICrossover.QueryFormattedName));

if (rows.Length > 0)

{

//find corresponding attribute

secondAttribute =

FindAttribute(rows[0]["secondAttribute"].ToString(), tag.Attributes);

if (secondAttribute != null)

{

attributePosition =

htmlDocument.IndexOf(ICrossover.Name, tag.BeginPosition);

secondAttributePosition =

62

htmlDocument.IndexOf(secondAttribute.Name, tag.BeginPosition);

//compare the attributes' positions

messageByte =

ExtractBit(attributePosition, secondAttributePosition,

messageByte, bitIndex, message);

If the primary chromosome (attribute) is followed by secondary chromosome

(attribute), record a bit 1, else record a bit 0. Mark the chromosomes as processed after

retrieving the bit.

Pn,i (c1, c2)=1 Pn,i (c2, c1)=0

If the chromosome is not found in the primary attribute field of the key file, then

skip this chromosome and move to another chromosome (attribute). On the end convert

the bit stream obtained after the completion of into stream of characters.

3.7 The Architecture Design of the Proposed System

Many researchers and developers of steganography systems have used different

architectures and methodologies to establish safety hiding system for many types of

applications.

The construction of security systems based on steganography system varies

among developers of intrusion detection and prevention systems. Different developer

uses different requirements, architecture and methods.

The architecture of this proposed mainly depends on genetic algorithm as shown

in figure (3.6) this figure also shows the methods by which genetic algorithm as a class

is linked to other classes. The algorithm code contains five main classes, after creating

the GeneticAlgorithm class and providing it with an implementation of IGenomeFactory

class, this method can be applied to create the population of custom Genomes that are

needed to be used in the GA search.

63

Figure 3.6: The architecture of (StegnoTag) system

During the creation of RealGenomeFactory class, the minimum and maximum

values of each genome will be identified and then the genetic algorithm will provide

GenomeFactory class, which is responsible for every aspect of Genome construction, so

that it will be ready for evaluation and crossover with over genomes.

IGenomeFactory class collects methods used to create Genomes. HtmlAttribute

class which is considered as GeneticAlgorithm’s parent, stores HTML tags and their

attributes. Attributes don't have many properties, they only have a name and a value.

64

Each attribute in a tag can be used for only one bit. The program has to mark it

as already handled. An HtmlTag class has a name and a number of attributes. The

constructor searches the tag's text for attributes and their values.

private HtmlAttributeCollection ICrossover;

public HtmlAttributeCollection Attributes {

get { return ICrossover; }

public HtmlTag(String text, int beginPosition, int endPosition) {

this.beginPosition = beginPosition;

this.endPosition = endPosition;

this.ICrossover = new HtmlAttributeCollection();

Also has a fitness method to calculate a fitness value by comparing the value.

HtmlUtility class calculates the key attribute couples in an HTML document with

crossover method according to the rule that the selection operator chooses individuals

with a probability that corresponds to the relative fitness.

Chromosomes with a high fitness value have a great chance of being selected to

generate children for the next generation, two chosen individuals, are called the parents,

also the path and the name of the HTML document, data table with the key attributes

and number of bytes that can be hidden in the specified document.

public int CompareTo(object obj)

{ HtmlAttribute compared=(HtmlAttribute)obj;

if(this.Fitness<compared.Fitness)

return -1;

else if(this.Fitness>compared.Fitness)

return 1;

else

return 0;

}

#endregion

65

internal static void Sort()

{ throw new NotImplementedException(); }

3.8 The Implementation of The Proposed System

Genetic transactions are the basic steps on genetic algorithm, they are steady

steps that vary in style formula, those steps related to each other. The algorithm cannot

be applied on any problem unless specific conditions are available, for example the

components of the problem must be presented in the form of genomes, if these

conditions are not available the genetic algorithm will lose its value and usefulness in

finding the best solution.

The genetic algorithm is implemented in this proposal system on four main

steps, defined as follows:

1.Alteration

At the first step, database is coded by substitute with the target coding of

samples. The hexadecimal coding is the target bits that increase the capacity of the

cover page, the bits are going to be represented in the form of genomes.

2. Modulation

This is the most important step; it is an essential part of algorithm. All results

and achievements that it expects are depending on this step. Active and smart

algorithms are useful here.

In this stage genetic algorithm tries to decrease the amount of faults and improve

the transparency. To perform this step, two different methods will be used.

One method is easier and is similar to the ordinary techniques, the other method

will be a validation method for better modification of the bits of samples. It is simply

the difference between original page and modified page, here more bits and samples are

modified and adjusted than some previous algorithms.

If it can decrease the difference of the bits, the transparency will be improved.

The following are two examples of modulation for expected smart genetic algorithm.

66

In the first sample, the bits are: 00101111 = 47 on Hexadecimal value are 2F the

goal layer 5 and 1-bit data without modify: 00111111 = 63 Hexadecimal value are 3F

(difference is 16) after modifying the: 00110000 = 48 Hexadecimal value are 30 (the

difference will be 1 for 1-bit including).

In the second sample, the bits are: 00100111 = 39 Hexadecimal value are 27 the

goal layers are 4&5, and data bits are 11 without modifying: 00111111 = 63

Hexadecimal value are 3F (difference is 24) after modifying: the: 00011111 = 31

Hexadecimal value are 1F (difference will be 8 for 2 bits including).

The sample in the proposed system (StegnoTag) is a chromosome (attribute) and

each bit of the sample represents a (tag) gene. First generation or first parents consist of

original page and altered sampled.

Fitness may be determined by a function which calculates the mistake. The most

transparent sample pattern should be measured fittest.

It must be considered that in crossover and mutation the place of target bit

should not be changed.

3. Verification

This step is to control the quality, and the function of the algorithm is identified,

here the result must be verified. If the result is different from the original page and the

new page is acceptable and reasonable, the new page will be accepted, otherwise it will

be rejected and original page will be used in reconstructing the new page instead of that.

4. Reconstruction

In the last step a new page will be created. This is done by test the samples.

There are two states at the input of this step. Either modified sample or the original

sample that is the same with host page file.

This is why it can be claimed that the algorithm does not alter all samples or

predictable samples. Figure 3.7 show the main steps in steganography proposal.

67

Figure 3.7: The main steps in steganography proposal

3.9 The Good Side of Using Genetic Algorithm in The Proposal

One of the benefits of genetic algorithm is that, it can solve most problems with

optimization that can be described with chromosome coding and solving problems with

the development of multiple solutions. Moreover, it is very easy to understand,

practical, does not demand complicated mathematics and be easily transferred to genetic

algorithms and simulation models list. With respect to this proposal(StegnoTag), it have

been proved that genetic algorithm have got efficient hiding ability so that hidden

information cannot be detected on the data through visual, semantic or statistical attack,

and this part will be described in chapter four.

In contrast to the genetic algorithm, steganography techniques are susceptible to

visual, semantic and statistical attacks, because it uses random numbers. Based on the

fact that the order of attributes in HTML tags has no effect on the appearance of the

68

document. This feature can be used to hide the random data effectively. In the proposed

technique, tags are represented in the form of genes and attributes are represented in the

form of chromosomes.

In this proposal a relational table have been created, it consists of primary data

that is distributed in two columns, each column consists of two chromosomes

representing a gene that contains a set of properties. Hexadecimal encode will be

selected in the process of attributes conversion to overcome the problems of capacity

and limited the amount of hidden data. In this encoding process, the chromosome is

represented by using Hexadecimal numbers, this encoding process will increase the

capacity of the cover page (0-9, A-F) so gene can hide 8 bits on the project. The first

chromosome in a row be represented by 4 bits and the second is also represented by 4

bits. The genetic algorithm randomly combines the left hand side of one chromosome

with the right hand side of other chromosome, to form a new chromosome. The new

chromosome must be modified by replacing the repetitive genes with other genes, so

that all genes are different, within each chromosome, in process named crossover.

In mutation, the processor randomly chooses a chromosome and exchanges any

two genes to form a new chromosome. In this operation the place of the embedded

target bit is not changed. The main job of chromosomal fitness is to maximize or

minimize the value of the chromosome through many alterations to reach the suitable

cut off value, which have already been defined, and here it should select the minimum

optimal value of the chromosome.

In regards to the capacity of the hidden data, existing techniques can embed

large database, and in some cases the meaning of the cover page changes completely

until no sense can be made out of it.To analyse each chromosome of each gene of the

stego page, any consecutive two elements of the chromosome G are considered as a

couple, if this chromosome (attribute) is found in the primary chromosomal field of the

key file, then the presence of its corresponding secondary chromosome in the currently

processed gene, is examined. If it is present, then this pair of chromosome hides a bit. If

the primary chromosome (attribute) is followed by secondary chromosome (attribute),

bit 1 will be recorded, otherwise bit 0 will be recorded, and after retrieving the bit the

chromosomes will be marked as processed.

69

The method that is applied to hide huge database is vulnerable to visual attack,

because it uses illogical type of database. The main question to be answered in this

research is how to implement genetic algorithm to produce an effective tool of hiding

data. An encryption method is used to add an extra layer of security for better

encryption of confidential data.

3.10 Limitations of Using Genetic Algorithm in The Proposal

The limitations of this work are primarily due to the conditions of the simulation test:

• The test has been implemented and simulated steadily with the standard group of data

that have already been specified.

• Due to the limitation of resources, the tests have also used samples from random data.

A percentage of web pages and data have been determined to understand the behaviour

of hiding data by the researcher.

• Since the testing and simulation were done in a static condition and methods of

steganography that need the techniques of application in the real time cannot be

implemented. Also, it cannot be measured by standards of latency and simultaneous

connections due to the same reason.

3.11 Chapter Summary

This chapter gives an idea about the theoretical aspects of the proposed system

which uses genetic algorithm to hide the database on HTML document (the protection

of database) and determines the type of media that hide data in proposal techniques.

This chapter also discuss in details the processes and stages of the algorithm and it

showed the steps in Flow charts.

This chapter also discussed the architecture and implementation of the proposed

steganography system (StegnoTag) and demonstrated the good side of using

steganography with genetic algorithm where real-world problems can be solved

practically. Also discussed the limitations of using genetic algorithm.

70

CHAPTER 4

ANALYSIS AND DISCUSSION

4.1 Introduction

The proposed model for steganography by genetic algorithm is validated by

experimental research methods where series of tests are performed to prove its

capabilities. For this purpose, an application was developed that consists of database by

XML with library genetic algorithms and Steganography with encryption.

The application is developed by the C#.NET programming language logical

concepts programming. This application contains the classes, which are designed in a

way that represent logical concepts in the same way.

The genetic algorithm needs to be more reliable than those from previous related

studies and researches. The results achieved from the simulation tests are discussed

using new metrics for detection capabilities.

4.2 System Examination

To determine the fitness of the simulation the behavior of application have been

examined to know the efficiency system by calculating the capacity of the cover file to

determine the size of database which can be secretly sent by using the application of

steganography, also the change in the size on the cover page after steganography,

detection of security in the web page by determine how data can be safety and

intangible. The study has been compared with the search results to the searchers L.Polak

and Z.Kotulski [22].

The homepage of web pages has been divided into different categories to

regulate the training and testing, validation of the algorithm is examined across a range

of different parameter settings. While the classification algorithm works efficiently for

real time applications, when evaluating the system performance over many runs such as

71

examining different web pages’ categories and testing the effects of variations in

multiple parameters, simulating the algorithm can be in short time.

The testing from different home pages like company pages, news portals, social

media pages and university pages. This pages have a lot of items that must be described

HTML tags with features, increasing its ability of steganography. The results of the

analysis will illustrate by details in the tables below.

Table 4.1: Results of efficiency on the companies’ pages

Web Page File Size

(KB)

Stegano Capacity

(B)

Characters

number

Efficiency

(B/kB) %

dell.com 143 134 67 93.71%

microsoft.com 62 55 27 88.71%

toshiba.com 88 44 22 50%

asus.com 85 37 18 43.51%

sony.com 145 52 26 35.86%

samsung.com 81 14 14 17.28%

Table 4.2: Results of efficiency on the news’ pages

Web Page File Size

(KB)

Stegano Capacity

(B)

Characters

number

Efficiency

(B/kB) %

alarabiya.net 276 350 175 126.81%

news.google.com 750 721 360 96.13%

aljazeera.net 190 121 60 63.68%

bbc.com 430 249 124 57.91%

cnn.com 893 480 208 53.75%

news.yahoo.com 1,013 224 112 22.11%

For the simulation purpose, C# software has been used and the simulation was

run in normal PC. Each web page has been trained and tested individually for different

parameter settings.

72

Table 4.3: Results of efficiency on universities’ pages

Web Page File Size

(KB)

Stegano Capacity

(B)

Characters

number

Efficiency

(B/kB) %

fbsu.edu.sa 107 235 117 219.61%

ksu.edu.sa/en 71 74 37 104.23%

manchester.ac.uk 62 72 57 116.13%

cam.ac.uk 61 53 26 86.89%

ox.ac.uk 100 83 42 83%

hanover.edu 91 11 5 12.09%

Table 4.4: Results of efficiency on Social Media’ pages

Web Page File Size

(KB)

Stegano Capacity

(B)

Characters

number

Efficiency

(B/kB) %

youtube.com 483 304 152 62.94%

plus.google.com 1,062 420 210 39.55%

facebook.com 856 217 108 25.35%

twitter.com 302 51 25 16.89%

linkedin.com 329 37 18 11.25%

The results show that universities pages (Table 4.3) and companies’ pages

(Table 4.1) have smaller capacity of the cover file to determine the database which can

be secretly sent by using the application of steganography, because their main pages

(home page) are usually small ones in order to simplify navigation and finding specific

information sought by the customers.

Social Media (Table 4.4) commonly do not allow to hide a lot of information

comparing to their size, because the attributes do not appear so frequently in their

source code, it must login to the main page to navigate the inside pages have a lot of

attributes.

The best results are shown in news pages’ table (4.2), those results represent the

best efficiency of capacity. The News pages are big in size because they have lot of

attributes, so that they can allow to hide a lot of data compared to their size so they can

constitute a decent steganography environment.

73

It have been observed that in some pages the efficiency is over 100% in

fbsu.edu.sa page (Fahad Bin Sultan University) have 219.61%, the hide was very good

and the size not change these results show that this type of processing allows to highly

increase the maximum size of hidden data that can be sent over a HTML code with the

tags’ steganography algorithm, but some case like manchester.ac.uk (Manchester

University) have an efficiency 116.13% the hide was bad, because there is a defect and

integrating the data hidden.

The result of table (4.5) shows the efficiency of genetic algorithm, most of the

pages after hiding data like the original page do not change when the secret data are

hidden on HTML document (Figure 4.1), the increase between 1-3 kb except social

media pages the increases 11-14 kb, 497 kb of capacity for the youtube.com page,

compared to 483 kb before, gave us 2.9% increase.

(Table 4.5): Simulation Tests Results for pages’ size after hiding data

Web Page File Size before

hiding (KB)

File Size after

hiding (KB)

Increase

size

Increase rate

dell.com 143 143 0 0

microsoft.com 62 62 0 0

alarabiya.net 276 278 2 0.7%

news.google.com 750 753 3 0.4%

aljazeera.net 190 191 1 0.5%

bbc.com 430 430 0 0

cnn.com 893 893 0 0

fbsu.edu.sa 107 107 0 0

ksu.edu.sa/en 71 72 1 1.4%

manchester.ac.uk 62 62 0 0

cam.ac.uk 61 61 0 0

ox.ac.uk 100 101 1 1.0%

hanover.edu 91 91 0 0

youtube.com 483 497 14 2.9%

plus.google.com 1,062 1.076 14 13.2%

facebook.com 856 867 11 1.3%

74

Figure 4.1: Simulation Tests Results Increase rate after hide data

Figure (4.1) shows the best feature to gain the maximum accuracy. The value

between 0% and 0.02% increase rate obtained at most of the news pages. These values

are indication of the strength of the genetic algorithm in hiding data without increasing

the size of page.

4.3 Comparative Studies

To test the proposed algorithm, a table have been made containing three

columns, the first column have got different web pages, the second column is the

capacity of proposal method, and the last column is the capacity of L.Polak and

Z.kotulski method ,so efficiency of the steganography algorithm have been examined.

(Table 4.6): Comparison between pages’ capacity on pages share with other method

Web Page Capacity

of proposal method

Capacity

of L.Polak method

microsoft.com 55 126

sony.com 52 100

youtube.com 304 255

cnn.com 480 153

news.yahoo.com 224 133

On table (4.6) some websites which shared between the two methods, the

websites have been tested for the maximum capacity hidden amount of web page. Table

75

shows the largest embedded capacity. The two methods were compared, the

experimental results figure (4.2) are shown so that these methods have capacity bigger

than the other methods. So a comparison between the pages after hiding data (table 4.6)

explained the proof on the efficiency of genetic algorithm.

Figure (4.2): Comparison results between the two methods

4.4 Performance Evaluation of Genetic Algorithm

There are three main components to be designed in GAs. The primary

component is the coding which is drawing a scheme from a problem to the GA

paradigm and represents possible solutions like binary coding and hexadecimal coding

[48].

The second one is a fitness function which determines the quality of solutions

and allows us to differentiate best solutions from bad solutions. The last one is a set of

parameters including population size, population structure, a sequence of genetic

operators, the operators' parameters, the end of situation [21]. The two models of

performance analysis have been measured in this application:

1. Fitness value when select mutation model.

2. Fitness value when cancel mutation model.

76

A GPdotNET v3 software have been used for testing the experimental analysis

[54]. It is a free software tool of artificial intelligence for the application of genetic

programming and genetic algorithms in modeling and optimization of different

problems of engineering nature. The application was developed in .NET (Mono)

framework and C# programming language and can be run on Windows and Linux

operating system.

In this project the fitness function was

Where nr =the number of the attributes.

And pa =the capacity of attributes on page.

Suppose x is a candidate solution to problem, fitness function for x is to

measuring quality of a solution x. The goodness of a solution x may not different as the

genes which is included x vary since the genetic operators such as crossover and

mutation. The equation took from the number of genes in the attributes string, the

population size a capacity of each attribute on page, the mutation probability, the

crossover probability, a random number generator, and a number of generations to run

the simulation.

The roulette wheel selection procedure is employed for the reproduction process.

Because this procedure requires positive fitness values, when there is a negative fitness

value in the population, compensation technique is used for fitness sizing.

In crossover operation the one-point crossover was chosen with set of a

probability. The mutation is applied by random changes in the binary bit string. Each bit

of the string mutation occurs with specific probability, if there are solutions useless

reached after the operations, this string is ignored, and then new one are created from

the start instead.

4.4.1 Fitness Value When Select Mutation Model

In the first run when mutation was selected, the two chromosomes were

randomly chosen to create a new offspring by using a probability of crossing their

genetic string patterns at a random point, 500 values were generated with 500 genes, a

77

population size of 70, in probability of GP operations 0.05 chance of mutation, selection

1 of elitism ,0.3 of reproduction and a 0.7 chance of crossover.

No law to determine the values, may give any values without affecting the

results of the algorithm. To calculate the fitness function must be a number of attributes

does not exceed the attributes capacity of the page from tags.

The implementation was responsible for an initial generation of genes,

calculating their fitness, mating then and creating offspring, then analyzing the next

generation.

Experimental data is generated by GPdotNET program figure (4.3). The entire

dataset is randomly divided into a training set of 24 compounds and test set consisting

of 10 compounds, represented by 10 selected molecular descriptors.

figure (4.3): Experimental dataset when select mutation

The results show that the best solution is found in 500 generations, with best

fitness value is 137.76 as shown in figure (4.4). The most fit, and the maximum fitness

correct bits in the most fit for each generation.

78

Figure (4.4): The evaluation simulation result when select mutation

Figure (4.5): Simulation result of GP modelling of best fitness simulation when select

mutation

When analyzing average fitness, it’s amazing to see the biggest jump from the

first generation to the second, this is the first trained generation to work with the fitness

function, each time the algorithm selects random chromosomes it selects the more fit

out of the two possibilities surrounding a randomly generated number. When this is

performed for an entire generation, there would be substantial growth.

Table (4.7): Training dataset to test a select mutation

Nr X1 X2 X3 X4 R1 R2 R3 R4 Y

1 0 0 0 0.04345 3.83745 2.44014 7.15454 0.60323 0.448

2 0 0 0.5 0.33859 3.83745 2.44014 7.15454 0.60323 0.368

3 0 0 1 0.26253 3.83745 2.44014 7.15454 0.60323 0.336

4 0 0.5 0 0.06103 3.83745 2.44014 7.15454 0.60323 0.576

5 0 0.5 0.5 0.57484 3.83745 2.44014 7.15454 0.60323 0.552

6 0 0.5 1 0.58944 3.83745 2.44014 7.15454 0.60323 0.408

7 0 1 0 0.22962 3.83745 2.44014 7.15454 0.60323 1

8 0 1 0.5 0.67536 3.83745 2.44014 7.15454 0.60323 0.88

9 0 1 1 0.70682 3.83745 2.44014 7.15454 0.60323 0.776

79

When choosing mutation, the fitness value was increased in every time, and

change at generation value was lower.

This has led to high the fitness value for the most fit, also has the ability to

randomly mutate was highly valued to genes figure (4.6) show the results.

Figure (4.6): the results of select mutation to training dataset

figure (4.7): The result of GP model fitness evolution of the program when select

mutation

It is not important to have a linear upward slope in reaching higher fitness,

because the crossover function may destroy a most fit individuals offspring in the early

generations as shown in figure (4.7).

80

Figure (4.8): The results of training dataset when select mutation

Figure (4.9): Simulation results of test data when select mutation

Figure (4.9) show the average fitness constantly increase, so that despite some

variance in the most fit gene, the population tends to grow consistently over time.

4.4.2 Fitness Value When Cancel Mutation

In the second run when mutation was canceled, the two chromosomes were

randomly chosen to create a new offspring by using a probability of crossing their

genetic string patterns at a random point, 500 values were generated with 500 genes, a

population size of 70, in probability of GP operations 0.00 chance of mutation, selection

1 of elitism ,0.3 of reproduction and a 0.7 chance of crossover. Also like the selected

mutation model, the entire dataset is randomly divided into a training set of 24

compounds and test set consisting of 10 compounds, represented by 10 selected

molecular descriptors.

The best fitness values increased slower not reach as high of a value total, the

most fit individual to be more consistent likely reaching a higher overall value. The

results show that the best solution is found in 500 generations, with best fitness value is

81

53.23, the changed at generation is 84 as shown in figure (4.10). The result is less than

select mutation.

Figure (4.10): The evaluation result when cancel mutation

Figure (4.11): The results of modelling of best fitness simulation when cancel mutation

Figure (4.11) displays the best fitness is 53.23, the most fit, and the maximum

fitness correct bits in the most fit for each generation. When analyzing average fitness,

each time the algorithm selects random chromosomes it selects the more fit out of the

two possibilities surrounding a randomly generated number. When this is performed for

an entire generation, there would be substantial growth.

Table (4.8): Training dataset to test a cancel mutation

Nr X1 X2 X3 X4 X5 X6 X7 Y

1 -0.45239 0.509759 1.606766 2.828297 4.163454 5.600927 7.129132 13.91386

2 0.509759 1.606766 2.828297 4.163454 5.600927 7.129132 8.73632 15.71945

3 1.606766 2.828297 4.163454 5.600927 7.129132 8.73632 10.41068 17.54589

4 2.828297 4.163454 5.600927 7.129132 8.73632 10.41068 12.14042 19.38214

5 4.163454 5.600927 7.129132 8.73632 10.41068 12.14042 13.91386 21.21744

6 5.600927 7.129132 8.73632 10.41068 12.14042 13.91386 15.71945 23.0414

7 7.129132 8.73632 10.41068 12.14042 13.91386 15.71945 17.54589 24.84399

8 8.73632 10.41068 12.14042 13.91386 15.71945 17.54589 19.38214 26.61558

9 10.41068 12.14042 13.91386 15.71945 17.54589 19.38214 21.21744 28.34696

10 12.14042 13.91386 15.71945 17.54589 19.38214 21.21744 23.0414 30.02934

82

The simulation was produced the results detailed in table (4.8). The maximum

best fitness was 53.23 and the result shows the maximum fitness correct bits in the most

fit for each generation. When mutation was canceled, the best fitness values was

increased slowly to reach as high of a value total, the most fit individual to be more

consistent likely to reaching a higher overall value the results shown in figure (4.12).

Figure (4.12): The results of training dataset when cancel mutation

The second simulation test of the algorithm has been applied for higher fitness in

dataset which includes a difference of bits on Table (4.9).

Table (4.9): Training dataset to test fitness function when cancel mutation

Nr X1 X2 X3 X4 X5 X6 X7 Y

1 6.433781 3.815089 2.54016 2.798055 3.632377 3.527516 2.96315 -0.5019

2 3.815089 2.54016 2.798055 3.632377 3.527516 2.96315 2.165226 -1.27008

3 2.54016 2.798055 3.632377 3.527516 2.96315 2.165226 1.26764 -1.91571

4 2.798055 3.632377 3.527516 2.96315 2.165226 1.26764 0.358334 -2.41751

5 3.632377 3.527516 2.96315 2.165226 1.26764 0.358334 -0.5019 -2.76119

6 3.527516 2.96315 2.165226 1.26764 0.358334 -0.5019 -1.27008 -2.93796

7 2.96315 2.165226 1.26764 0.358334 -0.5019 -1.27008 -1.91571 -2.94334

8 2.165226 1.26764 0.358334 -0.5019 -1.27008 -1.91571 -2.41751 -2.77633

9 1.26764 0.358334 -0.5019 -1.27008 -1.91571 -2.41751 -2.76119 -2.43874

10 0.358334 -0.5019 -1.27008 -1.91571 -2.41751 -2.76119 -2.93796 -1.93468

83

The results show that it is not important to have a linear upward slope in

reaching higher fitness, because the crossover function may destroy a most fit

individuals offspring in the early generations as shown in figure (4.13) and figure

(4.14). The values of true positive and true negative are also demonstrated. But it is

different between select mutation and cancel mutation. There are more meanders in

model fitness evolution of the program when cancel mutation Figure (4.13).

Figure (4.13): Model fitness evolution of the program when cancel mutation

Figure (4.14): the result of training dataset when cancel mutation

The performance analysis in fitness value show many facts in genetic algorithm

which related fitness value with respect to mutation selection, crossover, population

size, length of random genetic string, and the number of generations when used genetic

algorithm in steganography.

84

When cancel mutation it has been observed that quite few development jumps

after the first dozens of generations and the performance was very poor .These

simulation results show that a GA without mutation does not work at all. Therefore,

they mean that mutation plays a key role in this optimization problem.

An analysis showed that the mutation may affects in decreased average fitness,

but increased probability value of fitness fit was individual, also shows that crossover

helped average fitness while lowering the fitness value for the most fit individual lead to

a reduced negative fitness effect.

4.5 CPU Time Usage

Change in the simulation time has effect on CPU time. The project is simulation

steganography; it is depending on the save the different web pages on the system, so it

has an effect on the CPU time.

Buffer levels and different computers and operating systems vary widely in how

they keep track of CPU time, the simulation time increases, when more production

processes happening in the program. The performance of big database applications

depends critically on the packet buffer size of the data center.

In this application the database has been used with illogical design, it builds with

XML language, that does not require large memory to save it.

Computer's kind like the fitness function determine set of decision variables

according to the CPU time. Buffer levels do not have significant effect on simulation

time, except some extreme conditions, so did not mention buffer levels in performance

analysis [25].

The Results on CPU time converge when execute the hide data in different home

pages. The testing from different home page like company pages, news portals, Social

Media pages and university pages. When execute steganography with BBC news home

page CPU time, figure (4.15) and Figure (4.16) presents the results for CPU time by

using NET Reflector 9.0 program [55].

85

Figure (4.15): Simulation results for time line when steganography on BBC news page

Figure (4.16): Simulation results for method grid when steganography on BBC news

page

Further analysis result carried by execute a lot of different web pages, the

Statistical inference about the results is obtained by comparing the performances using

percentage tests. Examine the behavior of the performance measures steganography in

BBC news home page and dell company, Fahad Bin Sultan University, and YouTube

home page. Different kinds of web pages when make steganography having the same

CPU time results.

Figure 4.17: The results of CPU time when steganography on dell page

86

The results obtained in figure (4.17) by the beginning of the program have hit

count point in encrypting method after this point there is an increase in CPU time,

which is in the fitness method (. ctor) which have big hit count in method.

Population size and maximum generation number have no effects on CPU time

increases. Increasing the population size, and generation number enlarges the search

space; so CPU time increases.

The crossover probability is insignificant according to CPU time, as in the case

of fitness response. To crossover or not does not make any sense, because in each case

the produced children have similar patterns. Figure (4.17) curve shows that increments

in fitness function make significant reductions in CPU time covered.

Table (4.10): Results of CPU time when steganography on dell page

Method name Hit count CPU % Average CPU %

Main 1 99.964 0

btnEncryptDataSet_Click 1 2.65 0.002

EncryptString 1 1.434 0.009

btnSrcFileName_Click 2 41.938 0.005

GetSourceFileName 2 41.874 0.004

btnAnalyse_Click 1 5.442 0.038

GetCapacity 2 6.303 0.108

FindTags 4 3.218 0.05

.ctor 6 2.783 0

btnDstFileName_Click 1 12.317 0.01

GetDestinationFileName 1 12.275 0.003

btnHide_Click 1 24.349 0.037

Hide 1 21.382 0.349

btnExtract_Click 1 3.189 0.026

Extract 1 2.966 0.108

button6_Click 1 1.929 0.01

The results showed that, the simulation time of the solutions generated by GA

constitute to 99.964% of CPU time 0% in average CPU % Table 4.10. The factors

affecting the CPU time change the simulation time. For example, the simulation time

depends on the load of the system. The type of operating system increased the

87

simulation time, more production processes come in the project. Hide method has

41.938% of CPU time average CPU % 0.005%.

Encryption method after this point there is an increase in CPU time, which is in

the fitness method. The mutation has negative impact on CPU, so it is better to set it at

its high level. As the crossover probability and initial population type do not have

significant effect on CPU time.

4.6 Chapter Summary

This chapter in general looks for an important and traditional metrics needed to

check the simulation test of the system algorithms and clarify the limitation and

assumptions. The data used in the simulation test have been identified and discussed.

Also the mathematical equation and definition of the required metrics have been

determined and the results that support the algorithm have been analyzed.

The behavior of application has been tested to know the fitness of simulation

and the efficiency system by calculating the capacity of the cover file to determine the

database which can be secretly sent by using the application of steganography. The

dataset that is used in the simulation has been selected and discussed.

Also the mathematical equation and definition of the required metrics have been

determined. The results that supported the algorithm have been analyzed and reported.

Two models of performance analysis have been measured, the first one is fitness

value by selecting the mutation model and the second one is fitness value by the

cancelation of mutation model. The Change in the simulation time have been observed

through the effect of CPU time on the project of simulation steganography.

88

CHAPTER 5

CONCLUSIONS AND FUTURE WORK

5.1 Conclusions

The main objective of this research is to study the hiding of database within a

specific file (web page) without changing the size of this file using genetic algorithm to

increase the reliability and confidentiality of this database. It is inspired by the way of

natural evolution which is based on Darwin's theory. The mechanism is mainly based on

the way that chromosomes generate new generations through specific mechanisms, such

as cloning and mutation, through which data may be hidden within the page. In each

repetition of the process of generation, concealment process is performed including the

random selection and re-mixing and re-encryption.

The results showed that the combination between steganography and genetic

algorithm will improve the perfection of database hiding within a specific file up to 90%

without changing its general features of the file that are distributed in specific locations

within the file, the ability to isolate the hiding program will be protected with a

password.

The proposed program includes the concentration on the hidden database with a

specific web page and the application of genetic algorithm in database concealment.

That is way the general features of the HTML construction have been studied specially

the (tag) and putting the hidden database inside it and the relation between the features

and the hidden data. The steganography process has been completed successfully with

the required criteria.

Encryption algorithm is designed to strengthen and further complicate illegal

attempts to remove concealment. The main features of the system were designed and

implemented successfully.

89

5.2 Results

The conceptual framework is derived from a hybrid conceptual frameworks for

hiding and framework applied for genetic algorithm engineering, the new framework is

one of the contributions of this research, it can be applied in any system inspired by the

biological diversity.

The integration of the system of genetic algorithm and its adaptation to the

steganography system are essential for the designed specifications of the system to fulfil

the main objectives through the verification of system selection and analysis of results

by applying traditional methods of calculation. This research has introduced new

reference measurements through specific programs.

5.3 Future Work

To work in the future, researchers need deep knowledge about the mechanism of

natural and genetic algorithms for more inspiration and to develop new proposals. This

work has shown that the integration of the science of steganography with genetic

algorithm have got the ability to achieve a significant improvement in data security and

following the same methodology, this mix can be extended to the development of other

security systems to get safer and more reliable systems for database hiding.

5.4 Recommendation

1. The system needs a source of inspiration to be validated as a real-time, to

overcome any shortages that may arise during the implementation of the system

in the real time. The test needs to be more reliable and a sufficient volume of

data. Implementation testing and simulation must be done using other data to

solve problems in other sectors. Other algorithms must be improved to increase

the amount of hidden databases and the application of logical databases to deal

with the memory of the computer.

2. The high flexibility of HTML must be exploited so that it can be manipulated in

the places of features without any change in the status of the page in a web,

which is imperceptible for the browser. Other languages can also be used, they

other than common in the process of hiding databases. Although the proposed

90

method is simple but it is not common and require knowledge and experience to

be discovered.

3. The Internet protocols can also be exploited through steganography. The

development of this method by introducing developmental algorithms to

increase the efficiency hiding data the appearance of fifth-generation in the

communication process and this will develop a way of hiding data, it provides

greater speeds for data download compared to other generations through what is

known as Long Term Evolution (LTE). Steganography of e-mail data and its

white spaces and characters can also be developed.

91

REFERENCES

First: books

[1] David Hunter, et al -Beginning XML-(Indiana: Indianapolis-Wiley

Publishing- 4th ed- N,N,NO,Date).

[2] Mitchell Melanie- An Introduction to Genetic Algorithms- (London:

England- Massachusetts Institute of Technology- Fifth printing, 1999). Available at

:http://www.boente.eti.br/fuzzy/ebook-fuzzy-mitchell.pdf (Accessed on:24/08/2014).

[3] Nigel Smart-Cryptography: An Introduction-(3rd Edition)- McGraw-Hill-

(N.N.NO Date).

[4] S.N.Sivanandam S.N.Deepa- Introduction to Genetic Algorithms –

(Berlin Heidelberg:Springer-Verlag 2008 ).

Second: Scientific papers and researches

[5] Amrita Khamrui Enrolled Scholar-A Report on Genetic Algorithm based

Steganography for Image Authentication-(2014). Available

at:http://jkmandal.com/pdf/amrita_report2.pdf -(Accessed on:09/10/2014).

[6] Chintan Dhanani, et al -HTML Steganography using Relative links &

Multi web-page Embedment-(2014). Available at

:https://www.ijedr.org/papers/IJEDR1402108.pdf(Accessed on:16/01/2016).

[7] Chintan Dhanani, et al-Steganography using web documents as a

carrier:A Survey-(2013).

Available at:https://www.ijedr.org/papers/IJEDR1303036.pdf-(Accessed

on:05/08/2015).

http://jkmandal.com/pdf/amrita_report2.pdf

https://www.ijedr.org/papers/IJEDR1402108.pdf

https://www.ijedr.org/papers/IJEDR1303036.pdf

92

[8] Christian Grothoff et al. -Transection Based Steganography -(2009).

Available at:http://grothoff.org/christian/stego.pdf-(Accessed on:12/08/2012).

[9] Christine K. Mulunda , et al -Genetic Algorithm Based Model in Text

Steganography-(10-1-2013).

Available at:http://dl.acm.org/citation.cfm?id=2636522-(Accessed on:15/10/2014).

[10] David E. Goldberg-Genetic Algorithms and the Variance of Fitness-(

Complex Systems 5 - 1991)- pp. 265-278. Available at:http://www.complex-

systems.com/pdf/05-3-1.pdf-(Accessed on:21/05/2011).

[11] Donovan Artz - Digital Steganography: Hiding Data within Data-( IEEE

Internet Computing May / June 2001)-pp.75-80. Available at

:http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=935180&url=http%3A%2F

%2Fieeexplore.ieee.org%2Fiel5%2F4236%2F20242%2F00935180-(Accessed

on:22/09/2012).

[12] Eiji Kawaguchi and Richard O. Eason -Principle and applications of

BPCS-Steganography-(2007). Available at:http://datahide.org/BPCSe/QtechHV-

program-e.html-(Accessed on:14/07/2011).

[13] Elham Ghasemi, et al -High Capacity Image Steganography Based on

Genetic Algorithm and Wavelet Transform- (2012). Available at

:http://www.iaeng.org/publication/IMECS2011/IMECS2011_pp495-498.pdf-

(Accessed on:27/10/2014).

[14] Eric Cole Ronald D. Krutz-Hiding in Plain Sight: Steganography and the

Art of Covert Communication- (Indiana. Canada: Wiley Publishing- Indianapolis -

2003). Available at: http://www.amazon.com/Hiding-Plain-Sight-Steganography-

Communication/dp/0471444499-(Accessed on:11/04/2014).

http://grothoff.org/christian/stego.pdf

http://dl.acm.org/citation.cfm?id=2636522

http://www.complex-systems.com/pdf/05-3-1.pdf

http://www.complex-systems.com/pdf/05-3-1.pdf

http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=935180&url=http%3A%2F%2Fieeexplore.ieee.org%2Fiel5%2F4236%2F20242%2F00935180

http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=935180&url=http%3A%2F%2Fieeexplore.ieee.org%2Fiel5%2F4236%2F20242%2F00935180

http://datahide.org/BPCSe/QtechHV-program-e.html

http://datahide.org/BPCSe/QtechHV-program-e.html

http://www.iaeng.org/publication/IMECS2011/IMECS2011_pp495-498.pdf

http://www.amazon.com/Hiding-Plain-Sight-Steganography-Communication/dp/0471444499

http://www.amazon.com/Hiding-Plain-Sight-Steganography-Communication/dp/0471444499

93

[15] Gary C. Kessler-Steganography: Hiding Data Within Data-(2001).

Available at:http://www.garykessler.net/library/steganography.html-(Accessed

on:16/09/2012(.

[16] HweeHwa Pang, et al- Steganographic Schemes for File System and B-

Tree- (IEEE Transactions on Knowledge & Data Engineering- vol.16- no. 6- June

2004) pp. 701-713-(Accessed on:16/09/2012).

[17] Ingemar J Cox et al- Information Transmission and Steganography –

(1996). Available at

:http://www.cs.ucl.ac.uk/staff/I.Cox/Content/papers/2005/iwdw2005.pdf-(Accessed

on:12/12/2013).

[18] Jen-Chang Liu, Ming-Hong Shih -Generalizations of Pixel-Value

Differencing Steganography for Data Hiding in Images-(2008).

[19] Jessica Fridrich, et al -Steganalysis of Content-Adaptive Steganography in

Spatial Domain-(2011). Available at

:http://dde.binghamton.edu/kodovsky/pdf/Fri11BOSS.pdf-(Accessed on:22/10/2012).

[20] K.F.Man,et al- K.S. Tang, K.F.-Genetic Algorithms: Concept And

Applications-(IEEE transactions in industrial electronics -Vol.43-No.5-October

1996)-pp.519-534. Available at

:http://www.dca.fee.unicamp.br/~gomide/courses/EA072/artigos/Genetic_Algorithm

s_Concepts_Applications_Kwong_1996.pdf-(Accessed on:22/02/2016).

[21] Kazuo Sugihara-Measures for Performance Evaluation of Genetic

Algorithms-(2007). Available at

:http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.49.8611&rep=rep1&type

=pdf- ) Accessed on:20/04/2016).

[22] L.Polak1,Z.Kotulski2 -Sending Hidden Data Through Www

Pages:Detection And Prevention-( Trans- Polish Academy of Sciences - Institute of

Fundamental Technological Research- 2010)-pp.75–89.)Accessed on:10/08/2015(.

http://www.garykessler.net/library/steganography.html

http://www.cs.ucl.ac.uk/staff/I.Cox/Content/papers/2005/iwdw2005.pdf

http://dde.binghamton.edu/kodovsky/pdf/Fri11BOSS.pdf

http://www.dca.fee.unicamp.br/~gomide/courses/EA072/artigos/Genetic_Algorithms_Concepts_Applications_Kwong_1996.pdf

http://www.dca.fee.unicamp.br/~gomide/courses/EA072/artigos/Genetic_Algorithms_Concepts_Applications_Kwong_1996.pdf

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.49.8611&rep=rep1&type=pdf

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.49.8611&rep=rep1&type=pdf

94

[23] Mantas Paulinas, Andrius Ušinskas -A Survey of Genetic Algorithms

Applications for Image Enhancement and Segmentation- (Information Technology

and Control- Vol.36- No.3-2007)-Pp. 278- 284. Available at

:http://itc.ktu.lt/itc363/Paulinas363.pdf-(Accessed on:25/11/2011).

[24] Matthew Walker -Introduction to Genetic Programming-( October 7,

2001). Available at

:https://www.cs.montana.edu/~bwall/cs580/introduction_to_gp.pdf-(Accessed

on:16/09/2012).

[25] Onur Boyabatli -Parameter Selection in Genetic Algorithms-(Systemics,

Cybernetics and Informatics- Vol. 2 – No. 4-2004)-pp.78-83. Available at:

http://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=1840&context=lkcsb_rese

arch-(Accessed on:17/02/2016).

[26] Robert H. Williams III- Introduction to Information Security Concepts-(

2007). Available at

:http://www.worldcolleges.info/sites/default/files/enggnotes/introduction_to_informa

tion_security_concepts.pdf-(Accessed on:6/10/2012).

[27] Robert H. Williams III- Introduction to Information Security Concepts-

(2007). Available at

:http://www.worldcolleges.info/sites/default/files/enggnotes/introduction_to_informa

tion_security_concepts.pdf-(Accessed on:21/10/2012).

[28] Sandipan Dey -Embedding Secret Data in Html Web Page-(2010).

Available at:http://arxiv.org/pdf/1004.0459v1.pdf-(Accessed on:16/01/2016).

[29] Santi P. Maity1, Malay K. Kundu2-Genetic Algorithms for Optimality of

Data Hiding in Digital Images- (2008). Available at

:http://www.isical.ac.in/~malay/Papers/IJSoft%20comp_09.pdf-(Accessed

on:16/10/2014).

http://itc.ktu.lt/itc363/Paulinas363.pdf

https://www.cs.montana.edu/~bwall/cs580/introduction_to_gp.pdf

http://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=1840&context=lkcsb_research

http://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=1840&context=lkcsb_research

http://www.worldcolleges.info/sites/default/files/enggnotes/introduction_to_information_security_concepts.pdf




http://arxiv.org/pdf/1004.0459v1.pdf

http://www.isical.ac.in/~malay/Papers/IJSoft%20comp_09.pdf

95

[30] Shingo Inoue ,et al -A Proposal on Information Hiding Methods using

XML-(2002). Available at:http://takizawa.ne.jp/nlp_xml.pdf-(Accessed

on:11/02/2016).

[31] T. Morkel , et al -An Overview Of Image Steganography-(2005). Available

at:http://repository.root-me.org/St%C3%A9ganographie/EN%20-

%20Image%20Steganography%20Overview.pdf-(Accessed on:27/08/2012).

[32] T. R. Gopala krishnan Nair, et al -Genetic Algorithm to Make Persistent

Security and Quality of Image in Steganography from RS Analysis-

(2012).Available at:https://arxiv.org/ftp/arxiv/papers/1204/1204.2616.pdf-(Accessed

on:19/10/2014).

[33] Xin-Guang Sui, Hui Luo-A New Steganography method based on

Hypertext- IEEE-2004.

Third: Journals and periodicals

[34] Anit Kumar -Encoding Schemes In Genetic Algorithm – (International

Journal of Advanced Research in IT and Engineering, Vol. 2 - No. 3 - March 2013)-

PP.1-7. Available at: http://www.garph.co.uk/ijarie/mar2013/1.pdf-(Accessed

on:25/04/2014).

[35] Arup Kumar Bhaumik1, et al -Data Hiding in Video- (International Journal

of Database Theory and Application -Vol. 2- No. 2- June 2009)-pp.9-16. Available

at:http://www.sersc.org/journals/IJDTA/vol2_no2/2.pd-f(Accessed on:27/10/2012).

[36] B.B. Zaidan, al.el- StegoMos: A secure novel approach of high rate data

hidden using mosaic image and ANN-BMP cryptosystem -(International Journal of

the Physical Sciences- Vol. 5(11)- 18 September, 2010 )- pp. 1796-1806. Available

at:http://www.academicjournals.org/journal/IJPS/article-full-text-

pdf/328BB1031824-(Accessed on:18/06/2011).

[37] Babloo Saha , Shuchi Sharma-Steganographic Techniques of Data Hiding

using Digital Images –(Defence Science Journal-Vol. 62- No. 1- January 2012)- pp.

http://takizawa.ne.jp/nlp_xml.pdf

http://repository.root-me.org/St%C3%A9ganographie/EN%20-%20Image%20Steganography%20Overview.pdf

http://repository.root-me.org/St%C3%A9ganographie/EN%20-%20Image%20Steganography%20Overview.pdf

https://arxiv.org/ftp/arxiv/papers/1204/1204.2616.pdf

http://www.garph.co.uk/ijarie/mar2013/1.pdf

http://www.sersc.org/journals/IJDTA/vol2_no2/2.pd-f

http://www.academicjournals.org/journal/IJPS/article-full-text-pdf/328BB1031824

http://www.academicjournals.org/journal/IJPS/article-full-text-pdf/328BB1031824

96

11-18. Available at

:http://publications.drdo.gov.in/ojs/index.php/dsj/article/viewFile/1436/601-

(Accessed on :03/09/2012).

[38] Cheng-Hsing Yanga, et al -A data hiding scheme using the varieties of

pixel-value differencing inmultimedia images-( The Journal of Systems and

Software-( 2011))-pp. 669–678. Available at:

https://lms.ctl.cyut.edu.tw/sysdata/8/21108/doc/d7b84166985e286b/attach/905867.p

df-(Accessed on:02/03/2015).

[39] Dhammjyoti V. Dhawase , Sachin Chavan- Webpage Information Hiding

Using Page Contents- (International Journal of Advanced Research in Computer

Engineering & Technology (IJARCET)Volume 3- Issue 1-January 2014)-pp. 182-

186. Available at:http://ijarcet.org/wp-content/uploads/IJARCET-VOL-3-ISSUE-1-

182-186.pdf-(Accessed on:15/01/2016).

[40] Hamid.A.Jalab, et al -New Design for Information Hiding with in

Steganography Using Distortion Techniques-( IACSIT International Journal of

Engineering and Technology- Vol. 2-No.1- February 2010)-pp.72-77. Available at

:http://www.ijetch.org/papers/103-T463.pdf-(Accessed on:08/08/2015).

[41] Hedieh Sajedi · Mansour Jamzad -Using contourlet transform and cover

selection for secure steganography-( International Journal of Electrical and

Computer Engineering (IJECE)Vol.2- No.5- October 2012)- pp. 699-708. -(Accessed

on:27/10/2015). . Available at

http://www.ijcta.com/documents/volumes/vol3issue2/ijcta2012030233.pdf-(Accessed

on:04/10/2014).

[42] K. F. Rafat and M. Sher -StegRithm:Steganographic Algorithm for Digital

ASCII Text Documents-( IACSIT International Journal of Engineering and

Technology- Vol. 4- No. 6- December 2012)-pp.765-769. Available at

:http://www.ijetch.org/papers/480-B210.pdf-(Accessed on:07/12/2015).

http://publications.drdo.gov.in/ojs/index.php/dsj/article/viewFile/1436/601

https://lms.ctl.cyut.edu.tw/sysdata/8/21108/doc/d7b84166985e286b/attach/905867.pdf

https://lms.ctl.cyut.edu.tw/sysdata/8/21108/doc/d7b84166985e286b/attach/905867.pdf

http://ijarcet.org/wp-content/uploads/IJARCET-VOL-3-ISSUE-1-182-186.pdf


http://www.ijetch.org/papers/103-T463.pdf

http://www.ijcta.com/documents/volumes/vol3issue2/ijcta2012030233.pdf

http://www.ijetch.org/papers/480-B210.pdf

97

[43] Komal R. Hole1, et al -Application of Genetic Algorithm for Image

Enhancement and Segmentation-(International Journal of Advanced Research in

Computer Engineering & Technology (IJARCET)-Volume 2-Issue 4- April 2013)- pp.

1342-1346. Available at:http://ijarcet.org/wp-content/uploads/IJARCET-VOL-2-

ISSUE-4-1342-1346.pdf-(Accessed on:08/10/2014).

[44] Mohammad Shirali Shahreza-Arabic/Persian Text Steganography Utilizing

Similar Letters with Different Codes-(The Arabian Journal for Science and

Engineering- Volume 35-Number 1B-( 2006)-pp.213-222. Available at

:https://ajse.kfupm.edu.sa/articles/351b-p.14.pdf-(Accessed on:25/02/2015).

[45] Mohit Garg -A Novel Text Steganography Technique Based on Html

Documents-(International Journal of Advanced Science and Technology -Vol. 35-

October 2011)- pp. 129-138. Available at

:http://www.sersc.org/journals/IJAST/vol35/11.pdf-(Accessed on:23/08/2015).

[46] Neha Rani, et al - Text Steganography Techniques-(International Journal

of Engineering Trends and Technology (IJETT) – Vol.4 Issue 7- July 2013)-pp.

3013- 3015. Available at:http://www.ijettjournal.org/volume-4/issue-7/IJETT-

V4I7P186.pdf-(Accessed on:05/10/2015).

[47] P. Surekha1,S. Sumathi2-Implementation Of Genetic Algorithm For A Dwt

Based Image Watermarking Scheme-(

Ictact Journal On Soft Computing: Special Issue On Fuzzy In Industrial And Process

Automation,July 2011, Volume: 02, Issue: 01)-pp.244-252 . Available at:

http://ictactjournals.in/paper/IJSC_Vol2_Iss1_244_252.pdf- Accessed

on:20/02/2013).

[48] Priyanka Sharma1, Rajesh Gargi2-Performance Analysis of Different

SelectionTechniques in Genetic Algorithm-(International Journal of Science and

Research (IJSR)-2012)-pp. 2042-2046. Available at

:http://www.ijsr.net/archive/v3i8/MDIwMTU1NDc%3D.pdf-(Accessed

on:25/01/2015).

[49] Raj Kumar Mohanta1, Binapani Sethi2-A Review of Genetic Algorithm

application for Image Segmentation – (Raj Kumar Mohanta et al,Int.J.Computer



https://ajse.kfupm.edu.sa/articles/351b-p.14.pdf

http://www.sersc.org/journals/IJAST/vol35/11.pdf

http://www.ijettjournal.org/volume-4/issue-7/IJETT-V4I7P186.pdf

http://www.ijettjournal.org/volume-4/issue-7/IJETT-V4I7P186.pdf

http://www.ijsr.net/archive/v3i8/MDIwMTU1NDc%3D.pdf

98

Technology & Applications-Vol 3 )-pp. 720-723. Available at

:http://www.ijcta.com/documents/volumes/vol3issue2/ijcta2012030233.pdf

[50] Rajesh Kumar et al. - Genetic Divergence Studies in Pigeonpea –

(American Journal of Plant Sciences-2013)-pp. 2126-2130 . Available at

:http://dx.doi.org/10.4236/ajps.2013.411264-(Accessed on:02/06/2016).

[51] Rupali Gawade, et al -Data Hiding Using Steganography for Network

Security-( International Journal of Advanced Research in Computer and

Communication Engineering-Vol. 3- Issue 2- February 2014)-pp. 5740- 5743.

Available at

:http://www.ijarcce.com/upload/2014/february/IJARCEE9H_S_priyanka_shetya_Da

ta.pdf-(Accessed on:22/06/2016).

[52] Shen Wang, et al -A Secure Steganography Method based on Genetic

Algorithm- (Journal of Information Hiding and Multimedia Signal Processing-Vol.1-

No. 1- January 2010)-pp.28-35. Available at

:http://vanilla47.com/PDFs/Cryptography/Steganography/A%20Secure%20Stegano

graphy%20Method%20based%20on%20Genetic%20Algorithm.pdf-(Accessed

on:09/10/2014).

[53] Souvik Bhattacharyya, et al- Data Hiding Through Multi Level

Steganography and SSCE-(Journal of Global Research in Computer Science,

Volume 2, No. 2, February 2011)-pp.38-47.

Available at:

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.468.8944&rep=rep1&typ

e=pdf-(Accessed on:05/01/2013).

Fourth: Web sites

[54] GPdotNET v3,(https://gpdotnet.codeplex.com/) -(Accessed on:12/03/2016).

[55] Reflector 9.0 program,

(https://documentation.redgate.com/display/REF9/.NET+Reflector+9+documentatio

n). -(Accessed on:22/04/2016).

http://dx.doi.org/10.4236/ajps.2013.411264

http://www.ijarcce.com/upload/2014/february/IJARCEE9H_S_priyanka_shetya_Data.pdf

http://www.ijarcce.com/upload/2014/february/IJARCEE9H_S_priyanka_shetya_Data.pdf

http://vanilla47.com/PDFs/Cryptography/Steganography/A%20Secure%20Steganography%20Method%20based%20on%20Genetic%20Algorithm.pdf

http://vanilla47.com/PDFs/Cryptography/Steganography/A%20Secure%20Steganography%20Method%20based%20on%20Genetic%20Algorithm.pdf

99

APPENDICES

APPENDIX A

The Screens of the System

Creating and encrypting data base

Hiding a data base in web page

100

Extracting and decrypting data base

101

APPENDIX B

Source Code

%%%%%%%%%%%%%%%%%%%%%%%%%%%% class Encrypt %%%%%%%%%%%%%%%%%% using System; using System.Linq; using System.Text; using System.Security.Cryptography; using System.IO; namespace creatdatxml { public static class Encrypt { // This size of the IV (in bytes) must = (keysize / 8). Default keysize is 256, so the IV must be 32 bytes long. Using a 16 character string here gives us 32 bytes when converted to a byte array. private const string initVector = "pemgail9uzpgzl88"; // This constant is used to determine the keysize of the encryption algorithm. private const int keysize = 256; //Encrypt public static string EncryptString(string plainText, string passPhrase) { byte[] initVectorBytes = Encoding.Default.GetBytes(initVector); byte[] plainTextBytes = Encoding.Default.GetBytes(plainText); PasswordDeriveBytes password = new PasswordDeriveBytes(passPhrase, null); byte[] keyBytes = password.GetBytes(keysize / 8); RijndaelManaged symmetricKey = new RijndaelManaged(); symmetricKey.Mode = CipherMode.CBC; ICryptoTransform encryptor = symmetricKey.CreateEncryptor(keyBytes, initVectorBytes); MemoryStream memoryStream = new MemoryStream(); CryptoStream cryptoStream = new CryptoStream(memoryStream, encryptor, CryptoStreamMode.Write); cryptoStream.Write(plainTextBytes, 0, plainTextBytes.Length); cryptoStream.FlushFinalBlock(); byte[] cipherTextBytes = memoryStream.ToArray(); memoryStream.Close(); cryptoStream.Close(); return Convert.ToBase64String(cipherTextBytes); } //Decrypt public static string DecryptString(string cipherText, string passPhrase) { byte[] initVectorBytes = Encoding.ASCII.GetBytes(initVector); byte[] cipherTextBytes = Convert.FromBase64String(cipherText); PasswordDeriveBytes password = new PasswordDeriveBytes(passPhrase, null); byte[] keyBytes = password.GetBytes(keysize / 8); RijndaelManaged symmetricKey = new RijndaelManaged(); symmetricKey.Mode = CipherMode.CBC; ICryptoTransform decryptor = symmetricKey.CreateDecryptor(keyBytes, initVectorBytes); MemoryStream memoryStream = new MemoryStream(cipherTextBytes); CryptoStream cryptoStream = new CryptoStream(memoryStream, decryptor, CryptoStreamMode.Read);

102

byte[] plainTextBytes = new byte[cipherTextBytes.Length]; int decryptedByteCount = cryptoStream.Read(plainTextBytes, 0, plainTextBytes.Length); memoryStream.Close(); cryptoStream.Close(); return Encoding.UTF8.GetString(plainTextBytes, 0, decryptedByteCount); } } }

%%%%%%%%%%%%%%%%%%%%%%%%end class%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%%%%%%%%%%%%%%%%%%%%%%%%%%% class HtmlUtility %%%%%%%%%%%%%%%%%%

#region Using directives using System; using System.IO; using System.Data; using System.Text; using System.Collections; using System.Collections.Specialized; using System.Runtime.InteropServices; using GeneticAlgorithms; #endregion namespace stegeneticweb { public class HtmlUtility : GeneticAlgorithms.GenomeCollection { public HtmlUtility() { } /// <summary>Counts the key attribute couples in an HTML document</summary> /// <param name="sourceFileName">Path and name of the HTML document</param> /// <param name="keyTable">DataTable with the key attributes</param> /// <returns>Count of bytes that can be hidden in the specified document</returns> public int GetCapacity(String sourceFileName, DataTable keyTable) { int countCarrierCouples = 0; StreamReader reader = new StreamReader(sourceFileName, Encoding.Default); String htmlDocument = reader.ReadToEnd(); reader.Close(); HtmlTagCollection tags = FindTags(htmlDocument); StringBuilder insertTextBuilder = new StringBuilder(); DataRow[] rows; HtmlAttribute secondAttribute; foreach (HtmlTag tag in tags) { foreach (HtmlAttribute attribute in tag.Attributes) { if (!attribute.Handled)

103

{ rows = keyTable.Select("firstAttribute = '" + attribute.Name.Replace("'", "''") + "'"); if (rows.Length > 0) { secondAttribute = FindAttribute(rows[0]["secondAttribute"].ToString(), tag.Attributes); if (secondAttribute != null) { countCarrierCouples++; } } } } } return countCarrierCouples; } /// <summary>Encode one bit as a combination of attributes, add the resulting text to a StringBuilder</summary> /// <param name="messageByte">Current byte</param> /// <param name="bitIndex">Current position in [messageByte]</param> /// <param name="firstAttribute">Key attribute</param> /// <param name="secondAttribute">Corresponding attribute</param> /// <param name="insertTextBuilder">Receives the new HTML text</param> private void HideBit(int messageByte, int bitIndex, HtmlAttribute firstAttribute, HtmlAttribute secondAttribute, StringBuilder insertTextBuilder) { String firstAttributeText, secondAttributeText; if (firstAttribute.Value.Length > 0) { firstAttributeText = String.Format("{0:x2}={1:x2}", firstAttribute.Name, firstAttribute.Value); } else { firstAttributeText = firstAttribute.Name; } if (secondAttribute.Value.Length > 0) { secondAttributeText = String.Format("{0:x2}={1:x2}", secondAttribute.Name, secondAttribute.Value); } else { secondAttributeText = secondAttribute.Name; } if (GetBit(messageByte, bitIndex)) { //bit is true insertTextBuilder.AppendFormat( @" {0:x2} {1:x2}", firstAttributeText, secondAttributeText); } else { //bit is false

104

insertTextBuilder.AppendFormat( @" {0:x2} {1:x2}", secondAttributeText, firstAttributeText); { } } } /// <summary>Hide a message in an HTML document</summary> /// <param name="sourceFileName">Path and name of the HTML document</param> /// <param name="destinationFileName">Path and name to save the resulting HTML document</param> /// <param name="message">The message to hide</param> /// <param name="keyTable">DataTable with the key attributes</param> public void Hide(String sourceFileName, String destinationFileName, Stream message, DataTable keyTable) { //read the carrier document StreamReader reader = new StreamReader(sourceFileName, Encoding.Default); String htmlDocument = reader.ReadToEnd(); reader.Close(); message.Position = 0; ; //list the HTML tags HtmlTagCollection tags = FindTags(htmlDocument); StringBuilder insertTextBuilder = new StringBuilder(); DataRow[] rows; HtmlAttribute secondAttribute; int offset = 0; int bitIndex = 7; int messageByte = 0; foreach (HtmlTag tag in tags) { insertTextBuilder.Remove(0, insertTextBuilder.Length); insertTextBuilder.AppendFormat("<{0:x2}", tag.Name); foreach (HtmlAttribute ICrossover in tag.Attributes) { if (!ICrossover.Handled) { //attribute has not been used, yet //find key row for this attribute rows = keyTable.Select(String.Format("firstAttribute = '{0:x2}'", ICrossover.QueryFormattedName)); if (rows.Length > 0) { //find corresponding attribute secondAttribute = FindAttribute(rows[0]["secondAttribute"].ToString(), tag.Attributes); if (secondAttribute != null)

105

{ if (bitIndex ==7) { //get next message byte bitIndex = 0; messageByte = message.ReadByte(); } else { //next bit bitIndex++; } //change the attributes' order HideBit(messageByte, bitIndex, ICrossover, secondAttribute, insertTextBuilder); //mark both attributes as handled ICrossover.Handled = true; secondAttribute.Handled = true; } } if (!ICrossover.Handled) { //the attribute is not a primary key attribute. Is it a secondary key attribute? bool copyAttribute = false; rows = keyTable.Select(String.Format("secondAttribute = '{0:x2}'", ICrossover.QueryFormattedName)); if (rows.Length > 0) { //if the corresponding first attribute does not exist in this tag or has already been used, //this attribute will not be used and must be copied. HtmlAttribute firstAttribute = FindAttribute(rows[0]["firstAttribute"].ToString(), tag.Attributes); if (firstAttribute == null) { copyAttribute = true; } else { copyAttribute = firstAttribute.Handled; } } else if (rows.Length == 0) { //this attribute is not part of the key and must be copied. copyAttribute = true; } if (copyAttribute) { //copy unused attribute insertTextBuilder.AppendFormat( @" {0:x2}={1:x2}",

106

ICrossover.Name, ICrossover.Value); ICrossover.Handled = true; } } } } //replace old tag with new tag tag.BeginPosition += offset; tag.EndPosition += offset; String insertText = insertTextBuilder.ToString(); int newLength = insertText.Length; if (newLength > 0) { int oldLength = tag.EndPosition - tag.BeginPosition; htmlDocument = htmlDocument.Remove(tag.BeginPosition, oldLength); htmlDocument = htmlDocument.Insert(tag.BeginPosition, insertText); offset += (newLength - oldLength); } if (messageByte < 0) { break; //finished } } //save the new document StreamWriter writer = new StreamWriter(destinationFileName); writer.Write(htmlDocument); writer.Close(); } /// <summary>Extract one bit, add it to a Stream</summary> /// <param name="firstAttributePosition">Position of the key attribute in the source document</param> /// <param name="secondAttributePosition">Position of the corresponding attribute in the source document</param> /// <param name="messageByte">Current message byte</param> /// <param name="bitIndex">Current bit index</param> /// <param name="message">Message stream</param> /// <returns>New message byte</returns> private byte ExtractBit(int firstAttributePosition, int secondAttributePosition, byte messageByte, int bitIndex, Stream message) { if (firstAttributePosition < secondAttributePosition) { messageByte = SetBit(messageByte, bitIndex, true); } else { messageByte = SetBit(messageByte, bitIndex, false); } if (bitIndex == 7) { //save to message byte

107

message.WriteByte(messageByte); messageByte = 0; } return messageByte; } /// <summary>Extract a hidden message from an HTML document</summary> /// <param name="sourceFileName">Path and name of the HTML document</param> /// <param name="message">Empty stream for the message</param> /// <param name="keyTable">DataTable with the key attributes</param> public void Extract(String sourceFileName, Stream message, DataTable keyTable) { //read the carrier document StreamReader reader = new StreamReader(sourceFileName, Encoding.Default); String htmlDocument = reader.ReadToEnd(); reader.Close(); //list the HTML tags HtmlTagCollection tags = FindTags(htmlDocument); StringBuilder insertTextBuilder = new StringBuilder(); DataRow[] rows; HtmlAttribute secondAttribute; int attributePosition, secondAttributePosition; int messageLength = 0; int bitIndex = 0; byte messageByte = 0; foreach (HtmlTag tag in tags) { foreach (HtmlAttribute ICrossover in tag.Attributes) { if (!ICrossover.Handled) { //attribute has not been used, yet //find key row for this attribute rows = keyTable.Select(String.Format("firstAttribute = '{0:x2}'", ICrossover.QueryFormattedName)); if (rows.Length > 0) { //find corresponding attribute secondAttribute = FindAttribute(rows[0]["secondAttribute"].ToString(), tag.Attributes); if (secondAttribute != null) { attributePosition = htmlDocument.IndexOf(ICrossover.Name, tag.BeginPosition); secondAttributePosition = htmlDocument.IndexOf(secondAttribute.Name, tag.BeginPosition); //compare the attributes' positions messageByte = ExtractBit(attributePosition, secondAttributePosition, messageByte, bitIndex, message);

108

//next bit if (bitIndex == 7) { bitIndex = 0; if ((message.Length == 1) && (messageLength == 0)) { //read length message.Position = 0; BinaryReader binaryReader = new BinaryReader(message); messageLength = binaryReader.ReadByte(); reader = null; message.SetLength(0); message.Position = 0; } else if ((messageLength > 0) && (message.Length == messageLength)) { break; //finished } } else { bitIndex++; } //mark both attributes as handled ICrossover.Handled = true; secondAttribute.Handled = true; } } if (!ICrossover.Handled) { rows = keyTable.Select(String.Format("secondAttribute = '{0:x2}'", ICrossover.QueryFormattedName)); if (rows.Length == 0) { //tag not used ICrossover.Handled = true; } } } } if ((messageLength > 0) && (message.Length == messageLength)) { break; //finished } } } /// <summary>Find the attribute with a specific name</summary> /// <param name="name">Name of the attribute</param> /// <param name="attributes">Attributes of a tag</param> /// <returns>The attribute found in [attributes], or null</returns> private HtmlAttribute FindAttribute(String name, HtmlAttributeCollection attributes) {

109

HtmlAttribute foundAttribute = null; foreach (HtmlAttribute ICrossover in attributes) { if ((!ICrossover.Handled) && (ICrossover.Name == name)) { foundAttribute = ICrossover; break; } } return foundAttribute; } /// <summary>List all HTML tags of a document</summary> /// <param name="htmlDocument"></param> /// <returns>List with</returns> private HtmlTagCollection FindTags(String htmlDocument) { HtmlTagCollection ICrossover = new HtmlTagCollection(); int indexStart = 0, indexEnd = 0; String text; do { indexStart = htmlDocument.IndexOf('<', indexEnd + 1); if (indexStart > 0) { indexEnd = htmlDocument.IndexOf('>', indexStart + 1); if (indexEnd > 0) { if (htmlDocument[indexStart + 1] != '/') { //Ende vom Start-Tag gefunden text = htmlDocument.Substring(indexStart, indexEnd - indexStart); ICrossover.Add(new HtmlTag(text, indexStart, indexEnd)); } } } } while (indexStart > 0); return ICrossover; } /// <summary>Get the value of a bit</summary> /// <param name="b">The byte value</param> /// <param name="position">The position of the bit</param> /// <returns>The value of the bit</returns> private bool GetBit(int b, int position) { return ((b & (byte)(1 << position)) != 0); } /// <summary>Set a bit to [newBitValue]</summary> /// <param name="b">The byte value</param> /// <param name="position">The position (1-8) of the bit</param> /// <param name="newBitValue">The new value of the bit in position [position]</param> /// <returns>The new byte value</returns> /// #region Random Methods

110

/// public static byte[] GetRandomBytes(int size,HtmlAttribute parent) { byte[] buffer = new byte[size]; RandomProvider.NextBytes(buffer); return buffer; } private static Random _randomProvider = new Random(); public static Random RandomProvider { get { return _randomProvider; } set { _randomProvider = value; } } private byte SetBit(byte b, int position, bool newBitValue ) { byte mask = (byte)(1 << position); if (newBitValue) { return (byte)(b | mask); } else { return (byte)(b & ~mask); } } } }

%%%%%%%%%%%%%%%%%%%%%%%%%%%% end class %%%%%%%%%%%%%%%%%%

%%%%%%%%%%%%%%%%%%%%%%%%%%%%class HtmlAttribute %%%%%%%%%%%%%%%%%% using System; using System.Collections.Generic; using System.Linq; using System.Text; using GeneticAlgorithms; namespace stegeneticweb { public class HtmlAttribute:IComparable { protected GeneticAlgorithm Parent; protected HtmlAttribute(GeneticAlgorithm parent) { Parent=parent; } private String name; private String value; private bool handled;

111

public String Name { get { return name; } } public String QueryFormattedName { get { return name.Replace("'", "''"); } } public String Value { get { return this.value; } set { this.value = value; } } public bool Handled { get { return handled; } set { this.handled = value; } } public HtmlAttribute(String name) { this.name = name.ToLower(); this.value = String.Empty; handled = false; } private double _fitness=double.MinValue; public double Fitness { get { // double Fitness=Math.Pow(15 * x * y * (1 - x) * (1 - y) * Math.Sin(n * Math.PI * x) * Math.Sin(n * Math.PI * y), 2); return _fitness; } set { _fitness=value; } } #region IComparable Members public int CompareTo(object obj) { HtmlAttribute compared=(HtmlAttribute)obj; if(this.Fitness<compared.Fitness) return -1; else if(this.Fitness>compared.Fitness)

112

return 1; else return 0; } #endregion internal static void Sort() { throw new NotImplementedException(); } } }

%%%%%%%%%%%%%%%%%%%%%%%%%%%%end class %%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%% IGenomeFactory%%%%%%%%%%%%%%%%%%

using System; using System.Collections.Generic; using System.Linq; using System.Text; using GeneticAlgorithms; namespace stegeneticweb { /// <summary> /// Collects methods used to create Genomes /// </summary> public interface IGenomeFactory { HtmlAttribute CreateGenome(GeneticAlgorithm parent); } } %%%%%%%%%%%%%%%%%%%%%%%%%%%% end class%%%%%%%%%%%%%%%%%%

%%%%%%%%%%%%%%%%%%%%%%%%%%%% GeneticAlgorithm %%%%%%%%%%%%%%%%%% using System; using System.Collections.Generic; using System.Linq; using System.Text; namespace stegeneticweb { /// <summary> /// The GeneticAlgorithm class is the cornerstone in a small framework of class that build a customizable /// genetic algorithm searches. At a mimimum you must set up either a BinaryGenome or RealGenome /// that describes a solution to your problem and then implement the IEvaluateGenome interface to /// provide a way of 'scoring' each possible candidate in the population of possible solutions. /// /// See the provided examples for more information. /// </summary> public class GeneticAlgorithm : stegeneticweb.IGeneticAlgorithm { protected internal readonly System.Type genomeType;

113

/// <summary> /// Creates an instance of the GeneticAlgorithm class. /// </summary> public GeneticAlgorithm() { } #region Methods /// <summary> /// After you have created the GeneticAlgorithm instance and provided it with an implementation of /// IGenomeFactory, this method can be called to actually create the population of custom Genomes for /// use in the GA search. /// /// In the case of the provided RealGenomeFactory you create a RealGenomeFactory instance, then /// set the minimum and maximum values you want each genome to have, then provide the Genetic Algorithm /// the factory through the GenomeFactory property. /// /// The GenomeFactory that you provide is responsible for every aspect of constructing the Genome so that it's ready /// to evaluate and crossover with over genomes. /// </summary> /// <param name="populationSize"></param> public void CreateGenomes(int populationSize) { if(GenomeFactory==null) throw new ApplicationException("Cannot create Genomes if the GenomeFactory is null"); HtmlAttribute[] genomes = new HtmlAttribute[populationSize]; for(int i=0;i<populationSize;i++) _genomes.Add(GenomeFactory.CreateGenome(this)); } /// <summary> /// Attempts to find an optimal solution to the problem. /// </summary> /// <returns></returns> public virtual HtmlAttribute FindOptima() { return FindOptima(ExitConditions); } /// <summary> /// Attempts to find an optimal solution to the problem by beginning the process of evaluation and crossover/mutation. /// /// This method behavior is as follows: While the ExitConditions are not met, loop through each Genome in the population, and select /// for it a mate with (if in greedy mode, do not mate and replace the best solution with a child; keep the best solution only as /// a mate for other Genomes). Use the provided Selector to control the selection process (weighted random, or sequential etc) /// . Use the Crossover provided to recombine the selected genome and it's mate, as well as any mutation that must happen to the /// </summary> /// <param name="conditions"></param>

114

/// <returns></returns> public virtual HtmlAttribute FindOptima(ExitConditions conditions) { /// If you run the debug version, the app keeps track of the time spent evaluating the Genomes /// recombining and mutating the Genomes (crossover) and selecting mates for the Genomes (selection) #if DEBUG Counter evalTime=new Counter(); Counter crossoverTime=new Counter(); Counter selectorTime=new Counter(); #endif if(Selector==null) throw new ApplicationException("Cannot run FindOptima if the Selector is null"); if(Crossover==null) throw new ApplicationException("Cannot run FindOptima if the Crossover is null"); startTime=DateTime.Now; // this is how greedyness is implemented: // if we are 'greedy' we keep the top genome every time, so we short the // count or genomes by one, because we want the last genome to remain // untouched int genomeCount = (IsGreedy ? _genomes.Count-1 : _genomes.Count); while(conditions.DoesContinue(this)) { if(NewGeneration!=null) NewGeneration(this, null); #if DEBUG evalTime.Start(); #endif // first, evalute the fitness: for(int g=0;g<genomeCount;g++) { Genomes[g].Fitness = Evaluator.Eval(Genomes[g]); } #if DEBUG evalTime.Stop(); #endif // sort places genomes in order from least fit at position 0 //to most fit at the end of the collection: Genomes.Sort(); if(Genomes[Genomes.Count-1].Fitness>_gbestFitness) { _gbestFitness=Genomes[Genomes.Count-1].Fitness; if(NewGlobalBest!=null) NewGlobalBest(this, null); } for(int g=0;g<genomeCount;g++)

115

{ #if DEBUG selectorTime.Start(); #endif HtmlAttribute mate = Selector.Select(); //while(mate.Equals(Genomes[g])) //mate = Selector.Select(); #if DEBUG selectorTime.Stop(); #endif #if DEBUG crossoverTime.Start(); #endif /// Note: although the Crossover method will *often* simply modify the referenced first Genome 'in-place' /// it makes sense to return it explicitly so that future implementations *can* create new Genome instances /// if it better/necessary for it's particular algorithm _genomes[g] = Crossover.Crossover(_genomes[g], mate); #if DEBUG crossoverTime.Stop(); #endif } generations++; } #if DEBUG Console.WriteLine("Evaluation time: {0}", evalTime.Seconds); Console.WriteLine("Crossover time: {0}", crossoverTime.Seconds); Console.WriteLine("Selector time: {0}", selectorTime.Seconds); #endif if(!IsGreedy) HtmlAttribute.Sort(); return Genomes[Genomes.Count-1]; } #endregion #region Properties protected DateTime startTime; /// <summary> /// Records the time FindOptima was last called. /// </summary> public DateTime StartTime { get { return startTime; } } protected int generations=0; /// <summary> /// The Generation count since the last time FindOptima was called. /// </summary> public int GenerationCount

116

{ get { return generations; } } private HtmlAttributeCollection _genomes = new HtmlAttributeCollection(); public HtmlAttributeCollection Genomes { get { return _genomes; } } private IGenomeSelector _selector; /// <summary> /// The Selector is an instance of IGenomeSelector that provides the selection strategy for the GA. /// </summary> public IGenomeSelector Selector { get { return _selector; } set { _selector=value; } } private IEvaluateGenome _evaluator; /// <summary> /// The IEvaluateGenome implementation that the GA will use to determine Genomes' fitness. /// </summary> public IEvaluateGenome Evaluator { get { return _evaluator; } set { _evaluator=value; } } private ICrossover _crossover; /// <summary> /// The ICrossover used to recombine and mutate Genomes. /// </summary> public ICrossover Crossover { get { return _crossover; } set

117

{ _crossover=value; } } private IGenomeFactory _genomeFactory; /// <summary> /// The IGenomeFactory used to create new genomes as necessary. /// </summary> public IGenomeFactory GenomeFactory { get { return _genomeFactory; } set { _genomeFactory=value; } } private ExitConditions _exitConditions=new ExitConditions(); /// <summary> /// Sets/Gets the instance of ExitConditions this GA is using to determine when to quit iterating through /// generations. /// </summary> public ExitConditions ExitConditions { get { return _exitConditions; } set { _exitConditions=value; } } private bool _isGreedy=true; /// <summary> /// 'Greedyness' in this idiom of a GeneticAlgorithm means that the algorithm always holds on to the most /// optimal solution yet found between generations. /// </summary> public bool IsGreedy { get { return _isGreedy; } set { _isGreedy=value; } } #endregion public event GeneticAlgorithmEventHandler NewGeneration; public event GeneticAlgorithmEventHandler NewGlobalBest;

118

private double _gbestFitness=double.MinValue; } }

%%%%%%%%%%%%%%%%%%%%%%%%%%%%end class %%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%HtmlAttributeCollectio%%%%%%%%%%%%%%%%%%

#region Using directives using System; using System.Collections; using System.Text; using AForge.Genetic; using GeneticAlgorithms; #endregion namespace stegeneticweb { /// <summary> /// <para> /// A collection that stores <see cref='.HtmlAttribute'/> objects. /// </para> /// </summary> /// <seealso cref='.HtmlAttributeCollection'/> [Serializable()] public class HtmlAttributeCollection : System.Collections.CollectionBase { /// <summary> /// <para> /// Initializes a new instance of <see cref='.HtmlAttributeCollection'/>. /// </para> /// </summary> public HtmlAttributeCollection() { } /// <summary> /// <para> /// Initializes a new instance of <see cref='.HtmlAttributeCollection'/> based on another <see cref='.HtmlAttributeCollection'/>. /// </para> /// </summary> /// <param name='value'> /// A <see cref='.HtmlAttributeCollection'/> from which the contents are copied /// </param> public HtmlAttributeCollection(HtmlAttributeCollection val) { this.AddRange(val); } /// <summary> /// <para> /// Initializes a new instance of <see cref='.HtmlAttributeCollection'/> containing any array of <see cref='.HtmlAttribute'/> objects. /// </para> /// </summary> /// <param name='value'>

119

/// A array of <see cref='.HtmlAttribute'/> objects with which to intialize the collection /// </param> public HtmlAttributeCollection(HtmlAttribute[] val) { this.AddRange(val); } /// <summary> /// <para>Represents the entry at the specified index of the <see cref='.HtmlAttribute'/>.</para> /// </summary> /// <param name='index'><para>The zero-based index of the entry to locate in the collection.</para></param> /// <value> /// <para> The entry at the specified index of the collection.</para> /// </value> /// <exception cref='System.ArgumentOutOfRangeException'><paramref name='index'/> is outside the valid range of indexes for the collection.</exception> public HtmlAttribute this[int index] { get { return ((HtmlAttribute)(List[index])); } set { List[index] = value; } } /// <summary> /// <para>Adds a <see cref='.HtmlAttribute'/> with the specified value to the /// <see cref='.HtmlAttributeCollection'/> .</para> /// </summary> /// <param name='value'>The <see cref='.HtmlAttribute'/> to add.</param> /// <returns> /// <para>The index at which the new element was inserted.</para> /// </returns> /// <seealso cref='.HtmlAttributeCollection.AddRange'/> public int Add(HtmlAttribute val) { return List.Add(val); } /// <summary> /// <para>Copies the elements of an array to the end of the <see cref='.HtmlAttributeCollection'/>.</para> /// </summary> /// <param name='value'> /// An array of type <see cref='.HtmlAttribute'/> containing the objects to add to the collection. /// </param> /// <returns> /// <para>None.</para> /// </returns> /// <seealso cref='.HtmlAttributeCollection.Add'/> public void AddRange(HtmlAttribute[] val) { for (int i = 0; i < val.Length; i++) { this.Add(val[i]); } } /// <summary> /// <para>

120

/// Adds the contents of another <see cref='.HtmlAttributeCollection'/> to the end of the collection. /// </para> /// </summary> /// <param name='value'> /// A <see cref='.HtmlAttributeCollection'/> containing the objects to add to the collection. /// </param> /// <returns> /// <para>None.</para> /// </returns> /// <seealso cref='.HtmlAttributeCollection.Add'/> public void AddRange(HtmlAttributeCollection val) { for (int i = 0; i < val.Count; i++) { this.Add(val[i]); } } /// <summary> /// <para>Gets a value indicating whether the /// <see cref='.HtmlAttributeCollection'/> contains the specified <see cref='.HtmlAttribute'/>.</para> /// </summary> /// <param name='value'>The <see cref='.HtmlAttribute'/> to locate.</param> /// <returns> /// <para><see langword='true'/> if the <see cref='.HtmlAttribute'/> is contained in the collection; /// otherwise, <see langword='false'/>.</para> /// </returns> /// <seealso cref='.HtmlAttributeCollection.IndexOf'/> public bool Contains(HtmlAttribute val) { return List.Contains(val); } /// <summary> /// <para>Copies the <see cref='.HtmlAttributeCollection'/> values to a one-dimensional <see cref='System.Array'/> instance at the /// specified index.</para> /// </summary> /// <param name='array'><para>The one-dimensional <see cref='System.Array'/> that is the destination of the values copied from <see cref='.HtmlAttributeCollection'/> .</para></param> /// <param name='index'>The index in <paramref name='array'/> where copying begins.</param> /// <returns> /// <para>None.</para> /// </returns> /// <exception cref='System.ArgumentException'><para><paramref name='array'/> is multidimensional.</para> <para>-or-</para> <para>The number of elements in the <see cref='.HtmlAttributeCollection'/> is greater than the available space between <paramref name='arrayIndex'/> and the end of <paramref name='array'/>.</para></exception> /// <exception cref='System.ArgumentNullException'><paramref name='array'/> is <see langword='null'/>. </exception> /// <exception cref='System.ArgumentOutOfRangeException'><paramref name='arrayIndex'/> is less than <paramref name='array'/>'s lowbound. </exception> /// <seealso cref='System.Array'/> public void CopyTo(HtmlAttribute[] array, int index) { List.CopyTo(array, index); }

121

/// <summary> /// <para>Returns the index of a <see cref='.HtmlAttribute'/> in /// the <see cref='.HtmlAttributeCollection'/> .</para> /// </summary> /// <param name='value'>The <see cref='.HtmlAttribute'/> to locate.</param> /// <returns> /// <para>The index of the <see cref='.HtmlAttribute'/> of <paramref name='value'/> in the /// <see cref='.HtmlAttributeCollection'/>, if found; otherwise, -1.</para> /// </returns> /// <seealso cref='.HtmlAttributeCollection.Contains'/> /// public int IndexOf(HtmlAttribute val) { return List.IndexOf(val); } /// <summary> /// <para>Inserts a <see cref='.HtmlAttribute'/> into the <see cref='.HtmlAttributeCollection'/> at the specified index.</para> /// </summary> /// <param name='index'>The zero-based index where <paramref name='value'/> should be inserted.</param> /// <param name=' value'>The <see cref='.HtmlAttribute'/> to insert.</param> /// <returns><para>None.</para></returns> /// <seealso cref='.HtmlAttributeCollection.Add'/> public void Insert(int index, HtmlAttribute val) { List.Insert(index, val); } /// <summary> /// <para>Returns an enumerator that can iterate through /// the <see cref='.HtmlAttributeCollection'/> .</para> /// </summary> /// <returns><para>None.</para></returns> /// <seealso cref='System.Collections.IEnumerator'/> public new HtmlAttributeEnumerator GetEnumerator() { return new HtmlAttributeEnumerator(this); } /// <summary> /// <para> Removes a specific <see cref='.HtmlAttribute'/> from the /// <see cref='.HtmlAttributeCollection'/> .</para> /// </summary> /// <param name='value'>The <see cref='.HtmlAttribute'/> to remove from the <see cref='.HtmlAttributeCollection'/> .</param> /// <returns><para>None.</para></returns> /// <exception cref='System.ArgumentException'><paramref name='value'/> is not found in the Collection. </exception> public void Remove(HtmlAttribute val) { List.Remove(val); } public class HtmlAttributeEnumerator :IEnumerator { IEnumerator baseEnumerator; IEnumerable temp; public HtmlAttributeEnumerator(HtmlAttributeCollection mappings) { this.temp = ((IEnumerable)(mappings));

122

this.baseEnumerator = temp.GetEnumerator(); } public HtmlAttribute Current { get { return ((HtmlAttribute)(baseEnumerator.Current)); } } object IEnumerator.Current { get { return baseEnumerator.Current; } } public bool MoveNext() { return baseEnumerator.MoveNext(); } bool IEnumerator.MoveNext() { return baseEnumerator.MoveNext(); } public void Reset() { baseEnumerator.Reset(); } void IEnumerator.Reset() { baseEnumerator.Reset(); } } internal void Sort() { throw new NotImplementedException(); } } }

%%%%%%%%%%%%%%%%%%%%%%%%%%%%end class %%%%%%%%%%%%%%%%%%

%%%%%%%%%%%%%%%%%%%%%%%%%%%% Counter %%%%%%%%%%%%%%%% using System; using System.Collections.Generic; using System.Linq; using System.Text; namespace stegeneticweb { /// <summary> /// Counter is just a high-resolution stopwatch for timing operations. /// Slightly modified code created by the legendary Eric Gunnerson of Microsoft. /// </summary> public class Counter { long elapsedCount = 0; long startCount = 0; long lastLapCount = 0;

123

public void Start() { startCount = 0; QueryPerformanceCounter(ref startCount); } public void Stop() { long stopCount = 0; QueryPerformanceCounter(ref stopCount); elapsedCount += (stopCount - startCount); } public void Clear() { elapsedCount = 0; } public double Seconds { get { long freq = 0; QueryPerformanceFrequency(ref freq); return ((double)elapsedCount / (double)freq); } } public override string ToString() { return String.Format("{0} seconds", Seconds); } public double Lap { get { long freq = 0; long elapsed = lastLapCount; QueryPerformanceFrequency(ref freq); QueryPerformanceCounter(ref lastLapCount); return ((double)(lastLapCount - elapsed) / (double)freq); } } public static long Frequency { get { long freq = 0; QueryPerformanceFrequency(ref freq); return freq; } } public static long Value { get { long count = 0; QueryPerformanceCounter(ref count); return count;

124

} } [System.Runtime.InteropServices.DllImport("KERNEL32")] private static extern bool QueryPerformanceCounter(ref long lpPerformanceCount); [System.Runtime.InteropServices.DllImport("KERNEL32")] private static extern bool QueryPerformanceFrequency(ref long lpFrequency); } }

%%%%%%%%%%%%%%%%%%%%%%%%%%%end class %%%%%%%%%%%%%%%%%%

%%%%%%%%%%%%%%%%%%%%%%%%%%%%ICrossover %%%%%%%%%%%%%%%%%% using System; namespace stegeneticweb { /// <summary> /// Classes that implement this interface provide the logic to crossover (recombine) two target Genomes /// or Genome derived classes. /// </summary> public interface ICrossover { // HtmlAttribute Crossover(HtmlAttribute name, HtmlAttribute value); // double CrossoverProbability { get; set; } // double MutationProbability { get; set; } HtmlTag Crossover(HtmlTag AttributeName, HtmlTag AttributeValue,HtmlTag Space) ; double CrossoverProbability { get; set; } double MutationProbability { get; set; } HtmlAttribute Crossover(HtmlAttribute name, HtmlAttribute mate); } }

%%%%%%%%%%%%%%%%%%%%%%%%%%%% end class %%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%% HtmlAttributeCollection%%%%%%%%%%%%%%%%%%

#region Using directives using System; using System.Collections; using System.Text; using AForge.Genetic; using GeneticAlgorithms; #endregion namespace stegeneticweb { /// <summary> /// <para> /// A collection that stores <see cref='.HtmlAttribute'/> objects. /// </para> /// </summary> /// <seealso cref='.HtmlAttributeCollection'/>

125

[Serializable()] public class HtmlAttributeCollection : System.Collections.CollectionBase { /// <summary> /// <para> /// Initializes a new instance of <see cref='.HtmlAttributeCollection'/>. /// </para> /// </summary> public HtmlAttributeCollection() { } /// <summary> /// <para> /// Initializes a new instance of <see cref='.HtmlAttributeCollection'/> based on another <see cref='.HtmlAttributeCollection'/>. /// </para> /// </summary> /// <param name='value'> /// A <see cref='.HtmlAttributeCollection'/> from which the contents are copied /// </param> public HtmlAttributeCollection(HtmlAttributeCollection val) { this.AddRange(val); } /// <summary> /// <para> /// Initializes a new instance of <see cref='.HtmlAttributeCollection'/> containing any array of <see cref='.HtmlAttribute'/> objects. /// </para> /// </summary> /// <param name='value'> /// A array of <see cref='.HtmlAttribute'/> objects with which to intialize the collection /// </param> public HtmlAttributeCollection(HtmlAttribute[] val) { this.AddRange(val); } /// <summary> /// <para>Represents the entry at the specified index of the <see cref='.HtmlAttribute'/>.</para> /// </summary> /// <param name='index'><para>The zero-based index of the entry to locate in the collection.</para></param> /// <value> /// <para> The entry at the specified index of the collection.</para> /// </value> /// <exception cref='System.ArgumentOutOfRangeException'><paramref name='index'/> is outside the valid range of indexes for the collection.</exception> public HtmlAttribute this[int index] { get { return ((HtmlAttribute)(List[index])); } set { List[index] = value; }

126

} /// <summary> /// <para>Adds a <see cref='.HtmlAttribute'/> with the specified value to the /// <see cref='.HtmlAttributeCollection'/> .</para> /// </summary> /// <param name='value'>The <see cref='.HtmlAttribute'/> to add.</param> /// <returns> /// <para>The index at which the new element was inserted.</para> /// </returns> /// <seealso cref='.HtmlAttributeCollection.AddRange'/> public int Add(HtmlAttribute val) { return List.Add(val); } /// <summary> /// <para>Copies the elements of an array to the end of the <see cref='.HtmlAttributeCollection'/>.</para> /// </summary> /// <param name='value'> /// An array of type <see cref='.HtmlAttribute'/> containing the objects to add to the collection. /// </param> /// <returns> /// <para>None.</para> /// </returns> /// <seealso cref='.HtmlAttributeCollection.Add'/> public void AddRange(HtmlAttribute[] val) { for (int i = 0; i < val.Length; i++) { this.Add(val[i]); } } /// <summary> /// <para> /// Adds the contents of another <see cref='.HtmlAttributeCollection'/> to the end of the collection. /// </para> /// </summary> /// <param name='value'> /// A <see cref='.HtmlAttributeCollection'/> containing the objects to add to the collection. /// </param> /// <returns> /// <para>None.</para> /// </returns> /// <seealso cref='.HtmlAttributeCollection.Add'/> public void AddRange(HtmlAttributeCollection val) { for (int i = 0; i < val.Count; i++) { this.Add(val[i]); } } /// <summary> /// <para>Gets a value indicating whether the /// <see cref='.HtmlAttributeCollection'/> contains the specified <see cref='.HtmlAttribute'/>.</para> /// </summary> /// <param name='value'>The <see cref='.HtmlAttribute'/> to locate.</param> /// <returns>

127

/// <para><see langword='true'/> if the <see cref='.HtmlAttribute'/> is contained in the collection; /// otherwise, <see langword='false'/>.</para> /// </returns> /// <seealso cref='.HtmlAttributeCollection.IndexOf'/> public bool Contains(HtmlAttribute val) { return List.Contains(val); } /// <summary> /// <para>Copies the <see cref='.HtmlAttributeCollection'/> values to a one-dimensional <see cref='System.Array'/> instance at the /// specified index.</para> /// </summary> /// <param name='array'><para>The one-dimensional <see cref='System.Array'/> that is the destination of the values copied from <see cref='.HtmlAttributeCollection'/> .</para></param> /// <param name='index'>The index in <paramref name='array'/> where copying begins.</param> /// <returns> /// <para>None.</para> /// </returns> /// <exception cref='System.ArgumentException'><para><paramref name='array'/> is multidimensional.</para> <para>-or-</para> <para>The number of elements in the <see cref='.HtmlAttributeCollection'/> is greater than the available space between <paramref name='arrayIndex'/> and the end of <paramref name='array'/>.</para></exception> /// <exception cref='System.ArgumentNullException'><paramref name='array'/> is <see langword='null'/>. </exception> /// <exception cref='System.ArgumentOutOfRangeException'><paramref name='arrayIndex'/> is less than <paramref name='array'/>'s lowbound. </exception> /// <seealso cref='System.Array'/> public void CopyTo(HtmlAttribute[] array, int index) { List.CopyTo(array, index); } /// <summary> /// <para>Returns the index of a <see cref='.HtmlAttribute'/> in /// the <see cref='.HtmlAttributeCollection'/> .</para> /// </summary> /// <param name='value'>The <see cref='.HtmlAttribute'/> to locate.</param> /// <returns> /// <para>The index of the <see cref='.HtmlAttribute'/> of <paramref name='value'/> in the /// <see cref='.HtmlAttributeCollection'/>, if found; otherwise, -1.</para> /// </returns> /// <seealso cref='.HtmlAttributeCollection.Contains'/> /// public int IndexOf(HtmlAttribute val) { return List.IndexOf(val); } /// <summary> /// <para>Inserts a <see cref='.HtmlAttribute'/> into the <see cref='.HtmlAttributeCollection'/> at the specified index.</para> /// </summary> /// <param name='index'>The zero-based index where <paramref name='value'/> should be inserted.</param>

128

/// <param name=' value'>The <see cref='.HtmlAttribute'/> to insert.</param> /// <returns><para>None.</para></returns> /// <seealso cref='.HtmlAttributeCollection.Add'/> public void Insert(int index, HtmlAttribute val) { List.Insert(index, val); } /// <summary> /// <para>Returns an enumerator that can iterate through /// the <see cref='.HtmlAttributeCollection'/> .</para> /// </summary> /// <returns><para>None.</para></returns> /// <seealso cref='System.Collections.IEnumerator'/> public new HtmlAttributeEnumerator GetEnumerator() { return new HtmlAttributeEnumerator(this); } /// <summary> /// <para> Removes a specific <see cref='.HtmlAttribute'/> from the /// <see cref='.HtmlAttributeCollection'/> .</para> /// </summary> /// <param name='value'>The <see cref='.HtmlAttribute'/> to remove from the <see cref='.HtmlAttributeCollection'/> .</param> /// <returns><para>None.</para></returns> /// <exception cref='System.ArgumentException'><paramref name='value'/> is not found in the Collection. </exception> public void Remove(HtmlAttribute val) { List.Remove(val); } public class HtmlAttributeEnumerator :IEnumerator { IEnumerator baseEnumerator; IEnumerable temp; public HtmlAttributeEnumerator(HtmlAttributeCollection mappings) { this.temp = ((IEnumerable)(mappings)); this.baseEnumerator = temp.GetEnumerator(); } public HtmlAttribute Current { get { return ((HtmlAttribute)(baseEnumerator.Current)); } } object IEnumerator.Current { get { return baseEnumerator.Current; } } public bool MoveNext() { return baseEnumerator.MoveNext(); } bool IEnumerator.MoveNext() { return baseEnumerator.MoveNext(); } public void Reset() { baseEnumerator.Reset();

129

} void IEnumerator.Reset() { baseEnumerator.Reset(); } } internal void Sort() { throw new NotImplementedException(); } } }

%%%%%%%%%%%%%%%%%%%%%%%%%%%%end class %%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%HtmlTag %%%%%%%%%%%%%%%%%% #region Using directives using System; using System.Collections.Specialized; using System.Text; using AForge.Genetic; using GeneticAlgorithms; #endregion namespace stegeneticweb { public class HtmlTag:IComparable { protected GeneticAlgorithm Parent; protected HtmlTag(GeneticAlgorithm parent) { Parent = parent; } private enum PostionInTag { AttributeName, AttributeValue, Space } public int beginPosition; public int endPosition; private String name; public int BeginPosition { get { return beginPosition; } set { beginPosition = value; } } public int EndPosition { get { return endPosition; } set { endPosition = value; } } public String Name { get { return name; }

130

} private HtmlAttributeCollection ICrossover; public HtmlAttributeCollection Attributes { get { return ICrossover; } } public HtmlTag(String text, int beginPosition, int endPosition) { this.beginPosition = beginPosition; this.endPosition = endPosition; this.ICrossover = new HtmlAttributeCollection(); //separate tag name and attributes int index = text.IndexOf(' '); if (index < 0) { //this is a tag without any attributes name = text.Substring(1, text.Length - 1); } else { name = text.Substring(1, index - 1); } if (index > 0) { text = text.Substring(index); //find and list all attributes in this tag PostionInTag status = PostionInTag.Space; int startIndex = 0; String attributeName; String attributeValue; char attributeValueQuotation = '\''; HtmlAttribute attribute = null; for (int n = 1; n < text.Length; n++) { if ((status == PostionInTag.Space) && ((text[n] == '\'') || (text[n] == '\"'))) { //begin value startIndex = n; attributeValueQuotation = text[n]; status = PostionInTag.AttributeValue; } else if ((status == PostionInTag.AttributeValue) && (text[n] == attributeValueQuotation)) { //end value if (attribute != null) { attributeValue = text.Substring(startIndex, n + 1 - startIndex); attribute.Value = attributeValue; attribute = null; } status = PostionInTag.Space; } else if ((status == PostionInTag.Space) && (text[n] != ' ')) { //begin attribute status = PostionInTag.AttributeName; startIndex = n; }

131

else if ((status == PostionInTag.AttributeName) && ((text[n] == '=') || Char.IsWhiteSpace(text[n]) || (n == text.Length-1))) { //end name if (n == text.Length - 1) { //Correct string cursor position. //This is the last character of the tag. //The last attribute does not have a value. n++; } attributeName = text.Substring(startIndex, n - startIndex); attribute = new HtmlAttribute(attributeName); ICrossover.Add(attribute); status = PostionInTag.Space; } else if ((status != PostionInTag.AttributeValue) && (text[n] == ' ')) { status = PostionInTag.Space; } } } } private double _fitness = double.MinValue; double x=0; double y=0 ; // double n =0; private double MaxValue = double.MaxValue; public double Fitness { get { //double Fitnes = Math.Pow(15 * x * y * (1 - x) * (1 - y) * Math.Sin(n * Math.PI * x) * Math.Sin(n * Math.PI * y), 2); double Fitness = (((MaxValue) / (y) - (x + 8))); return _fitness; } set { _fitness = value; } } public int CompareTo(object obj) { { HtmlAttribute compared = (HtmlAttribute)obj; if (this.Fitness < compared.Fitness) return -1; else if (this.Fitness > compared.Fitness) return 1; else return 0; } }

132

} } %%%%%%%%%%%%%%end class %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%%%%%%%%%%%%%%%%%%%%%%%%%%%% HtmlTagCollection%%%%%%%%%%%%%%%%%% using System; using System.Collections; using AForge.Genetic; using GeneticAlgorithms; namespace stegeneticweb { /// <summary> /// <para> /// A collection that stores <see cref='.HtmlTag'/> objects. /// </para> /// </summary> /// <seealso cref='.HtmlTagCollection'/> [Serializable()] public class HtmlTagCollection : System.Collections.CollectionBase { /// <summary> /// <para> /// Initializes a new instance of <see cref='.HtmlTagCollection'/>. /// </para> /// </summary> public HtmlTagCollection() { } /// <summary> /// <para> /// Initializes a new instance of <see cref='.HtmlTagCollection'/> based on another <see cref='.HtmlTagCollection'/>. /// </para> /// </summary> /// <param name='value'> /// A <see cref='.HtmlTagCollection'/> from which the contents are copied /// </param> public HtmlTagCollection(HtmlTagCollection val) { this.AddRange(val); } /// <summary> /// <para> /// Initializes a new instance of <see cref='.HtmlTagCollection'/> containing any array of <see cref='.HtmlTag'/> objects. /// </para> /// </summary> /// <param name='value'> /// A array of <see cref='.HtmlTag'/> objects with which to intialize the collection /// </param> public HtmlTagCollection(HtmlTag[] val) { this.AddRange(val); } /// <summary> /// <para>Represents the entry at the specified index of the <see cref='.HtmlTag'/>.</para>

133

/// </summary> /// <param name='index'><para>The zero-based index of the entry to locate in the collection.</para></param> /// <value> /// <para> The entry at the specified index of the collection.</para> /// </value> /// <exception cref='System.ArgumentOutOfRangeException'><paramref name='index'/> is outside the valid range of indexes for the collection.</exception> public HtmlTag this[int index] { get { return ((HtmlTag)(List[index])); } set { List[index] = value; } } /// <summary> /// <para>Adds a <see cref='.HtmlTag'/> with the specified value to the /// <see cref='.HtmlTagCollection'/> .</para> /// </summary> /// <param name='value'>The <see cref='.HtmlTag'/> to add.</param> /// <returns> /// <para>The index at which the new element was inserted.</para> /// </returns> /// <seealso cref='.HtmlTagCollection.AddRange'/> public int Add(HtmlTag val) { return List.Add(val); } /// <summary> /// <para>Copies the elements of an array to the end of the <see cref='.HtmlTagCollection'/>.</para> /// </summary> /// <param name='value'> /// An array of type <see cref='.HtmlTag'/> containing the objects to add to the collection. /// </param> /// <returns> /// <para>None.</para> /// </returns> /// <seealso cref='.HtmlTagCollection.Add'/> public void AddRange(HtmlTag[] val) { for (int i = 0; i < val.Length; i++) { this.Add(val[i]); } } /// <summary> /// <para> /// Adds the contents of another <see cref='.HtmlTagCollection'/> to the end of the collection. /// </para> /// </summary> /// <param name='value'> /// A <see cref='.HtmlTagCollection'/> containing the objects to add to the collection. /// </param> /// <returns> /// <para>None.</para>

134

/// </returns> /// <seealso cref='.HtmlTagCollection.Add'/> public void AddRange(HtmlTagCollection val) { for (int i = 0; i < val.Count; i++) { this.Add(val[i]); } } /// <summary> /// <para>Gets a value indicating whether the /// <see cref='.HtmlTagCollection'/> contains the specified <see cref='.HtmlTag'/>.</para> /// </summary> /// <param name='value'>The <see cref='.HtmlTag'/> to locate.</param> /// <returns> /// <para><see langword='true'/> if the <see cref='.HtmlTag'/> is contained in the collection; /// otherwise, <see langword='false'/>.</para> /// </returns> /// <seealso cref='.HtmlTagCollection.IndexOf'/> public bool Contains(HtmlTag val) { return List.Contains(val); } /// <summary> /// <para>Copies the <see cref='.HtmlTagCollection'/> values to a one-dimensional <see cref='System.Array'/> instance at the /// specified index.</para> /// </summary> /// <param name='array'><para>The one-dimensional <see cref='System.Array'/> that is the destination of the values copied from <see cref='.HtmlTagCollection'/> .</para></param> /// <param name='index'>The index in <paramref name='array'/> where copying begins.</param> /// <returns> /// <para>None.</para> /// </returns> /// <exception cref='System.ArgumentException'><para><paramref name='array'/> is multidimensional.</para> <para>-or-</para> <para>The number of elements in the <see cref='.HtmlTagCollection'/> is greater than the available space between <paramref name='arrayIndex'/> and the end of <paramref name='array'/>.</para></exception> /// <exception cref='System.ArgumentNullException'><paramref name='array'/> is <see langword='null'/>. </exception> /// <exception cref='System.ArgumentOutOfRangeException'><paramref name='arrayIndex'/> is less than <paramref name='array'/>'s lowbound. </exception> /// <seealso cref='System.Array'/> public void CopyTo(HtmlTag[] array, int index) { List.CopyTo(array, index); } /// <summary> /// <para>Returns the index of a <see cref='.HtmlTag'/> in /// the <see cref='.HtmlTagCollection'/> .</para> /// </summary> /// <param name='value'>The <see cref='.HtmlTag'/> to locate.</param> /// <returns> /// <para>The index of the <see cref='.HtmlTag'/> of <paramref name='value'/> in the /// <see cref='.HtmlTagCollection'/>, if found; otherwise, -1.</para> /// </returns>

135

/// <seealso cref='.HtmlTagCollection.Contains'/> public int IndexOf(HtmlTag val) { return List.IndexOf(val); } /// <summary> /// <para>Inserts a <see cref='.HtmlTag'/> into the <see cref='.HtmlTagCollection'/> at the specified index.</para> /// </summary> /// <param name='index'>The zero-based index where <paramref name='value'/> should be inserted.</param> /// <param name=' value'>The <see cref='.HtmlTag'/> to insert.</param> /// <returns><para>None.</para></returns> /// <seealso cref='.HtmlTagCollection.Add'/> public void Insert(int index, HtmlTag val) { List.Insert(index, val); } /// <summary> /// <para>Returns an enumerator that can iterate through /// the <see cref='.HtmlTagCollection'/> .</para> /// </summary> /// <returns><para>None.</para></returns> /// <seealso cref='System.Collections.IEnumerator'/> public new HtmlTagEnumerator GetEnumerator() { return new HtmlTagEnumerator(this); } /// <summary> /// <para> Removes a specific <see cref='.HtmlTag'/> from the /// <see cref='.HtmlTagCollection'/> .</para> /// </summary> /// <param name='value'>The <see cref='.HtmlTag'/> to remove from the <see cref='.HtmlTagCollection'/> .</param> /// <returns><para>None.</para></returns> /// <exception cref='System.ArgumentException'><paramref name='value'/> is not found in the Collection. </exception> public void Remove(HtmlTag val) { List.Remove(val); } public class HtmlTagEnumerator : IEnumerator { IEnumerator baseEnumerator; IEnumerable temp; public HtmlTagEnumerator(HtmlTagCollection mappings) { this.temp = ((IEnumerable)(mappings)); this.baseEnumerator = temp.GetEnumerator(); } public HtmlTag Current { get { return ((HtmlTag)(baseEnumerator.Current)); } } object IEnumerator.Current { get { return baseEnumerator.Current; } }

136

public bool MoveNext() { return baseEnumerator.MoveNext(); } bool IEnumerator.MoveNext() { return baseEnumerator.MoveNext(); } public void Reset() { baseEnumerator.Reset(); } void IEnumerator.Reset() { baseEnumerator.Reset(); } } } }

%%%%%%%%%%%%%%%%%%%%%%%%%%%% end class %%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%% create attrbute%%%%%%%%%%%%%%%%%%

<?xml version="1.0" standalone="yes"?> <NewDataSet> <keyCombinations> <firstAttribute>width</firstAttribute> <secondAttribute>height</secondAttribute> </keyCombinations> <keyCombinations> <firstAttribute>src</firstAttribute> <secondAttribute>alt</secondAttribute> </keyCombinations> <keyCombinations> <firstAttribute>style</firstAttribute> <secondAttribute>class</secondAttribute> </keyCombinations> <keyCombinations> <firstAttribute>cellspacing</firstAttribute> <secondAttribute>cellpadding</secondAttribute> </keyCombinations> <keyCombinations> <firstAttribute>background</firstAttribute> <secondAttribute>valign</secondAttribute> </keyCombinations> <keyCombinations> <firstAttribute>face</firstAttribute> <secondAttribute>size</secondAttribute> </keyCombinations> <keyCombinations> <firstAttribute>name</firstAttribute> <secondAttribute>content</secondAttribute> </keyCombinations> <keyCombinations> <firstAttribute>colspan</firstAttribute> <secondAttribute>bgcolor</secondAttribute> </keyCombinations> </NewDataSet> %%%%%%%%%%%%%%%%%%%%%%%%%%%% end class %%%%%%%%%%%%%%%%%%

%%%%%%%%%%%%%%%%%%%%%%%%%%%% create database%%%%%%%%%%%%%%%%%%

<?xml version="1.0" standalone="yes"?> <file>

137

<file> <id>1</id> <name> fatima abdalla</name> <pasword>ffi1</pasword> </file> <file> <id> 2</id> <name> arig amir</name> <pasword>arig2arig</pasword> </file> <file> <id>4</id> <name>mahammad ali</name> <pasword>mahammed</pasword> </file> <file> <id>5</id> <name>abdalla omar</name> <pasword>abdabd2013</pasword> </file> <file> <id>6</id> <name>nora bshra</name> <pasword>nononono</pasword> </file> <file> <id>41</id> <name>ali</name> <pasword>asdf</pasword> </file> <file> <id>12</id> <name>amira ahmad</name> <pasword>amamam</pasword> </file> </file>

%%%%%%%%%%%%%%%%%%%%%%%%%%%% end create attrbute%%%%%%%%%%%%%%%%%%

Date post:	06-Apr-2020
Category:	Documents
Upload:	others
View:	0 times
Download:	0 times