Download - Report

1. SYSTEM STUDY

SKNCOE, Department of Computer Engineering, 2010-2011

1

CHAPTER 1

SYSTEM STUDY

1.1 . BACKGROUND

Intrusion detection has been extensively studied since the seminal report written by

Anderson (1980). Traditionally, intrusion detection techniques are divided into misuse

detection and anomaly detection. Misuse detection techniques mainly focus on developing

models of known attacks, which can be described b specific patterns or sequences of events

and data. Anomaly detection techniques model systems or users’ normal behaviours, and any

deviation from the normal behaviours is considered as an intrusion. Misuse detection

techniques have low False Detection Rates (FDR), but their major weakness is that novel or

unknown attacks will go unnoticed until corresponding signatures are assed to the database of

the Intrusion Detection System (IDS). Anomaly detection techniques have the potential to

detect such attacks, but quite often they tend to have FDR because it is difficult to

discriminate between abnormal and intrusive behaviour.

We have studied the previous work of Koza (1992) and Wong and Leung (2000) for

detecting known or novel attacks on the network using Genetic Programming. GP extends the

fundamental idea of Genetic Algorithm (GA), and evolves more complex data structures. To

do so, it uses chromosome like data structures having different fields which describe every

packet and its attributes. The major concern is to simply check for the “Fitness level” of the

packet. Hence the packet is not modified in any way. To determine the Fitness Level initial

rules are established which act as background knowledge database for known attacks and are

easily represented in simple if(....) then(....) statements. As the new attacks are made to the

system the rules will be created as per the packet’s origin and its attributes. Hence came into

existence the Intelligent Intrusion Detection System (IIDS).


2

Fig 1:- Structure of a simple genetic algorithm (Pohlheim, 2001)

1.2. CLASSIFICATION

The system is classified under 3 subpart

1. Packet Sniffer.

2. GA Algorithm.

3. Policy Management.

4. Administrator notification (SMS Gateway).

1.2.1. Packet Sniffer:-

Jpcap is a Java library for capturing and sending network packets. Using Jpcap, you

can develop applications to capture packets from a network interface and visualize/analyze

them in Java. You can also develop Java applications to send arbitrary packets through a

network interface. Jpcap has been tested on Microsoft Windows (98/2000/XP/Vista), Linux

(Fedora, Mandriva, Ubuntu), Mac OS X (Darwin), FreeBSD, and Solaris. Jpcap can capture

Ethernet, IPv4, IPv6, ARP/RARP, TCP, UDP, and ICMPv4 packets. Jpcap is open source,

and is licensed under GNU LGPL.

The JPCAP distribution includes both :


3

A tool for real time network traffic capture and analysis.

An API for developing packet capture application in JAVA.

The jpcap network capture tool performs real-time decomposition and visualization of

network traffic. Screenshots of the capture console and visualization component in action:

Fig. 2:- Screenshot of jpcap capture tool console capturing and converting network packets to

Java objects


4

1.2.2. GA MODULE:-

Genetic Algorithms were invented to mimic some of the processes observed in natural

evolution. Many people, biologists included, are astonished that life at the level of complexity

that we observe could have evolved in the relatively short time suggested by the fossil record.

The idea with GA is to use this power of evolution to solve optimization problems. The father

of the original Genetic Algorithm was John Holland who invented it in the early 1970's.

GA’s simulate the survival of the fittest among individuals over consecutive

generation for solving a problem. Each generation consists of a population of character

strings that are analogous to the chromosome that we see in our DNA. Each individual

represents a point in a search space and a possible solution. The individuals in the population

are then made to go through a process of evolution.

GA’s are based on an analogy with the genetic structure and behaviour of

chromosomes within a population of individuals using the following foundations:

Individuals in a population compete for resources and mates.

Those individuals most successful in each 'competition' will produce more

offspring than those individuals that perform poorly.

Genes from `good' individuals propagate throughout the population so that

two good parents will sometimes produce offspring that are better than either

parent.

Thus each successive generation will become more suited to their

environment.

1.2.3. POLICY MANAGEMENT:-

The term security policy refers to numerous aspects of information systems’ security

such as Network Security Policy (SP), Access control SP, Key management SP, etc. To unify

all these views, we define a security policy as “a set of rules that determine how a particular

set of assets should be secured”. Furthermore, the SP should be split into multiple

components as it should address all the security requirements of the enterprise. We found that

all of these policy components can be specified similarly even though they use distinguished

security techniques. Abstracting away from its context, a SP representation should contain


5

the protected asset, the operations modeling their interaction, and the security properties that

must be followed. According to this view, the assets of the protected infrastructure are

categorized into sorts. For example, to establish a connection between two machines, we need

three sorts (i.e. host, port, protocol).

In order to fully exploit the suspicious level, we need to examine all fields related

with a specific network connection. For simplicity, we only consider some obvious attributes

for each connection. The definition of rules (for TCP/IP protocols) is shown in Table 1.

The corresponding rule for the “Example Value” attribute in Table 1 could be translated as:

if {the connection has following information: source IP address 209.11.??.??; destination IP

address:130.18.176+?.??; source port number: 42335; destination port number: 80;

connection time: 482 seconds; the connection is stopped by the originator; the protocol used

is TCP; the originator sent 7320 bytes of data; and the responder sent 38891 bytes of data }

then {stop the connection}


6

We can convert the above example into the chromosome form, as described in the Figure 4

bellow

Fig 3: Chromosome encoding

The actual validity of this rule will be examined by matching the historical data set

comprised of connections marked as either anomalous or normal. If the rule is able to find an

anomalous behavior, a bonus will be given to the current chromosome. If the rule matches a

normal connection, a penalty will be applied to the chromosome. Clearly no single rule can

be used to separate all anomalous connections from normal connections. The population

needs evolving to find the optimal rule set.

In the example shown in Table 1, some wild cards (the ‘*’ character and the ‘?’

character) are used and the corresponding genes within the chromosome are shown as –1.

These wild cards are used to represent an appropriate range of specific values (Crosbie and

Spafford, 1995). It is useful when representing a network block (a range of IP addresses or

port numbers) in a rule. Once the spatial information is included in the rules, the capability of

the IDS can be greatly improved as an intrusion may initiate from many different locations.

The inclusion of the duration time of a network connection in the chromosome ensures

incorporation of temporal information for network connections. The maximum value of

duration time is 99999999 seconds, which is more than a year. This is helpful for identifying

intrusions because complex intrusions may span hours, days, or even months.

The genetic algorithm starts with a population that has randomly selected rules. The

population can evolve by using the crossover and mutations operators. Due to the

effectiveness of the evaluation function, the succeeding populations are biased toward rules

that match intrusive connections. Ultimately as the algorithm stops, rules are selected and

added into the IDS rule base.

A traditional policy-based management system and policy authoring rely on static

authoring of “if [condition] then [action]” rules, becomes incapable.

Utility function [, goal policies, and data mining and reinforcement learning, have

emerged as new approaches. Though to a new and ambiguous situations these approaches

fails to respond in systematic manner. Our contribution is in the form of providing solutions


7

to these problems using genetic algorithm based policy frame work. This policy system based

on genetic algorithm has four basic components described in IETF policy frame work; they

are Policy Repository, policy management tool, policy decision point, and policy

enforcement point.

Features of the project

Auto Rule Base Generation

Auto restore

Negligible Administrator Presence required

More Security

Less resources required

1.2.4. Administrator notification (SMS Gateway):-

An SMS gateway is a way of sending a text message with or without using a mobile

(cell) phone. Specifically, it is a device or service offering SMS transit by either transforming

messages to mobile network traffic from other media or by allowing transmission or receipt

of SMS messages with or without the use of a mobile phone. Typical use of a gateway would

be to forward simple email to a mobile phone recipient. It can also be useful in developing

web applications that we can interact with via SMS (Short Messaging Service).

Figure 4:- An SMS text messaging application connects to SMSCs through an SMS gateway.


8

http://en.wikipedia.org/wiki/Short_message_service

http://en.wikipedia.org/wiki/Mobile_phone

1.3 SYSTEM OVERVIEW

The importance of network security increases with the increase in attacks and its

variation. What better way to apply Genetic algorithm. After all it’s the ultimate resort.

Genetic algorithm’s basic goals are to maintain the systems integrity and security. There are

compromises to be made with respect to space and time complexity. In case of mutation and

cross over done with the incoming packets the off springs (new packets generated as result of

cross-over and mutation) there must be certain constraint provided so that they don’t overload

the system itself. Hence specific threshold values are provided for this purpose.

A population of individuals is maintained within search space for a GA, each

representing a possible solution to a given problem. Each individual is coded as a finite

length vector of components, or variables, in terms of some alphabet, usually

the binary alphabet {0,1}. To continue the genetic analogy these individuals are likened to

chromosomes and the variables are analogous to genes. Thus a chromosome (solution) is

composed of several genes (variables). A fitness score is assigned to each solution

representing the abilities of an individual to `compete'. The individual with the optimal (or

generally near optimal) fitness score is sought. The GA aims to use selective `breeding' of the

solutions to produce `offspring' better than the parents by combining information from the

chromosomes.

Figure 5 :- Chromosome Generation

The GA maintains a population of n chromosomes (solutions) with associated fitness

values. Parents are selected to mate, on the basis of their fitness, producing offspring via a

reproductive plan. Consequently highly fit solutions are given more opportunities to

reproduce, so that offspring inherit characteristics from each parent. As parents mate and

produce offspring, room must be made for the new arrivals since the population is kept at a

static size. Individuals in the population die and are replaced by the new solutions, eventually SKNCOE, Department of Computer Engineering, 2010-2011

9

creating a new generation once all mating opportunities in the old population have been

exhausted. In this way it is hoped that over successive generations better solutions will thrive

while the least fit solutions die out.

New generations of solutions are produced containing, on average, more good genes

than a typical solution in a previous generation. Each successive generation will contain more

good `partial solutions' than previous generations. Eventually, once the population has

converged and is not producing offspring noticeably different from those in previous

generations, the algorithm itself is said to have converged to a set of solutions to the problem

at hand.

1.4 SYSTEM BEHAVIOR

When the packets are passed through the packet sniffer module the attributes of the

packets are copied by the sniffer and passed to the GA operations unit where the comparison

is done with the Gene pool for identifying the most FIT packet (here most fit packet refers to

most fit for harming the system). The only anomalous behaviour of the system takes place

when an un-identified packet for a new source enters the system. Then the system is confused

as to treat it as FIT or UNFIT?

The alternative solution to this is done by creating RESTORE POINT and

FEEDBACK CONSOLE. The behaviour of these units is similar to the restore point in our

computers. When a packet entered is of unknown origin and attributes then the system allows

it to perform the task. If later it is discovered by the feedback console that the system has

performed in an abnormal way then it is restored back to previous version. Simultaneously,

the packet that was responsible for this is added into the gene pool as an unfit packet and

further such packets are blocked forever.

1.4.1 Basic Implementation Details

Based on Natural Selection

After an initial population is randomly generated, the algorithm evolves the through three

operators:

1. selection which equates to survival of the fittest;

2. crossover which represents mating between individuals;


10

3. mutation which introduces random modifications.

1. Selection Operator

key idea: give prefrence to better individuals, allowing them to pass on their

genes to the next generation.

The goodness of each individual depends on its fitness.

Fitness may be determined by an objective function or by a subjective

judgement.

2. Crossover Operator

Prime distinguished factor of GA from other optimization techniques

Two individuals are chosen from the population using the selection operator

A crossover site along the bit strings is randomly chosen

The values of the two strings are exchanged up to this point

If S1=000000 and s2=111111 and the crossover point is 2 then S1'=110000

and s2'=001111

The two new offspring created from this mating are put into the next

generation of the population

By recombining portions of good individuals, this process is likely to create

even better individuals

Figure 6:- Chromosome Crossover

3. Mutation Operator

With some low probability, a portion of the new individuals will have some of

their bits flipped.

Its purpose is to maintain diversity within the population and inhibit premature

convergence.


11

Mutation alone induces a random walk through the search space

Mutation and selection (without crossover) create a parallel, noise-tolerant, hill-

climbing algorithms

Figure 7:- Chromosome Mutation

1.4.2 Effects of Genetic Operators

Using selection alone will tend to fill the population with copies of the best

individual from the population

Using selection and crossover operators will tend to cause the algorithms to

converge on a good but sub-optimal solution

Using mutation alone induces a random walk through the search space.

Using selection and mutation creates a parallel, noise-tolerant, hill climbing

algorithm

1.5. FEASIBILITY STUDY

The feasibility study comprise of an initial investigation into personnel will be

required. Feasibility study will help you make informed and transparent decisions at crucial

points during the developmental process.

1.5.1. Market Feasibility:-

Till date similar systems provided same services at a larger scale but had no restore

point. This made it more vulnerable. Instead we have proposed a system at a small scale but a

reliable one.

1.5.2. Resource Feasibility:-

We can strongly say that it is technically feasible, since there will not be much

difficulty in getting required resources for the development and maintaining the system as

well. All resources needed for the development of the software as well as the maintenance of

the same is available. Here we are utilizing the resources, which are already available.


12

1.5.3 Legal Feasibility:-

The software used do not violate and privacy act. Even the packet sniffer only extracts

the attributes of the packet and not the content. Hence no legal issues are created. Also the

final authority is given to the system which helps it to avoid possible cause of failure as

whatever the result made is of authorised personal.

1.5.4 Economic Feasibility:-

An evolution of development cost against the ultimate income or benefits derived from

development system. Economical justification includes cost and benefits for which the

project is to be developed and implemented. Development of this application is highly

economically feasible. We need not spend much money for the accomplishment of the project

since the resources needed for the development of the system is already available. The only

thing to be done is making an environment for the development with an effective supervision.

If we are doing so, we can attain the maximum usability of the corresponding resources.

Therefore the system is economically feasible.

1.5.5 Operational Feasibility:-

The system is fully automated so does not require constant monitoring. System keeps

tracks on number of packets entering to the registered users, log of incoming packets is

maintained. So, the system is operationally feasible. Even it will remind user about its end of

validity.

1.6 SYSTEM REQUIREMENTS:-

SOFTWARE REQUIREMENTS:

Windows XP/Vista/Windows 7

IDE: Eclipse

Language: Java

JPCAP as Packet sniffer

HARDWARE REQUIREMENTS:

Dual Core 2.0 GHz or above

1 GB RAM or above and 20 GB hard disk space

LAN Card

Multiple nodes in same network


13

2. LITERATURE SURVEY


14

CHAPTER 2

LITERATURE SURVEY

2.1. NETWORK SECURITY:-

In the field of networking, the area of network security consists of the provisions and

policies adopted by the network administrator to prevent and monitor unauthorized access,

misuse, modification, or denial of the computer network and network-accessible resources.

Network Security is the authorization of access to data in a network, which is controlled by

the network administrator. Users are assigned an ID and password that allows them access to

information and programs within their authority. Network Security consist of a variety of

computer networks, both public and private that are used in everyday jobs conducting

transactions and communications among businesses, government agencies and individuals.

Networks can be private, such as within a company, and others which might be open to

public access. Network Security is involved in organization, enterprises, and all other type of

institutions. It does as its titles explains, secures the network. Network security starts from

authenticating the user, commonly with a username and a password. Since this requires just

one thing besides the user name, i.e. the password which is something you 'know', this is

sometimes termed one factor authentication. With two factor authentication something you

'have' is also used (e.g. a security token or 'dongle', an ATM card, or your mobile phone), or

with three factor authentication something you 'are' is also used (e.g. a fingerprint or retinal

scan).Once authenticated, a firewall enforces access policies such as what services are

allowed to be accessed by the network users. Though effective to prevent unauthorized

access, this component may fail to check potentially harmful content such as computer

worms or Trojans being transmitted over the network. Anti-virus software or an intrusion

prevention system(IPS) help detect and inhibit the action of such malware. An anomaly-base

intrusion detection system may also monitor the network and traffic for unexpected (i.e.

suspicious) content or behavior and other anomalies to protect resources, e.g. from denial of

service attacks or an employee accessing files at strange times. Individual events occurring on

the network may be logged for audit purposes and for later high level analysis.


15

Features of IDS (INTRUSION DETECTION SYSTEM):-

An intrusion detection system (IDS) is a device or software application that

monitors network and/or system activities for malicious activities or policy violations and

produces reports to a Management Station. Intrusion prevention is the process of performing

intrusion detection and attempting to stop detected possible incidents. Intrusion detection and

prevention systems (IDPS) are primarily focused on identifying possible incidents, logging

information about them, attempting to stop them, and reporting them to security

administrators. In addition, organizations use IDPSs for other purposes, such as identifying

problems with security policies, documenting existing threats, and deterring individuals from

violating security policies. IDPSs have become a necessary addition to the security

infrastructure of nearly every organization.

IDPSs typically record information related to observed events, notify security

administrators of important observed events, and produce reports. Many IDPSs can also

respond to a detected threat by attempting to prevent it from succeeding. They use several

response techniques, which involve the IDPS stopping the attack itself, changing the security

environment (e.g., reconfiguring a firewall), or changing the attack’s content.

As an alternate solution for protecting computers from malicious users, a model-based

Intrusion Detection System (IDS) may be used. Instead of using a fingerprinting method of

user classification, an IDS compares learned user characteristics from an empirical behavioral

model to all users of a system. User behavior is generally defined as the set of objective

characteristics of a connection between a client (e.g., a user’s computer) and a server. Using a

generalized behavioral model is theoretically more accurate, efficient, and easier to maintain

than a fingerprinting system. This method of detection eliminates the need for an attack to be

previously known to be detected because malicious behavior is different from normal

behavior by nature (Sinclair et al, 1999). Also, a model based system uses a constant amount

of computer resources per user, drastically reducing the possibility of depleting available

resources. Furthermore, while actual attack types by malicious users may vary widely, a

model-based IDS does not require the constant updates typical of fingerprint-based systems

because the characteristics of any attack against a system will not significantly change

throughout the lifetime of the system because attacks are inherently different from normal

behavior (Eskin et al, 2001; Lee et al, 2001; Sinclair et al, 1999). In previous research, the


16

options for model generation have been to base it on normal users or to base it on malicious

users (Eskin et al, 2001). Models based on normal users, known as Anomaly Detection

models, use an empirical behavioral model of a normal user and classifies any computer

activity that does not fit this model as malicious. Models based on malicious users are known

as Misuse Detection models. These models look for a pattern of malicious behavior, and

behavior that fits this model is classified as malicious (Eskin et al, 2001). In this research,

neither model was explicitly specified, allowing the genetic algorithm to generate the best

model. An Intrusion Detection System must first be able to detect malicious user connections,

for which it must have a generalized model of user behavior for comparison to users of a

system. The most efficient method for generating a user model is to apply a data analysis

algorithm to given “training data,” which is representative of real world data (Stolfo et al,

2000), and then generate an empirical model of either type of user based on this training data.

Previous research into empirical model generation has used data analysis algorithms such as

generalized data mining techniques (Lee

et al,1998, 2001), sparse Markov transducers (Eskin et al, 2001), and genetic algorithms

(Cedex, 1993; Crosbie & Spafford, 1995). Moreover, previous research using genetic

algorithms as a method for intrusion detection has either been theoretical (Cedex, 1993) or

become obsolete and is no longer applicable to current intrusion detection research (Crosbie

& Spafford, 1995). The experiment presented in this paper seeks to test the viability of

genetic algorithms as a method for generating empirical user behavioral models.

A genetic algorithm is a method of data analysis that works analogously to Darwinian

evolution (Koza, 1992). Within a computer simulation, a population of many individuals is

created, each individual representing a possible mathematical model. Each individual has one

or more chromosomes that function as basic instructions to the individual in a cause (e.g.,

input data) and effect (e.g., user classification) manner. An individual is measured by the

aggregate performance of its chromosomes. An initial population is created by complete

randomization of the chromosomes, and individuals of subsequent generations go through

mutations, which are also randomized (Moriarty et al, 1999). As in Darwinism, a population

that goes through many generations eliminates poor performing individuals and allows better

performing individuals to replicate and mutate themselves during each generation. This

genetic algorithm was designed so that each individual represented a possible behavioral

model.


17

2.2. SURVEY OF EXISTING SYSTEM:-

In recent years, Intrusion Detection System (IDS) has become one of the hottest

research areas in Computer Security. It is an important detection technology and is used as a

countermeasure to preserve data integrity and system availability during an intrusion. When

an intruder attempts to break into an information system or performs an action not legally

allowed, we refer to this activity as an intrusion (Graham, 2002; see also Jones and Sielken,

2000). Intruders can be divided into two groups, external and internal. The former refers to

those who do not have authorized access to the system and who attack by using various

penetration techniques. The latter refers to those with access permission who wish to perform

unauthorized activities. Intrusion techniques may include exploiting software bugs and

system misconfigurations, password cracking, sniffing unsecured traffic, or exploiting the

design flaw of specific protocols (Graham, 2002). An Intrusion Detection System is a system

for detecting intrusions and reporting them accurately to the proper authority. Intrusion

Detection Systems are usually specific to the operating system that they operate in and are an

important tool in the overall implementation an organization’s information security policy

(Jones and Sielken, 2000), which reflects an organization's statement by defining the rules

and practices to provide security, handle intrusions, and recover from damage caused by

security breaches. There are two generally accepted categories of intrusion detection

techniques: misuse detection and anomaly detection. Misuse detection refers to techniques

that characterize known methods to penetrate a system. These penetrations are characterized

as a ‘pattern’ or a ‘signature’ that the IDS looks for. The pattern/signature might be a static

string or a set sequence of actions. System responses are based on identified penetrations.

Anomaly detection refers to techniques that define and characterize normal or acceptable

behaviors of the system (e.g., CPU usage, job execution time, system calls). Behaviors that

deviate from the expected normal behavior are considered intrusions (Bezroukov, 2002; see

also McHugh, 2001). IDSs can also be divided into two groups depending on where they look

for intrusive behavior: Network-based IDS (NIDS) and Host-based IDS. The former refers to

systems that identify intrusions by monitoring traffic through network devices (e.g. Network

Interface Card, NIC). A host-based IDS monitors file and process activities related to a

software environment associated with a specific host. Some host-based IDSs also listen to

network traffic to identify attacks against a host (Bezroukov, 2002; see also McHugh, 2001).

There are other emerging techniques.One example is known as a blocking IDS, which


18

combines a host-based IDS with the ability to modify firewall rules (Miller and Shaw, 1996).

Another is called a Honeypot, which appears to be a ‘target’ to an intruder, but is specifically

designed to trap an intruder in order to trace down the intruder’s location and respond to

attack (Bezroukov, 2002).

The Intelligent Intrusion Detection System (IIDS) is an ongoing project at the Center

for Computer Security Research (CCSR) in Mississippi State University. The architecture

combines a number of different approaches to the IDS problem, and includes different AI

techniques to help identify intrusive behavior (Bridges and Vaughn,2001). It uses both

anomaly detection and misuse detection techniques and is both a network-based and host-

based system. Within the overall architecture of the IIDS, some open-source intrusion

detection software tools are integrated for use as security sensors (Li, 2002), such as Bro

(Paxson, 1998) and Snort (Roesch, 1999). Techniques proposed in this paper are part of the

IIDS research efforts.

Genetic Algorithm (GA) has been used in different ways in IDSs. The Applied

Research Laboratories of the University of Texas at Austin (Sinclair, Pierce, and Matzner

1999) uses different machine learning techniques, such as finite state machine, decision tree,

and GA, to generate artificial intelligence rules for IDS. One network connection and its

related behavior can be translated to represent a rule to judge whether or not a real-time

connection is considered an intrusion. These rules can be modeled as chromosomes inside the

population. The population evolves until the evaluation criteria are met. The generated rule

set can be used as knowledge inside the IDS for judging whether the network connection and

related behaviors are potential intrusions (Sinclair, Pierce, and Matzner 1999). The COAST

Laboratory in Purdue University (Crosbie and Spafford, 1995) implemented an IDS using

autonomous agents (security sensors) and applied AI techniques to evolve genetic algorithms.

Agents are modeled as chromosomes and an internal evaluator is used inside every agent

(Crosbie and Spafford, 1995). In the approaches described above, the IDS can be viewed as a

rule-based system (RBS) and GA can be viewed as a tool to help generate knowledge for the

RBS. These approaches have some disadvantages. In order to detect intrusive behaviors for a

local network, network connections should be used to define normal and anomalous

behaviors. Sometimes an attack can be as simple as scanning for available ports in a server or

a password-guessing scheme. But typically they are complex and are generated by automated

tools that are freely available from the Internet. An example can be a Trojan horse or a

backdoor that can run for a period of time, or can be initiated from different locations. In


19

order to detect such intrusions, both temporal and spatial information of network traffic

should be included in the rule set. The current GA applications do not address these issues

extensively. This paper shows how network connection information can be modeled as

chromosomes and how the parameters in genetic algorithm can be defined in this respect.

2.3 NEW CONCEPT:-

Genetic algorithm is a family of computational models based on principles of

evolution and natural selection. These algorithms convert the problem in a specific domain

into a model by using a chromosome-like data structure and evolve the chromosomes using

selection, recombination, and mutation operators. The range of the applications that can make

use of genetic algorithm is quite broad (Sinclair, Pierce, and Matzner 1999; see also Whitley,

1994). In computer security applications, it is mainly used for finding optimal solutions to a

specific problem. The process of a genetic algorithm usually begins with a randomly selected

population of chromosomes. These chromosomes are representations of the problem to be

solved. According to the attributes of the problem, different positions of each chromosome

are encoded as bits, characters, or numbers. These positions are sometimes referred to as

genes and are changed randomly within a range during evolution. The set of chromosomes

during a stage of evolution are called a population. An evaluation function is used to

calculate the “goodness” of each chromosome. During evaluation, two basic operators,

crossover and mutation, are used to simulate the natural reproduction and mutation of

species. The selection of chromosomes for survival and combination is biased towards the

fittest chromosomes.

Genetic algorithms can be used to evolve simple rules for network traffic (Sinclair,

Pierce, and Matzner 1999). These rules are used to differentiate normal network connections

from anomalous connections. These anomalous connections refer to events with probability

of intrusions. The rules stored in the rule base are usually in the following form (Sinclair,

Pierce, and Matzner 1999):

if { condition } then { act }


20

For the problems we presented above, the condition usually refers to a match between

current network connection and the rules in IDS, such as source and destination IP addresses

and port numbers (used in TCP/IP network protocols), duration of the connection, protocol

used, etc., indicating the probability of an intrusion. The act field usually refers to an action

defined by the security policies within an organization, such as reporting an alert to the

system administrator, stopping the connection, logging a message into system audit files, or

all of the above. For example, a rule can be defined as:

if {the connection has following information: source IP address 124.12.5.18; destination IP

address:

130.18.206.55; destination port number: 21; connection time: 10.1 seconds }

then {stop the connection}

This rule can be explained as follows: if there exists a network connection request

with the source IP address 124.12.5.18, destination IP address 130.18.206.55, destination port

number 21, and connection time 10.1 seconds, then stop this connection establishment. This

is because the IP address 124.12.5.18 is recognized by the IDS as one of the blacklisted IP

addresses; therefore, any service request initiated from it is rejected.

“Thus finally this new system of ours has given rise to new Intelligent IDS.”


21

3. REQUIREMENTS

GATHERING


22

CHAPTER 3

REQUIREMENTS GATHERING

3.1 INTRODUCTION:-

The term requirements gathering encompasses those tasks that go into determining the

needs or conditions to meet for a new or altered product, taking account of the possibly

conflicting requirements of the various stakeholders, such as beneficiaries or users. Also it

can be applied specifically to the analysis proper, as opposed to elicitation or documentation

of the requirements.

FIG.8: Requirement Gathering

3.1.1 Purpose:-

Genetic Algorithm (GA) has been used in different ways in IDSs. One

network connection and its related behavior can be translated to represent a rule to judge


23

whether or not a real-time connection is considered an intrusion. These rules can be modeled

as chromosomes inside the population. The population evolves until the evaluation criteria

are met. The generated rule set can be used as knowledge inside the IDS for judging whether

the network connection and related behaviours are potential intrusions.

3.1.2 Document conventions:-

The format is simple. The bold headings are used for showing the points. The points

are numbered in order to make reading of the SRS simple.

Abbreviations used are:-

GA: - Genetic Algorithm

GAM: - Genetic Algorithm Module

PR: - Policy Repository

PMT: - Policy Management Tool

PDP: - Policy Decision Point

PEP: - Policy Enforcement Point

FPC: - False Positive Count

FNC: - False Negative Count

3.1.3 Intended Audience and Reading Suggestions :-

The intended audience is users of the system, administrator, operator, database

designer & database admnistrator.

3.1.4 Scope of the project:-

In Scope:

Allow system to detect any network event

Gene Construction using packet sniffer and packet analyzer

Gene Pool storage

Use of FSM to reduce the time complexity while applying GA

Gene Fitness Evaluation Function


24

Policy Management

Policy enforcement

Maintenance of Initial Data Set

Out of Scope:

1. Compatibility issues related with OS other than Windows.

2. Issues caused by limited hardware requirements such as disk space.

3.2 OVERALL DESCRIPTION:-

This GA feedback based network security policy framework can be installed on any

system and can be used for policy-based management and to monitor and manage the

behaviour of network.

3.2.1 Product perspective:-

This system is aimed at developing an evolutional network security policy framework

based on genetic-feedback algorithm. Based on the historical security events, using genetic

algorithm, we can generate a rule base. When a new network event comes, the analyzer

judges whether the event is secure or not according to the rule base, and the policy system

may give a policy decision too. Obviously, these two results may be different. So the policies

can be automatically adjusted refer to the genetic calculated results.

3.2.1.1. Jpcap: -

Jpcap is a Java library for capturing and sending network packets. Using Jpcap, you

can develop applications to capture packets from a network interface and visualize/analyze

them in Java. Jpcap isn't a pure Java solution; it depends on the use of native libraries. On

either Windows or UNIX, you must have the required third-party library, WinPcap or

libpcap, respectively.


25

3.2.2. Product Features:-

Auto Rule Base Generation Auto restore damaged files Negligible Administrator Presence required More Security Less resources required

3.2.3. Operating Environment:-

Software Requirements:-

Operating System: Microsoft Windows 2000/NT, XP or higher version.

Other Software: Microsoft Access, Jpcap

Hardware Requirements:-

Compatible to any brand with the minimum configuration as:

Processor: - Intel Pentium IV, Dual Core

Ram:- 1GB Onwards

Hard disk: - 20.0GB

Monitor:-SVGA colour monitor, VGA Monochrome, LCD monitor

Keyboard:-105standards

Pointing Device:- Logitech Mouse or other Compatible

3.2.4. Design constraints:-

Our platform provides a easy to understand design. The screen follows all the rules

and regulation of GUI testing.

In software engineering, graphical user interface testing is the process of testing a

product's graphical user interface to ensure it meets its written specifications. This is

normally done through the use of a variety of test cases.

3.2.4.1 Security:-

The security is the important thing in case of the network events and transaction.

Hence the security should be provided properly. While any policy definition and

administrative functions secure SSL layer is used for transfer.


26

3.2.4.2 Fault Tolerance:-

Data will never get corrupted in case of system crash or power failure. There is a

active back up program running which keep on taking back up of the date after a time interval

depending on the load on the system.

3.2.4.3 Multi-tenant architecture:-

It offer network application develop in any environment and it also take care of

various issue like concurrency management, scalability, failover and security. The

architecture enable defining the "trust relationship" between users in security, access,

distribution of source code, navigation history, admin (people and device) profiles,

interaction history, and application usage.

3.2.4.4 Utility-grade instrumentation:-

It offers developers insight into the inner workings of their applications, and the

behaviour of their users.

3.2.4.5 Assumptions and dependencies:-

For successful restoring of backed up policies and enforcement of new policies all

administrator rights have been assigned.

3.3 SYSTEM FEATURES:-

3.3.1 Deploying Application:-

It allows user of application to use the services anywhere within the network area of

the service provider.

3.3.2 Common platform:-

This platform provides common environment to the developers for creating robust,

easy, and secure network events and transaction within an organization.

3.3.3 GUI:-

In software engineering, graphical user interface testing is the process of testing a

product's graphical user interface to ensure it meets its written specifications. This is

normally done through the use of a variety of test cases. Various section of test cases that we

follow are:


27

Section 1 - Windows Compliance Standards

1.1. Application icon.

1.2. For Each Window TITLE in the Application

1.3. Text Boxes

1.4. Option (Radio Buttons)

1.5. Check Boxes

1.6. Command Buttons

1.7. Drop down List Boxes

1.8. Combo Boxes

1.9. List Boxes

Section 2 - Tester's Screen Validation Checklist

2.1. Aesthetic Conditions

2.2. Validation Conditions

2.3. Navigation Conditions

2.4. Usability Conditions

2.5. Data Integrity Conditions

2.6. Modes (Editable Read-only) Conditions

2.7. General Conditions

2.8. Specific Field Tests

2.8.1. Date Field Checks

2.8.2. Numeric Fields

2.8.3. Alpha Field Checks

Section 3 - Validation Testing - Standard Actions

3.1. On every Screen

3.2. Shortcut keys / Hot Keys

3.3. Control Shortcut Keys

3.4 EXTERNAL INTERFACE REQUIREMENTS:-

The system has no other external software and hardware interface requirements.

3.4.1 User Interfaces:-


28

System will have powerful user interface which will enable the developer to create the

various applications and user to use the applications effectively.

3.4.2 Hardware interfaces: -

The system has no hardware interface requirements

3.4.3 Software interfaces:-

The system will require database servers, Microsoft Access for developers.

3.4.4 Communications Interfaces:-

To manage and monitor network events and to manage /enforce policies due to

various network events, connected in Local Area Network/Internet we will be using UDP and

TCP/IP protocols

3.5 OTHER NON FUNCTIONAL REQUIREMENTS:-

Backup and Restore facility to monitor automatic backup on a timely basis.

3.5.1 Performance Requirements:-

System should be able to perform backup and restore policy’s effectively without the loss or corruption of data. System should require minimum RAM usage

3.5.2 Safety Requirements:-

Policy Repository data and Rule Base data must be secured from unauthorized access

so registration of all application and users are done.

3.5.3 Security Requirements:-

User authentication is done at the time of logging in.

3.5.4 Software Quality Attributes:-

The attributes taken into consideration are reliability, flexibility, operability, platform

independence.


29

4. SYSTEM DESIGN


30

CHAPTER 4

SYSTEM DESIGN

4.1 SELECTION OF LIFE CYCLE MODEL:-

The Basic idea:-

Process models define distinct set of activities, action, tasks, milestones and work

product that are required to engineer high quality software. These process models are not

perfect, but they do provide a useful roadmap for software engineering work. We have used

the waterfall model as project development life cycle. The waterfall model suggests a

systematic sequence for software development that begins with customer specification of

requirements and progresses through planning, modeling, construction and deployment.

Fig. 9.DEVELOPMENT LIFE CYCLE


31

Deliverable Form Phase

Stage one: Communication

Project Concept Overview Document Project Initiation

Project Plan Document Project Initiation

Initial Estimate Document Project Initiation

Stage two: Requirement and Planning

User Requirements Document Pre-Design(Analysis)

Technical Requirements Document Pre-Design(Analysis)

Paper Prototype Document Pre-Design(Analysis)

Requirement and Planning

CompletionDocument Pre-Design(Analysis)

Stage three: Design

Infrastructure Design Document Design

Systems Design Document Design

Application Design Document Design

Time & Cost Quotation Document Design

Design Completion Document Design

Stage four: Construction & Testing

Infrastructure Installation Hardware/Software Development


32

Systems Installation Hardware/Software Development

Application Development Software Development

Development Beta Test

ReportDocument Testing

Application Testing Document Testing

Development Completion Software/Document Development

Stage five: Deployment

Live System Delivery Hardware/Software Deployment

Infrastructure Specification Document Deployment

System Specification Document Deployment

Application Technical

SpecificationDocument Deployment

User Documentation Document Deployment


33

Fig.10 Waterfall Model

Waterfall approach was first Process Model to be introduced and followed widely in

Software Engineering to ensure success of the project. In "The Waterfall" approach, the

whole process of software development

is divided into separate process phases.

The phases in Waterfall model are: Requirement Specifications phase, Software

Design, Implementation and Testing & Maintenance. All these phases are cascaded to each

other so that second phase is started as and when defined set of goals are achieved for first

phase and it is signed off, so the name "Waterfall Model". All the methods and processes

undertaken in Waterfall Model are more visible.


34

http://www.onestoptesting.com/sdlc-models/waterfall-model/

4.2 PROJECT PLAN:-

As the project is to be done during the course of two semesters with our university

examination falling in between. We divided the project in two phases of three months each. The first

phase was from August 2010 to October 2010 and the second phase was from January 2011 to April

2011.The detailed week wise project scheduling with the achieved milestones is shown is as shown

PROJECT PLAN

TASK COMPLETION DATE

Searching for the Project 12-Aug-2010

Deciding the Project 20-Aug-2010

Searching information about Project 5-Sep-2010

Deciding the Components 25-Sep-2010

Working on Project Design 3-Oct-2010

Finalizing the Project Design 15-Oct-2010

Configuration of Server 25-Jan-2011

Configuration of Database 5-Feb-2011

Checking the Database Connectivity 12-Feb-2011

Creating Web Applications 20-Feb-2011

Performing Various Tests 25-March-2011

Documentation of Project 30-March-2011

System Delivery And Installation 10-April-2011

Systems design is the process or art of defining the architecture, components,

modules, interfaces, and data for a system to satisfy specified requirements. One could see it

as the application of systems theory to product development.

4.3 DATA FLOW DIAGRAM:-SKNCOE, Department of Computer Engineering, 2010-2011

35

A data flow diagram (DFD) is a graphical representation of the "flow" of data through

an information system. DFDs can also be used for the visualization of data processing

(structured design).

On a DFD, data items flow from an external data source or an internal data store to an

internal data store or an external data sink, via an internal process.

A DFD provides no information about the timing of processes, or about whether

processes will operate in sequence or in parallel. It is therefore quite different from a

flowchart, which shows the flow of control through an algorithm, allowing a reader to

determine what operations will be performed, in what order, and under what circumstances,

but not what kinds of data will be input to and output from the system, nor where the data

will come from and go to, nor where the data will be stored (all of which are shown on a

DFD)

FIG 11. LEVEL 0 DFD


36

FIG 12. Level 1 DFD

4.4 USE CASE DIAGRAMS:-

Use case diagrams are basically used to model the dynamic aspects of systems. These

diagrams are central to modeling the behavior of the system, a subsystem, or a class. Each

one shows a set of use cases and actors and their relationships. Use case diagrams are

important for visualizing, specifying, and documenting the behavior of an element. They

make systems, subsystems and classes approachable and understandable by presenting an

outside view of how those elements may be used in context. The main purpose of a use case

diagram is to show what system functions are performed for which actor. Roles of the actors

in the system can be depicted.


37

FIG 13: USECASE DIAGRAM

4.5 CLASS DIAGRAM:-

In software engineering, a class diagram in the Unified Modelling Language (UML)

is a type of static structure diagram that describes the structure of a system by showing the

system's classes, their attributes, and the relationships between the classes. This diagram

shows various classes or main entities involved in the system and also their relationship with

each other. It depicts the attributes and operations each class can carry out, individually and

with help of other classes in the system designed.


38

http://en.wikipedia.org/wiki/Class_(computer_science)

http://en.wikipedia.org/wiki/Unified_Modeling_Language

http://en.wikipedia.org/wiki/Software_engineering

This diagram shows various classes or main entities involved in the system and also

their relationship with each other.It depicts the attributes and operations each class can carry

out, individually and with help of other classes in the system designed

FIG 14.CLASS DIAGRAM


39

4.6 STATE CHART DIAGRAM:-

A state diagram is a type of diagram used in computer science and related fields to

describe the behaviour of systems. State diagrams require that the system described is

composed of a finite number of states; sometimes, this is indeed the case, while at other times

this is a reasonable abstraction.

Fig 15- State-Chart Diagram


40

http://en.wikipedia.org/wiki/Abstraction

http://en.wikipedia.org/wiki/State_(computer_science)

http://en.wikipedia.org/wiki/Computer_science

http://en.wikipedia.org/wiki/Diagram

4.7 ACTIVITY DIAGRAMS:-

Activity diagrams are graphical representations of workflows of stepwise activities

and actions with support for choice, iteration and concurrency. In the Unified Modelling

Language, activity diagrams can be used to describe the business and operational step-by-step

workflows of components in a system. An activity diagram shows the overall flow of control.

FIG 16. ACTIVITY DIAGRAM


41

http://en.wikipedia.org/wiki/Workflow



http://en.wikipedia.org/wiki/Workflow

4.8 SEQUENCE DIAGRAMS:-

A sequence diagram in Unified Modelling Language (UML) is a kind of interaction

diagram that shows how processes operate with one another and in what order. It is a

construct of a Message Sequence Chart.

Fig 17- Sequence Diagram


42

http://en.wikipedia.org/wiki/Message_Sequence_Chart

http://en.wikipedia.org/wiki/Interaction_diagram

http://en.wikipedia.org/wiki/Interaction_diagram


4.9 DEPLOYMENT DIAGRAM:-

FIG 18: DEPLOYMENT DIAGRAM


43

4.10 PACKAGE DIAGRAM:-

Fig 19 PACKAGE DIAGRAM


44

5. IMPLEMENTATION DETAILS


45

CHAPTER 5

IMPLEMENTATION DETAILS

5.1 TECHNOLOGY DETAILS:-

5.1.1 Java:-

Java is a programming language originally developed by James Gosling at Sun Microsystems

(which is now a subsidiary of Oracle Corporation) and released in 1995 as a core component of Sun

Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a

simpler object model and fewer low-level facilities. Java applications are typically compiled to

bytecode (class file) that can run on any Java Virtual Machine (JVM) regardless of computer

architecture. Java is a general-purpose, concurrent, class-based, object-oriented language that is

specifically designed to have as few implementation dependencies as possible. It is intended to let

application developers "write once, run anywhere". Java is currently one of the most popular

programming languages in use, and is widely used from application software to web applications.

Features of Java:-

● Java Virtual Machine (JVM)

-An imaginary machine that is implemented by emulating software on a real machine

-Provides the hardware platform specifications to which you compile all Java technology code

● Bytecode

-A special machine language that can be understood by the Java Virtual Machine (JVM)

-Independent of any particular computer hardware, so any computer with a Java interpreter can

execute the compiled Java program, no matter what type of computer the program was compiled on

● Garbage collection thread

- Responsible for freeing any memory that can be freed. This happens automatically during the

lifetime of the Java program.

- Programmer is freed from the burden of having to deallocate that memory themselves Introduction

to Programming


46

● Code security

- Is attained in Java through the implementation of its Java Runtime Environment (JRE).

● JRE

- Runs code compiled for a JVM and performs class loading (through the class loader), code

verification (through the bytecode verifier) and finally code execution.

● Class Loader

– Responsible for loading all classes needed for the Java program

– Adds security by separating the namespaces for the classes of the local file system from those that

are imported from network sources

– After loading all the classes, the memory layout of the executable is then determined. This adds

protection against unauthorized access to restricted areas of the code since the memory layout is

determined during runtime

● Bytecode verifier

– tests the format of the code fragments and checks the code fragments for illegal code that can

violate access rights to objects

● Platform Independence

-The Write-Once-Run-Anywhere ideal has not been achieved (tuning for different platforms usually

required), but closer than with other languages.

● Object Oriented

-Object oriented throughout - no coding outside of class definitions, including main().

-An extensive class library available in the core language packages.

● Compiler/Interpreter Combo

-Code is compiled to bytecodes that are interpreted by a Java virtual machines (JVM) .

-This provides portability to any machine for which a virtual machine has been written.

-The two steps of compilation and interpretation allow for extensive code checking and improved

security.


47

● Robust

-Exception handling built-in, strong type checking (that is, all data must be declared an explicit type),

local variables must be initialized.

● Several dangerous features of C & C++ eliminated:

-No memory pointers

-No preprocessor

-Array index limit checking

● Good Performance

-Interpretation of bytecodes slowed performance in early versions, but advanced virtual machines

with adaptive and just-in-time compilation and other techniques now typically provide performance

up to 50% to 100% the speed of C++ programs.

● Threading

-Lightweight processes, called threads, can easily be spun off to perform multiprocessing.

-Can take advantage of multiprocessors where available

-Great for multimedia displays.

● Built-in Networking

-Java was designed with networking in mind and comes with many classes to develop sophisticated

Internet communications.

5.1.2 Jpcap:-

Jpcap is an open source library for capturing and sending network packets from Java

applications. It provides facilities to:

● capture raw packets live from the wire.

● save captured packets to an offline file, and read captured packets from an offline file.


48

● automatically identify packet types and generate corresponding Java objects (for Ethernet,

IPv4, IPv6, ARP/RARP, TCP, UDP, and ICMPv4 packets).

● filter the packets according to user-specified rules before dispatching them to the

application.

● send raw packets to the network

● Jpcap is based on libpcap/winpcap, and is implemented in C and Java.

● Jpcap has been tested on Microsoft Windows (98/2000/XP/Vista), Linux (Fedora, Ubuntu),

Mac OS X (Darwin), FreeBSD, and Solaris.

Jpcap can be used to develop many kinds of network applications, including (but not limited

to):

- network and protocol analyzers

- network monitors

- traffic loggers

- traffic generators

- user-level bridges and routers

- network intrusion detection systems (NIDS)

- network scanners

- security tools

Jpcap captures and sends packets independently from the host protocols (e.g., TCP/IP). This

means that Jpcap does not (cannot) block, filter or manipulate the traffic generated by other programs

on the same machine: it simply "sniffs" the packets that transit on the wire. Therefore, it does not

provide the appropriate support for applications like traffic shapers, QoS schedulers and personal

firewalls.

When you want to capture packets from a network, the first thing you have to do is to obtain

the list of network interfaces on your machine. To do so, Jpcap provides JpcapCaptor.getDeviceList()

method. It returns an array of NetworkInterface objects.

A Network Interface object contains some information about the corresponding network

interface, such as its name, description, IP and MAC addresses, and data link name and description.


49

Fig 20 Jpcap

5.2 MODULAR DETAILS:-SKNCOE, Department of Computer Engineering, 2010-2011

50

Various modules involved in are:

1. The Packet sniffer which takes care of converting incoming packets into

Chromosomes-like Data structures. These chromosome-like data structures are

used by the GA module for checking its Fitness value.

2. With the help of different sub-modules like the GD and GC the chromosome is

checked against the existing Gene pool. Also the important tasks of Cross-over

and Mutation are carried out by the GA module.

3. Once the Fitness Calculator decides the fitness value of the packet, the info is

passed to the Event Report Generator. Hence the Policy Management Point

(Admin) comes into picture. His role is vital w.r.t. the decision to make which

allows the anomalous packet to block or to allow it.

4. Once the policy management point checks for the validation of all the policy and

none are violated, the packet is allowed. If there is inconsistency then the SMS

Gateway is invoked The Administrator is notified accordingly.

5.3 DATBASE DETAILS:-

5.3.1 Micrsoft Office Access 2007:-

Microsoft Office Access, previously known as Microsoft Access, is a relational

database management system from Microsoft that combines the relational Microsoft Jet

Database Engine with a graphical user interface and software-development tools. Software

developers and data architects can use Microsoft Access to develop application software, and

"power users" can use it to build simple applications. Like other Office applications, Access

is supported by Visual Basic for Applications, an object-oriented programming language that

can reference a variety of objects including DAO (Data Access Objects), ActiveX Data

Objects, and many other ActiveX components. Visual objects used in forms and reports

expose their methods and properties in the VBA programming environment, and VBA code

modules may declare and call Windows operating-system functions.

Microsoft Access is used to make databases. When reviewing Microsoft Access in the

real world, it should be understood how it is used with other products. An all-Access solution

may have Microsoft Access Forms and Reports managing Microsoft Access tables. However,

Microsoft Access may be used only as the 'front-end', using another product for the 'back-end'


51

tables, such as Microsoft SQL Server and non-Microsoft products such as Oracle and Sybase.

Similarly, some applications will only use the Microsoft Access tables and use another

product as a front-end, such as Visual Basic or ASP.NET. Microsoft Access may be only part

of the solution in more complex applications, where it may be integrated with other

technologies such as Microsoft Excel, Microsoft Outlook or ActiveX Data Objects.

Access tables support a variety of standard field types, indices, and referential

integrity. Access also includes a query interface, forms to display and enter data, and reports

for printing. The underlying Jet database, which contains these objects, is multiuser-aware

and handles record-locking and referential integrity including cascading updates and deletes.

Users can create tables, queries, forms and reports, and connect them together with

macros. Advanced users can use VBA to write rich solutions with advanced data

manipulation and user control. The original concept of Access was for end users to be able to

"access" data from any source. Other uses include: the import and export of data to many

formats including Excel, Outlook, ASCII, dBase, Paradox, FoxPro, SQL Server, Oracle,

ODBC, etc. It also has the ability to link to data in its existing location and use it for viewing,

querying, editing, and reporting. This allows the existing data to change and the Access

platform to always use the latest data. It can perform heterogeneous joins between data sets

stored across different platforms. Access is often used by people downloading data from

enterprise level databases for manipulation, analysis, and reporting locally.

There is also the Jet Database format (MDB or ACCDB in Access 2007) which can

contain the application and data in one file. This makes it very convenient to distribute the

entire application to another user, who can run it in disconnected environments.

One of the benefits of Access from a programmer's perspective is its relative

compatibility with SQL (structured query language) — queries can be viewed graphically or

edited as SQL statements, and SQL statements can be used directly in Macros and VBA

Modules to manipulate Access tables. Users can mix and use both VBA and "Macros" for

programming forms and logic and offers object-oriented possibilities. VBA can also be

included in queries. It can perform heterogeneous joins between data sets stored across

different platforms. Access tables support a variety of standard field types, indices, and

referential integrity. Access also includes a query interface, forms to display and enter data,

and reports for printing. The underlying Jet database, which contains these objects, is


52

multiuser-aware and handles record-locking and referential integrity including cascading

updates and deletes. Access is often used by people downloading data from enterprise level

databases for manipulation, analysis, and reporting locally.

Fig 21 Microsoft Office Access 2007

5.4 SNAPSHOTS:


53

Main Window


54

Log record

No packet case


55

6. TESTING


56

CHAPTER 6

TESTING

6.1 TESTING STRAEGIES:-

The test strategy consists of a series of different tests that will fully exercise the GA based

network security system. The primary purpose of these tests is to uncover the systems limitations and

measure its full capabilities. A list of the various planned tests and a brief explanation follows below.

1. UI testing:-

The admin interaction needs to be user friendly. The admin has a graphical user interface

(GUI). The important features of the security services are being highlighted to the administrator. It

should be able to switch properly between different screens and operations should execute correctly.

The administrator has many other interfaces like buttons, textboxes, scrollbars etc. which needs to

comply with industry standards. The efficiency and functionality of these features need to be tested.

2. Functional Testing:-

It's a type of GUI testing where functionality of an application is tested. Testing of all features

and functions of system software, hardware, etc. to ensure requirements and specifications are met.

Functionality testing of software is testing conducted on a complete, integrated system to evaluate the

system's compliance with its specified requirements. Functionality testing falls within the scope of

black box testing, and as such, should require no knowledge of the inner design of the code or logic.

Also the basic functional requirements of the system should be fulfilled and tested.

3. Stability Testing:-

The admin has to control the application for long time. It must have a scheduling functionality

wherein they run in background on the proxy servers, policy server and database server. When

invoked the server comes to the foreground and corresponds to the admin’s request. The stability of

system in such scenarios should be tested.

6.1.1 Test Goals:-

The software is intended to provide a very user friendly GUI to the system administrators. Most of the

testing is inclined to ensure that this requirement is fulfilled.

1. To ensure that the admin receives correct reply from the policy server and proxy server.

2. To make sure that the changes and updation done by admin is correctly reflected on policy

and proxy server.


57

3. To ensure that the data is stored and fetched without any problem from database server.

4. To ensure that policy repository and rule base are always consistent, integrated and durable

during life cycle of the system.

5. To ensure that the value of False Positive Count [FPC] and False Negative Count [FNC] is

always below the danger level.

6. The higher fitness value indicates more fit population and hence population with more

occurring and accurate individuals overall therefore the testing goal is to make sure that the

fitness value is always high.

Project Title: GENETIC-FEEDBACK ALGORITHM BASED NETWORK SECURITY POLICY FRAMEWORK.

Developer Requirements:-

1. Processor : 1 GHz and above

2. Primary memory : 1 GB of RAM

3. Operating System : Windows XP

Software Resources:-

1. Microsoft Access

2. Java Virtual machine.

3. Jpcap-Network packet analyzer.

To run application:-

1. Software : Microsoft Access, Java.

2. Hardware : PC’s. (for Proxy server, Admin, Policy Repository,

Rule Base)

3. Primary memory : 1 GB RAM

6.1.3 Features to be tested:-


58

Sr.No. Acceptance Tests Result

1. The system must have maximum data recovery.

2. One can access the system only if he is authenticated.

3.

There should be an administrator console having all

administrator functions with a password authentication

login.

4.The system should not proceed if the administrator has not

selected any appropriate option.

5.The process of Updating will take place only if the

administrator enters or modify the text.

6.The time required for Retrieving data must as less as

possible with maximum efficiency.

7. The system should have an option for exiting the system.

6.1.4 Test Team:-

1. ROHAN KULKARNI2. VIRAL PATEL 3. SAGAR ROTHAWAN4. MAYURESH SHIVADE

6.2. TEST DELIVERABLES:-

1. Acceptance test plan

2. System/Integration test plan

3. Unit test plans/turnover documentation

4. Screen prototypes

5. Report mock-ups

6. Defect/Incident reports and summaries

7. Test logs and turnover report


59

6.3. REMAINING TEST TASKS:-

TASK ASSIGNED TO

Create Acceptance Test Plan ROHAN, SAGAR, VIRAL

Create System/Integration Test Plan

ROHAN, MAYURESH

Define Unit Test rules and Procedures

SAGAR, VIRAL

Define Turnover procedures for each level

SAGAR, ROHAN

Verify prototypes of Screens VIRAL, MAYURESH

Verify prototypes of Reports ROHAN,MAYURESH

6.4. STAFFING AND TRAINING NEEDS:-

There are three people allocated for the completion of the project. Individual skill set of the

members is mentioned in table:

ROHAN SAGAR VIRAL MAYURESH

Programming

LanguagesJAVA JAVA JAVA JAVA

Operating

systems

Windows /

LinuxWindows / Linux Windows / Linux

Windows/

Linux


60

Tools jpcap jpcap jpcap jpcap

6.5. SCHEDULE:-

Test plan includes various types of testing viz. manual testing, performance testing and

automated testing. These tests should be well planned and executed accordingly. The test plan is

shown in table

# Test Type Start Date End Date

1 Manual Testing 25/3/2011 27/3/2011

2 Run performance Tests 27/3/2011 30/3/2011

3 Finalize Testing 30/3/2011 31/3/2011

6.6. RESPONSIBILITIES:-

Rohan

Kulkarni

Sagar

Rothawan

Viral

Patel

Mayuresh

Shivade

Acceptance test

Documentation & Execution

System/Integration test

Documentation & Exec.

Unit test documentation &

execution

System design review

Detail Design Reviews

Test procedures and rules


61

Change Control and

regression testing

6.7. TEST ITEMS (FUNCTIONS):-

Test Case ID 01

Project Name Genetic-Feedback Algorithm Based Network Security Policy

Framework

Test Case Name Main Page – Client Side

Test Case Description To accept registration data.

Step No. Step Description Input Data Expected Result Actual Result

1

2

3

Enter alphabet in mobile number

textbox

Enter mobile number < 10 digits

Enter mobile and password correctly

alphabets

< 10 digits

Mobile Number and password

Error message showing “Enter numbers only”

Error message showing “Incorrect

number”

Successful login

Error message.

Error message.

Execute successfully.

Test Case ID 02


Framework

Test Case Name Main Page – Server Side

Test Case Description To login


1 Enter username and

password correctly

Username and

Password

Successful login Execute

successfully and SKNCOE, Department of Computer Engineering, 2010-2011

62

login.

Test Case ID 03


Framework

Test Case Name IMPORT TEMPLATES

Test Case Description To import templates.


1

2

Do not import any template and press

OK

Import template and press ok.

No selection

Import file.

Error message “please import test file and model file”.

Predict successfully and gives result.

Error message.

Execute successfully.

6.8. SOFTWARE RISK ISSUES:-

An effective strategy to deal with risk must consider three issues:

1: Risk Avoidance

2: Risk Monitoring

3: Risk Management and Contingency Planning

The risks mentioned in the risk table for the given project can be mentioned in the following

ways:

A: Large no of network events than planned:-

In case of heavy network traffic of the system recovery strategy can be applied. Also

the concurrent access of the database and updating tables is managed.


63

B: Project does not complete by the delivery date:-

This is business risk and can be eliminated by forming a study project plan .The work

must be strictly followed. This has been checked for throughout the life stage of the project.

C: End user resists system:-

The administrator may dislike the system or user may not feel comfortable with the

system if its representation is too complex. Providing an attractive, easy & can eliminate this

risk and extremely user-friendly interface. Also the system crash condition must be

eliminated for convince of the user.

D: Lack of trained staff:-

This risk is not varying difficult to handle since the user friendly graphical interface

itself will guide the user through the system. There is no requirement of any Special skill set.

It is assumed that naïve users will operate upon this system.

E: Lack of training of tools:-

This risk is associated with the developer’s inability and can be handled by employing

developers with quality skill required for project

F: Loss of funding:-

The funding can be lost in the event of administrator dissatisfaction. The prototype of system

provided to the administrator must keep him in an engrossed and waiting for the product.

G: Required resources not available on the host:-

This risk arises due to over estimation of client’s setup. The client infrastructure and

state of machines be checked for.

H: Customer may change requirement:-

This becomes the critical risk and even further aggravating if the system is in

completion stage. Every stage of SDLC system requires whole understanding of

requirements. It is the best that the system will be built only when requirements are frozen.

I: Less reuse than excepted:-

Flexibility must be incorporated in the project to enable the further improvements as


64

well as reuse for another client.


65

7. APPLICATION AND FUTURE ENHANCEMENT


66

CHAPTER 7

APPLICATION AND FUTURE ENHANCEMENT

7.1 APPLICATIONS:-

At present there are no such systems in market which follow self learning algorithm like Genetic Algorithm. In present scenario all system needs a update of current threats to avoid them. This system will be independent and will not require any outside support to counter new risks.

1. Network Security.

2. Network Intrusion Detection.

3. Unauthorized Network Access.

4. Organization Network Security and Control.

5. Packet analyzer and sniffer.

6. College intranet security system.

7.2. FUTURE ENHANCEMENT:-

1. Improved rule base generation techniques.

2. More efficient packet sniffers and gene checking algorithms

3. For the improvement in FPC and FNC, at first the applied network must be studied

thoroughly to identify the major impact holder between FPC and FNC. If both has

simultaneous effect, then by a suitable combination of the generated rules considering

FPC and FNC separately, the shortcoming could be overcome, which is left as a future

work.

4. Detailed specification of parameters to consider for genetic algorithm should be

determined during the experiments.

5. Combining knowledge from different security sensors into a standard rule base is

another promising area in this work.


67

8. CONCLUSION


68

CHAPTER 8:

CONCLUSION

Through this project we have introduced a new and improved model for genetic

feedback algorithm based network security policy framework. Fitness function and the

parameters affecting the fitness function is also taken under consideration. This new model is

much more simplified and implementable.

The simulation results show that new rules generated by GA have the better potential

capability to detect the attacks. But, this technique is not sufficient to improve both FPC and

FNC simultaneously. For the deployment of the technique, at first the applied network must

be studied thoroughly to identify the major impact holder between FPC and FNC. If both has

simultaneous effect, then by a suitable combination of the generated rules considering FPC

and FNC separately, the shortcoming could be overcome, which is left as a future work.


69

9. BIBLIOGRAPHY


70

CHAPTER 9

BIBLIOGRAPHY

9.1 BOOKS:-

1. Research on Policy-Based Security Management by W.E. Walsh.

2. An AI perspective on autonomic computing policies by Crosbie, Mark and Spafford.

3. Genetic and Evolutionary Algorithms: Principles, Methods and Algorithms.

4. Using Genetic Algorithm for Network Intrusion Detection by Wei Li.

5. Framework for Policy-based Admission Control by Crosbie, Mark and Spafford.

9.2 WEBSITES :-

1. http://www.geatbx.com/docu/algindex.html

2. http://www-dse.doc.ic.ac.uk/policies/

3. http://www.Security.cse.msstate.edu/

9.3 RESEARCH PAPERS:-

[1]. R. Shirey, “RFC2828”, Internet Security Glossary, May, 2000

[2]. Pohlheim, Hartmut, “Genetic and Evolutionary Algorithms: Principles, Methods and

Algorithms.”Genetic and Evolutionary Algorithm Toolbox,

http://www.geatbx.com/docu/algindex.html., 30 Oct. 2003.

[3]. Stuart Russel and Peter Norvig, “Artificial Intelligence A Modern Approach”, second

edition, PEARSON Education, 2004

[4]. Abu Sayed Md. Mostafizur Rahaman, Akram Hossain, Md. Abdur Rahman, Abeda


71

http://www.Security.cse.msstate.edu/

http://www-dse.doc.ic.ac.uk/policies/

http://www.geatbx.com/docu/algindex.html

Sultana, Jesmin Akhter, “Genetic Programming: Novel Network Attacks Detection”, 8 th

International Conference on Computer and Information Technology, 2005

[5]. “Policies for Network and Distributed Systems Management”, http://www-

dse.doc.ic.ac.uk/policies/, Imperial College

[6]. P. Flegkas, P. Trimintzios, G. Pavlou, I. Andrikopoulos, C.F. Cavalcanti, “On

Policy-based Extensible Hierarchical Network Management in QoS-enabled IP Networks”,

Policies for Distributed Systems and Networks: International Workshop, POLICY 2001,

Bristol, UK, January 2001

[7]. Wei Li, “Using Genetic Algorithm for Network Intrusion Detection”, Department of

Computer

Science and Engineering Mississippi State University, Mississippi State, MS 39762,

http://www.Security.cse.msstate.edu, 2004

[8]. Adhitya Chittur, “Model Generation for an Intrusion Detection System Using Genetic

Algorithms”, Ossining High School, Ossining, NY, November 27, 2001

[9] Koza, John R., Genetic Programmming: “On The Programming of Computers by Means

of Natural Selection”, MIT Press, Cambridge, MA, 1992.

[10] Crosbie, Mark and Spafford, Gene, “Applying Genetic Programming Techniques to

Intrusion

Detection”, In Proceedings of the AAAI 1995 Fall Symposium, November 1995.

[11] Bob Adolf, “New Paradigms for Intrusion Detection Using Genetic Programming”,

January 7,2004


72