1. SYSTEM STUDY
SKNCOE, Department of Computer Engineering, 2010-2011
1
CHAPTER 1
SYSTEM STUDY
1.1 . BACKGROUND
Intrusion detection has been extensively studied since the seminal report written by
Anderson (1980). Traditionally, intrusion detection techniques are divided into misuse
detection and anomaly detection. Misuse detection techniques mainly focus on developing
models of known attacks, which can be described b specific patterns or sequences of events
and data. Anomaly detection techniques model systems or users’ normal behaviours, and any
deviation from the normal behaviours is considered as an intrusion. Misuse detection
techniques have low False Detection Rates (FDR), but their major weakness is that novel or
unknown attacks will go unnoticed until corresponding signatures are assed to the database of
the Intrusion Detection System (IDS). Anomaly detection techniques have the potential to
detect such attacks, but quite often they tend to have FDR because it is difficult to
discriminate between abnormal and intrusive behaviour.
We have studied the previous work of Koza (1992) and Wong and Leung (2000) for
detecting known or novel attacks on the network using Genetic Programming. GP extends the
fundamental idea of Genetic Algorithm (GA), and evolves more complex data structures. To
do so, it uses chromosome like data structures having different fields which describe every
packet and its attributes. The major concern is to simply check for the “Fitness level” of the
packet. Hence the packet is not modified in any way. To determine the Fitness Level initial
rules are established which act as background knowledge database for known attacks and are
easily represented in simple if(....) then(....) statements. As the new attacks are made to the
system the rules will be created as per the packet’s origin and its attributes. Hence came into
existence the Intelligent Intrusion Detection System (IIDS).
SKNCOE, Department of Computer Engineering, 2010-2011
2
Fig 1:- Structure of a simple genetic algorithm (Pohlheim, 2001)
1.2. CLASSIFICATION
The system is classified under 3 subpart
1. Packet Sniffer.
2. GA Algorithm.
3. Policy Management.
4. Administrator notification (SMS Gateway).
1.2.1. Packet Sniffer:-
Jpcap is a Java library for capturing and sending network packets. Using Jpcap, you
can develop applications to capture packets from a network interface and visualize/analyze
them in Java. You can also develop Java applications to send arbitrary packets through a
network interface. Jpcap has been tested on Microsoft Windows (98/2000/XP/Vista), Linux
(Fedora, Mandriva, Ubuntu), Mac OS X (Darwin), FreeBSD, and Solaris. Jpcap can capture
Ethernet, IPv4, IPv6, ARP/RARP, TCP, UDP, and ICMPv4 packets. Jpcap is open source,
and is licensed under GNU LGPL.
The JPCAP distribution includes both :
SKNCOE, Department of Computer Engineering, 2010-2011
3
A tool for real time network traffic capture and analysis.
An API for developing packet capture application in JAVA.
The jpcap network capture tool performs real-time decomposition and visualization of
network traffic. Screenshots of the capture console and visualization component in action:
Fig. 2:- Screenshot of jpcap capture tool console capturing and converting network packets to
Java objects
SKNCOE, Department of Computer Engineering, 2010-2011
4
1.2.2. GA MODULE:-
Genetic Algorithms were invented to mimic some of the processes observed in natural
evolution. Many people, biologists included, are astonished that life at the level of complexity
that we observe could have evolved in the relatively short time suggested by the fossil record.
The idea with GA is to use this power of evolution to solve optimization problems. The father
of the original Genetic Algorithm was John Holland who invented it in the early 1970's.
GA’s simulate the survival of the fittest among individuals over consecutive
generation for solving a problem. Each generation consists of a population of character
strings that are analogous to the chromosome that we see in our DNA. Each individual
represents a point in a search space and a possible solution. The individuals in the population
are then made to go through a process of evolution.
GA’s are based on an analogy with the genetic structure and behaviour of
chromosomes within a population of individuals using the following foundations:
Individuals in a population compete for resources and mates.
Those individuals most successful in each 'competition' will produce more
offspring than those individuals that perform poorly.
Genes from `good' individuals propagate throughout the population so that
two good parents will sometimes produce offspring that are better than either
parent.
Thus each successive generation will become more suited to their
environment.
1.2.3. POLICY MANAGEMENT:-
The term security policy refers to numerous aspects of information systems’ security
such as Network Security Policy (SP), Access control SP, Key management SP, etc. To unify
all these views, we define a security policy as “a set of rules that determine how a particular
set of assets should be secured”. Furthermore, the SP should be split into multiple
components as it should address all the security requirements of the enterprise. We found that
all of these policy components can be specified similarly even though they use distinguished
security techniques. Abstracting away from its context, a SP representation should contain
SKNCOE, Department of Computer Engineering, 2010-2011
5
the protected asset, the operations modeling their interaction, and the security properties that
must be followed. According to this view, the assets of the protected infrastructure are
categorized into sorts. For example, to establish a connection between two machines, we need
three sorts (i.e. host, port, protocol).
In order to fully exploit the suspicious level, we need to examine all fields related
with a specific network connection. For simplicity, we only consider some obvious attributes
for each connection. The definition of rules (for TCP/IP protocols) is shown in Table 1.
The corresponding rule for the “Example Value” attribute in Table 1 could be translated as:
if {the connection has following information: source IP address 209.11.??.??; destination IP
address:130.18.176+?.??; source port number: 42335; destination port number: 80;
connection time: 482 seconds; the connection is stopped by the originator; the protocol used
is TCP; the originator sent 7320 bytes of data; and the responder sent 38891 bytes of data }
then {stop the connection}
SKNCOE, Department of Computer Engineering, 2010-2011
6
We can convert the above example into the chromosome form, as described in the Figure 4
bellow
Fig 3: Chromosome encoding
The actual validity of this rule will be examined by matching the historical data set
comprised of connections marked as either anomalous or normal. If the rule is able to find an
anomalous behavior, a bonus will be given to the current chromosome. If the rule matches a
normal connection, a penalty will be applied to the chromosome. Clearly no single rule can
be used to separate all anomalous connections from normal connections. The population
needs evolving to find the optimal rule set.
In the example shown in Table 1, some wild cards (the ‘*’ character and the ‘?’
character) are used and the corresponding genes within the chromosome are shown as –1.
These wild cards are used to represent an appropriate range of specific values (Crosbie and
Spafford, 1995). It is useful when representing a network block (a range of IP addresses or
port numbers) in a rule. Once the spatial information is included in the rules, the capability of
the IDS can be greatly improved as an intrusion may initiate from many different locations.
The inclusion of the duration time of a network connection in the chromosome ensures
incorporation of temporal information for network connections. The maximum value of
duration time is 99999999 seconds, which is more than a year. This is helpful for identifying
intrusions because complex intrusions may span hours, days, or even months.
The genetic algorithm starts with a population that has randomly selected rules. The
population can evolve by using the crossover and mutations operators. Due to the
effectiveness of the evaluation function, the succeeding populations are biased toward rules
that match intrusive connections. Ultimately as the algorithm stops, rules are selected and
added into the IDS rule base.
A traditional policy-based management system and policy authoring rely on static
authoring of “if [condition] then [action]” rules, becomes incapable.
Utility function [, goal policies, and data mining and reinforcement learning, have
emerged as new approaches. Though to a new and ambiguous situations these approaches
fails to respond in systematic manner. Our contribution is in the form of providing solutions
SKNCOE, Department of Computer Engineering, 2010-2011
7
to these problems using genetic algorithm based policy frame work. This policy system based
on genetic algorithm has four basic components described in IETF policy frame work; they
are Policy Repository, policy management tool, policy decision point, and policy
enforcement point.
Features of the project
Auto Rule Base Generation
Auto restore
Negligible Administrator Presence required
More Security
Less resources required
1.2.4. Administrator notification (SMS Gateway):-
An SMS gateway is a way of sending a text message with or without using a mobile
(cell) phone. Specifically, it is a device or service offering SMS transit by either transforming
messages to mobile network traffic from other media or by allowing transmission or receipt
of SMS messages with or without the use of a mobile phone. Typical use of a gateway would
be to forward simple email to a mobile phone recipient. It can also be useful in developing
web applications that we can interact with via SMS (Short Messaging Service).
Figure 4:- An SMS text messaging application connects to SMSCs through an SMS gateway.
SKNCOE, Department of Computer Engineering, 2010-2011
8
1.3 SYSTEM OVERVIEW
The importance of network security increases with the increase in attacks and its
variation. What better way to apply Genetic algorithm. After all it’s the ultimate resort.
Genetic algorithm’s basic goals are to maintain the systems integrity and security. There are
compromises to be made with respect to space and time complexity. In case of mutation and
cross over done with the incoming packets the off springs (new packets generated as result of
cross-over and mutation) there must be certain constraint provided so that they don’t overload
the system itself. Hence specific threshold values are provided for this purpose.
A population of individuals is maintained within search space for a GA, each
representing a possible solution to a given problem. Each individual is coded as a finite
length vector of components, or variables, in terms of some alphabet, usually
the binary alphabet {0,1}. To continue the genetic analogy these individuals are likened to
chromosomes and the variables are analogous to genes. Thus a chromosome (solution) is
composed of several genes (variables). A fitness score is assigned to each solution
representing the abilities of an individual to `compete'. The individual with the optimal (or
generally near optimal) fitness score is sought. The GA aims to use selective `breeding' of the
solutions to produce `offspring' better than the parents by combining information from the
chromosomes.
Figure 5 :- Chromosome Generation
The GA maintains a population of n chromosomes (solutions) with associated fitness
values. Parents are selected to mate, on the basis of their fitness, producing offspring via a
reproductive plan. Consequently highly fit solutions are given more opportunities to
reproduce, so that offspring inherit characteristics from each parent. As parents mate and
produce offspring, room must be made for the new arrivals since the population is kept at a
static size. Individuals in the population die and are replaced by the new solutions, eventually SKNCOE, Department of Computer Engineering, 2010-2011
9
creating a new generation once all mating opportunities in the old population have been
exhausted. In this way it is hoped that over successive generations better solutions will thrive
while the least fit solutions die out.
New generations of solutions are produced containing, on average, more good genes
than a typical solution in a previous generation. Each successive generation will contain more
good `partial solutions' than previous generations. Eventually, once the population has
converged and is not producing offspring noticeably different from those in previous
generations, the algorithm itself is said to have converged to a set of solutions to the problem
at hand.
1.4 SYSTEM BEHAVIOR
When the packets are passed through the packet sniffer module the attributes of the
packets are copied by the sniffer and passed to the GA operations unit where the comparison
is done with the Gene pool for identifying the most FIT packet (here most fit packet refers to
most fit for harming the system). The only anomalous behaviour of the system takes place
when an un-identified packet for a new source enters the system. Then the system is confused
as to treat it as FIT or UNFIT?
The alternative solution to this is done by creating RESTORE POINT and
FEEDBACK CONSOLE. The behaviour of these units is similar to the restore point in our
computers. When a packet entered is of unknown origin and attributes then the system allows
it to perform the task. If later it is discovered by the feedback console that the system has
performed in an abnormal way then it is restored back to previous version. Simultaneously,
the packet that was responsible for this is added into the gene pool as an unfit packet and
further such packets are blocked forever.
1.4.1 Basic Implementation Details
Based on Natural Selection
After an initial population is randomly generated, the algorithm evolves the through three
operators:
1. selection which equates to survival of the fittest;
2. crossover which represents mating between individuals;
SKNCOE, Department of Computer Engineering, 2010-2011
10
3. mutation which introduces random modifications.
1. Selection Operator
key idea: give prefrence to better individuals, allowing them to pass on their
genes to the next generation.
The goodness of each individual depends on its fitness.
Fitness may be determined by an objective function or by a subjective
judgement.
2. Crossover Operator
Prime distinguished factor of GA from other optimization techniques
Two individuals are chosen from the population using the selection operator
A crossover site along the bit strings is randomly chosen
The values of the two strings are exchanged up to this point
If S1=000000 and s2=111111 and the crossover point is 2 then S1'=110000
and s2'=001111
The two new offspring created from this mating are put into the next
generation of the population
By recombining portions of good individuals, this process is likely to create
even better individuals
Figure 6:- Chromosome Crossover
3. Mutation Operator
With some low probability, a portion of the new individuals will have some of
their bits flipped.
Its purpose is to maintain diversity within the population and inhibit premature
convergence.
SKNCOE, Department of Computer Engineering, 2010-2011
11
Mutation alone induces a random walk through the search space
Mutation and selection (without crossover) create a parallel, noise-tolerant, hill-
climbing algorithms
Figure 7:- Chromosome Mutation
1.4.2 Effects of Genetic Operators
Using selection alone will tend to fill the population with copies of the best
individual from the population
Using selection and crossover operators will tend to cause the algorithms to
converge on a good but sub-optimal solution
Using mutation alone induces a random walk through the search space.
Using selection and mutation creates a parallel, noise-tolerant, hill climbing
algorithm
1.5. FEASIBILITY STUDY
The feasibility study comprise of an initial investigation into personnel will be
required. Feasibility study will help you make informed and transparent decisions at crucial
points during the developmental process.
1.5.1. Market Feasibility:-
Till date similar systems provided same services at a larger scale but had no restore
point. This made it more vulnerable. Instead we have proposed a system at a small scale but a
reliable one.
1.5.2. Resource Feasibility:-
We can strongly say that it is technically feasible, since there will not be much
difficulty in getting required resources for the development and maintaining the system as
well. All resources needed for the development of the software as well as the maintenance of
the same is available. Here we are utilizing the resources, which are already available.
SKNCOE, Department of Computer Engineering, 2010-2011
12
1.5.3 Legal Feasibility:-
The software used do not violate and privacy act. Even the packet sniffer only extracts
the attributes of the packet and not the content. Hence no legal issues are created. Also the
final authority is given to the system which helps it to avoid possible cause of failure as
whatever the result made is of authorised personal.
1.5.4 Economic Feasibility:-
An evolution of development cost against the ultimate income or benefits derived from
development system. Economical justification includes cost and benefits for which the
project is to be developed and implemented. Development of this application is highly
economically feasible. We need not spend much money for the accomplishment of the project
since the resources needed for the development of the system is already available. The only
thing to be done is making an environment for the development with an effective supervision.
If we are doing so, we can attain the maximum usability of the corresponding resources.
Therefore the system is economically feasible.
1.5.5 Operational Feasibility:-
The system is fully automated so does not require constant monitoring. System keeps
tracks on number of packets entering to the registered users, log of incoming packets is
maintained. So, the system is operationally feasible. Even it will remind user about its end of
validity.
1.6 SYSTEM REQUIREMENTS:-
SOFTWARE REQUIREMENTS:
Windows XP/Vista/Windows 7
IDE: Eclipse
Language: Java
JPCAP as Packet sniffer
HARDWARE REQUIREMENTS:
Dual Core 2.0 GHz or above
1 GB RAM or above and 20 GB hard disk space
LAN Card
Multiple nodes in same network
SKNCOE, Department of Computer Engineering, 2010-2011
13
2. LITERATURE SURVEY
SKNCOE, Department of Computer Engineering, 2010-2011
14
CHAPTER 2
LITERATURE SURVEY
2.1. NETWORK SECURITY:-
In the field of networking, the area of network security consists of the provisions and
policies adopted by the network administrator to prevent and monitor unauthorized access,
misuse, modification, or denial of the computer network and network-accessible resources.
Network Security is the authorization of access to data in a network, which is controlled by
the network administrator. Users are assigned an ID and password that allows them access to
information and programs within their authority. Network Security consist of a variety of
computer networks, both public and private that are used in everyday jobs conducting
transactions and communications among businesses, government agencies and individuals.
Networks can be private, such as within a company, and others which might be open to
public access. Network Security is involved in organization, enterprises, and all other type of
institutions. It does as its titles explains, secures the network. Network security starts from
authenticating the user, commonly with a username and a password. Since this requires just
one thing besides the user name, i.e. the password which is something you 'know', this is
sometimes termed one factor authentication. With two factor authentication something you
'have' is also used (e.g. a security token or 'dongle', an ATM card, or your mobile phone), or
with three factor authentication something you 'are' is also used (e.g. a fingerprint or retinal
scan).Once authenticated, a firewall enforces access policies such as what services are
allowed to be accessed by the network users. Though effective to prevent unauthorized
access, this component may fail to check potentially harmful content such as computer
worms or Trojans being transmitted over the network. Anti-virus software or an intrusion
prevention system(IPS) help detect and inhibit the action of such malware. An anomaly-base
intrusion detection system may also monitor the network and traffic for unexpected (i.e.
suspicious) content or behavior and other anomalies to protect resources, e.g. from denial of
service attacks or an employee accessing files at strange times. Individual events occurring on
the network may be logged for audit purposes and for later high level analysis.
SKNCOE, Department of Computer Engineering, 2010-2011
15
Features of IDS (INTRUSION DETECTION SYSTEM):-
An intrusion detection system (IDS) is a device or software application that
monitors network and/or system activities for malicious activities or policy violations and
produces reports to a Management Station. Intrusion prevention is the process of performing
intrusion detection and attempting to stop detected possible incidents. Intrusion detection and
prevention systems (IDPS) are primarily focused on identifying possible incidents, logging
information about them, attempting to stop them, and reporting them to security
administrators. In addition, organizations use IDPSs for other purposes, such as identifying
problems with security policies, documenting existing threats, and deterring individuals from
violating security policies. IDPSs have become a necessary addition to the security
infrastructure of nearly every organization.
IDPSs typically record information related to observed events, notify security
administrators of important observed events, and produce reports. Many IDPSs can also
respond to a detected threat by attempting to prevent it from succeeding. They use several
response techniques, which involve the IDPS stopping the attack itself, changing the security
environment (e.g., reconfiguring a firewall), or changing the attack’s content.
As an alternate solution for protecting computers from malicious users, a model-based
Intrusion Detection System (IDS) may be used. Instead of using a fingerprinting method of
user classification, an IDS compares learned user characteristics from an empirical behavioral
model to all users of a system. User behavior is generally defined as the set of objective
characteristics of a connection between a client (e.g., a user’s computer) and a server. Using a
generalized behavioral model is theoretically more accurate, efficient, and easier to maintain
than a fingerprinting system. This method of detection eliminates the need for an attack to be
previously known to be detected because malicious behavior is different from normal
behavior by nature (Sinclair et al, 1999). Also, a model based system uses a constant amount
of computer resources per user, drastically reducing the possibility of depleting available
resources. Furthermore, while actual attack types by malicious users may vary widely, a
model-based IDS does not require the constant updates typical of fingerprint-based systems
because the characteristics of any attack against a system will not significantly change
throughout the lifetime of the system because attacks are inherently different from normal
behavior (Eskin et al, 2001; Lee et al, 2001; Sinclair et al, 1999). In previous research, the
SKNCOE, Department of Computer Engineering, 2010-2011
16
options for model generation have been to base it on normal users or to base it on malicious
users (Eskin et al, 2001). Models based on normal users, known as Anomaly Detection
models, use an empirical behavioral model of a normal user and classifies any computer
activity that does not fit this model as malicious. Models based on malicious users are known
as Misuse Detection models. These models look for a pattern of malicious behavior, and
behavior that fits this model is classified as malicious (Eskin et al, 2001). In this research,
neither model was explicitly specified, allowing the genetic algorithm to generate the best
model. An Intrusion Detection System must first be able to detect malicious user connections,
for which it must have a generalized model of user behavior for comparison to users of a
system. The most efficient method for generating a user model is to apply a data analysis
algorithm to given “training data,” which is representative of real world data (Stolfo et al,
2000), and then generate an empirical model of either type of user based on this training data.
Previous research into empirical model generation has used data analysis algorithms such as
generalized data mining techniques (Lee
et al,1998, 2001), sparse Markov transducers (Eskin et al, 2001), and genetic algorithms
(Cedex, 1993; Crosbie & Spafford, 1995). Moreover, previous research using genetic
algorithms as a method for intrusion detection has either been theoretical (Cedex, 1993) or
become obsolete and is no longer applicable to current intrusion detection research (Crosbie
& Spafford, 1995). The experiment presented in this paper seeks to test the viability of
genetic algorithms as a method for generating empirical user behavioral models.
A genetic algorithm is a method of data analysis that works analogously to Darwinian
evolution (Koza, 1992). Within a computer simulation, a population of many individuals is
created, each individual representing a possible mathematical model. Each individual has one
or more chromosomes that function as basic instructions to the individual in a cause (e.g.,
input data) and effect (e.g., user classification) manner. An individual is measured by the
aggregate performance of its chromosomes. An initial population is created by complete
randomization of the chromosomes, and individuals of subsequent generations go through
mutations, which are also randomized (Moriarty et al, 1999). As in Darwinism, a population
that goes through many generations eliminates poor performing individuals and allows better
performing individuals to replicate and mutate themselves during each generation. This
genetic algorithm was designed so that each individual represented a possible behavioral
model.
SKNCOE, Department of Computer Engineering, 2010-2011
17
2.2. SURVEY OF EXISTING SYSTEM:-
In recent years, Intrusion Detection System (IDS) has become one of the hottest
research areas in Computer Security. It is an important detection technology and is used as a
countermeasure to preserve data integrity and system availability during an intrusion. When
an intruder attempts to break into an information system or performs an action not legally
allowed, we refer to this activity as an intrusion (Graham, 2002; see also Jones and Sielken,
2000). Intruders can be divided into two groups, external and internal. The former refers to
those who do not have authorized access to the system and who attack by using various
penetration techniques. The latter refers to those with access permission who wish to perform
unauthorized activities. Intrusion techniques may include exploiting software bugs and
system misconfigurations, password cracking, sniffing unsecured traffic, or exploiting the
design flaw of specific protocols (Graham, 2002). An Intrusion Detection System is a system
for detecting intrusions and reporting them accurately to the proper authority. Intrusion
Detection Systems are usually specific to the operating system that they operate in and are an
important tool in the overall implementation an organization’s information security policy
(Jones and Sielken, 2000), which reflects an organization's statement by defining the rules
and practices to provide security, handle intrusions, and recover from damage caused by
security breaches. There are two generally accepted categories of intrusion detection
techniques: misuse detection and anomaly detection. Misuse detection refers to techniques
that characterize known methods to penetrate a system. These penetrations are characterized
as a ‘pattern’ or a ‘signature’ that the IDS looks for. The pattern/signature might be a static
string or a set sequence of actions. System responses are based on identified penetrations.
Anomaly detection refers to techniques that define and characterize normal or acceptable
behaviors of the system (e.g., CPU usage, job execution time, system calls). Behaviors that
deviate from the expected normal behavior are considered intrusions (Bezroukov, 2002; see
also McHugh, 2001). IDSs can also be divided into two groups depending on where they look
for intrusive behavior: Network-based IDS (NIDS) and Host-based IDS. The former refers to
systems that identify intrusions by monitoring traffic through network devices (e.g. Network
Interface Card, NIC). A host-based IDS monitors file and process activities related to a
software environment associated with a specific host. Some host-based IDSs also listen to
network traffic to identify attacks against a host (Bezroukov, 2002; see also McHugh, 2001).
There are other emerging techniques.One example is known as a blocking IDS, which
SKNCOE, Department of Computer Engineering, 2010-2011
18
combines a host-based IDS with the ability to modify firewall rules (Miller and Shaw, 1996).
Another is called a Honeypot, which appears to be a ‘target’ to an intruder, but is specifically
designed to trap an intruder in order to trace down the intruder’s location and respond to
attack (Bezroukov, 2002).
The Intelligent Intrusion Detection System (IIDS) is an ongoing project at the Center
for Computer Security Research (CCSR) in Mississippi State University. The architecture
combines a number of different approaches to the IDS problem, and includes different AI
techniques to help identify intrusive behavior (Bridges and Vaughn,2001). It uses both
anomaly detection and misuse detection techniques and is both a network-based and host-
based system. Within the overall architecture of the IIDS, some open-source intrusion
detection software tools are integrated for use as security sensors (Li, 2002), such as Bro
(Paxson, 1998) and Snort (Roesch, 1999). Techniques proposed in this paper are part of the
IIDS research efforts.
Genetic Algorithm (GA) has been used in different ways in IDSs. The Applied
Research Laboratories of the University of Texas at Austin (Sinclair, Pierce, and Matzner
1999) uses different machine learning techniques, such as finite state machine, decision tree,
and GA, to generate artificial intelligence rules for IDS. One network connection and its
related behavior can be translated to represent a rule to judge whether or not a real-time
connection is considered an intrusion. These rules can be modeled as chromosomes inside the
population. The population evolves until the evaluation criteria are met. The generated rule
set can be used as knowledge inside the IDS for judging whether the network connection and
related behaviors are potential intrusions (Sinclair, Pierce, and Matzner 1999). The COAST
Laboratory in Purdue University (Crosbie and Spafford, 1995) implemented an IDS using
autonomous agents (security sensors) and applied AI techniques to evolve genetic algorithms.
Agents are modeled as chromosomes and an internal evaluator is used inside every agent
(Crosbie and Spafford, 1995). In the approaches described above, the IDS can be viewed as a
rule-based system (RBS) and GA can be viewed as a tool to help generate knowledge for the
RBS. These approaches have some disadvantages. In order to detect intrusive behaviors for a
local network, network connections should be used to define normal and anomalous
behaviors. Sometimes an attack can be as simple as scanning for available ports in a server or
a password-guessing scheme. But typically they are complex and are generated by automated
tools that are freely available from the Internet. An example can be a Trojan horse or a
backdoor that can run for a period of time, or can be initiated from different locations. In
SKNCOE, Department of Computer Engineering, 2010-2011
19
order to detect such intrusions, both temporal and spatial information of network traffic
should be included in the rule set. The current GA applications do not address these issues
extensively. This paper shows how network connection information can be modeled as
chromosomes and how the parameters in genetic algorithm can be defined in this respect.
2.3 NEW CONCEPT:-
Genetic algorithm is a family of computational models based on principles of
evolution and natural selection. These algorithms convert the problem in a specific domain
into a model by using a chromosome-like data structure and evolve the chromosomes using
selection, recombination, and mutation operators. The range of the applications that can make
use of genetic algorithm is quite broad (Sinclair, Pierce, and Matzner 1999; see also Whitley,
1994). In computer security applications, it is mainly used for finding optimal solutions to a
specific problem. The process of a genetic algorithm usually begins with a randomly selected
population of chromosomes. These chromosomes are representations of the problem to be
solved. According to the attributes of the problem, different positions of each chromosome
are encoded as bits, characters, or numbers. These positions are sometimes referred to as
genes and are changed randomly within a range during evolution. The set of chromosomes
during a stage of evolution are called a population. An evaluation function is used to
calculate the “goodness” of each chromosome. During evaluation, two basic operators,
crossover and mutation, are used to simulate the natural reproduction and mutation of
species. The selection of chromosomes for survival and combination is biased towards the
fittest chromosomes.
Genetic algorithms can be used to evolve simple rules for network traffic (Sinclair,
Pierce, and Matzner 1999). These rules are used to differentiate normal network connections
from anomalous connections. These anomalous connections refer to events with probability
of intrusions. The rules stored in the rule base are usually in the following form (Sinclair,
Pierce, and Matzner 1999):
if { condition } then { act }
SKNCOE, Department of Computer Engineering, 2010-2011
20
For the problems we presented above, the condition usually refers to a match between
current network connection and the rules in IDS, such as source and destination IP addresses
and port numbers (used in TCP/IP network protocols), duration of the connection, protocol
used, etc., indicating the probability of an intrusion. The act field usually refers to an action
defined by the security policies within an organization, such as reporting an alert to the
system administrator, stopping the connection, logging a message into system audit files, or
all of the above. For example, a rule can be defined as:
if {the connection has following information: source IP address 124.12.5.18; destination IP
address:
130.18.206.55; destination port number: 21; connection time: 10.1 seconds }
then {stop the connection}
This rule can be explained as follows: if there exists a network connection request
with the source IP address 124.12.5.18, destination IP address 130.18.206.55, destination port
number 21, and connection time 10.1 seconds, then stop this connection establishment. This
is because the IP address 124.12.5.18 is recognized by the IDS as one of the blacklisted IP
addresses; therefore, any service request initiated from it is rejected.
“Thus finally this new system of ours has given rise to new Intelligent IDS.”
SKNCOE, Department of Computer Engineering, 2010-2011
21
3. REQUIREMENTS
GATHERING
SKNCOE, Department of Computer Engineering, 2010-2011
22
CHAPTER 3
REQUIREMENTS GATHERING
3.1 INTRODUCTION:-
The term requirements gathering encompasses those tasks that go into determining the
needs or conditions to meet for a new or altered product, taking account of the possibly
conflicting requirements of the various stakeholders, such as beneficiaries or users. Also it
can be applied specifically to the analysis proper, as opposed to elicitation or documentation
of the requirements.
FIG.8: Requirement Gathering
3.1.1 Purpose:-
Genetic Algorithm (GA) has been used in different ways in IDSs. One
network connection and its related behavior can be translated to represent a rule to judge
SKNCOE, Department of Computer Engineering, 2010-2011
23
whether or not a real-time connection is considered an intrusion. These rules can be modeled
as chromosomes inside the population. The population evolves until the evaluation criteria
are met. The generated rule set can be used as knowledge inside the IDS for judging whether
the network connection and related behaviours are potential intrusions.
3.1.2 Document conventions:-
The format is simple. The bold headings are used for showing the points. The points
are numbered in order to make reading of the SRS simple.
Abbreviations used are:-
GA: - Genetic Algorithm
GAM: - Genetic Algorithm Module
PR: - Policy Repository
PMT: - Policy Management Tool
PDP: - Policy Decision Point
PEP: - Policy Enforcement Point
FPC: - False Positive Count
FNC: - False Negative Count
3.1.3 Intended Audience and Reading Suggestions :-
The intended audience is users of the system, administrator, operator, database
designer & database admnistrator.
3.1.4 Scope of the project:-
In Scope:
Allow system to detect any network event
Gene Construction using packet sniffer and packet analyzer
Gene Pool storage
Use of FSM to reduce the time complexity while applying GA
Gene Fitness Evaluation Function
SKNCOE, Department of Computer Engineering, 2010-2011
24
Policy Management
Policy enforcement
Maintenance of Initial Data Set
Out of Scope:
1. Compatibility issues related with OS other than Windows.
2. Issues caused by limited hardware requirements such as disk space.
3.2 OVERALL DESCRIPTION:-
This GA feedback based network security policy framework can be installed on any
system and can be used for policy-based management and to monitor and manage the
behaviour of network.
3.2.1 Product perspective:-
This system is aimed at developing an evolutional network security policy framework
based on genetic-feedback algorithm. Based on the historical security events, using genetic
algorithm, we can generate a rule base. When a new network event comes, the analyzer
judges whether the event is secure or not according to the rule base, and the policy system
may give a policy decision too. Obviously, these two results may be different. So the policies
can be automatically adjusted refer to the genetic calculated results.
3.2.1.1. Jpcap: -
Jpcap is a Java library for capturing and sending network packets. Using Jpcap, you
can develop applications to capture packets from a network interface and visualize/analyze
them in Java. Jpcap isn't a pure Java solution; it depends on the use of native libraries. On
either Windows or UNIX, you must have the required third-party library, WinPcap or
libpcap, respectively.
SKNCOE, Department of Computer Engineering, 2010-2011
25
3.2.2. Product Features:-
Auto Rule Base Generation Auto restore damaged files Negligible Administrator Presence required More Security Less resources required
3.2.3. Operating Environment:-
Software Requirements:-
Operating System: Microsoft Windows 2000/NT, XP or higher version.
Other Software: Microsoft Access, Jpcap
Hardware Requirements:-
Compatible to any brand with the minimum configuration as:
Processor: - Intel Pentium IV, Dual Core
Ram:- 1GB Onwards
Hard disk: - 20.0GB
Monitor:-SVGA colour monitor, VGA Monochrome, LCD monitor
Keyboard:-105standards
Pointing Device:- Logitech Mouse or other Compatible
3.2.4. Design constraints:-
Our platform provides a easy to understand design. The screen follows all the rules
and regulation of GUI testing.
In software engineering, graphical user interface testing is the process of testing a
product's graphical user interface to ensure it meets its written specifications. This is
normally done through the use of a variety of test cases.
3.2.4.1 Security:-
The security is the important thing in case of the network events and transaction.
Hence the security should be provided properly. While any policy definition and
administrative functions secure SSL layer is used for transfer.
SKNCOE, Department of Computer Engineering, 2010-2011
26
3.2.4.2 Fault Tolerance:-
Data will never get corrupted in case of system crash or power failure. There is a
active back up program running which keep on taking back up of the date after a time interval
depending on the load on the system.
3.2.4.3 Multi-tenant architecture:-
It offer network application develop in any environment and it also take care of
various issue like concurrency management, scalability, failover and security. The
architecture enable defining the "trust relationship" between users in security, access,
distribution of source code, navigation history, admin (people and device) profiles,
interaction history, and application usage.
3.2.4.4 Utility-grade instrumentation:-
It offers developers insight into the inner workings of their applications, and the
behaviour of their users.
3.2.4.5 Assumptions and dependencies:-
For successful restoring of backed up policies and enforcement of new policies all
administrator rights have been assigned.
3.3 SYSTEM FEATURES:-
3.3.1 Deploying Application:-
It allows user of application to use the services anywhere within the network area of
the service provider.
3.3.2 Common platform:-
This platform provides common environment to the developers for creating robust,
easy, and secure network events and transaction within an organization.
3.3.3 GUI:-
In software engineering, graphical user interface testing is the process of testing a
product's graphical user interface to ensure it meets its written specifications. This is
normally done through the use of a variety of test cases. Various section of test cases that we
follow are:
SKNCOE, Department of Computer Engineering, 2010-2011
27
Section 1 - Windows Compliance Standards
1.1. Application icon.
1.2. For Each Window TITLE in the Application
1.3. Text Boxes
1.4. Option (Radio Buttons)
1.5. Check Boxes
1.6. Command Buttons
1.7. Drop down List Boxes
1.8. Combo Boxes
1.9. List Boxes
Section 2 - Tester's Screen Validation Checklist
2.1. Aesthetic Conditions
2.2. Validation Conditions
2.3. Navigation Conditions
2.4. Usability Conditions
2.5. Data Integrity Conditions
2.6. Modes (Editable Read-only) Conditions
2.7. General Conditions
2.8. Specific Field Tests
2.8.1. Date Field Checks
2.8.2. Numeric Fields
2.8.3. Alpha Field Checks
Section 3 - Validation Testing - Standard Actions
3.1. On every Screen
3.2. Shortcut keys / Hot Keys
3.3. Control Shortcut Keys
3.4 EXTERNAL INTERFACE REQUIREMENTS:-
The system has no other external software and hardware interface requirements.
3.4.1 User Interfaces:-
SKNCOE, Department of Computer Engineering, 2010-2011
28
System will have powerful user interface which will enable the developer to create the
various applications and user to use the applications effectively.
3.4.2 Hardware interfaces: -
The system has no hardware interface requirements
3.4.3 Software interfaces:-
The system will require database servers, Microsoft Access for developers.
3.4.4 Communications Interfaces:-
To manage and monitor network events and to manage /enforce policies due to
various network events, connected in Local Area Network/Internet we will be using UDP and
TCP/IP protocols
3.5 OTHER NON FUNCTIONAL REQUIREMENTS:-
Backup and Restore facility to monitor automatic backup on a timely basis.
3.5.1 Performance Requirements:-
System should be able to perform backup and restore policy’s effectively without the loss or corruption of data. System should require minimum RAM usage
3.5.2 Safety Requirements:-
Policy Repository data and Rule Base data must be secured from unauthorized access
so registration of all application and users are done.
3.5.3 Security Requirements:-
User authentication is done at the time of logging in.
3.5.4 Software Quality Attributes:-
The attributes taken into consideration are reliability, flexibility, operability, platform
independence.
SKNCOE, Department of Computer Engineering, 2010-2011
29
4. SYSTEM DESIGN
SKNCOE, Department of Computer Engineering, 2010-2011
30
CHAPTER 4
SYSTEM DESIGN
4.1 SELECTION OF LIFE CYCLE MODEL:-
The Basic idea:-
Process models define distinct set of activities, action, tasks, milestones and work
product that are required to engineer high quality software. These process models are not
perfect, but they do provide a useful roadmap for software engineering work. We have used
the waterfall model as project development life cycle. The waterfall model suggests a
systematic sequence for software development that begins with customer specification of
requirements and progresses through planning, modeling, construction and deployment.
Fig. 9.DEVELOPMENT LIFE CYCLE
SKNCOE, Department of Computer Engineering, 2010-2011
31
Deliverable Form Phase
Stage one: Communication
Project Concept Overview Document Project Initiation
Project Plan Document Project Initiation
Initial Estimate Document Project Initiation
Stage two: Requirement and Planning
User Requirements Document Pre-Design(Analysis)
Technical Requirements Document Pre-Design(Analysis)
Paper Prototype Document Pre-Design(Analysis)
Requirement and Planning
CompletionDocument Pre-Design(Analysis)
Stage three: Design
Infrastructure Design Document Design
Systems Design Document Design
Application Design Document Design
Time & Cost Quotation Document Design
Design Completion Document Design
Stage four: Construction & Testing
Infrastructure Installation Hardware/Software Development
SKNCOE, Department of Computer Engineering, 2010-2011
32
Systems Installation Hardware/Software Development
Application Development Software Development
Development Beta Test
ReportDocument Testing
Application Testing Document Testing
Development Completion Software/Document Development
Stage five: Deployment
Live System Delivery Hardware/Software Deployment
Infrastructure Specification Document Deployment
System Specification Document Deployment
Application Technical
SpecificationDocument Deployment
User Documentation Document Deployment
SKNCOE, Department of Computer Engineering, 2010-2011
33
Fig.10 Waterfall Model
Waterfall approach was first Process Model to be introduced and followed widely in
Software Engineering to ensure success of the project. In "The Waterfall" approach, the
whole process of software development
is divided into separate process phases.
The phases in Waterfall model are: Requirement Specifications phase, Software
Design, Implementation and Testing & Maintenance. All these phases are cascaded to each
other so that second phase is started as and when defined set of goals are achieved for first
phase and it is signed off, so the name "Waterfall Model". All the methods and processes
undertaken in Waterfall Model are more visible.
SKNCOE, Department of Computer Engineering, 2010-2011
34
4.2 PROJECT PLAN:-
As the project is to be done during the course of two semesters with our university
examination falling in between. We divided the project in two phases of three months each. The first
phase was from August 2010 to October 2010 and the second phase was from January 2011 to April
2011.The detailed week wise project scheduling with the achieved milestones is shown is as shown
PROJECT PLAN
TASK COMPLETION DATE
Searching for the Project 12-Aug-2010
Deciding the Project 20-Aug-2010
Searching information about Project 5-Sep-2010
Deciding the Components 25-Sep-2010
Working on Project Design 3-Oct-2010
Finalizing the Project Design 15-Oct-2010
Configuration of Server 25-Jan-2011
Configuration of Database 5-Feb-2011
Checking the Database Connectivity 12-Feb-2011
Creating Web Applications 20-Feb-2011
Performing Various Tests 25-March-2011
Documentation of Project 30-March-2011
System Delivery And Installation 10-April-2011
Systems design is the process or art of defining the architecture, components,
modules, interfaces, and data for a system to satisfy specified requirements. One could see it
as the application of systems theory to product development.
4.3 DATA FLOW DIAGRAM:-SKNCOE, Department of Computer Engineering, 2010-2011
35
A data flow diagram (DFD) is a graphical representation of the "flow" of data through
an information system. DFDs can also be used for the visualization of data processing
(structured design).
On a DFD, data items flow from an external data source or an internal data store to an
internal data store or an external data sink, via an internal process.
A DFD provides no information about the timing of processes, or about whether
processes will operate in sequence or in parallel. It is therefore quite different from a
flowchart, which shows the flow of control through an algorithm, allowing a reader to
determine what operations will be performed, in what order, and under what circumstances,
but not what kinds of data will be input to and output from the system, nor where the data
will come from and go to, nor where the data will be stored (all of which are shown on a
DFD)
FIG 11. LEVEL 0 DFD
SKNCOE, Department of Computer Engineering, 2010-2011
36
FIG 12. Level 1 DFD
4.4 USE CASE DIAGRAMS:-
Use case diagrams are basically used to model the dynamic aspects of systems. These
diagrams are central to modeling the behavior of the system, a subsystem, or a class. Each
one shows a set of use cases and actors and their relationships. Use case diagrams are
important for visualizing, specifying, and documenting the behavior of an element. They
make systems, subsystems and classes approachable and understandable by presenting an
outside view of how those elements may be used in context. The main purpose of a use case
diagram is to show what system functions are performed for which actor. Roles of the actors
in the system can be depicted.
SKNCOE, Department of Computer Engineering, 2010-2011
37
FIG 13: USECASE DIAGRAM
4.5 CLASS DIAGRAM:-
In software engineering, a class diagram in the Unified Modelling Language (UML)
is a type of static structure diagram that describes the structure of a system by showing the
system's classes, their attributes, and the relationships between the classes. This diagram
shows various classes or main entities involved in the system and also their relationship with
each other. It depicts the attributes and operations each class can carry out, individually and
with help of other classes in the system designed.
SKNCOE, Department of Computer Engineering, 2010-2011
38
This diagram shows various classes or main entities involved in the system and also
their relationship with each other.It depicts the attributes and operations each class can carry
out, individually and with help of other classes in the system designed
FIG 14.CLASS DIAGRAM
SKNCOE, Department of Computer Engineering, 2010-2011
39
4.6 STATE CHART DIAGRAM:-
A state diagram is a type of diagram used in computer science and related fields to
describe the behaviour of systems. State diagrams require that the system described is
composed of a finite number of states; sometimes, this is indeed the case, while at other times
this is a reasonable abstraction.
Fig 15- State-Chart Diagram
SKNCOE, Department of Computer Engineering, 2010-2011
40
4.7 ACTIVITY DIAGRAMS:-
Activity diagrams are graphical representations of workflows of stepwise activities
and actions with support for choice, iteration and concurrency. In the Unified Modelling
Language, activity diagrams can be used to describe the business and operational step-by-step
workflows of components in a system. An activity diagram shows the overall flow of control.
FIG 16. ACTIVITY DIAGRAM
SKNCOE, Department of Computer Engineering, 2010-2011
41
4.8 SEQUENCE DIAGRAMS:-
A sequence diagram in Unified Modelling Language (UML) is a kind of interaction
diagram that shows how processes operate with one another and in what order. It is a
construct of a Message Sequence Chart.
Fig 17- Sequence Diagram
SKNCOE, Department of Computer Engineering, 2010-2011
42
4.9 DEPLOYMENT DIAGRAM:-
FIG 18: DEPLOYMENT DIAGRAM
SKNCOE, Department of Computer Engineering, 2010-2011
43
4.10 PACKAGE DIAGRAM:-
Fig 19 PACKAGE DIAGRAM
SKNCOE, Department of Computer Engineering, 2010-2011
44
5. IMPLEMENTATION DETAILS
SKNCOE, Department of Computer Engineering, 2010-2011
45
CHAPTER 5
IMPLEMENTATION DETAILS
5.1 TECHNOLOGY DETAILS:-
5.1.1 Java:-
Java is a programming language originally developed by James Gosling at Sun Microsystems
(which is now a subsidiary of Oracle Corporation) and released in 1995 as a core component of Sun
Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a
simpler object model and fewer low-level facilities. Java applications are typically compiled to
bytecode (class file) that can run on any Java Virtual Machine (JVM) regardless of computer
architecture. Java is a general-purpose, concurrent, class-based, object-oriented language that is
specifically designed to have as few implementation dependencies as possible. It is intended to let
application developers "write once, run anywhere". Java is currently one of the most popular
programming languages in use, and is widely used from application software to web applications.
Features of Java:-
● Java Virtual Machine (JVM)
-An imaginary machine that is implemented by emulating software on a real machine
-Provides the hardware platform specifications to which you compile all Java technology code
● Bytecode
-A special machine language that can be understood by the Java Virtual Machine (JVM)
-Independent of any particular computer hardware, so any computer with a Java interpreter can
execute the compiled Java program, no matter what type of computer the program was compiled on
● Garbage collection thread
- Responsible for freeing any memory that can be freed. This happens automatically during the
lifetime of the Java program.
- Programmer is freed from the burden of having to deallocate that memory themselves Introduction
to Programming
SKNCOE, Department of Computer Engineering, 2010-2011
46
● Code security
- Is attained in Java through the implementation of its Java Runtime Environment (JRE).
● JRE
- Runs code compiled for a JVM and performs class loading (through the class loader), code
verification (through the bytecode verifier) and finally code execution.
● Class Loader
– Responsible for loading all classes needed for the Java program
– Adds security by separating the namespaces for the classes of the local file system from those that
are imported from network sources
– After loading all the classes, the memory layout of the executable is then determined. This adds
protection against unauthorized access to restricted areas of the code since the memory layout is
determined during runtime
● Bytecode verifier
– tests the format of the code fragments and checks the code fragments for illegal code that can
violate access rights to objects
● Platform Independence
-The Write-Once-Run-Anywhere ideal has not been achieved (tuning for different platforms usually
required), but closer than with other languages.
● Object Oriented
-Object oriented throughout - no coding outside of class definitions, including main().
-An extensive class library available in the core language packages.
● Compiler/Interpreter Combo
-Code is compiled to bytecodes that are interpreted by a Java virtual machines (JVM) .
-This provides portability to any machine for which a virtual machine has been written.
-The two steps of compilation and interpretation allow for extensive code checking and improved
security.
SKNCOE, Department of Computer Engineering, 2010-2011
47
● Robust
-Exception handling built-in, strong type checking (that is, all data must be declared an explicit type),
local variables must be initialized.
● Several dangerous features of C & C++ eliminated:
-No memory pointers
-No preprocessor
-Array index limit checking
● Good Performance
-Interpretation of bytecodes slowed performance in early versions, but advanced virtual machines
with adaptive and just-in-time compilation and other techniques now typically provide performance
up to 50% to 100% the speed of C++ programs.
● Threading
-Lightweight processes, called threads, can easily be spun off to perform multiprocessing.
-Can take advantage of multiprocessors where available
-Great for multimedia displays.
● Built-in Networking
-Java was designed with networking in mind and comes with many classes to develop sophisticated
Internet communications.
5.1.2 Jpcap:-
Jpcap is an open source library for capturing and sending network packets from Java
applications. It provides facilities to:
● capture raw packets live from the wire.
● save captured packets to an offline file, and read captured packets from an offline file.
SKNCOE, Department of Computer Engineering, 2010-2011
48
● automatically identify packet types and generate corresponding Java objects (for Ethernet,
IPv4, IPv6, ARP/RARP, TCP, UDP, and ICMPv4 packets).
● filter the packets according to user-specified rules before dispatching them to the
application.
● send raw packets to the network
● Jpcap is based on libpcap/winpcap, and is implemented in C and Java.
● Jpcap has been tested on Microsoft Windows (98/2000/XP/Vista), Linux (Fedora, Ubuntu),
Mac OS X (Darwin), FreeBSD, and Solaris.
Jpcap can be used to develop many kinds of network applications, including (but not limited
to):
- network and protocol analyzers
- network monitors
- traffic loggers
- traffic generators
- user-level bridges and routers
- network intrusion detection systems (NIDS)
- network scanners
- security tools
Jpcap captures and sends packets independently from the host protocols (e.g., TCP/IP). This
means that Jpcap does not (cannot) block, filter or manipulate the traffic generated by other programs
on the same machine: it simply "sniffs" the packets that transit on the wire. Therefore, it does not
provide the appropriate support for applications like traffic shapers, QoS schedulers and personal
firewalls.
When you want to capture packets from a network, the first thing you have to do is to obtain
the list of network interfaces on your machine. To do so, Jpcap provides JpcapCaptor.getDeviceList()
method. It returns an array of NetworkInterface objects.
A Network Interface object contains some information about the corresponding network
interface, such as its name, description, IP and MAC addresses, and data link name and description.
SKNCOE, Department of Computer Engineering, 2010-2011
49
Fig 20 Jpcap
5.2 MODULAR DETAILS:-SKNCOE, Department of Computer Engineering, 2010-2011
50
Various modules involved in are:
1. The Packet sniffer which takes care of converting incoming packets into
Chromosomes-like Data structures. These chromosome-like data structures are
used by the GA module for checking its Fitness value.
2. With the help of different sub-modules like the GD and GC the chromosome is
checked against the existing Gene pool. Also the important tasks of Cross-over
and Mutation are carried out by the GA module.
3. Once the Fitness Calculator decides the fitness value of the packet, the info is
passed to the Event Report Generator. Hence the Policy Management Point
(Admin) comes into picture. His role is vital w.r.t. the decision to make which
allows the anomalous packet to block or to allow it.
4. Once the policy management point checks for the validation of all the policy and
none are violated, the packet is allowed. If there is inconsistency then the SMS
Gateway is invoked The Administrator is notified accordingly.
5.3 DATBASE DETAILS:-
5.3.1 Micrsoft Office Access 2007:-
Microsoft Office Access, previously known as Microsoft Access, is a relational
database management system from Microsoft that combines the relational Microsoft Jet
Database Engine with a graphical user interface and software-development tools. Software
developers and data architects can use Microsoft Access to develop application software, and
"power users" can use it to build simple applications. Like other Office applications, Access
is supported by Visual Basic for Applications, an object-oriented programming language that
can reference a variety of objects including DAO (Data Access Objects), ActiveX Data
Objects, and many other ActiveX components. Visual objects used in forms and reports
expose their methods and properties in the VBA programming environment, and VBA code
modules may declare and call Windows operating-system functions.
Microsoft Access is used to make databases. When reviewing Microsoft Access in the
real world, it should be understood how it is used with other products. An all-Access solution
may have Microsoft Access Forms and Reports managing Microsoft Access tables. However,
Microsoft Access may be used only as the 'front-end', using another product for the 'back-end'
SKNCOE, Department of Computer Engineering, 2010-2011
51
tables, such as Microsoft SQL Server and non-Microsoft products such as Oracle and Sybase.
Similarly, some applications will only use the Microsoft Access tables and use another
product as a front-end, such as Visual Basic or ASP.NET. Microsoft Access may be only part
of the solution in more complex applications, where it may be integrated with other
technologies such as Microsoft Excel, Microsoft Outlook or ActiveX Data Objects.
Access tables support a variety of standard field types, indices, and referential
integrity. Access also includes a query interface, forms to display and enter data, and reports
for printing. The underlying Jet database, which contains these objects, is multiuser-aware
and handles record-locking and referential integrity including cascading updates and deletes.
Users can create tables, queries, forms and reports, and connect them together with
macros. Advanced users can use VBA to write rich solutions with advanced data
manipulation and user control. The original concept of Access was for end users to be able to
"access" data from any source. Other uses include: the import and export of data to many
formats including Excel, Outlook, ASCII, dBase, Paradox, FoxPro, SQL Server, Oracle,
ODBC, etc. It also has the ability to link to data in its existing location and use it for viewing,
querying, editing, and reporting. This allows the existing data to change and the Access
platform to always use the latest data. It can perform heterogeneous joins between data sets
stored across different platforms. Access is often used by people downloading data from
enterprise level databases for manipulation, analysis, and reporting locally.
There is also the Jet Database format (MDB or ACCDB in Access 2007) which can
contain the application and data in one file. This makes it very convenient to distribute the
entire application to another user, who can run it in disconnected environments.
One of the benefits of Access from a programmer's perspective is its relative
compatibility with SQL (structured query language) — queries can be viewed graphically or
edited as SQL statements, and SQL statements can be used directly in Macros and VBA
Modules to manipulate Access tables. Users can mix and use both VBA and "Macros" for
programming forms and logic and offers object-oriented possibilities. VBA can also be
included in queries. It can perform heterogeneous joins between data sets stored across
different platforms. Access tables support a variety of standard field types, indices, and
referential integrity. Access also includes a query interface, forms to display and enter data,
and reports for printing. The underlying Jet database, which contains these objects, is
SKNCOE, Department of Computer Engineering, 2010-2011
52
multiuser-aware and handles record-locking and referential integrity including cascading
updates and deletes. Access is often used by people downloading data from enterprise level
databases for manipulation, analysis, and reporting locally.
Fig 21 Microsoft Office Access 2007
5.4 SNAPSHOTS:
SKNCOE, Department of Computer Engineering, 2010-2011
53
Main Window
SKNCOE, Department of Computer Engineering, 2010-2011
54
Log record
No packet case
SKNCOE, Department of Computer Engineering, 2010-2011
55
6. TESTING
SKNCOE, Department of Computer Engineering, 2010-2011
56
CHAPTER 6
TESTING
6.1 TESTING STRAEGIES:-
The test strategy consists of a series of different tests that will fully exercise the GA based
network security system. The primary purpose of these tests is to uncover the systems limitations and
measure its full capabilities. A list of the various planned tests and a brief explanation follows below.
1. UI testing:-
The admin interaction needs to be user friendly. The admin has a graphical user interface
(GUI). The important features of the security services are being highlighted to the administrator. It
should be able to switch properly between different screens and operations should execute correctly.
The administrator has many other interfaces like buttons, textboxes, scrollbars etc. which needs to
comply with industry standards. The efficiency and functionality of these features need to be tested.
2. Functional Testing:-
It's a type of GUI testing where functionality of an application is tested. Testing of all features
and functions of system software, hardware, etc. to ensure requirements and specifications are met.
Functionality testing of software is testing conducted on a complete, integrated system to evaluate the
system's compliance with its specified requirements. Functionality testing falls within the scope of
black box testing, and as such, should require no knowledge of the inner design of the code or logic.
Also the basic functional requirements of the system should be fulfilled and tested.
3. Stability Testing:-
The admin has to control the application for long time. It must have a scheduling functionality
wherein they run in background on the proxy servers, policy server and database server. When
invoked the server comes to the foreground and corresponds to the admin’s request. The stability of
system in such scenarios should be tested.
6.1.1 Test Goals:-
The software is intended to provide a very user friendly GUI to the system administrators. Most of the
testing is inclined to ensure that this requirement is fulfilled.
1. To ensure that the admin receives correct reply from the policy server and proxy server.
2. To make sure that the changes and updation done by admin is correctly reflected on policy
and proxy server.
SKNCOE, Department of Computer Engineering, 2010-2011
57
3. To ensure that the data is stored and fetched without any problem from database server.
4. To ensure that policy repository and rule base are always consistent, integrated and durable
during life cycle of the system.
5. To ensure that the value of False Positive Count [FPC] and False Negative Count [FNC] is
always below the danger level.
6. The higher fitness value indicates more fit population and hence population with more
occurring and accurate individuals overall therefore the testing goal is to make sure that the
fitness value is always high.
Project Title: GENETIC-FEEDBACK ALGORITHM BASED NETWORK SECURITY POLICY FRAMEWORK.
Developer Requirements:-
1. Processor : 1 GHz and above
2. Primary memory : 1 GB of RAM
3. Operating System : Windows XP
Software Resources:-
1. Microsoft Access
2. Java Virtual machine.
3. Jpcap-Network packet analyzer.
To run application:-
1. Software : Microsoft Access, Java.
2. Hardware : PC’s. (for Proxy server, Admin, Policy Repository,
Rule Base)
3. Primary memory : 1 GB RAM
6.1.3 Features to be tested:-
SKNCOE, Department of Computer Engineering, 2010-2011
58
Sr.No. Acceptance Tests Result
1. The system must have maximum data recovery.
2. One can access the system only if he is authenticated.
3.
There should be an administrator console having all
administrator functions with a password authentication
login.
4.The system should not proceed if the administrator has not
selected any appropriate option.
5.The process of Updating will take place only if the
administrator enters or modify the text.
6.The time required for Retrieving data must as less as
possible with maximum efficiency.
7. The system should have an option for exiting the system.
6.1.4 Test Team:-
1. ROHAN KULKARNI2. VIRAL PATEL 3. SAGAR ROTHAWAN4. MAYURESH SHIVADE
6.2. TEST DELIVERABLES:-
1. Acceptance test plan
2. System/Integration test plan
3. Unit test plans/turnover documentation
4. Screen prototypes
5. Report mock-ups
6. Defect/Incident reports and summaries
7. Test logs and turnover report
SKNCOE, Department of Computer Engineering, 2010-2011
59
6.3. REMAINING TEST TASKS:-
TASK ASSIGNED TO
Create Acceptance Test Plan ROHAN, SAGAR, VIRAL
Create System/Integration Test Plan
ROHAN, MAYURESH
Define Unit Test rules and Procedures
SAGAR, VIRAL
Define Turnover procedures for each level
SAGAR, ROHAN
Verify prototypes of Screens VIRAL, MAYURESH
Verify prototypes of Reports ROHAN,MAYURESH
6.4. STAFFING AND TRAINING NEEDS:-
There are three people allocated for the completion of the project. Individual skill set of the
members is mentioned in table:
ROHAN SAGAR VIRAL MAYURESH
Programming
LanguagesJAVA JAVA JAVA JAVA
Operating
systems
Windows /
LinuxWindows / Linux Windows / Linux
Windows/
Linux
SKNCOE, Department of Computer Engineering, 2010-2011
60
Tools jpcap jpcap jpcap jpcap
6.5. SCHEDULE:-
Test plan includes various types of testing viz. manual testing, performance testing and
automated testing. These tests should be well planned and executed accordingly. The test plan is
shown in table
# Test Type Start Date End Date
1 Manual Testing 25/3/2011 27/3/2011
2 Run performance Tests 27/3/2011 30/3/2011
3 Finalize Testing 30/3/2011 31/3/2011
6.6. RESPONSIBILITIES:-
Rohan
Kulkarni
Sagar
Rothawan
Viral
Patel
Mayuresh
Shivade
Acceptance test
Documentation & Execution
System/Integration test
Documentation & Exec.
Unit test documentation &
execution
System design review
Detail Design Reviews
Test procedures and rules
SKNCOE, Department of Computer Engineering, 2010-2011
61
Change Control and
regression testing
6.7. TEST ITEMS (FUNCTIONS):-
Test Case ID 01
Project Name Genetic-Feedback Algorithm Based Network Security Policy
Framework
Test Case Name Main Page – Client Side
Test Case Description To accept registration data.
Step No. Step Description Input Data Expected Result Actual Result
1
2
3
Enter alphabet in mobile number
textbox
Enter mobile number < 10 digits
Enter mobile and password correctly
alphabets
< 10 digits
Mobile Number and password
Error message showing “Enter numbers only”
Error message showing “Incorrect
number”
Successful login
Error message.
Error message.
Execute successfully.
Test Case ID 02
Project Name Genetic-Feedback Algorithm Based Network Security Policy
Framework
Test Case Name Main Page – Server Side
Test Case Description To login
Step No. Step Description Input Data Expected Result Actual Result
1 Enter username and
password correctly
Username and
Password
Successful login Execute
successfully and SKNCOE, Department of Computer Engineering, 2010-2011
62
login.
Test Case ID 03
Project Name Genetic-Feedback Algorithm Based Network Security Policy
Framework
Test Case Name IMPORT TEMPLATES
Test Case Description To import templates.
Step No. Step Description Input Data Expected Result Actual Result
1
2
Do not import any template and press
OK
Import template and press ok.
No selection
Import file.
Error message “please import test file and model file”.
Predict successfully and gives result.
Error message.
Execute successfully.
6.8. SOFTWARE RISK ISSUES:-
An effective strategy to deal with risk must consider three issues:
1: Risk Avoidance
2: Risk Monitoring
3: Risk Management and Contingency Planning
The risks mentioned in the risk table for the given project can be mentioned in the following
ways:
A: Large no of network events than planned:-
In case of heavy network traffic of the system recovery strategy can be applied. Also
the concurrent access of the database and updating tables is managed.
SKNCOE, Department of Computer Engineering, 2010-2011
63
B: Project does not complete by the delivery date:-
This is business risk and can be eliminated by forming a study project plan .The work
must be strictly followed. This has been checked for throughout the life stage of the project.
C: End user resists system:-
The administrator may dislike the system or user may not feel comfortable with the
system if its representation is too complex. Providing an attractive, easy & can eliminate this
risk and extremely user-friendly interface. Also the system crash condition must be
eliminated for convince of the user.
D: Lack of trained staff:-
This risk is not varying difficult to handle since the user friendly graphical interface
itself will guide the user through the system. There is no requirement of any Special skill set.
It is assumed that naïve users will operate upon this system.
E: Lack of training of tools:-
This risk is associated with the developer’s inability and can be handled by employing
developers with quality skill required for project
F: Loss of funding:-
The funding can be lost in the event of administrator dissatisfaction. The prototype of system
provided to the administrator must keep him in an engrossed and waiting for the product.
G: Required resources not available on the host:-
This risk arises due to over estimation of client’s setup. The client infrastructure and
state of machines be checked for.
H: Customer may change requirement:-
This becomes the critical risk and even further aggravating if the system is in
completion stage. Every stage of SDLC system requires whole understanding of
requirements. It is the best that the system will be built only when requirements are frozen.
I: Less reuse than excepted:-
Flexibility must be incorporated in the project to enable the further improvements as
SKNCOE, Department of Computer Engineering, 2010-2011
64
well as reuse for another client.
SKNCOE, Department of Computer Engineering, 2010-2011
65
7. APPLICATION AND FUTURE ENHANCEMENT
SKNCOE, Department of Computer Engineering, 2010-2011
66
CHAPTER 7
APPLICATION AND FUTURE ENHANCEMENT
7.1 APPLICATIONS:-
At present there are no such systems in market which follow self learning algorithm like Genetic Algorithm. In present scenario all system needs a update of current threats to avoid them. This system will be independent and will not require any outside support to counter new risks.
1. Network Security.
2. Network Intrusion Detection.
3. Unauthorized Network Access.
4. Organization Network Security and Control.
5. Packet analyzer and sniffer.
6. College intranet security system.
7.2. FUTURE ENHANCEMENT:-
1. Improved rule base generation techniques.
2. More efficient packet sniffers and gene checking algorithms
3. For the improvement in FPC and FNC, at first the applied network must be studied
thoroughly to identify the major impact holder between FPC and FNC. If both has
simultaneous effect, then by a suitable combination of the generated rules considering
FPC and FNC separately, the shortcoming could be overcome, which is left as a future
work.
4. Detailed specification of parameters to consider for genetic algorithm should be
determined during the experiments.
5. Combining knowledge from different security sensors into a standard rule base is
another promising area in this work.
SKNCOE, Department of Computer Engineering, 2010-2011
67
8. CONCLUSION
SKNCOE, Department of Computer Engineering, 2010-2011
68
CHAPTER 8:
CONCLUSION
Through this project we have introduced a new and improved model for genetic
feedback algorithm based network security policy framework. Fitness function and the
parameters affecting the fitness function is also taken under consideration. This new model is
much more simplified and implementable.
The simulation results show that new rules generated by GA have the better potential
capability to detect the attacks. But, this technique is not sufficient to improve both FPC and
FNC simultaneously. For the deployment of the technique, at first the applied network must
be studied thoroughly to identify the major impact holder between FPC and FNC. If both has
simultaneous effect, then by a suitable combination of the generated rules considering FPC
and FNC separately, the shortcoming could be overcome, which is left as a future work.
SKNCOE, Department of Computer Engineering, 2010-2011
69
9. BIBLIOGRAPHY
SKNCOE, Department of Computer Engineering, 2010-2011
70
CHAPTER 9
BIBLIOGRAPHY
9.1 BOOKS:-
1. Research on Policy-Based Security Management by W.E. Walsh.
2. An AI perspective on autonomic computing policies by Crosbie, Mark and Spafford.
3. Genetic and Evolutionary Algorithms: Principles, Methods and Algorithms.
4. Using Genetic Algorithm for Network Intrusion Detection by Wei Li.
5. Framework for Policy-based Admission Control by Crosbie, Mark and Spafford.
9.2 WEBSITES :-
1. http://www.geatbx.com/docu/algindex.html
2. http://www-dse.doc.ic.ac.uk/policies/
3. http://www.Security.cse.msstate.edu/
9.3 RESEARCH PAPERS:-
[1]. R. Shirey, “RFC2828”, Internet Security Glossary, May, 2000
[2]. Pohlheim, Hartmut, “Genetic and Evolutionary Algorithms: Principles, Methods and
Algorithms.”Genetic and Evolutionary Algorithm Toolbox,
http://www.geatbx.com/docu/algindex.html., 30 Oct. 2003.
[3]. Stuart Russel and Peter Norvig, “Artificial Intelligence A Modern Approach”, second
edition, PEARSON Education, 2004
[4]. Abu Sayed Md. Mostafizur Rahaman, Akram Hossain, Md. Abdur Rahman, Abeda
SKNCOE, Department of Computer Engineering, 2010-2011
71
Sultana, Jesmin Akhter, “Genetic Programming: Novel Network Attacks Detection”, 8 th
International Conference on Computer and Information Technology, 2005
[5]. “Policies for Network and Distributed Systems Management”, http://www-
dse.doc.ic.ac.uk/policies/, Imperial College
[6]. P. Flegkas, P. Trimintzios, G. Pavlou, I. Andrikopoulos, C.F. Cavalcanti, “On
Policy-based Extensible Hierarchical Network Management in QoS-enabled IP Networks”,
Policies for Distributed Systems and Networks: International Workshop, POLICY 2001,
Bristol, UK, January 2001
[7]. Wei Li, “Using Genetic Algorithm for Network Intrusion Detection”, Department of
Computer
Science and Engineering Mississippi State University, Mississippi State, MS 39762,
http://www.Security.cse.msstate.edu, 2004
[8]. Adhitya Chittur, “Model Generation for an Intrusion Detection System Using Genetic
Algorithms”, Ossining High School, Ossining, NY, November 27, 2001
[9] Koza, John R., Genetic Programmming: “On The Programming of Computers by Means
of Natural Selection”, MIT Press, Cambridge, MA, 1992.
[10] Crosbie, Mark and Spafford, Gene, “Applying Genetic Programming Techniques to
Intrusion
Detection”, In Proceedings of the AAAI 1995 Fall Symposium, November 1995.
[11] Bob Adolf, “New Paradigms for Intrusion Detection Using Genetic Programming”,
January 7,2004
SKNCOE, Department of Computer Engineering, 2010-2011
72