Date post: | 06-Aug-2015 |
Category: |
Documents |
Upload: | amrit-khandelwal |
View: | 18 times |
Download: | 1 times |
N-RMP ALGORITHM FOR MINING BEHAVIORAL PATTERNS IN A WIRELESS SENSOR NETWORKS (WSN)
INTRODUCTION Sensor association rules have been found to be very useful
for determining frequent patterns in wireless sensor networks (WSN).
These mining algorithms generate patterns from the sensor responses and further match it with already existing database of frequent patterns. If any anomaly or delay is detected for , sensors are determined that have stopped working and necessary actions are taken.
These mining algorithm have helped us with in two important prospects : 1. Memory usage is optimized 2. Time of execution is reduced
We propose mining algorithms to overcome these essential challenges : 1. Mining applied to online datasets.2. Increase efficiency for resource constrained systems.
LITERATURE REVIEWReference Technique used Performance
[1]P-N Tan, ” Knowledge discovery from sensor data,” Sensors, pp, 14-
19, 2006[2]A. Boukerchhe and S.A. Samarah,” Novel algorithm for mining
association rules from WSN”, WSN, vol.19, no.17, pp- 865-877,
2008
Apriori Algorithm
• Database was scanned multiple
times• The time of execution was
high• Memory usage
was high
This project N-RMP Algorithm
• Database was scanned only once
• Time of execution was very less
• Memory usage was optimum
SYSTEM REQUIREMENTSHardware Requirements:
Processor : Any Processor above 500 MHz.
RAM : 128Mb.
Hard Disk : 10 Gb.
Compact Disk : 650 Mb.
Input device : Standard Keyboard and Mouse.
Output device : VGA and High Resolution Monitor.
Software requirements:
Operating System : Windows Family.
Language : JDK 1.5
SYSTEM ANALYSISEXISTING SYSTEM• Existing systems run not on dynamic but static
datasets.
• These algorithms that run on static datasets are bound to fail to provide correct response for dynamic data.
• Systems run on algorithms with higher degree of time complexity.
• The already existing systems have the following disadvantages:
• Uses more memory• Execution time is high
PROPOSED SYSTEM• Systems connected to the internet and can run
database that is always updating
• System where the sliding window protocol can be applied.
• Degree of time complexity is reduced in newer algorithms applied in these systems.
• The systems will have the following advantages:• Could run dynamic datasets• Uses less memory thus efficiency is increased• Reducing the time of execution manifolds
SYSTEM DESIGN : ARCHITECTURE
APPLIED ALGORITHMS ON THE DATABASE COLLECTED FROM THE SINK NODE:
Apriori algorithm
N-RMP algorithm
IMPLEMENTATION
INTRODUCTION TO APRIORI ALGORITHMS The Apriori Algorithms an influential algorithm for mining
frequent item-sets for boolean association rules Some key points in Apriori algorithm –
To mine frequent item-sets from traditional database for boolean association rules.
A subset of frequent item-set must also be frequent item-sets. For example, if {l1, l2} is a frequent item-set then {l1}, {l2} should be frequent item-sets.
An iterative way to find frequent item-sets.
Use the frequent item-sets to generate association rules.
CONCEPTSA set of all items in a store
A set of all transactions (Transactional Database T) • Each is a set of item set.
Each transaction has a Transaction ID (TID).
Initial frequent set
Candidate generation
Candidate pruning
Support calculation
CONCEPTS Uses level wise search where k item-sets are use to explore
(k+1) item-set.
Frequent subsets are extended one item at a time, which is known as candidate generation process.
Groups of candidates are texted against the data.
It identifies the frequent individual items in the database and extends them to larger and larger item-sets as long as those item-sets appear sufficiently often in the database.
Apriori algorithm determines frequent item-set to determine association rules.
All infrequent item-sets can be pruned if it has an infrequent subset.
APRIORI ALGORITHM – THE PSEUDO CODE
Join Step: is generated by joining with itself. Prune Step: Any (k-1) item-set that is not frequent cannot be a
subset of a frequent k item-set Pseudo – Code:
C k: candidate item-set of size kF k: frequent item-set of size k L1= {frequent items}; for (k = 1; L k=! ; k++) do begin C k+1candidate key generated from Lk
for each transaction t in database do increment the count of all candidates in Ck+1 that are contained in t
L k+1= candidate in C k+1 with min_support end return Uk + Lk
HOW THE ALGORITHM WORKSWe have to build candidate list for k item-sets and
extract a frequent list of k-item-sets using support count.
After that we use the frequent list of k item-sets in determining the candidate and frequent list of k+1 item-sets.
We use pruning to do that.
We repeat until we have an empty candidate or frequent support of k item-sets.
Then return the list of k-1 item-sets.
EXAMPLE OF APRIORI ALGORITHMConsider the following database:
TID Items
T1 1 2 3
T2 2 3 5
T3 1 2 3 5
T4 2 5
T5 1 3 5
Step 1 : Minimum support count = 2
Item-set Support
{1} 3
{2} 3
(3} 4
{4} 1
{5} 4
Prune
Item-set Support
{1} 3
{2} 3
{3} 4
{5} 4
Frequent item-set -1Candidate item-set -1
STEP 2 :
Item-sets Support
{1,2} 1
{1,3} 3
{1,5} 2
{2,3} 2
{2,5} 3
{3,5} 3
Item-sets Supoort
{1,3} 3
{1,5} 2
{2,3} 2
{2,5} 3
{3,5} 3
Candidate item-set 2 Frequent item-set 2 :
Prune
STEP 3 :
Item-sets Supoort
{1,3} 3
{1,5} 2
{2,3} 2
{2,5} 3
{3,5} 3
Frequent item-set 2 :
Item-sets In FI2?
{1,2,3}{1,2} {1,3}
{2,3}NO
{1,2,5}{1,2} {1,5}
{2,5}NO
{1,3,5}{1,3} {1,5}
{3,5}YES
{2,3,5}{2,3} {2,5}
{3,5}YES
DiscardedReason : A subset of a frequent item-set must also be frequent item-set.
Item-sets Support
{1,3,5} 2
{2,3,5} 2
Frequent item-set 3 :Candidate item-set 3 :
INTRODUCTION TO N-RMP ALGORITHMThe N-RMP (Non-Redundant Mining Process) algorithm is a three
step mining process algorithm :
Step 1 : Scannin
g of dataset
Step 2 : Processing the non-redundant
data and discarding of
redundant data
Step 3 : Generatio
n of frequent item-set
N-RMP is able to capture the information with one scan over the stream of sensor data and store them in a memory-efficient highly compact manner similar to FP-Tree.
CONCEPTSP-Tree is a frequency- descending compact tree structure.
Epoch in sensor database DS is inserted into a SP-Tree according to lexicographic order and a header list S-list is also build at this stage.
Once all epochs of DS are inserted into the tree, it is re-organized into a frequency descending tree based on the calculated frequency of the S-list.
Therefore SP-tree construction consists of two phases: Insertion phase: Epochs from DS is inserted to a
lexicographic tree and a header list S-Tree is build.
Reorganization phase: Rearranges the S-list in
frequency descending order and restructure the SP-tree.
N-RMP ALGORITHM – PSEUDO CODE1) FG { }; // global list of frequent generators2) fill C1 with 1-itemsets and count their supports;3) copy frequent item-sets from C1 to F1;4) mark item-sets in F1 as “closed”;5) mark item-sets in F1 as “key” if their support < |O|; //where |O| is the number of objects in the input dataset6) if there is a full column in the input dataset, then FG {;};7) i +1;8) loop9) {10) Ci+1 NRMP-Gen(Fi);
11) if Ci+1 is empty then break from loop;12) count the support of “key” itemsets in Ci+1;13) if Ci+1 has an itemset whose support = pred supp,
then mark it as “not key”;14) copy frequent itemsets to Fi+1;15) if an itemset in Fi+1 has a subset in Fi with the same
Support, then mark the subset as “not closed”;16) copy “closed” itemsets from Fi to Zi;17) Find-Generators(Zi);18) i i + 1;19) }20) copy itemsets from Fi to Zi;21) Find-Generators(Zi);
EXAMPLE OF N-RMP :
TID Epoch
T100 s1 s2 s3 s4 s7 s8
T200 s1 s5 s6
T300 s2 s5 s6 s7 s8
T400 s1 s2 s4 s7
T500 s1 s2 s4 s5
T600 s1 s3 s4 s7
Consider the following Dataset:
STEP 1.0 : BUILDING OF LEXICOGRAPHIC TREE AND CORRESPONDING SP-TREE
S-list
S1 = 5
S2 = 4
S3 = 2
S4 = 4
S5 = 3
S6 = 2
S7 = 4
S8 = 2
Table : S-list from the given data
{ }
s1:5
s2:1
S5:1
s7:1
s3:1
s4:1
s4:2
s5:1
s7:1
s4:1
S7:1
s2:3 s3:
1
s8:1
s6:1
S8:1
S7:1
S6:1
s5:1
Tree : Interpretation of S-list
STEP 2.0 : FREQUENCY DESCENDING SP-TREE
S-list
S1 = 5
S2 = 4
S4 = 4
S7 = 4
S5 = 3
S3 = 2
S6 = 2
S8 = 2
Ssort
{ }
s1:5
S2:1
S5:1
s7:1
S6:1
s5:2
s4:1
s6:1
s3:1
s2:2
s7:3
s4:3
s3:1
s8:1
s8:1
s2:1
Table : Sorted list from S-list
Tree : Traversed according to descending frequency
STEP 3 : COMPRESSING THE TREE { }
S2, s5, s6, s7, s8:1
s1:5
S4, s7:3
s2:2
s3:1
S3, s8:1
s5:2
S4, s2, s6 : 1
CONCLUSION• Apriori
• The apriori has a higher runtime for execution.• It consumes more memory than N-RMP.• It uses large item-sets• Assumes transaction database is memory resident.• Requires multiple database scan.
• N-RMP• The N-RMP algorithm is more efficient than the apriori.• Lesser runtime.• Consumes less memory• Transaction database is discarded to save more memory.• Requires just a single database scan.
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
Execution time comparison between N-RMP & AprioriApriori
Min_support(%)
Tim
e(m
s)Graphical representation of the analysis:
Memory usage comparison
Apriori N-RMP
FURTHER ENHANCEMENTS• The algorithm is designed for dynamic datasets.
• The algorithm should adapt according to the rate of dataflow; if the rate of data flow is high, the algorithm selects smaller time windows to run, whereas, if the data flow is low, the algorithm selects bigger time window thus making it a more efficient algorithm.
REFERENCES[1]P-N Tan, ” Knowledge discovery from sensor data,” Sensors, pp, 14-19, 2006
[2]A. Boukerchhe and S.A. Samarah,” Novel algorithm for mining association rules in Wireless Ad-hoc Sensor networks”, IEEE Transactions on parallel and distributed Systems, vol.19, no.17, pp- 865-877, 2008
[3]S.K. Tanbeer, C.F. Ahmed, B.S. Jeong, “ An efficient Single-pass algorithm for mining association from wireless sensor networks,” IETE Technical review, vol. 26, Issue 4, 2009.
THANK YOU.