+ All Categories
Home > Documents > Radhika Thesis

Radhika Thesis

Date post: 29-Jul-2015
Category:
Upload: radhikadharurkar
View: 1,553 times
Download: 2 times
Share this document with a friend
Popular Tags:
51
Context-Aware Middleware for Activity Recognition Masters Thesis Defense Radhika Dharurkar Advisor: Dr. Tim Finin Committee: Dr. Anupam Joshi Dr. Yelena Yesha Dr. Laura Zavala 1
Transcript
Page 1: Radhika Thesis

1

Context-Aware Middleware for Activity

Recognition 

Masters Thesis DefenseRadhika Dharurkar

Advisor: Dr. Tim FininCommittee: Dr. Anupam Joshi Dr. Yelena Yesha Dr. Laura Zavala

Page 2: Radhika Thesis

2

Overview• Motivation• Problem Statement• Related work • Approach• Implementation• Experiments and Results• Contribution• Limitations• Future Work• Conclusion

Page 3: Radhika Thesis

3

Mobile Market

• 5.3 Billion mobile subscribers (77% of world’s population)

• Smart Phone Market - Predicted 30% growth/year

• 85% mobile handsets access mobile web

Pictures Courtesy: Mobile Youth

Page 4: Radhika Thesis

4

Motivation • Enhance User Experience

o Richer notion of context that includes functional and social aspects• Co-located social organizations• Nearby devices and people• Typical and inferred activities• Roles of the people

• Device understanding “Geo-Social Location” and perhaps Activity

• System by Service Providers and Administratorso Collaborationo Privacyo Trust

Page 5: Radhika Thesis

5

Motivation • Platys Project

Conceptual Place

• Tasks• Semantic Context Modeling• Mobility Tracking• Collaborative Localization• Privacy and Information Sharing• Context Representation, reasoning, and inference• Activity Recognition

Page 6: Radhika Thesis

6

Problem• Predict Activity of the user with the use of

“Smart Phone” • Capture data from different sensors present in

smart phone (atmospheric, transitional, temporal, etc.)

• Capture information of surrounding devices• Capture statistics about usage of phone (e.g.

battery usage, call list)• Capture information from other sources of

information (e.g. calendar)• Developed prototype system which can predict

almost 10 activities with better precision.

Page 7: Radhika Thesis

7

Platys Ontology

Page 8: Radhika Thesis

8

Activity Hierarchy

Page 9: Radhika Thesis

9

Related Work• Roy Want ,  Veronica Falcao ,  Jon Gibbons. “The

Active Badge Location System” (1992)• Guanling Chen, David Kotz. “A survey of context-

aware mobile computing research” (2000)• Gregory D. Abowd, Anind K. Dey, Peter J. Brown,

Nigel Davies, Mark Smith, and Pete Steggles. “Towards a better understanding of context and context-awareness” (1999)

• Stefano Mizzaro, Elena Nazzi, and Luca Vassena. “Retrieval of context-aware applications on mobile devices: how to evaluate?”(2008)

Page 10: Radhika Thesis

10

Related Work• Nicholas D. Lane, Emiliano Miluzzo, Hong Lu, Daniel Peebles,

Tanzeem Choudhury, and Andrew T. Campbell. “A Survey of Mobile Phone Sensing”, (2010)

• Hong Lu, Jun Yang, Zhigang Liu, Nicholas D. Lane, Tanzeem Choudhury, Andrew T. Campbell.

“The Jigsaw Continuous Sensing Engine for Mobile Phone Applications”, (2010)• Nathan Eagle, Alex (Sandy) Pentland, and David Lazer.

“Inferring friendship network structure by using mobile phone data”, (2009)

• Locale• “ActiveCampus”. William G. Griswold, Patricia Shanahan

Steven W. Brown, Robert T. Boyer, UCSD (2003)• Context COBRA. Harry Chen, Tim Finin, Anupam Joshi (2002)

Page 11: Radhika Thesis

11

Background: Context

Pictures Courtesy: 1) Mobile Youth 2) Zimmermann, A., Lorenz, A., Oppermann, R.: An operational definition of

context.

Page 12: Radhika Thesis

12

Approacho Automatically extract data from various data sources

with the help of smart phone

o Provide context modeling • Representation of context as ontologies• Represent the contextual information in a database

o Learning and Reasoning • Supervised learning approach• Identify feature set • Prediction of the Activity of the user

Page 13: Radhika Thesis

13

Architecture

Page 14: Radhika Thesis

14

Data Collection

User Tagging

Sensor Values

Page 15: Radhika Thesis

15

Data Collection

Page 16: Radhika Thesis

16

Data Extraction and Cleanup

Page 17: Radhika Thesis

17

Extracting Features

Page 18: Radhika Thesis

18

Classification

Page 19: Radhika Thesis

19

Toy Experiment• Data collected though framework developed by

eBiquity member which stored it in MySQL DB.• We added data from Google Calendar data • Data collected for one Student and one staff

member• Automated understanding of Calendar data• Manual cleaning up of data• Labeled instances to find “Conceptual Place”

o Student : 422 -Home, Lab, Class, Else whereo Staff Member : 280 – Home Vs. Office

Page 20: Radhika Thesis

20

Google Calendar

Page 21: Radhika Thesis

21

Toy Experiment• Data collected though framework developed by

senior members (Tejas) which stored in MySQL DB.

• Captured Google Calendar data • Data collected for one Student and one staff

member• Automated understanding of Calendar data• Manual cleaning up of data• Labeled instances

o Student : 422 -Home, Lab, Class, Else whereo Staff Member : 280 – Home, Office

Page 22: Radhika Thesis

22

Toy ExperimentSr. No

Captured Data

1 Device Id

2 Timestamp

3 Latitude

4 Longitude

5 Wi-Fi Status

6 Wi-Fi Count

7 Wi-Fi ID

8 Battery Status

9 Light

10 Proximity

11 Power Connected

12 User Present

13 Handset Plugged

14 Calendar Data

15 Temperature

Page 23: Radhika Thesis

23

Toy Experiment

Naïve Bayes J48 trees Random Trees Bayes Net Random Forest0

10

20

30

40

50

60

70

80

90

100

StudentPost Doc

Classifier

%

Accuracy

Page 24: Radhika Thesis

24

Analysis• Only few activities –> therefore good accuracy

• Data Sparse -> cannot do proper training

• Presence of Noise

• Artificially high decision-value to the information

• Overfitting

Page 25: Radhika Thesis

25

Experiment 1- Statistics

• Data collected though Application built for Android phone by Dr. Laura Zavala

• Added Bluetooth devices capture functionality• Data collected every 12 min for duration of 1 min (Notification)• Last activity saved, if user ignores.• Collects data from different

o Sensors o Nearby Wi-Fi deviceso Nearby Bluetooth devices (Paired, not paired)o GPS coordinates, Geo-locationo Call historyo User tagging for place and activity

Page 26: Radhika Thesis

26

Experiment 1- Statistics

• Collected data for 2 users for 2 weeks continuously.

• Captured Fine detailed activitieso 19 for Studento 14 or staff member

• Parsing for raw text data• Cleaning up the data• Transformation of data into feature vector• Use of Discretization techniques for continuous

attributes

Page 27: Radhika Thesis

27

Experiment 1- Accuracy

Naïve Bayes J48 trees Random Trees

Bayes Net Random Forest

0

10

20

30

40

50

60

70

80

90

100

StudentPost Doc

Classifier

%

Accuracy

Page 28: Radhika Thesis

28

Experiment 1- Analysis• Comparing with TOY experiment accuracy

o Similar accuracy for Naïve Bayes and Decision Trees in Toy Exp.

o Big drop in accuracy for decision trees here

• In Toy Experiment o Overfittingo Noiseo Missing Data

• In This Experimento We tried to work on cleanupo Discretization for sensor valueso Still have timestamp, Wi-Fi ids, such attributes as 1 feature.

Page 29: Radhika Thesis

29

Confused ActivitiesTotal Main Activity Conflicted Conflicted

54 Coffee/Snacks Working/Studying 12 Sleeping 5

218 Working/Studying Coffee/Snacks 5 Sleeping 8, Chatting 8

39 Reading Working/Studying 19 Sleeping 4

26 Cleaning Working/Studying 10 Sleeping 2

195 Sleeping Working/Studying 9  

17 Cooking Working/Studying 5 Sleeping 3, Cleaning 2

49 Chatting/Talking on Phone Working/Studying 14 Sleeping 2 ,Coffee/Snacks 2

6 Class-Listening Class-TakingNotes 2  

3 Talk-Listening Class-TakingNotes 1 Working/Studying 1

1 Watching Movie Sleeping 1  

3 Dinner Working/Studying 3  

9 Watching TV Working/Studying 3 Sleeping 6

1 Shopping Working/Studying 1  

Student Data

Page 30: Radhika Thesis

30

Confused Activities

Total

 

Main Activity Conflicted Conflicted

525 Working/Studying Other/Idle 9 Sleeping 4 , Watching TV 6

9 Lunch Working/Studying 3 Other/Idle 1

72 Sleeping Working/Studying 19 Other/Idle 2

11 Cooking Working/Studying 3 Sleeping 2

78 Other/Idle Working/Studying 13 Walking 1

18 Watching TV Working/Studying 7 Other/Idle 1

2 Shopping Cooking 1  

Staff Data

Page 31: Radhika Thesis

31

Experiment 2- Statistics

• Collected data for users for a month continuously.• Finer detailed activities captured

o 19 for Student

• Some activities were hard to distinguish -> reduced to small set of 9 activities for prediction

• Parsing for raw text data• Cleaned up the data• Use of Discretization techniques for continuous

attributes• Used “Bag of Words” approach

o Wi-Fio Geo-location o Bluetootho Timestamp

Page 32: Radhika Thesis

32

Experiment 2- Accuracy

Naïve Bayes J48 trees Bagging + J48 trees

LibSVM LibLinear0

10

20

30

40

50

60

70

80

90

Percentage split 66%

Cross Validation 10 FoldsClassifier

%

Accuracy

Page 33: Radhika Thesis

33

Experiment 2- Confusion Matrix

a b c d e f g h i j k <-- classified as 677 1 0 0 0 0 4 0 0 0 2 | a = [Sleeping] 0 186 0 0 20 0 3 0 5 0 0 | b = [Walking] 0 0 27 0 0 0 0 0 0 0 0 | c = [In Meeting] 0 2 0 65 0 4 0 0 0 0 0 | d = [Playing] 0 37 0 0 37 0 0 0 4 0 0 | e = [Driving/Transporting] 0 0 0 2 0 146 1 0 0 2 0 | f = [Class-Listening] 8 0 0 0 0 2 52 2 0 0 8 | g = [Lunch] 9 0 0 0 0 0 8 11 0 0 0 | h = [Cooking] 0 11 0 0 6 0 0 0 13 0 0 | i = [Shopping] 0 2 0 0 0 5 0 0 0 7 0 | j = [Talk-Listening] 5 0 0 0 0 0 1 0 0 0 34 | k = [Watching Movie]

Page 34: Radhika Thesis

34

Experiment 2- Analysis• Small Set of Activities analyzed• Individual basis• Naïve Bayes performance reduced

o More features includedo Less functional independence

• Decision Trees Accuracy Improvedo Bag of words approacho Concept Hierarchyo ConjunctionsInline with Research 1) “Physical Activity monitoring” by Aminian, Robert2) “Activity Recognition from user annotated accelerometer data” by Bao, intille

• Recognition accuracy is highest for Decision tree classifier => Proved Best for our Model

Page 35: Radhika Thesis

35

Accuracy for Models

11 Activities Stationary Vs. Moving 10 Activities In Meeting Vs. In Class

Home Vs. School Vs. Else Where

Home Vs. School82

84

86

88

90

92

94

96

98

100

% Accuracy

Classification for Activities

Page 36: Radhika Thesis

36

Small subset of Activities

• These activities do not have simple characteristics and are easily confused with other activities.o Phone kept on table while working, lunch, coffeeo Driving and Walking in school

• Not more sensor data to capture some activities• Model mostly relies on features like

o Wi-Fi IDso Geographic locationo Bluetooth Ids o Time of day

• Therefore, Hard to predict activities across users o E.g In Class, cooking (Does not predict relying on sound levels)

Page 37: Radhika Thesis

37

General Model

Page 38: Radhika Thesis

38

Classifiers Evaluating Our Data

Machine Learning Algorithm Evaluation Problems

Naive Bayes classifier Independence Assumption

Support vector machines Noise and Missing values

Decision trees Robust to errors, missing values, conjunctions

Random Trees No Pruning

Ensembles of classifiers Reduces Variance

Page 39: Radhika Thesis

39

Discretization• Filters – unsupervised attribute

• Binning

• Concept Hierarchy

• Division in intervals

• Smoothening the data

Page 40: Radhika Thesis

40

Bagging with J48• Ensemble Learning Algorithms

• Averaging over bootstrap samples reduces error from variance, esp. when small differences in training set can produce big difference between hypotheses.

Page 41: Radhika Thesis

41

Example J48+BaggingPlace = Home: Sleeping (9.0/2.0)Place = ITE346: In Meeting (1.0)Place = Outdoors| G1 = False| | Morning = True: Walking (5.0/2.0)| | Morning = False: Driving/Transporting (17.0/2.0)| G1 = True: Walking (2.0)Place = Home| Evening = False: Sleeping (20.0)| Evening = True| | noise = '(-inf-28.19588]': Cooking (0.0)| | noise = '(28.19588-32.71862]': Cooking (2.0)| | noise = '(32.71862-inf)': Watching Movie (1.0)Place = Restaurant: Lunch (5.0)Place = Movie Theater: Watching Movie (2.0)Place = Elsewhere: Walking (1.0)Place = ITE325: Talk-Listening (4.0)Place = ITE3338/ITE377: In Meeting (2.0)Place = Groceries store: Shopping (1.0)

loc2 = '(-inf-39.17259]': Watching Movie (2.0)loc2 = '(39.17259-39.18528]': Sleeping (0.0)loc2 = '(39.18528-39.19797]': Lunch (4.0)loc2 = '(39.24873-39.26142]': Walking (9.0/2.0)

Afternoon = False| Evening = False| | Place = Outdoors: Walking (1.0)| | Place = Elsewhere: Sleeping (0.0)| Evening = True: Walking (4.0)Afternoon = True| Wifi Id8 = True: In Meeting (3.0)| Wifi Id8 = False| | Place = Home: Lunch (0.0)| | Place = Restaurant: Lunch (4.0)| | Place = Movie Theater: Watching Movie (2.0)| | Place = Work/School: Working (1.0)| | Place = ITE346: Lunch (0.0)| | Place = Outdoors: Walking (1.0)| | Place = ITE3338/ITE377: Lunch (0.0)

Wifi Id8 = True: In Meeting (6.0/1.0)Wifi Id8 = False| Afternoon = False| | Evening = False: Sleeping (24.0/1.0)| | Evening = True: Walking (5.0)| Afternoon = True| | Place = Work/School: Working (1.0)| | Place = ITE346: Lunch (0.0)| | Place = Outdoors: Walking (1.0)| | Place = Home: Lunch (0.0)| | Place = ITE3338/ITE377: Lunch (0.0)

Page 42: Radhika Thesis

42

Contribution• Smart phone usage for Mid-level Activity

recognition (Supervised Learning Approach)• High level notion of context

• Accuracy of 88% for 9 Activities for a user• Accuracy Inline with other researches

o Home Vs Work 100% compared to 95% accuracy- MIT project using HMM

o Mid-level detailed activity recognition – Bao and Intille (MIT).o Highest Recognition Accuracy for Decision Tree classifier - Bao and

intille (MIT)

• General Model

Page 43: Radhika Thesis

43

Applications

Walking 1Working 2In Meeting 3Driving 4Other/Idle 5Watching TV 6Sleeping 7Cooking 8Talk-Listening 9Lunch 10Watching Movie 11Reading 12Shopping 13Coffee/Snacks 14

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Activity Distribution over a Week

Activity

Day

Mon

Tue

Wed

Thu

Fri

Sat

Sun

Page 44: Radhika Thesis

44

Applications

0:00

1:12

2:24

3:36

4:48

6:00

7:12

8:24

9:36

10:4

8

12:0

0

13:1

2

14:2

4

15:3

6

16:4

8

18:0

0

19:1

2

20:2

4

21:3

6

22:4

80

1

2

3

4

5

6

7

8

9

10

11

Weekday Activity Distribution

Timeline

Activity

Sleeping 1

Studying 2Coffee/Snacks 3

Reading 4

Driving/Transporting 5

Walking 6

In Meeting 7

Lunch 8Class-Listening 9

Class-Taking Notes 10

Chatting 11

Page 45: Radhika Thesis

45

Applications

0:00

1:12

2:24

3:36

4:48

6:00

7:12

8:24

9:36

10:4

8

12:0

0

13:1

2

14:2

4

15:3

6

16:4

8

18:0

0

19:1

2

20:2

4

21:3

6

22:4

80

1

2

3

4

5

6

7

8

9

10

Timeline

Activity

Walking 5

Studying 2

Transporting 6

Chatting 8

Playing 9

Sleeping 1

Other 10

Reading 4

Shopping 7

Coffee/Snacks 3

Weekend Activity Distribution

Page 46: Radhika Thesis

46

Applications• Understand Pattern of Activities for users• Keep a check on time spent

o Planner o Study Scheduleso Program Meetings

• Update Phone settings according to context• Recommendation Systems• Locate specific service nearby• Adjust presence of user• Update Calendar of a user

Page 47: Radhika Thesis

47

Limitations• Set of Experiments

o Duration of Data captureo Number of users for capturing data

• Information captured through Phone

• Audio, sound processing

• Training on data from different individuals for general model

Page 48: Radhika Thesis

48

Future• Robust General Model• Multiple feature sets for different kind of

predictions• Roles management• Rules for some ground truths or profiles• Collaborative activity inference• Models to incorporate sequence of activities

Page 49: Radhika Thesis

49

Thank you

Page 50: Radhika Thesis

50

ES – Decision Trees• Each node = attribute• End leaf gives classification results• Root node = Most information gain(Claude

Shannon) If there are equal numbers of yeses and no's, then there is a great deal of entropy in that value. In this situation, information reaches a maximum Info = -SUMi=1tom p1logp1 

• attr 2 yes, 3 no=I([2,3])= -2/5 x log 2/5 - 3/5 x log 3/5

• Average them n subtract frm I(whole)

Page 51: Radhika Thesis

51

Classification via Decision Trees

• Effective with Nominal data • Pruning – correct potential overfitting• Confidence Factor = 0.25• Minimum number of Objects = 2• Error Estimation = (e+1)/(N+m)• Reduced Error Pruning - False• Sub tree Raising - True

“Decision Tree Analysis using Weka”- Sam Drazin, Matt Montag


Recommended