+ All Categories
Home > Documents > Solving Data Management Problems in a Personal Health Network with Tanja

Solving Data Management Problems in a Personal Health Network with Tanja

Date post: 03-Dec-2023
Category:
Upload: granada
View: 1 times
Download: 0 times
Share this document with a friend
14
Transcript

of the period. The beginning of the exercise period is obtained by subtractingits duration (10 minutes). In Figure 4 a representation of the heart rate streamof an athlete during two consecutive periods of activity is shown.

1. Input: An XML file corresponding to the sensor stream of one participant

2. Output: The end time of the working period

3. Start from End(Stream)

4. Calculate all 10 minute average AVG(i) for T(WT)+T(Surplus)

5. foreach AVG(i) do6. if AVG(i) < SmallestAvg then7. SmallestAvg = AVG(i);

8. // locate the smallest 10 minute moving average.9. Return Time(SmallestAvg);

Fig. 5. The detectEffort algorithm

4 Outlier Removal

The sensor network will generate outlier values, which are clearly seen as lyingoutside the range of values that are normally acceptable. Before proceeding tomake any transformation, search or query of the sensed data, we must detectand calibrate such outlier data. We here present a generic method that directlyoperates on XML sensor output and can be parameterized by sports specialists.Four major steps must be performed to accomplish this: Set Valid Range, IdentifyCandidate Outliers, Identify current Outliers and Calibrate Outlier.

4.1 Set Valid Range, Min Value and MaxValue

During heart rate (HR) monitoring (hr), taking into account the age groupto which the athlete belongs, the following function can be used to identifythe upper and lower limits of the range that contains valid values for outliercandidates:

fvalidRange(hr,HRLimit, age, variance)

=

FALSE if(hr > (HRLimit − age) ∗ (1 + variance))FALSE if(hr < (HRLimit − age) ∗ (1 + variance/2))TRUE Otherwise

(1)

As a rule of thumb, if in a given experiment age of athlete = 25, then themaximum value of HR is ProbableMax = HRLimit − age = 195. Given thatwe consider HRLimit = 220. If we add 10% variance, then any value up toMaxV alue = ProbableMax + 10% = 214 will be a possibly valid HR maxi-mum value and any value over this is considered an outlier value. For the calcu-lation of the minimum HR value HR, we have MinV alue = 195 − 5% = 185,

then any value below 185 cannot be an outlier. The latter value represents aninteresting result because an athlete’s heart rate is affected by many variationsat the beginning of an exercise period.

4.2 Identify Candidate Outliers

The file containing the data stream of heart rates must be read from beginningto end. We examine all HR values that are in the above range, which we definedas valid, and each of these is considered as a Candidate Outlier. That is, for theabove example, any value between 185 and 214.

4.3 Identify Actual Outlier

The five values of HR before and after a candidate outlier value are read. Theaverage of these 10 values, Mean Compare, is calculated and if the value of thecandidate is outside the 1.5% of Mean Compare, taking this value as a variableparameter, we can then say that the candidate is an actual outlier.

4.4 Calibrate Outlier

Once a value of HR is considered as an actual Outlier, it is replaced by the valueMean Compare previously calculated.

Device StartTime Interval

Reading

PeakHR MeanHR

DataAggregation

Offset Time Key RawValue Outlier RevisedValue

TimeLength AvgPerformance

Athlete

SensorData

Experiment

Fig. 6. Transformed XML Schmema

5 Data Management Strategy

We will now discuss the transformation processes necessary to make the queriesmore manageable for the proposed system’s users, mainly sports scientists andspecialists in the field of athletics. We can see in Figure 6 the necessary XML

schema for structured, filtered and classified data obtained from sensors. Theprevious schema can be considered the entry point of the three transformationprocesses that we describe here. Each of these processes will yield, as a result,an extended schema with the objective to facilitate knowledge based queries.

5.1 Collecting Information

The objective here is the acquisition of aggregates. The XML files obtained arerich in data but they are dependent on their context, as an isolated readingof the heart rate does not give us much useful information. When we look atthe readings of heart rates within a broader category of values, these provideus with a valuable comparative metric of an athlete’s performance. When wefinish the process of filtering, described above, we must add the informationof State 2 to each sensor reading, also storing it in the XML file. We mustthen mine these XML files to produce aggregated information for each athleteduring their physical activity. Depending on the type of activity, we can calculateseveral aggregate information layers to generate reports that can help sportsspecialists analyze and optimize the performance of their athletes. For example,the aggregate information that can be obtained during athletic activity wouldbe the following:

– PeakHR: the value of the maximum of heart rates of an athlete during aperiod of activity; ;

– HR como un percentaje de PeakHR: the Reading of heart rates of an athleteexpressed as a percentage of the maximum value of heart rates during aperiod of activity.

– The average value of heart rates of an athlete during a complete period ofactivity.

5.2 PeakHeartRate Treatment

This process computes the current Peak Heart Rate data sensed for an athleteand calculates the time length of the different percentage ranges of its PeakHR.The result is enriched, using the XML format of Figure 3, and is stored inthe data base MonetDB [MonetDB, 2008], ready for sports specialists to beginformulating queries using the XQuery language.

5.3 PeakInterval Tranformation

Here a process called PeakInterval Transformation is defined, which can be un-derstood from the next database query example: ”the period of time an athletehas been exercising while maintaining the levels: 70% or higher of their PeakHRfor 10 seconds or more etc.” The different estimated levels of PeakInterval foran athlete are then calculated. With this transformation much more complexqueries about the performance of athletes can be answered.2 A state refers to some working interval in the sporting activity

5.4 Process Evaluation

This process is responsible for storing the enriched data in the MonetDB XQueryserver, where they are calibrated until they are ready to be used in user queries.

Fig. 7. Metamodel Components

– Filtering. It takes a few minutes to complete all the data streams fromsensors, to enrich the raw data of HR values with XML formatting, andproduce outputs up to 145 MB stored in MonetDB.

– Detection and elimination of outlier values. This process is designed tocalibrate the outlier values and usually takes a little less time than Filteringto process all experiment data.

– Aggregate reports. This has an execution time intermediate between theother two processes, since it has to perform calculations with several rangesof values; it extracts each athlete’s activity period and then calculates andprints each of the reports with the sensed data; the total size resulting fromthis process can be around 2 MB.

– Change PeakHeartRate and PeakInterval. Both processes need more than 1minute to generate results due to the fact that they calculate PeakHR levelsand intervals of the complete experiment.

6 Metadata Service

The transformations described in the previous section have effectively createda form of ”data warehouse” where the same data subjects are used in differentinterpretations of results. There must be a more abstract description of the datasets in order for sports specialists to be able to understand the transformationcontents and to program query expressions on these data sets.

The role of a service’s metadata (Metadata Service) is both to manage thesechanges and to provide a configuration to ensure a proper link between datasets. Moreover, a high level of generality must be permitted so that this servicemay be applicable in sensor networks.

One sensor will be deployed in a specific context and in association with aspecific activity. A context will always have one or more States (i.e., phase of theactivity at which the value of the sensor was generated), areas (location fromwhich the sensor generated the output), and timing (information on the time atwhich an activity began and the interval at which sensor readings are generated).

6.1 Metadata Constructs

There are four primary objects that are consistent with all sensor networks:Sensor, Subject, Activity and Context. In our case we include a fifth ob-ject, Template, which is used to isolate the system’s new sensors and which canbe added later or isolate the way in which a particular sensor creates its output.XML template descriptions can be exploited through a generic interface to inter-pret any sensor device. In addition, when the data generated by these networksare processed, two more objects, Sensor Data and (data) Transform, appear.

Mappings between the readings of low-level sensors and their transformedaggregations are captured by the Metamodel, see Figure 7:

– Sensor:The sensor produces output that must be interpreted and struc-turally enriched to a format suitable for queries. Sensor output is receivedby the system in the form of XML. Since new sensors are added or modi-fied,in order to not change anything in the system, a template to interpretthe sensed data is required.

– Subject: Sensors are generally associated with a single subject, which maybe a piece of instrumentation, a person, or some aspect of the environment.The subject will provide an important contextual information required tointerpret sensor data.

– Context: Sensor output has little meaning unless it can be used in a specificcontext which provides additional semantics to the data generated by thesensor. The metadata service allows end users to specify a number of differentcontextual parameters including States to be associated with all sensed data.

– Activity: Both Subject and Activity form the contextual input for inter-preting sensor data. For example, a heart rate monitor will generate data ina uniform, easy to read, format. However, the context in which that sensoris used is crucial to query processing.

– Template: Irrespective of the context in which a sensor may be used, thesensing device will generate its output in some prescribed format. The benefitof removing the logic that interprets sensor output from algorithms anddescribing the output in a template (managed by the metadata service) isthat changes to sensor output do not require a system change.

– SensorData: When structurally enriched, the sensor stream is stored in XMLformat with an unique ID, XML Path, and mapping to its position in aSchema. This is the base form of the sensor data to which all transforma-tions are mapped. Each SensorData instance may be used in any number oftransformations.

– Transform: Each transformation will comprise a series of SensorData in-stances. They may be part of the basic schema or from an existing transfor-mation.

6.2 Services

A number of services that merit discussion are defined:

– Add New Sensor. This simply provides a file template, so the system canread data from sensors. Another object called Template, which is requiredto properly interpret the sensor data stream, must also be created.

– Add New Context. Provides a set of States, each representing a particularphysical activity performed by an athlete, TimingOffset which is set to 0,and Timing set to 5 (sec.). TimingOffset This is a parameter used whenwe want to either start to process data again or start processing data froma specified time onwards.

– Apply Activity. When a new sensor is included in the system, the archiveTemplate facilitates reading and processing the output. The role played bythis object is to map each measurement to a particular state value.

– Add New Transformation. Three transformations have been used in thiswork, all of them transforming a basic schema in order to provide a corre-spondence between two versions of a schema.

7 Simulation Results Discussion

Our proposal is based on the interest to provide a standard query interface thatcan be used with low level or ”raw”sensor data. In addition, with the devel-opment of the proposed Metadata service and the transformation layer, it ispossible to facilitate much more complex queries while reducing the complexityof XQuery expressions.

Each sensor recorded between 1100 and 1200 readings at intervals of 5 sec-onds over a period of between 95 and 102 minutes. It is not possible to executeuser queries on current raw data streams. Without the data layer, for each par-ticular data stream, to query a great quantity of data, it would be necessary toapply some context-dependent information and then to program low-level query

primitives to identify the values of interest or collect sequences to compute thevalues min, max, prom, etc. However, with the proposed system we can applyXQuery to solve all the requirements of users, taking into account that even themore basic queries usually require very complex XQuery expressions.

Typical user Queries are listed in natural language in Table 2 and theirequivalent XQuery expressions are listed in Table 3. An interpretation of thephysiological value of these questions is outside the scope of this article. It isworth noting that all of them are initially based on determining the maximalheart rate PeakHR during each period of physical activity. The percentage oftime that the athletes maintained values close to their PeakHR is considered animportant result by many sport specialists in determining an athlete’s perfor-mance.

1 How many times was a specific athlete at 70% Max HR for more than 20s?

2 How many times was each player at 70 % Max HR for more than 20sacross all the exercises?

3 What was the average time spent in seconds below 50% of the Max HR ofan athlete across all the exercises?

Table 2. Sample of possible queries to be transformed

The queries are executed on the resulting extended schema of the transfor-mation processes, which have been described in previous sections. Without thedata management layer each query would require the development of a complexprogram, whereas with our proposal it is possible to specify simple XQuery ex-pressions to respond to questions from users (i.e. XQuery expressions). All thishas been achieved thanks to the proposed transformations.

The execution times are shown in Table 3. These times show the averagevalues measured in ten exercises for each user query. Looking at the results, wecan see that all of them lie within acceptable response times. The most complexcase, query no. 3 in Table 3, required the results of approximately 300 athleticexercises to be processed and the result was generated in little more than 100ms.All other queries have execution times below 100 ms. Query 1 runs four timesfaster than query 2. This is due to the impact that a connection to the database,as well as the building of the whole set of results, has on the global time,

Finally, it is interesting to note that as queries are specified as standardXQuery expressions, they can be modified using a simple editor if more queriesare needed. Thus, with our system, there would be no need to alter any moduleof the system if this were the case.

8 Conclusions

Sensor networks are a very useful tool for finding solutions to the problem ofefficiency and endurance measurement of an athlete during sports training. Oursystem will enable sports specialists to advise athletes and coaches on how toimprove athletic performance.

1 doc(‘dataAggregation.xml’)/healthSense/user[text()=’mcapel’] 26 ms/sensorData/dataAggregations/peakInterval[avgPerformance=70and timeLength>20]/timesAboveAvgPerf/text()

2 doc(‘dataAggregation.xml’)/healthSense/user/sensorData 97 ms/dataAggregations/peakInterval[avgPerformance=70and timeLength>20]/timesAboveAvgPerf/text()

3 for $c in doc(‘dataAggregation.xml’) return fn:count($c/healthSense 127 ms[session[text()=‘15aside’]]/user/sensorData/dataAggregations/peakLength[avgPerformance<50]/exerciseLengthTime/text())

Table 3. Equivalent XQuery expressions

With a specific design of the data management layers, we can obtain raw datafrom sensors and process them to a certain point in which the queries of sportsspecialists can be expressed using a standard query language. The transformationprocess is carried through the deployment of a Metadata Service to provide theappropriate level of generality to express complex queries without getting stuckin low level details. The Metadata Service is used to associate processors withsensor data for performing different transformations. Additionally, this servicekeeps track of the changes and logs the mappings between basic values and thetransformations in which these basic values occur. Future work will focus oncreating a graphical interface to enable sports specialists to define queries usinga graphical notation for representing schemas and data transformations.

References

[Capel, 2007] Manuel I. Capel-Tunon, Walter Mata Lopez , Zouhair A. Sadouq. Char-acteristic Analysis of Wireless Sensor Networks with Real-time Middleware. In Pro-ceedings of the II Congreso Espanol de Informatica (CEDI 2007). Congreso Espanolde Informatica, CEDI07,21 - 28. 12-14 September 2007, Zaragoza (Spain).

[ERCIM, 2009] ERCIM News Special theme: The Sensor Web. ERCIM News vol. 76,January 2009.

[Grust, 2002] Grust T. Accelerating XPath Location Steps. Proceedings of the 2002ACM SIGMOD International Conference on Management of Data, pp.109-120, ACMPress, 2002.

[Marks, 2009] Marks G., and Roantree M. Metamodel-Based Optimisation of XPathQueries. Proceedings of 26th BNCOD, LNCS vol. 5588, Springer, 2009.

[MonetDB, 2008] MonetDB - open source XML database. http://monetdb.cwi.nl/,2008

[Polar, 2011] http://http://www.polar.fi

[Sadouq, 2007] Zouhair A.Sadouq, Mohamed Essaaidi. Design Challenges of WirelessSensor Networks Based on Real-time Middleware. Information and CommunicationTechnologies International Symposium, ICTIS07, April 3–5, Fez, 2007.

[Sadouq, 2008] Zouhair A. Sadouq, Manuel I. Capel-Tunon, Mohammed Essaaidi.Tanja: A framework for modelling Wireless Sensor Networks Workshop on Sen-sor Networks and Applications. Workshop on Sensor Networks and Applications(WSeNA’08), 1–5. September, 5. Gramado, Brasil, 2008.


Recommended