+ All Categories
Home > Documents > Wearable Sensor-Based Human Activity Recognition Using...

Wearable Sensor-Based Human Activity Recognition Using...

Date post: 28-Jul-2020
Category:
Upload: others
View: 5 times
Download: 0 times
Share this document with a friend
12
Research Article Wearable Sensor-Based Human Activity Recognition Using Hybrid Deep Learning Techniques Huaijun Wang, 1,2 Jing Zhao, 1 Junhuai Li , 1,2 Ling Tian, 1 Pengjia Tu, 1 Ting Cao, 1,2 Yang An, 1 Kan Wang, 1,2 and Shancang Li 3 1 School of Computer Science and Engineering, Xi’an University of Technology, Xi’an 710048, China 2 Shaanxi Key Laboratory for Network Computing and Security Technology, Xi’an 710048, China 3 Department of Computer Science and Creative Technologies, UWE Bristol, Bristol BS16 1QY, UK Correspondence should be addressed to Junhuai Li; [email protected] Received 16 February 2020; Revised 8 June 2020; Accepted 6 July 2020; Published 27 July 2020 Academic Editor: Xiaolong Xu Copyright©2020HuaijunWangetal.isisanopenaccessarticledistributedundertheCreativeCommonsAttributionLicense, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Human activity recognition (HAR) can be exploited to great benefits in many applications, including elder care, health care, rehabilitation, entertainment, and monitoring. Many existing techniques, such as deep learning, have been developed for specific activity recognition, but little for the recognition of the transitions between activities. is work proposes a deep learning based scheme that can recognize both specific activities and the transitions between two different activities of short duration and low frequency for health care applications. In this work, we first build a deep convolutional neural network (CNN) for extracting features from the data collected by sensors. en, the long short-term memory (LTSM) network is used to capture long-term dependencies between two actions to further improve the HAR identification rate. By combing CNN and LSTM, a wearable sensor based model is proposed that can accurately recognize activities and their transitions. e experimental results show that the proposed approach can help improve the recognition rate up to 95.87% and the recognition rate for transitions higher than 80%, which are better than those of most existing similar models over the open HAPT dataset. 1. Introduction Human behavior recognition (HAR) is the detection, in- terpretation, and recognition of human behaviors, which can use smart heath care to actively assist users according to their needs. Human behavior recognition has wide appli- cation prospects, such as monitoring in smart homes, sports, game controls, health care, elderly patients care, bad habits detection, and identification. It plays a significant role in depth study [1] and can make our daily life become smarter, safer, and more convenient. Currently, human behavior data can be acquired in two ways: one is based on computer vision and the other is based on sensors [2]. Behavior recognition based on computer vision has been studied for a long time and has a mature theoretical basis. However, the vision-based approaches have many limitations in practice. For example, the use of a camera is limited by various factors, such as light, position, angle, potential obstacles, and privacy invasion issues, which make it difficult to be restricted in practical application. Although the research time of sensor-based behavior rec- ognition is relatively short, with the development and maturity of microelectronics and sensor technology, there are various types of sensors, such as accelerometers, gyro- scopes, magnetometers, and barometers. ese sensors can be integrated into mobile phones and wearable devices such as watches, bracelets, and clothes. Furthermore, state-of-the- art wearable sensors have solved the issue of antimagnetic field interference, such as [3], which can accurately estimate the current acceleration and angular velocity of motion sensors in real time in the presence of magnetic field in- terference. So these wearable sensors are usually small in size, high in sensitivity, and strong in anti-interference ability, so the sensor-based identification method is more suitable for practical situations. Moreover, sensor-based behavior recognition is not limited by scene or time, which Hindawi Security and Communication Networks Volume 2020, Article ID 2132138, 12 pages https://doi.org/10.1155/2020/2132138
Transcript
Page 1: Wearable Sensor-Based Human Activity Recognition Using ...downloads.hindawi.com/journals/scn/2020/2132138.pdf · Wearable Sensor-Based Human Activity Recognition Using Hybrid Deep

Research ArticleWearable Sensor-Based Human Activity Recognition UsingHybrid Deep Learning Techniques

Huaijun Wang12 Jing Zhao1 Junhuai Li 12 Ling Tian1 Pengjia Tu1 Ting Cao12

Yang An1 Kan Wang12 and Shancang Li3

1School of Computer Science and Engineering Xirsquoan University of Technology Xirsquoan 710048 China2Shaanxi Key Laboratory for Network Computing and Security Technology Xirsquoan 710048 China3Department of Computer Science and Creative Technologies UWE Bristol Bristol BS16 1QY UK

Correspondence should be addressed to Junhuai Li lijunhuaixauteducn

Received 16 February 2020 Revised 8 June 2020 Accepted 6 July 2020 Published 27 July 2020

Academic Editor Xiaolong Xu

Copyright copy 2020HuaijunWang et alis is an open access article distributed under the Creative CommonsAttribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

Human activity recognition (HAR) can be exploited to great benefits in many applications including elder care health carerehabilitation entertainment and monitoring Many existing techniques such as deep learning have been developed for specificactivity recognition but little for the recognition of the transitions between activities is work proposes a deep learning basedscheme that can recognize both specific activities and the transitions between two different activities of short duration and lowfrequency for health care applications In this work we first build a deep convolutional neural network (CNN) for extractingfeatures from the data collected by sensors en the long short-term memory (LTSM) network is used to capture long-termdependencies between two actions to further improve the HAR identification rate By combing CNN and LSTM a wearable sensorbased model is proposed that can accurately recognize activities and their transitions e experimental results show that theproposed approach can help improve the recognition rate up to 9587 and the recognition rate for transitions higher than 80which are better than those of most existing similar models over the open HAPT dataset

1 Introduction

Human behavior recognition (HAR) is the detection in-terpretation and recognition of human behaviors whichcan use smart heath care to actively assist users according totheir needs Human behavior recognition has wide appli-cation prospects such as monitoring in smart homes sportsgame controls health care elderly patients care bad habitsdetection and identification It plays a significant role indepth study [1] and can make our daily life become smartersafer and more convenient

Currently human behavior data can be acquired in twoways one is based on computer vision and the other is basedon sensors [2] Behavior recognition based on computervision has been studied for a long time and has a maturetheoretical basis However the vision-based approacheshave many limitations in practice For example the use of acamera is limited by various factors such as light position

angle potential obstacles and privacy invasion issues whichmake it difficult to be restricted in practical applicationAlthough the research time of sensor-based behavior rec-ognition is relatively short with the development andmaturity of microelectronics and sensor technology thereare various types of sensors such as accelerometers gyro-scopes magnetometers and barometers ese sensors canbe integrated into mobile phones and wearable devices suchas watches bracelets and clothes Furthermore state-of-the-art wearable sensors have solved the issue of antimagneticfield interference such as [3] which can accurately estimatethe current acceleration and angular velocity of motionsensors in real time in the presence of magnetic field in-terference So these wearable sensors are usually small insize high in sensitivity and strong in anti-interferenceability so the sensor-based identification method is moresuitable for practical situations Moreover sensor-basedbehavior recognition is not limited by scene or time which

HindawiSecurity and Communication NetworksVolume 2020 Article ID 2132138 12 pageshttpsdoiorg10115520202132138

can better reflect the nature of human activities ereforethe research and application of human behavior recognitionbased on sensors are more and more valuable andsignificant

Besides the HAR includes two types basic actions andtransition actions Due to the low incidence and shortduration of transition movement there are relatively fewstudies on the transition movement from standing to sittingwalking to standing and so on in the research of humanbehavior recognition [4] However the study of transitionalmovement is a very important part of human behaviorrecognition In order to improve the behavior recognitionrate transition action recognition is not negligible etransition action is the distinction of a variety of basic ac-tions in frequent alternations e accurate division of thetransition action can accurately segment the streaming datato a certain extent and ultimately improve the recognitionrate In addition the behavior recognition methods based ontraditional patterns have shortcomings such as manualfeature extraction With the application and development ofdeep learning in different fields the deep learningmodel alsoshows great advantages in the field of behavior recognition

e main contributions of this work are summarized asfollows

(1) We presented a deep learning model composed ofconvolutional and Long Short-Term Memory re-current layers which can automatically learn localfeatures and model the time dependence betweenfeatures

(2) We discussed the influence of key parameters in deeplearning model on performance and finally deter-mined the best parameters in the model

(3) We analyzed and compared the experimental resultswith other models that adopt the same common dataset e results show that the proposed method issuperior to the other advanced methods

In this work we use both acceleration sensor and agyroscope sensor of smart phones to acquire data andproposed a CNN-LSTM hybrid model to recognize thetransitionmotion Convolution neural network (CNN) [5] isa type of depth neural network used as a feature extractor Itis characterized by local dependence so it has good per-formance in extracting local features However humanactivity information belongs to long instance which iscomposed of complex movements and changes with time Sothe CNN model does not work well in extracting the rela-tionship between time and features e Long Short-TermMemory (LSTM) [6] neural network is a kind of recursionnetwork that contains a memory to simulate a time de-pendent sequence problem erefore the mixture of CNN-LSTM can accurately identify the basic and transitionalfeatures of activities

e remainder of the paper is organized as followsSection 2 reviews the literature on human activity identi-fication based on deep learning and existing problemsSection 3 presents the mixed deep learning frameworkproposed in this paper for existing problems Section 4

discusses and analyzes the experimental results based onexperimental data Finally Section 5 concludes this paper

2 Related Works

Due to the extensive application of human-computer in-teraction behavior detection and other technologies hu-man behavior recognition has become a hot field [7] Humanbehavior recognition can be regarded as a representativepattern recognition problem e traditional pattern ofbehavior recognition research using decision tree supportvector machine (SVM) and other machine learning algo-rithms can obtain much satisfactory results in premise ofsome controlled experimental environments and a smallnumber of labeled data However the accuracy of thesemethods depends on the effectiveness and comprehen-siveness of manual feature extraction In addition thesemethods can only extract shallow features Because of theselimitations the behavior recognition methods based ontraditional pattern recognition are limited in classificationaccuracy and model generalization

In recent years deep learning has developed rapidly andattracted many research efforts especially in image pro-cessing time series natural language logical reasoning andother complex data processing aspects and has achievedunparalleled achievements [8] Different from the traditionalbehavior recognition method deep learning could reducethe workload of feature design In addition the higher-leveland more meaningful features can be learned via the end-to-end neural network Furthermore the deep networkstructure is more suitable for unsupervised incrementallearning Moreover deep networks created by super-imposing several layers of features can model data withcomplex structures In a word the deep learning is an idealmethod for HAR

Since deep learning has made outstanding achievementsin image feature extraction many researchers first try toapply it to behavior recognition based on video In earlyperiods Taylor et al [9] used convolution thresholdBoltzmann machine to identify video behavior data andextract sensitive features Ji et al [10] proposed a three-dimensional CNN model to capture more action informa-tion from space and time Liu et al [11] proposed that CNNand conditional random domains (CRFs) be combined foraction segmentation and recognition e CNN can auto-matically learn space-time characteristics while CRF is ableto capture the dependency between outputs Other commondeep learning methods are also widely used such as re-cursive neural network [12] and long short-term memorynetwork On one hand it is successful on application of deeplearning in video behavior recognition On the other hand itis also widely used in human behavior recognition based onsensors

Zeng et al [13] proposed treating the single-axis sensordata as one-dimensional data of images and then sendingthem to CNN for identification Jiang and Yin [14] com-bined the signal sequences of accelerometer and gyroscopeinto an active image enabling deep convolutional neuralnetwork (DCNN) to automatically learn the optimal features

2 Security and Communication Networks

from the active image Chen and Xue [15] modified the CNNconvolution kernel to adapt to the characteristics of triaxialacceleration signals Ronao and Cho [16] proposed a con-vNet which realized efficient and data adaptive humanbehavior recognition with smart phone sensors ConvNetsnot only utilize the inherent time-local dependence of sensorsignal sequences but also provide an adaptive method forextracting robust features Experimental results show thatthis method can recognize similar actions which are difficultto be processed by traditional machine learning Murad andPyun [17] and Zhou et al [18] proposed three deep recursiveneural network structures based on LSTM to establishrecognition models to capture time relations in input se-quences and could achieve more accurate recognition Dueto the superior performance of LSTM in behavior recog-nition application Guan and Plotz [19] and Qi et al [20]improved the LSTM and proposed an integration modelintegrating different LSTM learners into an integratedclassifier rough the experimental evaluation in thestandard data set it is proved that the integrated systemcomposed of LSTM learners is superior to a single LSTMnetwork Ignatov [21] combined the manually extractedstatistical features with the features automatically extractedby neural network and realized a human behavior recog-nition method based on user autonomous deep learningAmong them CNN extracted local features while statisticalfeatures preserved the information about the global form oftime series Experiments on open data sets show that themodel has the advantages of small computation shortrunning time and good performance Nweke et al [22] andWang et al [23] respectively summarized the application ofdeep learning method in sensor-based behavior recognitionand not only put forward detailed views on the existingwork but also pointed out the challenges and improvementdirections of future research

is work demonstrated the potential of deep neuralnetwork to learn the potential features and time seriesfeatures Nevertheless existing works on action recognitionmainly focus on the aspect of basic behavior recognitionwhile the transition between actions is usually ignored be-cause the transition action has a short duration However itis necessary to study the transition action in depth in orderto improve the robustness of the model e precise divisionof the transition action can accurately segment the streamingdata to a certain extent and ultimately improve the recog-nition rate In this paper CNN combined with LSTM hybridmodel is adopted to extract deep and advanced features andelaborate description is made of basic and transition actionso as to realize accurate identification

3 Proposed Method

e overall architecture diagram of the method proposed inthis paper is shown in Figure 1 which contains three partse first part is the preprocessing and transformation of theoriginal data which combines the original data such asacceleration and gyroscope into an image-like two-dimen-sional arraye second part is to input the composite imageinto a three-layer CNN network that can automatically

extract the motion features from the activity image andabstract the features then map them into the feature mape third part is to input the feature vector into the LSTMmodel establish a relationship between time and actionsequence and finally introduce the full connection layer toachieve the fusion of multiple features In addition BatchNormalization (BN) is introduced [24] in which BN cannormalize the data in each layer and finally send it to theSoftmax layer for action classification

31 Data Preprocessing Due to the large amount of be-havioral data collected by the sensor it is impossible to inputall the data into the depth model at one time ereforesliding window segmentation should be carried out beforedata input into the model e behavior recognition methodproposed in this paper can recognize both the basic actionand the transition action at the same time e transitionaction lasts for a short time it is necessary to choose theappropriate window size If the window is too large im-portant information will be lost Otherwise the computa-tional costs will be increased After data segmentation thebehavioral data collected by sensors are one-dimensionaltime series different from image data erefore beforeapplying the deep learning model to these input data it isnecessary to input and adapt them Dimension transfor-mation is carried out on the data after window segmentatione method of transformation is to splice the sensor data ofall axes into a two-dimensional matrixe advantage of thisapproach to data processing is that it preserves the corre-lation between sensorsrsquo axes Finally samples similar topictures are formed and input into the deep learning modelFigure 2 shows the model structure of data preprocessing

32 Feature Learning Based 1D-CNN e original uniaxialacceleration and gyroscope data are equivalent to two-di-mensional array of images after dimensional transformatione feature image is input into the convolution neuralnetwork and its structure is generally composed of con-volution layer and pooling layer e convolution layercarries out convolution operation on the input imagethrough convolution kernel to obtain feature mapping epooling layer extracts local features from the feature map ofthe convolution layer through sampling operation to lessenthe size of neurons and the number of parameters econvolution layer and pooling layer are stacked to form adeep structure which can automatically extract the actionfeature information from the original action data [5]

e CNN model structure designed in this paper isshown in Figure 3 e CNN network model consists ofthree convolution layers and three pooling layers (eachconvolution layer is followed by one pooling layer) andfinally outputs a number of feature map images with actionfeatures Table 1 illustrates the settings of different param-eters for each convolution and pooling layer Convolution isachieved by the convolution of two-dimensional convolu-tion kernel with images superimposed by multiple adjacentframes e convolution kernel number of the three con-volution layers is 18 36 and 72 respectively e

Security and Communication Networks 3

convolution kernel size is 2times 8 2times18 and 2times 36 and thestep size is 1 Since the filter may not be able to process thedata in a certain direction in the operation of convolution toavoid reducing data of the image edge the padding pa-rameter is introduced and set to ldquoSAMErdquo and 0 is added tothe edge of the input image matrix After the convolutionoperation in the convolution layer the output will usuallypass through a nonlinear activation function and then formthe output of the convolution layer e popular activationfunctions include Sigmoid function ReLU function andTanh function Among them ReLU function can change thenegative value of the data extracted by CNN into 0 and thepositive value of the data greater than 0 remains unchangedAfter nonlinear processing operation the positive value

Raw data

Window splitTime

w1

w2w3

Figure 2 Structure of data preprocessing model

m lowast nInput

Conv_1 Pooling_1 Conv_2 Pooling_2 Conv_3 Pooling_3

Feature map

Convolutional and pooling layer

Figure 3 CNN model architecture

Table 1 Activity label corresponding to the original data

Id Exp Label Start End1 1 5 250 12321 1 7 1233 13921 1 4 1393 21941 1 8 2195 23591 1 5 2360 33741 1 11 3375 36621 1 5 3663 45381 1 11 4539 47351 1 5 4736 56671 1 11 5668 58591 1 5 5860 67861 1 11 6787 69771 1 5 6978 8078

Input

Con

v_1

Pool

ing

Pool

ing

Con

v_2

Pool

ing

Con

v_3

Feature map

CNN feature extraction

LSTM

Fully

conn

ecte

d

Batc

h no

rmal

izat

ion

Som

ax

LSTM feature fusion and classify

WalkingUpstairs

Stand to lie

Lie to stand

Figure 1 Human activity recognition framework based on CNN-LSTM

4 Security and Communication Networks

greater than 0 can be more clearly expressed by the extractedfeatures erefore ReLU activation function is used in theconvolution layer of CNN

f(x) max(0 x) 0 xlt 0

x xge 01113896 (1)

Further we have

fprime(x) 0 xlt 0

1 xge 01113896 (2)

Pooling layer is regarded as reducing the number offeature mappings and parameters e popular poolingtechniques include maximum pooling and average poolingIn recent years relevant theoretical analysis and perfor-mance evaluation have shown the superior performance ofthe maximum pooling strategy which is widely used in deeplearning [25 26] Moreover some studies show that themaximum pooling technology is very suitable for sensor-based human behavior recognition [27] erefore allpooling layers of CNN in this paper utilized the maximumpooling technique Specific convolution and pooling processparameters are set as shown in Table 2

33 Feature FusionandActionClassification To improve therecognition rate of transition actions we build a LSTM afterthe CNN network f1 f2 fn1113864 1113865 is the feature sequenceconverted from the feature map calculated by CNN from theimages composed of original data erefore the sequencef1 f2 fn1113864 1113865 input LSTM and the storage unit of LSTMwill produce a sequence of characters m1 m2 mn1113864 1113865

Since LSTM has different gating units memory unitssuch as input gate forgetting gate and output gate arecombined with learning weights to solve the problem ofgradient disappearance in the process of back propagation ofordinary circular neural network Meanwhile LSTM canmodel time-dependent actions and fully capture globalfeatures so as to improve the recognition accuracy [28]LSTM cell controls the inward flowing information ofneurons which is composed of forgetting gate input gateand output gate Furthermore the predicted value of LSTMcell is obtained using Tanh function

Firstly the forgetting gate determines how much in-formation from the previous moment can be accumulated tothe current cell As shown in equation (3) the probabilityvalue is calculated to determine the amount of informationthat can pass through the gate

Γf σ wf lowast alangtminus 1rang

xlangtrang

1113960 1113961 + bf1113872 1113873 (3)

where wf represents the weight corresponding to the inputvector b represents the bias alangtminus 1rang presents the output of theneuron at the last moment and xlangtrang represents the currentinput of the neuron

Secondly the input gate consists of update gate and Tanhlayer which controls how much information can flow intothe current cell e calculation process is shown in equa-tions (4)ndash(6) e input of the input gate and the output ofthe forgetting gate update the cell at the same time

discarding unwanted information en the predicted valueof the current unit is determined by the output gate and theoutput of the model is obtained as shown in equations (7)and (8)

Γu σ wulowast alangtminus 1rang

xlangtrang

1113960 1113961 + bu1113872 1113873 (4)

1113957C tanh wclowast alangtminus 1rang

xlangtrang

1113960 1113961 + bc1113872 1113873 (5)

Ct Γu lowast 1113957Clangtrang

+ Γf lowastClangtminus 1rang

(6)

Γo σ wo lowast alangtminus 1rang

xlangtrang

1113960 1113961 + bo1113872 1113873 (7)

alangtrang

Γo lowast tanh Clangtrang

1113872 1113873 (8)

After the processing of LSTM layer the final output is aset of vectors containing time and action sequence corre-lation which are input into the full connection layer for thefusion of global action features e training process ofneural network model becomes complicated since the sta-tistical distribution of input of each layer changes with theparameters of the previous layer To keep the distribution ofoutput data from changing too much a lower learning ratewill be used which could reduce the training speed To solvethis issue this paper introduces the BN to standardize thevalues of each layer in LSTM (the output of neurons at thelast moment and the input at the current moment) so thatthe mean and variance of sum will not change with thechange of the distribution of the underlying parameters andeffectively separate the parameters of each layer from otherlayers In this way the gradient disappearance or explosioncan be prevented and the training speed of the network canbe accelerated e BN algorithm is shown in Algorithm 1

In Algorithm 1 μx and ς2x are themean and variance of xi

obtained through minibatch e mean and variance wereused to normalize xi to make the sample follow normaldistribution However the positive distribution is not able toreflect the characteristic distribution of the training samplesand thus it is necessary to introduce the scaling factor c andthe shift factor β As training progresses c and β are alsolearned by back propagation to improve accuracy

After BN operation the features are more obvious soinput them to Softmax layer to extract the action featuresand classify them in time series In this model the outputlayer uses Softmax normalized exponential function tocalculate the posterior probabilities of different actions to

Table 2 e convolution and pooling layers of the CNNarchitecture

Layers Conv1d_1 Conv1d_2 Conv1d_2Size 1times 2times 8 1times 2times18 1times 2times 36Stride 1times 1times 1 1times 1times 1 1times 1times 1Channel 18 36 72Layers Pooling_1 Pooling_2 Pooling_3Size 1times 2times18 1times 2times 36 1times 2times 72Stride 1times 1times 1 1times 1times 1 1times 1times 1Channel 18 36 72

Security and Communication Networks 5

realize classification It maps the output values of neuronsbetween (0 1) which can be regarded as the predictionprobability of actions and the largest one is the result ofclassification en the Softmax output layer outputs acategory vector such as [0 0 0 0 1 0 0 0 0 0 0 0]indicating that the classification result is an action numbered5

34 Model Implementation and Training e neural net-work described here is implemented in TensorFlow [29] It is alightweight library for building and training neural networksModel training and classification runs on a conventionalcomputer with a 24GHz CPU and 16GB memory

e model is trained in a fully supervised manner tobackpropagate the gradient from the Softmax layer to theconvolution layer Network parameters are optimized byusing minibatch gradient descent method and Adam opti-mizer through minimizing cross-loss function [13] Adam iswidely used due to its advantages in simple implementationefficient calculation and low memory demand Comparedwith other kinds of random optimization algorithms Adamhas great advantages In this paper to better train the modelafter the training data are input into the network Adamoptimizer and backpropagation algorithm are used to learnand optimize the network parameters Meanwhile the cross-entropy loss function is used to calculate the total error asshown in the following equation

C minus1N

1113944X

[yIna +(1 minus y)In(1 minus a)] (9)

where y is the true tag and a is the predicted valueTo improve efficiency small batches of data segment size

are segmented during training and testing With theseconfigurations the cumulative gradient of the parameters iscalculated after each small batch e weights are randomlyand orthogonally initialized As a form of regularization weintroduce a dropout operator on each dense layer of inputis operator sets the activation of a randomly selected unitto zero during training Dropout technology proposed byHinton et al [30] is based on the principle of randomlydeleting some nodes in the network while maintaining theintegrity of input and output neurons which is equivalent totraining many different networks Different networks mayoverfit in different ways but their average results can ef-fectively reduce overfitting In addition dropout allowsneurons to learn stronger features by not relying on other

specific neurons e number of parameters to be optimizedin a deep neural network varies depending on the type oflayer it contains And it has a great impact on the time andcomputer skills required to train the network e specificmodel training parameters will reflect the best choices in theexperiment

4 Activity Recognition

41 Experiment Data In addition to common basic actionsthis paper also studies transition actions Actually a fewexisting public data sets contain transition actions ere-fore this paper adopts the international standard Data SetSmart phone Based Recognition of Human Activities andPostural Transitions Data Set [31 32] to conduct an ex-periment which is abbreviated as HAPT Data Set e dataset is an updated version of the UCI Human ActivityRecognition Using popularity Data set [8] It provides rawdata from smart phone sensors rather than preprocesseddata In addition the action category has been expanded toinclude transition actions e HAPT data set containstwelve types of actions Firstly it has six basic actions thatinclude three types of static actions such as standing sittingand lying and three types of walking activities such aswalking going downstairs and upstairs Secondly it has sixpossible transitions between any two static movementsstanding to sitting sitting to standing standing to lyinglying to sitting sitting to lying and lying to standing

e HAPT data collection process is shown in Figure 4e experiment involved 30 volunteers whose ages rangefrom 19 to 48 each wearing a smart phone on their waistData collection is carried out with the built-in accelerationsensor and gyroscope and the sampling frequency is 50HzMeanwhile video records of the experimental process aremade for the convenience of subsequent data marking

e collected data is saved in the form of txt and theacceleration and gyroscope data are stored independentlywith 60 groups respectively As shown in Table 1 it is thelabel information corresponding to the original data of theexperiment Among them the first column is the experimentID the second column is the experimenter number the thirdcolumn is the action label and the fourth and fifth columnsare the start and end row labels of the corresponding sensordata e label ranges from 1 to 12 representing 12 types ofactions It can be seen from the figure that the collected datacontains invalid data and the first 250 pieces of data areunlabeled and belong to invalid data

Input data set χ x1 xn1113864 1113865

Output yi BNcβ(xi)1113966 1113967

(1) Calculate the mean of data set μxlarr(1n) 1113936ni1 xi

(2) Calculate the variance of data set ς2χlarr(1n) 1113936ni1 (xi minus μχ)

2

(3) Normalize data 1113954xilarr(xi minus μxς2x

1113968+ ε)

(4) Scale change and deviation yilarrc1113954xi + β BNcβ(xi)

(5) Return learning parameter c and β

ALGORITHM 1 Algorithm of batch normalization

6 Security and Communication Networks

After preliminary processing of the original data all thedata without labels were deleted Finally 815614 valid piecesof data were obtained Due to the low frequency and shortduration of transition action as well as the high frequencyand long duration of basic action there is a considerabledifference in data volume between transition action andbasic action e data volume of the six transition actions ismuch lower than that of the other basic actions accountingfor only about 8 of the total data Table 3 lists the amount ofdata for different actions e original data is divided intothree parts training set verification set and test set in whichthe training set is used for model training and verificationset is used to adjust parameters and test set is used tomeasure the quality of the final model

42 Parameters Setting In the deep learning network themodel parameters greatly affect its recognition rateerefore the experimental analysis of the number ofneurons learning rate BN Batch size and other parametersin LSTM layer would be conducted in the following sections

421 Number of Neurons in LSTM Layer In order to verifythe influence of the number of neurons in LSTM layer on therecognition results the following experiments are carriedout in this paper as shown in Figure 5 It shows that therecognition rate is the lowest when each LSTM layer con-tains only 8 neurons is is because given less neurons thenetwork lacks the necessary learning ability and informationprocessing ability resulting in the low recognition rate As

the number of neurons increases the recognition rate tendsto increase When the number of neurons is 64 the rec-ognition rate reaches 9587 If the number of neurons is toolarge the complexity of network structure will increase andthe learning speed of network will slow down ereforeconsidering the training time of the network the number ofLSTM layer neurons in this paper is tentatively 64

422 e Learning Rates Experiments are carried out atdifferent learning rates in this paper As shown in Table 4 itcan be seen that the recognition rate of the model reaches amaximum of 9587 when the learning rate is 0002erefore the learning rate of 0002 is adopted

423 BN Operation To verify the improvement of the BNoperation on the network model a comparative experimentis carried out first with and without BN layer e epoch isset to 400 and other parameters remain unchanged erecognition rates of both methods on the test set are shownin Table 5 Obviously the recognition rate on the test set isimproved by about 424 after the BN layer is added

424 Batch Size Batch size refers to the Batch sample sizewhose maximum value is the total number of samples in the

Figure 4 Data collection of the physical activities

0955

0950

0945

0940

0935

0930

0925

0920

Acc

urac

y

8 16 24 32 40 48 56 64Number of neurons

Test accuracy

Figure 5 Accuracy of different numbers of neurons on test sets

Table 3 e data amount of various activities in the HAPT

Type ID NumberWalk A1 122091Upstairs A2 116707Downstairs A3 107961Sit down A4 126677Stand A5 138105Lie A6 136865Stand to sit A7 10316Sit to stand A8 8029Sit to lie A9 12428Lie to sit A10 11150Stand to lie A11 14418Lie to stop A12 10867

Security and Communication Networks 7

training set When the amount of data is small the batchdata is the whole data set so that it can approach the extremevalue direction more accurately However in practical ap-plications the amount of data used by deep learning isrelatively large and the principle of small batch processing isgenerally adopted Using small batch processing requiresrelatively little memory and faster training time Within anappropriate range increasing the batch size can more ac-curately determine the direction of gradient descent andcause less training shock However when the batch sizeincreases to a certain value the determined downward di-rection will not change and the correction of parameters willslow down significantly e identification results of dif-ferent batch sizes are shown in Table 6 It can be seen thatwhen the batch size is 150 the maximum identification ratereaches 9587 erefore 150 is selected as the best batchsize in this paper

e parameters of the CNN-LSTM model proposed inthis paper are shown in Table 7

5 Experimental Results and Analysis

For human movement recognition Wang and Liu [33]proposed to use the F-measure standard measurementmethod to verify the performance of the deep-rootedLSTM network model in human activity recognition Luet al [34] demonstrated the superiority of the model inbehavior recognition by using accuracy prediction rateand recall rate in the experiment erefore to evaluatethe performance of the motion recognition methodproposed in this paper we also used the measurementmethod of accuracy recall rate loss rate and F-measure inthe experiment

According to the above parameters the recognitionconfusion matrix of 12 different actions is shown in Table 8Accuracy curve of CNN-LSTM model is shown in Figure 6It can be seen from Table 9 that the overall recognition rateof CNN-LSTM is high and the CNN-LSTM has a betterrecognition effect on the transition action

6 Case Study

In the non-deep-learning method the random forest clas-sification method (RF) and K-nearest neighbor (KNN)classification perform well in action classification recogni-tion erefore the CNN-LSTM model proposed is com-pared with the RF and KNN methods First of all input theHAPTdata set into RF and KNNen segment the original

Table 4 Accuracy of different learning rates on test sets

Learning rate Recognition rate ()0001 935700015 94210002 958700025 92390003 933400035 92120004 928400045 9201

Table 5 Accuracy and loss rate on test sets with or without BNlayer

Recognition rate ()Without BN layer 9163With BN layer 9587

Table 6 Accuracy of different batch size on test sets

Batch size Recognition rate ()25 917450 928875 9292100 9310125 9433150 9587175 9345200 9337225 9372250 9345275 9284300 9335325 9406350 9334375 9296400 9353

Table 7 Experimental parameters of CNN-LSTM model

Parameters ValueInput vector size 150Input channel number 8Convolution kernel size 3Pool size 2Activation function ReLuLSTM layer 1Neurons number 64Dropout 05Learning rate 0002Batch size 150Epoch 400

Table 8 Confusion matrix of various actions

ActualPredict

A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A12A1 410 1 3 0 0 0 0 0 0 0 0 0A2 5 388 3 0 0 0 0 0 0 0 0 0A3 1 3 346 0 0 0 0 0 0 0 0 0A4 1 0 0 383 32 3 1 0 1 1 0 0A5 0 0 1 31 431 0 0 0 0 0 0 0A6 0 0 0 1 0 457 0 0 0 0 0 0A7 0 0 0 1 0 0 17 0 0 0 0 0A8 0 0 0 0 0 0 0 4 0 0 1 0A9 0 0 0 0 0 0 0 0 19 1 4 1A10 0 0 0 0 0 0 0 0 1 14 0 2A11 0 1 0 1 0 0 1 0 2 1 32 1A12 0 0 0 0 0 0 0 0 0 1 1 16

8 Security and Communication Networks

sensor data and calculate the mean value variance co-variance and 15 features Finally classify the basic actionsand transition actions according to the clustering resultse classification results are shown in Table 10 It can be seenthat the recognition rate of CNN-LSTM model is higher

than that of RF and KNNmethods for both basic actions andtransition actions

In addition to the comparison with RF and KNN clas-sifier our proposed model is also compared with a singleCNN a single LSTM CNN-GRU and CNN-BLSTM deep

Table 9 e recognition accuracy recall rate and F value of various actions

ID Accuracy () Recall () F-measure ()A1 9903 9832 9868A2 9778 9873 9835A3 9886 9802 9844A4 9076 9185 9130A5 9309 9309 9309A6 9978 9956 9956A7 9444 8947 9189A8 100 100 100A9 7600 8261 7917A10 8235 7778 8000A11 8205 8649 8421A12 8889 8000 8421

Table 10 Average accuracy of various actions in CNN-LSTM RF and KNN models

ID RF () KNN () CNN-LSTM ()A1 9990 8810 9903A2 9250 9780 9778A3 9020 9940 9886A4 9190 8380 9076A5 9080 8750 9309A6 9710 100 9978A7 7130 6670 9444A8 7200 6800 100A9 5130 3860 7600A10 7490 3630 8235A11 5920 3370 8205A12 6110 5790 8889

10

09

08

07

06

05

04

030 50 100 150 200 250 300 350 400

Iterations

TrainValidation

Figure 6 Accuracy curve of CNN-LSTM Model

Security and Communication Networks 9

learning models Table 11 shows the average accuracy ofvarious actions in five different depth models As can be seenfrom Table 11 CNN-LSTM not only has a slightly higherrecognition of basic movements than the other five modelsbut also has a significantly better recognition of transitionmovements especially standing to sitting sitting to lyingand standing to lying Table 12 shows the recognition rates ofdifferent models on the test set It can be seen from the tablethat the average recognition rate of the three models ishigher than 90 but the recognition effect of CNN-LSTMmodel is slightly better than that of CNN LSTM CNN-GRU and CNN-BLSTM

To prove the effectiveness of the CNN-LSTM deeplearning model it is also compared with other deep learningmethods using the same dataset Kuang [35] applied BLSTMto construct the behavior recognition model Hassan et al[36] used deep belief network (DBN) for human behaviorrecognition We compared the performance with the ap-proaches in [35 36] with the result shown in Table 13 Itfollows that the proposed CNN-LSTM can achieve highestaverage recognition rate

7 Conclusion

is paper explored the recognition method based on deeplearning and designed the behavior recognition model basedon CNN-LSTM CNN learns local features from the original

sensor data and LSTM extracts time-dependent relation-ships from local features and realizes the fusion of localfeatures and global features fine description of basic andtransition movements and accurate identification of the twomotion patterns

e actions identified in this paper only include commonbasic actions and individual transition actions In the nextstep more kinds of actions can be collected and morecomplex actions can be added such as eating and drivingAnd the individual recognition can be realized by consid-ering the behavior differences of different users Meanwhilethe deep learning model still needs to be optimized andimproved Studies show that the combination of depthmodel and shallow model can achieve better performanceDeep learning model has strong learning ability whileshallow learning model has higher learning efficiency ecollaboration between the two can achieve more accurateand lightweight recognition

Data Availability

No data were used to support this study

Conflicts of Interest

e authors declare no conflicts of interest

Acknowledgments

e authors would like to thank the support of the labo-ratory university and government is research wasfunded by the National Key Research and Development Plan(No 2017YFB1402103) the National Natural ScienceFoundation of China (No 61971347) Scientific ResearchProgram of Shaanxi Province (2018HJCG-05) and Projectof Xirsquoan Science and Technology Planning Foundation(201805037YD15CG214)

References

[1] I H Lopez-Nava and M M Angelica ldquoWearable inertialsensors for human motion analysis a reviewrdquo IEEE SensorsJournalvol 16 no 15 2016

Table 11 Average accuracy of different activities with five deep learning models

ID CNN () LSTM () CNN-BLSTM () CNN-GRU () CNN-LSTM ()A1 9750 9770 9741 9975 9903A2 9725 9710 9565 9899 9778A3 9560 9715 100 9657 9886A4 9126 9026 9196 8199 9076A5 9080 9080 8474 9248 9309A6 9967 9858 100 9978 9978A7 7647 6486 4444 7778 9444A8 100 6667 6667 5000 100A9 6383 6939 6207 4800 7600A10 8485 7027 8000 5294 8235A11 7250 6933 6500 7179 8205A12 8330 7027 7059 5556 8889

Table 12 Average accuracy of the five models in this paper

Method Average recognition rate ()CNN 9429LSTM 9322CNN-BLSTM 9273CNN-GRU 9334CNN-LSTM 9587

Table 13 Average accuracy of different methods on test set in thepaper [35 36]

Method Average recognition rateBLSTM [35] 875DBN [36] 896CNN-LSTM 958

10 Security and Communication Networks

[2] Y Liu L Nie L Liu and D S Rosenblum ldquoFrom action toactivity sensor-based activity recognitionrdquo Neurocomputingvol 181 pp 108ndash115 2016

[3] T Liu F Bingfei and L Qingguo ldquoe invention relates to awearable motion sensor and a method for resisting magneticfield interferencerdquo 2017

[4] O D Lara and M A Labrador ldquoA survey on human activityrecognition using wearable sensorsrdquo IEEE CommunicationsSurveys amp Tutorials vol 15 no 3 pp 1192ndash1209 2013

[5] F J Ordontildeez and D Roggen ldquoDeep convolutional and LSTMrecurrent neural networks for multimodal wearable activityrecognitionrdquo Sensors (Switzerland) vol 16 p 1 2016

[6] X Du R Vasudevan and M Johnson-Roberson ldquoBio-LSTMa biomechanically inspired recurrent neural network for 3-dpedestrian pose and gait predictionrdquo IEEE Robotics andAutomation Letters vol 4 no 2 pp 1501ndash1508 2019

[7] Y Huang C Wan and H Feng ldquoMulti-feature fusion humanbehavior recognition algorithm based on convolutionalneural network and long short termmemory neural networkrdquoLaser Optoelectron Prog vol 56 p 7 2019

[8] Y Lecun Y Bengio and G Hinton ldquoDeep learningrdquo Naturevol 521 no 7553 pp 436ndash444 2015

[9] G W Taylor R Fergus Y LeCun and C Bregler ldquoCon-volutional learning of spatio-temporal featuresrdquo in LectureNotes in Computer Science (including subseries Lecture Notesin Artificial Intelligence and Lecture Notes in Bioinformatics)Springer Berlin Germany 2010

[10] S Ji W Xu M Yang and K Yu ldquo3D Convolutional neuralnetworks for human action recognitionrdquo IEEE Transactionson Pattern Analysis and Machine Intelligence vol 35 no 1pp 221ndash231 2013

[11] C Liu J Liu Z He Y Zhai Q Hu and Y Huang ldquoCon-volutional neural random fields for action recognitionrdquoPattern Recognition vol 59 pp 213ndash224 2016

[12] K Cho M Bart van G Caglar et al ldquoLearning phraserepresentations using RNN encoder-decoder for statisticalmachine translationrdquo in Proceedings of the EMNLP 2014 -2014 Conference on Empirical Methods in Natural LanguageProcessing pp 1724ndash1734 Doha Qatar October 2014

[13] M Zeng T N Le Y Bo et al ldquoConvolutional NeuralNetworks for human activity recognition using mobile sen-sorsrdquo in Proceedings Of the 2014 6th International ConferenceOn Mobile Computing Applications And Services pp 197ndash205 Austin TX USA November 2015

[14] W Jiang and Z Yin ldquoHuman activity recognition usingwearable sensors by deep convolutional neural networksrdquo inProceedings Of the 2015 ACM Multimedia Conference MM2015 pp 1307ndash1310 Brisbane Australia October 2015

[15] Y Chen and Y Xue ldquoA deep learning approach to humanactivity recognition based on single accelerometerrdquo in Pro-ceedings of the 2015 IEEE International Conference On Sys-tems Man and Cybernetics SMC 2015 pp 1488ndash1492 HongKong China October 2016

[16] C A Ronao and S-B Cho ldquoHuman activity recognition withsmartphone sensors using deep learning neural networksrdquoExpert Systems with Applications vol 59 pp 235ndash244 2016

[17] A Murad and J Y Pyun ldquoDeep recurrent neural networks forhuman activity recognitionrdquo Sensors (Switzerland) vol 17p 11 2017

[18] J Zhou J Sun P Cong et al ldquoSecurity-critical energy-awaretask scheduling for heterogeneous real-time MPSoCs in IoTrdquoIEEE Transactions On Services Computing (TSC) vol 12 p 992019

[19] Y Guan and T Plotz ldquoEnsembles of deep LSTM learners foractivity recognition using wearablesrdquo Proceedings of the ACMon Interactive Mobile Wearable and Ubiquitous Technologiesvol 1 no 2 pp 1ndash28 2017

[20] L Qi X ZhangW Dou C Hu C Yang and J Chen ldquoA two-stage locality-sensitive hashing based approach for privacy-preserving mobile service recommendation in cross-platformedge environmentrdquo Future Generation Computer Systemsvol 88 pp 636ndash643 2018

[21] A Ignatov ldquoReal-time human activity recognition from ac-celerometer data using Convolutional Neural NetworksrdquoApplied Soft Computing vol 62 pp 915ndash922 2018

[22] H F Nweke Y W Teh M A Al-garadi and U R Alo ldquoDeeplearning algorithms for human activity recognition usingmobile and wearable sensor networks state of the art andresearch challengesrdquo Expert Systems with Applicationsvol 105 pp 233ndash261 2018

[23] J Wang Y Chen S Hao X Peng and L Hu ldquoDeep learningfor sensor-based activity recognition a surveyrdquo PatternRecognition Letters vol 119 pp 3ndash11 2019

[24] S Wu G Li L Deng et al ldquo$L1$-norm batch normalizationfor efficient training of deep neural networksrdquo IEEE Trans-actions on Neural Networks and Learning Systems vol 30no 7 pp 2043ndash2051 2019

[25] B Almaslukh J Al Muhtadi and A M Artoli ldquoA robustconvolutional neural network for online smartphone-basedhuman activity recognitionrdquo Journal of Intelligent amp FuzzySystems vol 35 no 2 pp 1609ndash1620 2018

[26] R Yao G Lin Q Shi and D C Ranasinghe ldquoEfficient denselabelling of human activity sequences from wearables usingfully convolutional networksrdquo Pattern Recognition vol 78pp 252ndash266 2018

[27] T Kautz B H Groh J Hannink U Jensen H Strubberg andB M Eskofier ldquoActivity recognition in beach volleyball usinga deep convolutional neural networkrdquo Data Mining andKnowledge Discovery vol 31 no 6 pp 1678ndash1705 2017

[28] R Jozefowicz W Zaremba and I Sutskever ldquoAn empiricalexploration of recurrent network architecturesrdquo in Proceed-ings of the 32nd international Conference on machine learningICML 2015 vol 3 pp 2332ndash2340 Lille France July 2015

[29] S Li S Zhao P Yang P Andriotis L Xu and Q SunldquoDistributed consensus algorithm for events detection incyber-physical systemsrdquo IEEE Internet of ings Journalvol 6 no 2 pp 2299ndash2308 2019

[30] G E Hinton N Srivastava A Krizhevsky I Sutskever andR R Salakhutdinov Improving Neural Networks by Pre-venting Co-adaptation of Feature Detectors arXiv preparationGeneva Switzerland 2012

[31] B M h Abidine L Fergani B Fergani and M Oussalahldquoe joint use of sequence features combination and modifiedweighted SVM for improving daily activity recognitionrdquoPattern Analysis and Applications vol 21 no 1 pp 119ndash1382018

[32] G M Weiss J W Lockhart T T Pulickal et al ldquoAsmartphone-based activity recognition system for improvinghealth and well-beingrdquo in Proceedings of the 3rd IEEE In-ternational Conference On Data Science And Advanced An-alytics DSAA 2016 pp 682ndash688 Montreal QC CanadaOctober 2016

[33] L Wang and R Liu ldquoHuman activity recognition based onwearable sensor using hierarchical deep LSTM networksrdquoCircuits Systems and Signal Processing vol 39 no 2pp 837ndash856 2019

Security and Communication Networks 11

[34] W Lu F Fan J Chu P Jing and S Yuting ldquoWearablecomputing for internet of things a discriminant approach forhuman activity recognitionrdquo IEEE Internet of ings Journalvol 6 no 2 pp 2749ndash2759 2019

[35] X Kuang Human Behavior Recognition Based on DeepLearning and Wearable Sensor Nanjing University of In-formation Engineering Nanjing China 2018

[36] M M Hassan M Z Uddin A Mohamed and A AlmogrenldquoA robust human activity recognition system using smart-phone sensors and deep learningrdquo Future Generation Com-puter Systems vol 81 pp 307ndash313 2018

12 Security and Communication Networks

Page 2: Wearable Sensor-Based Human Activity Recognition Using ...downloads.hindawi.com/journals/scn/2020/2132138.pdf · Wearable Sensor-Based Human Activity Recognition Using Hybrid Deep

can better reflect the nature of human activities ereforethe research and application of human behavior recognitionbased on sensors are more and more valuable andsignificant

Besides the HAR includes two types basic actions andtransition actions Due to the low incidence and shortduration of transition movement there are relatively fewstudies on the transition movement from standing to sittingwalking to standing and so on in the research of humanbehavior recognition [4] However the study of transitionalmovement is a very important part of human behaviorrecognition In order to improve the behavior recognitionrate transition action recognition is not negligible etransition action is the distinction of a variety of basic ac-tions in frequent alternations e accurate division of thetransition action can accurately segment the streaming datato a certain extent and ultimately improve the recognitionrate In addition the behavior recognition methods based ontraditional patterns have shortcomings such as manualfeature extraction With the application and development ofdeep learning in different fields the deep learningmodel alsoshows great advantages in the field of behavior recognition

e main contributions of this work are summarized asfollows

(1) We presented a deep learning model composed ofconvolutional and Long Short-Term Memory re-current layers which can automatically learn localfeatures and model the time dependence betweenfeatures

(2) We discussed the influence of key parameters in deeplearning model on performance and finally deter-mined the best parameters in the model

(3) We analyzed and compared the experimental resultswith other models that adopt the same common dataset e results show that the proposed method issuperior to the other advanced methods

In this work we use both acceleration sensor and agyroscope sensor of smart phones to acquire data andproposed a CNN-LSTM hybrid model to recognize thetransitionmotion Convolution neural network (CNN) [5] isa type of depth neural network used as a feature extractor Itis characterized by local dependence so it has good per-formance in extracting local features However humanactivity information belongs to long instance which iscomposed of complex movements and changes with time Sothe CNN model does not work well in extracting the rela-tionship between time and features e Long Short-TermMemory (LSTM) [6] neural network is a kind of recursionnetwork that contains a memory to simulate a time de-pendent sequence problem erefore the mixture of CNN-LSTM can accurately identify the basic and transitionalfeatures of activities

e remainder of the paper is organized as followsSection 2 reviews the literature on human activity identi-fication based on deep learning and existing problemsSection 3 presents the mixed deep learning frameworkproposed in this paper for existing problems Section 4

discusses and analyzes the experimental results based onexperimental data Finally Section 5 concludes this paper

2 Related Works

Due to the extensive application of human-computer in-teraction behavior detection and other technologies hu-man behavior recognition has become a hot field [7] Humanbehavior recognition can be regarded as a representativepattern recognition problem e traditional pattern ofbehavior recognition research using decision tree supportvector machine (SVM) and other machine learning algo-rithms can obtain much satisfactory results in premise ofsome controlled experimental environments and a smallnumber of labeled data However the accuracy of thesemethods depends on the effectiveness and comprehen-siveness of manual feature extraction In addition thesemethods can only extract shallow features Because of theselimitations the behavior recognition methods based ontraditional pattern recognition are limited in classificationaccuracy and model generalization

In recent years deep learning has developed rapidly andattracted many research efforts especially in image pro-cessing time series natural language logical reasoning andother complex data processing aspects and has achievedunparalleled achievements [8] Different from the traditionalbehavior recognition method deep learning could reducethe workload of feature design In addition the higher-leveland more meaningful features can be learned via the end-to-end neural network Furthermore the deep networkstructure is more suitable for unsupervised incrementallearning Moreover deep networks created by super-imposing several layers of features can model data withcomplex structures In a word the deep learning is an idealmethod for HAR

Since deep learning has made outstanding achievementsin image feature extraction many researchers first try toapply it to behavior recognition based on video In earlyperiods Taylor et al [9] used convolution thresholdBoltzmann machine to identify video behavior data andextract sensitive features Ji et al [10] proposed a three-dimensional CNN model to capture more action informa-tion from space and time Liu et al [11] proposed that CNNand conditional random domains (CRFs) be combined foraction segmentation and recognition e CNN can auto-matically learn space-time characteristics while CRF is ableto capture the dependency between outputs Other commondeep learning methods are also widely used such as re-cursive neural network [12] and long short-term memorynetwork On one hand it is successful on application of deeplearning in video behavior recognition On the other hand itis also widely used in human behavior recognition based onsensors

Zeng et al [13] proposed treating the single-axis sensordata as one-dimensional data of images and then sendingthem to CNN for identification Jiang and Yin [14] com-bined the signal sequences of accelerometer and gyroscopeinto an active image enabling deep convolutional neuralnetwork (DCNN) to automatically learn the optimal features

2 Security and Communication Networks

from the active image Chen and Xue [15] modified the CNNconvolution kernel to adapt to the characteristics of triaxialacceleration signals Ronao and Cho [16] proposed a con-vNet which realized efficient and data adaptive humanbehavior recognition with smart phone sensors ConvNetsnot only utilize the inherent time-local dependence of sensorsignal sequences but also provide an adaptive method forextracting robust features Experimental results show thatthis method can recognize similar actions which are difficultto be processed by traditional machine learning Murad andPyun [17] and Zhou et al [18] proposed three deep recursiveneural network structures based on LSTM to establishrecognition models to capture time relations in input se-quences and could achieve more accurate recognition Dueto the superior performance of LSTM in behavior recog-nition application Guan and Plotz [19] and Qi et al [20]improved the LSTM and proposed an integration modelintegrating different LSTM learners into an integratedclassifier rough the experimental evaluation in thestandard data set it is proved that the integrated systemcomposed of LSTM learners is superior to a single LSTMnetwork Ignatov [21] combined the manually extractedstatistical features with the features automatically extractedby neural network and realized a human behavior recog-nition method based on user autonomous deep learningAmong them CNN extracted local features while statisticalfeatures preserved the information about the global form oftime series Experiments on open data sets show that themodel has the advantages of small computation shortrunning time and good performance Nweke et al [22] andWang et al [23] respectively summarized the application ofdeep learning method in sensor-based behavior recognitionand not only put forward detailed views on the existingwork but also pointed out the challenges and improvementdirections of future research

is work demonstrated the potential of deep neuralnetwork to learn the potential features and time seriesfeatures Nevertheless existing works on action recognitionmainly focus on the aspect of basic behavior recognitionwhile the transition between actions is usually ignored be-cause the transition action has a short duration However itis necessary to study the transition action in depth in orderto improve the robustness of the model e precise divisionof the transition action can accurately segment the streamingdata to a certain extent and ultimately improve the recog-nition rate In this paper CNN combined with LSTM hybridmodel is adopted to extract deep and advanced features andelaborate description is made of basic and transition actionso as to realize accurate identification

3 Proposed Method

e overall architecture diagram of the method proposed inthis paper is shown in Figure 1 which contains three partse first part is the preprocessing and transformation of theoriginal data which combines the original data such asacceleration and gyroscope into an image-like two-dimen-sional arraye second part is to input the composite imageinto a three-layer CNN network that can automatically

extract the motion features from the activity image andabstract the features then map them into the feature mape third part is to input the feature vector into the LSTMmodel establish a relationship between time and actionsequence and finally introduce the full connection layer toachieve the fusion of multiple features In addition BatchNormalization (BN) is introduced [24] in which BN cannormalize the data in each layer and finally send it to theSoftmax layer for action classification

31 Data Preprocessing Due to the large amount of be-havioral data collected by the sensor it is impossible to inputall the data into the depth model at one time ereforesliding window segmentation should be carried out beforedata input into the model e behavior recognition methodproposed in this paper can recognize both the basic actionand the transition action at the same time e transitionaction lasts for a short time it is necessary to choose theappropriate window size If the window is too large im-portant information will be lost Otherwise the computa-tional costs will be increased After data segmentation thebehavioral data collected by sensors are one-dimensionaltime series different from image data erefore beforeapplying the deep learning model to these input data it isnecessary to input and adapt them Dimension transfor-mation is carried out on the data after window segmentatione method of transformation is to splice the sensor data ofall axes into a two-dimensional matrixe advantage of thisapproach to data processing is that it preserves the corre-lation between sensorsrsquo axes Finally samples similar topictures are formed and input into the deep learning modelFigure 2 shows the model structure of data preprocessing

32 Feature Learning Based 1D-CNN e original uniaxialacceleration and gyroscope data are equivalent to two-di-mensional array of images after dimensional transformatione feature image is input into the convolution neuralnetwork and its structure is generally composed of con-volution layer and pooling layer e convolution layercarries out convolution operation on the input imagethrough convolution kernel to obtain feature mapping epooling layer extracts local features from the feature map ofthe convolution layer through sampling operation to lessenthe size of neurons and the number of parameters econvolution layer and pooling layer are stacked to form adeep structure which can automatically extract the actionfeature information from the original action data [5]

e CNN model structure designed in this paper isshown in Figure 3 e CNN network model consists ofthree convolution layers and three pooling layers (eachconvolution layer is followed by one pooling layer) andfinally outputs a number of feature map images with actionfeatures Table 1 illustrates the settings of different param-eters for each convolution and pooling layer Convolution isachieved by the convolution of two-dimensional convolu-tion kernel with images superimposed by multiple adjacentframes e convolution kernel number of the three con-volution layers is 18 36 and 72 respectively e

Security and Communication Networks 3

convolution kernel size is 2times 8 2times18 and 2times 36 and thestep size is 1 Since the filter may not be able to process thedata in a certain direction in the operation of convolution toavoid reducing data of the image edge the padding pa-rameter is introduced and set to ldquoSAMErdquo and 0 is added tothe edge of the input image matrix After the convolutionoperation in the convolution layer the output will usuallypass through a nonlinear activation function and then formthe output of the convolution layer e popular activationfunctions include Sigmoid function ReLU function andTanh function Among them ReLU function can change thenegative value of the data extracted by CNN into 0 and thepositive value of the data greater than 0 remains unchangedAfter nonlinear processing operation the positive value

Raw data

Window splitTime

w1

w2w3

Figure 2 Structure of data preprocessing model

m lowast nInput

Conv_1 Pooling_1 Conv_2 Pooling_2 Conv_3 Pooling_3

Feature map

Convolutional and pooling layer

Figure 3 CNN model architecture

Table 1 Activity label corresponding to the original data

Id Exp Label Start End1 1 5 250 12321 1 7 1233 13921 1 4 1393 21941 1 8 2195 23591 1 5 2360 33741 1 11 3375 36621 1 5 3663 45381 1 11 4539 47351 1 5 4736 56671 1 11 5668 58591 1 5 5860 67861 1 11 6787 69771 1 5 6978 8078

Input

Con

v_1

Pool

ing

Pool

ing

Con

v_2

Pool

ing

Con

v_3

Feature map

CNN feature extraction

LSTM

Fully

conn

ecte

d

Batc

h no

rmal

izat

ion

Som

ax

LSTM feature fusion and classify

WalkingUpstairs

Stand to lie

Lie to stand

Figure 1 Human activity recognition framework based on CNN-LSTM

4 Security and Communication Networks

greater than 0 can be more clearly expressed by the extractedfeatures erefore ReLU activation function is used in theconvolution layer of CNN

f(x) max(0 x) 0 xlt 0

x xge 01113896 (1)

Further we have

fprime(x) 0 xlt 0

1 xge 01113896 (2)

Pooling layer is regarded as reducing the number offeature mappings and parameters e popular poolingtechniques include maximum pooling and average poolingIn recent years relevant theoretical analysis and perfor-mance evaluation have shown the superior performance ofthe maximum pooling strategy which is widely used in deeplearning [25 26] Moreover some studies show that themaximum pooling technology is very suitable for sensor-based human behavior recognition [27] erefore allpooling layers of CNN in this paper utilized the maximumpooling technique Specific convolution and pooling processparameters are set as shown in Table 2

33 Feature FusionandActionClassification To improve therecognition rate of transition actions we build a LSTM afterthe CNN network f1 f2 fn1113864 1113865 is the feature sequenceconverted from the feature map calculated by CNN from theimages composed of original data erefore the sequencef1 f2 fn1113864 1113865 input LSTM and the storage unit of LSTMwill produce a sequence of characters m1 m2 mn1113864 1113865

Since LSTM has different gating units memory unitssuch as input gate forgetting gate and output gate arecombined with learning weights to solve the problem ofgradient disappearance in the process of back propagation ofordinary circular neural network Meanwhile LSTM canmodel time-dependent actions and fully capture globalfeatures so as to improve the recognition accuracy [28]LSTM cell controls the inward flowing information ofneurons which is composed of forgetting gate input gateand output gate Furthermore the predicted value of LSTMcell is obtained using Tanh function

Firstly the forgetting gate determines how much in-formation from the previous moment can be accumulated tothe current cell As shown in equation (3) the probabilityvalue is calculated to determine the amount of informationthat can pass through the gate

Γf σ wf lowast alangtminus 1rang

xlangtrang

1113960 1113961 + bf1113872 1113873 (3)

where wf represents the weight corresponding to the inputvector b represents the bias alangtminus 1rang presents the output of theneuron at the last moment and xlangtrang represents the currentinput of the neuron

Secondly the input gate consists of update gate and Tanhlayer which controls how much information can flow intothe current cell e calculation process is shown in equa-tions (4)ndash(6) e input of the input gate and the output ofthe forgetting gate update the cell at the same time

discarding unwanted information en the predicted valueof the current unit is determined by the output gate and theoutput of the model is obtained as shown in equations (7)and (8)

Γu σ wulowast alangtminus 1rang

xlangtrang

1113960 1113961 + bu1113872 1113873 (4)

1113957C tanh wclowast alangtminus 1rang

xlangtrang

1113960 1113961 + bc1113872 1113873 (5)

Ct Γu lowast 1113957Clangtrang

+ Γf lowastClangtminus 1rang

(6)

Γo σ wo lowast alangtminus 1rang

xlangtrang

1113960 1113961 + bo1113872 1113873 (7)

alangtrang

Γo lowast tanh Clangtrang

1113872 1113873 (8)

After the processing of LSTM layer the final output is aset of vectors containing time and action sequence corre-lation which are input into the full connection layer for thefusion of global action features e training process ofneural network model becomes complicated since the sta-tistical distribution of input of each layer changes with theparameters of the previous layer To keep the distribution ofoutput data from changing too much a lower learning ratewill be used which could reduce the training speed To solvethis issue this paper introduces the BN to standardize thevalues of each layer in LSTM (the output of neurons at thelast moment and the input at the current moment) so thatthe mean and variance of sum will not change with thechange of the distribution of the underlying parameters andeffectively separate the parameters of each layer from otherlayers In this way the gradient disappearance or explosioncan be prevented and the training speed of the network canbe accelerated e BN algorithm is shown in Algorithm 1

In Algorithm 1 μx and ς2x are themean and variance of xi

obtained through minibatch e mean and variance wereused to normalize xi to make the sample follow normaldistribution However the positive distribution is not able toreflect the characteristic distribution of the training samplesand thus it is necessary to introduce the scaling factor c andthe shift factor β As training progresses c and β are alsolearned by back propagation to improve accuracy

After BN operation the features are more obvious soinput them to Softmax layer to extract the action featuresand classify them in time series In this model the outputlayer uses Softmax normalized exponential function tocalculate the posterior probabilities of different actions to

Table 2 e convolution and pooling layers of the CNNarchitecture

Layers Conv1d_1 Conv1d_2 Conv1d_2Size 1times 2times 8 1times 2times18 1times 2times 36Stride 1times 1times 1 1times 1times 1 1times 1times 1Channel 18 36 72Layers Pooling_1 Pooling_2 Pooling_3Size 1times 2times18 1times 2times 36 1times 2times 72Stride 1times 1times 1 1times 1times 1 1times 1times 1Channel 18 36 72

Security and Communication Networks 5

realize classification It maps the output values of neuronsbetween (0 1) which can be regarded as the predictionprobability of actions and the largest one is the result ofclassification en the Softmax output layer outputs acategory vector such as [0 0 0 0 1 0 0 0 0 0 0 0]indicating that the classification result is an action numbered5

34 Model Implementation and Training e neural net-work described here is implemented in TensorFlow [29] It is alightweight library for building and training neural networksModel training and classification runs on a conventionalcomputer with a 24GHz CPU and 16GB memory

e model is trained in a fully supervised manner tobackpropagate the gradient from the Softmax layer to theconvolution layer Network parameters are optimized byusing minibatch gradient descent method and Adam opti-mizer through minimizing cross-loss function [13] Adam iswidely used due to its advantages in simple implementationefficient calculation and low memory demand Comparedwith other kinds of random optimization algorithms Adamhas great advantages In this paper to better train the modelafter the training data are input into the network Adamoptimizer and backpropagation algorithm are used to learnand optimize the network parameters Meanwhile the cross-entropy loss function is used to calculate the total error asshown in the following equation

C minus1N

1113944X

[yIna +(1 minus y)In(1 minus a)] (9)

where y is the true tag and a is the predicted valueTo improve efficiency small batches of data segment size

are segmented during training and testing With theseconfigurations the cumulative gradient of the parameters iscalculated after each small batch e weights are randomlyand orthogonally initialized As a form of regularization weintroduce a dropout operator on each dense layer of inputis operator sets the activation of a randomly selected unitto zero during training Dropout technology proposed byHinton et al [30] is based on the principle of randomlydeleting some nodes in the network while maintaining theintegrity of input and output neurons which is equivalent totraining many different networks Different networks mayoverfit in different ways but their average results can ef-fectively reduce overfitting In addition dropout allowsneurons to learn stronger features by not relying on other

specific neurons e number of parameters to be optimizedin a deep neural network varies depending on the type oflayer it contains And it has a great impact on the time andcomputer skills required to train the network e specificmodel training parameters will reflect the best choices in theexperiment

4 Activity Recognition

41 Experiment Data In addition to common basic actionsthis paper also studies transition actions Actually a fewexisting public data sets contain transition actions ere-fore this paper adopts the international standard Data SetSmart phone Based Recognition of Human Activities andPostural Transitions Data Set [31 32] to conduct an ex-periment which is abbreviated as HAPT Data Set e dataset is an updated version of the UCI Human ActivityRecognition Using popularity Data set [8] It provides rawdata from smart phone sensors rather than preprocesseddata In addition the action category has been expanded toinclude transition actions e HAPT data set containstwelve types of actions Firstly it has six basic actions thatinclude three types of static actions such as standing sittingand lying and three types of walking activities such aswalking going downstairs and upstairs Secondly it has sixpossible transitions between any two static movementsstanding to sitting sitting to standing standing to lyinglying to sitting sitting to lying and lying to standing

e HAPT data collection process is shown in Figure 4e experiment involved 30 volunteers whose ages rangefrom 19 to 48 each wearing a smart phone on their waistData collection is carried out with the built-in accelerationsensor and gyroscope and the sampling frequency is 50HzMeanwhile video records of the experimental process aremade for the convenience of subsequent data marking

e collected data is saved in the form of txt and theacceleration and gyroscope data are stored independentlywith 60 groups respectively As shown in Table 1 it is thelabel information corresponding to the original data of theexperiment Among them the first column is the experimentID the second column is the experimenter number the thirdcolumn is the action label and the fourth and fifth columnsare the start and end row labels of the corresponding sensordata e label ranges from 1 to 12 representing 12 types ofactions It can be seen from the figure that the collected datacontains invalid data and the first 250 pieces of data areunlabeled and belong to invalid data

Input data set χ x1 xn1113864 1113865

Output yi BNcβ(xi)1113966 1113967

(1) Calculate the mean of data set μxlarr(1n) 1113936ni1 xi

(2) Calculate the variance of data set ς2χlarr(1n) 1113936ni1 (xi minus μχ)

2

(3) Normalize data 1113954xilarr(xi minus μxς2x

1113968+ ε)

(4) Scale change and deviation yilarrc1113954xi + β BNcβ(xi)

(5) Return learning parameter c and β

ALGORITHM 1 Algorithm of batch normalization

6 Security and Communication Networks

After preliminary processing of the original data all thedata without labels were deleted Finally 815614 valid piecesof data were obtained Due to the low frequency and shortduration of transition action as well as the high frequencyand long duration of basic action there is a considerabledifference in data volume between transition action andbasic action e data volume of the six transition actions ismuch lower than that of the other basic actions accountingfor only about 8 of the total data Table 3 lists the amount ofdata for different actions e original data is divided intothree parts training set verification set and test set in whichthe training set is used for model training and verificationset is used to adjust parameters and test set is used tomeasure the quality of the final model

42 Parameters Setting In the deep learning network themodel parameters greatly affect its recognition rateerefore the experimental analysis of the number ofneurons learning rate BN Batch size and other parametersin LSTM layer would be conducted in the following sections

421 Number of Neurons in LSTM Layer In order to verifythe influence of the number of neurons in LSTM layer on therecognition results the following experiments are carriedout in this paper as shown in Figure 5 It shows that therecognition rate is the lowest when each LSTM layer con-tains only 8 neurons is is because given less neurons thenetwork lacks the necessary learning ability and informationprocessing ability resulting in the low recognition rate As

the number of neurons increases the recognition rate tendsto increase When the number of neurons is 64 the rec-ognition rate reaches 9587 If the number of neurons is toolarge the complexity of network structure will increase andthe learning speed of network will slow down ereforeconsidering the training time of the network the number ofLSTM layer neurons in this paper is tentatively 64

422 e Learning Rates Experiments are carried out atdifferent learning rates in this paper As shown in Table 4 itcan be seen that the recognition rate of the model reaches amaximum of 9587 when the learning rate is 0002erefore the learning rate of 0002 is adopted

423 BN Operation To verify the improvement of the BNoperation on the network model a comparative experimentis carried out first with and without BN layer e epoch isset to 400 and other parameters remain unchanged erecognition rates of both methods on the test set are shownin Table 5 Obviously the recognition rate on the test set isimproved by about 424 after the BN layer is added

424 Batch Size Batch size refers to the Batch sample sizewhose maximum value is the total number of samples in the

Figure 4 Data collection of the physical activities

0955

0950

0945

0940

0935

0930

0925

0920

Acc

urac

y

8 16 24 32 40 48 56 64Number of neurons

Test accuracy

Figure 5 Accuracy of different numbers of neurons on test sets

Table 3 e data amount of various activities in the HAPT

Type ID NumberWalk A1 122091Upstairs A2 116707Downstairs A3 107961Sit down A4 126677Stand A5 138105Lie A6 136865Stand to sit A7 10316Sit to stand A8 8029Sit to lie A9 12428Lie to sit A10 11150Stand to lie A11 14418Lie to stop A12 10867

Security and Communication Networks 7

training set When the amount of data is small the batchdata is the whole data set so that it can approach the extremevalue direction more accurately However in practical ap-plications the amount of data used by deep learning isrelatively large and the principle of small batch processing isgenerally adopted Using small batch processing requiresrelatively little memory and faster training time Within anappropriate range increasing the batch size can more ac-curately determine the direction of gradient descent andcause less training shock However when the batch sizeincreases to a certain value the determined downward di-rection will not change and the correction of parameters willslow down significantly e identification results of dif-ferent batch sizes are shown in Table 6 It can be seen thatwhen the batch size is 150 the maximum identification ratereaches 9587 erefore 150 is selected as the best batchsize in this paper

e parameters of the CNN-LSTM model proposed inthis paper are shown in Table 7

5 Experimental Results and Analysis

For human movement recognition Wang and Liu [33]proposed to use the F-measure standard measurementmethod to verify the performance of the deep-rootedLSTM network model in human activity recognition Luet al [34] demonstrated the superiority of the model inbehavior recognition by using accuracy prediction rateand recall rate in the experiment erefore to evaluatethe performance of the motion recognition methodproposed in this paper we also used the measurementmethod of accuracy recall rate loss rate and F-measure inthe experiment

According to the above parameters the recognitionconfusion matrix of 12 different actions is shown in Table 8Accuracy curve of CNN-LSTM model is shown in Figure 6It can be seen from Table 9 that the overall recognition rateof CNN-LSTM is high and the CNN-LSTM has a betterrecognition effect on the transition action

6 Case Study

In the non-deep-learning method the random forest clas-sification method (RF) and K-nearest neighbor (KNN)classification perform well in action classification recogni-tion erefore the CNN-LSTM model proposed is com-pared with the RF and KNN methods First of all input theHAPTdata set into RF and KNNen segment the original

Table 4 Accuracy of different learning rates on test sets

Learning rate Recognition rate ()0001 935700015 94210002 958700025 92390003 933400035 92120004 928400045 9201

Table 5 Accuracy and loss rate on test sets with or without BNlayer

Recognition rate ()Without BN layer 9163With BN layer 9587

Table 6 Accuracy of different batch size on test sets

Batch size Recognition rate ()25 917450 928875 9292100 9310125 9433150 9587175 9345200 9337225 9372250 9345275 9284300 9335325 9406350 9334375 9296400 9353

Table 7 Experimental parameters of CNN-LSTM model

Parameters ValueInput vector size 150Input channel number 8Convolution kernel size 3Pool size 2Activation function ReLuLSTM layer 1Neurons number 64Dropout 05Learning rate 0002Batch size 150Epoch 400

Table 8 Confusion matrix of various actions

ActualPredict

A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A12A1 410 1 3 0 0 0 0 0 0 0 0 0A2 5 388 3 0 0 0 0 0 0 0 0 0A3 1 3 346 0 0 0 0 0 0 0 0 0A4 1 0 0 383 32 3 1 0 1 1 0 0A5 0 0 1 31 431 0 0 0 0 0 0 0A6 0 0 0 1 0 457 0 0 0 0 0 0A7 0 0 0 1 0 0 17 0 0 0 0 0A8 0 0 0 0 0 0 0 4 0 0 1 0A9 0 0 0 0 0 0 0 0 19 1 4 1A10 0 0 0 0 0 0 0 0 1 14 0 2A11 0 1 0 1 0 0 1 0 2 1 32 1A12 0 0 0 0 0 0 0 0 0 1 1 16

8 Security and Communication Networks

sensor data and calculate the mean value variance co-variance and 15 features Finally classify the basic actionsand transition actions according to the clustering resultse classification results are shown in Table 10 It can be seenthat the recognition rate of CNN-LSTM model is higher

than that of RF and KNNmethods for both basic actions andtransition actions

In addition to the comparison with RF and KNN clas-sifier our proposed model is also compared with a singleCNN a single LSTM CNN-GRU and CNN-BLSTM deep

Table 9 e recognition accuracy recall rate and F value of various actions

ID Accuracy () Recall () F-measure ()A1 9903 9832 9868A2 9778 9873 9835A3 9886 9802 9844A4 9076 9185 9130A5 9309 9309 9309A6 9978 9956 9956A7 9444 8947 9189A8 100 100 100A9 7600 8261 7917A10 8235 7778 8000A11 8205 8649 8421A12 8889 8000 8421

Table 10 Average accuracy of various actions in CNN-LSTM RF and KNN models

ID RF () KNN () CNN-LSTM ()A1 9990 8810 9903A2 9250 9780 9778A3 9020 9940 9886A4 9190 8380 9076A5 9080 8750 9309A6 9710 100 9978A7 7130 6670 9444A8 7200 6800 100A9 5130 3860 7600A10 7490 3630 8235A11 5920 3370 8205A12 6110 5790 8889

10

09

08

07

06

05

04

030 50 100 150 200 250 300 350 400

Iterations

TrainValidation

Figure 6 Accuracy curve of CNN-LSTM Model

Security and Communication Networks 9

learning models Table 11 shows the average accuracy ofvarious actions in five different depth models As can be seenfrom Table 11 CNN-LSTM not only has a slightly higherrecognition of basic movements than the other five modelsbut also has a significantly better recognition of transitionmovements especially standing to sitting sitting to lyingand standing to lying Table 12 shows the recognition rates ofdifferent models on the test set It can be seen from the tablethat the average recognition rate of the three models ishigher than 90 but the recognition effect of CNN-LSTMmodel is slightly better than that of CNN LSTM CNN-GRU and CNN-BLSTM

To prove the effectiveness of the CNN-LSTM deeplearning model it is also compared with other deep learningmethods using the same dataset Kuang [35] applied BLSTMto construct the behavior recognition model Hassan et al[36] used deep belief network (DBN) for human behaviorrecognition We compared the performance with the ap-proaches in [35 36] with the result shown in Table 13 Itfollows that the proposed CNN-LSTM can achieve highestaverage recognition rate

7 Conclusion

is paper explored the recognition method based on deeplearning and designed the behavior recognition model basedon CNN-LSTM CNN learns local features from the original

sensor data and LSTM extracts time-dependent relation-ships from local features and realizes the fusion of localfeatures and global features fine description of basic andtransition movements and accurate identification of the twomotion patterns

e actions identified in this paper only include commonbasic actions and individual transition actions In the nextstep more kinds of actions can be collected and morecomplex actions can be added such as eating and drivingAnd the individual recognition can be realized by consid-ering the behavior differences of different users Meanwhilethe deep learning model still needs to be optimized andimproved Studies show that the combination of depthmodel and shallow model can achieve better performanceDeep learning model has strong learning ability whileshallow learning model has higher learning efficiency ecollaboration between the two can achieve more accurateand lightweight recognition

Data Availability

No data were used to support this study

Conflicts of Interest

e authors declare no conflicts of interest

Acknowledgments

e authors would like to thank the support of the labo-ratory university and government is research wasfunded by the National Key Research and Development Plan(No 2017YFB1402103) the National Natural ScienceFoundation of China (No 61971347) Scientific ResearchProgram of Shaanxi Province (2018HJCG-05) and Projectof Xirsquoan Science and Technology Planning Foundation(201805037YD15CG214)

References

[1] I H Lopez-Nava and M M Angelica ldquoWearable inertialsensors for human motion analysis a reviewrdquo IEEE SensorsJournalvol 16 no 15 2016

Table 11 Average accuracy of different activities with five deep learning models

ID CNN () LSTM () CNN-BLSTM () CNN-GRU () CNN-LSTM ()A1 9750 9770 9741 9975 9903A2 9725 9710 9565 9899 9778A3 9560 9715 100 9657 9886A4 9126 9026 9196 8199 9076A5 9080 9080 8474 9248 9309A6 9967 9858 100 9978 9978A7 7647 6486 4444 7778 9444A8 100 6667 6667 5000 100A9 6383 6939 6207 4800 7600A10 8485 7027 8000 5294 8235A11 7250 6933 6500 7179 8205A12 8330 7027 7059 5556 8889

Table 12 Average accuracy of the five models in this paper

Method Average recognition rate ()CNN 9429LSTM 9322CNN-BLSTM 9273CNN-GRU 9334CNN-LSTM 9587

Table 13 Average accuracy of different methods on test set in thepaper [35 36]

Method Average recognition rateBLSTM [35] 875DBN [36] 896CNN-LSTM 958

10 Security and Communication Networks

[2] Y Liu L Nie L Liu and D S Rosenblum ldquoFrom action toactivity sensor-based activity recognitionrdquo Neurocomputingvol 181 pp 108ndash115 2016

[3] T Liu F Bingfei and L Qingguo ldquoe invention relates to awearable motion sensor and a method for resisting magneticfield interferencerdquo 2017

[4] O D Lara and M A Labrador ldquoA survey on human activityrecognition using wearable sensorsrdquo IEEE CommunicationsSurveys amp Tutorials vol 15 no 3 pp 1192ndash1209 2013

[5] F J Ordontildeez and D Roggen ldquoDeep convolutional and LSTMrecurrent neural networks for multimodal wearable activityrecognitionrdquo Sensors (Switzerland) vol 16 p 1 2016

[6] X Du R Vasudevan and M Johnson-Roberson ldquoBio-LSTMa biomechanically inspired recurrent neural network for 3-dpedestrian pose and gait predictionrdquo IEEE Robotics andAutomation Letters vol 4 no 2 pp 1501ndash1508 2019

[7] Y Huang C Wan and H Feng ldquoMulti-feature fusion humanbehavior recognition algorithm based on convolutionalneural network and long short termmemory neural networkrdquoLaser Optoelectron Prog vol 56 p 7 2019

[8] Y Lecun Y Bengio and G Hinton ldquoDeep learningrdquo Naturevol 521 no 7553 pp 436ndash444 2015

[9] G W Taylor R Fergus Y LeCun and C Bregler ldquoCon-volutional learning of spatio-temporal featuresrdquo in LectureNotes in Computer Science (including subseries Lecture Notesin Artificial Intelligence and Lecture Notes in Bioinformatics)Springer Berlin Germany 2010

[10] S Ji W Xu M Yang and K Yu ldquo3D Convolutional neuralnetworks for human action recognitionrdquo IEEE Transactionson Pattern Analysis and Machine Intelligence vol 35 no 1pp 221ndash231 2013

[11] C Liu J Liu Z He Y Zhai Q Hu and Y Huang ldquoCon-volutional neural random fields for action recognitionrdquoPattern Recognition vol 59 pp 213ndash224 2016

[12] K Cho M Bart van G Caglar et al ldquoLearning phraserepresentations using RNN encoder-decoder for statisticalmachine translationrdquo in Proceedings of the EMNLP 2014 -2014 Conference on Empirical Methods in Natural LanguageProcessing pp 1724ndash1734 Doha Qatar October 2014

[13] M Zeng T N Le Y Bo et al ldquoConvolutional NeuralNetworks for human activity recognition using mobile sen-sorsrdquo in Proceedings Of the 2014 6th International ConferenceOn Mobile Computing Applications And Services pp 197ndash205 Austin TX USA November 2015

[14] W Jiang and Z Yin ldquoHuman activity recognition usingwearable sensors by deep convolutional neural networksrdquo inProceedings Of the 2015 ACM Multimedia Conference MM2015 pp 1307ndash1310 Brisbane Australia October 2015

[15] Y Chen and Y Xue ldquoA deep learning approach to humanactivity recognition based on single accelerometerrdquo in Pro-ceedings of the 2015 IEEE International Conference On Sys-tems Man and Cybernetics SMC 2015 pp 1488ndash1492 HongKong China October 2016

[16] C A Ronao and S-B Cho ldquoHuman activity recognition withsmartphone sensors using deep learning neural networksrdquoExpert Systems with Applications vol 59 pp 235ndash244 2016

[17] A Murad and J Y Pyun ldquoDeep recurrent neural networks forhuman activity recognitionrdquo Sensors (Switzerland) vol 17p 11 2017

[18] J Zhou J Sun P Cong et al ldquoSecurity-critical energy-awaretask scheduling for heterogeneous real-time MPSoCs in IoTrdquoIEEE Transactions On Services Computing (TSC) vol 12 p 992019

[19] Y Guan and T Plotz ldquoEnsembles of deep LSTM learners foractivity recognition using wearablesrdquo Proceedings of the ACMon Interactive Mobile Wearable and Ubiquitous Technologiesvol 1 no 2 pp 1ndash28 2017

[20] L Qi X ZhangW Dou C Hu C Yang and J Chen ldquoA two-stage locality-sensitive hashing based approach for privacy-preserving mobile service recommendation in cross-platformedge environmentrdquo Future Generation Computer Systemsvol 88 pp 636ndash643 2018

[21] A Ignatov ldquoReal-time human activity recognition from ac-celerometer data using Convolutional Neural NetworksrdquoApplied Soft Computing vol 62 pp 915ndash922 2018

[22] H F Nweke Y W Teh M A Al-garadi and U R Alo ldquoDeeplearning algorithms for human activity recognition usingmobile and wearable sensor networks state of the art andresearch challengesrdquo Expert Systems with Applicationsvol 105 pp 233ndash261 2018

[23] J Wang Y Chen S Hao X Peng and L Hu ldquoDeep learningfor sensor-based activity recognition a surveyrdquo PatternRecognition Letters vol 119 pp 3ndash11 2019

[24] S Wu G Li L Deng et al ldquo$L1$-norm batch normalizationfor efficient training of deep neural networksrdquo IEEE Trans-actions on Neural Networks and Learning Systems vol 30no 7 pp 2043ndash2051 2019

[25] B Almaslukh J Al Muhtadi and A M Artoli ldquoA robustconvolutional neural network for online smartphone-basedhuman activity recognitionrdquo Journal of Intelligent amp FuzzySystems vol 35 no 2 pp 1609ndash1620 2018

[26] R Yao G Lin Q Shi and D C Ranasinghe ldquoEfficient denselabelling of human activity sequences from wearables usingfully convolutional networksrdquo Pattern Recognition vol 78pp 252ndash266 2018

[27] T Kautz B H Groh J Hannink U Jensen H Strubberg andB M Eskofier ldquoActivity recognition in beach volleyball usinga deep convolutional neural networkrdquo Data Mining andKnowledge Discovery vol 31 no 6 pp 1678ndash1705 2017

[28] R Jozefowicz W Zaremba and I Sutskever ldquoAn empiricalexploration of recurrent network architecturesrdquo in Proceed-ings of the 32nd international Conference on machine learningICML 2015 vol 3 pp 2332ndash2340 Lille France July 2015

[29] S Li S Zhao P Yang P Andriotis L Xu and Q SunldquoDistributed consensus algorithm for events detection incyber-physical systemsrdquo IEEE Internet of ings Journalvol 6 no 2 pp 2299ndash2308 2019

[30] G E Hinton N Srivastava A Krizhevsky I Sutskever andR R Salakhutdinov Improving Neural Networks by Pre-venting Co-adaptation of Feature Detectors arXiv preparationGeneva Switzerland 2012

[31] B M h Abidine L Fergani B Fergani and M Oussalahldquoe joint use of sequence features combination and modifiedweighted SVM for improving daily activity recognitionrdquoPattern Analysis and Applications vol 21 no 1 pp 119ndash1382018

[32] G M Weiss J W Lockhart T T Pulickal et al ldquoAsmartphone-based activity recognition system for improvinghealth and well-beingrdquo in Proceedings of the 3rd IEEE In-ternational Conference On Data Science And Advanced An-alytics DSAA 2016 pp 682ndash688 Montreal QC CanadaOctober 2016

[33] L Wang and R Liu ldquoHuman activity recognition based onwearable sensor using hierarchical deep LSTM networksrdquoCircuits Systems and Signal Processing vol 39 no 2pp 837ndash856 2019

Security and Communication Networks 11

[34] W Lu F Fan J Chu P Jing and S Yuting ldquoWearablecomputing for internet of things a discriminant approach forhuman activity recognitionrdquo IEEE Internet of ings Journalvol 6 no 2 pp 2749ndash2759 2019

[35] X Kuang Human Behavior Recognition Based on DeepLearning and Wearable Sensor Nanjing University of In-formation Engineering Nanjing China 2018

[36] M M Hassan M Z Uddin A Mohamed and A AlmogrenldquoA robust human activity recognition system using smart-phone sensors and deep learningrdquo Future Generation Com-puter Systems vol 81 pp 307ndash313 2018

12 Security and Communication Networks

Page 3: Wearable Sensor-Based Human Activity Recognition Using ...downloads.hindawi.com/journals/scn/2020/2132138.pdf · Wearable Sensor-Based Human Activity Recognition Using Hybrid Deep

from the active image Chen and Xue [15] modified the CNNconvolution kernel to adapt to the characteristics of triaxialacceleration signals Ronao and Cho [16] proposed a con-vNet which realized efficient and data adaptive humanbehavior recognition with smart phone sensors ConvNetsnot only utilize the inherent time-local dependence of sensorsignal sequences but also provide an adaptive method forextracting robust features Experimental results show thatthis method can recognize similar actions which are difficultto be processed by traditional machine learning Murad andPyun [17] and Zhou et al [18] proposed three deep recursiveneural network structures based on LSTM to establishrecognition models to capture time relations in input se-quences and could achieve more accurate recognition Dueto the superior performance of LSTM in behavior recog-nition application Guan and Plotz [19] and Qi et al [20]improved the LSTM and proposed an integration modelintegrating different LSTM learners into an integratedclassifier rough the experimental evaluation in thestandard data set it is proved that the integrated systemcomposed of LSTM learners is superior to a single LSTMnetwork Ignatov [21] combined the manually extractedstatistical features with the features automatically extractedby neural network and realized a human behavior recog-nition method based on user autonomous deep learningAmong them CNN extracted local features while statisticalfeatures preserved the information about the global form oftime series Experiments on open data sets show that themodel has the advantages of small computation shortrunning time and good performance Nweke et al [22] andWang et al [23] respectively summarized the application ofdeep learning method in sensor-based behavior recognitionand not only put forward detailed views on the existingwork but also pointed out the challenges and improvementdirections of future research

is work demonstrated the potential of deep neuralnetwork to learn the potential features and time seriesfeatures Nevertheless existing works on action recognitionmainly focus on the aspect of basic behavior recognitionwhile the transition between actions is usually ignored be-cause the transition action has a short duration However itis necessary to study the transition action in depth in orderto improve the robustness of the model e precise divisionof the transition action can accurately segment the streamingdata to a certain extent and ultimately improve the recog-nition rate In this paper CNN combined with LSTM hybridmodel is adopted to extract deep and advanced features andelaborate description is made of basic and transition actionso as to realize accurate identification

3 Proposed Method

e overall architecture diagram of the method proposed inthis paper is shown in Figure 1 which contains three partse first part is the preprocessing and transformation of theoriginal data which combines the original data such asacceleration and gyroscope into an image-like two-dimen-sional arraye second part is to input the composite imageinto a three-layer CNN network that can automatically

extract the motion features from the activity image andabstract the features then map them into the feature mape third part is to input the feature vector into the LSTMmodel establish a relationship between time and actionsequence and finally introduce the full connection layer toachieve the fusion of multiple features In addition BatchNormalization (BN) is introduced [24] in which BN cannormalize the data in each layer and finally send it to theSoftmax layer for action classification

31 Data Preprocessing Due to the large amount of be-havioral data collected by the sensor it is impossible to inputall the data into the depth model at one time ereforesliding window segmentation should be carried out beforedata input into the model e behavior recognition methodproposed in this paper can recognize both the basic actionand the transition action at the same time e transitionaction lasts for a short time it is necessary to choose theappropriate window size If the window is too large im-portant information will be lost Otherwise the computa-tional costs will be increased After data segmentation thebehavioral data collected by sensors are one-dimensionaltime series different from image data erefore beforeapplying the deep learning model to these input data it isnecessary to input and adapt them Dimension transfor-mation is carried out on the data after window segmentatione method of transformation is to splice the sensor data ofall axes into a two-dimensional matrixe advantage of thisapproach to data processing is that it preserves the corre-lation between sensorsrsquo axes Finally samples similar topictures are formed and input into the deep learning modelFigure 2 shows the model structure of data preprocessing

32 Feature Learning Based 1D-CNN e original uniaxialacceleration and gyroscope data are equivalent to two-di-mensional array of images after dimensional transformatione feature image is input into the convolution neuralnetwork and its structure is generally composed of con-volution layer and pooling layer e convolution layercarries out convolution operation on the input imagethrough convolution kernel to obtain feature mapping epooling layer extracts local features from the feature map ofthe convolution layer through sampling operation to lessenthe size of neurons and the number of parameters econvolution layer and pooling layer are stacked to form adeep structure which can automatically extract the actionfeature information from the original action data [5]

e CNN model structure designed in this paper isshown in Figure 3 e CNN network model consists ofthree convolution layers and three pooling layers (eachconvolution layer is followed by one pooling layer) andfinally outputs a number of feature map images with actionfeatures Table 1 illustrates the settings of different param-eters for each convolution and pooling layer Convolution isachieved by the convolution of two-dimensional convolu-tion kernel with images superimposed by multiple adjacentframes e convolution kernel number of the three con-volution layers is 18 36 and 72 respectively e

Security and Communication Networks 3

convolution kernel size is 2times 8 2times18 and 2times 36 and thestep size is 1 Since the filter may not be able to process thedata in a certain direction in the operation of convolution toavoid reducing data of the image edge the padding pa-rameter is introduced and set to ldquoSAMErdquo and 0 is added tothe edge of the input image matrix After the convolutionoperation in the convolution layer the output will usuallypass through a nonlinear activation function and then formthe output of the convolution layer e popular activationfunctions include Sigmoid function ReLU function andTanh function Among them ReLU function can change thenegative value of the data extracted by CNN into 0 and thepositive value of the data greater than 0 remains unchangedAfter nonlinear processing operation the positive value

Raw data

Window splitTime

w1

w2w3

Figure 2 Structure of data preprocessing model

m lowast nInput

Conv_1 Pooling_1 Conv_2 Pooling_2 Conv_3 Pooling_3

Feature map

Convolutional and pooling layer

Figure 3 CNN model architecture

Table 1 Activity label corresponding to the original data

Id Exp Label Start End1 1 5 250 12321 1 7 1233 13921 1 4 1393 21941 1 8 2195 23591 1 5 2360 33741 1 11 3375 36621 1 5 3663 45381 1 11 4539 47351 1 5 4736 56671 1 11 5668 58591 1 5 5860 67861 1 11 6787 69771 1 5 6978 8078

Input

Con

v_1

Pool

ing

Pool

ing

Con

v_2

Pool

ing

Con

v_3

Feature map

CNN feature extraction

LSTM

Fully

conn

ecte

d

Batc

h no

rmal

izat

ion

Som

ax

LSTM feature fusion and classify

WalkingUpstairs

Stand to lie

Lie to stand

Figure 1 Human activity recognition framework based on CNN-LSTM

4 Security and Communication Networks

greater than 0 can be more clearly expressed by the extractedfeatures erefore ReLU activation function is used in theconvolution layer of CNN

f(x) max(0 x) 0 xlt 0

x xge 01113896 (1)

Further we have

fprime(x) 0 xlt 0

1 xge 01113896 (2)

Pooling layer is regarded as reducing the number offeature mappings and parameters e popular poolingtechniques include maximum pooling and average poolingIn recent years relevant theoretical analysis and perfor-mance evaluation have shown the superior performance ofthe maximum pooling strategy which is widely used in deeplearning [25 26] Moreover some studies show that themaximum pooling technology is very suitable for sensor-based human behavior recognition [27] erefore allpooling layers of CNN in this paper utilized the maximumpooling technique Specific convolution and pooling processparameters are set as shown in Table 2

33 Feature FusionandActionClassification To improve therecognition rate of transition actions we build a LSTM afterthe CNN network f1 f2 fn1113864 1113865 is the feature sequenceconverted from the feature map calculated by CNN from theimages composed of original data erefore the sequencef1 f2 fn1113864 1113865 input LSTM and the storage unit of LSTMwill produce a sequence of characters m1 m2 mn1113864 1113865

Since LSTM has different gating units memory unitssuch as input gate forgetting gate and output gate arecombined with learning weights to solve the problem ofgradient disappearance in the process of back propagation ofordinary circular neural network Meanwhile LSTM canmodel time-dependent actions and fully capture globalfeatures so as to improve the recognition accuracy [28]LSTM cell controls the inward flowing information ofneurons which is composed of forgetting gate input gateand output gate Furthermore the predicted value of LSTMcell is obtained using Tanh function

Firstly the forgetting gate determines how much in-formation from the previous moment can be accumulated tothe current cell As shown in equation (3) the probabilityvalue is calculated to determine the amount of informationthat can pass through the gate

Γf σ wf lowast alangtminus 1rang

xlangtrang

1113960 1113961 + bf1113872 1113873 (3)

where wf represents the weight corresponding to the inputvector b represents the bias alangtminus 1rang presents the output of theneuron at the last moment and xlangtrang represents the currentinput of the neuron

Secondly the input gate consists of update gate and Tanhlayer which controls how much information can flow intothe current cell e calculation process is shown in equa-tions (4)ndash(6) e input of the input gate and the output ofthe forgetting gate update the cell at the same time

discarding unwanted information en the predicted valueof the current unit is determined by the output gate and theoutput of the model is obtained as shown in equations (7)and (8)

Γu σ wulowast alangtminus 1rang

xlangtrang

1113960 1113961 + bu1113872 1113873 (4)

1113957C tanh wclowast alangtminus 1rang

xlangtrang

1113960 1113961 + bc1113872 1113873 (5)

Ct Γu lowast 1113957Clangtrang

+ Γf lowastClangtminus 1rang

(6)

Γo σ wo lowast alangtminus 1rang

xlangtrang

1113960 1113961 + bo1113872 1113873 (7)

alangtrang

Γo lowast tanh Clangtrang

1113872 1113873 (8)

After the processing of LSTM layer the final output is aset of vectors containing time and action sequence corre-lation which are input into the full connection layer for thefusion of global action features e training process ofneural network model becomes complicated since the sta-tistical distribution of input of each layer changes with theparameters of the previous layer To keep the distribution ofoutput data from changing too much a lower learning ratewill be used which could reduce the training speed To solvethis issue this paper introduces the BN to standardize thevalues of each layer in LSTM (the output of neurons at thelast moment and the input at the current moment) so thatthe mean and variance of sum will not change with thechange of the distribution of the underlying parameters andeffectively separate the parameters of each layer from otherlayers In this way the gradient disappearance or explosioncan be prevented and the training speed of the network canbe accelerated e BN algorithm is shown in Algorithm 1

In Algorithm 1 μx and ς2x are themean and variance of xi

obtained through minibatch e mean and variance wereused to normalize xi to make the sample follow normaldistribution However the positive distribution is not able toreflect the characteristic distribution of the training samplesand thus it is necessary to introduce the scaling factor c andthe shift factor β As training progresses c and β are alsolearned by back propagation to improve accuracy

After BN operation the features are more obvious soinput them to Softmax layer to extract the action featuresand classify them in time series In this model the outputlayer uses Softmax normalized exponential function tocalculate the posterior probabilities of different actions to

Table 2 e convolution and pooling layers of the CNNarchitecture

Layers Conv1d_1 Conv1d_2 Conv1d_2Size 1times 2times 8 1times 2times18 1times 2times 36Stride 1times 1times 1 1times 1times 1 1times 1times 1Channel 18 36 72Layers Pooling_1 Pooling_2 Pooling_3Size 1times 2times18 1times 2times 36 1times 2times 72Stride 1times 1times 1 1times 1times 1 1times 1times 1Channel 18 36 72

Security and Communication Networks 5

realize classification It maps the output values of neuronsbetween (0 1) which can be regarded as the predictionprobability of actions and the largest one is the result ofclassification en the Softmax output layer outputs acategory vector such as [0 0 0 0 1 0 0 0 0 0 0 0]indicating that the classification result is an action numbered5

34 Model Implementation and Training e neural net-work described here is implemented in TensorFlow [29] It is alightweight library for building and training neural networksModel training and classification runs on a conventionalcomputer with a 24GHz CPU and 16GB memory

e model is trained in a fully supervised manner tobackpropagate the gradient from the Softmax layer to theconvolution layer Network parameters are optimized byusing minibatch gradient descent method and Adam opti-mizer through minimizing cross-loss function [13] Adam iswidely used due to its advantages in simple implementationefficient calculation and low memory demand Comparedwith other kinds of random optimization algorithms Adamhas great advantages In this paper to better train the modelafter the training data are input into the network Adamoptimizer and backpropagation algorithm are used to learnand optimize the network parameters Meanwhile the cross-entropy loss function is used to calculate the total error asshown in the following equation

C minus1N

1113944X

[yIna +(1 minus y)In(1 minus a)] (9)

where y is the true tag and a is the predicted valueTo improve efficiency small batches of data segment size

are segmented during training and testing With theseconfigurations the cumulative gradient of the parameters iscalculated after each small batch e weights are randomlyand orthogonally initialized As a form of regularization weintroduce a dropout operator on each dense layer of inputis operator sets the activation of a randomly selected unitto zero during training Dropout technology proposed byHinton et al [30] is based on the principle of randomlydeleting some nodes in the network while maintaining theintegrity of input and output neurons which is equivalent totraining many different networks Different networks mayoverfit in different ways but their average results can ef-fectively reduce overfitting In addition dropout allowsneurons to learn stronger features by not relying on other

specific neurons e number of parameters to be optimizedin a deep neural network varies depending on the type oflayer it contains And it has a great impact on the time andcomputer skills required to train the network e specificmodel training parameters will reflect the best choices in theexperiment

4 Activity Recognition

41 Experiment Data In addition to common basic actionsthis paper also studies transition actions Actually a fewexisting public data sets contain transition actions ere-fore this paper adopts the international standard Data SetSmart phone Based Recognition of Human Activities andPostural Transitions Data Set [31 32] to conduct an ex-periment which is abbreviated as HAPT Data Set e dataset is an updated version of the UCI Human ActivityRecognition Using popularity Data set [8] It provides rawdata from smart phone sensors rather than preprocesseddata In addition the action category has been expanded toinclude transition actions e HAPT data set containstwelve types of actions Firstly it has six basic actions thatinclude three types of static actions such as standing sittingand lying and three types of walking activities such aswalking going downstairs and upstairs Secondly it has sixpossible transitions between any two static movementsstanding to sitting sitting to standing standing to lyinglying to sitting sitting to lying and lying to standing

e HAPT data collection process is shown in Figure 4e experiment involved 30 volunteers whose ages rangefrom 19 to 48 each wearing a smart phone on their waistData collection is carried out with the built-in accelerationsensor and gyroscope and the sampling frequency is 50HzMeanwhile video records of the experimental process aremade for the convenience of subsequent data marking

e collected data is saved in the form of txt and theacceleration and gyroscope data are stored independentlywith 60 groups respectively As shown in Table 1 it is thelabel information corresponding to the original data of theexperiment Among them the first column is the experimentID the second column is the experimenter number the thirdcolumn is the action label and the fourth and fifth columnsare the start and end row labels of the corresponding sensordata e label ranges from 1 to 12 representing 12 types ofactions It can be seen from the figure that the collected datacontains invalid data and the first 250 pieces of data areunlabeled and belong to invalid data

Input data set χ x1 xn1113864 1113865

Output yi BNcβ(xi)1113966 1113967

(1) Calculate the mean of data set μxlarr(1n) 1113936ni1 xi

(2) Calculate the variance of data set ς2χlarr(1n) 1113936ni1 (xi minus μχ)

2

(3) Normalize data 1113954xilarr(xi minus μxς2x

1113968+ ε)

(4) Scale change and deviation yilarrc1113954xi + β BNcβ(xi)

(5) Return learning parameter c and β

ALGORITHM 1 Algorithm of batch normalization

6 Security and Communication Networks

After preliminary processing of the original data all thedata without labels were deleted Finally 815614 valid piecesof data were obtained Due to the low frequency and shortduration of transition action as well as the high frequencyand long duration of basic action there is a considerabledifference in data volume between transition action andbasic action e data volume of the six transition actions ismuch lower than that of the other basic actions accountingfor only about 8 of the total data Table 3 lists the amount ofdata for different actions e original data is divided intothree parts training set verification set and test set in whichthe training set is used for model training and verificationset is used to adjust parameters and test set is used tomeasure the quality of the final model

42 Parameters Setting In the deep learning network themodel parameters greatly affect its recognition rateerefore the experimental analysis of the number ofneurons learning rate BN Batch size and other parametersin LSTM layer would be conducted in the following sections

421 Number of Neurons in LSTM Layer In order to verifythe influence of the number of neurons in LSTM layer on therecognition results the following experiments are carriedout in this paper as shown in Figure 5 It shows that therecognition rate is the lowest when each LSTM layer con-tains only 8 neurons is is because given less neurons thenetwork lacks the necessary learning ability and informationprocessing ability resulting in the low recognition rate As

the number of neurons increases the recognition rate tendsto increase When the number of neurons is 64 the rec-ognition rate reaches 9587 If the number of neurons is toolarge the complexity of network structure will increase andthe learning speed of network will slow down ereforeconsidering the training time of the network the number ofLSTM layer neurons in this paper is tentatively 64

422 e Learning Rates Experiments are carried out atdifferent learning rates in this paper As shown in Table 4 itcan be seen that the recognition rate of the model reaches amaximum of 9587 when the learning rate is 0002erefore the learning rate of 0002 is adopted

423 BN Operation To verify the improvement of the BNoperation on the network model a comparative experimentis carried out first with and without BN layer e epoch isset to 400 and other parameters remain unchanged erecognition rates of both methods on the test set are shownin Table 5 Obviously the recognition rate on the test set isimproved by about 424 after the BN layer is added

424 Batch Size Batch size refers to the Batch sample sizewhose maximum value is the total number of samples in the

Figure 4 Data collection of the physical activities

0955

0950

0945

0940

0935

0930

0925

0920

Acc

urac

y

8 16 24 32 40 48 56 64Number of neurons

Test accuracy

Figure 5 Accuracy of different numbers of neurons on test sets

Table 3 e data amount of various activities in the HAPT

Type ID NumberWalk A1 122091Upstairs A2 116707Downstairs A3 107961Sit down A4 126677Stand A5 138105Lie A6 136865Stand to sit A7 10316Sit to stand A8 8029Sit to lie A9 12428Lie to sit A10 11150Stand to lie A11 14418Lie to stop A12 10867

Security and Communication Networks 7

training set When the amount of data is small the batchdata is the whole data set so that it can approach the extremevalue direction more accurately However in practical ap-plications the amount of data used by deep learning isrelatively large and the principle of small batch processing isgenerally adopted Using small batch processing requiresrelatively little memory and faster training time Within anappropriate range increasing the batch size can more ac-curately determine the direction of gradient descent andcause less training shock However when the batch sizeincreases to a certain value the determined downward di-rection will not change and the correction of parameters willslow down significantly e identification results of dif-ferent batch sizes are shown in Table 6 It can be seen thatwhen the batch size is 150 the maximum identification ratereaches 9587 erefore 150 is selected as the best batchsize in this paper

e parameters of the CNN-LSTM model proposed inthis paper are shown in Table 7

5 Experimental Results and Analysis

For human movement recognition Wang and Liu [33]proposed to use the F-measure standard measurementmethod to verify the performance of the deep-rootedLSTM network model in human activity recognition Luet al [34] demonstrated the superiority of the model inbehavior recognition by using accuracy prediction rateand recall rate in the experiment erefore to evaluatethe performance of the motion recognition methodproposed in this paper we also used the measurementmethod of accuracy recall rate loss rate and F-measure inthe experiment

According to the above parameters the recognitionconfusion matrix of 12 different actions is shown in Table 8Accuracy curve of CNN-LSTM model is shown in Figure 6It can be seen from Table 9 that the overall recognition rateof CNN-LSTM is high and the CNN-LSTM has a betterrecognition effect on the transition action

6 Case Study

In the non-deep-learning method the random forest clas-sification method (RF) and K-nearest neighbor (KNN)classification perform well in action classification recogni-tion erefore the CNN-LSTM model proposed is com-pared with the RF and KNN methods First of all input theHAPTdata set into RF and KNNen segment the original

Table 4 Accuracy of different learning rates on test sets

Learning rate Recognition rate ()0001 935700015 94210002 958700025 92390003 933400035 92120004 928400045 9201

Table 5 Accuracy and loss rate on test sets with or without BNlayer

Recognition rate ()Without BN layer 9163With BN layer 9587

Table 6 Accuracy of different batch size on test sets

Batch size Recognition rate ()25 917450 928875 9292100 9310125 9433150 9587175 9345200 9337225 9372250 9345275 9284300 9335325 9406350 9334375 9296400 9353

Table 7 Experimental parameters of CNN-LSTM model

Parameters ValueInput vector size 150Input channel number 8Convolution kernel size 3Pool size 2Activation function ReLuLSTM layer 1Neurons number 64Dropout 05Learning rate 0002Batch size 150Epoch 400

Table 8 Confusion matrix of various actions

ActualPredict

A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A12A1 410 1 3 0 0 0 0 0 0 0 0 0A2 5 388 3 0 0 0 0 0 0 0 0 0A3 1 3 346 0 0 0 0 0 0 0 0 0A4 1 0 0 383 32 3 1 0 1 1 0 0A5 0 0 1 31 431 0 0 0 0 0 0 0A6 0 0 0 1 0 457 0 0 0 0 0 0A7 0 0 0 1 0 0 17 0 0 0 0 0A8 0 0 0 0 0 0 0 4 0 0 1 0A9 0 0 0 0 0 0 0 0 19 1 4 1A10 0 0 0 0 0 0 0 0 1 14 0 2A11 0 1 0 1 0 0 1 0 2 1 32 1A12 0 0 0 0 0 0 0 0 0 1 1 16

8 Security and Communication Networks

sensor data and calculate the mean value variance co-variance and 15 features Finally classify the basic actionsand transition actions according to the clustering resultse classification results are shown in Table 10 It can be seenthat the recognition rate of CNN-LSTM model is higher

than that of RF and KNNmethods for both basic actions andtransition actions

In addition to the comparison with RF and KNN clas-sifier our proposed model is also compared with a singleCNN a single LSTM CNN-GRU and CNN-BLSTM deep

Table 9 e recognition accuracy recall rate and F value of various actions

ID Accuracy () Recall () F-measure ()A1 9903 9832 9868A2 9778 9873 9835A3 9886 9802 9844A4 9076 9185 9130A5 9309 9309 9309A6 9978 9956 9956A7 9444 8947 9189A8 100 100 100A9 7600 8261 7917A10 8235 7778 8000A11 8205 8649 8421A12 8889 8000 8421

Table 10 Average accuracy of various actions in CNN-LSTM RF and KNN models

ID RF () KNN () CNN-LSTM ()A1 9990 8810 9903A2 9250 9780 9778A3 9020 9940 9886A4 9190 8380 9076A5 9080 8750 9309A6 9710 100 9978A7 7130 6670 9444A8 7200 6800 100A9 5130 3860 7600A10 7490 3630 8235A11 5920 3370 8205A12 6110 5790 8889

10

09

08

07

06

05

04

030 50 100 150 200 250 300 350 400

Iterations

TrainValidation

Figure 6 Accuracy curve of CNN-LSTM Model

Security and Communication Networks 9

learning models Table 11 shows the average accuracy ofvarious actions in five different depth models As can be seenfrom Table 11 CNN-LSTM not only has a slightly higherrecognition of basic movements than the other five modelsbut also has a significantly better recognition of transitionmovements especially standing to sitting sitting to lyingand standing to lying Table 12 shows the recognition rates ofdifferent models on the test set It can be seen from the tablethat the average recognition rate of the three models ishigher than 90 but the recognition effect of CNN-LSTMmodel is slightly better than that of CNN LSTM CNN-GRU and CNN-BLSTM

To prove the effectiveness of the CNN-LSTM deeplearning model it is also compared with other deep learningmethods using the same dataset Kuang [35] applied BLSTMto construct the behavior recognition model Hassan et al[36] used deep belief network (DBN) for human behaviorrecognition We compared the performance with the ap-proaches in [35 36] with the result shown in Table 13 Itfollows that the proposed CNN-LSTM can achieve highestaverage recognition rate

7 Conclusion

is paper explored the recognition method based on deeplearning and designed the behavior recognition model basedon CNN-LSTM CNN learns local features from the original

sensor data and LSTM extracts time-dependent relation-ships from local features and realizes the fusion of localfeatures and global features fine description of basic andtransition movements and accurate identification of the twomotion patterns

e actions identified in this paper only include commonbasic actions and individual transition actions In the nextstep more kinds of actions can be collected and morecomplex actions can be added such as eating and drivingAnd the individual recognition can be realized by consid-ering the behavior differences of different users Meanwhilethe deep learning model still needs to be optimized andimproved Studies show that the combination of depthmodel and shallow model can achieve better performanceDeep learning model has strong learning ability whileshallow learning model has higher learning efficiency ecollaboration between the two can achieve more accurateand lightweight recognition

Data Availability

No data were used to support this study

Conflicts of Interest

e authors declare no conflicts of interest

Acknowledgments

e authors would like to thank the support of the labo-ratory university and government is research wasfunded by the National Key Research and Development Plan(No 2017YFB1402103) the National Natural ScienceFoundation of China (No 61971347) Scientific ResearchProgram of Shaanxi Province (2018HJCG-05) and Projectof Xirsquoan Science and Technology Planning Foundation(201805037YD15CG214)

References

[1] I H Lopez-Nava and M M Angelica ldquoWearable inertialsensors for human motion analysis a reviewrdquo IEEE SensorsJournalvol 16 no 15 2016

Table 11 Average accuracy of different activities with five deep learning models

ID CNN () LSTM () CNN-BLSTM () CNN-GRU () CNN-LSTM ()A1 9750 9770 9741 9975 9903A2 9725 9710 9565 9899 9778A3 9560 9715 100 9657 9886A4 9126 9026 9196 8199 9076A5 9080 9080 8474 9248 9309A6 9967 9858 100 9978 9978A7 7647 6486 4444 7778 9444A8 100 6667 6667 5000 100A9 6383 6939 6207 4800 7600A10 8485 7027 8000 5294 8235A11 7250 6933 6500 7179 8205A12 8330 7027 7059 5556 8889

Table 12 Average accuracy of the five models in this paper

Method Average recognition rate ()CNN 9429LSTM 9322CNN-BLSTM 9273CNN-GRU 9334CNN-LSTM 9587

Table 13 Average accuracy of different methods on test set in thepaper [35 36]

Method Average recognition rateBLSTM [35] 875DBN [36] 896CNN-LSTM 958

10 Security and Communication Networks

[2] Y Liu L Nie L Liu and D S Rosenblum ldquoFrom action toactivity sensor-based activity recognitionrdquo Neurocomputingvol 181 pp 108ndash115 2016

[3] T Liu F Bingfei and L Qingguo ldquoe invention relates to awearable motion sensor and a method for resisting magneticfield interferencerdquo 2017

[4] O D Lara and M A Labrador ldquoA survey on human activityrecognition using wearable sensorsrdquo IEEE CommunicationsSurveys amp Tutorials vol 15 no 3 pp 1192ndash1209 2013

[5] F J Ordontildeez and D Roggen ldquoDeep convolutional and LSTMrecurrent neural networks for multimodal wearable activityrecognitionrdquo Sensors (Switzerland) vol 16 p 1 2016

[6] X Du R Vasudevan and M Johnson-Roberson ldquoBio-LSTMa biomechanically inspired recurrent neural network for 3-dpedestrian pose and gait predictionrdquo IEEE Robotics andAutomation Letters vol 4 no 2 pp 1501ndash1508 2019

[7] Y Huang C Wan and H Feng ldquoMulti-feature fusion humanbehavior recognition algorithm based on convolutionalneural network and long short termmemory neural networkrdquoLaser Optoelectron Prog vol 56 p 7 2019

[8] Y Lecun Y Bengio and G Hinton ldquoDeep learningrdquo Naturevol 521 no 7553 pp 436ndash444 2015

[9] G W Taylor R Fergus Y LeCun and C Bregler ldquoCon-volutional learning of spatio-temporal featuresrdquo in LectureNotes in Computer Science (including subseries Lecture Notesin Artificial Intelligence and Lecture Notes in Bioinformatics)Springer Berlin Germany 2010

[10] S Ji W Xu M Yang and K Yu ldquo3D Convolutional neuralnetworks for human action recognitionrdquo IEEE Transactionson Pattern Analysis and Machine Intelligence vol 35 no 1pp 221ndash231 2013

[11] C Liu J Liu Z He Y Zhai Q Hu and Y Huang ldquoCon-volutional neural random fields for action recognitionrdquoPattern Recognition vol 59 pp 213ndash224 2016

[12] K Cho M Bart van G Caglar et al ldquoLearning phraserepresentations using RNN encoder-decoder for statisticalmachine translationrdquo in Proceedings of the EMNLP 2014 -2014 Conference on Empirical Methods in Natural LanguageProcessing pp 1724ndash1734 Doha Qatar October 2014

[13] M Zeng T N Le Y Bo et al ldquoConvolutional NeuralNetworks for human activity recognition using mobile sen-sorsrdquo in Proceedings Of the 2014 6th International ConferenceOn Mobile Computing Applications And Services pp 197ndash205 Austin TX USA November 2015

[14] W Jiang and Z Yin ldquoHuman activity recognition usingwearable sensors by deep convolutional neural networksrdquo inProceedings Of the 2015 ACM Multimedia Conference MM2015 pp 1307ndash1310 Brisbane Australia October 2015

[15] Y Chen and Y Xue ldquoA deep learning approach to humanactivity recognition based on single accelerometerrdquo in Pro-ceedings of the 2015 IEEE International Conference On Sys-tems Man and Cybernetics SMC 2015 pp 1488ndash1492 HongKong China October 2016

[16] C A Ronao and S-B Cho ldquoHuman activity recognition withsmartphone sensors using deep learning neural networksrdquoExpert Systems with Applications vol 59 pp 235ndash244 2016

[17] A Murad and J Y Pyun ldquoDeep recurrent neural networks forhuman activity recognitionrdquo Sensors (Switzerland) vol 17p 11 2017

[18] J Zhou J Sun P Cong et al ldquoSecurity-critical energy-awaretask scheduling for heterogeneous real-time MPSoCs in IoTrdquoIEEE Transactions On Services Computing (TSC) vol 12 p 992019

[19] Y Guan and T Plotz ldquoEnsembles of deep LSTM learners foractivity recognition using wearablesrdquo Proceedings of the ACMon Interactive Mobile Wearable and Ubiquitous Technologiesvol 1 no 2 pp 1ndash28 2017

[20] L Qi X ZhangW Dou C Hu C Yang and J Chen ldquoA two-stage locality-sensitive hashing based approach for privacy-preserving mobile service recommendation in cross-platformedge environmentrdquo Future Generation Computer Systemsvol 88 pp 636ndash643 2018

[21] A Ignatov ldquoReal-time human activity recognition from ac-celerometer data using Convolutional Neural NetworksrdquoApplied Soft Computing vol 62 pp 915ndash922 2018

[22] H F Nweke Y W Teh M A Al-garadi and U R Alo ldquoDeeplearning algorithms for human activity recognition usingmobile and wearable sensor networks state of the art andresearch challengesrdquo Expert Systems with Applicationsvol 105 pp 233ndash261 2018

[23] J Wang Y Chen S Hao X Peng and L Hu ldquoDeep learningfor sensor-based activity recognition a surveyrdquo PatternRecognition Letters vol 119 pp 3ndash11 2019

[24] S Wu G Li L Deng et al ldquo$L1$-norm batch normalizationfor efficient training of deep neural networksrdquo IEEE Trans-actions on Neural Networks and Learning Systems vol 30no 7 pp 2043ndash2051 2019

[25] B Almaslukh J Al Muhtadi and A M Artoli ldquoA robustconvolutional neural network for online smartphone-basedhuman activity recognitionrdquo Journal of Intelligent amp FuzzySystems vol 35 no 2 pp 1609ndash1620 2018

[26] R Yao G Lin Q Shi and D C Ranasinghe ldquoEfficient denselabelling of human activity sequences from wearables usingfully convolutional networksrdquo Pattern Recognition vol 78pp 252ndash266 2018

[27] T Kautz B H Groh J Hannink U Jensen H Strubberg andB M Eskofier ldquoActivity recognition in beach volleyball usinga deep convolutional neural networkrdquo Data Mining andKnowledge Discovery vol 31 no 6 pp 1678ndash1705 2017

[28] R Jozefowicz W Zaremba and I Sutskever ldquoAn empiricalexploration of recurrent network architecturesrdquo in Proceed-ings of the 32nd international Conference on machine learningICML 2015 vol 3 pp 2332ndash2340 Lille France July 2015

[29] S Li S Zhao P Yang P Andriotis L Xu and Q SunldquoDistributed consensus algorithm for events detection incyber-physical systemsrdquo IEEE Internet of ings Journalvol 6 no 2 pp 2299ndash2308 2019

[30] G E Hinton N Srivastava A Krizhevsky I Sutskever andR R Salakhutdinov Improving Neural Networks by Pre-venting Co-adaptation of Feature Detectors arXiv preparationGeneva Switzerland 2012

[31] B M h Abidine L Fergani B Fergani and M Oussalahldquoe joint use of sequence features combination and modifiedweighted SVM for improving daily activity recognitionrdquoPattern Analysis and Applications vol 21 no 1 pp 119ndash1382018

[32] G M Weiss J W Lockhart T T Pulickal et al ldquoAsmartphone-based activity recognition system for improvinghealth and well-beingrdquo in Proceedings of the 3rd IEEE In-ternational Conference On Data Science And Advanced An-alytics DSAA 2016 pp 682ndash688 Montreal QC CanadaOctober 2016

[33] L Wang and R Liu ldquoHuman activity recognition based onwearable sensor using hierarchical deep LSTM networksrdquoCircuits Systems and Signal Processing vol 39 no 2pp 837ndash856 2019

Security and Communication Networks 11

[34] W Lu F Fan J Chu P Jing and S Yuting ldquoWearablecomputing for internet of things a discriminant approach forhuman activity recognitionrdquo IEEE Internet of ings Journalvol 6 no 2 pp 2749ndash2759 2019

[35] X Kuang Human Behavior Recognition Based on DeepLearning and Wearable Sensor Nanjing University of In-formation Engineering Nanjing China 2018

[36] M M Hassan M Z Uddin A Mohamed and A AlmogrenldquoA robust human activity recognition system using smart-phone sensors and deep learningrdquo Future Generation Com-puter Systems vol 81 pp 307ndash313 2018

12 Security and Communication Networks

Page 4: Wearable Sensor-Based Human Activity Recognition Using ...downloads.hindawi.com/journals/scn/2020/2132138.pdf · Wearable Sensor-Based Human Activity Recognition Using Hybrid Deep

convolution kernel size is 2times 8 2times18 and 2times 36 and thestep size is 1 Since the filter may not be able to process thedata in a certain direction in the operation of convolution toavoid reducing data of the image edge the padding pa-rameter is introduced and set to ldquoSAMErdquo and 0 is added tothe edge of the input image matrix After the convolutionoperation in the convolution layer the output will usuallypass through a nonlinear activation function and then formthe output of the convolution layer e popular activationfunctions include Sigmoid function ReLU function andTanh function Among them ReLU function can change thenegative value of the data extracted by CNN into 0 and thepositive value of the data greater than 0 remains unchangedAfter nonlinear processing operation the positive value

Raw data

Window splitTime

w1

w2w3

Figure 2 Structure of data preprocessing model

m lowast nInput

Conv_1 Pooling_1 Conv_2 Pooling_2 Conv_3 Pooling_3

Feature map

Convolutional and pooling layer

Figure 3 CNN model architecture

Table 1 Activity label corresponding to the original data

Id Exp Label Start End1 1 5 250 12321 1 7 1233 13921 1 4 1393 21941 1 8 2195 23591 1 5 2360 33741 1 11 3375 36621 1 5 3663 45381 1 11 4539 47351 1 5 4736 56671 1 11 5668 58591 1 5 5860 67861 1 11 6787 69771 1 5 6978 8078

Input

Con

v_1

Pool

ing

Pool

ing

Con

v_2

Pool

ing

Con

v_3

Feature map

CNN feature extraction

LSTM

Fully

conn

ecte

d

Batc

h no

rmal

izat

ion

Som

ax

LSTM feature fusion and classify

WalkingUpstairs

Stand to lie

Lie to stand

Figure 1 Human activity recognition framework based on CNN-LSTM

4 Security and Communication Networks

greater than 0 can be more clearly expressed by the extractedfeatures erefore ReLU activation function is used in theconvolution layer of CNN

f(x) max(0 x) 0 xlt 0

x xge 01113896 (1)

Further we have

fprime(x) 0 xlt 0

1 xge 01113896 (2)

Pooling layer is regarded as reducing the number offeature mappings and parameters e popular poolingtechniques include maximum pooling and average poolingIn recent years relevant theoretical analysis and perfor-mance evaluation have shown the superior performance ofthe maximum pooling strategy which is widely used in deeplearning [25 26] Moreover some studies show that themaximum pooling technology is very suitable for sensor-based human behavior recognition [27] erefore allpooling layers of CNN in this paper utilized the maximumpooling technique Specific convolution and pooling processparameters are set as shown in Table 2

33 Feature FusionandActionClassification To improve therecognition rate of transition actions we build a LSTM afterthe CNN network f1 f2 fn1113864 1113865 is the feature sequenceconverted from the feature map calculated by CNN from theimages composed of original data erefore the sequencef1 f2 fn1113864 1113865 input LSTM and the storage unit of LSTMwill produce a sequence of characters m1 m2 mn1113864 1113865

Since LSTM has different gating units memory unitssuch as input gate forgetting gate and output gate arecombined with learning weights to solve the problem ofgradient disappearance in the process of back propagation ofordinary circular neural network Meanwhile LSTM canmodel time-dependent actions and fully capture globalfeatures so as to improve the recognition accuracy [28]LSTM cell controls the inward flowing information ofneurons which is composed of forgetting gate input gateand output gate Furthermore the predicted value of LSTMcell is obtained using Tanh function

Firstly the forgetting gate determines how much in-formation from the previous moment can be accumulated tothe current cell As shown in equation (3) the probabilityvalue is calculated to determine the amount of informationthat can pass through the gate

Γf σ wf lowast alangtminus 1rang

xlangtrang

1113960 1113961 + bf1113872 1113873 (3)

where wf represents the weight corresponding to the inputvector b represents the bias alangtminus 1rang presents the output of theneuron at the last moment and xlangtrang represents the currentinput of the neuron

Secondly the input gate consists of update gate and Tanhlayer which controls how much information can flow intothe current cell e calculation process is shown in equa-tions (4)ndash(6) e input of the input gate and the output ofthe forgetting gate update the cell at the same time

discarding unwanted information en the predicted valueof the current unit is determined by the output gate and theoutput of the model is obtained as shown in equations (7)and (8)

Γu σ wulowast alangtminus 1rang

xlangtrang

1113960 1113961 + bu1113872 1113873 (4)

1113957C tanh wclowast alangtminus 1rang

xlangtrang

1113960 1113961 + bc1113872 1113873 (5)

Ct Γu lowast 1113957Clangtrang

+ Γf lowastClangtminus 1rang

(6)

Γo σ wo lowast alangtminus 1rang

xlangtrang

1113960 1113961 + bo1113872 1113873 (7)

alangtrang

Γo lowast tanh Clangtrang

1113872 1113873 (8)

After the processing of LSTM layer the final output is aset of vectors containing time and action sequence corre-lation which are input into the full connection layer for thefusion of global action features e training process ofneural network model becomes complicated since the sta-tistical distribution of input of each layer changes with theparameters of the previous layer To keep the distribution ofoutput data from changing too much a lower learning ratewill be used which could reduce the training speed To solvethis issue this paper introduces the BN to standardize thevalues of each layer in LSTM (the output of neurons at thelast moment and the input at the current moment) so thatthe mean and variance of sum will not change with thechange of the distribution of the underlying parameters andeffectively separate the parameters of each layer from otherlayers In this way the gradient disappearance or explosioncan be prevented and the training speed of the network canbe accelerated e BN algorithm is shown in Algorithm 1

In Algorithm 1 μx and ς2x are themean and variance of xi

obtained through minibatch e mean and variance wereused to normalize xi to make the sample follow normaldistribution However the positive distribution is not able toreflect the characteristic distribution of the training samplesand thus it is necessary to introduce the scaling factor c andthe shift factor β As training progresses c and β are alsolearned by back propagation to improve accuracy

After BN operation the features are more obvious soinput them to Softmax layer to extract the action featuresand classify them in time series In this model the outputlayer uses Softmax normalized exponential function tocalculate the posterior probabilities of different actions to

Table 2 e convolution and pooling layers of the CNNarchitecture

Layers Conv1d_1 Conv1d_2 Conv1d_2Size 1times 2times 8 1times 2times18 1times 2times 36Stride 1times 1times 1 1times 1times 1 1times 1times 1Channel 18 36 72Layers Pooling_1 Pooling_2 Pooling_3Size 1times 2times18 1times 2times 36 1times 2times 72Stride 1times 1times 1 1times 1times 1 1times 1times 1Channel 18 36 72

Security and Communication Networks 5

realize classification It maps the output values of neuronsbetween (0 1) which can be regarded as the predictionprobability of actions and the largest one is the result ofclassification en the Softmax output layer outputs acategory vector such as [0 0 0 0 1 0 0 0 0 0 0 0]indicating that the classification result is an action numbered5

34 Model Implementation and Training e neural net-work described here is implemented in TensorFlow [29] It is alightweight library for building and training neural networksModel training and classification runs on a conventionalcomputer with a 24GHz CPU and 16GB memory

e model is trained in a fully supervised manner tobackpropagate the gradient from the Softmax layer to theconvolution layer Network parameters are optimized byusing minibatch gradient descent method and Adam opti-mizer through minimizing cross-loss function [13] Adam iswidely used due to its advantages in simple implementationefficient calculation and low memory demand Comparedwith other kinds of random optimization algorithms Adamhas great advantages In this paper to better train the modelafter the training data are input into the network Adamoptimizer and backpropagation algorithm are used to learnand optimize the network parameters Meanwhile the cross-entropy loss function is used to calculate the total error asshown in the following equation

C minus1N

1113944X

[yIna +(1 minus y)In(1 minus a)] (9)

where y is the true tag and a is the predicted valueTo improve efficiency small batches of data segment size

are segmented during training and testing With theseconfigurations the cumulative gradient of the parameters iscalculated after each small batch e weights are randomlyand orthogonally initialized As a form of regularization weintroduce a dropout operator on each dense layer of inputis operator sets the activation of a randomly selected unitto zero during training Dropout technology proposed byHinton et al [30] is based on the principle of randomlydeleting some nodes in the network while maintaining theintegrity of input and output neurons which is equivalent totraining many different networks Different networks mayoverfit in different ways but their average results can ef-fectively reduce overfitting In addition dropout allowsneurons to learn stronger features by not relying on other

specific neurons e number of parameters to be optimizedin a deep neural network varies depending on the type oflayer it contains And it has a great impact on the time andcomputer skills required to train the network e specificmodel training parameters will reflect the best choices in theexperiment

4 Activity Recognition

41 Experiment Data In addition to common basic actionsthis paper also studies transition actions Actually a fewexisting public data sets contain transition actions ere-fore this paper adopts the international standard Data SetSmart phone Based Recognition of Human Activities andPostural Transitions Data Set [31 32] to conduct an ex-periment which is abbreviated as HAPT Data Set e dataset is an updated version of the UCI Human ActivityRecognition Using popularity Data set [8] It provides rawdata from smart phone sensors rather than preprocesseddata In addition the action category has been expanded toinclude transition actions e HAPT data set containstwelve types of actions Firstly it has six basic actions thatinclude three types of static actions such as standing sittingand lying and three types of walking activities such aswalking going downstairs and upstairs Secondly it has sixpossible transitions between any two static movementsstanding to sitting sitting to standing standing to lyinglying to sitting sitting to lying and lying to standing

e HAPT data collection process is shown in Figure 4e experiment involved 30 volunteers whose ages rangefrom 19 to 48 each wearing a smart phone on their waistData collection is carried out with the built-in accelerationsensor and gyroscope and the sampling frequency is 50HzMeanwhile video records of the experimental process aremade for the convenience of subsequent data marking

e collected data is saved in the form of txt and theacceleration and gyroscope data are stored independentlywith 60 groups respectively As shown in Table 1 it is thelabel information corresponding to the original data of theexperiment Among them the first column is the experimentID the second column is the experimenter number the thirdcolumn is the action label and the fourth and fifth columnsare the start and end row labels of the corresponding sensordata e label ranges from 1 to 12 representing 12 types ofactions It can be seen from the figure that the collected datacontains invalid data and the first 250 pieces of data areunlabeled and belong to invalid data

Input data set χ x1 xn1113864 1113865

Output yi BNcβ(xi)1113966 1113967

(1) Calculate the mean of data set μxlarr(1n) 1113936ni1 xi

(2) Calculate the variance of data set ς2χlarr(1n) 1113936ni1 (xi minus μχ)

2

(3) Normalize data 1113954xilarr(xi minus μxς2x

1113968+ ε)

(4) Scale change and deviation yilarrc1113954xi + β BNcβ(xi)

(5) Return learning parameter c and β

ALGORITHM 1 Algorithm of batch normalization

6 Security and Communication Networks

After preliminary processing of the original data all thedata without labels were deleted Finally 815614 valid piecesof data were obtained Due to the low frequency and shortduration of transition action as well as the high frequencyand long duration of basic action there is a considerabledifference in data volume between transition action andbasic action e data volume of the six transition actions ismuch lower than that of the other basic actions accountingfor only about 8 of the total data Table 3 lists the amount ofdata for different actions e original data is divided intothree parts training set verification set and test set in whichthe training set is used for model training and verificationset is used to adjust parameters and test set is used tomeasure the quality of the final model

42 Parameters Setting In the deep learning network themodel parameters greatly affect its recognition rateerefore the experimental analysis of the number ofneurons learning rate BN Batch size and other parametersin LSTM layer would be conducted in the following sections

421 Number of Neurons in LSTM Layer In order to verifythe influence of the number of neurons in LSTM layer on therecognition results the following experiments are carriedout in this paper as shown in Figure 5 It shows that therecognition rate is the lowest when each LSTM layer con-tains only 8 neurons is is because given less neurons thenetwork lacks the necessary learning ability and informationprocessing ability resulting in the low recognition rate As

the number of neurons increases the recognition rate tendsto increase When the number of neurons is 64 the rec-ognition rate reaches 9587 If the number of neurons is toolarge the complexity of network structure will increase andthe learning speed of network will slow down ereforeconsidering the training time of the network the number ofLSTM layer neurons in this paper is tentatively 64

422 e Learning Rates Experiments are carried out atdifferent learning rates in this paper As shown in Table 4 itcan be seen that the recognition rate of the model reaches amaximum of 9587 when the learning rate is 0002erefore the learning rate of 0002 is adopted

423 BN Operation To verify the improvement of the BNoperation on the network model a comparative experimentis carried out first with and without BN layer e epoch isset to 400 and other parameters remain unchanged erecognition rates of both methods on the test set are shownin Table 5 Obviously the recognition rate on the test set isimproved by about 424 after the BN layer is added

424 Batch Size Batch size refers to the Batch sample sizewhose maximum value is the total number of samples in the

Figure 4 Data collection of the physical activities

0955

0950

0945

0940

0935

0930

0925

0920

Acc

urac

y

8 16 24 32 40 48 56 64Number of neurons

Test accuracy

Figure 5 Accuracy of different numbers of neurons on test sets

Table 3 e data amount of various activities in the HAPT

Type ID NumberWalk A1 122091Upstairs A2 116707Downstairs A3 107961Sit down A4 126677Stand A5 138105Lie A6 136865Stand to sit A7 10316Sit to stand A8 8029Sit to lie A9 12428Lie to sit A10 11150Stand to lie A11 14418Lie to stop A12 10867

Security and Communication Networks 7

training set When the amount of data is small the batchdata is the whole data set so that it can approach the extremevalue direction more accurately However in practical ap-plications the amount of data used by deep learning isrelatively large and the principle of small batch processing isgenerally adopted Using small batch processing requiresrelatively little memory and faster training time Within anappropriate range increasing the batch size can more ac-curately determine the direction of gradient descent andcause less training shock However when the batch sizeincreases to a certain value the determined downward di-rection will not change and the correction of parameters willslow down significantly e identification results of dif-ferent batch sizes are shown in Table 6 It can be seen thatwhen the batch size is 150 the maximum identification ratereaches 9587 erefore 150 is selected as the best batchsize in this paper

e parameters of the CNN-LSTM model proposed inthis paper are shown in Table 7

5 Experimental Results and Analysis

For human movement recognition Wang and Liu [33]proposed to use the F-measure standard measurementmethod to verify the performance of the deep-rootedLSTM network model in human activity recognition Luet al [34] demonstrated the superiority of the model inbehavior recognition by using accuracy prediction rateand recall rate in the experiment erefore to evaluatethe performance of the motion recognition methodproposed in this paper we also used the measurementmethod of accuracy recall rate loss rate and F-measure inthe experiment

According to the above parameters the recognitionconfusion matrix of 12 different actions is shown in Table 8Accuracy curve of CNN-LSTM model is shown in Figure 6It can be seen from Table 9 that the overall recognition rateof CNN-LSTM is high and the CNN-LSTM has a betterrecognition effect on the transition action

6 Case Study

In the non-deep-learning method the random forest clas-sification method (RF) and K-nearest neighbor (KNN)classification perform well in action classification recogni-tion erefore the CNN-LSTM model proposed is com-pared with the RF and KNN methods First of all input theHAPTdata set into RF and KNNen segment the original

Table 4 Accuracy of different learning rates on test sets

Learning rate Recognition rate ()0001 935700015 94210002 958700025 92390003 933400035 92120004 928400045 9201

Table 5 Accuracy and loss rate on test sets with or without BNlayer

Recognition rate ()Without BN layer 9163With BN layer 9587

Table 6 Accuracy of different batch size on test sets

Batch size Recognition rate ()25 917450 928875 9292100 9310125 9433150 9587175 9345200 9337225 9372250 9345275 9284300 9335325 9406350 9334375 9296400 9353

Table 7 Experimental parameters of CNN-LSTM model

Parameters ValueInput vector size 150Input channel number 8Convolution kernel size 3Pool size 2Activation function ReLuLSTM layer 1Neurons number 64Dropout 05Learning rate 0002Batch size 150Epoch 400

Table 8 Confusion matrix of various actions

ActualPredict

A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A12A1 410 1 3 0 0 0 0 0 0 0 0 0A2 5 388 3 0 0 0 0 0 0 0 0 0A3 1 3 346 0 0 0 0 0 0 0 0 0A4 1 0 0 383 32 3 1 0 1 1 0 0A5 0 0 1 31 431 0 0 0 0 0 0 0A6 0 0 0 1 0 457 0 0 0 0 0 0A7 0 0 0 1 0 0 17 0 0 0 0 0A8 0 0 0 0 0 0 0 4 0 0 1 0A9 0 0 0 0 0 0 0 0 19 1 4 1A10 0 0 0 0 0 0 0 0 1 14 0 2A11 0 1 0 1 0 0 1 0 2 1 32 1A12 0 0 0 0 0 0 0 0 0 1 1 16

8 Security and Communication Networks

sensor data and calculate the mean value variance co-variance and 15 features Finally classify the basic actionsand transition actions according to the clustering resultse classification results are shown in Table 10 It can be seenthat the recognition rate of CNN-LSTM model is higher

than that of RF and KNNmethods for both basic actions andtransition actions

In addition to the comparison with RF and KNN clas-sifier our proposed model is also compared with a singleCNN a single LSTM CNN-GRU and CNN-BLSTM deep

Table 9 e recognition accuracy recall rate and F value of various actions

ID Accuracy () Recall () F-measure ()A1 9903 9832 9868A2 9778 9873 9835A3 9886 9802 9844A4 9076 9185 9130A5 9309 9309 9309A6 9978 9956 9956A7 9444 8947 9189A8 100 100 100A9 7600 8261 7917A10 8235 7778 8000A11 8205 8649 8421A12 8889 8000 8421

Table 10 Average accuracy of various actions in CNN-LSTM RF and KNN models

ID RF () KNN () CNN-LSTM ()A1 9990 8810 9903A2 9250 9780 9778A3 9020 9940 9886A4 9190 8380 9076A5 9080 8750 9309A6 9710 100 9978A7 7130 6670 9444A8 7200 6800 100A9 5130 3860 7600A10 7490 3630 8235A11 5920 3370 8205A12 6110 5790 8889

10

09

08

07

06

05

04

030 50 100 150 200 250 300 350 400

Iterations

TrainValidation

Figure 6 Accuracy curve of CNN-LSTM Model

Security and Communication Networks 9

learning models Table 11 shows the average accuracy ofvarious actions in five different depth models As can be seenfrom Table 11 CNN-LSTM not only has a slightly higherrecognition of basic movements than the other five modelsbut also has a significantly better recognition of transitionmovements especially standing to sitting sitting to lyingand standing to lying Table 12 shows the recognition rates ofdifferent models on the test set It can be seen from the tablethat the average recognition rate of the three models ishigher than 90 but the recognition effect of CNN-LSTMmodel is slightly better than that of CNN LSTM CNN-GRU and CNN-BLSTM

To prove the effectiveness of the CNN-LSTM deeplearning model it is also compared with other deep learningmethods using the same dataset Kuang [35] applied BLSTMto construct the behavior recognition model Hassan et al[36] used deep belief network (DBN) for human behaviorrecognition We compared the performance with the ap-proaches in [35 36] with the result shown in Table 13 Itfollows that the proposed CNN-LSTM can achieve highestaverage recognition rate

7 Conclusion

is paper explored the recognition method based on deeplearning and designed the behavior recognition model basedon CNN-LSTM CNN learns local features from the original

sensor data and LSTM extracts time-dependent relation-ships from local features and realizes the fusion of localfeatures and global features fine description of basic andtransition movements and accurate identification of the twomotion patterns

e actions identified in this paper only include commonbasic actions and individual transition actions In the nextstep more kinds of actions can be collected and morecomplex actions can be added such as eating and drivingAnd the individual recognition can be realized by consid-ering the behavior differences of different users Meanwhilethe deep learning model still needs to be optimized andimproved Studies show that the combination of depthmodel and shallow model can achieve better performanceDeep learning model has strong learning ability whileshallow learning model has higher learning efficiency ecollaboration between the two can achieve more accurateand lightweight recognition

Data Availability

No data were used to support this study

Conflicts of Interest

e authors declare no conflicts of interest

Acknowledgments

e authors would like to thank the support of the labo-ratory university and government is research wasfunded by the National Key Research and Development Plan(No 2017YFB1402103) the National Natural ScienceFoundation of China (No 61971347) Scientific ResearchProgram of Shaanxi Province (2018HJCG-05) and Projectof Xirsquoan Science and Technology Planning Foundation(201805037YD15CG214)

References

[1] I H Lopez-Nava and M M Angelica ldquoWearable inertialsensors for human motion analysis a reviewrdquo IEEE SensorsJournalvol 16 no 15 2016

Table 11 Average accuracy of different activities with five deep learning models

ID CNN () LSTM () CNN-BLSTM () CNN-GRU () CNN-LSTM ()A1 9750 9770 9741 9975 9903A2 9725 9710 9565 9899 9778A3 9560 9715 100 9657 9886A4 9126 9026 9196 8199 9076A5 9080 9080 8474 9248 9309A6 9967 9858 100 9978 9978A7 7647 6486 4444 7778 9444A8 100 6667 6667 5000 100A9 6383 6939 6207 4800 7600A10 8485 7027 8000 5294 8235A11 7250 6933 6500 7179 8205A12 8330 7027 7059 5556 8889

Table 12 Average accuracy of the five models in this paper

Method Average recognition rate ()CNN 9429LSTM 9322CNN-BLSTM 9273CNN-GRU 9334CNN-LSTM 9587

Table 13 Average accuracy of different methods on test set in thepaper [35 36]

Method Average recognition rateBLSTM [35] 875DBN [36] 896CNN-LSTM 958

10 Security and Communication Networks

[2] Y Liu L Nie L Liu and D S Rosenblum ldquoFrom action toactivity sensor-based activity recognitionrdquo Neurocomputingvol 181 pp 108ndash115 2016

[3] T Liu F Bingfei and L Qingguo ldquoe invention relates to awearable motion sensor and a method for resisting magneticfield interferencerdquo 2017

[4] O D Lara and M A Labrador ldquoA survey on human activityrecognition using wearable sensorsrdquo IEEE CommunicationsSurveys amp Tutorials vol 15 no 3 pp 1192ndash1209 2013

[5] F J Ordontildeez and D Roggen ldquoDeep convolutional and LSTMrecurrent neural networks for multimodal wearable activityrecognitionrdquo Sensors (Switzerland) vol 16 p 1 2016

[6] X Du R Vasudevan and M Johnson-Roberson ldquoBio-LSTMa biomechanically inspired recurrent neural network for 3-dpedestrian pose and gait predictionrdquo IEEE Robotics andAutomation Letters vol 4 no 2 pp 1501ndash1508 2019

[7] Y Huang C Wan and H Feng ldquoMulti-feature fusion humanbehavior recognition algorithm based on convolutionalneural network and long short termmemory neural networkrdquoLaser Optoelectron Prog vol 56 p 7 2019

[8] Y Lecun Y Bengio and G Hinton ldquoDeep learningrdquo Naturevol 521 no 7553 pp 436ndash444 2015

[9] G W Taylor R Fergus Y LeCun and C Bregler ldquoCon-volutional learning of spatio-temporal featuresrdquo in LectureNotes in Computer Science (including subseries Lecture Notesin Artificial Intelligence and Lecture Notes in Bioinformatics)Springer Berlin Germany 2010

[10] S Ji W Xu M Yang and K Yu ldquo3D Convolutional neuralnetworks for human action recognitionrdquo IEEE Transactionson Pattern Analysis and Machine Intelligence vol 35 no 1pp 221ndash231 2013

[11] C Liu J Liu Z He Y Zhai Q Hu and Y Huang ldquoCon-volutional neural random fields for action recognitionrdquoPattern Recognition vol 59 pp 213ndash224 2016

[12] K Cho M Bart van G Caglar et al ldquoLearning phraserepresentations using RNN encoder-decoder for statisticalmachine translationrdquo in Proceedings of the EMNLP 2014 -2014 Conference on Empirical Methods in Natural LanguageProcessing pp 1724ndash1734 Doha Qatar October 2014

[13] M Zeng T N Le Y Bo et al ldquoConvolutional NeuralNetworks for human activity recognition using mobile sen-sorsrdquo in Proceedings Of the 2014 6th International ConferenceOn Mobile Computing Applications And Services pp 197ndash205 Austin TX USA November 2015

[14] W Jiang and Z Yin ldquoHuman activity recognition usingwearable sensors by deep convolutional neural networksrdquo inProceedings Of the 2015 ACM Multimedia Conference MM2015 pp 1307ndash1310 Brisbane Australia October 2015

[15] Y Chen and Y Xue ldquoA deep learning approach to humanactivity recognition based on single accelerometerrdquo in Pro-ceedings of the 2015 IEEE International Conference On Sys-tems Man and Cybernetics SMC 2015 pp 1488ndash1492 HongKong China October 2016

[16] C A Ronao and S-B Cho ldquoHuman activity recognition withsmartphone sensors using deep learning neural networksrdquoExpert Systems with Applications vol 59 pp 235ndash244 2016

[17] A Murad and J Y Pyun ldquoDeep recurrent neural networks forhuman activity recognitionrdquo Sensors (Switzerland) vol 17p 11 2017

[18] J Zhou J Sun P Cong et al ldquoSecurity-critical energy-awaretask scheduling for heterogeneous real-time MPSoCs in IoTrdquoIEEE Transactions On Services Computing (TSC) vol 12 p 992019

[19] Y Guan and T Plotz ldquoEnsembles of deep LSTM learners foractivity recognition using wearablesrdquo Proceedings of the ACMon Interactive Mobile Wearable and Ubiquitous Technologiesvol 1 no 2 pp 1ndash28 2017

[20] L Qi X ZhangW Dou C Hu C Yang and J Chen ldquoA two-stage locality-sensitive hashing based approach for privacy-preserving mobile service recommendation in cross-platformedge environmentrdquo Future Generation Computer Systemsvol 88 pp 636ndash643 2018

[21] A Ignatov ldquoReal-time human activity recognition from ac-celerometer data using Convolutional Neural NetworksrdquoApplied Soft Computing vol 62 pp 915ndash922 2018

[22] H F Nweke Y W Teh M A Al-garadi and U R Alo ldquoDeeplearning algorithms for human activity recognition usingmobile and wearable sensor networks state of the art andresearch challengesrdquo Expert Systems with Applicationsvol 105 pp 233ndash261 2018

[23] J Wang Y Chen S Hao X Peng and L Hu ldquoDeep learningfor sensor-based activity recognition a surveyrdquo PatternRecognition Letters vol 119 pp 3ndash11 2019

[24] S Wu G Li L Deng et al ldquo$L1$-norm batch normalizationfor efficient training of deep neural networksrdquo IEEE Trans-actions on Neural Networks and Learning Systems vol 30no 7 pp 2043ndash2051 2019

[25] B Almaslukh J Al Muhtadi and A M Artoli ldquoA robustconvolutional neural network for online smartphone-basedhuman activity recognitionrdquo Journal of Intelligent amp FuzzySystems vol 35 no 2 pp 1609ndash1620 2018

[26] R Yao G Lin Q Shi and D C Ranasinghe ldquoEfficient denselabelling of human activity sequences from wearables usingfully convolutional networksrdquo Pattern Recognition vol 78pp 252ndash266 2018

[27] T Kautz B H Groh J Hannink U Jensen H Strubberg andB M Eskofier ldquoActivity recognition in beach volleyball usinga deep convolutional neural networkrdquo Data Mining andKnowledge Discovery vol 31 no 6 pp 1678ndash1705 2017

[28] R Jozefowicz W Zaremba and I Sutskever ldquoAn empiricalexploration of recurrent network architecturesrdquo in Proceed-ings of the 32nd international Conference on machine learningICML 2015 vol 3 pp 2332ndash2340 Lille France July 2015

[29] S Li S Zhao P Yang P Andriotis L Xu and Q SunldquoDistributed consensus algorithm for events detection incyber-physical systemsrdquo IEEE Internet of ings Journalvol 6 no 2 pp 2299ndash2308 2019

[30] G E Hinton N Srivastava A Krizhevsky I Sutskever andR R Salakhutdinov Improving Neural Networks by Pre-venting Co-adaptation of Feature Detectors arXiv preparationGeneva Switzerland 2012

[31] B M h Abidine L Fergani B Fergani and M Oussalahldquoe joint use of sequence features combination and modifiedweighted SVM for improving daily activity recognitionrdquoPattern Analysis and Applications vol 21 no 1 pp 119ndash1382018

[32] G M Weiss J W Lockhart T T Pulickal et al ldquoAsmartphone-based activity recognition system for improvinghealth and well-beingrdquo in Proceedings of the 3rd IEEE In-ternational Conference On Data Science And Advanced An-alytics DSAA 2016 pp 682ndash688 Montreal QC CanadaOctober 2016

[33] L Wang and R Liu ldquoHuman activity recognition based onwearable sensor using hierarchical deep LSTM networksrdquoCircuits Systems and Signal Processing vol 39 no 2pp 837ndash856 2019

Security and Communication Networks 11

[34] W Lu F Fan J Chu P Jing and S Yuting ldquoWearablecomputing for internet of things a discriminant approach forhuman activity recognitionrdquo IEEE Internet of ings Journalvol 6 no 2 pp 2749ndash2759 2019

[35] X Kuang Human Behavior Recognition Based on DeepLearning and Wearable Sensor Nanjing University of In-formation Engineering Nanjing China 2018

[36] M M Hassan M Z Uddin A Mohamed and A AlmogrenldquoA robust human activity recognition system using smart-phone sensors and deep learningrdquo Future Generation Com-puter Systems vol 81 pp 307ndash313 2018

12 Security and Communication Networks

Page 5: Wearable Sensor-Based Human Activity Recognition Using ...downloads.hindawi.com/journals/scn/2020/2132138.pdf · Wearable Sensor-Based Human Activity Recognition Using Hybrid Deep

greater than 0 can be more clearly expressed by the extractedfeatures erefore ReLU activation function is used in theconvolution layer of CNN

f(x) max(0 x) 0 xlt 0

x xge 01113896 (1)

Further we have

fprime(x) 0 xlt 0

1 xge 01113896 (2)

Pooling layer is regarded as reducing the number offeature mappings and parameters e popular poolingtechniques include maximum pooling and average poolingIn recent years relevant theoretical analysis and perfor-mance evaluation have shown the superior performance ofthe maximum pooling strategy which is widely used in deeplearning [25 26] Moreover some studies show that themaximum pooling technology is very suitable for sensor-based human behavior recognition [27] erefore allpooling layers of CNN in this paper utilized the maximumpooling technique Specific convolution and pooling processparameters are set as shown in Table 2

33 Feature FusionandActionClassification To improve therecognition rate of transition actions we build a LSTM afterthe CNN network f1 f2 fn1113864 1113865 is the feature sequenceconverted from the feature map calculated by CNN from theimages composed of original data erefore the sequencef1 f2 fn1113864 1113865 input LSTM and the storage unit of LSTMwill produce a sequence of characters m1 m2 mn1113864 1113865

Since LSTM has different gating units memory unitssuch as input gate forgetting gate and output gate arecombined with learning weights to solve the problem ofgradient disappearance in the process of back propagation ofordinary circular neural network Meanwhile LSTM canmodel time-dependent actions and fully capture globalfeatures so as to improve the recognition accuracy [28]LSTM cell controls the inward flowing information ofneurons which is composed of forgetting gate input gateand output gate Furthermore the predicted value of LSTMcell is obtained using Tanh function

Firstly the forgetting gate determines how much in-formation from the previous moment can be accumulated tothe current cell As shown in equation (3) the probabilityvalue is calculated to determine the amount of informationthat can pass through the gate

Γf σ wf lowast alangtminus 1rang

xlangtrang

1113960 1113961 + bf1113872 1113873 (3)

where wf represents the weight corresponding to the inputvector b represents the bias alangtminus 1rang presents the output of theneuron at the last moment and xlangtrang represents the currentinput of the neuron

Secondly the input gate consists of update gate and Tanhlayer which controls how much information can flow intothe current cell e calculation process is shown in equa-tions (4)ndash(6) e input of the input gate and the output ofthe forgetting gate update the cell at the same time

discarding unwanted information en the predicted valueof the current unit is determined by the output gate and theoutput of the model is obtained as shown in equations (7)and (8)

Γu σ wulowast alangtminus 1rang

xlangtrang

1113960 1113961 + bu1113872 1113873 (4)

1113957C tanh wclowast alangtminus 1rang

xlangtrang

1113960 1113961 + bc1113872 1113873 (5)

Ct Γu lowast 1113957Clangtrang

+ Γf lowastClangtminus 1rang

(6)

Γo σ wo lowast alangtminus 1rang

xlangtrang

1113960 1113961 + bo1113872 1113873 (7)

alangtrang

Γo lowast tanh Clangtrang

1113872 1113873 (8)

After the processing of LSTM layer the final output is aset of vectors containing time and action sequence corre-lation which are input into the full connection layer for thefusion of global action features e training process ofneural network model becomes complicated since the sta-tistical distribution of input of each layer changes with theparameters of the previous layer To keep the distribution ofoutput data from changing too much a lower learning ratewill be used which could reduce the training speed To solvethis issue this paper introduces the BN to standardize thevalues of each layer in LSTM (the output of neurons at thelast moment and the input at the current moment) so thatthe mean and variance of sum will not change with thechange of the distribution of the underlying parameters andeffectively separate the parameters of each layer from otherlayers In this way the gradient disappearance or explosioncan be prevented and the training speed of the network canbe accelerated e BN algorithm is shown in Algorithm 1

In Algorithm 1 μx and ς2x are themean and variance of xi

obtained through minibatch e mean and variance wereused to normalize xi to make the sample follow normaldistribution However the positive distribution is not able toreflect the characteristic distribution of the training samplesand thus it is necessary to introduce the scaling factor c andthe shift factor β As training progresses c and β are alsolearned by back propagation to improve accuracy

After BN operation the features are more obvious soinput them to Softmax layer to extract the action featuresand classify them in time series In this model the outputlayer uses Softmax normalized exponential function tocalculate the posterior probabilities of different actions to

Table 2 e convolution and pooling layers of the CNNarchitecture

Layers Conv1d_1 Conv1d_2 Conv1d_2Size 1times 2times 8 1times 2times18 1times 2times 36Stride 1times 1times 1 1times 1times 1 1times 1times 1Channel 18 36 72Layers Pooling_1 Pooling_2 Pooling_3Size 1times 2times18 1times 2times 36 1times 2times 72Stride 1times 1times 1 1times 1times 1 1times 1times 1Channel 18 36 72

Security and Communication Networks 5

realize classification It maps the output values of neuronsbetween (0 1) which can be regarded as the predictionprobability of actions and the largest one is the result ofclassification en the Softmax output layer outputs acategory vector such as [0 0 0 0 1 0 0 0 0 0 0 0]indicating that the classification result is an action numbered5

34 Model Implementation and Training e neural net-work described here is implemented in TensorFlow [29] It is alightweight library for building and training neural networksModel training and classification runs on a conventionalcomputer with a 24GHz CPU and 16GB memory

e model is trained in a fully supervised manner tobackpropagate the gradient from the Softmax layer to theconvolution layer Network parameters are optimized byusing minibatch gradient descent method and Adam opti-mizer through minimizing cross-loss function [13] Adam iswidely used due to its advantages in simple implementationefficient calculation and low memory demand Comparedwith other kinds of random optimization algorithms Adamhas great advantages In this paper to better train the modelafter the training data are input into the network Adamoptimizer and backpropagation algorithm are used to learnand optimize the network parameters Meanwhile the cross-entropy loss function is used to calculate the total error asshown in the following equation

C minus1N

1113944X

[yIna +(1 minus y)In(1 minus a)] (9)

where y is the true tag and a is the predicted valueTo improve efficiency small batches of data segment size

are segmented during training and testing With theseconfigurations the cumulative gradient of the parameters iscalculated after each small batch e weights are randomlyand orthogonally initialized As a form of regularization weintroduce a dropout operator on each dense layer of inputis operator sets the activation of a randomly selected unitto zero during training Dropout technology proposed byHinton et al [30] is based on the principle of randomlydeleting some nodes in the network while maintaining theintegrity of input and output neurons which is equivalent totraining many different networks Different networks mayoverfit in different ways but their average results can ef-fectively reduce overfitting In addition dropout allowsneurons to learn stronger features by not relying on other

specific neurons e number of parameters to be optimizedin a deep neural network varies depending on the type oflayer it contains And it has a great impact on the time andcomputer skills required to train the network e specificmodel training parameters will reflect the best choices in theexperiment

4 Activity Recognition

41 Experiment Data In addition to common basic actionsthis paper also studies transition actions Actually a fewexisting public data sets contain transition actions ere-fore this paper adopts the international standard Data SetSmart phone Based Recognition of Human Activities andPostural Transitions Data Set [31 32] to conduct an ex-periment which is abbreviated as HAPT Data Set e dataset is an updated version of the UCI Human ActivityRecognition Using popularity Data set [8] It provides rawdata from smart phone sensors rather than preprocesseddata In addition the action category has been expanded toinclude transition actions e HAPT data set containstwelve types of actions Firstly it has six basic actions thatinclude three types of static actions such as standing sittingand lying and three types of walking activities such aswalking going downstairs and upstairs Secondly it has sixpossible transitions between any two static movementsstanding to sitting sitting to standing standing to lyinglying to sitting sitting to lying and lying to standing

e HAPT data collection process is shown in Figure 4e experiment involved 30 volunteers whose ages rangefrom 19 to 48 each wearing a smart phone on their waistData collection is carried out with the built-in accelerationsensor and gyroscope and the sampling frequency is 50HzMeanwhile video records of the experimental process aremade for the convenience of subsequent data marking

e collected data is saved in the form of txt and theacceleration and gyroscope data are stored independentlywith 60 groups respectively As shown in Table 1 it is thelabel information corresponding to the original data of theexperiment Among them the first column is the experimentID the second column is the experimenter number the thirdcolumn is the action label and the fourth and fifth columnsare the start and end row labels of the corresponding sensordata e label ranges from 1 to 12 representing 12 types ofactions It can be seen from the figure that the collected datacontains invalid data and the first 250 pieces of data areunlabeled and belong to invalid data

Input data set χ x1 xn1113864 1113865

Output yi BNcβ(xi)1113966 1113967

(1) Calculate the mean of data set μxlarr(1n) 1113936ni1 xi

(2) Calculate the variance of data set ς2χlarr(1n) 1113936ni1 (xi minus μχ)

2

(3) Normalize data 1113954xilarr(xi minus μxς2x

1113968+ ε)

(4) Scale change and deviation yilarrc1113954xi + β BNcβ(xi)

(5) Return learning parameter c and β

ALGORITHM 1 Algorithm of batch normalization

6 Security and Communication Networks

After preliminary processing of the original data all thedata without labels were deleted Finally 815614 valid piecesof data were obtained Due to the low frequency and shortduration of transition action as well as the high frequencyand long duration of basic action there is a considerabledifference in data volume between transition action andbasic action e data volume of the six transition actions ismuch lower than that of the other basic actions accountingfor only about 8 of the total data Table 3 lists the amount ofdata for different actions e original data is divided intothree parts training set verification set and test set in whichthe training set is used for model training and verificationset is used to adjust parameters and test set is used tomeasure the quality of the final model

42 Parameters Setting In the deep learning network themodel parameters greatly affect its recognition rateerefore the experimental analysis of the number ofneurons learning rate BN Batch size and other parametersin LSTM layer would be conducted in the following sections

421 Number of Neurons in LSTM Layer In order to verifythe influence of the number of neurons in LSTM layer on therecognition results the following experiments are carriedout in this paper as shown in Figure 5 It shows that therecognition rate is the lowest when each LSTM layer con-tains only 8 neurons is is because given less neurons thenetwork lacks the necessary learning ability and informationprocessing ability resulting in the low recognition rate As

the number of neurons increases the recognition rate tendsto increase When the number of neurons is 64 the rec-ognition rate reaches 9587 If the number of neurons is toolarge the complexity of network structure will increase andthe learning speed of network will slow down ereforeconsidering the training time of the network the number ofLSTM layer neurons in this paper is tentatively 64

422 e Learning Rates Experiments are carried out atdifferent learning rates in this paper As shown in Table 4 itcan be seen that the recognition rate of the model reaches amaximum of 9587 when the learning rate is 0002erefore the learning rate of 0002 is adopted

423 BN Operation To verify the improvement of the BNoperation on the network model a comparative experimentis carried out first with and without BN layer e epoch isset to 400 and other parameters remain unchanged erecognition rates of both methods on the test set are shownin Table 5 Obviously the recognition rate on the test set isimproved by about 424 after the BN layer is added

424 Batch Size Batch size refers to the Batch sample sizewhose maximum value is the total number of samples in the

Figure 4 Data collection of the physical activities

0955

0950

0945

0940

0935

0930

0925

0920

Acc

urac

y

8 16 24 32 40 48 56 64Number of neurons

Test accuracy

Figure 5 Accuracy of different numbers of neurons on test sets

Table 3 e data amount of various activities in the HAPT

Type ID NumberWalk A1 122091Upstairs A2 116707Downstairs A3 107961Sit down A4 126677Stand A5 138105Lie A6 136865Stand to sit A7 10316Sit to stand A8 8029Sit to lie A9 12428Lie to sit A10 11150Stand to lie A11 14418Lie to stop A12 10867

Security and Communication Networks 7

training set When the amount of data is small the batchdata is the whole data set so that it can approach the extremevalue direction more accurately However in practical ap-plications the amount of data used by deep learning isrelatively large and the principle of small batch processing isgenerally adopted Using small batch processing requiresrelatively little memory and faster training time Within anappropriate range increasing the batch size can more ac-curately determine the direction of gradient descent andcause less training shock However when the batch sizeincreases to a certain value the determined downward di-rection will not change and the correction of parameters willslow down significantly e identification results of dif-ferent batch sizes are shown in Table 6 It can be seen thatwhen the batch size is 150 the maximum identification ratereaches 9587 erefore 150 is selected as the best batchsize in this paper

e parameters of the CNN-LSTM model proposed inthis paper are shown in Table 7

5 Experimental Results and Analysis

For human movement recognition Wang and Liu [33]proposed to use the F-measure standard measurementmethod to verify the performance of the deep-rootedLSTM network model in human activity recognition Luet al [34] demonstrated the superiority of the model inbehavior recognition by using accuracy prediction rateand recall rate in the experiment erefore to evaluatethe performance of the motion recognition methodproposed in this paper we also used the measurementmethod of accuracy recall rate loss rate and F-measure inthe experiment

According to the above parameters the recognitionconfusion matrix of 12 different actions is shown in Table 8Accuracy curve of CNN-LSTM model is shown in Figure 6It can be seen from Table 9 that the overall recognition rateof CNN-LSTM is high and the CNN-LSTM has a betterrecognition effect on the transition action

6 Case Study

In the non-deep-learning method the random forest clas-sification method (RF) and K-nearest neighbor (KNN)classification perform well in action classification recogni-tion erefore the CNN-LSTM model proposed is com-pared with the RF and KNN methods First of all input theHAPTdata set into RF and KNNen segment the original

Table 4 Accuracy of different learning rates on test sets

Learning rate Recognition rate ()0001 935700015 94210002 958700025 92390003 933400035 92120004 928400045 9201

Table 5 Accuracy and loss rate on test sets with or without BNlayer

Recognition rate ()Without BN layer 9163With BN layer 9587

Table 6 Accuracy of different batch size on test sets

Batch size Recognition rate ()25 917450 928875 9292100 9310125 9433150 9587175 9345200 9337225 9372250 9345275 9284300 9335325 9406350 9334375 9296400 9353

Table 7 Experimental parameters of CNN-LSTM model

Parameters ValueInput vector size 150Input channel number 8Convolution kernel size 3Pool size 2Activation function ReLuLSTM layer 1Neurons number 64Dropout 05Learning rate 0002Batch size 150Epoch 400

Table 8 Confusion matrix of various actions

ActualPredict

A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A12A1 410 1 3 0 0 0 0 0 0 0 0 0A2 5 388 3 0 0 0 0 0 0 0 0 0A3 1 3 346 0 0 0 0 0 0 0 0 0A4 1 0 0 383 32 3 1 0 1 1 0 0A5 0 0 1 31 431 0 0 0 0 0 0 0A6 0 0 0 1 0 457 0 0 0 0 0 0A7 0 0 0 1 0 0 17 0 0 0 0 0A8 0 0 0 0 0 0 0 4 0 0 1 0A9 0 0 0 0 0 0 0 0 19 1 4 1A10 0 0 0 0 0 0 0 0 1 14 0 2A11 0 1 0 1 0 0 1 0 2 1 32 1A12 0 0 0 0 0 0 0 0 0 1 1 16

8 Security and Communication Networks

sensor data and calculate the mean value variance co-variance and 15 features Finally classify the basic actionsand transition actions according to the clustering resultse classification results are shown in Table 10 It can be seenthat the recognition rate of CNN-LSTM model is higher

than that of RF and KNNmethods for both basic actions andtransition actions

In addition to the comparison with RF and KNN clas-sifier our proposed model is also compared with a singleCNN a single LSTM CNN-GRU and CNN-BLSTM deep

Table 9 e recognition accuracy recall rate and F value of various actions

ID Accuracy () Recall () F-measure ()A1 9903 9832 9868A2 9778 9873 9835A3 9886 9802 9844A4 9076 9185 9130A5 9309 9309 9309A6 9978 9956 9956A7 9444 8947 9189A8 100 100 100A9 7600 8261 7917A10 8235 7778 8000A11 8205 8649 8421A12 8889 8000 8421

Table 10 Average accuracy of various actions in CNN-LSTM RF and KNN models

ID RF () KNN () CNN-LSTM ()A1 9990 8810 9903A2 9250 9780 9778A3 9020 9940 9886A4 9190 8380 9076A5 9080 8750 9309A6 9710 100 9978A7 7130 6670 9444A8 7200 6800 100A9 5130 3860 7600A10 7490 3630 8235A11 5920 3370 8205A12 6110 5790 8889

10

09

08

07

06

05

04

030 50 100 150 200 250 300 350 400

Iterations

TrainValidation

Figure 6 Accuracy curve of CNN-LSTM Model

Security and Communication Networks 9

learning models Table 11 shows the average accuracy ofvarious actions in five different depth models As can be seenfrom Table 11 CNN-LSTM not only has a slightly higherrecognition of basic movements than the other five modelsbut also has a significantly better recognition of transitionmovements especially standing to sitting sitting to lyingand standing to lying Table 12 shows the recognition rates ofdifferent models on the test set It can be seen from the tablethat the average recognition rate of the three models ishigher than 90 but the recognition effect of CNN-LSTMmodel is slightly better than that of CNN LSTM CNN-GRU and CNN-BLSTM

To prove the effectiveness of the CNN-LSTM deeplearning model it is also compared with other deep learningmethods using the same dataset Kuang [35] applied BLSTMto construct the behavior recognition model Hassan et al[36] used deep belief network (DBN) for human behaviorrecognition We compared the performance with the ap-proaches in [35 36] with the result shown in Table 13 Itfollows that the proposed CNN-LSTM can achieve highestaverage recognition rate

7 Conclusion

is paper explored the recognition method based on deeplearning and designed the behavior recognition model basedon CNN-LSTM CNN learns local features from the original

sensor data and LSTM extracts time-dependent relation-ships from local features and realizes the fusion of localfeatures and global features fine description of basic andtransition movements and accurate identification of the twomotion patterns

e actions identified in this paper only include commonbasic actions and individual transition actions In the nextstep more kinds of actions can be collected and morecomplex actions can be added such as eating and drivingAnd the individual recognition can be realized by consid-ering the behavior differences of different users Meanwhilethe deep learning model still needs to be optimized andimproved Studies show that the combination of depthmodel and shallow model can achieve better performanceDeep learning model has strong learning ability whileshallow learning model has higher learning efficiency ecollaboration between the two can achieve more accurateand lightweight recognition

Data Availability

No data were used to support this study

Conflicts of Interest

e authors declare no conflicts of interest

Acknowledgments

e authors would like to thank the support of the labo-ratory university and government is research wasfunded by the National Key Research and Development Plan(No 2017YFB1402103) the National Natural ScienceFoundation of China (No 61971347) Scientific ResearchProgram of Shaanxi Province (2018HJCG-05) and Projectof Xirsquoan Science and Technology Planning Foundation(201805037YD15CG214)

References

[1] I H Lopez-Nava and M M Angelica ldquoWearable inertialsensors for human motion analysis a reviewrdquo IEEE SensorsJournalvol 16 no 15 2016

Table 11 Average accuracy of different activities with five deep learning models

ID CNN () LSTM () CNN-BLSTM () CNN-GRU () CNN-LSTM ()A1 9750 9770 9741 9975 9903A2 9725 9710 9565 9899 9778A3 9560 9715 100 9657 9886A4 9126 9026 9196 8199 9076A5 9080 9080 8474 9248 9309A6 9967 9858 100 9978 9978A7 7647 6486 4444 7778 9444A8 100 6667 6667 5000 100A9 6383 6939 6207 4800 7600A10 8485 7027 8000 5294 8235A11 7250 6933 6500 7179 8205A12 8330 7027 7059 5556 8889

Table 12 Average accuracy of the five models in this paper

Method Average recognition rate ()CNN 9429LSTM 9322CNN-BLSTM 9273CNN-GRU 9334CNN-LSTM 9587

Table 13 Average accuracy of different methods on test set in thepaper [35 36]

Method Average recognition rateBLSTM [35] 875DBN [36] 896CNN-LSTM 958

10 Security and Communication Networks

[2] Y Liu L Nie L Liu and D S Rosenblum ldquoFrom action toactivity sensor-based activity recognitionrdquo Neurocomputingvol 181 pp 108ndash115 2016

[3] T Liu F Bingfei and L Qingguo ldquoe invention relates to awearable motion sensor and a method for resisting magneticfield interferencerdquo 2017

[4] O D Lara and M A Labrador ldquoA survey on human activityrecognition using wearable sensorsrdquo IEEE CommunicationsSurveys amp Tutorials vol 15 no 3 pp 1192ndash1209 2013

[5] F J Ordontildeez and D Roggen ldquoDeep convolutional and LSTMrecurrent neural networks for multimodal wearable activityrecognitionrdquo Sensors (Switzerland) vol 16 p 1 2016

[6] X Du R Vasudevan and M Johnson-Roberson ldquoBio-LSTMa biomechanically inspired recurrent neural network for 3-dpedestrian pose and gait predictionrdquo IEEE Robotics andAutomation Letters vol 4 no 2 pp 1501ndash1508 2019

[7] Y Huang C Wan and H Feng ldquoMulti-feature fusion humanbehavior recognition algorithm based on convolutionalneural network and long short termmemory neural networkrdquoLaser Optoelectron Prog vol 56 p 7 2019

[8] Y Lecun Y Bengio and G Hinton ldquoDeep learningrdquo Naturevol 521 no 7553 pp 436ndash444 2015

[9] G W Taylor R Fergus Y LeCun and C Bregler ldquoCon-volutional learning of spatio-temporal featuresrdquo in LectureNotes in Computer Science (including subseries Lecture Notesin Artificial Intelligence and Lecture Notes in Bioinformatics)Springer Berlin Germany 2010

[10] S Ji W Xu M Yang and K Yu ldquo3D Convolutional neuralnetworks for human action recognitionrdquo IEEE Transactionson Pattern Analysis and Machine Intelligence vol 35 no 1pp 221ndash231 2013

[11] C Liu J Liu Z He Y Zhai Q Hu and Y Huang ldquoCon-volutional neural random fields for action recognitionrdquoPattern Recognition vol 59 pp 213ndash224 2016

[12] K Cho M Bart van G Caglar et al ldquoLearning phraserepresentations using RNN encoder-decoder for statisticalmachine translationrdquo in Proceedings of the EMNLP 2014 -2014 Conference on Empirical Methods in Natural LanguageProcessing pp 1724ndash1734 Doha Qatar October 2014

[13] M Zeng T N Le Y Bo et al ldquoConvolutional NeuralNetworks for human activity recognition using mobile sen-sorsrdquo in Proceedings Of the 2014 6th International ConferenceOn Mobile Computing Applications And Services pp 197ndash205 Austin TX USA November 2015

[14] W Jiang and Z Yin ldquoHuman activity recognition usingwearable sensors by deep convolutional neural networksrdquo inProceedings Of the 2015 ACM Multimedia Conference MM2015 pp 1307ndash1310 Brisbane Australia October 2015

[15] Y Chen and Y Xue ldquoA deep learning approach to humanactivity recognition based on single accelerometerrdquo in Pro-ceedings of the 2015 IEEE International Conference On Sys-tems Man and Cybernetics SMC 2015 pp 1488ndash1492 HongKong China October 2016

[16] C A Ronao and S-B Cho ldquoHuman activity recognition withsmartphone sensors using deep learning neural networksrdquoExpert Systems with Applications vol 59 pp 235ndash244 2016

[17] A Murad and J Y Pyun ldquoDeep recurrent neural networks forhuman activity recognitionrdquo Sensors (Switzerland) vol 17p 11 2017

[18] J Zhou J Sun P Cong et al ldquoSecurity-critical energy-awaretask scheduling for heterogeneous real-time MPSoCs in IoTrdquoIEEE Transactions On Services Computing (TSC) vol 12 p 992019

[19] Y Guan and T Plotz ldquoEnsembles of deep LSTM learners foractivity recognition using wearablesrdquo Proceedings of the ACMon Interactive Mobile Wearable and Ubiquitous Technologiesvol 1 no 2 pp 1ndash28 2017

[20] L Qi X ZhangW Dou C Hu C Yang and J Chen ldquoA two-stage locality-sensitive hashing based approach for privacy-preserving mobile service recommendation in cross-platformedge environmentrdquo Future Generation Computer Systemsvol 88 pp 636ndash643 2018

[21] A Ignatov ldquoReal-time human activity recognition from ac-celerometer data using Convolutional Neural NetworksrdquoApplied Soft Computing vol 62 pp 915ndash922 2018

[22] H F Nweke Y W Teh M A Al-garadi and U R Alo ldquoDeeplearning algorithms for human activity recognition usingmobile and wearable sensor networks state of the art andresearch challengesrdquo Expert Systems with Applicationsvol 105 pp 233ndash261 2018

[23] J Wang Y Chen S Hao X Peng and L Hu ldquoDeep learningfor sensor-based activity recognition a surveyrdquo PatternRecognition Letters vol 119 pp 3ndash11 2019

[24] S Wu G Li L Deng et al ldquo$L1$-norm batch normalizationfor efficient training of deep neural networksrdquo IEEE Trans-actions on Neural Networks and Learning Systems vol 30no 7 pp 2043ndash2051 2019

[25] B Almaslukh J Al Muhtadi and A M Artoli ldquoA robustconvolutional neural network for online smartphone-basedhuman activity recognitionrdquo Journal of Intelligent amp FuzzySystems vol 35 no 2 pp 1609ndash1620 2018

[26] R Yao G Lin Q Shi and D C Ranasinghe ldquoEfficient denselabelling of human activity sequences from wearables usingfully convolutional networksrdquo Pattern Recognition vol 78pp 252ndash266 2018

[27] T Kautz B H Groh J Hannink U Jensen H Strubberg andB M Eskofier ldquoActivity recognition in beach volleyball usinga deep convolutional neural networkrdquo Data Mining andKnowledge Discovery vol 31 no 6 pp 1678ndash1705 2017

[28] R Jozefowicz W Zaremba and I Sutskever ldquoAn empiricalexploration of recurrent network architecturesrdquo in Proceed-ings of the 32nd international Conference on machine learningICML 2015 vol 3 pp 2332ndash2340 Lille France July 2015

[29] S Li S Zhao P Yang P Andriotis L Xu and Q SunldquoDistributed consensus algorithm for events detection incyber-physical systemsrdquo IEEE Internet of ings Journalvol 6 no 2 pp 2299ndash2308 2019

[30] G E Hinton N Srivastava A Krizhevsky I Sutskever andR R Salakhutdinov Improving Neural Networks by Pre-venting Co-adaptation of Feature Detectors arXiv preparationGeneva Switzerland 2012

[31] B M h Abidine L Fergani B Fergani and M Oussalahldquoe joint use of sequence features combination and modifiedweighted SVM for improving daily activity recognitionrdquoPattern Analysis and Applications vol 21 no 1 pp 119ndash1382018

[32] G M Weiss J W Lockhart T T Pulickal et al ldquoAsmartphone-based activity recognition system for improvinghealth and well-beingrdquo in Proceedings of the 3rd IEEE In-ternational Conference On Data Science And Advanced An-alytics DSAA 2016 pp 682ndash688 Montreal QC CanadaOctober 2016

[33] L Wang and R Liu ldquoHuman activity recognition based onwearable sensor using hierarchical deep LSTM networksrdquoCircuits Systems and Signal Processing vol 39 no 2pp 837ndash856 2019

Security and Communication Networks 11

[34] W Lu F Fan J Chu P Jing and S Yuting ldquoWearablecomputing for internet of things a discriminant approach forhuman activity recognitionrdquo IEEE Internet of ings Journalvol 6 no 2 pp 2749ndash2759 2019

[35] X Kuang Human Behavior Recognition Based on DeepLearning and Wearable Sensor Nanjing University of In-formation Engineering Nanjing China 2018

[36] M M Hassan M Z Uddin A Mohamed and A AlmogrenldquoA robust human activity recognition system using smart-phone sensors and deep learningrdquo Future Generation Com-puter Systems vol 81 pp 307ndash313 2018

12 Security and Communication Networks

Page 6: Wearable Sensor-Based Human Activity Recognition Using ...downloads.hindawi.com/journals/scn/2020/2132138.pdf · Wearable Sensor-Based Human Activity Recognition Using Hybrid Deep

realize classification It maps the output values of neuronsbetween (0 1) which can be regarded as the predictionprobability of actions and the largest one is the result ofclassification en the Softmax output layer outputs acategory vector such as [0 0 0 0 1 0 0 0 0 0 0 0]indicating that the classification result is an action numbered5

34 Model Implementation and Training e neural net-work described here is implemented in TensorFlow [29] It is alightweight library for building and training neural networksModel training and classification runs on a conventionalcomputer with a 24GHz CPU and 16GB memory

e model is trained in a fully supervised manner tobackpropagate the gradient from the Softmax layer to theconvolution layer Network parameters are optimized byusing minibatch gradient descent method and Adam opti-mizer through minimizing cross-loss function [13] Adam iswidely used due to its advantages in simple implementationefficient calculation and low memory demand Comparedwith other kinds of random optimization algorithms Adamhas great advantages In this paper to better train the modelafter the training data are input into the network Adamoptimizer and backpropagation algorithm are used to learnand optimize the network parameters Meanwhile the cross-entropy loss function is used to calculate the total error asshown in the following equation

C minus1N

1113944X

[yIna +(1 minus y)In(1 minus a)] (9)

where y is the true tag and a is the predicted valueTo improve efficiency small batches of data segment size

are segmented during training and testing With theseconfigurations the cumulative gradient of the parameters iscalculated after each small batch e weights are randomlyand orthogonally initialized As a form of regularization weintroduce a dropout operator on each dense layer of inputis operator sets the activation of a randomly selected unitto zero during training Dropout technology proposed byHinton et al [30] is based on the principle of randomlydeleting some nodes in the network while maintaining theintegrity of input and output neurons which is equivalent totraining many different networks Different networks mayoverfit in different ways but their average results can ef-fectively reduce overfitting In addition dropout allowsneurons to learn stronger features by not relying on other

specific neurons e number of parameters to be optimizedin a deep neural network varies depending on the type oflayer it contains And it has a great impact on the time andcomputer skills required to train the network e specificmodel training parameters will reflect the best choices in theexperiment

4 Activity Recognition

41 Experiment Data In addition to common basic actionsthis paper also studies transition actions Actually a fewexisting public data sets contain transition actions ere-fore this paper adopts the international standard Data SetSmart phone Based Recognition of Human Activities andPostural Transitions Data Set [31 32] to conduct an ex-periment which is abbreviated as HAPT Data Set e dataset is an updated version of the UCI Human ActivityRecognition Using popularity Data set [8] It provides rawdata from smart phone sensors rather than preprocesseddata In addition the action category has been expanded toinclude transition actions e HAPT data set containstwelve types of actions Firstly it has six basic actions thatinclude three types of static actions such as standing sittingand lying and three types of walking activities such aswalking going downstairs and upstairs Secondly it has sixpossible transitions between any two static movementsstanding to sitting sitting to standing standing to lyinglying to sitting sitting to lying and lying to standing

e HAPT data collection process is shown in Figure 4e experiment involved 30 volunteers whose ages rangefrom 19 to 48 each wearing a smart phone on their waistData collection is carried out with the built-in accelerationsensor and gyroscope and the sampling frequency is 50HzMeanwhile video records of the experimental process aremade for the convenience of subsequent data marking

e collected data is saved in the form of txt and theacceleration and gyroscope data are stored independentlywith 60 groups respectively As shown in Table 1 it is thelabel information corresponding to the original data of theexperiment Among them the first column is the experimentID the second column is the experimenter number the thirdcolumn is the action label and the fourth and fifth columnsare the start and end row labels of the corresponding sensordata e label ranges from 1 to 12 representing 12 types ofactions It can be seen from the figure that the collected datacontains invalid data and the first 250 pieces of data areunlabeled and belong to invalid data

Input data set χ x1 xn1113864 1113865

Output yi BNcβ(xi)1113966 1113967

(1) Calculate the mean of data set μxlarr(1n) 1113936ni1 xi

(2) Calculate the variance of data set ς2χlarr(1n) 1113936ni1 (xi minus μχ)

2

(3) Normalize data 1113954xilarr(xi minus μxς2x

1113968+ ε)

(4) Scale change and deviation yilarrc1113954xi + β BNcβ(xi)

(5) Return learning parameter c and β

ALGORITHM 1 Algorithm of batch normalization

6 Security and Communication Networks

After preliminary processing of the original data all thedata without labels were deleted Finally 815614 valid piecesof data were obtained Due to the low frequency and shortduration of transition action as well as the high frequencyand long duration of basic action there is a considerabledifference in data volume between transition action andbasic action e data volume of the six transition actions ismuch lower than that of the other basic actions accountingfor only about 8 of the total data Table 3 lists the amount ofdata for different actions e original data is divided intothree parts training set verification set and test set in whichthe training set is used for model training and verificationset is used to adjust parameters and test set is used tomeasure the quality of the final model

42 Parameters Setting In the deep learning network themodel parameters greatly affect its recognition rateerefore the experimental analysis of the number ofneurons learning rate BN Batch size and other parametersin LSTM layer would be conducted in the following sections

421 Number of Neurons in LSTM Layer In order to verifythe influence of the number of neurons in LSTM layer on therecognition results the following experiments are carriedout in this paper as shown in Figure 5 It shows that therecognition rate is the lowest when each LSTM layer con-tains only 8 neurons is is because given less neurons thenetwork lacks the necessary learning ability and informationprocessing ability resulting in the low recognition rate As

the number of neurons increases the recognition rate tendsto increase When the number of neurons is 64 the rec-ognition rate reaches 9587 If the number of neurons is toolarge the complexity of network structure will increase andthe learning speed of network will slow down ereforeconsidering the training time of the network the number ofLSTM layer neurons in this paper is tentatively 64

422 e Learning Rates Experiments are carried out atdifferent learning rates in this paper As shown in Table 4 itcan be seen that the recognition rate of the model reaches amaximum of 9587 when the learning rate is 0002erefore the learning rate of 0002 is adopted

423 BN Operation To verify the improvement of the BNoperation on the network model a comparative experimentis carried out first with and without BN layer e epoch isset to 400 and other parameters remain unchanged erecognition rates of both methods on the test set are shownin Table 5 Obviously the recognition rate on the test set isimproved by about 424 after the BN layer is added

424 Batch Size Batch size refers to the Batch sample sizewhose maximum value is the total number of samples in the

Figure 4 Data collection of the physical activities

0955

0950

0945

0940

0935

0930

0925

0920

Acc

urac

y

8 16 24 32 40 48 56 64Number of neurons

Test accuracy

Figure 5 Accuracy of different numbers of neurons on test sets

Table 3 e data amount of various activities in the HAPT

Type ID NumberWalk A1 122091Upstairs A2 116707Downstairs A3 107961Sit down A4 126677Stand A5 138105Lie A6 136865Stand to sit A7 10316Sit to stand A8 8029Sit to lie A9 12428Lie to sit A10 11150Stand to lie A11 14418Lie to stop A12 10867

Security and Communication Networks 7

training set When the amount of data is small the batchdata is the whole data set so that it can approach the extremevalue direction more accurately However in practical ap-plications the amount of data used by deep learning isrelatively large and the principle of small batch processing isgenerally adopted Using small batch processing requiresrelatively little memory and faster training time Within anappropriate range increasing the batch size can more ac-curately determine the direction of gradient descent andcause less training shock However when the batch sizeincreases to a certain value the determined downward di-rection will not change and the correction of parameters willslow down significantly e identification results of dif-ferent batch sizes are shown in Table 6 It can be seen thatwhen the batch size is 150 the maximum identification ratereaches 9587 erefore 150 is selected as the best batchsize in this paper

e parameters of the CNN-LSTM model proposed inthis paper are shown in Table 7

5 Experimental Results and Analysis

For human movement recognition Wang and Liu [33]proposed to use the F-measure standard measurementmethod to verify the performance of the deep-rootedLSTM network model in human activity recognition Luet al [34] demonstrated the superiority of the model inbehavior recognition by using accuracy prediction rateand recall rate in the experiment erefore to evaluatethe performance of the motion recognition methodproposed in this paper we also used the measurementmethod of accuracy recall rate loss rate and F-measure inthe experiment

According to the above parameters the recognitionconfusion matrix of 12 different actions is shown in Table 8Accuracy curve of CNN-LSTM model is shown in Figure 6It can be seen from Table 9 that the overall recognition rateof CNN-LSTM is high and the CNN-LSTM has a betterrecognition effect on the transition action

6 Case Study

In the non-deep-learning method the random forest clas-sification method (RF) and K-nearest neighbor (KNN)classification perform well in action classification recogni-tion erefore the CNN-LSTM model proposed is com-pared with the RF and KNN methods First of all input theHAPTdata set into RF and KNNen segment the original

Table 4 Accuracy of different learning rates on test sets

Learning rate Recognition rate ()0001 935700015 94210002 958700025 92390003 933400035 92120004 928400045 9201

Table 5 Accuracy and loss rate on test sets with or without BNlayer

Recognition rate ()Without BN layer 9163With BN layer 9587

Table 6 Accuracy of different batch size on test sets

Batch size Recognition rate ()25 917450 928875 9292100 9310125 9433150 9587175 9345200 9337225 9372250 9345275 9284300 9335325 9406350 9334375 9296400 9353

Table 7 Experimental parameters of CNN-LSTM model

Parameters ValueInput vector size 150Input channel number 8Convolution kernel size 3Pool size 2Activation function ReLuLSTM layer 1Neurons number 64Dropout 05Learning rate 0002Batch size 150Epoch 400

Table 8 Confusion matrix of various actions

ActualPredict

A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A12A1 410 1 3 0 0 0 0 0 0 0 0 0A2 5 388 3 0 0 0 0 0 0 0 0 0A3 1 3 346 0 0 0 0 0 0 0 0 0A4 1 0 0 383 32 3 1 0 1 1 0 0A5 0 0 1 31 431 0 0 0 0 0 0 0A6 0 0 0 1 0 457 0 0 0 0 0 0A7 0 0 0 1 0 0 17 0 0 0 0 0A8 0 0 0 0 0 0 0 4 0 0 1 0A9 0 0 0 0 0 0 0 0 19 1 4 1A10 0 0 0 0 0 0 0 0 1 14 0 2A11 0 1 0 1 0 0 1 0 2 1 32 1A12 0 0 0 0 0 0 0 0 0 1 1 16

8 Security and Communication Networks

sensor data and calculate the mean value variance co-variance and 15 features Finally classify the basic actionsand transition actions according to the clustering resultse classification results are shown in Table 10 It can be seenthat the recognition rate of CNN-LSTM model is higher

than that of RF and KNNmethods for both basic actions andtransition actions

In addition to the comparison with RF and KNN clas-sifier our proposed model is also compared with a singleCNN a single LSTM CNN-GRU and CNN-BLSTM deep

Table 9 e recognition accuracy recall rate and F value of various actions

ID Accuracy () Recall () F-measure ()A1 9903 9832 9868A2 9778 9873 9835A3 9886 9802 9844A4 9076 9185 9130A5 9309 9309 9309A6 9978 9956 9956A7 9444 8947 9189A8 100 100 100A9 7600 8261 7917A10 8235 7778 8000A11 8205 8649 8421A12 8889 8000 8421

Table 10 Average accuracy of various actions in CNN-LSTM RF and KNN models

ID RF () KNN () CNN-LSTM ()A1 9990 8810 9903A2 9250 9780 9778A3 9020 9940 9886A4 9190 8380 9076A5 9080 8750 9309A6 9710 100 9978A7 7130 6670 9444A8 7200 6800 100A9 5130 3860 7600A10 7490 3630 8235A11 5920 3370 8205A12 6110 5790 8889

10

09

08

07

06

05

04

030 50 100 150 200 250 300 350 400

Iterations

TrainValidation

Figure 6 Accuracy curve of CNN-LSTM Model

Security and Communication Networks 9

learning models Table 11 shows the average accuracy ofvarious actions in five different depth models As can be seenfrom Table 11 CNN-LSTM not only has a slightly higherrecognition of basic movements than the other five modelsbut also has a significantly better recognition of transitionmovements especially standing to sitting sitting to lyingand standing to lying Table 12 shows the recognition rates ofdifferent models on the test set It can be seen from the tablethat the average recognition rate of the three models ishigher than 90 but the recognition effect of CNN-LSTMmodel is slightly better than that of CNN LSTM CNN-GRU and CNN-BLSTM

To prove the effectiveness of the CNN-LSTM deeplearning model it is also compared with other deep learningmethods using the same dataset Kuang [35] applied BLSTMto construct the behavior recognition model Hassan et al[36] used deep belief network (DBN) for human behaviorrecognition We compared the performance with the ap-proaches in [35 36] with the result shown in Table 13 Itfollows that the proposed CNN-LSTM can achieve highestaverage recognition rate

7 Conclusion

is paper explored the recognition method based on deeplearning and designed the behavior recognition model basedon CNN-LSTM CNN learns local features from the original

sensor data and LSTM extracts time-dependent relation-ships from local features and realizes the fusion of localfeatures and global features fine description of basic andtransition movements and accurate identification of the twomotion patterns

e actions identified in this paper only include commonbasic actions and individual transition actions In the nextstep more kinds of actions can be collected and morecomplex actions can be added such as eating and drivingAnd the individual recognition can be realized by consid-ering the behavior differences of different users Meanwhilethe deep learning model still needs to be optimized andimproved Studies show that the combination of depthmodel and shallow model can achieve better performanceDeep learning model has strong learning ability whileshallow learning model has higher learning efficiency ecollaboration between the two can achieve more accurateand lightweight recognition

Data Availability

No data were used to support this study

Conflicts of Interest

e authors declare no conflicts of interest

Acknowledgments

e authors would like to thank the support of the labo-ratory university and government is research wasfunded by the National Key Research and Development Plan(No 2017YFB1402103) the National Natural ScienceFoundation of China (No 61971347) Scientific ResearchProgram of Shaanxi Province (2018HJCG-05) and Projectof Xirsquoan Science and Technology Planning Foundation(201805037YD15CG214)

References

[1] I H Lopez-Nava and M M Angelica ldquoWearable inertialsensors for human motion analysis a reviewrdquo IEEE SensorsJournalvol 16 no 15 2016

Table 11 Average accuracy of different activities with five deep learning models

ID CNN () LSTM () CNN-BLSTM () CNN-GRU () CNN-LSTM ()A1 9750 9770 9741 9975 9903A2 9725 9710 9565 9899 9778A3 9560 9715 100 9657 9886A4 9126 9026 9196 8199 9076A5 9080 9080 8474 9248 9309A6 9967 9858 100 9978 9978A7 7647 6486 4444 7778 9444A8 100 6667 6667 5000 100A9 6383 6939 6207 4800 7600A10 8485 7027 8000 5294 8235A11 7250 6933 6500 7179 8205A12 8330 7027 7059 5556 8889

Table 12 Average accuracy of the five models in this paper

Method Average recognition rate ()CNN 9429LSTM 9322CNN-BLSTM 9273CNN-GRU 9334CNN-LSTM 9587

Table 13 Average accuracy of different methods on test set in thepaper [35 36]

Method Average recognition rateBLSTM [35] 875DBN [36] 896CNN-LSTM 958

10 Security and Communication Networks

[2] Y Liu L Nie L Liu and D S Rosenblum ldquoFrom action toactivity sensor-based activity recognitionrdquo Neurocomputingvol 181 pp 108ndash115 2016

[3] T Liu F Bingfei and L Qingguo ldquoe invention relates to awearable motion sensor and a method for resisting magneticfield interferencerdquo 2017

[4] O D Lara and M A Labrador ldquoA survey on human activityrecognition using wearable sensorsrdquo IEEE CommunicationsSurveys amp Tutorials vol 15 no 3 pp 1192ndash1209 2013

[5] F J Ordontildeez and D Roggen ldquoDeep convolutional and LSTMrecurrent neural networks for multimodal wearable activityrecognitionrdquo Sensors (Switzerland) vol 16 p 1 2016

[6] X Du R Vasudevan and M Johnson-Roberson ldquoBio-LSTMa biomechanically inspired recurrent neural network for 3-dpedestrian pose and gait predictionrdquo IEEE Robotics andAutomation Letters vol 4 no 2 pp 1501ndash1508 2019

[7] Y Huang C Wan and H Feng ldquoMulti-feature fusion humanbehavior recognition algorithm based on convolutionalneural network and long short termmemory neural networkrdquoLaser Optoelectron Prog vol 56 p 7 2019

[8] Y Lecun Y Bengio and G Hinton ldquoDeep learningrdquo Naturevol 521 no 7553 pp 436ndash444 2015

[9] G W Taylor R Fergus Y LeCun and C Bregler ldquoCon-volutional learning of spatio-temporal featuresrdquo in LectureNotes in Computer Science (including subseries Lecture Notesin Artificial Intelligence and Lecture Notes in Bioinformatics)Springer Berlin Germany 2010

[10] S Ji W Xu M Yang and K Yu ldquo3D Convolutional neuralnetworks for human action recognitionrdquo IEEE Transactionson Pattern Analysis and Machine Intelligence vol 35 no 1pp 221ndash231 2013

[11] C Liu J Liu Z He Y Zhai Q Hu and Y Huang ldquoCon-volutional neural random fields for action recognitionrdquoPattern Recognition vol 59 pp 213ndash224 2016

[12] K Cho M Bart van G Caglar et al ldquoLearning phraserepresentations using RNN encoder-decoder for statisticalmachine translationrdquo in Proceedings of the EMNLP 2014 -2014 Conference on Empirical Methods in Natural LanguageProcessing pp 1724ndash1734 Doha Qatar October 2014

[13] M Zeng T N Le Y Bo et al ldquoConvolutional NeuralNetworks for human activity recognition using mobile sen-sorsrdquo in Proceedings Of the 2014 6th International ConferenceOn Mobile Computing Applications And Services pp 197ndash205 Austin TX USA November 2015

[14] W Jiang and Z Yin ldquoHuman activity recognition usingwearable sensors by deep convolutional neural networksrdquo inProceedings Of the 2015 ACM Multimedia Conference MM2015 pp 1307ndash1310 Brisbane Australia October 2015

[15] Y Chen and Y Xue ldquoA deep learning approach to humanactivity recognition based on single accelerometerrdquo in Pro-ceedings of the 2015 IEEE International Conference On Sys-tems Man and Cybernetics SMC 2015 pp 1488ndash1492 HongKong China October 2016

[16] C A Ronao and S-B Cho ldquoHuman activity recognition withsmartphone sensors using deep learning neural networksrdquoExpert Systems with Applications vol 59 pp 235ndash244 2016

[17] A Murad and J Y Pyun ldquoDeep recurrent neural networks forhuman activity recognitionrdquo Sensors (Switzerland) vol 17p 11 2017

[18] J Zhou J Sun P Cong et al ldquoSecurity-critical energy-awaretask scheduling for heterogeneous real-time MPSoCs in IoTrdquoIEEE Transactions On Services Computing (TSC) vol 12 p 992019

[19] Y Guan and T Plotz ldquoEnsembles of deep LSTM learners foractivity recognition using wearablesrdquo Proceedings of the ACMon Interactive Mobile Wearable and Ubiquitous Technologiesvol 1 no 2 pp 1ndash28 2017

[20] L Qi X ZhangW Dou C Hu C Yang and J Chen ldquoA two-stage locality-sensitive hashing based approach for privacy-preserving mobile service recommendation in cross-platformedge environmentrdquo Future Generation Computer Systemsvol 88 pp 636ndash643 2018

[21] A Ignatov ldquoReal-time human activity recognition from ac-celerometer data using Convolutional Neural NetworksrdquoApplied Soft Computing vol 62 pp 915ndash922 2018

[22] H F Nweke Y W Teh M A Al-garadi and U R Alo ldquoDeeplearning algorithms for human activity recognition usingmobile and wearable sensor networks state of the art andresearch challengesrdquo Expert Systems with Applicationsvol 105 pp 233ndash261 2018

[23] J Wang Y Chen S Hao X Peng and L Hu ldquoDeep learningfor sensor-based activity recognition a surveyrdquo PatternRecognition Letters vol 119 pp 3ndash11 2019

[24] S Wu G Li L Deng et al ldquo$L1$-norm batch normalizationfor efficient training of deep neural networksrdquo IEEE Trans-actions on Neural Networks and Learning Systems vol 30no 7 pp 2043ndash2051 2019

[25] B Almaslukh J Al Muhtadi and A M Artoli ldquoA robustconvolutional neural network for online smartphone-basedhuman activity recognitionrdquo Journal of Intelligent amp FuzzySystems vol 35 no 2 pp 1609ndash1620 2018

[26] R Yao G Lin Q Shi and D C Ranasinghe ldquoEfficient denselabelling of human activity sequences from wearables usingfully convolutional networksrdquo Pattern Recognition vol 78pp 252ndash266 2018

[27] T Kautz B H Groh J Hannink U Jensen H Strubberg andB M Eskofier ldquoActivity recognition in beach volleyball usinga deep convolutional neural networkrdquo Data Mining andKnowledge Discovery vol 31 no 6 pp 1678ndash1705 2017

[28] R Jozefowicz W Zaremba and I Sutskever ldquoAn empiricalexploration of recurrent network architecturesrdquo in Proceed-ings of the 32nd international Conference on machine learningICML 2015 vol 3 pp 2332ndash2340 Lille France July 2015

[29] S Li S Zhao P Yang P Andriotis L Xu and Q SunldquoDistributed consensus algorithm for events detection incyber-physical systemsrdquo IEEE Internet of ings Journalvol 6 no 2 pp 2299ndash2308 2019

[30] G E Hinton N Srivastava A Krizhevsky I Sutskever andR R Salakhutdinov Improving Neural Networks by Pre-venting Co-adaptation of Feature Detectors arXiv preparationGeneva Switzerland 2012

[31] B M h Abidine L Fergani B Fergani and M Oussalahldquoe joint use of sequence features combination and modifiedweighted SVM for improving daily activity recognitionrdquoPattern Analysis and Applications vol 21 no 1 pp 119ndash1382018

[32] G M Weiss J W Lockhart T T Pulickal et al ldquoAsmartphone-based activity recognition system for improvinghealth and well-beingrdquo in Proceedings of the 3rd IEEE In-ternational Conference On Data Science And Advanced An-alytics DSAA 2016 pp 682ndash688 Montreal QC CanadaOctober 2016

[33] L Wang and R Liu ldquoHuman activity recognition based onwearable sensor using hierarchical deep LSTM networksrdquoCircuits Systems and Signal Processing vol 39 no 2pp 837ndash856 2019

Security and Communication Networks 11

[34] W Lu F Fan J Chu P Jing and S Yuting ldquoWearablecomputing for internet of things a discriminant approach forhuman activity recognitionrdquo IEEE Internet of ings Journalvol 6 no 2 pp 2749ndash2759 2019

[35] X Kuang Human Behavior Recognition Based on DeepLearning and Wearable Sensor Nanjing University of In-formation Engineering Nanjing China 2018

[36] M M Hassan M Z Uddin A Mohamed and A AlmogrenldquoA robust human activity recognition system using smart-phone sensors and deep learningrdquo Future Generation Com-puter Systems vol 81 pp 307ndash313 2018

12 Security and Communication Networks

Page 7: Wearable Sensor-Based Human Activity Recognition Using ...downloads.hindawi.com/journals/scn/2020/2132138.pdf · Wearable Sensor-Based Human Activity Recognition Using Hybrid Deep

After preliminary processing of the original data all thedata without labels were deleted Finally 815614 valid piecesof data were obtained Due to the low frequency and shortduration of transition action as well as the high frequencyand long duration of basic action there is a considerabledifference in data volume between transition action andbasic action e data volume of the six transition actions ismuch lower than that of the other basic actions accountingfor only about 8 of the total data Table 3 lists the amount ofdata for different actions e original data is divided intothree parts training set verification set and test set in whichthe training set is used for model training and verificationset is used to adjust parameters and test set is used tomeasure the quality of the final model

42 Parameters Setting In the deep learning network themodel parameters greatly affect its recognition rateerefore the experimental analysis of the number ofneurons learning rate BN Batch size and other parametersin LSTM layer would be conducted in the following sections

421 Number of Neurons in LSTM Layer In order to verifythe influence of the number of neurons in LSTM layer on therecognition results the following experiments are carriedout in this paper as shown in Figure 5 It shows that therecognition rate is the lowest when each LSTM layer con-tains only 8 neurons is is because given less neurons thenetwork lacks the necessary learning ability and informationprocessing ability resulting in the low recognition rate As

the number of neurons increases the recognition rate tendsto increase When the number of neurons is 64 the rec-ognition rate reaches 9587 If the number of neurons is toolarge the complexity of network structure will increase andthe learning speed of network will slow down ereforeconsidering the training time of the network the number ofLSTM layer neurons in this paper is tentatively 64

422 e Learning Rates Experiments are carried out atdifferent learning rates in this paper As shown in Table 4 itcan be seen that the recognition rate of the model reaches amaximum of 9587 when the learning rate is 0002erefore the learning rate of 0002 is adopted

423 BN Operation To verify the improvement of the BNoperation on the network model a comparative experimentis carried out first with and without BN layer e epoch isset to 400 and other parameters remain unchanged erecognition rates of both methods on the test set are shownin Table 5 Obviously the recognition rate on the test set isimproved by about 424 after the BN layer is added

424 Batch Size Batch size refers to the Batch sample sizewhose maximum value is the total number of samples in the

Figure 4 Data collection of the physical activities

0955

0950

0945

0940

0935

0930

0925

0920

Acc

urac

y

8 16 24 32 40 48 56 64Number of neurons

Test accuracy

Figure 5 Accuracy of different numbers of neurons on test sets

Table 3 e data amount of various activities in the HAPT

Type ID NumberWalk A1 122091Upstairs A2 116707Downstairs A3 107961Sit down A4 126677Stand A5 138105Lie A6 136865Stand to sit A7 10316Sit to stand A8 8029Sit to lie A9 12428Lie to sit A10 11150Stand to lie A11 14418Lie to stop A12 10867

Security and Communication Networks 7

training set When the amount of data is small the batchdata is the whole data set so that it can approach the extremevalue direction more accurately However in practical ap-plications the amount of data used by deep learning isrelatively large and the principle of small batch processing isgenerally adopted Using small batch processing requiresrelatively little memory and faster training time Within anappropriate range increasing the batch size can more ac-curately determine the direction of gradient descent andcause less training shock However when the batch sizeincreases to a certain value the determined downward di-rection will not change and the correction of parameters willslow down significantly e identification results of dif-ferent batch sizes are shown in Table 6 It can be seen thatwhen the batch size is 150 the maximum identification ratereaches 9587 erefore 150 is selected as the best batchsize in this paper

e parameters of the CNN-LSTM model proposed inthis paper are shown in Table 7

5 Experimental Results and Analysis

For human movement recognition Wang and Liu [33]proposed to use the F-measure standard measurementmethod to verify the performance of the deep-rootedLSTM network model in human activity recognition Luet al [34] demonstrated the superiority of the model inbehavior recognition by using accuracy prediction rateand recall rate in the experiment erefore to evaluatethe performance of the motion recognition methodproposed in this paper we also used the measurementmethod of accuracy recall rate loss rate and F-measure inthe experiment

According to the above parameters the recognitionconfusion matrix of 12 different actions is shown in Table 8Accuracy curve of CNN-LSTM model is shown in Figure 6It can be seen from Table 9 that the overall recognition rateof CNN-LSTM is high and the CNN-LSTM has a betterrecognition effect on the transition action

6 Case Study

In the non-deep-learning method the random forest clas-sification method (RF) and K-nearest neighbor (KNN)classification perform well in action classification recogni-tion erefore the CNN-LSTM model proposed is com-pared with the RF and KNN methods First of all input theHAPTdata set into RF and KNNen segment the original

Table 4 Accuracy of different learning rates on test sets

Learning rate Recognition rate ()0001 935700015 94210002 958700025 92390003 933400035 92120004 928400045 9201

Table 5 Accuracy and loss rate on test sets with or without BNlayer

Recognition rate ()Without BN layer 9163With BN layer 9587

Table 6 Accuracy of different batch size on test sets

Batch size Recognition rate ()25 917450 928875 9292100 9310125 9433150 9587175 9345200 9337225 9372250 9345275 9284300 9335325 9406350 9334375 9296400 9353

Table 7 Experimental parameters of CNN-LSTM model

Parameters ValueInput vector size 150Input channel number 8Convolution kernel size 3Pool size 2Activation function ReLuLSTM layer 1Neurons number 64Dropout 05Learning rate 0002Batch size 150Epoch 400

Table 8 Confusion matrix of various actions

ActualPredict

A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A12A1 410 1 3 0 0 0 0 0 0 0 0 0A2 5 388 3 0 0 0 0 0 0 0 0 0A3 1 3 346 0 0 0 0 0 0 0 0 0A4 1 0 0 383 32 3 1 0 1 1 0 0A5 0 0 1 31 431 0 0 0 0 0 0 0A6 0 0 0 1 0 457 0 0 0 0 0 0A7 0 0 0 1 0 0 17 0 0 0 0 0A8 0 0 0 0 0 0 0 4 0 0 1 0A9 0 0 0 0 0 0 0 0 19 1 4 1A10 0 0 0 0 0 0 0 0 1 14 0 2A11 0 1 0 1 0 0 1 0 2 1 32 1A12 0 0 0 0 0 0 0 0 0 1 1 16

8 Security and Communication Networks

sensor data and calculate the mean value variance co-variance and 15 features Finally classify the basic actionsand transition actions according to the clustering resultse classification results are shown in Table 10 It can be seenthat the recognition rate of CNN-LSTM model is higher

than that of RF and KNNmethods for both basic actions andtransition actions

In addition to the comparison with RF and KNN clas-sifier our proposed model is also compared with a singleCNN a single LSTM CNN-GRU and CNN-BLSTM deep

Table 9 e recognition accuracy recall rate and F value of various actions

ID Accuracy () Recall () F-measure ()A1 9903 9832 9868A2 9778 9873 9835A3 9886 9802 9844A4 9076 9185 9130A5 9309 9309 9309A6 9978 9956 9956A7 9444 8947 9189A8 100 100 100A9 7600 8261 7917A10 8235 7778 8000A11 8205 8649 8421A12 8889 8000 8421

Table 10 Average accuracy of various actions in CNN-LSTM RF and KNN models

ID RF () KNN () CNN-LSTM ()A1 9990 8810 9903A2 9250 9780 9778A3 9020 9940 9886A4 9190 8380 9076A5 9080 8750 9309A6 9710 100 9978A7 7130 6670 9444A8 7200 6800 100A9 5130 3860 7600A10 7490 3630 8235A11 5920 3370 8205A12 6110 5790 8889

10

09

08

07

06

05

04

030 50 100 150 200 250 300 350 400

Iterations

TrainValidation

Figure 6 Accuracy curve of CNN-LSTM Model

Security and Communication Networks 9

learning models Table 11 shows the average accuracy ofvarious actions in five different depth models As can be seenfrom Table 11 CNN-LSTM not only has a slightly higherrecognition of basic movements than the other five modelsbut also has a significantly better recognition of transitionmovements especially standing to sitting sitting to lyingand standing to lying Table 12 shows the recognition rates ofdifferent models on the test set It can be seen from the tablethat the average recognition rate of the three models ishigher than 90 but the recognition effect of CNN-LSTMmodel is slightly better than that of CNN LSTM CNN-GRU and CNN-BLSTM

To prove the effectiveness of the CNN-LSTM deeplearning model it is also compared with other deep learningmethods using the same dataset Kuang [35] applied BLSTMto construct the behavior recognition model Hassan et al[36] used deep belief network (DBN) for human behaviorrecognition We compared the performance with the ap-proaches in [35 36] with the result shown in Table 13 Itfollows that the proposed CNN-LSTM can achieve highestaverage recognition rate

7 Conclusion

is paper explored the recognition method based on deeplearning and designed the behavior recognition model basedon CNN-LSTM CNN learns local features from the original

sensor data and LSTM extracts time-dependent relation-ships from local features and realizes the fusion of localfeatures and global features fine description of basic andtransition movements and accurate identification of the twomotion patterns

e actions identified in this paper only include commonbasic actions and individual transition actions In the nextstep more kinds of actions can be collected and morecomplex actions can be added such as eating and drivingAnd the individual recognition can be realized by consid-ering the behavior differences of different users Meanwhilethe deep learning model still needs to be optimized andimproved Studies show that the combination of depthmodel and shallow model can achieve better performanceDeep learning model has strong learning ability whileshallow learning model has higher learning efficiency ecollaboration between the two can achieve more accurateand lightweight recognition

Data Availability

No data were used to support this study

Conflicts of Interest

e authors declare no conflicts of interest

Acknowledgments

e authors would like to thank the support of the labo-ratory university and government is research wasfunded by the National Key Research and Development Plan(No 2017YFB1402103) the National Natural ScienceFoundation of China (No 61971347) Scientific ResearchProgram of Shaanxi Province (2018HJCG-05) and Projectof Xirsquoan Science and Technology Planning Foundation(201805037YD15CG214)

References

[1] I H Lopez-Nava and M M Angelica ldquoWearable inertialsensors for human motion analysis a reviewrdquo IEEE SensorsJournalvol 16 no 15 2016

Table 11 Average accuracy of different activities with five deep learning models

ID CNN () LSTM () CNN-BLSTM () CNN-GRU () CNN-LSTM ()A1 9750 9770 9741 9975 9903A2 9725 9710 9565 9899 9778A3 9560 9715 100 9657 9886A4 9126 9026 9196 8199 9076A5 9080 9080 8474 9248 9309A6 9967 9858 100 9978 9978A7 7647 6486 4444 7778 9444A8 100 6667 6667 5000 100A9 6383 6939 6207 4800 7600A10 8485 7027 8000 5294 8235A11 7250 6933 6500 7179 8205A12 8330 7027 7059 5556 8889

Table 12 Average accuracy of the five models in this paper

Method Average recognition rate ()CNN 9429LSTM 9322CNN-BLSTM 9273CNN-GRU 9334CNN-LSTM 9587

Table 13 Average accuracy of different methods on test set in thepaper [35 36]

Method Average recognition rateBLSTM [35] 875DBN [36] 896CNN-LSTM 958

10 Security and Communication Networks

[2] Y Liu L Nie L Liu and D S Rosenblum ldquoFrom action toactivity sensor-based activity recognitionrdquo Neurocomputingvol 181 pp 108ndash115 2016

[3] T Liu F Bingfei and L Qingguo ldquoe invention relates to awearable motion sensor and a method for resisting magneticfield interferencerdquo 2017

[4] O D Lara and M A Labrador ldquoA survey on human activityrecognition using wearable sensorsrdquo IEEE CommunicationsSurveys amp Tutorials vol 15 no 3 pp 1192ndash1209 2013

[5] F J Ordontildeez and D Roggen ldquoDeep convolutional and LSTMrecurrent neural networks for multimodal wearable activityrecognitionrdquo Sensors (Switzerland) vol 16 p 1 2016

[6] X Du R Vasudevan and M Johnson-Roberson ldquoBio-LSTMa biomechanically inspired recurrent neural network for 3-dpedestrian pose and gait predictionrdquo IEEE Robotics andAutomation Letters vol 4 no 2 pp 1501ndash1508 2019

[7] Y Huang C Wan and H Feng ldquoMulti-feature fusion humanbehavior recognition algorithm based on convolutionalneural network and long short termmemory neural networkrdquoLaser Optoelectron Prog vol 56 p 7 2019

[8] Y Lecun Y Bengio and G Hinton ldquoDeep learningrdquo Naturevol 521 no 7553 pp 436ndash444 2015

[9] G W Taylor R Fergus Y LeCun and C Bregler ldquoCon-volutional learning of spatio-temporal featuresrdquo in LectureNotes in Computer Science (including subseries Lecture Notesin Artificial Intelligence and Lecture Notes in Bioinformatics)Springer Berlin Germany 2010

[10] S Ji W Xu M Yang and K Yu ldquo3D Convolutional neuralnetworks for human action recognitionrdquo IEEE Transactionson Pattern Analysis and Machine Intelligence vol 35 no 1pp 221ndash231 2013

[11] C Liu J Liu Z He Y Zhai Q Hu and Y Huang ldquoCon-volutional neural random fields for action recognitionrdquoPattern Recognition vol 59 pp 213ndash224 2016

[12] K Cho M Bart van G Caglar et al ldquoLearning phraserepresentations using RNN encoder-decoder for statisticalmachine translationrdquo in Proceedings of the EMNLP 2014 -2014 Conference on Empirical Methods in Natural LanguageProcessing pp 1724ndash1734 Doha Qatar October 2014

[13] M Zeng T N Le Y Bo et al ldquoConvolutional NeuralNetworks for human activity recognition using mobile sen-sorsrdquo in Proceedings Of the 2014 6th International ConferenceOn Mobile Computing Applications And Services pp 197ndash205 Austin TX USA November 2015

[14] W Jiang and Z Yin ldquoHuman activity recognition usingwearable sensors by deep convolutional neural networksrdquo inProceedings Of the 2015 ACM Multimedia Conference MM2015 pp 1307ndash1310 Brisbane Australia October 2015

[15] Y Chen and Y Xue ldquoA deep learning approach to humanactivity recognition based on single accelerometerrdquo in Pro-ceedings of the 2015 IEEE International Conference On Sys-tems Man and Cybernetics SMC 2015 pp 1488ndash1492 HongKong China October 2016

[16] C A Ronao and S-B Cho ldquoHuman activity recognition withsmartphone sensors using deep learning neural networksrdquoExpert Systems with Applications vol 59 pp 235ndash244 2016

[17] A Murad and J Y Pyun ldquoDeep recurrent neural networks forhuman activity recognitionrdquo Sensors (Switzerland) vol 17p 11 2017

[18] J Zhou J Sun P Cong et al ldquoSecurity-critical energy-awaretask scheduling for heterogeneous real-time MPSoCs in IoTrdquoIEEE Transactions On Services Computing (TSC) vol 12 p 992019

[19] Y Guan and T Plotz ldquoEnsembles of deep LSTM learners foractivity recognition using wearablesrdquo Proceedings of the ACMon Interactive Mobile Wearable and Ubiquitous Technologiesvol 1 no 2 pp 1ndash28 2017

[20] L Qi X ZhangW Dou C Hu C Yang and J Chen ldquoA two-stage locality-sensitive hashing based approach for privacy-preserving mobile service recommendation in cross-platformedge environmentrdquo Future Generation Computer Systemsvol 88 pp 636ndash643 2018

[21] A Ignatov ldquoReal-time human activity recognition from ac-celerometer data using Convolutional Neural NetworksrdquoApplied Soft Computing vol 62 pp 915ndash922 2018

[22] H F Nweke Y W Teh M A Al-garadi and U R Alo ldquoDeeplearning algorithms for human activity recognition usingmobile and wearable sensor networks state of the art andresearch challengesrdquo Expert Systems with Applicationsvol 105 pp 233ndash261 2018

[23] J Wang Y Chen S Hao X Peng and L Hu ldquoDeep learningfor sensor-based activity recognition a surveyrdquo PatternRecognition Letters vol 119 pp 3ndash11 2019

[24] S Wu G Li L Deng et al ldquo$L1$-norm batch normalizationfor efficient training of deep neural networksrdquo IEEE Trans-actions on Neural Networks and Learning Systems vol 30no 7 pp 2043ndash2051 2019

[25] B Almaslukh J Al Muhtadi and A M Artoli ldquoA robustconvolutional neural network for online smartphone-basedhuman activity recognitionrdquo Journal of Intelligent amp FuzzySystems vol 35 no 2 pp 1609ndash1620 2018

[26] R Yao G Lin Q Shi and D C Ranasinghe ldquoEfficient denselabelling of human activity sequences from wearables usingfully convolutional networksrdquo Pattern Recognition vol 78pp 252ndash266 2018

[27] T Kautz B H Groh J Hannink U Jensen H Strubberg andB M Eskofier ldquoActivity recognition in beach volleyball usinga deep convolutional neural networkrdquo Data Mining andKnowledge Discovery vol 31 no 6 pp 1678ndash1705 2017

[28] R Jozefowicz W Zaremba and I Sutskever ldquoAn empiricalexploration of recurrent network architecturesrdquo in Proceed-ings of the 32nd international Conference on machine learningICML 2015 vol 3 pp 2332ndash2340 Lille France July 2015

[29] S Li S Zhao P Yang P Andriotis L Xu and Q SunldquoDistributed consensus algorithm for events detection incyber-physical systemsrdquo IEEE Internet of ings Journalvol 6 no 2 pp 2299ndash2308 2019

[30] G E Hinton N Srivastava A Krizhevsky I Sutskever andR R Salakhutdinov Improving Neural Networks by Pre-venting Co-adaptation of Feature Detectors arXiv preparationGeneva Switzerland 2012

[31] B M h Abidine L Fergani B Fergani and M Oussalahldquoe joint use of sequence features combination and modifiedweighted SVM for improving daily activity recognitionrdquoPattern Analysis and Applications vol 21 no 1 pp 119ndash1382018

[32] G M Weiss J W Lockhart T T Pulickal et al ldquoAsmartphone-based activity recognition system for improvinghealth and well-beingrdquo in Proceedings of the 3rd IEEE In-ternational Conference On Data Science And Advanced An-alytics DSAA 2016 pp 682ndash688 Montreal QC CanadaOctober 2016

[33] L Wang and R Liu ldquoHuman activity recognition based onwearable sensor using hierarchical deep LSTM networksrdquoCircuits Systems and Signal Processing vol 39 no 2pp 837ndash856 2019

Security and Communication Networks 11

[34] W Lu F Fan J Chu P Jing and S Yuting ldquoWearablecomputing for internet of things a discriminant approach forhuman activity recognitionrdquo IEEE Internet of ings Journalvol 6 no 2 pp 2749ndash2759 2019

[35] X Kuang Human Behavior Recognition Based on DeepLearning and Wearable Sensor Nanjing University of In-formation Engineering Nanjing China 2018

[36] M M Hassan M Z Uddin A Mohamed and A AlmogrenldquoA robust human activity recognition system using smart-phone sensors and deep learningrdquo Future Generation Com-puter Systems vol 81 pp 307ndash313 2018

12 Security and Communication Networks

Page 8: Wearable Sensor-Based Human Activity Recognition Using ...downloads.hindawi.com/journals/scn/2020/2132138.pdf · Wearable Sensor-Based Human Activity Recognition Using Hybrid Deep

training set When the amount of data is small the batchdata is the whole data set so that it can approach the extremevalue direction more accurately However in practical ap-plications the amount of data used by deep learning isrelatively large and the principle of small batch processing isgenerally adopted Using small batch processing requiresrelatively little memory and faster training time Within anappropriate range increasing the batch size can more ac-curately determine the direction of gradient descent andcause less training shock However when the batch sizeincreases to a certain value the determined downward di-rection will not change and the correction of parameters willslow down significantly e identification results of dif-ferent batch sizes are shown in Table 6 It can be seen thatwhen the batch size is 150 the maximum identification ratereaches 9587 erefore 150 is selected as the best batchsize in this paper

e parameters of the CNN-LSTM model proposed inthis paper are shown in Table 7

5 Experimental Results and Analysis

For human movement recognition Wang and Liu [33]proposed to use the F-measure standard measurementmethod to verify the performance of the deep-rootedLSTM network model in human activity recognition Luet al [34] demonstrated the superiority of the model inbehavior recognition by using accuracy prediction rateand recall rate in the experiment erefore to evaluatethe performance of the motion recognition methodproposed in this paper we also used the measurementmethod of accuracy recall rate loss rate and F-measure inthe experiment

According to the above parameters the recognitionconfusion matrix of 12 different actions is shown in Table 8Accuracy curve of CNN-LSTM model is shown in Figure 6It can be seen from Table 9 that the overall recognition rateof CNN-LSTM is high and the CNN-LSTM has a betterrecognition effect on the transition action

6 Case Study

In the non-deep-learning method the random forest clas-sification method (RF) and K-nearest neighbor (KNN)classification perform well in action classification recogni-tion erefore the CNN-LSTM model proposed is com-pared with the RF and KNN methods First of all input theHAPTdata set into RF and KNNen segment the original

Table 4 Accuracy of different learning rates on test sets

Learning rate Recognition rate ()0001 935700015 94210002 958700025 92390003 933400035 92120004 928400045 9201

Table 5 Accuracy and loss rate on test sets with or without BNlayer

Recognition rate ()Without BN layer 9163With BN layer 9587

Table 6 Accuracy of different batch size on test sets

Batch size Recognition rate ()25 917450 928875 9292100 9310125 9433150 9587175 9345200 9337225 9372250 9345275 9284300 9335325 9406350 9334375 9296400 9353

Table 7 Experimental parameters of CNN-LSTM model

Parameters ValueInput vector size 150Input channel number 8Convolution kernel size 3Pool size 2Activation function ReLuLSTM layer 1Neurons number 64Dropout 05Learning rate 0002Batch size 150Epoch 400

Table 8 Confusion matrix of various actions

ActualPredict

A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A12A1 410 1 3 0 0 0 0 0 0 0 0 0A2 5 388 3 0 0 0 0 0 0 0 0 0A3 1 3 346 0 0 0 0 0 0 0 0 0A4 1 0 0 383 32 3 1 0 1 1 0 0A5 0 0 1 31 431 0 0 0 0 0 0 0A6 0 0 0 1 0 457 0 0 0 0 0 0A7 0 0 0 1 0 0 17 0 0 0 0 0A8 0 0 0 0 0 0 0 4 0 0 1 0A9 0 0 0 0 0 0 0 0 19 1 4 1A10 0 0 0 0 0 0 0 0 1 14 0 2A11 0 1 0 1 0 0 1 0 2 1 32 1A12 0 0 0 0 0 0 0 0 0 1 1 16

8 Security and Communication Networks

sensor data and calculate the mean value variance co-variance and 15 features Finally classify the basic actionsand transition actions according to the clustering resultse classification results are shown in Table 10 It can be seenthat the recognition rate of CNN-LSTM model is higher

than that of RF and KNNmethods for both basic actions andtransition actions

In addition to the comparison with RF and KNN clas-sifier our proposed model is also compared with a singleCNN a single LSTM CNN-GRU and CNN-BLSTM deep

Table 9 e recognition accuracy recall rate and F value of various actions

ID Accuracy () Recall () F-measure ()A1 9903 9832 9868A2 9778 9873 9835A3 9886 9802 9844A4 9076 9185 9130A5 9309 9309 9309A6 9978 9956 9956A7 9444 8947 9189A8 100 100 100A9 7600 8261 7917A10 8235 7778 8000A11 8205 8649 8421A12 8889 8000 8421

Table 10 Average accuracy of various actions in CNN-LSTM RF and KNN models

ID RF () KNN () CNN-LSTM ()A1 9990 8810 9903A2 9250 9780 9778A3 9020 9940 9886A4 9190 8380 9076A5 9080 8750 9309A6 9710 100 9978A7 7130 6670 9444A8 7200 6800 100A9 5130 3860 7600A10 7490 3630 8235A11 5920 3370 8205A12 6110 5790 8889

10

09

08

07

06

05

04

030 50 100 150 200 250 300 350 400

Iterations

TrainValidation

Figure 6 Accuracy curve of CNN-LSTM Model

Security and Communication Networks 9

learning models Table 11 shows the average accuracy ofvarious actions in five different depth models As can be seenfrom Table 11 CNN-LSTM not only has a slightly higherrecognition of basic movements than the other five modelsbut also has a significantly better recognition of transitionmovements especially standing to sitting sitting to lyingand standing to lying Table 12 shows the recognition rates ofdifferent models on the test set It can be seen from the tablethat the average recognition rate of the three models ishigher than 90 but the recognition effect of CNN-LSTMmodel is slightly better than that of CNN LSTM CNN-GRU and CNN-BLSTM

To prove the effectiveness of the CNN-LSTM deeplearning model it is also compared with other deep learningmethods using the same dataset Kuang [35] applied BLSTMto construct the behavior recognition model Hassan et al[36] used deep belief network (DBN) for human behaviorrecognition We compared the performance with the ap-proaches in [35 36] with the result shown in Table 13 Itfollows that the proposed CNN-LSTM can achieve highestaverage recognition rate

7 Conclusion

is paper explored the recognition method based on deeplearning and designed the behavior recognition model basedon CNN-LSTM CNN learns local features from the original

sensor data and LSTM extracts time-dependent relation-ships from local features and realizes the fusion of localfeatures and global features fine description of basic andtransition movements and accurate identification of the twomotion patterns

e actions identified in this paper only include commonbasic actions and individual transition actions In the nextstep more kinds of actions can be collected and morecomplex actions can be added such as eating and drivingAnd the individual recognition can be realized by consid-ering the behavior differences of different users Meanwhilethe deep learning model still needs to be optimized andimproved Studies show that the combination of depthmodel and shallow model can achieve better performanceDeep learning model has strong learning ability whileshallow learning model has higher learning efficiency ecollaboration between the two can achieve more accurateand lightweight recognition

Data Availability

No data were used to support this study

Conflicts of Interest

e authors declare no conflicts of interest

Acknowledgments

e authors would like to thank the support of the labo-ratory university and government is research wasfunded by the National Key Research and Development Plan(No 2017YFB1402103) the National Natural ScienceFoundation of China (No 61971347) Scientific ResearchProgram of Shaanxi Province (2018HJCG-05) and Projectof Xirsquoan Science and Technology Planning Foundation(201805037YD15CG214)

References

[1] I H Lopez-Nava and M M Angelica ldquoWearable inertialsensors for human motion analysis a reviewrdquo IEEE SensorsJournalvol 16 no 15 2016

Table 11 Average accuracy of different activities with five deep learning models

ID CNN () LSTM () CNN-BLSTM () CNN-GRU () CNN-LSTM ()A1 9750 9770 9741 9975 9903A2 9725 9710 9565 9899 9778A3 9560 9715 100 9657 9886A4 9126 9026 9196 8199 9076A5 9080 9080 8474 9248 9309A6 9967 9858 100 9978 9978A7 7647 6486 4444 7778 9444A8 100 6667 6667 5000 100A9 6383 6939 6207 4800 7600A10 8485 7027 8000 5294 8235A11 7250 6933 6500 7179 8205A12 8330 7027 7059 5556 8889

Table 12 Average accuracy of the five models in this paper

Method Average recognition rate ()CNN 9429LSTM 9322CNN-BLSTM 9273CNN-GRU 9334CNN-LSTM 9587

Table 13 Average accuracy of different methods on test set in thepaper [35 36]

Method Average recognition rateBLSTM [35] 875DBN [36] 896CNN-LSTM 958

10 Security and Communication Networks

[2] Y Liu L Nie L Liu and D S Rosenblum ldquoFrom action toactivity sensor-based activity recognitionrdquo Neurocomputingvol 181 pp 108ndash115 2016

[3] T Liu F Bingfei and L Qingguo ldquoe invention relates to awearable motion sensor and a method for resisting magneticfield interferencerdquo 2017

[4] O D Lara and M A Labrador ldquoA survey on human activityrecognition using wearable sensorsrdquo IEEE CommunicationsSurveys amp Tutorials vol 15 no 3 pp 1192ndash1209 2013

[5] F J Ordontildeez and D Roggen ldquoDeep convolutional and LSTMrecurrent neural networks for multimodal wearable activityrecognitionrdquo Sensors (Switzerland) vol 16 p 1 2016

[6] X Du R Vasudevan and M Johnson-Roberson ldquoBio-LSTMa biomechanically inspired recurrent neural network for 3-dpedestrian pose and gait predictionrdquo IEEE Robotics andAutomation Letters vol 4 no 2 pp 1501ndash1508 2019

[7] Y Huang C Wan and H Feng ldquoMulti-feature fusion humanbehavior recognition algorithm based on convolutionalneural network and long short termmemory neural networkrdquoLaser Optoelectron Prog vol 56 p 7 2019

[8] Y Lecun Y Bengio and G Hinton ldquoDeep learningrdquo Naturevol 521 no 7553 pp 436ndash444 2015

[9] G W Taylor R Fergus Y LeCun and C Bregler ldquoCon-volutional learning of spatio-temporal featuresrdquo in LectureNotes in Computer Science (including subseries Lecture Notesin Artificial Intelligence and Lecture Notes in Bioinformatics)Springer Berlin Germany 2010

[10] S Ji W Xu M Yang and K Yu ldquo3D Convolutional neuralnetworks for human action recognitionrdquo IEEE Transactionson Pattern Analysis and Machine Intelligence vol 35 no 1pp 221ndash231 2013

[11] C Liu J Liu Z He Y Zhai Q Hu and Y Huang ldquoCon-volutional neural random fields for action recognitionrdquoPattern Recognition vol 59 pp 213ndash224 2016

[12] K Cho M Bart van G Caglar et al ldquoLearning phraserepresentations using RNN encoder-decoder for statisticalmachine translationrdquo in Proceedings of the EMNLP 2014 -2014 Conference on Empirical Methods in Natural LanguageProcessing pp 1724ndash1734 Doha Qatar October 2014

[13] M Zeng T N Le Y Bo et al ldquoConvolutional NeuralNetworks for human activity recognition using mobile sen-sorsrdquo in Proceedings Of the 2014 6th International ConferenceOn Mobile Computing Applications And Services pp 197ndash205 Austin TX USA November 2015

[14] W Jiang and Z Yin ldquoHuman activity recognition usingwearable sensors by deep convolutional neural networksrdquo inProceedings Of the 2015 ACM Multimedia Conference MM2015 pp 1307ndash1310 Brisbane Australia October 2015

[15] Y Chen and Y Xue ldquoA deep learning approach to humanactivity recognition based on single accelerometerrdquo in Pro-ceedings of the 2015 IEEE International Conference On Sys-tems Man and Cybernetics SMC 2015 pp 1488ndash1492 HongKong China October 2016

[16] C A Ronao and S-B Cho ldquoHuman activity recognition withsmartphone sensors using deep learning neural networksrdquoExpert Systems with Applications vol 59 pp 235ndash244 2016

[17] A Murad and J Y Pyun ldquoDeep recurrent neural networks forhuman activity recognitionrdquo Sensors (Switzerland) vol 17p 11 2017

[18] J Zhou J Sun P Cong et al ldquoSecurity-critical energy-awaretask scheduling for heterogeneous real-time MPSoCs in IoTrdquoIEEE Transactions On Services Computing (TSC) vol 12 p 992019

[19] Y Guan and T Plotz ldquoEnsembles of deep LSTM learners foractivity recognition using wearablesrdquo Proceedings of the ACMon Interactive Mobile Wearable and Ubiquitous Technologiesvol 1 no 2 pp 1ndash28 2017

[20] L Qi X ZhangW Dou C Hu C Yang and J Chen ldquoA two-stage locality-sensitive hashing based approach for privacy-preserving mobile service recommendation in cross-platformedge environmentrdquo Future Generation Computer Systemsvol 88 pp 636ndash643 2018

[21] A Ignatov ldquoReal-time human activity recognition from ac-celerometer data using Convolutional Neural NetworksrdquoApplied Soft Computing vol 62 pp 915ndash922 2018

[22] H F Nweke Y W Teh M A Al-garadi and U R Alo ldquoDeeplearning algorithms for human activity recognition usingmobile and wearable sensor networks state of the art andresearch challengesrdquo Expert Systems with Applicationsvol 105 pp 233ndash261 2018

[23] J Wang Y Chen S Hao X Peng and L Hu ldquoDeep learningfor sensor-based activity recognition a surveyrdquo PatternRecognition Letters vol 119 pp 3ndash11 2019

[24] S Wu G Li L Deng et al ldquo$L1$-norm batch normalizationfor efficient training of deep neural networksrdquo IEEE Trans-actions on Neural Networks and Learning Systems vol 30no 7 pp 2043ndash2051 2019

[25] B Almaslukh J Al Muhtadi and A M Artoli ldquoA robustconvolutional neural network for online smartphone-basedhuman activity recognitionrdquo Journal of Intelligent amp FuzzySystems vol 35 no 2 pp 1609ndash1620 2018

[26] R Yao G Lin Q Shi and D C Ranasinghe ldquoEfficient denselabelling of human activity sequences from wearables usingfully convolutional networksrdquo Pattern Recognition vol 78pp 252ndash266 2018

[27] T Kautz B H Groh J Hannink U Jensen H Strubberg andB M Eskofier ldquoActivity recognition in beach volleyball usinga deep convolutional neural networkrdquo Data Mining andKnowledge Discovery vol 31 no 6 pp 1678ndash1705 2017

[28] R Jozefowicz W Zaremba and I Sutskever ldquoAn empiricalexploration of recurrent network architecturesrdquo in Proceed-ings of the 32nd international Conference on machine learningICML 2015 vol 3 pp 2332ndash2340 Lille France July 2015

[29] S Li S Zhao P Yang P Andriotis L Xu and Q SunldquoDistributed consensus algorithm for events detection incyber-physical systemsrdquo IEEE Internet of ings Journalvol 6 no 2 pp 2299ndash2308 2019

[30] G E Hinton N Srivastava A Krizhevsky I Sutskever andR R Salakhutdinov Improving Neural Networks by Pre-venting Co-adaptation of Feature Detectors arXiv preparationGeneva Switzerland 2012

[31] B M h Abidine L Fergani B Fergani and M Oussalahldquoe joint use of sequence features combination and modifiedweighted SVM for improving daily activity recognitionrdquoPattern Analysis and Applications vol 21 no 1 pp 119ndash1382018

[32] G M Weiss J W Lockhart T T Pulickal et al ldquoAsmartphone-based activity recognition system for improvinghealth and well-beingrdquo in Proceedings of the 3rd IEEE In-ternational Conference On Data Science And Advanced An-alytics DSAA 2016 pp 682ndash688 Montreal QC CanadaOctober 2016

[33] L Wang and R Liu ldquoHuman activity recognition based onwearable sensor using hierarchical deep LSTM networksrdquoCircuits Systems and Signal Processing vol 39 no 2pp 837ndash856 2019

Security and Communication Networks 11

[34] W Lu F Fan J Chu P Jing and S Yuting ldquoWearablecomputing for internet of things a discriminant approach forhuman activity recognitionrdquo IEEE Internet of ings Journalvol 6 no 2 pp 2749ndash2759 2019

[35] X Kuang Human Behavior Recognition Based on DeepLearning and Wearable Sensor Nanjing University of In-formation Engineering Nanjing China 2018

[36] M M Hassan M Z Uddin A Mohamed and A AlmogrenldquoA robust human activity recognition system using smart-phone sensors and deep learningrdquo Future Generation Com-puter Systems vol 81 pp 307ndash313 2018

12 Security and Communication Networks

Page 9: Wearable Sensor-Based Human Activity Recognition Using ...downloads.hindawi.com/journals/scn/2020/2132138.pdf · Wearable Sensor-Based Human Activity Recognition Using Hybrid Deep

sensor data and calculate the mean value variance co-variance and 15 features Finally classify the basic actionsand transition actions according to the clustering resultse classification results are shown in Table 10 It can be seenthat the recognition rate of CNN-LSTM model is higher

than that of RF and KNNmethods for both basic actions andtransition actions

In addition to the comparison with RF and KNN clas-sifier our proposed model is also compared with a singleCNN a single LSTM CNN-GRU and CNN-BLSTM deep

Table 9 e recognition accuracy recall rate and F value of various actions

ID Accuracy () Recall () F-measure ()A1 9903 9832 9868A2 9778 9873 9835A3 9886 9802 9844A4 9076 9185 9130A5 9309 9309 9309A6 9978 9956 9956A7 9444 8947 9189A8 100 100 100A9 7600 8261 7917A10 8235 7778 8000A11 8205 8649 8421A12 8889 8000 8421

Table 10 Average accuracy of various actions in CNN-LSTM RF and KNN models

ID RF () KNN () CNN-LSTM ()A1 9990 8810 9903A2 9250 9780 9778A3 9020 9940 9886A4 9190 8380 9076A5 9080 8750 9309A6 9710 100 9978A7 7130 6670 9444A8 7200 6800 100A9 5130 3860 7600A10 7490 3630 8235A11 5920 3370 8205A12 6110 5790 8889

10

09

08

07

06

05

04

030 50 100 150 200 250 300 350 400

Iterations

TrainValidation

Figure 6 Accuracy curve of CNN-LSTM Model

Security and Communication Networks 9

learning models Table 11 shows the average accuracy ofvarious actions in five different depth models As can be seenfrom Table 11 CNN-LSTM not only has a slightly higherrecognition of basic movements than the other five modelsbut also has a significantly better recognition of transitionmovements especially standing to sitting sitting to lyingand standing to lying Table 12 shows the recognition rates ofdifferent models on the test set It can be seen from the tablethat the average recognition rate of the three models ishigher than 90 but the recognition effect of CNN-LSTMmodel is slightly better than that of CNN LSTM CNN-GRU and CNN-BLSTM

To prove the effectiveness of the CNN-LSTM deeplearning model it is also compared with other deep learningmethods using the same dataset Kuang [35] applied BLSTMto construct the behavior recognition model Hassan et al[36] used deep belief network (DBN) for human behaviorrecognition We compared the performance with the ap-proaches in [35 36] with the result shown in Table 13 Itfollows that the proposed CNN-LSTM can achieve highestaverage recognition rate

7 Conclusion

is paper explored the recognition method based on deeplearning and designed the behavior recognition model basedon CNN-LSTM CNN learns local features from the original

sensor data and LSTM extracts time-dependent relation-ships from local features and realizes the fusion of localfeatures and global features fine description of basic andtransition movements and accurate identification of the twomotion patterns

e actions identified in this paper only include commonbasic actions and individual transition actions In the nextstep more kinds of actions can be collected and morecomplex actions can be added such as eating and drivingAnd the individual recognition can be realized by consid-ering the behavior differences of different users Meanwhilethe deep learning model still needs to be optimized andimproved Studies show that the combination of depthmodel and shallow model can achieve better performanceDeep learning model has strong learning ability whileshallow learning model has higher learning efficiency ecollaboration between the two can achieve more accurateand lightweight recognition

Data Availability

No data were used to support this study

Conflicts of Interest

e authors declare no conflicts of interest

Acknowledgments

e authors would like to thank the support of the labo-ratory university and government is research wasfunded by the National Key Research and Development Plan(No 2017YFB1402103) the National Natural ScienceFoundation of China (No 61971347) Scientific ResearchProgram of Shaanxi Province (2018HJCG-05) and Projectof Xirsquoan Science and Technology Planning Foundation(201805037YD15CG214)

References

[1] I H Lopez-Nava and M M Angelica ldquoWearable inertialsensors for human motion analysis a reviewrdquo IEEE SensorsJournalvol 16 no 15 2016

Table 11 Average accuracy of different activities with five deep learning models

ID CNN () LSTM () CNN-BLSTM () CNN-GRU () CNN-LSTM ()A1 9750 9770 9741 9975 9903A2 9725 9710 9565 9899 9778A3 9560 9715 100 9657 9886A4 9126 9026 9196 8199 9076A5 9080 9080 8474 9248 9309A6 9967 9858 100 9978 9978A7 7647 6486 4444 7778 9444A8 100 6667 6667 5000 100A9 6383 6939 6207 4800 7600A10 8485 7027 8000 5294 8235A11 7250 6933 6500 7179 8205A12 8330 7027 7059 5556 8889

Table 12 Average accuracy of the five models in this paper

Method Average recognition rate ()CNN 9429LSTM 9322CNN-BLSTM 9273CNN-GRU 9334CNN-LSTM 9587

Table 13 Average accuracy of different methods on test set in thepaper [35 36]

Method Average recognition rateBLSTM [35] 875DBN [36] 896CNN-LSTM 958

10 Security and Communication Networks

[2] Y Liu L Nie L Liu and D S Rosenblum ldquoFrom action toactivity sensor-based activity recognitionrdquo Neurocomputingvol 181 pp 108ndash115 2016

[3] T Liu F Bingfei and L Qingguo ldquoe invention relates to awearable motion sensor and a method for resisting magneticfield interferencerdquo 2017

[4] O D Lara and M A Labrador ldquoA survey on human activityrecognition using wearable sensorsrdquo IEEE CommunicationsSurveys amp Tutorials vol 15 no 3 pp 1192ndash1209 2013

[5] F J Ordontildeez and D Roggen ldquoDeep convolutional and LSTMrecurrent neural networks for multimodal wearable activityrecognitionrdquo Sensors (Switzerland) vol 16 p 1 2016

[6] X Du R Vasudevan and M Johnson-Roberson ldquoBio-LSTMa biomechanically inspired recurrent neural network for 3-dpedestrian pose and gait predictionrdquo IEEE Robotics andAutomation Letters vol 4 no 2 pp 1501ndash1508 2019

[7] Y Huang C Wan and H Feng ldquoMulti-feature fusion humanbehavior recognition algorithm based on convolutionalneural network and long short termmemory neural networkrdquoLaser Optoelectron Prog vol 56 p 7 2019

[8] Y Lecun Y Bengio and G Hinton ldquoDeep learningrdquo Naturevol 521 no 7553 pp 436ndash444 2015

[9] G W Taylor R Fergus Y LeCun and C Bregler ldquoCon-volutional learning of spatio-temporal featuresrdquo in LectureNotes in Computer Science (including subseries Lecture Notesin Artificial Intelligence and Lecture Notes in Bioinformatics)Springer Berlin Germany 2010

[10] S Ji W Xu M Yang and K Yu ldquo3D Convolutional neuralnetworks for human action recognitionrdquo IEEE Transactionson Pattern Analysis and Machine Intelligence vol 35 no 1pp 221ndash231 2013

[11] C Liu J Liu Z He Y Zhai Q Hu and Y Huang ldquoCon-volutional neural random fields for action recognitionrdquoPattern Recognition vol 59 pp 213ndash224 2016

[12] K Cho M Bart van G Caglar et al ldquoLearning phraserepresentations using RNN encoder-decoder for statisticalmachine translationrdquo in Proceedings of the EMNLP 2014 -2014 Conference on Empirical Methods in Natural LanguageProcessing pp 1724ndash1734 Doha Qatar October 2014

[13] M Zeng T N Le Y Bo et al ldquoConvolutional NeuralNetworks for human activity recognition using mobile sen-sorsrdquo in Proceedings Of the 2014 6th International ConferenceOn Mobile Computing Applications And Services pp 197ndash205 Austin TX USA November 2015

[14] W Jiang and Z Yin ldquoHuman activity recognition usingwearable sensors by deep convolutional neural networksrdquo inProceedings Of the 2015 ACM Multimedia Conference MM2015 pp 1307ndash1310 Brisbane Australia October 2015

[15] Y Chen and Y Xue ldquoA deep learning approach to humanactivity recognition based on single accelerometerrdquo in Pro-ceedings of the 2015 IEEE International Conference On Sys-tems Man and Cybernetics SMC 2015 pp 1488ndash1492 HongKong China October 2016

[16] C A Ronao and S-B Cho ldquoHuman activity recognition withsmartphone sensors using deep learning neural networksrdquoExpert Systems with Applications vol 59 pp 235ndash244 2016

[17] A Murad and J Y Pyun ldquoDeep recurrent neural networks forhuman activity recognitionrdquo Sensors (Switzerland) vol 17p 11 2017

[18] J Zhou J Sun P Cong et al ldquoSecurity-critical energy-awaretask scheduling for heterogeneous real-time MPSoCs in IoTrdquoIEEE Transactions On Services Computing (TSC) vol 12 p 992019

[19] Y Guan and T Plotz ldquoEnsembles of deep LSTM learners foractivity recognition using wearablesrdquo Proceedings of the ACMon Interactive Mobile Wearable and Ubiquitous Technologiesvol 1 no 2 pp 1ndash28 2017

[20] L Qi X ZhangW Dou C Hu C Yang and J Chen ldquoA two-stage locality-sensitive hashing based approach for privacy-preserving mobile service recommendation in cross-platformedge environmentrdquo Future Generation Computer Systemsvol 88 pp 636ndash643 2018

[21] A Ignatov ldquoReal-time human activity recognition from ac-celerometer data using Convolutional Neural NetworksrdquoApplied Soft Computing vol 62 pp 915ndash922 2018

[22] H F Nweke Y W Teh M A Al-garadi and U R Alo ldquoDeeplearning algorithms for human activity recognition usingmobile and wearable sensor networks state of the art andresearch challengesrdquo Expert Systems with Applicationsvol 105 pp 233ndash261 2018

[23] J Wang Y Chen S Hao X Peng and L Hu ldquoDeep learningfor sensor-based activity recognition a surveyrdquo PatternRecognition Letters vol 119 pp 3ndash11 2019

[24] S Wu G Li L Deng et al ldquo$L1$-norm batch normalizationfor efficient training of deep neural networksrdquo IEEE Trans-actions on Neural Networks and Learning Systems vol 30no 7 pp 2043ndash2051 2019

[25] B Almaslukh J Al Muhtadi and A M Artoli ldquoA robustconvolutional neural network for online smartphone-basedhuman activity recognitionrdquo Journal of Intelligent amp FuzzySystems vol 35 no 2 pp 1609ndash1620 2018

[26] R Yao G Lin Q Shi and D C Ranasinghe ldquoEfficient denselabelling of human activity sequences from wearables usingfully convolutional networksrdquo Pattern Recognition vol 78pp 252ndash266 2018

[27] T Kautz B H Groh J Hannink U Jensen H Strubberg andB M Eskofier ldquoActivity recognition in beach volleyball usinga deep convolutional neural networkrdquo Data Mining andKnowledge Discovery vol 31 no 6 pp 1678ndash1705 2017

[28] R Jozefowicz W Zaremba and I Sutskever ldquoAn empiricalexploration of recurrent network architecturesrdquo in Proceed-ings of the 32nd international Conference on machine learningICML 2015 vol 3 pp 2332ndash2340 Lille France July 2015

[29] S Li S Zhao P Yang P Andriotis L Xu and Q SunldquoDistributed consensus algorithm for events detection incyber-physical systemsrdquo IEEE Internet of ings Journalvol 6 no 2 pp 2299ndash2308 2019

[30] G E Hinton N Srivastava A Krizhevsky I Sutskever andR R Salakhutdinov Improving Neural Networks by Pre-venting Co-adaptation of Feature Detectors arXiv preparationGeneva Switzerland 2012

[31] B M h Abidine L Fergani B Fergani and M Oussalahldquoe joint use of sequence features combination and modifiedweighted SVM for improving daily activity recognitionrdquoPattern Analysis and Applications vol 21 no 1 pp 119ndash1382018

[32] G M Weiss J W Lockhart T T Pulickal et al ldquoAsmartphone-based activity recognition system for improvinghealth and well-beingrdquo in Proceedings of the 3rd IEEE In-ternational Conference On Data Science And Advanced An-alytics DSAA 2016 pp 682ndash688 Montreal QC CanadaOctober 2016

[33] L Wang and R Liu ldquoHuman activity recognition based onwearable sensor using hierarchical deep LSTM networksrdquoCircuits Systems and Signal Processing vol 39 no 2pp 837ndash856 2019

Security and Communication Networks 11

[34] W Lu F Fan J Chu P Jing and S Yuting ldquoWearablecomputing for internet of things a discriminant approach forhuman activity recognitionrdquo IEEE Internet of ings Journalvol 6 no 2 pp 2749ndash2759 2019

[35] X Kuang Human Behavior Recognition Based on DeepLearning and Wearable Sensor Nanjing University of In-formation Engineering Nanjing China 2018

[36] M M Hassan M Z Uddin A Mohamed and A AlmogrenldquoA robust human activity recognition system using smart-phone sensors and deep learningrdquo Future Generation Com-puter Systems vol 81 pp 307ndash313 2018

12 Security and Communication Networks

Page 10: Wearable Sensor-Based Human Activity Recognition Using ...downloads.hindawi.com/journals/scn/2020/2132138.pdf · Wearable Sensor-Based Human Activity Recognition Using Hybrid Deep

learning models Table 11 shows the average accuracy ofvarious actions in five different depth models As can be seenfrom Table 11 CNN-LSTM not only has a slightly higherrecognition of basic movements than the other five modelsbut also has a significantly better recognition of transitionmovements especially standing to sitting sitting to lyingand standing to lying Table 12 shows the recognition rates ofdifferent models on the test set It can be seen from the tablethat the average recognition rate of the three models ishigher than 90 but the recognition effect of CNN-LSTMmodel is slightly better than that of CNN LSTM CNN-GRU and CNN-BLSTM

To prove the effectiveness of the CNN-LSTM deeplearning model it is also compared with other deep learningmethods using the same dataset Kuang [35] applied BLSTMto construct the behavior recognition model Hassan et al[36] used deep belief network (DBN) for human behaviorrecognition We compared the performance with the ap-proaches in [35 36] with the result shown in Table 13 Itfollows that the proposed CNN-LSTM can achieve highestaverage recognition rate

7 Conclusion

is paper explored the recognition method based on deeplearning and designed the behavior recognition model basedon CNN-LSTM CNN learns local features from the original

sensor data and LSTM extracts time-dependent relation-ships from local features and realizes the fusion of localfeatures and global features fine description of basic andtransition movements and accurate identification of the twomotion patterns

e actions identified in this paper only include commonbasic actions and individual transition actions In the nextstep more kinds of actions can be collected and morecomplex actions can be added such as eating and drivingAnd the individual recognition can be realized by consid-ering the behavior differences of different users Meanwhilethe deep learning model still needs to be optimized andimproved Studies show that the combination of depthmodel and shallow model can achieve better performanceDeep learning model has strong learning ability whileshallow learning model has higher learning efficiency ecollaboration between the two can achieve more accurateand lightweight recognition

Data Availability

No data were used to support this study

Conflicts of Interest

e authors declare no conflicts of interest

Acknowledgments

e authors would like to thank the support of the labo-ratory university and government is research wasfunded by the National Key Research and Development Plan(No 2017YFB1402103) the National Natural ScienceFoundation of China (No 61971347) Scientific ResearchProgram of Shaanxi Province (2018HJCG-05) and Projectof Xirsquoan Science and Technology Planning Foundation(201805037YD15CG214)

References

[1] I H Lopez-Nava and M M Angelica ldquoWearable inertialsensors for human motion analysis a reviewrdquo IEEE SensorsJournalvol 16 no 15 2016

Table 11 Average accuracy of different activities with five deep learning models

ID CNN () LSTM () CNN-BLSTM () CNN-GRU () CNN-LSTM ()A1 9750 9770 9741 9975 9903A2 9725 9710 9565 9899 9778A3 9560 9715 100 9657 9886A4 9126 9026 9196 8199 9076A5 9080 9080 8474 9248 9309A6 9967 9858 100 9978 9978A7 7647 6486 4444 7778 9444A8 100 6667 6667 5000 100A9 6383 6939 6207 4800 7600A10 8485 7027 8000 5294 8235A11 7250 6933 6500 7179 8205A12 8330 7027 7059 5556 8889

Table 12 Average accuracy of the five models in this paper

Method Average recognition rate ()CNN 9429LSTM 9322CNN-BLSTM 9273CNN-GRU 9334CNN-LSTM 9587

Table 13 Average accuracy of different methods on test set in thepaper [35 36]

Method Average recognition rateBLSTM [35] 875DBN [36] 896CNN-LSTM 958

10 Security and Communication Networks

[2] Y Liu L Nie L Liu and D S Rosenblum ldquoFrom action toactivity sensor-based activity recognitionrdquo Neurocomputingvol 181 pp 108ndash115 2016

[3] T Liu F Bingfei and L Qingguo ldquoe invention relates to awearable motion sensor and a method for resisting magneticfield interferencerdquo 2017

[4] O D Lara and M A Labrador ldquoA survey on human activityrecognition using wearable sensorsrdquo IEEE CommunicationsSurveys amp Tutorials vol 15 no 3 pp 1192ndash1209 2013

[5] F J Ordontildeez and D Roggen ldquoDeep convolutional and LSTMrecurrent neural networks for multimodal wearable activityrecognitionrdquo Sensors (Switzerland) vol 16 p 1 2016

[6] X Du R Vasudevan and M Johnson-Roberson ldquoBio-LSTMa biomechanically inspired recurrent neural network for 3-dpedestrian pose and gait predictionrdquo IEEE Robotics andAutomation Letters vol 4 no 2 pp 1501ndash1508 2019

[7] Y Huang C Wan and H Feng ldquoMulti-feature fusion humanbehavior recognition algorithm based on convolutionalneural network and long short termmemory neural networkrdquoLaser Optoelectron Prog vol 56 p 7 2019

[8] Y Lecun Y Bengio and G Hinton ldquoDeep learningrdquo Naturevol 521 no 7553 pp 436ndash444 2015

[9] G W Taylor R Fergus Y LeCun and C Bregler ldquoCon-volutional learning of spatio-temporal featuresrdquo in LectureNotes in Computer Science (including subseries Lecture Notesin Artificial Intelligence and Lecture Notes in Bioinformatics)Springer Berlin Germany 2010

[10] S Ji W Xu M Yang and K Yu ldquo3D Convolutional neuralnetworks for human action recognitionrdquo IEEE Transactionson Pattern Analysis and Machine Intelligence vol 35 no 1pp 221ndash231 2013

[11] C Liu J Liu Z He Y Zhai Q Hu and Y Huang ldquoCon-volutional neural random fields for action recognitionrdquoPattern Recognition vol 59 pp 213ndash224 2016

[12] K Cho M Bart van G Caglar et al ldquoLearning phraserepresentations using RNN encoder-decoder for statisticalmachine translationrdquo in Proceedings of the EMNLP 2014 -2014 Conference on Empirical Methods in Natural LanguageProcessing pp 1724ndash1734 Doha Qatar October 2014

[13] M Zeng T N Le Y Bo et al ldquoConvolutional NeuralNetworks for human activity recognition using mobile sen-sorsrdquo in Proceedings Of the 2014 6th International ConferenceOn Mobile Computing Applications And Services pp 197ndash205 Austin TX USA November 2015

[14] W Jiang and Z Yin ldquoHuman activity recognition usingwearable sensors by deep convolutional neural networksrdquo inProceedings Of the 2015 ACM Multimedia Conference MM2015 pp 1307ndash1310 Brisbane Australia October 2015

[15] Y Chen and Y Xue ldquoA deep learning approach to humanactivity recognition based on single accelerometerrdquo in Pro-ceedings of the 2015 IEEE International Conference On Sys-tems Man and Cybernetics SMC 2015 pp 1488ndash1492 HongKong China October 2016

[16] C A Ronao and S-B Cho ldquoHuman activity recognition withsmartphone sensors using deep learning neural networksrdquoExpert Systems with Applications vol 59 pp 235ndash244 2016

[17] A Murad and J Y Pyun ldquoDeep recurrent neural networks forhuman activity recognitionrdquo Sensors (Switzerland) vol 17p 11 2017

[18] J Zhou J Sun P Cong et al ldquoSecurity-critical energy-awaretask scheduling for heterogeneous real-time MPSoCs in IoTrdquoIEEE Transactions On Services Computing (TSC) vol 12 p 992019

[19] Y Guan and T Plotz ldquoEnsembles of deep LSTM learners foractivity recognition using wearablesrdquo Proceedings of the ACMon Interactive Mobile Wearable and Ubiquitous Technologiesvol 1 no 2 pp 1ndash28 2017

[20] L Qi X ZhangW Dou C Hu C Yang and J Chen ldquoA two-stage locality-sensitive hashing based approach for privacy-preserving mobile service recommendation in cross-platformedge environmentrdquo Future Generation Computer Systemsvol 88 pp 636ndash643 2018

[21] A Ignatov ldquoReal-time human activity recognition from ac-celerometer data using Convolutional Neural NetworksrdquoApplied Soft Computing vol 62 pp 915ndash922 2018

[22] H F Nweke Y W Teh M A Al-garadi and U R Alo ldquoDeeplearning algorithms for human activity recognition usingmobile and wearable sensor networks state of the art andresearch challengesrdquo Expert Systems with Applicationsvol 105 pp 233ndash261 2018

[23] J Wang Y Chen S Hao X Peng and L Hu ldquoDeep learningfor sensor-based activity recognition a surveyrdquo PatternRecognition Letters vol 119 pp 3ndash11 2019

[24] S Wu G Li L Deng et al ldquo$L1$-norm batch normalizationfor efficient training of deep neural networksrdquo IEEE Trans-actions on Neural Networks and Learning Systems vol 30no 7 pp 2043ndash2051 2019

[25] B Almaslukh J Al Muhtadi and A M Artoli ldquoA robustconvolutional neural network for online smartphone-basedhuman activity recognitionrdquo Journal of Intelligent amp FuzzySystems vol 35 no 2 pp 1609ndash1620 2018

[26] R Yao G Lin Q Shi and D C Ranasinghe ldquoEfficient denselabelling of human activity sequences from wearables usingfully convolutional networksrdquo Pattern Recognition vol 78pp 252ndash266 2018

[27] T Kautz B H Groh J Hannink U Jensen H Strubberg andB M Eskofier ldquoActivity recognition in beach volleyball usinga deep convolutional neural networkrdquo Data Mining andKnowledge Discovery vol 31 no 6 pp 1678ndash1705 2017

[28] R Jozefowicz W Zaremba and I Sutskever ldquoAn empiricalexploration of recurrent network architecturesrdquo in Proceed-ings of the 32nd international Conference on machine learningICML 2015 vol 3 pp 2332ndash2340 Lille France July 2015

[29] S Li S Zhao P Yang P Andriotis L Xu and Q SunldquoDistributed consensus algorithm for events detection incyber-physical systemsrdquo IEEE Internet of ings Journalvol 6 no 2 pp 2299ndash2308 2019

[30] G E Hinton N Srivastava A Krizhevsky I Sutskever andR R Salakhutdinov Improving Neural Networks by Pre-venting Co-adaptation of Feature Detectors arXiv preparationGeneva Switzerland 2012

[31] B M h Abidine L Fergani B Fergani and M Oussalahldquoe joint use of sequence features combination and modifiedweighted SVM for improving daily activity recognitionrdquoPattern Analysis and Applications vol 21 no 1 pp 119ndash1382018

[32] G M Weiss J W Lockhart T T Pulickal et al ldquoAsmartphone-based activity recognition system for improvinghealth and well-beingrdquo in Proceedings of the 3rd IEEE In-ternational Conference On Data Science And Advanced An-alytics DSAA 2016 pp 682ndash688 Montreal QC CanadaOctober 2016

[33] L Wang and R Liu ldquoHuman activity recognition based onwearable sensor using hierarchical deep LSTM networksrdquoCircuits Systems and Signal Processing vol 39 no 2pp 837ndash856 2019

Security and Communication Networks 11

[34] W Lu F Fan J Chu P Jing and S Yuting ldquoWearablecomputing for internet of things a discriminant approach forhuman activity recognitionrdquo IEEE Internet of ings Journalvol 6 no 2 pp 2749ndash2759 2019

[35] X Kuang Human Behavior Recognition Based on DeepLearning and Wearable Sensor Nanjing University of In-formation Engineering Nanjing China 2018

[36] M M Hassan M Z Uddin A Mohamed and A AlmogrenldquoA robust human activity recognition system using smart-phone sensors and deep learningrdquo Future Generation Com-puter Systems vol 81 pp 307ndash313 2018

12 Security and Communication Networks

Page 11: Wearable Sensor-Based Human Activity Recognition Using ...downloads.hindawi.com/journals/scn/2020/2132138.pdf · Wearable Sensor-Based Human Activity Recognition Using Hybrid Deep

[2] Y Liu L Nie L Liu and D S Rosenblum ldquoFrom action toactivity sensor-based activity recognitionrdquo Neurocomputingvol 181 pp 108ndash115 2016

[3] T Liu F Bingfei and L Qingguo ldquoe invention relates to awearable motion sensor and a method for resisting magneticfield interferencerdquo 2017

[4] O D Lara and M A Labrador ldquoA survey on human activityrecognition using wearable sensorsrdquo IEEE CommunicationsSurveys amp Tutorials vol 15 no 3 pp 1192ndash1209 2013

[5] F J Ordontildeez and D Roggen ldquoDeep convolutional and LSTMrecurrent neural networks for multimodal wearable activityrecognitionrdquo Sensors (Switzerland) vol 16 p 1 2016

[6] X Du R Vasudevan and M Johnson-Roberson ldquoBio-LSTMa biomechanically inspired recurrent neural network for 3-dpedestrian pose and gait predictionrdquo IEEE Robotics andAutomation Letters vol 4 no 2 pp 1501ndash1508 2019

[7] Y Huang C Wan and H Feng ldquoMulti-feature fusion humanbehavior recognition algorithm based on convolutionalneural network and long short termmemory neural networkrdquoLaser Optoelectron Prog vol 56 p 7 2019

[8] Y Lecun Y Bengio and G Hinton ldquoDeep learningrdquo Naturevol 521 no 7553 pp 436ndash444 2015

[9] G W Taylor R Fergus Y LeCun and C Bregler ldquoCon-volutional learning of spatio-temporal featuresrdquo in LectureNotes in Computer Science (including subseries Lecture Notesin Artificial Intelligence and Lecture Notes in Bioinformatics)Springer Berlin Germany 2010

[10] S Ji W Xu M Yang and K Yu ldquo3D Convolutional neuralnetworks for human action recognitionrdquo IEEE Transactionson Pattern Analysis and Machine Intelligence vol 35 no 1pp 221ndash231 2013

[11] C Liu J Liu Z He Y Zhai Q Hu and Y Huang ldquoCon-volutional neural random fields for action recognitionrdquoPattern Recognition vol 59 pp 213ndash224 2016

[12] K Cho M Bart van G Caglar et al ldquoLearning phraserepresentations using RNN encoder-decoder for statisticalmachine translationrdquo in Proceedings of the EMNLP 2014 -2014 Conference on Empirical Methods in Natural LanguageProcessing pp 1724ndash1734 Doha Qatar October 2014

[13] M Zeng T N Le Y Bo et al ldquoConvolutional NeuralNetworks for human activity recognition using mobile sen-sorsrdquo in Proceedings Of the 2014 6th International ConferenceOn Mobile Computing Applications And Services pp 197ndash205 Austin TX USA November 2015

[14] W Jiang and Z Yin ldquoHuman activity recognition usingwearable sensors by deep convolutional neural networksrdquo inProceedings Of the 2015 ACM Multimedia Conference MM2015 pp 1307ndash1310 Brisbane Australia October 2015

[15] Y Chen and Y Xue ldquoA deep learning approach to humanactivity recognition based on single accelerometerrdquo in Pro-ceedings of the 2015 IEEE International Conference On Sys-tems Man and Cybernetics SMC 2015 pp 1488ndash1492 HongKong China October 2016

[16] C A Ronao and S-B Cho ldquoHuman activity recognition withsmartphone sensors using deep learning neural networksrdquoExpert Systems with Applications vol 59 pp 235ndash244 2016

[17] A Murad and J Y Pyun ldquoDeep recurrent neural networks forhuman activity recognitionrdquo Sensors (Switzerland) vol 17p 11 2017

[18] J Zhou J Sun P Cong et al ldquoSecurity-critical energy-awaretask scheduling for heterogeneous real-time MPSoCs in IoTrdquoIEEE Transactions On Services Computing (TSC) vol 12 p 992019

[19] Y Guan and T Plotz ldquoEnsembles of deep LSTM learners foractivity recognition using wearablesrdquo Proceedings of the ACMon Interactive Mobile Wearable and Ubiquitous Technologiesvol 1 no 2 pp 1ndash28 2017

[20] L Qi X ZhangW Dou C Hu C Yang and J Chen ldquoA two-stage locality-sensitive hashing based approach for privacy-preserving mobile service recommendation in cross-platformedge environmentrdquo Future Generation Computer Systemsvol 88 pp 636ndash643 2018

[21] A Ignatov ldquoReal-time human activity recognition from ac-celerometer data using Convolutional Neural NetworksrdquoApplied Soft Computing vol 62 pp 915ndash922 2018

[22] H F Nweke Y W Teh M A Al-garadi and U R Alo ldquoDeeplearning algorithms for human activity recognition usingmobile and wearable sensor networks state of the art andresearch challengesrdquo Expert Systems with Applicationsvol 105 pp 233ndash261 2018

[23] J Wang Y Chen S Hao X Peng and L Hu ldquoDeep learningfor sensor-based activity recognition a surveyrdquo PatternRecognition Letters vol 119 pp 3ndash11 2019

[24] S Wu G Li L Deng et al ldquo$L1$-norm batch normalizationfor efficient training of deep neural networksrdquo IEEE Trans-actions on Neural Networks and Learning Systems vol 30no 7 pp 2043ndash2051 2019

[25] B Almaslukh J Al Muhtadi and A M Artoli ldquoA robustconvolutional neural network for online smartphone-basedhuman activity recognitionrdquo Journal of Intelligent amp FuzzySystems vol 35 no 2 pp 1609ndash1620 2018

[26] R Yao G Lin Q Shi and D C Ranasinghe ldquoEfficient denselabelling of human activity sequences from wearables usingfully convolutional networksrdquo Pattern Recognition vol 78pp 252ndash266 2018

[27] T Kautz B H Groh J Hannink U Jensen H Strubberg andB M Eskofier ldquoActivity recognition in beach volleyball usinga deep convolutional neural networkrdquo Data Mining andKnowledge Discovery vol 31 no 6 pp 1678ndash1705 2017

[28] R Jozefowicz W Zaremba and I Sutskever ldquoAn empiricalexploration of recurrent network architecturesrdquo in Proceed-ings of the 32nd international Conference on machine learningICML 2015 vol 3 pp 2332ndash2340 Lille France July 2015

[29] S Li S Zhao P Yang P Andriotis L Xu and Q SunldquoDistributed consensus algorithm for events detection incyber-physical systemsrdquo IEEE Internet of ings Journalvol 6 no 2 pp 2299ndash2308 2019

[30] G E Hinton N Srivastava A Krizhevsky I Sutskever andR R Salakhutdinov Improving Neural Networks by Pre-venting Co-adaptation of Feature Detectors arXiv preparationGeneva Switzerland 2012

[31] B M h Abidine L Fergani B Fergani and M Oussalahldquoe joint use of sequence features combination and modifiedweighted SVM for improving daily activity recognitionrdquoPattern Analysis and Applications vol 21 no 1 pp 119ndash1382018

[32] G M Weiss J W Lockhart T T Pulickal et al ldquoAsmartphone-based activity recognition system for improvinghealth and well-beingrdquo in Proceedings of the 3rd IEEE In-ternational Conference On Data Science And Advanced An-alytics DSAA 2016 pp 682ndash688 Montreal QC CanadaOctober 2016

[33] L Wang and R Liu ldquoHuman activity recognition based onwearable sensor using hierarchical deep LSTM networksrdquoCircuits Systems and Signal Processing vol 39 no 2pp 837ndash856 2019

Security and Communication Networks 11

[34] W Lu F Fan J Chu P Jing and S Yuting ldquoWearablecomputing for internet of things a discriminant approach forhuman activity recognitionrdquo IEEE Internet of ings Journalvol 6 no 2 pp 2749ndash2759 2019

[35] X Kuang Human Behavior Recognition Based on DeepLearning and Wearable Sensor Nanjing University of In-formation Engineering Nanjing China 2018

[36] M M Hassan M Z Uddin A Mohamed and A AlmogrenldquoA robust human activity recognition system using smart-phone sensors and deep learningrdquo Future Generation Com-puter Systems vol 81 pp 307ndash313 2018

12 Security and Communication Networks

Page 12: Wearable Sensor-Based Human Activity Recognition Using ...downloads.hindawi.com/journals/scn/2020/2132138.pdf · Wearable Sensor-Based Human Activity Recognition Using Hybrid Deep

[34] W Lu F Fan J Chu P Jing and S Yuting ldquoWearablecomputing for internet of things a discriminant approach forhuman activity recognitionrdquo IEEE Internet of ings Journalvol 6 no 2 pp 2749ndash2759 2019

[35] X Kuang Human Behavior Recognition Based on DeepLearning and Wearable Sensor Nanjing University of In-formation Engineering Nanjing China 2018

[36] M M Hassan M Z Uddin A Mohamed and A AlmogrenldquoA robust human activity recognition system using smart-phone sensors and deep learningrdquo Future Generation Com-puter Systems vol 81 pp 307ndash313 2018

12 Security and Communication Networks


Recommended