1
Priority based Adaptive Access Barring for
M2M Communications in LTE Networks using
Learning Automata
Faezeh Morvari and Abdorasoul Ghasemi
Faculty of Computer Engineering, K.N. Toosi University of Technology, Tehran,
Iran
Email: [email protected], [email protected]
February 15, 2017 DRAFT
2
Abstract
Supporting a huge number of Machine-to-Machine (M2M) devices with different priorities in
LTE networks is addressed in this paper. We propose a Learning Automaton (LA) based scheme for
dynamically allocating Random Access (RA) resources to different classes of M2M devices according
to their priorities and their demands in each cycle. We then use another LA based scheme to adjust
the barring factor for each class to control the possible overload. We show that by appropriate updating
procedure for these LAs, the system performance asymptotically converges to the optimal performance
in which the evolved Node B (eNB) knows the number of access-attempting devices from each class a
priori. Simulation results are provided to show the performance of the proposed scheme in RA resource
allocation to defined classes and adjusting the barring factor for each of them.
Index Terms
Machine-to-machine communications; Access barring; Learning automaton; Random access;
I. INTRODUCTION
Machine-to-Machine (M2M) or Machine Type Communication (MTC) refers to an emerging
communication technology in which the key elements for constituting new communication
paradigms such as smart city and Internet of Things (IoT) are addressed [1]. It involves a large
number of autonomous devices that exchange information or data with each other or with the
MTC server through a wireless area network without human intervention [2]. Smart grids, city
automation, and infrastructure management are the typical examples of M2M applications which
are widely adopted in our daily life [3]. The demand for M2M communications is continuously
growing and it is expected that there will be 50 billion devices by 2020 [4].
Currently, cellular networks and in particular, the Third Generation Partnership Project (3GPP)
Long Term Evolution (LTE) network are considered as a suitable infrastructure for deployment
of MTC Devices (MTCDs) due to the advantages of providing the possibility of a ubiquitous and
transparent communications for MTCDs [5]. However, cellular networks are mainly designed
for human type communication which generally characterized by bursts of data during a limited
number of active periods. Hence, the required signaling traffic for resource management is
negligible. M2M communications, instead, involves a huge amount of MTCDs that need to
transmit typically a small amount of data, most of the time [6]. That is the generated signaling
February 15, 2017 DRAFT
3
traffic by a massive amount of MTCDs is significant and may cause a risk to the traditional
operation of the cellular networks [7]. Therefore, deployment of the MTC in LTE infrastructure
raises new challenges.
Specifically, when a massive number of MTCDs try to access the network within a short
interval, the Radio Access Network (RAN) becomes congested which leads to decrease in the
access success probability and heavy access delay for MTCDs. Therefore, handling the massive
access requests of MTCDs is one of the main challenges for MTC in LTE [8]. So far, several
methods have been proposed to alleviate congestion in the RAN. Among them, the Access Class
Barring (ACB) scheme has attracted more attention due to its simplicity in deployment [9]. In the
ACB scheme, the access of MTCDs are barred according to a barring factor which is broadcasted
by the evolved node B (eNB).
On the other hand, since MTCDs belong to various applications with different priorities, the
network should consider the priorities of devices in access granting for connecting to the network
[10]. In this paper, we address the prioritized massive access of MTCDs in LTE networks and
how to allocate Random Access CHannel (RACH) resources to them. The Random Access (RA)
procedure is the first step for connecting to the cellular network which is done through RACH
resources.
We propose a prioritized random access scheme using Learning Automaton (LA) in which
the MTCDs are classified into different classes according to the priorities of the corresponding
applications. Two LA modules are deployed. The first LA dynamically determines the amount
of RACH resources which must be allocated to each class according to its priority. The second
LA is used for determining the transmission probability for access-attempting MTCDs of each
class to prevent from a huge amount of simultaneous RA attempts. It is shown that by proper
adjustment of learning parameters, the asymptotic behavior of the proposed scheme tends to the
optimal scheme in which the eNB has priori information about the number of access-attempting
devices from each class.
The rest of this paper is organized as follows. The related works are presented in section II. In
section III, the preliminaries and system model are explained. The proposed scheme is presented
in section IV. Performance analysis and simulation results are provided in section V and VI, and
the paper is concluded in section VII.
February 15, 2017 DRAFT
4
TABLE I
SUMMARY OF RACH OVERLOAD CONTROL TECHNIQUES.
Techniques References Idea
Separation of RACH resources
Split preambles [11, 12]
Split PRACH occasions [9]
Prioritized random access [13]
Split preambles between M2M and H2H users.
Pre-allocates RACH resources to different MTC classes.
Slotted Access Slotted access schemes [9, 14]Dedicated slots for each MTCD.
Access Class Barring
Extended ACB [9]
Dynamic ACB [15]
Cooperative ACB [16]
Selectively control the access attempts of UEs which
configured for EAB.
The ACB factor is adjusted by a heuristic algorithm in each
time slot, dynamically.
Controls the RAN overload by dispersing MTCDs among
neighboring cells that overlapped with each other.
MTC-specific backoffBackoff tunning [17]
Backoff timer method [12, 18]
PRACH overload is controlled by proper adjusting backoff times
of MTCDs.
Other solutions
pull based scheme [9, 19] Allows MTCDs to access the PRACH when paged by the eNB.
Q-learning [20]Uses Q-learning based RACH scheme slot assignment to
MTCDs.
Self optimizing overload con-
trol (SOOC) [21]
A self-optimizing mechanism for configuring the RACH re-
sources based on load condition.
II. RELATED WORKS
In 3GPP LTE release 11, i.e., the LTE-A system, several approaches are proposed to coun-
teract the RACH overload such as separate RACH resource allocation for M2M and non-M2M
communications, slotted Access, ACB scheme, the MTC-specific backoff scheme, and pull based
scheme [9]. In table I, a summary of the different RACH overload control techniques is presented.
Particulary, in the ACB scheme, the eNB broadcasts the ACB or barring factor. Each device
which has an access request selects a uniform distributed random number between 0 and 1,
and compares it with the ACB factor. If this number is less than the ACB factor, the device
can participate in contention by selecting a preamble, otherwise, it barred for a barring time. In
this scheme, eNB controls the number of access-attempting MTCDs or the congestion level by
adjusting the ACB factor. In addition to ACB, 3GPP also proposed the extended access barring
(EAB) scheme. In EAB, eNB considers 16 access classes and in the case of RAN overload,
only one or more of these classes which belong to the high priority applications are allowed to
February 15, 2017 DRAFT
5
participate in the RA procedure and others become barred [9], [22].
The ACB factor should be adjusted according to the number and priorities of different MTCDs.
In [15], a heuristic algorithm for adaptive adjustment of ACB factor using the number of
successful and collided transmissions in the previous time slots is proposed. Also, the authors
derive an analytical model for determining the total expected access delay for MTCDs. The
proposed scheme in [23] uses available information in the eNB for accurate estimation of
the number of M2M devices using Kalman filtering and adjusts the ACB factor based on this
estimation. In order to reduce the RAN overload caused by MTCDs, the authors in [24] proposed
a scheme which jointly utilizes from timing advance information and ACB. In this scheme by
selecting the optimal value for ACB factor, the number of MTCDs which can be served in each
time slot is maximized.
In [16] the authors proposed a cooperative ACB scheme for access load sharing among
neighboring cells that overlapped with each other. The MTCDs which located in the coverage area
of eNBs can select one of the eNBs to access such that the load is balanced among overlapping
cells. This scheme improves the congestion delay for M2M communications. A Q-learning based
scheme is proposed in [20] to avoid collision between M2M devices and enhance the throughput
of the RACH resources. Using this scheme the performance loss of H2H devices that can be
caused by massive access requests of M2M devices is reduced.
In these works, the RACH overload problem caused by massive access requests of MTCDs
is discussed and less attention has been paid to the priorities and quality of service (QoS)
requirements of them. Since different applications with different access priorities should be
handled in MTC scenarios, the RACH overload control solutions should take into account the
tolerable access delay of each MTC class. In order to satisfy the QoS requirements, the authors
in [13] presented a prioritized random access mechanism that pre allocates RACH resources
to different MTC classes according to their priorities. Furthermore, this mechanism prevents a
large number of concurrent random accesses by dynamic access barring (DAB). However, the
resources are not allocated to different priority classes in a dynamic manner which may lead
to resource wasting. In this paper, in contrast, we propose an LA based scheme in which the
available RACH resources are dynamically allocated to the priority classes of MTCDs according
to their current demands where the ACB factor for each class is adjusted properly in the massive
access case.
February 15, 2017 DRAFT
6
III. PRELIMINARIES AND SYSTEM MODEL
A. Random Access Procedure in LTE Networks
In LTE networks, a User Equipment (UE) can be scheduled for uplink transmission if its
uplink transmission timing is synchronized. The Random Access (RA) procedure is the first step
for connecting to the LTE networks which is done through RACH resources. Therefore, the RA
procedure plays a key role as an interface between non-synchronized UEs and the orthogonal
transmission scheduling scheme through the LTE uplink radio resources [25]. That is the eNB can
schedule UEs for uplink transmissions provided that they successfully passed the RA procedure.
Notice that the RA procedure can be performed in a contention-free or contention-based manner
[2]. In contention-free RA procedure, the eNB allocates a unique RA preamble to a specific
UE and hence guarantee its access to the network. This access scheme is not typically used for
massive access of M2M applications and deployed for time critical usages such as handover.
However, the contention-based RA procedure which is also adopted in this paper, is much more
appropriate for M2M traffic. That is, a certain number of assigned preamble sequences to each
LTE cell is reserved for contention-free RA procedure and the remaining ones are used in the
contention-based RA. The information about these preambles which are used by MTCDs is
broadcasted by eNB through downlink control channel [8]. Then each access-attempting UE
selects a preamble randomly and transmits its request to the eNB through the RA slot which
is a time-frequency radio resource of the Physical RACH (PRACH). The contention-based RA
procedure consists of four steps as follows [7], [25]:
Step 1: The MTCD transmits a randomly selected RA preamble through the next available RA
slots of the PRACH. Due to the orthogonality of the available preambles, an eNB can decode
multiple transmitted access requests by MTCDs which select different preambles in the same
RA slot.
Step 2: For each successfully detected preamble, the eNB sends a random access response
(RAR) through the Physical Downlink Shared Channel (PDSCH) which includes a random access
preamble identifier (ID), an uplink (UL) grant that will be used for transmitting the third step of
the RA procedure, a temporary cell identifier (C-RNTI), and a time alignment (TA) command.
Step 3: When the MTCD receives a RAR corresponding to the transmitted preamble in a
specific RA slot, it sends the connection setup request message to the eNB using the assigned
February 15, 2017 DRAFT
7
UL grant in the received RAR.
Step 4: The eNB sends the contention resolution message to the MTCD provided that it
can successfully decode the transmitted third message by the device in the specific UL grant.
Otherwise, the eNB will not transmit any response and the device assumes that failed and
schedules for a new RA procedure.
Collision occurs if one preamble is selected by two or more MTCDs in the same RA slot.
In the situation of undetected preamble collision in step 1 of the RA procedure, more than one
MTCD transmit connection setup request messages or data through the same UL grant and eNB
cannot decode the received data successfully in step 3 and collision occurs [26]. We assume
that for a collided preamble the eNB cannot decode any of the transmitted data in the third
message of the RA procedure and all of the devices corresponding to such preambles must retry
in subsequent cycles. Specifically, at the end of each cycle the eNB can divide the preambles into
three groups including: 1) successful preambles: preambles which are selected by one device,
2) idle preambles: preambles which are not selected by any device, and 3) collided preambles:
preambles which are selected by more than one device.
B. Learning Automata
Learning automata is a self-operating learning model which aims at operating in the environ-
ments with unknown characteristics. This learning model is useful in many applications involving
adaptive decision making. An LA is an automaton that enhances its functionality by acquiring
knowledge about the behavior of the random environment. It uses the acquired knowledge for
adaptive decision making in the future. The response of the environment to the selected action
by the LA feedbacks as a reward or penalty to the LA for updating the selection probability of
the action as it is shown in Fig. 1 [27].
That is, the LA interacts with the random environment in repetitive cycles so as to find
among the set of actions the one that maximizes the average reward the system receives by the
environment. The environment is represented by a tripleE = {a, b, p} wherea = {a1, a2, . . . , ar}is the environment input set,b = {b1, b2, . . . , br} represents the environment output set and
p = {p1, p2, . . . , pr} represents the probability distribution for ther actions attth cycle where∑r
i=1 pi(t) = 1. The automaton is known as a P-model one, if the set of environmental responses
take only the values 1 and 0, representing penalty and reward, respectively [28], [29].
February 15, 2017 DRAFT
8
Fig. 1. An example of learning automata.
Assume that in cyclet the selected action and the corresponding normalized environmental
response by the automaton are denoted byai andc(t) respectively. The probabilities of actions
are then updated in a reinforcement manner according to (1).
pi(t + 1) =
pi(t)− (1− c(t))gi(p(t)) + c(t)hi(p(t)), if a(t) 6= ai
pi(t) + (1− c(t))∑
j 6=i gj(p(t))− c(t)∑
j 6=i hj(p(t)), if a(t) = ai
(1)
where functionsgi andhi are associated with reward and penalty for the selected actionai.
C. System Model
We consider a system withN MTCDs corresponding to applications with different priorities
in the coverage area of an eNB in a cell of LTE networks. The MTCDs are grouped into three
priority classes including high, medium, and low according to their QoS requirements which are
indicated byH, M andL in this paper, respectively. The corresponding numbers of MTCDs in
each class are denoted byNH , NM , andNL. We consider each MTCD will be activated at the
interval [0, Ts] with probability g(t). In [9] two different probability distributions forg(t) are
proposed including uniform and beta distributions. In this paper, in order to consider the massive
access scenario in which a large number of MTCDs try to access the network simultaneously,
we assume that the activation of MTCDs in the interval[0, Ts] follows beta distribution with
parametersα = 3 andβ = 4, as follows:
g(t) =tα−1(Ts − t)β−1
T α+β−1s B(α, β)
(2)
February 15, 2017 DRAFT
9
Fig. 2. M2M devices with different priorities in LTE networks
whereB(α, β) is the beta function [30].
Since most of the M2M applications have small sized data for transmission, we assume each
activated device has only one small data packet for transmitting in aTs interval. Ts is divided
into Zs cycles each of them consists of two parts. The first part is used for transmission of
the preambles and the second part is used for transmission of the third messages of the RA
procedure, see Fig. 2. In this paper, in order to avoid the signalling burden, we assume that the
small data packets of MTCDs are transmitted to the eNB during the RA procedure. We also
assume that eNB only knows the average number of access requests from each priority class in
[0, Ts] and it does not know the number and the start times of traffic bursts as well as the access
request probabilities of MTCDs in each cycle. The total number of cycles which are required
for serving all the MTCDs corresponding to each class in the activation interval is called Total
Service Time (TST).
Let M be the number of available preambles for MTCDs in each cycle. To provide QoS for
different priority classes one can divide the available RACH resources among them according
to their average resource requirements. However, determining a fixed amount of resources for
each priority class may cause a significant degradation in the network throughput when a class
February 15, 2017 DRAFT
10
does not utilize the allocated resources in some cycles and another class has more data for
transmission rather than the corresponding allocated RACH resources. We use an LA based
scheme for dynamic assignment of the RACH resources to classes. As mentioned before, LA
is a useful structure that can provide adaptation to systems operating in environments with
changing and/or unknown characteristics [28]. On the other hand, the number of contending
MTCDs in each cycle is unknown and depends on the stochastic arrival process of random
access requests of the UEs. Furthermore, these access-attempting UEs have different priorities
and demands for uplink resources [7]. We deploy LA to followup the number of contending
MTCDs in each priority class and then adjust the ACB and RACH allocation probabilities for
them. In the proposed scheme, the following prioritization rules for allocating RACH resources
must be satisfied:
1. Each priority class can utilize a certain amount of available resources which is determined
statically based on its priority and average requirement.
2. The unused resources of each priority class should be proportionally allocated to other
priority classes which require more resources.
The initial probability of RACH resource allocation and the corresponding amount of allocated
RACH resources, i.e., the number of allocated preambles, to priority classx ∈ {H,M, L} in
the tth cycle are denoted byqx(t) andMx(t), respectively. Also, the maximum value ofqx(t) is
denoted byCx. According to the priority and the average number of access-attempting devices
of classx in a Ts interval, the value ofCx is determined statically by eNB and broadcasted
at the beginning of theTs interval. The MTCDs acquire this information through reading the
broadcasted system information blocks (SIBs).
Although, a certain amount of the RACH resources are dedicated to each class, the number
of access-attempting MTCDs can be much greater than the assigned resources in the massive
access scenario. Hence, we use an LA based ACB scheme for each class to control the possible
overload. The ACB parameter for priority classx ∈ {H, M,L} in the tth cycle is denoted by
px(t). The key mathematical symbols and their definitions are presented in table II.
IV. L EARNING AUTOMATA BASED RANDOM ACCESSSCHEME
For the proposed LA based scheme, two LAs are used in each MTCD. The first LA is
responsible for adjusting the value ofqx(t) and the second LA is used to adjust the barring
February 15, 2017 DRAFT
11
TABLE II
TABLE OF KEY MATEMATICAL SYMBOLS
Symbol Definition
N Total number of MTCDs in the coverage area of eNB
NH , NM , NL Number of MTCDs in high, medium and low priority classes
Ts Activation interval of all MTCDs
M Number of available preambles for MTCDs in each cycle
qx(t) Probability of RACH resource allocation to priority classx in tth cycle
Mx(t) Number of allocated preambles to priority classx in tth cycle
Cx Maximum value ofqx(t)
px(t) ACB parameter for priority classx in tth cycle
Nx(t)Number of access-attempting devices from priority classx
in tth cycle
pidlex (t) Probability that a preamble from classx remains idle intth cycle
psuccx (t) Probability that a preamble from classx becomes successful intth cycle
P collx (t) Probability that a preamble from classx becomes collided intth cycle
r(t) Feedback array intth cycle
factor, i.e.,px(t). Consider priority classx and let the number of access-attempting devices
which belong to this class intth cycle is denoted byNx(t). Each MTCD from priority classx
participates in the RA procedure with probabilitypx(t) and randomly selects a preamble from the
availableMx(t) preambles by probability 1Mx(t)
. Hence, the probability that a certain preamble
is selected by an MTCD from priority classx is given by px(t)Mx(t)
.
Therefore, the probability that this preamble remains idle, successfully exploited by one device,
or encounters collision are given by (3), (4), and (5) respectively.
pidlex (t) =
(1− px(t)
Mx(t)
)Nx(t)
. (3)
psuccx (t) =
(Nx(t)
1
)px(t)
Mx(t)
(1− px(t)
Mx(t)
)Nx(t)−1
. (4)
pcollx (t) = 1− Nx(t)px(t)
Mx(t)
(1− px(t)
Mx(t)
)Nx(t)−1
−(1− px(t)
Mx(t)
)Nx(t)
. (5)
The objective of the proposed scheme is adjusting the values ofqx(t) andpx(t) such that the
optimal performance is achieved.
February 15, 2017 DRAFT
12
If the number of access-attempting devices from priority classx is less than the maximum
allowable RACH resources for this class, the optimal performance is achieved when the number
of allocated preambles to this class is equal toNx(t), i.e., Mx(t) = Nx(t). On the other hand,
if the number of access-attempting devices from priority classx is greater than the maximum
allowable RACH resources for this class, the optimal performance is achieved when the maximum
allowable RACH resources are allocated to classx and the ACB factor is adjusted such that
px(t) = Mx(t)Nx(t)
.
The eNB does not know the number of access-attempting devices of MTC classes in each cycle.
The available information for the eNB are the number of successful, collided, and idle preambles
at the end of each cycle which are denoted bypsuccx (t), pcoll
x (t), andpidlex (t), respectively. Notice
that by optimal adjusting the values ofqx(t) andpx(t) in a massive access scenario, the probability
that eNB finds each preamble in successful, idle, and collision states would converge toe−1,
e−1 and1− 2e−1, respectively.
In the proposed scheme, we usepcollx as an indicator in order to determine the feedback
for each class. This feedback which is received by all devices’ LAs is denoted by the array
r(t) = (rH(t), rM(t), rL(t)). rx(t) for each class takes a binary value as reward or penalty.
At the end of each cycle, eNB monitors the value ofpcollx for classx and generatesrx(t) by
comparing it with the optimal expected value ofv = 1− 2e−1. That is:
rx(t) =
0 if pcollx (t) < v
1 if pcollx (t) ≥ v
(6)
The eNB broadcasts the generated feedback arrayr(t) at the end of each cycle through the
downlink broadcast channel.
A. Dynamic RACH Resource Allocation
Assume that each MTCD is empowered with a P-model LA. The LA must updateqx(t) after
receiving the feedback array.rx(t) = 1 is occurred when the percentage of collision in classx
is greater than the optimal value. It means that the allocated RACH resources to this class is
less than the optimal value, therefore,qx(t) should be increased. On the other hand,rx(t) = 0
indicates thatqx(t) should be decreased. Note that, in order to simplify the analysis of the
February 15, 2017 DRAFT
13
proposed scheme, we assumepcollx = 1 − 2e−1. The general updating procedure ofqx(t) which
is used by LAs in the proposed scheme is given by:
qx(t + 1) =
qx(t) + ∆1 if rx(t) = 1
qx(t)−∆2 if rx(t) = 0
(7)
Where0 < ∆1 < Cx − qx(t) and0 < ∆2 < qx(t)− a1. a1 takes a small value and is used to
ensure that the percentage of allocated resources to each class be greater than zero when that
class has no access request. In the proposed scheme, the LA starts with the maximum probability
of allocating RACH resources to each class, i.e.,qx(t) = Cx. After updating the values ofqx(t)
at the end oftth cycle , the values ofqx(t) are normalized by each LA according to (8).
σx(t) =qx(t)
qH(t) + qM(t) + qL(t), for x ∈ {H,M, L}. (8)
It is clear that∑
x∈{H,M,L} σx(t) = 1. The normalization of the probabilities is used in the LA
based schemes [31]-[33].
The number of preambles that priority classx can use in thetth cycle is determined according
to the normalized probabilityσx(t) as given in (9).
Mx(t) = M × σx(t) (9)
Therefore, the range of preambles which can be used by MTCDs in each priority class is
determined.
To ensure the convergence ofqx(t) to the optimal value, the values of∆1 and∆2 in (7) should
be selected appropriately. According to (5) and (9)pcollx (t) is a function ofqx(t) and the optimal
value ofpcollx (t) will be achieved by proper increasing or decreasing ofqx(t). We have
δqx(t) = E[qx(t + 1)− qx(t)] = pcollx (t)∆1 − (1− pcoll
x (t))∆2
= pcollx (t)∆1 + pcoll
x (t)∆2 −∆2 = pcollx (t)(∆2 + ∆1)−∆2
= (∆1 + ∆2)(pcollx (t)− ∆2
∆1 + ∆2
). (10)
In fact, qx(t) increases with probabilitypcollx (t) and decreases with probability1− pcoll
x (t). As
we mentioned before, when the number of access-attempting devices is less than the maximum
February 15, 2017 DRAFT
14
amount of RACH resources which can be allocated to priority classx, if qx(t) is adjusted by
optimal value, we should havepcollx (t) = v. To asymptotically converge to the optimal case, the
allocation procedure should be updated according to the following conditions.
1. If pcollx (t) < v thenδqx(t) < 0 and thereforeδpcoll
x (t) > 0.
2. If pcollx (t) > v thenδqx(t) > 0 and thereforeδpcoll
x (t) < 0.
3. If pcollx (t) = v thenδqx(t) = 0 and thereforeδpcoll
x (t) = 0.
According to (10), these conditions are satisfied and the optimal case is achieved provided
that δqx = 0 and ∆2
∆2+∆1= v. Therefore,
∆1 =1− v
v∆2 = d1∆2,
d1 =1− v
v= 2.77. (11)
By considering∆2 = ∆ where0 < ∆ < Cx−qx(t)d1
and0 < ∆ < qx(t)− a1, we adjust∆ by:
∆ = L1
(Cx − qx(t)
)(qx(t)− a1
), where L1 ∈ (0, 1). (12)
In sum, the RACH allocation updating procedure is given by (13).
qx(t + 1) =
qx(t) + d1L1
(Cx − qx(t)
)(qx(t)− a1
)if rx(t) = 1
qx(t)− L1
(Cx − qx(t)
)(qx(t)− a1
)if rx(t) = 0
(13)
WhereL1 ∈ (0, 1) is the step size of probability updating procedure. The convergence speed
as well as the estimation accuracy of the automaton depend on the value ofL1. By the updating
procedure in (13),qx(t) is changed according to the requirements of each class and takes a value
in the interval(a1, Cx). Also, note that the two mentioned priority rules which is discussed in
the system model section are satisfied.
As a special case consider a scenario in which each class experiences massive access by a lot
of access-attempting devices. In this case,qx(t) will converge toCx and we have
σx(t) =Cx
CL + CM + CH
, for x ∈ {H, M, L}. (14)
That is all classes use from the maximum preassigned RACH resources.
February 15, 2017 DRAFT
15
Now, consider a scenario in which classH has no traffic for transmission, however, the other
two classes are in massive access mode. In this case,qx(t) for classH, M , andL converge to
a1, CM , andCL respectively and we have:
σH(t) =a1
CL + CM + a1
, (15)
and
σx(t) =Cx
CL + CM + a1
, for x ∈ {M, L}. (16)
Therefore, the unused RACH resources by classH is proportionally allocated to the other
two classes as expected.
B. Dynamic Adjusting of ACB Factors
When the number of access-attempting devices in a class is greater than the maximum amount
of allocatable RACH resources, the ACB probability should be adjusted properly in order to
reduce the chance of collisions. We use another P-model based LA in each MTCD to adjust
the corresponding ACB factor. For this purpose, similar to the updating procedure forqx(t), at
cycle t each MTCD updatespx(t) using the broadcastedrx(t). Notice that ifpx(t) is adjusted
appropriately,pcollx (t) would converge tov = 1 − 2e−1. The updating procedure is defined as
follows.
px(t + 1) =
px(t) + ∆1 if rx(t) = 0
px(t)−∆2 if rx(t) = 1
(17)
where0 < ∆1 < 1−px(t) and0 < ∆2 < px(t)−a2. Again,a2 is an appropriate small value and
∆1 and ∆2 should be selected such that the updating procedure for the ACB factor converges
to the optimal value asymptotically. According to (17) we have:
δpx(t) = E[px(t + 1)− px(t)
]=
(1− pcoll
x (t))∆1 − pcoll
x (t)∆2
= ∆1 − pcollx (t)∆1 − pcoll
x (t)∆2 = ∆1 − pcollx (t)(∆1 + ∆2) =
= (∆1 + ∆2)(− pcoll
x (t) +∆1
∆1 + ∆2
). (18)
February 15, 2017 DRAFT
16
To ensure thatpcollx (t) = v, the ACB factor updating procedure must satisfy the following
conditions.
1. If pcollx (t) < v thenδpx(t) > 0 and thereforeδpcoll
x (t) > 0.
2. If pcollx (t) > v thenδpx(t) < 0 and thereforeδpcoll
x (t) < 0.
3. If pcollx (t) = v thenδpx(t) = 0 and thereforeδpcoll
x (t) = 0.
According to (18), these conditions are satisfied and the optimal case is achieved when
δpx(t) = 0 and ∆1
∆1+∆2= v. Therefore, we have:
∆1 =v
1− v∆2 = d2∆2,
d2 =v
1− v= 0.359. (19)
By considering∆2 = ∆ where0 < ∆ < 1−px(t)d2
and0 < ∆ < px(t)−a2, we adjust∆ as follows
∆ = L2
(1− px(t)
)(px(t)− a2
), where L2 ∈ (0, 1). (20)
In sum, the ACB updating procedure is given by (21).
px(t + 1) =
px(t) + d2L2
(1− px(t)
)(px(t)− a2
)if rx(t) = 0
px(t)− L2
(1− px(t)
)(px(t)− a2
)if rx(t) = 1
(21)
C. State Diagram of the LA Based Scheme
The state diagram of the proposed scheme is illustrated in Fig. 3. Consider an MTCD from
classx. According to Fig. 3, at the first step the values ofqx(t) and px(t) for this device
are initialized by the corresponding maximum values, i.e.,Cx and 1, respectively. Then, if the
received feedback is 0,px(t) remains constant andqx(t) is decreased according to (13). Therefore,
the percentage of allocated RACH resources to priority classx is decreased and may be used
for other priority classes which require more resources. Otherwise, if the received feedback is
1, the value ofqx(t) is increased until it reaches to its maximum value and then the value of
px(t) is decreased to bare the massive access of this class taking into account the maximum
allocatable RACH resources.
February 15, 2017 DRAFT
17
Fig. 3. The state transition diagram of the LA based scheme.
In this state, if the received feedback changes to 0, at the first the value ofpx(t) is increased
until it reaches to its maximum value and then the value ofqx(t) is decreased. Note that, the
value of px(t) can be decreased only whenqx(t) is adjusted by its maximum value, i.e.,Cx.
Also, the value ofqx(t) can be decreased only whenpx(t) is adjusted by its maximum value,
i.e., 1.
Put together, the probability updating procedures forqx(t) and px(t) are given by (22) and
(23).
qx(t + 1) =
qx(t) + d1L1
(Cx − qx(t)
)(qx(t)− a1
)if rx(t) = 1
qx(t)− L1
(Cx − qx(t)
)(qx(t)− a1
)if rx(t) = 0 and px(t) = 1
(22)
February 15, 2017 DRAFT
18
and
px(t + 1) =
px(t) + d2L2
(1− px(t)
)(px(t)− a2
)if rx(t) = 0
px(t)− L2
(1− px(t)
)(px(t)− a2
)if rx(t) = 1 and qx(t) = Cx
(23)
whereL1, L2 ∈ (0, 1).
V. PERFORMANCEANALYSIS
According to the number of access-attempting devices for each class two different situations
can occur in thetth cycle as follows:
1. If the number of access-attempting devices from classx is less than the maximum amount
of RACH resources which can be assigned to this class, the optimal performance is achieved
when the number of allocated preambles to classx is equal to theNx(t), i.e., Mx(t) = Nx(t).
Therefore, the optimal value ofσx(t) would be Nx(t)M
as follows.
σX(t) =qx(t)∑ki=1 qi(t)
=Nx(t)
M. (24)
Hence, the optimal value ofqx(t) is given by (25).
qx(t) =Nx(t)
M
( k∑
i=1,i 6=x
qi(t) + qx(t)). (25)
According to (25) and since the maximum value ofqx(t) is bounded byCx we conclude that
the optimal value forqx(t) is:
min{ Nx(t)
M
∑ki=1,i6=x qi(t)
1− Nx(t)M
, Cx
}. (26)
2. In the case that the number of access-attempting devices from classx is greater than the
maximum allowable RACH resources for this class, the optimal performance is achieved when
the maximum allowable RACH resources are allocated to classx, i.e., qx(t) = Cx and the
number of participating MTCDs is limited by optimal value ofpx(t) as is given by (27).
px(t) =Mx(t)
Nx(t). (27)
According to (27) and since the maximum value ofpx(t) is 1, we conclude that the optimal
value forpx(t) is:
February 15, 2017 DRAFT
19
min{Mx(t)
Nx(t), 1
}. (28)
The asymptotic behaviors ofqx(t) andpx(t) are given by Lemma 1 and Lemma 2.
Lemma 1: If the number of access-attempting devices in priority classx in the tth cycle is
Nx(t) which is less than the maximum amount of RACH resources which can be assigned to
this class and the probability updating procedure (13) is used, we have:
limL1→0,a1→0,t→∞
qx(t) = min{ Nx(t)
M
∑ki=1,i6=x qi(t)
1− Nx(t)M
, Cx
}. (29)
Proof: see appendix A.
Lemma 2: If the number of access-attempting devices in priority classx in tth cycle isNx(t)
and this class uses from the maximum allocatable resources, using the updating procedure of
(21) we have:
limL2→0,a2→0,t→∞
px(t) = min{Mx(t)
Nx(t), 1
}(30)
Proof: see appendix A.
VI. SIMULATION RESULTS
In this section, we evaluate the performance of the proposed LA based scheme and compare
it against the optimal and fixed allocation schemes. In optimal allocation, we assume that eNB
knows the number of access-attempting devices from each class in each cycle and allocates
preambles to them taking into account the maximum allocatable RACH resources to each class.
Hence, the RACH allocation and ACB probabilities are assigned in the optimal manner. In the
fixed allocation, a fixed number of preambles are pre-allocated to each class statically by the
eNB according to the priority and the average number of access-attempting devices in that class
in a Ts interval as given by (31).
Mx =NxCxM
NHCH + NMCM + NLCL
, for x ∈ {H,M, L}. (31)
February 15, 2017 DRAFT
20
1000 2000 3000 4000 5000 6000 7000 8000 9000 100000
1
2
3
4
5
6
7
Number of access requests
Ave
rage
acc
ess
dela
y (c
ycle
)
High priority classMedium priority classLow priority class
Fig. 4. The average access delay vs. the number of MTCDs for three priority classes withZs =200.
We assume that one RA slot occurs in each cycle and 50 preambles are reserved in each RA
slot for using by three priority classes. The values ofCH , CM , andCL are set to 0.5, 0.3, and
0.2, respectively.
In Fig. 4, the average access delay of a typical MTCD in each class for different number of
MTCDs in three classes for the proposed LA based scheme is shown. The number of cycles
in the activation interval isZs = 200. As expected, in massive access scenario the average
access delay for each priority class depends on the percentage of resources which considered for
that class. That is each class exploits from the maximum allocatable RACH resources and the
MTCDs which belong to the high priority class incur less average access delay as the number
of access requests from each class is increased.
We then evaluate the number of allocated preambles to different classes in consecutive cycles
of the TST interval in the proposed scheme as shown in Fig. 5. In this simulation, at first the
number of MTCDs in three classes is equal to 1000 andZs = 20. It is clear that when each
class has data for transmission, the number of allocated preambles to that class is proportional
to the maximum percentage of resources which can be allocated to it. However, when all of
the access requests from priority classx are served, the corresponding RACH resources for
this class should allocate to other classes proportionably. For example, at cycle 535, when the
priority classH has no more access request, the RACH resources which can be used by it are
February 15, 2017 DRAFT
21
0 100 200 300 400 500 600 700 800 900 10000
5
10
15
20
25
30
35
40
45
50
Cycle
Num
ber
of a
lloca
ted
prea
mbl
es
High priority classMedium priority classLow priority class
Fig. 5. The number of allocated preambles to three priority classes whenNH = NM = NL = 1000 andZs=20.
allocated to priority classM andL proportionally. Also, as expected and it can be seen in Fig.
5, in this case a small percentage of RACH resources is still allocated to priority classH which
corresponds to parametera1 in the LA scheme.
In order to compare the proposed scheme with the optimal and fixed allocation schemes, we
consider aB(3, 4) traffic model for each class.
We consider three traffic bursts for classH in the Ts interval. The first burst is started at the
cycle 0th and last for 20 cycles with 500 requesting devices. The second and third bursts of this
class are started at200th and400th cycles, with 20 cycles duration and 250 requesting devices in
each burst, respectively. Also, we consider two traffic bursts for medium priority starting at0th
and500th cycles with 100 cycles duration respectively. The number of requesting devices in two
bursts is equal to 2500. Finally, a traffic burst is generated by 10000 low priority devices at0th
cycle with 100 cycles duration inTs interval. The number of allocated preambles in consecutive
cycles of the TST interval for the proposed LA based scheme, optimal allocation, and fixed
allocation schemes for the priority classH, M , andL are depicted in Fig. 6 (a), (b) and (c),
respectively. In this simulationZs = 600.
As it is expected, when there is no request from classH, the minimum allowable number
of preambles are allocated to this class and the remaining preambles are allocated to other
classes proportionally. However, as the second burst of this class starts, the allocated preambles
February 15, 2017 DRAFT
22
0 500 1000 15000
10
20
30
40
Cycle(a)
0 500 1000 15000
10
20
30
40
Cycle(b)
Num
ber
of a
lloca
ted
prea
mbl
es
0 500 1000 15000
20
40
60
Cycle(c)
Proposed LA based schemeOptimal allocationFixed allocation
Fig. 6. (a) The number of allocated preambles to priority classH in different cycles of the TST interval. (b) The number
of allocated preambles to priority classM in different cycles of the TST interval. (c) The number of allocated preambles to
priority classL in different cycles of the TST interval.
is increased again and the MTCDs in classH exploit from the maximum allocatable RACH
resources, i.e.,CH . The same trend is observed for other two classes.
Also, the proposed LA based scheme follows the optimal scheme in which we assume that eNB
knows the number of access requests in each cycle and also, this scheme has better performance
in terms of decreasing the TST compared to the fixed allocation scheme. Note that, the reason of
the observed small differences between the proposed LA based scheme and optimal allocation
is that the learning process in the proposed scheme is done in two steps including learning the
RACH allocation and ACB factors. For this scenario, the corresponding variation in the ACB
factors in different cycles of the TST intervals for the proposed LA based scheme and optimal
allocation scheme are shown in Fig. 7(a), 7(b), and 7(c). The results show that the proposed LA
based scheme can successfully follow the optimal decisions.
The average access delay versus the number of allocated preambles for M2M communications
February 15, 2017 DRAFT
23
0 200 400 600 8000
0.2
0.4
0.6
0.8
1
Cycle(a)
AC
B fa
ctor
0 500 10000
0.2
0.4
0.6
0.8
1
Cycle(b)
AC
B fa
ctor
0 200 400 600 800 10000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Cycle(c)
AC
B fa
ctor
Proposed LA based schemeOptimal allocation
Fig. 7. The values of ACB factor in different cycles of TST interval, (a) classH, (b) classM , (c) classL.
for each of requesting devices in priority classH, M andL are depicted in Fig. 8(a), 8(b) and
8(c) respectively. In this simulationNH = 1000, NM = 5000 and NL = 10000 and Zs = 600.
The traffic bursts of each class follows the beta distribution where the start times of the burst
is uniformly distributed inTs interval. Also, the number of access-attempting devices in each
traffic burst is taken by a uniform random value between 1 and the number of access-attempting
devices. The simulation is performed for 200 runs and the averages are reported. We find that
the average access delay is decreased when the number of allocated preambles increases. Also,
the proposed scheme performs close to the optimal case and has better performance compared
to the fixed allocation scheme.
In Fig. 9, we provide the sensitivity analysis of the proposed scheme for variation in the
proper value of learning parameterL. In this figure the cumulative distribution function (CDF)
of the average access delay for different priority classes are shown for tuned learning parameter,
i.e., properL, and forL±0.15L. The simulation is performed for 300 runs and the averages are
reported. The results of this simulation show that the proposed scheme are not much sensitive
to learning parameter and loading.
February 15, 2017 DRAFT
24
25 30 35 40 45 50 550
0.2
0.4
0.6
0.8
Number of preambles(a)
Ave
rage
acc
ess
dela
y (c
ycle
)
25 30 35 40 45 50 550
0.5
1
1.5
2
2.5
Number of preambles(b)
Ave
rage
acc
ess
dela
y (c
ycle
)
25 30 35 40 45 50 551
2
3
4
5
6
7
Number of preambles(c)
Ave
rage
acc
ess
dela
y (c
ycle
)
Proposed LA based schemeOptimal allocationFixed allocation
Proposed LA based schemeOptimal allocationFixed allocation
Proposed LA based schemeOptimal allocationFixed allocation
Fig. 8. Average access delay vs. the number of preambles for (a) classH , (b) classM , (c) classL.
VII. C ONCLUSION
In this paper, we focused on supporting different priority classes of MTC devices in resource
allocation procedure. We presented an LA based scheme for allocating RACH resources and
adjusting the ACB factors for classes of MTC devices. Simulation results show that the propose
scheme allocates the RACH resources and adjusts the ACB factors of each priority class properly.
Also, it has better performance compared to the fixed allocation and follows the optimal scheme
in which the eNB know the number of access requests in each cycle.
VIII. A PPENDIX A
We use the following theorem from [34] for proving the asymptotic behaviors of the proposed
LAs.
February 15, 2017 DRAFT
25
0 0.05 0.1 0.15 0.2 0.250
0.2
0.4
0.6
0.8
1
Cycle(a)
CD
F o
f ave
rage
acc
ess
dela
y
LL + 0.15LL − 0.15L
0 0.5 1 1.50
0.2
0.4
0.6
0.8
1
Cycle(b)
CD
F o
f ave
rage
acc
ess
dela
y
LL + 0.15LL − 0.15L
0 1 2 3 40
0.2
0.4
0.6
0.8
1
Cycle(c)
CD
F o
f ave
rage
acc
ess
dela
y
L L + 0.15LL − 0.15L
Fig. 9. CDF of the average access delay for (a) classH , (b) classM , (c) classL.
Theorem 1 [34]: Let {x(t)}n≥0 be a stationary Markov process dependent on a constant
parameterθ ∈ [0, 1]. Eachx(t) ∈ I, whereI is a subset of the real line. Letδx(t) = x(t+1)−x(t).
The followings are assumed to hold:
i. I is compact
ii. E[δx(t)|x(t) = y] = θω(y) + O(θ2)
iii. E[|δx(t)|2|x(t) = y] = θ2b(y) + O(θ2).
WheresupO(θk)θk < ∞ for k >= 2 andsupO(θ2)
θ2 → 0 asθ → 0.
iv. ω(y) has a Lipschitz inI.
v. b(y) is continuous inI.
If assumptions (i)-(v) hold, for small values of the parameterθ, ω(y) has a unique rooty∗ in
I anddω/dy|(y=y∗) < 0.
Proof of Lemma 2: To use theorem 1, identifyx(t) with qx(t), θ with L1, andI with (0,1).
We have:
February 15, 2017 DRAFT
26
E[δqx(t)|qx(t)] = pcollx (t)(d1L1(Cx − qx(t))(qx − a1)) + (1− pcoll
x (t))(−L1(Cx − qx(t))(qx(t)− a1))
= L1(1 + d1)(Cx − qx(t))(qx(t)− a1)(pcollx (t)− v)
= L1ω(qx(t)) (32)
and
E[|δqx(t)|2|qx(t)] = pcollx (t)(d1L1(Cx − qx(t))(qx(t)− a1))
2 + (1− pcollx (t))(−L1(Cx − qx(t))(qx(t)− a1))
2
= L21((Cx − qx(t))(qx(t)− a1))
2(1 + pcollx (d2
1 − 1))
= L21b(qx(t)) + O(L2
1). (33)
The functionω(qx(t)) and b(qx(t)) are defined as follows:
ω(qx(t)) = L1(1 + d1)(Cx − qx(t))(qx(t)− a1)(pcollx (t)− v) (34)
b(qx(t)) = L21(Cx − qx(t))(qx(t)− a1)
2(1 + pcollx (d2
1 − 1)) (35)
As it can be seen in (34) and (35),ω(qx(t)) is a Lipschitz function in (0,1) andb(qx(t)) is a
continuous function in (0,1). Therefore, assumptions (i)-(v) are satisfied for small values ofL1.
For the convergence ofqx(t) to the optimal point,E[δqx(t)|qx(t)] must converge to 0. According
to this, we should have
ω(qx(t)) = 0, (36)
Therefore,
L1(1 + d1)(Cx − qx(t))(qx(t)− a1)(pcollx (t)− v) = 0, (37)
There are three possible roots forω(qx(t)). The first root isqx(t) = Cx which means that we
use from maximum percentage of allocatable resources for classx. In this case, the updating
procedure in (13) does not affectqx(t) and the system is stable. The second root is happened
when qx(t) = a1, but again means that there are no available resources for classx and hence
the updating procedure does not affectqx(t) and the system is stable. The third root is happened
when
pcollx (t) = v, (38)
February 15, 2017 DRAFT
27
where in this case the updating procedure is running. Therefore,
1− Nx(t)
Mσx(t)
(1− 1
Mσx(t)
)(Nx(t)−1)
−(1− 1
Mσx(t)
)Nx(t)
= v. (39)
Consequently,
σx(t) =Nx(t)
M. (40)
And therefore,
q∗x(t) =
Nx(t)M
∑ki=1,i6=x qi(t)
1− Nx(t)M
. (41)
If the updating procedure (13) is used, the optimal value forqx(t) is obtained according to
(41). Since the value ofqx(t) can not be greater thanCx, we have:
liml→0,a→0,t→∞
qx(t) = min{ Nx(t)
M
∑ki=1,i 6=x qi(t)
1− Nx(t)M
, Cx
}(42)
Proof of Lemma 1: In order to use theorem1, identifyx(t) with px(t), θ with L2, andI with
(0, 1). We have
E[δpx(t)|px(t)] = pcollx (t)(d2L2(1− px(t))(px(t)− a2)) + (1− pcoll
x (t))(−L2(1− px(t))(px(t)− a2))
= L2(1 + d2)(1− px(t))(px(t)− a2)(pcollx (t)− v)
= L2ω(px(t)), (43)
and
E[|δpx(t)|2|px(t)] = pcollx (t)(d2L2(1− px(t))(px(t)− a2))
2 + (1− pcollx (t))(−L2(1− px(t))(px(t)− a2))
2
= L22((1− px(t))(px(t)− a2))
2(1 + pcollx (d2
2 − 1))
= L22b(px(t)) + O(L2
2). (44)
The functionω(px(t)) and b(px(t)) are defined as follows:
ω(px(t)) = L2(1 + d2)(1− px(t))(px(t)− a2)(pcollx (t)− v) (45)
b(px(t)) = L22((1− px(t))(px(t)− a2))
2(1 + pcollx (1 + (d2
2 − 1))). (46)
As it can be seen in (45) and (46),ω(px(t)) is a Lipschitz function in (0,1) andb(px(t)) is a
continuous function in (0,1). Therefore, assumptions (i)-(v) are satisfied for small values ofL2.
February 15, 2017 DRAFT
28
For the convergence ofpx(t) to the optimal point,E[δpx(t)|px(t)] must converge to 0. According
to this, we have
ω(px(t)) = 0 (47)
Therefore,
L2(1 + d2)((1− px(t))(px(t)− a2)((pcollx (t)− v) = 0 (48)
There are three possible roots forω(px(t)). For px(t) = 1, we use maximum allocatable
resources for classx and the updating procedure in (21) does not affectpx(t) and the system
remains stable. The second root ispx(t) = a2, but same as before, this value means that there
are no request for classx and therefore the updating procedure does not affectpx(t) and the
system is stable. The third root happens for:
pcollx (t) = v, (49)
1− Nx(t)px(t)
Mx(t)
(1− px(t)
Mx(t)
)Nx(t)−1
−(1− px(t)
Mx(t)
)Nx(t)
= v. (50)
px(t) =Mx(t)
Nx(t). (51)
If the updating procedure (21) is used, the optimal value forpx(t) is obtained according to (50).
Since the maximum value forpx(t) is 1, we have:
p∗x(t) = min{Mx(t)
Nx(t), 1
}. (52)
REFERENCES
[1] L. Atzori, A. Iera, and G. Morabito, “The Internet of Things: A survey,”Computer Networks, vol. 54, no. 15, pp. 2787-2805,
Oct. 2010.
[2] A. Laya, L. Alonso, and J. Alonso-Zarate, “Is the Random Access Channel of LTE and LTE-A Suitable for M2M
Communications? A Survey of Alternatives,”IEEE Commun. Surv. and Tuto., vol. 16 , no. 1, pp. 4-16, first quarter,
2014.
[3] L. Ferdouse, A. Anpalagan and S. Misra, “Congestion and overload control techniques in massive M2M systems: a survey,”
Transactions on Emerging Telecommunications Technologies(2015).
[4] Cisco, San Jose, CA, USA, “Cisco visual networking index: Global mobile data traffic forecast update, 2014-2019 technical
report,” Feb. 2015.
[5] H. Shariatmadari, R. Ratasuk, S. Iraji, “Machine-type communications: current status and future perspectives toward 5G
systems,”IEEE Com. Magazine, vol. 53, no. 9, pp.10 -17, Sep. 2015.
February 15, 2017 DRAFT
29
[6] M. Tauhidul Islam, A. M. Taha, “A Survey of Access Management Techniques in Machine Type Communications,”IEEE
Communications Magazine, VOL. 52, PP. 74 - 81, April 2014.
[7] F. Hussain, A. Anpalagan, and R. Vannithamby, “Medium access control techniques in M2M communication: survey and
critical review,” European Trans. Emerging Tel. Tech. doi: 10.1002/ett.2869, 2015.
[8] A. Biral, M. Centenaro, A. Zanella, L. Vangelista and M. Zorzi, “The challenges of M2M massive access in wireless cellular
networks,”Digit. Commun. Netw. , vol. 1, no. 1, pp.1 -19, 2015.
[9] 3GPPTS 37.868, “Study on RAN Improvements for Machine-type Communications,” v11.0.0, Sep. 2011.
[10] A. Rajandekar and B. Sikdar, “A survey of MAC layer issues and protocols for machine-to-machine communications,”
IEEE Internet of Things Journal, vol. 2, no. 2, pp. 175-186, Apr 2015.
[11] K.-D. Lee, S. Kim, B. Yi, “Throughput Comparison of Random Access Methods for M2M Service over LTE Networks,”
Proc. 2011 IEEE GLOBECOM Wksps., pp. 373-77, 2011-Dec.
[12] ZTE R2104662: “MTC Simulation Results with Specific Solutions,” Aug. 2011.
[13] T.-M. Lin, C.-H.Lee, J.-P.Cheng, “PRADA: Prioritized Random Access with Dynamic Access Barring for MTC in
3GPPLTE-A Networks,”IEEE Trans. Veh. Technology,vol. 63, no. 5, pp. 2467-2472, Jun. 2014.
[14] M. Cheng, G. Lin, H. Wei, A. Hsu, “Overload control for machine-type-communications in LTE-advanced system,”IEEE
Commun. Mag., vol. 50, no. 6, pp. 38-45, Jun. 2012.
[15] S. Duan, V. Shah-Mansouri, and V.W.S. Wong, “Dynamic access class barring for M2M communications in LTE networks,”
in Proc. Of IEEE GLOBECOM, Altlanta,GA, Dec. 2013.
[16] S.-Y. Lien, T.-H. Liau, C.-Y. Kao and K.-C. Chen, “Cooperative access class barring for machine-to-machine communi-
cations,” IEEE Trans. Wireless Commun.,vol. 11, no. 1, pp. 27-32, Jan. 2012.
[17] X. Yang, A. Fapojuwo, E. Egbogah, “Performance analysis and parameter optimization of random access backoff algorithm
in LTE,” Proc. IEEE Veh. Technol. Conf. (VTC-Fall12), pp. 1-5, Sep. 2012.
[18] S.-Y. Lien, K.-C. Chen, Y. Lin, “Toward Ubiquitous Massive Accesses in 3GPP Machine-to-Machine Commu-nications,”
IEEE Commun. Mag., vol. 49, no. 4, pp. 66-74, Apr. 2011.
[19] M.-Y. Cheng , G.-Y. Lin , H.-Y. Wei , and C.-C. Hsu, “Performance evaluation of radio access network overloading from
machine type communications in LTE-A networks,”in Proc. IEEE WCNC Workshops, Paris, France , Apr. 2012 , pp. 248
252 .
[20] M L. M. Bello, P. Mitchell, and D. Grace, “Application of Q-Learning for RACH Access to Support M2M Traffic over a
Cellular Network,” in European Wireless Conference,May 2014, pp. 1-6.
[21] A. Lo, Y. W. Law, M. Jacobsson, “Enhanced LTE-Advanced Random-Access Mechanism for Massive Machine-to-Machine
(M2M) Communications,”Proc. 27th Meeting of Wireless World Research Forum, 2011-Oct.
[22] 3GPP, “Service requirements for machine-type communications,” 3GPP TS 22.368 V13.0.0, 2014.
[23] M. Tavana, V. Shah-Mansouri, and V.W.S. Wong, “Congestion control for bursty M2M traffic in LTE networks,”in Proc.
IEEE ICC, London, June 2015.
[24] Z. Wang and V.W.S. Wong, “Joint access class barring and timing advance model for machine-type communications,”in
Proc. of IEEE ICC, Sydney,Australia, June 2014.
[25] S. Sesia, I. Toufik, and M. Baker,LTE - The UMTS Long Term Evolution.West Sussex, UK: John Wiley & Sons, Ltd,
2009.
[26] A. Ali, W. Hamouda, “Next Generation M2M Cellular Networks: Challenges and Practical Considerations,”IEEE Com.
Magazine,vol. 53, no. 9, pp.18 -24, Sep. 2015.
February 15, 2017 DRAFT
30
[27] K. S. Narendra, and M. A. L. Thathachar, Learning automata: An introduction. Englewood Cliffs NJ: Prentice Hall, (1989).
[28] P. Nicopolitidis, G. I. Papadimitriou and A. S. Pomportsis, “Distributed protocols for Ad-Hoc wireless LANs: A learning-
automata-based approach,”Ad Hoc Netw.,vol. 2, no. 4, pp.419 -431 2004.
[29] S. Misra, V. Tiwari and M. S. Obaidat, “LACAS: Learning Automata-based Congestion Avoidance Scheme for Healthcare
Wireless Sensor Networks,”IEEE JSAC,vol. 27, no. 4, pp. 466-79, May 2009.
[30] 3GPP, “LTE: MTC LTE simulations,” 3rd Generation Partnership Project (3GPP), TSG RAN WG2 v11.0.0, Sep. 2011.
[31] P. Nicopolitidis , G. I. Papadimitriou and A. S. Pomportsis, ”Learning-automata-based polling protocols for wireless LANs”,
IEEE Trans. Commun.,vol. 51, pp. 453-463, 2003.
[32] S. Pediaditaki, P. Arrieta and M.K. Marina, “A Learning-Based Approach for Distributed Multi-Radio Channel Allocation
in Wireless Mesh Networks,”Proc. IEEE Int’,l Conf. Network Protocols (ICNP),2009.
[33] T. D. Lagkas, G. I. Papadimitriou, P. Nicopolitidis, and A. S. Pomportsis, “POAC-QG: Priority Oriented Adaptive Control
with QoS Guarantee for Wireless Networks,”Proceedings of IEEE EUROCON 2005, Belgrade, Serbia & Montenegro,Nov.
2005, pp. 1858-1861.
[34] G. I. Papadimitriou and D. G. Maritsas, “Self-adaptive random-access protocols for WDM passive star networks,”IEE
Proc.-Comput. Digit. Tech.,vol. 142, no. 4, pp.306 -312 1995.
February 15, 2017 DRAFT