Learning Approach to Link Adaptation in WirelessMAN
Avinash PrasadSupervisor: Prof. Saran
Outline of Presentation
Introduction Problem Definition Proposed Solution & Learning Automaton. Requirements About Implementation Results Conclusions References
Introduction ( Link Adaptation )
Definition
Link adaptation refers to a set of techniques where modulation, coding rate and/or other signal transmission parameters are changed on the fly to better adjust to the changing channel conditions.
Introduction( WirelessMAN )
WirelessMAN requires high data rates over channel conditions varying across different links. channel conditions vary over time
Link Adaptation on per link basis is the most fundamental step that this BWA system uses to respond to these link to link variations, and variations over time. There is an elaborate message passing mechanism for exchanging channel information at the MAC layer.
Problem Definition
Link adaptation requires us to know which channel condition changes summon for a change in transmission parameters.
Most commonly identified problem of link adaptation
How do we calculate the threshold values for various channel estimation parameters, which shall signal a need for change in channel transmission parameters.
Problem Definition ( Current Approaches)
Current methods for threshold estimation Model Based
Requires analytical modeling. How reliable is the model ? Availability of appropriate model for the wireless scenario ?
Statistical methods. Hard to obtain one channel conditions. Doesn’t change with time, fixed. Even a change in season may effect the best appropriate
values. Heuristics based
Scope limited to very few scenarios.
Proposed Solution( Aim )
Come up with a machine learning based method such that Learn the optimal threshold values as we operate over
the network. No analytical modeling needed by the method in its
operation. Should be able to handle noisy feedback from the
environment. Generic enough to be able to learn different parameters
without much changes to the core.
Proposed Solution( Idea )
Use stochastic learning automaton. Informally – Essentially simulates animal learning ;
Repeatedly make your decisions based on your current knowledge, and then refine your decision as per the response from the environment.
Mathematically – Modifies the probability of selecting any action based on how much reward do we get from the environment.
Proposed Solution ( Exp. Setup )
Experimental setup used to study the stochastic learning methods. We learn the optimal SNR threshold values for switching
among coding profiles, such that the throughput is maximized.
Threshold Ti decides when to switch from profile i to profile i+1.
The possible values for Ti have been restricted to a limited set of values, to facilitate faster learning, by diminishing the number of options.
Proposed Solution ( Exp. Setup ) ( Mathematical Formulation )
For N different profiles in use, the need to (N-1) thresholds to be determined/learned.
At any instance these (N-1) thresholds Ti , i € {1,..,N-1}, form the input to the environment.
In return the environment returns the reward
β( SNR estimate,< T1,..,TN>)= (1- SER)* ( K / Kmax)
K represents the information block size fed to the RS encoder in the selected profile.
Kmax is the maximum possible value of K for any profile. This makes the reward value lie in the range [0,1]
Clearly β is a measure of normalized throughput.
Proposed Solution (Learning Automaton)( Formal Definition )A learning automaton is completely given by
(A, B, LA, Pk ) Action set A (α1, α2,…, αr), we shall always assume this set to be
finite in our discussion. Set of rewards B = [0,1] The learning algorithm LA
State information Pk = [ p1k , p2
k ,.., prk]
Proposed Solution (Learning Automaton)
( Why? )Advantages: Complete generality of action set We can have a entire set of automaton , each
working on a different variable of a multivariable problem, and yet they arrive at a Nash Equilibrium, such that the overall function is maximized, much faster than a single automaton.
It can handle noisy reward values from the environment
Perform long time averaging as it learns But thus needs the environment to be stationary
Proposed Solution (Learning Automaton)
( Solution to Exp. setup )
Each threshold is learnt by an independent automaton in the group, game , of automaton that solves the problem.
We choose the smallest possible action set depending that covers all possible variations in channel conditions in the setup, for each of the automaton .i.e. decide the possible range of threshold values.
We decide on the learning algorithm to use.
Proposed Solution (Learning Automaton)
( Solution to Exp. setup )
For k being the instance of playoff. We do the following
Each automaton selects an action (Threshold) based on its state Pk, the probability vector.
Based on these threshold values, we select a profile for channel transmission.
Get feedback from the channel in the form of the value of normalized throughput defined earlier.
Use the learning algorithm to calculate the new state, set of probabilities Pk+1 .
Proposed Solution (Learning Automaton)
( Learning Algorithms )
We have explored two different algorithms LRI , Linear reward inaction.
Very much Markovian , just update the Pk+1 based on the last action/reward pair for α(k)=αi pi(k+1) = pi(k) + ∆*β(k)*(1- pi(k))
otherwise pi(k+1) = pi(k ) - ∆*β(k)* pi(k ) ∆ is a rate constant.
Pursuit Algorithm Uses the entire history of selection and reward to calculate the
average reward estimates for all actions. Aggressively tries to move towards the simplex solution, which
has probability 1 for action with highest reward estimate, say action αM .
P(k+1)= P(k) + ∆*( eM(k) – P(k))
Proposed Solution (Learning Automaton) ( Learning Algorithms cont.) Both differ in the speed of convergence to the
optimal solution The amount of storage required for each. How much decentralized the learning setup,
game, can be The way they approach their convergence point
Being a greedy method pursuit algorithm shows lots of deviation in the evolution phase.
Requirements
802.16 OFDM Physical layer Channel model (SUI model used) Learning Setup
About Implementation ( 802.16 OFDM Physical
Layer) Implements OFDM physical layer from 802.16d. Coded in MatlabTM Complies fully to the standard, operations tested
with the example for pipeline given in the standard.
No antenna diversity used, and perfect channel impulse response estimation assumed.
About Implementation ( Channel
Model )
We have implemented the complete set of SUI models for omni antenna case.
The complete channel model consists of one of the SUI models plus AWGN model for noise.
Coded in MatlabTM , thus completing the entire channel + coding pipeline.
Results from this data transmission pipeline shall be presented later.
About Implementation ( Learning
Setup ) We implemented both the algorithms for
comparison. Coded in C/C++. A network model was constructed using the
Symbol error rate plots obtained form PHY layer simulations to estimate the reward values.
Results ( PHY layer)( BER plots for different SUI
models)
Results ( PHY layer)( SER plots for different SUI
models)
Results ( PHY layer)( BER plots for different Profiles at SUI2)
Results ( PHY layer)(SER plots for different Profiles at SUI2)
Results ( PHY layer)( Reward Metric for learning automaton )
Results ( Learning )( Convergence curve; LRI (rate=0.0015)
Results ( Learning )(Convergence curve; Pursuit (rate=0.0017)
Results ( Learning )( Convergence curve; LRI (rate=0.0032)
Results( Learning : 4 actions per Thresh )( Convergence curve; LRI (rate=0.0015)
Conclusions
Our plots suggest the following Learning methods are indeed capable of arriving at the
optimal values for parameters in the type of channel conditions faced in WirelessMAN.
The rate of convergence depends on rate factor (∆) size of the action set How much do the actions differ in terms of the reward
that they get from the environment. The learning algorithm
Although we have worked with a relatively simple setup with assumption that SNRestimated is perfect and available complete generality of the action set ensures that we can work with other channel estimation parameters as well.
References
V. Erceg and K. V. Hari, Channel models for fixed wireless applications. IEEE 802.16 broadband wireless access working group.2001
Daniel S. Baum, Simulating the SUI models. IEEE 802.16 broadband wireless access working group,2000
M. A. L. Thathachar and P.S. Shastry. Network of learning automata techniques for online stochastic optimization, Kluwer Academic Publication,2003
Thanks
Thanks