DecouplingPredictionandControlinModernNetworks
SandeepChinchali,AlexAnemogiannis,Tianshu Chu,MarcoPavone,SachinKatti
Forecaster Controller
Time-seriesControlisUbiquitous
RoboticTaxiFleets
OptimalControl
CellCongestion
StockPrice
VideoStreaming
QuantitativeFinance
ChallengesofTime-seriesControl
1. Data-drivenforecasts• Whatfeatures/statistics areneededforcontrol?
2. ManyInputVariables• ForecasterandController
3. Increasingly:• Databoundaries
CellCongestion
Controller
NetworkOperator
Claim:Decouple butco-design predictorandcontroller
Whynotend-to-end learning?
WhyDecouple?1. NaturalDataBoundaries2. Modularity(Re-useforecaster)
WhyCo-design?1. Tuneforecaststocontrolrisk2. RobustAdversarialTraining
4
End-to-end
State Action
End-to-end
State Action
ControlAPI
Reward
Forecaster Controller
PrivateInfo PrivateInfo
ControlReward
ForecastedFeatures
Action
ControllerPrivateStateForecasterPrivateState
Joint(Public)State
(ControlAPI)
GeneralApproach
Forecaster Controller
VideoQoE
Private:Usermobility Private:BufferState
PastThroughputs
Bitrate
FutureThroughputs(Risk-adjusted,~30s)
Forecaster Controller
VideoStreaming
NetworkOperator
CloudVideoServices
PassengerWaitTime,RideEfficiency
FutureCellCongestion,Anomalies (~hrs)
TaxiRoutes
Private:UserLocation,CellDemand
Private:Taxilocations,OutstandingDemand
Forecaster Controller
RoboticTaxiFleet City-wideCellCongestion
NetworkOperator
TaxiOperator
Approach:ReinforcementLearning(RL)
Forecaster Controller
ReinforcementLearning(RL)
Goal:Maximizethetotalreward
Agent Environment
Observestate𝑠"
Action𝑎"
Reward𝑟"
9AdaptedfromPensieve (Sigcomm 18,Maoet.al.)
ControllerForecaster
RLFormulation
Adversarial
Application:MobileVideoStreaming
ABRagent
state
NeuralNetwork
240P480P720P1080P
policyπθ(s,a)
Observestates
parameterθ
Figure fromPensieve (Sigcomm 18,Maoet.al.)
CanwebeatPensieve byforecastingnetworkconditions?
PaloAltoCellThroughputDiversity
Insight: Foresightoftrue networkconditionhelpsSolution: Dynamicallysplicespecializedcontrollers(metaRL)
SwitchCellFreeway
High-LevelTrace
Statistics[µ, 𝝈](API)
VLO
LO
MID
HI
metaRL
Bitrate
metaRL controllerForecaster
VLO
LO
MID
HI
Infer networkconditionfromforecaststatsPrivate:UserLocation
QoE
PaloAlto(Ourdata)+FCC/Norway(Pensieve)
metaRL
GeneralizetoFCC/NorwaydatafromPensieve
metaRL
Re-analysisofPensieve (Sigcomm 18,Maoet.al.)
LinearQoE (hi-thpt) HDQoE (vlo-thpt)
OptimizeTail
QuantifyingSub-optimalityGap
Withoracleknowledgeofnetworkcondition Havetolearnnetworkcondition
Value/PriceofTimeseriesVariables?
Gap
Futurework
1. Broad-visionforTime-SeriesControl• Data-drivenforecasts/controlstrategies
• Intrinsicdataboundaries
2. Value/PriceofInformationusedforLong-TermControl?
3. Privacy/InformationLeakage
Questions:[email protected]
Forecaster Controller