Public Review for Knowledge-Deﬁned Networking · 2020-05-28 · network topology and handles the...

Public Review for

Knowledge-Defined Networking

Albert Mestres, Alberto Rodriguez-Natal, Josep Carner, PereBarlet-Ros, Eduard Alarcn, Marc Sol, Victor Munts-Mulero, DavidMeyer, Sharon Barkai, Mike J. Hibbett, Giovani Estrada, KhaldunMa’ruf, Florin Coras, Vina Ermagan, Hugo Latapie, Chris Cassar,

John Evans, Fabio Maino, Jean Walrand, Albert Cabellos

It has been more than a decade since David Clark published his proposal fora knowledge plane (KP) for the Internet, in which he proposed integrating“intelligence” into the fabric of the Internet itself, as opposed to sequesteringit at the endpoints. While KP has inspired numerous research projects andis prolifically cited, Clark’s vision has yet to be fully realized in deployment.In this paper, the authors ask why this is, and whether recent developmentsin networked and distributed systems give cause to revisit KP.

In short, the authors argue the answer is yes. Namely, they make the casethat advances in machine learning, data analytics, and software-defined net-working enable a paradigm they call “knowledge defined networking” thatcan achieve the goals of a KP. This paper identifies an architecture for theirapproach and provides several case studies demonstrating the potential ben-efits in deployment. The authors identify numerous challenges that mustbe addressed before the KDN paradigm is ready for widespread adoption,with perhaps the biggest open question being: will this be the next big steptoward Clark’s KP vision?

Public review written by

David Cho↵nes

Northeastern University

ACM SIGCOMM Computer Communication Review Volume 47 Issue 3, July 2017

Artifacts Review for


Albert Mestres, Alberto Rodriguez-Natal, Josep Carner, PereBarlet-Ros, Eduard Alarcn, Marc Sol, Victor Munts-Mulero, DavidMeyer, Sharon Barkai, Mike J. Hibbett, Giovani Estrada, KhaldunMa’ruf, Florin Coras, Vina Ermagan, Hugo Latapie, Chris Cassar,

John Evans, Fabio Maino, Jean Walrand, Albert Cabellos

I start by applauding the authors of Knowledge-Defined Networking for pub-licly sharing the datasets and scripts used in the experiments presented in thepaper, at http://knowledgedefinednetworking.org. This is not only im-portant for reproducibility, but for the particular context of Machine Learn-ing (ML) having standardised datasets has been shown fundamental in otherareas, as the authors correctly point out, so this is an excellent first steptowards that goal in networking.The website provides two main types of artefacts: datasets and neural net-works software. The datasets include all data used for the two use casesdiscussed in the paper. For the virtual network functions they include theCPU consumption of an OVS connected to an SDN controller, of an OVSconfigured with firewall rules, and of SNORT. For the network characterisa-tion use case, the authors include several delay measurements for di↵erentnetwork topologies. The datasets have shown to be good and useful.The authors also released neural networks software scripts, but they requirea specific version of a commercial software that was not available to thereviewer. The open-source alternative could not be used to reproduce allthe experiments, only a subset of them. The first version of the websitedid not include enough documentation to help reproducing the results, butthe authors kindly provided the necessary information, which the reviewerrecommends to be included in the website (alongside the precise softwarerequirements).Overall, I associate the Artifacts Evaluated and Functional badge tothis paper.

Public review written by

Fernando Ramos

Universidad de Lisboa, Portugal



Albert Mestres1, Alberto Rodriguez-Natal1, Josep Carner1, Pere Barlet-Ros1,2, Eduard Alarcón1,Marc Solé3, Victor Muntés-Mulero3, David Meyer4, Sharon Barkai5, Mike J Hibbett6,

Giovani Estrada6, Khaldun Ma‘ruf7, Florin Coras8, Vina Ermagan8, Hugo Latapie8, Chris Cassar8,John Evans8, Fabio Maino8, Jean Walrand9 and Albert Cabellos1

1 Universitat Politècnica de Catalunya, 2 Talaia Networks, 3 CA Technologies,4 Brocade Communication, 5 Hewlett Packard Enterprise, 6 Intel R&D, 7 NTT Communications,

8 Cisco Systems, 9 University of California, Berkeley

ABSTRACTThe research community has considered in the past the ap-plication of Artificial Intelligence (AI) techniques to controland operate networks. A notable example is the KnowledgePlane proposed by D.Clark et al. However, such techniqueshave not been extensively prototyped or deployed in the fieldyet. In this paper, we explore the reasons for the lack ofadoption and posit that the rise of two recent paradigms:Software-Defined Networking (SDN) and Network Analyt-ics (NA), will facilitate the adoption of AI techniques in thecontext of network operation and control. We describe anew paradigm that accommodates and exploits SDN, NAand AI, and provide use-cases that illustrate its applicabil-ity and benefits. We also present simple experimental resultsthat support, for some relevant use-cases, its feasibility. Werefer to this new paradigm as Knowledge-Defined Network-ing (KDN).

CCS Concepts•Networks ! Network design principles; Network ser-vices; Network performance modeling;

KeywordsKnowledge Plane, SDN, Network Analytics, Machine Learn-ing, NFV, Knowledge-Defined Networking

1. INTRODUCTIOND. Clark et al. proposed “A Knowledge Plane for the

Internet” [1], a new construct that relies on Machine Learn-ing (ML) and cognitive techniques to operate the network.A Knowledge Plane (KP) would bring many advantages tonetworking, such as automation (recognize-act) and recom-mendation (recognize-explain-suggest), and it has the po-tential to represent a paradigm shift on the way we operate,optimize and troubleshoot data networks. However, at thetime of this writing, we are yet to see the KP prototyped ordeployed. Why?

One of the biggest challenges when applying ML for net-work operation and control is that networks are inherentlydistributed systems, where each node (i.e., switch, router)has only a partial view and control over the complete sys-tem. Learning from nodes that can only view and act over asmall portion of the system is very complex, particularly ifthe end goal is to exercise control beyond the local domain.The emerging trend towards logical centralization of control

will ease the complexity of learning in an inherently dis-tributed environment. In particular, the Software-DefinedNetworking (SDN) paradigm [2] decouples control from thedata plane and provides a logically centralized control plane,i.e., a logical single point in the network with knowledge ofthe whole.In addition to the “softwarization” of the network, current

network data plane elements, such as routers and switches,are equipped with improved computing and storage capa-bilities. This has enabled a new breed of network moni-toring techniques, commonly referred to as network teleme-try [3]. Such techniques provide real-time packet and flow-granularity information, as well as configuration and net-work state monitoring data, to a centralized Network An-alytics (NA) platform. In this context, telemetry and an-alytics technologies provide a richer view of the networkcompared to what was possible with conventional networkmanagement approaches.In this paper, we advocate that the centralized control of-

fered by SDN, combined with a rich centralized view of thenetwork provided by network analytics, enable the deploy-ment of the KP concept proposed in [1]. In this context, theKP can use various ML approaches, such as Deep Learning(DL) techniques, to gather knowledge about the network,and exploit that knowledge to control the network usinglogically centralized control capabilities provided by SDN.We refer to the paradigm resulting from combining SDN,telemetry, Network Analytics, and the Knowledge Plane asKnowledge-Defined Networking.This paper first describes the Knowledge-Defined Net-

working (KDN) paradigm and how it operates. Then, itdescribes a set of relevant use-cases that show the appli-cability of such paradigm to networking and the benefitsassociated with using ML. In addition, for some use-cases,we also provide early experimental results that show theirfeasibility. We conclude the paper by analyzing the openresearch challenges associated with the KDN paradigm.

2. A KNOWLEDGE PLANE FOR SDN AR-CHITECTURES

This paper restates the concept of Knowledge Plane (KP)as defined by D. Clark et al. [1] in the context of SDN ar-chitectures. The addition of a KP to the traditional threeplanes of the SDN paradigm results in what we call Know-ledge-Defined Networking.The Data Plane is responsible for storing, forwarding and


processing data packets. In SDN networks, data plane el-ements are typically network devices composed of line-rateprogrammable forwarding hardware. They operate unawareof the rest of the network and rely on the other planes topopulate their forwarding tables and update their configu-ration.

The Control Plane exchanges operational state in orderto update the data plane matching and processing rules.In an SDN network, this role is assigned to the –logicallycentralized– SDN controller that programs SDN data planeforwarding elements via a southbound interface. While thedata plane operates at packet time scales, the control planeis slower and typically operates at flow time scales.

The Management Plane ensures the correct operation andperformance of the network in the long term. It defines thenetwork topology and handles the provision and configura-tion of network devices. In SDN this is usually handledby the SDN controller as well. The management plane isalso responsible for monitoring the network to provide crit-ical network analytics. To this end, it collects telemetryinformation from the control and data plane while keeping ahistorical record of the network state and events. The man-agement plane is orthogonal to the control and data planes,and typically operates at larger time-scales.

The Knowledge Plane, as originally proposed by Clark, isredefined in this paper under the terms of SDN as follows:the heart of the knowledge plane is its ability to integrate be-havioral models and reasoning processes oriented to decisionmaking into an SDN network. In the KDN paradigm, theKP takes advantage of the control and management planesto obtain a rich view and control over the network. It isresponsible for learning the behavior of the network and, insome cases, automatically operate the network accordingly.Fundamentally, the KP processes the network analytics col-lected by the management plane, either preprocessed dataor raw data, transforms them into knowledge via ML, anduses that knowledge to make decisions (either automaticallyor through human intervention). While parsing the informa-tion and learning from it is typically a slow o↵-line process,using such knowledge automatically can be done at a time-scales close to those of the control and management planes.However, the trend is towards on-line learning for applica-tions such as those described in section 4.

3. KNOWLEDGE-DEFINED NETWORKINGThe Knowledge-Defined Networking (KDN) paradigm op-

erates by means of a control loop to provide automation,recommendation, optimization, validation and estimation.Conceptually, the KDN paradigm borrows many ideas fromother areas, notably from black-box optimization [4], neural-networks in feedback control systems [5], reinforcement learn-ing [6] and autonomic self-* architectures [7]. In addition,recent initiatives share the same vision stated in this pa-per1 [8], [9]. Fig. 2 shows the basic steps of the main KDNcontrol. In what follows we describe these steps in detail.

Forwarding Elements & SDN Controller ! Analytics

Platform.

The Analytics Platform aims to gather enough informa-tion to o↵er a complete view of the network. To that end,it monitors the data plane elements in real time while they

1Cognet project: http://www.cognet.5g-ppp.eu/

Northbound SDN controller

API

Forwarding elements

SDN controller

Analytics platform

Machine learning

Human decisionKnowledge

Automatic decision

Figure 1: KDN operational loop

forward packets in order to access fine-grained tra�c infor-mation. In addition, it queries the SDN controller to obtaincontrol and management state. The analytics platform re-lies on protocols, such as NETCONF (RFC 6241), NetFlow(RFC 3954) and IPFIX (RFC 7011), to obtain configura-tion information, operational state and tra�c data from thenetwork. The most relevant data collected by the analyticsplatform is summarized below.

• Packet-level and flow-level data: This includes DeepPacket Inspection (DPI) information, flow granularitydata and relevant tra�c features.

• Network state: This includes the physical, topologicaland logical configuration of the network.

• Control & management state: This includes all theinformation included both in the SDN controller andmanagement infrastructure, including policy, virtualtopologies, application-related information, etc.

• Service-level telemetry: This is relevant to learn thebehavior of the application or service, and its relationwith the network performance, load and configuration.

• External information: This is relevant to model theimpact of external events, such as activity on socialnetworks (e.g., amount of people attending a sportsevent), weather forecasts, etc. on the network.

In order to e↵ectively learn the network behavior, besideshaving a rich view of the network, it is critical to observeas many di↵erent situations as possible. As we discuss inSection 5, this includes di↵erent loads, configurations andservices. To that end, the analytics platform keeps a histor-ical record of the collected data.

Analytics Platform ! Machine Learning.

ML algorithms (such as Deep Learning techniques) arethe heart of the KP, which are able to learn from the net-work behavior. The current and historical data provided bythe analytics platform are used to feed learning algorithmsthat learn from the network and generate knowledge (e.g.,a model of the network). We consider three approaches: su-pervised learning, unsupervised learning and reinforcementlearning.In supervised learning, the KP learns a model that

describes the behavior of the network, i.e., a function that


Table 1: KDN applicationsClosed Loop Open Loop

SupervisedAutomation

Optimization

Validation

Estimation

What-if analysis

Unsupervised Improvement Recommendation

ReinforcementAutomation

OptimizationN/A

relates relevant network variables to the operation of thenetwork (e.g., the performance of the network as a functionof the tra�c load and network configuration). It requireslabeled training data and feature engineering to representnetwork data.

Unsupervised learning is a data-driven knowledge dis-covery approach that can automatically infer a function thatdescribes the structure of the analyzed data or can highlightcorrelations in the data that the network operator may beunaware of. As an example, the KP may be able to discoverhow the local weather a↵ects the link’s utilization.

In the reinforcement learning approach, a softwareagent aims to discover which actions lead to an optimal con-figuration. As an example the network administrator can seta target policy, for instance the delay of a set of flows, thenthe agent acts on the SDN controller by changing the con-figuration and for each action receives a reward, which in-creases as the in-place policy gets closer to the target policy.Ultimately, the agent will learn the set of configuration up-dates (actions) that result in such target policy. Recently,deep reinforcement learning techniques have provided im-portant breakthroughs in the AI field that are being appliedin many network-related fields (e.g., [10]).

Please note that learning can also happen o✏ine and ap-plied online. In this context knowledge can be learned o✏inetraining a neural network with datasets of the behavior ofa large set of networks, then the resulting model can beapplied online.

Machine Learning ! Northbound controller API.

The KP eases the transition between telemetry data col-lected by the analytics platform and control specific actions.Traditionally, a network operator had to examine the met-rics collected from network measurements and make a deci-sion on how to act on the network. In KDN, this processis partially o✏oaded to the KP, which is able to make -orrecommend- control decisions taking advantage of ML tech-niques.

Depending on whether the network operator is involved ornot in the decision making process, there are two di↵erentsets of applications for the KP. We next describe thesepotential applications and summarize them in table 1.

Closed loop: When using supervised or reinforcementlearning, the network model obtained can be used first forautomation, since the KP can make decisions automaticallyon behalf of the network operator. Second, it can be usedfor optimization of the existing network configuration, giventhat the learned network model can be explored throughcommon optimization techniques to find (quasi)optimal con-figurations. In the case of unsupervised learning, the knowl-edge discovered can be used to automatically improve thenetwork via the interface o↵ered by the SDN controller. For

instance the relation between tra�c, routing, topology andthe resulting delay can be modeled to then apply optimalrouting configurations that minimize delay.Open loop: In this case the network operator is still in

charge of making the decisions, however it can rely on theKP to ease this task. When using supervised learning, themodel learned by ML can be used for validation (e.g., toquery the model before applying tentative changes to thesystem). The model can also be used as a tool for perfor-mance estimation and what-if analysis, since the operatorcan tune the variables considered in the model and obtainan assessment of the network performance. When using un-supervised learning, the correlations found in the exploreddata may serve to provide recommendations that the net-work operator can take into consideration when making de-cisions.

Northbound controller API ! SDN controller.

The northbound controller API o↵ers a common interfaceto, human, software-based network applications and policymakers to control the network elements. The API o↵eredby the SDN controller can be either a traditional impera-tive language or a declarative one [11]. In the latter case,the users of the API express their intentions towards thenetwork, which then are translated into specific control di-rectives.The KP can operate both on top of imperative or declara-

tive languages as long as it is trained accordingly. However,and at the time of this writing, developing truly expres-sive and high-level declarative northbound APIs is an openresearch question. Such intent-based declarative languagesprovide automation and intelligence capabilities to the sys-tem. In this context, we advocate that the KP representsan opportunity to help on their development, rather than anadditional level of intelligence. As a result, we envision theKP operating on top of imperative languages, while help-ing on the translation of the intentions stated by the policymakers into network directives.

SDN controller ! Forwarding Elements.

The parsed control actions are pushed to the forwardingdevices via the controller southbound protocols in order toprogram the data plane according to the decisions made atthe KP.

4. USE-CASESThis section presents a set of specific uses-cases that illus-

trate the potential applications of the KDN paradigm andthe benefits a KP based on ML may bring to common net-working problems. For two representative use-cases, we alsoprovide early experimental results that show the technicalfeasibility of the proposed paradigm. All the datasets used inthis paper, as well as codes and relevant hyper-parameters,can be found at [12].

4.1 Routing in an Overlay NetworkThe main objective of this use-case is to show that it is

possible to model the behavior of a network with the use ofML techniques. In particular, we present a simple proof-of-concept example in the context of overlay networks, wherean Artificial Neural Network (ANN) is used to build a modelof the delay of the (hidden) underlay network, which can


later be used to improve routing in the overlay network.Overlay networks have become a common solution for de-

ployments where one network (overlay) has to be instanti-ated on top of another (underlay). This may be the casewhen a physically distributed system needs to behave asa whole while relying on a transit network, for instance acompany with geo-distributed branches that connects themthrough the Internet. Another case is when a network hasto send tra�c through another for which it is not interoper-able, for example when trying to send Ethernet frames overan IP-only network.

In such cases, an overlay network can be instantiated bymeans of deploying overlay-enabler nodes at the edge of thetransit network and then tunneling overlay tra�c using anencapsulation protocol (e.g., LISP (RFC 6830), VXLAN(RFC 7348), etc.). In many overlay deployments, the un-derlay network belongs to a di↵erent administrative domainand thus its details (e.g., topology, configuration) are hiddento the overlay network administrator (see fig. 2 inset).

Typically, overlay edge nodes are connected to the under-lay network via several links. Even though edge nodes haveno control over the underlay routing, they can distributethe tra�c among the di↵erent links they use to connect toit. Edge nodes can use overlay control plane protocols (e.g.,LISP) to coordinate tra�c balancing policies across links.However, a common problem is how to find best/optimumper-link policies at the edge such that the global performanceis optimized. An e�cient use of edge nodes links is criticalsince it is the only way the overlay operator can control –toa certain extent– the tra�c path over the underlay network.

Overlay operators can rely on building a model of theunderlay network to optimize the performance. However,building such a model poses two main challenges. First, nei-ther the topology nor the configuration (e.g., routing policy)of the underlay network are known, and thus it is di�cultto determine the path that each flow will follow. Second,mathematical or theoretical models may fall short to modelsuch a complex scenario.

ML techniques o↵er a new tool to model hidden networksby analyzing the correlation of inputs and outputs in thesystem. In other words, ML techniques can model the hid-den underlay network by means of observing how the out-put tra�c behaves for a given input tra�c (i.e., f (routingpolicy, tra�c) = performance). For instance, if two edgenode links share a transit node within the -hidden- under-lay network, ML techniques can learn that the performancedecreases when both of those links are used at the sametime and therefore recommend tra�c balancing policies thatavoid using both links simultaneously.

4.1.1 Experimental Results

To assess the validity of this approach, we carried outthe following simple experiment. We have simulated (Om-net++2) a network with 12 overlay nodes, 19 underlay ele-ments and a total of 72 links. The simulation has the follow-ing tra�c characteristics: overlay nodes randomly split thetra�c independently of the destination node, the underlaynetwork uses shortest path routing with constant link capac-ity, constant propagation delay and Poisson tra�c genera-tion. From the KP perspective, only the overlay nodes thatsend and receive tra�c are seen, while the underlay networkis hidden.2https://omnetpp.org/

0 2000 4000 6000 8000 100000

20

40

60

80

100

120

140

160

180

200

Samples in the training set

MSE

[µs2 ]

Global viewLocal view

Overlay network with a hidden underlay

Overlaynodes

Overlaynodes

Figure 2: Prediction error (Mean Square Error) as afunction of the size of the training set in an overlay-underlay scenario, both for global and local view.

We train an ANN using Pylearn2–Theano 0.7 with onehidden layer and a sigmoid activation function. The ANNis trained using the following input features: the tra�c vol-ume, defined as the aggregated bytes for the simulated timeamong source-destination pairs of the overlay nodes, andthe routing of the overlay, defined as the ratio of tra�c thatis sent through each edge link. The average delays amongpaths obtained in the simulation are used as output features.We train the network with 9,600 training samples and we use300 –separate samples– to validate the results.With this use-case, we aim to learn the function that re-

lates the tra�c and the routing configuration of the overlaynetwork with the resulting average delay of each path. Thatis, we train the ANN using the dataset that has as inputstra�c and routing configuration, and as output the averagedelay. Thus, the resulting ANN models the average delay ofthe packets for any tra�c and routing configuration.Fig. 2 shows the error (the accuracy) of the model as a

function of the training set size (solid line). This error rep-resents how accurately the model predicts the delay whenthe routing and tra�c is known, but not the topology. Asshown by the figure, the relative error is roughly 1% whenusing 6,400 training samples, equivalent to a mean squareerror of 20 ms2. In addition to this, fig. 2 also shows (dashedline) when the model is trained only using local information.The main reason behind this experiment is that we aim tovalidate the main hypothesis stated in this paper: ML ap-plied to a global view renders better results than when onlylocal information is available. For this, each overlay node istrained only with local tra�c, routing and delay and as theresults show, the accuracy in this case is strongly degraded.This is because the delay between two nodes depends on thestate of the queues of the underlay network, which in turnsdepends on the total tra�c of the network.Similar scenarios has been addressed before in the past

(e.g., [13]) using network optimization techniques. Suchmechanisms rely on models that represent the network builtusing either analytical techniques (e.g., Markov Chains) orcomputational models. In this paper we advocate that MLtechniques represent a third pillar in network modeling, en-abled by the global view and control o↵ered by SDN andNA techniques. It provides important advantages: ML isdata-driven, does not require simplifying assumptions typi-cally found in traditional network modeling, works well with


complex systems (e.g., non-linearities and multi-dimensionaldependencies) and, if trained well, can be general. Furtherinformation about this can be found at section 5.

4.2 Resource Management in an NFV scenarioThis use-case shows how the KDN paradigm can also be

useful in the context of Network Function Virtualization(NFV). NFV [14] is a networking paradigm where networkfunctions (e.g., firewalls, load-balancers, etc.) no longerrequire specific hardware appliances but rather are imple-mented in the form of Virtual Network Functions (VNFs)that run on top of general purpose hardware.

The resource management in NFV scenarios is a complexproblem since VNF placement may have an important im-pact on the overall system performance. The problem ofoptimal Virtual Machine (VM) placement has been widelystudied for Data Center (DC) scenarios (see [15] and thereferences therein), where the network topology is mostlystatic. However, in NFV scenarios the placement of a VNFmodifies the performance of the virtualized network. Thisincreases the complexity of the optimal placement of VNFsin NFV deployments.

Contrary to the overlay case, in the VNF placement prob-lem all the information is available, e.g., virtual networktopology, CPU/memory usage, energy consumption, VNFimplementation, tra�c characteristics, current configuration,etc. However, in this case the challenge is not the lack of in-formation but rather its complexity. The behavior of VNFsdepend on many di↵erent factors and thus developing accu-rate models is challenging.

The KDN paradigm can address many of the challengesposed by the NFV resource-allocation problem. For exam-ple, the KP can characterize, via ML techniques, the be-havior of a VNF as a function of the collected analytics,such as the tra�c processed by the VNF or the configura-tion pushed by the controller. With this model, the resourcerequirements of a VNF can be modeled by the KP withouthaving to modify the network. This is helpful to optimizethe placement of this VNF and, therefore, to optimize theperformance of the overall network.

4.2.1 Experimental results

To validate this use-case we model the CPU consumptionof real-world VNFs when operating under real tra�c. Wehave chosen two di↵erent network elements, an Open VirtualSwitch (OVS v2.0.23) and Snort (v2.9.6.04). We have testedOVS with two di↵erent set of rules and controller configu-rations: as a SDN-enabled firewall and as a SDN-enabledswitch. In both cases, we have aimed to have a representa-tive configuration of real-world deployments. With this wehave three di↵erent VNFs: SNORT, Firewall and Switch.

To measure the CPU consumption of both VNFs, we havedeployed them in VMs (Ubuntu 14.04.1) running on top ofa hypervisor (VMware ESXi v5.5), which provides a virtualnetwork to interconnect the VNFs using 1 Gbps links. TwoVMs generate and receive tra�c, and are connected to theVNF. The tra�c used in this experiment was replayed usingtcpreplay (version 3.4.4) from an on-campus DPI infrastruc-ture. The campus network serves around 30k users, furtherdetails about the tra�c traces can be found in [16]. To repre-sent the tra�c, we extract o↵-line a set of 86 tra�c features3http://openvswitch.org/4https://www.snort.org/

3 4 5 6 7

x 104

80

100

120

140

160

180

200

220

240

260

280

Num Packets

CP

U c

onsu

mpt

ion

Snort

Figure 3: Measured points and the built model usingtwo di↵erent features for two di↵erent VNF (onlyshowing the most relevant feature)

0 2 4 6 8 10 12 140

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Error [%]

Prob

abilit

y

FirewallOVS SwitchSnort

Figure 4: Cumulative Distribution Function of therelative error ((ypred�yreal)/yreal) for the three VNFs

in 20 second batches: number of packets, number of 5-tupleflows, average length, number of di↵erent IPs or ports, ap-plication layer information, among others. The complete setof features can be found in [12]. In the learning process, weuse the Matlab ANN toolbox with one hidden layer, wherethe input are the 86 tra�c features and the output is themeasured CPU consumption. In this case, we aim to learnthe function that relates the tra�c features with the CPUconsumption.In this experiment, we use a dataset of 750 samples (600

for training and 150 for test-set) for the Firewall learningmodel and a dataset of 1,100 samples (900 for training and200 for test-set) for the Snort and the Switch learning mod-els. First we aim to understand if the model is complex(e.g., non-lineal) and thus, requires the use of ML. For thiswe show in fig. 3 the model when only a single feature isused, specifically we pick the input tra�c feature that ismore relevant to predict CPU of each of the VNFs, this isthe result of a PCA analysis. The figure plots the predictedCPU consumption (line) and the measured data (dots) asa function of the tra�c feature used for prediction. As theplot shows, training with a single feature leads to poor ac-curacy while showing non-linear dependencies, motivatingthe use of neural networks. Finally, fig. 4 shows the CDF ofthe relative error ((ypred � yreal)/yreal) of the models whentrained with all the 86 features defined previously, achievingvery good accuracy.


4.3 Knowledge extraction from network logsOperators typically equip their networks with a logging in-

frastructure where network devices report events (e.g., linkgoing down, packet losses, etc.). Such logs are extensivelyused by operators to monitor the health of the network andto troubleshoot issues. Log analysis is a well-known researchfield and, in the context of the KDN paradigm, it can alsobe used in networking [17]. By means of unsupervised learn-ing techniques, a KDN architecture can correlate log eventsand discover new knowledge. This knowledge can be usedby the network administrators for network operation usingthe open-loop approach, or to take automatic decisions ina closed-loop solution. These are some specific examples ofKnowledge Discovery using Network Logging and unsuper-vised learning:

• Node N is always congested around 8pm and ServicesX and Y have an above-average number of clients.

• Abnormal number of BGP UPDATES messages sentand Interface 3 is flapping.

• Fan speeds increase in node N when interface Y fails.

4.4 5G mobile communications networksThe fifth generation of mobile communications networks

will provide higher data rates, lower latencies, among otheradvances together with an important update of the net-work [18]. The 5G network is, by design, a Wireless SDN(WSDN), which o↵ers a flexible network architecture, re-quired by the specifications of 5G. Moreover, 5G is com-plemented by the use of Network Function Virtualization(NFV), to increase the flexibility of the 5G network andto create virtual networks over the same physical network.Within this context, KDN can be easily applied as in a con-ventional SDN+NFV network.

Additionally, 5G networks require novel technical solu-tions in which the KDN paradigm can also be helpful. Thehigh scalability required makes necessary the design of in-telligent routing algorithms for a large number of users, es-pecially when these users are mobile [18]. The design ofreliable hando↵s, as well as the design of dynamic routingalgorithms may take advantage of the data collected to pre-dict the user movement to increase the performance of thesealgorithms. Moreover, this can also be used to increase thee�ciency of the beam-steering techniques, which will facili-tate the increase of the throughput in the physical layer.

5. CHALLENGES AND CONCLUSIONSThe KDN paradigm brings significant advantages to net-

working, but at the same time it also introduces importantchallenges that need to be addressed. In what follows wediscuss the most relevant ones.

New ML mechanisms: Although ML techniques provideflexible tools to computer learning, its evolution is partiallydriven by existing ML applications (e.g., Computer Vision,recommendation systems, etc.). In this context the KDNparadigm represents a new application for ML and as such,requires either adapting existing ML mechanisms or develop-ing new ones. Graphs are a notable example, they are usedin networking to represent topologies, which determine theperformance and features of a network. In this context, onlypreliminary attempts have been proposed in the literatureto create sound ML algorithms able to model the topology

of systems that can be represented through a graph [19]. Al-though such proposals are not tailored to network topologies,their core ideas are encouraging for the computer networksresearch area. In this sense, the combination of modernML techniques, such as Q-learning techniques, convolutionalneural networks and other deep learning techniques, may beessential to make a step further in this area.Non-deterministic networks: Typically networks operate

with deterministic protocols. In addition, common analyti-cal models used in networking have an estimation accuracyand are based on assumptions that are well understood. Incontrast, models produced by ML techniques do not pro-vide such guarantees and are di�cult to understand by hu-mans. This also means that manual verification is usuallyimpractical when using ML-derived models. Nevertheless,ML models work well when the training set is representativeenough. Then, what is a representative training set in net-working? This is an important research question that needsto be addressed. Basically, we need a deep understandingof the relationship between the accuracy of the ML mod-els, the characteristics of the network, and the size of thetraining set. This might be challenging in this context asthe KP may not observe all possible network conditions andconfigurations during its normal operation. As a result, insome use-cases a training phase that tests the network un-der various representative configurations can be required. Inthis scenario, it is necessary to analyze the characteristics ofsuch loads and configurations in order to address questionssuch as: does the normal tra�c variability occurring in net-works produce a representative training set? Does ML re-quire testing the network under a set of configurations thatmay render it unusable?New skill set and mindset: The move from traditional

networks to the SDN paradigm has created an importantshift on the required expertise of networking engineers andresearchers. The KDN paradigm further exacerbates thisissue, as it requires a new set of skills in ML techniques orArtificial Intelligence tools.Standardized Datasets: In many cases, progress in ML

techniques heavily depends on the availability of standard-ized datasets. Such datasets are used to research, developand benchmark new AI algorithms. And some researchersargue that the cultivation of high-quality training datasetsis even more important that new algorithms, since focus-ing on the dataset rather than on the algorithm may be amore straightforward approach. The publication of datasetsis already a common practice in several popular ML applica-tion, such as image recognition5. In this paper, we advocatethat we need similar initiatives for the computer networkAI field. For this reason, all datasets used in this paper arepublic and can be found at [12]. This datasets have provenuseful in routing and VNF experiments, it is our hope thathelp kick-o↵ a community contributing with larger datasets.Summary: We advocate that in order to address such

important challenges and achieve the vision shared in thispaper, we require a truly inter-disciplinary e↵ort betweenthe research fields of Artificial Intelligence, Network Scienceand Computer Networks.Acknowledgments. We would like to thank Prof. ShyamParekh and David Kirsch for their valuable suggestions. Wewould also like to thank the anonymous reviewers for their

5Imagenet database: http://www.image-net.org/


comments that improved this paper. This work has beenpartially supported by the Spanish Ministry of Economy,Industry and Competitiveness and EU FEDER under grantTEC2014-59583-C2-2-R (SUNSET project) and by the Cata-lan Government (ref. 2014SGR-1427).

6. REFERENCES[1] Clark, D., et al. “A knowledge plane for the Internet.”

Conf. on Applications, technologies, architectures, andprotocols for computer communications, ACM, 2003.

[2] Kreutz, D., et al. “Software-defined networking: Acomprehensive survey.” Proceedings of the IEEE, vol.103.1, pp. 14-76, 2015.

[3] Kim, C., et al. “In-band Network Telemetry viaProgrammable Dataplanes.” Industrial demo, ACMSIGCOMM, 2015.

[4] Rios, L.M., and Sahinidis, N. V., “Derivative-freeoptimization: a review of algorithms and comparison ofsoftware implementations.” Journal of GlobalOptimization, vol.54, pp. 1247-1293, 2013.

[5] Narendra, K. S., and Kannan P. “Identification andcontrol of dynamical systems using neural networks.”Neural Networks, IEEE Trans., vol.1, pp. 4-27, 1990.

[6] Mnih, V., et al. “Human-level control through deepreinforcement learning.”Nature 518(7540), pp. 529-533,2015.

[7] Derbel, H., et al. “ANEMA: Autonomic networkmanagement architecture to support self-configurationand self-optimization in IP networks.”ComputerNetworks, vol. 53.3, pp. 418-430, 2009

[8] Zorzi, M., et at. “COBANETS: A new paradigm forcognitive communications systems.” InternationalConference on Computing, Networking andCommunications (ICNC), pp. 1-7, 2016

[9] Giordano, Danilo, et al. “YouLighter: A cognitiveapproach to unveil YouTube CDN and changes.” IEEETransactions on Cognitive Communications andNetworking, vol. 1.2, pp. 161-174, 2015.

[10] Mao, Hongzi, et al. “Resource Management with DeepReinforcement Learning.”Hot Topics in Networks.ACM, 2016.

[11] Foster, N., et al., “Languages for software-definednetworks,” IEEE Communications Magazine, vol. 51,no. 6, pp. 128-134, 2013.

[12] KDN database,http://knowledgedefinednetworking.org/

[13] Chen, Yan, et al. “An algebraic approach to practicaland scalable overlay network monitoring.”ACMSIGCOMM Computer Communication Review, vol.34.4, 2004.

[14] Han, B., et al. “Network function virtualization:Challenges and opportunities for innovations.”Communications Magazine, IEEE, vol. 53.2, pp. 90-97,2015.

[15] Meng, X., et al. “Improving the scalability of datacenter networks with tra�c-aware virtual machineplacement.” INFOCOM, Proceedings IEEE, 2010.

[16] Barlet-Ros, P., et at. “Predictive resource managementof multiple monitoring applications.”Transactions onNetworking, IEEE/ACM, vol. 19(3), 2011.

[17] Kimura, Tatsuaki, et al. “Spatio-temporalfactorization of log data for understanding networkevents.” IEEE INFOCOM, pp. 610-618, 2014.

[18] Akyildiz, Ian, et al. “5G roadmap: 10 key enablingtechnologies.”Computer Networks, vol 106, pp. 17-48,2016.

[19] Bruna, J., et al. “Spectral networks and locallyconnected networks on graphs.” arXiv preprintarXiv:1312.6203, 2013.


Date post:	04-Jun-2020
Category:	Documents
Upload:	others
View:	1 times
Download:	0 times

Public Review for Knowledge-Deﬁned Networking · 2020-05-28 · network topology and handles the...

Documents