DELIVERABLE
This project has received financial support from the European Union Horizon 2020 Programme under grant agreement no. 688203.
D4.4 Framework for Knowledge Extraction from
IoT Data Sources
Project Acronym: bIoTope
Project title: Building an IoT Open Innovation Ecosystem for Connected Smart Objects
Grant Agreement No. 688203
Website: www.bIoTope-project.org
Version: 1.0
Date: 2017-06-30
Responsible Partner: EPFL
Editor: Prodromos Kolyvakis
Contributing Partners: Fraunhofer, BIBA, CSIRO, AALTO:CS, eccenca and Enervent
Dissemination Level: Public X
Confidential – only consortium members and European Commission Services
Ref. Ares(2017)3188041 - 26/06/2017
D4.4 Framework for Knowledge Extraction from IoT Data Sources
© 688203 bIoTope Project Partners 2 30 June 2017
Revision History
Revision Date Author Organization Description
0.1 15/11/2016 Prodromos Kolyvakis
EPFL Initial ToC
0.2 23/02/2016 Prodromos Kolyvakis
EPFL Sections 2 – 2.4 edited
0.3 27/03/2016 Prodromos Kolyvakis
EPFL Knowledge Extraction & Fusion
Added
0.4 28/03/2016 Prodromos Kolyvakis
EPFL Knowledge Definition added & 3rd section headers
revised
0.5 18/04/2016 Prodromos Kolyvakis
EPFL Section 2.4 extended 3rd section’s content
added
0.6a 12/05/2016 Prodromos Kolyvakis
EPFL 4th section content added
0.6b 16/05/2016 Prodromos Kolyvakis
EPFL 4th section’s content extended
0.7 25/05/2016 Prodromos Kolyvakis
EPFL Section 1 and 5 revised
0.7a 31/05/2016 Prodromos Kolyvakis
EPFL ToC revised, some cross-references
fixed
0.8a 09/06/2016 Jérémy ROBERT
Uni.LU Section 3.2 revised
0.8b 15/06/2016 Jérémy Morel OpenDataSoft Section 2,3,4 revised
0.9a 16/06/2016 Robert Hellbach BIBA Whole manuscript’s review & revision
0.9b 20/06/2016 Prodromos Kolyvakis
EPFL Changes based on the received
feedback
1.0 20/06/2016 Prodromos Kolyvakis
EPFL Final Version
Every effort has been made to ensure that all statements and information contained herein are accurate, however the bIoTope Project Partners accept no liability for any error or omission in the same.
D4.4 Framework for Knowledge Extraction from IoT Data Sources
© 688203 bIoTope Project Partners 3 30 June 2017
Table of Contents
1. Introduction .......................................................................................................................... 7
1.1. Scope ........................................................................................................................................ 7
1.2. Audience ................................................................................................................................... 7
1.3. Content of this Document.......................................................................................................... 7
2. State-of-the-Art in Knowledge Extraction and Fusion ............................................................. 9
2.1. Sensor Networks ....................................................................................................................... 9
2.2. Knowledge Definition .............................................................................................................. 10
2.3. Multi-sensor Data fusion and the JDL/DFIG model ................................................................... 11
2.4. Knowledge Extraction.............................................................................................................. 13
2.5. Conclusion .............................................................................................................................. 17
3. Knowledge Extraction Framework ....................................................................................... 18
3.1. Knowledge Extraction Architecture in bIoTope KnaaS framework ............................................. 19
3.2. Integration Workflow with Existing bIoTope Components ........................................................ 21
3.3. Knowledge Extraction Implementation .................................................................................... 22
4. Case Studies ........................................................................................................................ 24
4.1. Knowledge Extraction and Fusion in bIoTope use cases and pilots ............................................ 24
4.1.1. Heat Wave Mitigation: Inform Citizens of Local Heat Conditions ............................................. 25
4.1.2. Heat Wave Mitigation: Help citizens to Ventilate their Home .................................................. 27
4.1.3. Heat Wave Mitigation: Establish a correlation between traffic and temperature ................... 29
5. Conclusion and Future Work ............................................................................................... 32
6. References .......................................................................................................................... 33
7. Online References ............................................................................................................... 35
8. Annex 1. bIoTope Big Picture ............................................................................................... 36
List of Tables
Table 1: Six Levels of the JDL/DFIG model. ........................................................................................................ 13
Table 2: The main problems of the Data Fusion process. ................................................................................. 14
List of Figures
Figure 1: Internet of Things Monitoring Cycle [6]. ............................................................................................ 12
Figure 2: A taxonomy of data fusion methodologies: Different data fusion algorithms can be roughly
categorized based on one of the four challenging problems of input data that are mainly tackled: data
imperfection, data correlation, data inconsistency, and disparateness of data form [21]. ...................... 15
Figure 3: Categorization of Reasoning Systems [40]. ........................................................................................ 16
Figure 4: Conceptual Architecture of the bIoTope Knowledge Framework ...................................................... 18
Figure 5: Key processing blocks required in knowledge framework and knowledge processing ..................... 19
D4.4 Framework for Knowledge Extraction from IoT Data Sources
© 688203 bIoTope Project Partners 4 30 June 2017
Figure 6: KnaaS Architecture highliting interactions betweeen other bIoTope components and Linked Open
Data. .......................................................................................................................................................... 20
Figure 7: Example workflow for KnaaS integration with bIoTope. .................................................................... 21
Figure 8: bIoTope KnaaS example ..................................................................................................................... 23
Figure 9: Goal Realization Viewpoint of Lyon's Pilot Case #2 in ArchiMate. ..................................................... 24
Figure 10: Access Local Heat Conditions & Traffic Through KnaaS ................................................................... 25
Figure 11: Visualization of local Heat Conditions through KnaaS ..................................................................... 26
Figure 12: Data Exploration of the local Heat Conditions through KnaaS ........................................................ 26
Figure 13: Gauge Charts and Graphs produced via KnaaS to visual explore the sensor streams. .................... 27
Figure 14: Node-RED flow, which displays through dashboards’ notifications and tweets the OpenWeatherMap
information. ............................................................................................................................................... 28
Figure 15: Message notification with the available information in the OpenWeatherMap. ............................ 28
Figure 16: Tweet message with the parsed information from OpenWeatherMap. ......................................... 28
Figure 17: Node-RED flow, which enables the Users to ventilate their homes accordingly. ............................ 28
Figure 18: Node for writing Python code in KnaaS............................................................................................ 29
Figure 19: Establish a correlation between traffic and temperature through KnaaS. ...................................... 30
Figure 20: MongoDB query that retrieves the geocoordinates of a specific street_name given a specific
timestamp. ................................................................................................................................................ 30
Figure 21: MongoDB query that finds the closest Netatmo local weather stations given specific geographic
coordinates. ............................................................................................................................................... 30
Figure 22: Dashboard with the computed Pearson correlation coefficients. ................................................... 31
D4.4 Framework for Knowledge Extraction from IoT Data Sources
© 688203 bIoTope Project Partners 5 30 June 2017
Terms and Acronyms
API Application Programming Interface
AUCS Artificial Use Case Scenario
GSN Global Sensor Networks
HTML HyperText Markup Language
HTTP Hypertext Transfer Protocol
ICO Internet-Connected Objects
ICT Information and Communication Technologies
IoT Internet of Things
KnaaS Knowledge as a Service
O-MI Open Messaging Interface
O-DF Open Data Format
OWL Ontology Web Language
RDF Resource Description Format
REST Representational State Transfer
RFID Radio Frequency Identification
SOA Service Oriented Architecture
SOAP Simple Object Access Protocol
SPARQL SPARQL Protocol and RDF Query Language
SQL Structured Query Language
URI Uniform Resource Identifier
XML eXtensible Markup Language
D4.4 Framework for Knowledge Extraction from IoT Data Sources
© 688203 bIoTope Project Partners 6 30 June 2017
Executive Summary
The bIoTope project lays the foundation for creating open innovation ecosystems supporting the Internet of Things (IoT) by providing a platform that enables companies to easily create new IoT systems and to rapidly harness available information using advanced Systems-of-Systems (SoS) capabilities for connected smart objects and easily creating innovative business processes. The main purpose of this deliverable is to propose an architecture and implement a framework that is capable to let the users to extract valuable knowledge out of IoT data sources and thus permitting the user to gain insight into the interactions of various phenomena of the everyday life. Toward this direction, a plethora of knowledge extraction and data fusion mechanisms are presented and implemented inside the Knowledge as a Service (KnaaS) framework. In order to measure the effectiveness of the proposed conceptual framework as well as the prototype implementation of it various scenarios of the Lyon’s Heat Wave Mitigation use case are examined and concrete implemented solutions are provided. More specifically, the studied scenarios show how the data fusion and the knowledge discovery capabilities can be used to inform the citizens of Urban Community of Lyon about the local heat conditions of their region and help the citizens ventilate their homes accordingly. In addition, it is demonstrated how a service can be set up through the KnaaS framework that enables to establish the correlation between traffic and temperature data. It should be highlighted that the Lyon’s Heat Wave Mitigation Pilot use case was chosen as it was considered the most representative as far as the knowledge extraction process is concerned. However, the work presented in this deliverable is by no means specific to this use case. The deliverable takes into account use cases and interaction scenarios defined in Deliverable D2.1 as well as the Conceptual Architecture of the bIoTope Knowledge Framework introduced in Deliverable D4.2, and shows how the proposed approach supports them. From a technology perspective, the contributions of this deliverable are in accordance with the activities of WP5, and the work we report in this deliverable has been closely coordinated with D5.4 (led by Fraunhofer). Last but not least, special consideration was given so as the Knowledge as a Service framework presented in this deliverable to provide a concrete integration of the O-MI/O-DF RDF Integration Server (OORI) Server introduced in D5.2 with the aim to (a) exploit the advanced querying capabilities offered by it inside the KnaaS framework and (b) to facilitate the interaction of User Interaction (UI) as a Service and Knowledge as a Service towards the creation of a sustainable bIoTope ecosystem.
D4.4 Framework for Knowledge Extraction from IoT Data Sources
© 688203 bIoTope Project Partners 7 30 June 2017
1. Introduction
1.1. Scope
This deliverable illustrates some of the core features of the prototype reference implementation of Knowledge
Extraction Framework from IoT data sources, which enables the efficient knowledge discovery in the bIoTope
ecosystem through a plethora of different extraction and fusion algorithms. The Knowledge Extraction
Framework consists the key technology enabler of KnaaS, and from a practical perspective it could be seen as
the key realization component of KnaaS. From a theoretical perspective, the KnaaS framework, however, is
more general as it defines also the key requirements that any possible third-party service providers, who are
willing to provide a domain-specific KnaaS services and join bIoTope ecosystems should confront with1.
In order to understand the key challenges of the bIoTope’s Knowledge Extraction Framework as well as the
design choices, this document deals with the following questions:
How the knowledge extraction can be formally defined?
Which are the primary functions of a Knowledge Extraction Framework so as to play such a role?
Can the reasoning mechanisms offered by the Knowledge Extraction Framework be exhaustive?
Which types of interfacing and interaction mechanisms should be implemented between the
Knowledge as a Service framework and the other bIoTope components?
1.2. Audience
The target audience of the deliverable includes groups within and outside the consortium, in particular:
Researchers, developers and integrators of the bIoTope consortium: This deliverable illustrates important aspects of the bIoTope context service formulation and delivery and will therefore serve as valuable input for stakeholders within the bIoTope consortium, notably stakeholders that work on the design of the bIoTope infrastructure and/or its implementation in the scope of the bIoTope open source project.
Researchers within other IERC and IoT EPI projects: The deliverable illustrates some of the core implementation concepts of bIoTope and will therefore be of interest to researchers in other IERC and IoT- European Platforms Initiative (IoT-EPI) projects, notably researchers working on projects that interact closely with bIoTope.
Researchers working on IoT: The deliverable will be also of interest to broader groups of IoT researchers, since it provides new insights into IoT open innovation ecosystem (e.g., sensors / cloud computing) integration. As a public document, the deliverable will be accessible to such groups.
Open source community: In the medium term, bIoTope intends to build an open source community based on the bIoTope IoT platform ecosystem. This deliverable may serve as a guide to some of the introductory, yet important topics and functionalities of bIoTope.
1.3. Content of this Document
This deliverable reports our work on the implementation of the core knowledge extraction and data fusion
mechanisms of the Knowledge Extraction Framework paving the way to the bIoTope’s Knowledge as a Service
(Knaas) Framework. After reporting the state-of-the-art of knowledge extraction in IoT, this report presents
the key features of the knowledge extraction functionalities, while illustrating the interaction between the
1 For further information, please refer to chapter 3.
D4.4 Framework for Knowledge Extraction from IoT Data Sources
© 688203 bIoTope Project Partners 8 30 June 2017
other bIoTope components. For the purpose of demonstrating the strengths of the bIoTope’s KnaaS
framework and helping the readers understand the proposed approach, different scenarios of Heat Wave
Mitigation Use Case of the Urban Community of Lyon are examined and implemented in the frame of KnaaS.
The rest of this deliverable is structured as follows: Chapter 2 provides state-of-the-art definitions and
discussions of knowledge, knowledge reasoning, knowledge extraction and knowledge fusion, particularly
taking into account their applications on Internet of Things. Chapter 3 deals with the core functionalities of
the Knowledge Extraction Framework as well as with its reference implementation. In chapter 4, the
application of these functionalities is illustrated by demonstrating step-by-step how the provided prototype
implementation can provide solutions in the Heat Wave Mitigation Pilot Use Case of the Urban Community of
Lyon. Finally, chapter 5 summarizes the work demonstrated in this deliverable and presents the future work.
D4.4 Framework for Knowledge Extraction from IoT Data Sources
© 688203 bIoTope Project Partners 9 30 June 2017
2. State-of-the-Art in Knowledge Extraction and Fusion
The essence of an IoT ecosystem is to enable the secure connection of a multitude of heterogeneous sensing
and actuating devices, having different constraints and capabilities, to the Internet. In the absence of de-facto
communication standard(s), the sensing and actuating devices by different vendors may subscribe to different
interaction patterns, and may implement different subsets of available communication protocols. As a result,
arguably, the value of an IoT ecosystem grows proportionally with the number and the versatility of the
supported devices [1]. The current IoT solutions address the issue of interfacing heterogeneous devices
differently. Generally, the interoperability with devices is ensured either by implementing a gateway that can
be expanded, e.g., with the help of plugins, to support new types of devices whenever needed, or by
mandating the device vendors to use protocols from a limited set of supported ones.
Note however that either the heterogeneity of supported devices is limited, or the use of a gateway is
necessary. In a recent IoT gap analysis [1] the authors concluded that “in order to streamline the integration
of new device types, standard object models for IoT devices” and universal messaging standards should be
integrated widely in a IoT ecosystem. For a smooth integration with sensing and actuating devices, it is
essential that the IoT communities ought to establish standardized protocols that would enable the
publication, consumption and composition of heterogeneous information sources and services from across
various platforms. To this end, bIoTope takes full advantage of recent Open API standards for the IoT [2],
notably O-MI (Open Messaging Interface) [3] and O-DF (Open Data Format) [4], which can be extended with
more specific vocabularies, e.g. using semantic web and ontology technologies.
This architectural choice consists of an innovative action towards the realization of an IoT ecosystem and
facilitates the integration of heterogeneous data sources, the fusion of these sensor data – sensor fusion – as
well as the knowledge discovery and reasoning upon the available sensor data – knowledge extraction. In the
rest of this chapter, we will provide the background knowledge required for the next sections and we will
provide a summary of state-of-the-art discussions on knowledge extraction.
2.1. Sensor Networks
Sensor networks are the major enabler of the IoT. A sensor can be defined as a device that detects or measures
a physical phenomenon such as humidity, temperature, etc. A sensor node is a physical platform that hosts
one or more sensors. Each sensor node has the capability to sense, communicate and process data. A typical
sensor network [5] comprises two or more sensor nodes which communicate between each other using wired
and wireless means. In sensor networks, sensors can be homogeneous or heterogeneous. Multiple sensor
networks can be connected together through different mechanisms. One such approach is through the
Internet.
Typically, sensor nodes are deployed in densely manner around the phenomenon, which we want to sense
[5].These sensor nodes are low-cost and small in size, that enable large deployments. Sensor network is not a
concept that emerged with the IoT. The concept of sensor network and related research existed long time
before the IoT was defined. This can be clearly seen when we evaluate the literature in the field. However,
with the emergence of the IoT, it has facilitated the mainstream adoption of sensor network as a major
technology used to realise the IoT vision.
In recent times, another widely recognised source of sensor data is obtained from mobile smart devices. The
ubiquitous nature of mobile smart devices such as smartphones, tablets, smart watch to name a few and the
availability of cheap embedded sensors have completely revolutionised the smart city application dimensions
[6].
D4.4 Framework for Knowledge Extraction from IoT Data Sources
© 688203 bIoTope Project Partners 10 30 June 2017
2.2. Knowledge Definition
In the literature, there is a wide range of definitions and unspoken agreement of what knowledge is. Starting
with Hintikka’s [7] pioneering contribution, the notions of knowledge and belief have been studied extensively
in the literature. Bonanno’s definitions of information and knowledge constitute one of the most acceptable
formal definitions in the literature [8]. Apart from providing just a conceptual framework for information and
knowledge Bonanno provides a strict axiomatic system of what knowledge and information is [8], [9].
According to Bonanno, “information is modelled as possibilities consistent with signal received from the
environment. Knowledge is obtained by reasoning about the signals received as well as those that are missing”
[8].
One of the most important aspects of Bonanno’s model is that the absence of information could also be
valuable input for deriving knowledge. This is explained based on the argument presented below, adapted
from Conan Doyle’s Silver Blaze Mystery. “In the dead of night, someone removed the horse Silver Blaze from
the stable in which he was kept. Footprint found outside the stable match those of two individuals, who
therefore become the primary suspects. During the investigation, Scotland Yard Inspector Gregory asks
Sherlock Holmes” [8], [10]:
Gregory: ‘Is there any other point to which you should wish to draw my attention?’
Holmes: ‘To the curious incident of the dog in the night-time’.
Gregory: ‘The dog did nothing in the night-time’.
Holmes: ‘That was the curious incident’.
Holmes deduces from the fact that the dog did not bark (absence of a signal) that the thief must have been
known to the dog and is therefore able to eliminate one of the two suspects of these grounds [8], [9]. We
present the basic definitions of signal, information and knowledge based on the Bonanno’s model.
Information can be thought as possibilities associated (or consistent) with signals (sensor data) received from
the environment. Let 𝛺 be a set of states and 𝛺𝛫 be the set of known states, while 𝛺𝛫 ⊆ 𝛺. For instance,
𝛺 could be the set of all diseases and 𝛺𝛫 be the set of known diseases (e.g. the set of diseases in some
database). Let 𝛴 be a set of signals (sensor data) and 𝜎: 𝛺 → 2𝛴 (where 2𝛴 denotes the set of subsets of 𝛴)
be a function that associates with every state ω the set of signals produced by ω. In the example where
states are identified with diseases, signals can be thought of as symptoms so that 𝜎(𝜔) is the set of
symptoms associated with disease ω [8].
We define signal as anything that alters the physical environment. Namely, there is an objectively
measurable difference between the situation when the signal is present and when the signal is absent. If we
consider an example from the medical domain, signals can be thoughts as sensor data e.g., auscultation lungs
sound, radiography, etc. On the other hand, according to this terminology, the absence of a signal is not itself
a signal. This allows us to think of information as signals received and to represent knowledge as inference
based on the signals that are present as well as those that are absent.
Definition: The information function 𝐼: 𝛺 → 2𝛺 is given by: 𝐼(𝜔) = {𝜔′ ∈ 𝛺𝛫 ∶ 𝜎(𝜔′) ⊇ 𝜎(𝜔)}.
Thus 𝐼(𝜔) is the set of known states that are compatible with the signals produced by true state ω, in the
sense that those states would also have produced those signals (although they might have more signals
associated with them). For example, one can imagine a database of known diseases and their associated
symptoms and a computer program which receives as input the patient’s symptoms and gives as output the
list of diseases in the database that manifest all those symptoms. The next step for a careful doctor would
be to research each of the reported possible diseases and eliminate as true possibilities all those diseases
D4.4 Framework for Knowledge Extraction from IoT Data Sources
© 688203 bIoTope Project Partners 11 30 June 2017
that had extra symptoms not exhibited by the patient. This step, corresponding to the notion of knowledge
derived from information [8]. Although, the treatment appeared in [8], [9] defines a lot of additional axioms
that information function should fulfill like secondary reflexivity, transitivity, etc., we have chosen to omit
them for the sake of simplicity.
Based on the above definitions, we will proceed in defining what the knowledge is. Let 𝛫: 𝛺 → 2𝛺 be the
knowledge function. 𝐾(𝜔) is interpreted as the set of states that, based on what someone knows, the
individual cannot rule out when the state is ω. We should emphasize on the fact that we did not impose any
additional requirements on the steps in reasoning, deduction, induction, and abduction (for further
information please see section 2.4). We impose only the following requirements [8]:
1. Knowledge should be based on information received. 2. Knowledge should reflect reasoning about the information. 3. Knowledge should be derived exclusively from the available information.
Based on the above definition and requirements, we have proceeded in illustrating the key concepts of
Bonanno’s model of how information and knowledge can be defined. Of course, we did not aim at a complete
presentation of Bonanno’s theory. For a strict and formal definition of information and knowledge, which is
confronted with the above statements, please refer to [8], [9]
2.3. Multi-sensor Data fusion and the JDL/DFIG model
“By 2020, wirelessly networked sensors in everything we own will form a new Web. But it will only be of value
if the “terabyte torrent” of data it generates can be collected, analysed and interpreted” [11]. IoT would
produce substantial amount of data [12] that are less useful unless we are able to derive knowledge using
them. Data fusion is a data processing technique that associates, combines, aggregates, and integrates data
from different sources. It helps to build knowledge about certain events and environments, which is not
possible by using individual sensors separately. Data fusion also helps to build a context-awareness model that
helps to understand situational context [6]. The boundary between sensor fusion and sensor integration is
quite fuzzy and the terms are used interchangeably sometimes. Joshi and Sanderson describe multi-sensor
fusion as part of the multi-sensor integration process [13]. This process refers to the synergistic use of multiple
sensors to improve operation of the system as a whole and includes sensor planning and sensor architecture.
Data fusion in the IoT environment consists one of the most important challenges that need to be addressed
to develop innovative services. In particular, in smart cities applications, when 50 to 100 billion devices start
sensing [14], it would be essential to fuse, and reason about the data automatically and intelligently. A recent
work from a group of researchers from MIT [15] demonstrate the potential of fusing data from disparate data
sources in smart city to understand a city’s attractiveness. The work focuses on cities in Spain and shows how
the fusion of big data sets can provide insights into the way people visit cities. Such a correlation of data from
a variety of data sources play a vital role in delivering services successfully in smart cities of the future. Fusion
is a broad term that can be interpreted in many ways. Hall and Llinas [16] have defined the sensor data fusion
as a method of combining sensor data from multiple sensors to produce more accurate, more complete, and
more dependable information that could not be possible to achieve through a single sensor. Nakamura et al.
[17] have defined data fusion based on three key operations: complementary, redundant, and cooperative.
Complementary means putting bits and pieces of a large picture together. A single sensor cannot say
much about the environment as it would be focused on measuring a single factor such as temperature.
However, when we have data sensed through a number of different sensors, we can understand the
environment in a much better way.
D4.4 Framework for Knowledge Extraction from IoT Data Sources
© 688203 bIoTope Project Partners 12 30 June 2017
Redundant means that same environmental factor is sensed through different sensors. It helps to
increase the accuracy of the data. For example, averaging the temperature value sensed by two
sensors located in the same physical location would produce more accurate information compared to
a single sensor. It also reduces the amount of data that need to be handled as it com- bines the two
set of data streams together.
Cooperative operations combine the sensor data together to produce new knowledge. For example,
reading RFID tags recorded in a supermarket can be used to identify the events such as shoplifting.
Let’s consider a scenario where RFID reader in a supermarket shelf detects that an item has been
removed from a shelf. The RFID sensor in the counter does not see the object during payments. Later,
the RFID sensor in the exit door detects the item that was removed from the shelf earlier. This
sequence of actions can be simply inferred as a shoplifting event.
The ultimate goal of sensor data fusion is to understand the environment and act accordingly. This can be
defined as a cycle as shown in Figure 1. It is called the Internet of Things Monitoring Cycle [6]. It has five steps:
Collection, Collation, Evaluation, Decide, and Act. IoT monitoring cycle has been derived by combining the
Intelligence Cycle [18] and the Boyd Control Loop [19]. These Collection step collects raw data from sensors
and other IoT data sources (Social media, smart city infrastructure, mobile devices etc.). The Collation step
analyse, compare and correlate the collected data. The Evaluation step fuses the data in order to understand
and provide a full view of the environment. The Decide step decides the actions that need to be taken. The
Act step simply applies the actions decided at the previous step. The Act step includes actuator control as well
as sensor calibration and re-configuration [6].
In an attempt to unify the terminology associated with data and information fusion, the Joint Directors of
Laboratories formed the Data Fusion Subpanel (which later became known as the Data Fusion Group)
developed the JDL fusion model [20]. With subsequent revisions, it is the most widely used system for
understanding data fusion processes. The goal of this model is to facilitate understanding and communication
among researchers, designers, developers, evaluators, and users of data and information fusion techniques
to permit cost-effective system design, development, and operation [21]. The JDL/DFIG model differentiates
data fusion functions into a set of fusion levels and provides a useful distinction among data fusion processes
that relate to the refinement of objects, situations, threats, and processes at several levels. With the advent
of the World Wide Web, data fusion thus included data, sensor, and information fusion [21]. The JDL/DFIG
introduced a model of data fusion that divided the various processes. Currently, the six levels with the Data
Fusion Information Group (DFIG) model are:
Figure 1: Internet of Things Monitoring Cycle [6].
D4.4 Framework for Knowledge Extraction from IoT Data Sources
© 688203 bIoTope Project Partners 13 30 June 2017
Table 1: Six Levels of the JDL/DFIG model.
Levels Description
Level 0:
Source Preprocessing/
subject Assessment
Estimation and prediction of signal/object observable states on the basis of pixel/signal level data association and characterization.
Level 1:
Object Assessment
Estimation and prediction of entity states on the basis of observation-to-track association, continuous state estimation (e.g. kinematics) and discrete state estimation (e.g. target type and ID).
Level 2:
Situation Assessment
Estimation and prediction of relations among entities, to include force structure and cross force relations, communications and perceptual influences, physical context, etc.
Level 3:
Impact Assessment
Estimation and prediction of effects on situations of planned or estimated/predicted actions by the participants; to include interactions between action plans of multiple players (e.g. assessing susceptibilities and vulnerabilities to estimated/predicted threat actions given one’s own planned actions).
Level 4:
Process Refinement
Adaptive data acquisition and processing to support mission objectives.
Level 5:
User Refinement
Adaptive determination of who queries information and who has access to
information (e.g. information operations) and adaptive data retrieved and
displayed to support cognitive decision making and actions (e.g. human computer
interface) [22].
Although the JDL Model (Level 1–4) is still in use today, it is often criticized for its implication that the levels
necessarily happen in order and also for its lack of adequate representation of the potential for a human-in-
the-loop [23]. The DFIG model (Level 0–5) explored the implications of situation awareness, user refinement,
and mission management [24]. Despite these shortcomings, the JDL/DFIG models are useful for visualizing the
data fusion process, facilitating discussion and common understanding [25], and important for systems-level
information fusion design [24].
2.4. Knowledge Extraction
Knowledge extraction – also known as Knowledge Discovery and Data Mining (KDD) – concerns the creation
of knowledge out of information coming from structured and unstructured data sources. Examples of
structured sources include relational databases, NoSQL, or XML format data, whereas unstructured sources
can include text data or documents and images, which are largely accessible via the Web. It is an
interdisciplinary area focusing upon methodologies for extracting useful knowledge from data. The ongoing
rapid growth of online data due to the Internet and the widespread use of databases have created an immense
need for KDD methodologies. The challenge of extracting knowledge from data draws upon research in
statistics, databases, pattern recognition, machine learning, data visualization, optimization, and high-
performance computing, to deliver advanced business intelligence and web discovery solutions. In bIoTope,
ecosystem applications will be accessed through O-MI/O-DF compatible wrappers2. Hence, this approach
facilitates the exchange of information and permits the knowledge extraction process to be performed in a
uniform manner. However, the knowledge extraction remains a key challenge as it requires a bunch of
different advanced techniques to be performed upon the data so as to gain valuable insight into the data. Data
Fusion, which is a very important aspect of Knowledge Extraction, is a bunch of data processing techniques
that permits the association, combination, aggregation, and integration of data coming from different and
heterogeneous data sources. There are a number of issues that make data fusion a challenging task. The
2 For further information, please refer to the section 2.3 in Deliverable D4.2.
D4.4 Framework for Knowledge Extraction from IoT Data Sources
© 688203 bIoTope Project Partners 14 30 June 2017
majority of these issues arise from the data to be fused, imperfection and diversity of the sensor technologies,
and the nature of the application environment as following:
Table 2: The main problems of the Data Fusion process.
Data Fusion
Main Problems
Description
Data
Imperfection
Data provided by sensors is always affected by some level of impreciseness as well as
uncertainty in the measurements. Data fusion algorithms should be able to express such
imperfections effectively, and to exploit the data redundancy to reduce their effects.
Outliers and
spurious data
The uncertainties in sensors arise not only from the impreciseness and noise in the
measurements, but are also caused by the ambiguities and inconsistencies present in
the environment, and from the inability to distinguish between them [26]. Data fusion
algorithms should be able to exploit the redundant data to alleviate such effects.
Conflicting data
fusion of such data can be problematic especially when the fusion system is based on
evidential belief reasoning and Dempster’s rule of combination [27]. To avoid producing
counter-intuitive results, any data fusion algorithm must treat highly conflicting data
with special care.
Data modality
Sensor networks may collect the qualitatively similar (homogeneous) or different
(heterogeneous) data such as auditory, visual, and tactile measurements of a
phenomenon. Both cases must be handled by a data fusion scheme.
Data
Correlation
This issue is particularly important and common in distributed fusion settings, e.g.
wireless sensor networks, as for example some sensor nodes are likely to be exposed to
the same external noise biasing their measurements. If such data dependencies are not
accounted for, the fusion algorithm, may suffer from over/under confidence in results.
Data
alignment/
registration
Sensor data must be transformed from each sensor’s local frame into a common frame
before fusion occurs. Such an alignment problem is often referred to as sensor
registration and deals with the calibration error induced by individual sensor nodes. Data
registration is of critical importance to the successful deployment of fusion systems in
practice.
Data
association
Multi-target tracking problems introduce a major complexity to the fusion system
compared to the single-target tracking case [28]. One of these new difficulties is the data
association problem, which may come in two forms: measurement-to-track and track-
to-track association. The former refers to the problem of identifying from which target,
if any, each measurement is originated, while the latter deals with distinguishing and
combining tracks, which are estimating the state of the same real-world target [29].
Processing
framework
Data fusion processing can be performed in a centralized or decentralized manner. The
latter is usually preferable in wireless sensor networks, as it allows each sensor node to
process locally collected data. This is much more efficient compared to the
communicational burden required by a centralized approach, when all measurements
have to be sent to a central processing node for fusion.
Operational
timing
The area covered by sensors may span a vast environment composed of different
aspects varying in different rates. Also, in the case of homogeneous sensors, the
operation frequency of the sensors may be different. A well-designed data fusion
method should incorporate multiple time scales in order to deal with such timing
variations in data. In distributed fusion settings, different parts of the data may traverse
different routes before reaching the fusion center, which may cause out- of-sequence
arrival of data. This issue needs to be handled properly, especially in real-time
applications, to avoid potential performance degradation.
D4.4 Framework for Knowledge Extraction from IoT Data Sources
© 688203 bIoTope Project Partners 15 30 June 2017
Static vs.
dynamic
phenomena
The phenomenon under observation may be time-invariant or varying with time. In the
latter case, it may be necessary for the data fusion algorithm to incorporate a recent
history of measurements into the fusion process [13]. In particular, data freshness, i.e.,
how quickly data sources capture changes and update accordingly, plays a vital role in
the validity of fusion results.
Data
dimensionality
The measurement data could be preprocessed, either locally at each of the sensor nodes
or globally at the fusion center to be compressed into lower dimensional data, assuming
a certain level of compression loss is allowed. This preprocessing stage is beneficial as it
enables saving on the communication bandwidth and power required for transmit- ting
data, in the case of local preprocessing [30], or limiting the computational load of the
central fusion node, in the case of global preprocessing [31].
While many of these problems have been identified and heavily investigated, no single data fusion algorithm
is capable of addressing all the aforementioned challenges. The variety of methods in the literature focus on
a subset of these issues to solve, which would be determined based on the application in hand. Our
presentation of data fusion literature is organized according to the taxonomy shown in Figure 2. This taxonomy
represented in the work of [32], illustrates an overview of data-related challenges that are typically tackled by
data fusion algorithms. The input data to the fusion system may be imperfect, correlated, inconsistent, and/or
in disparate forms/modalities. Each of these four main categories of challenging problems can be further
subcategorized into more specific problems, as shown in Figure 2. For each of one of these categories many
approaches have been developed to tackle with each specific aspect. For instance, a number of mathematical
theories are available to represent data imperfection [33], such as probability theory [34], fuzzy set theory
[35], [36], possibility theory [37], rough set theory [38], and Dempster– Shafer evidence theory (DSET) [39]. It
is clear that none of these techniques can substitute the others. Thus, the art of data fusion is more about
choosing among a set of appropriate fusing tools rather than choosing one to solve each specific task.
Figure 2: A taxonomy of data fusion methodologies: Different data fusion algorithms can be roughly categorized based on one of the
four challenging problems of input data that are mainly tackled: data imperfection, data correlation, data inconsistency, and
disparateness of data form [21].
With regard to knowledge creation from heterogeneous data sources, the knowledge will be the result of
some inference procedures. Inferences are steps in reasoning, moving from premises to conclusions. Charles
Sanders Peirce divided inference into three kinds: deduction, induction, and abduction. Another one more
D4.4 Framework for Knowledge Extraction from IoT Data Sources
© 688203 bIoTope Project Partners 16 30 June 2017
refined categorization of reasoning systems is provided in [40]. Its categorization is depicted in Figure 3.
According to this, reasoning is further divided in First Order Logic Reasoning, Probabilistic Reasoning, Causal
Reasoning, Newtonian Mechanics, Spatial Reasoning and Nom-falsifiable reasoning. Deduction is inference
deriving logical conclusions from premises known or assumed to be true, with the laws of valid inference being
studied in logic (First Order Logic in Figure 3).
Figure 3: Categorization of Reasoning Systems [40].
Abduction is inference to the best explanation. Statistical inference uses mathematics to draw conclusions in
the presence of uncertainty. This generalizes deterministic reasoning, with the absence of uncertainty as a
special case. Statistical inference uses quantitative or qualitative (categorical) data which may be subject to
random variations. Towards that direction, Machine learning explores the study and construction of
algorithms that can learn from and make predictions on data [41]. Machine learning is a subfield of computer
science that evolved from the study of pattern recognition and computational learning theory in artificial
intelligence. In 1959, Arthur Samuel defined machine learning as a "Field of study that gives computers the
ability to learn without being explicitly programmed" [42]. Such algorithms operate by building a model from
example inputs in order to make data-driven predictions or decisions, rather than following strictly static
program instructions.
Causal reasoning is a well-known expressive limitation of probabilistic reasoning. For instance, “we can
establish a correlation between the events “it is raining” and “people carry open umbrellas”. This correlation
is predictive: if people carry open umbrellas, we can be pretty certain that it is raining3. But this correlation
tells us little about the consequences of an intervention: banning umbrellas will not stop the rain” [40]. For
instance, classical mechanics is an extremely successful example of causal reasoning system and an example
of what is called as Newtonian mechanics reasoning. Spatial reasoning provides answers to question such as
“How would a visual scene change if one changes the viewpoint or if one manipulates one of the objects in
the scene?” [40]. Spatial reasoning does not require the full logic apparatus but certainly benefits from the
definition of specific algebraic constructions [43]. Last, History provides countless examples of reasoning
systems with questionable predictive capabilities, such as Astrology, Mythology, etc. Just like non-falsifiable
statistical models, non-falsifiable reasoning systems are unlikely to have useful predictive capabilities [40],
[44].
There are two ways to face such a universe of reasoning systems. One approach would be to identify a single
reasoning framework strictly more powerful than all others. Whether such a framework exists and whether it
leads to computationally feasible algorithms is unknown [40]. As in the case of data fusion, the creation of
knowledge out of information, which comes in a semantically annotated or in raw data form may require a lot
of different reasoning techniques to be employed on the available data. This highlights the importance of a
Knowledge as a Service to provide various different mechanisms of reasoning upon data. Last but not least, as
it is very difficult to discover a single reasoning framework more powerful than all others, special consideration
should be given to be easily extendible and adaptable.
3 An alternative conclusion may be that the weather is sunny. Cultural and local aspects play a significant role in this example as well.
D4.4 Framework for Knowledge Extraction from IoT Data Sources
© 688203 bIoTope Project Partners 17 30 June 2017
2.5. Conclusion
In this chapter, we have provided the background knowledge required for the next sections and we have given
a summary of state-of-the-art discussions in data fusion and knowledge extraction. We have begun by defining
information and knowledge to share a common understanding along the whole chapter, as these notions
correspond to the main concepts that data fusion and knowledge discovery work upon. We have explained in
details the number of issues that make data fusion a challenging task and that the majority of these issues
arise from the data’s imperfection and diversity of the sensor technologies. In addition, we provided an
overview of the key reasoning mechanisms as well as the key difficulties encountered in identifying a single
reasoning or data fusion framework strictly more powerful than all others. This knowledge will be valuable for
understanding the key architectural and implementation choices followed by both the conceptual architecture
of the Knowledge as a Service and its prototype implementation as described in chapter 3.
D4.4 Framework for Knowledge Extraction from IoT Data Sources
© 688203 bIoTope Project Partners 18 30 June 2017
3. Knowledge Extraction Framework
In deliverable D4.2 – Knowledge Representation and Inference Framework – the conceptual architecture of
‘Knowledge as a Service (KnaaS)’ was described and there was provided a recommendation architecture,
depicted in Figure 4. In this deliverable we will go in further details regarding the “demystification” of the
Knowledge Extraction Framework as well as the whole Knowledge as a Service framework. Special concern
will be given to the connection with the ‘bIoTope Ecosystem’ as well as their implementation design.
It should be highlighted that among the bIoTope Ecosystem the O-MI4 and O-DF5 standards will be the ‘glue
technologies’ that manages to make all the IoT devices and platforms interoperable. With this in mind, the
communication using O-MI/O-DF is the key requirement that any system or service should confront with. From
the point of view of ‘bIoTope Ecosystem6’, Knowledge as a Service (KnaaS) also serves as a recommendation
architecture paradigm. Other third-party service providers, who are willing to provide a domain-specific KnaaS
services, might be able to join bIoTope ecosystems by making their platform/software systems compliant at
least to the O-MI/O-DF communication, the recommended semantic vocabularies used in the bIoTope
ecosystem, etc.
Based on the state-of-the art discussion and the analyses well described in [45], [46], four main functions are
identified in order for any sensor data to be integrated within a knowledge framework. The functional blocks
are summarized below:
1. The capability of gathering relevant data from any kind of data sources, e.g., structured and/or
unstructured.
4 https://www2.opengroup.org/ogsys/catalog/C14B 5 https://www2.opengroup.org/ogsys/catalog/C14A 6 Please refer to “bIoTope Big Picture” in Annex I.
Figure 4: Conceptual Architecture of the bIoTope Knowledge Framework
D4.4 Framework for Knowledge Extraction from IoT Data Sources
© 688203 bIoTope Project Partners 19 30 June 2017
2. The availability of annotating heterogeneous data for the purpose of being integrated and understood
in an inference engine, e.g., probabilistic, first order logic reasoning, etc.
3. The capacity of inferring new information from data, and managing the inferred knowledge.
4. The possibility of annotating relevantly the response of knowledge services for the purpose of posting
the answer on user devices or activating operations as well as republishing it.
Thus, the KnaaS will offer to possibility to an agent, either user or a software program, to gather relevant data,
to further annotate throughout the intermediate steps the heterogeneous data, inferring new information
out of the incoming information and markup relevantly the response of knowledge services. We would like to
highlight that as O-DF will be the basic data format to exchange information among the bIoTope ecosystem, a
lot of consideration is given into integrating the semantic annotation into O-DF messages. For that reason, we
have mentioned previously to further annotate the heterogeneous data coming to the KnaaS7. Last but not
least, the second requirement is not restrictive as far as the reasoning techniques are concerned. Thus, the
inference capabilities of bIoTope’s KnaaS do not aim to be exhaustive. It is clear that if a new platform, which
offers ‘Knowledge as a Service’ capabilities enters the bIoTope ecosystem will probably provide similar or
complementary inference capabilities. This is in accordance with the overall objective of bIoTope, which is to
create a system of systems where information from cross domain platforms, devices and other information
sources can be accessed when, and as needed using standardized open APIs.
The above-mentioned functional building blocks are sequentially arranged on Figure 5 respecting the order of
processing.
3.1. Knowledge Extraction Architecture in bIoTope KnaaS framework
We provide a prototype implementation that has some knowledge fusion and inference capabilities, using the
KnaaS conceptual architecture presented in Figure 4. To be in accordance with both the ‘Everything as a
Service’ principles as well as the bIoTope’s open APIs requirement we have a chosen a service-based approach,
i.e, a Knowledge Service Request will be sent to KnaaS and the communication will be achieved through O-
MI/O-DF.
Figure 68 provides an overview of the components that KnaaS offers (indicated by the dashed green line) and
their interfaces. Inside the KnaaS, different knowledge extraction mechanisms are provided in order to fuse
and/or reason upon the available information. The KnaaS consists of a visual programming interface in which
different knowledge extraction algorithms can be deployed through a drag-and-drop mechanism. At the same
time, the KnaaS framework permits the direct implementation of advanced knowledge extraction methods,
7 For further information, please refer to the 2.3 section in Deliverable D4.2. 8 For the creation of Figure 6 icons from www.flaticon.com and the Linking Open Data cloud diagram 2017 [54] were used.
Knowledge
inferencing
Data from
context, big
data, other
sources
Semantic
annotation
Response
markup
Figure 5: Key processing blocks required in knowledge framework and knowledge processing
D4.4 Framework for Knowledge Extraction from IoT Data Sources
© 688203 bIoTope Project Partners 20 30 June 2017
apart from the ones provided as a black-box in the visual programming interface. The coding of these custom
functions could be performed either in the JavaScript or in the Python programming language. Advanced
knowledge extraction mechanisms are offered in the prototype implementation such as: dimensionality
reduction, clustering algorithms, classification and/or regression algorithms, SPARQL queries, etc. Last but not
least, it provides abstract interfaces in order to parse O-DF formatted data and query the O-MI nodes. In Figure
6, the interaction of KnaaS extraction framework with another bIoTope component (OORI Server) is shown as
well as with the Linked Open Data Cloud. The OORI server consists of a conversion service component, which
receives the URI of an O-MI node and the O-MI/O-DF XML-structure describing the objects that should be
converted into their RDF representation. It is presented in further details in deliverable D5.2 and consists one
of the main results of WP5. KnaaS exploits the OORI server in order to perform advanced SPARQL queries on
O-MI nodes, and extract valuable information out of the new semantic representation of the O-DF messages.
The role of IoTBnB and the interactions with the KnaaS will be described in details in the next section.
In order to make more clear the role of KnaaS inside the bIoTope ecosystem, we will describe an artificial use
case scenario to demonstrate its capabilities as well as its usage. Later in chapter 4, a real use case scenario
and implementation will be provided addressing major aspects of the Greater Lyon’s Heat Wave Mitigation
use case.
A User wants to extract knowledge out of available information provided through either an O-MI node
or data available on the Web, i.e., Linked Open Data, etc. For that reason, the user wants to use the
bIoTope’s KnaaS. The user chooses the available knowledge fusion and knowledge reasoning black
boxes that are provided and combines them in an easy, visual and as far as possible without
programming way. In the next, the user provides a service request in O-DF, which is able to ‘fire’ the
execution of this specific KnaaS provided solution. Last but not least, the markup of the answer should
be added, and to be encoded in an O-MI message if it will be exposed in the bIoTope ecosystem. In
the future, an interested agent, either another person or a software program, could access this
knowledge service in an automated way without programming and harvesting the resulted
knowledge. This artificial conceived scenario is in full accordance with the conceptual architecture of
Figure 6: KnaaS Architecture highliting interactions betweeen other bIoTope components and Linked Open Data.
D4.4 Framework for Knowledge Extraction from IoT Data Sources
© 688203 bIoTope Project Partners 21 30 June 2017
the bIoTope Knowledge Framework (Figure 4) as well as with the four requirements provided in
Section 3.9
3.2. Integration Workflow with Existing bIoTope Components
Figure 7: Example workflow for KnaaS integration with bIoTope.
Figure 7 describes the systems involved in our proposed approach and the information flow between them.
On the left hand side of the figure, the IoT Environment of a user, exemplarily called “Jeremy”, is illustrated
by a dashed-line circulated area. It consists of an O-MI node and the user itself, interacting with another
system, the IoTBnB platform. This platform enables to index and expose the IoT data/service description so as
the IoT stakeholders can easily find IoT data/services. The systems within the dashed-lined grey areas already
exist and numerous instances are deployed on the Web. Our contribution in this deliverable, the Knowledge
as a Service – KnaaS– is outlined on the right hand side of Figure 7 using the green dashed line.
A typical workflow of KnaaS integrated in the bIoTope ecosystem could look as follows:
1. Jeremy decides to expose IoT data/service through an O-MI node/server. To increase the visibility of his data/service (possibly based on financial motives), he registers his node on the IoTBnB marketplace.
2. As Jeremy wants also to publish/expose and potentially sell the results of more advanced data analysis based on his own data, he purchases — maybe for free, through the IoTBnB marketplace — the access to the KnaaS component.
3. To benefit from this service, he needs to provide some information to the KnaaS such as his O-MI node URL, the associated infoItems on which he wants a knowledge extraction, data fusion, etc.
9 Later references will be provided in the rest of the Section to that artificial scenario, for that reason we will refer to this
as AUCS (Artificial Use Case Scenario).
D4.4 Framework for Knowledge Extraction from IoT Data Sources
© 688203 bIoTope Project Partners 22 30 June 2017
4. Based on this information, Jeremy chooses the available knowledge fusion and knowledge reasoning black boxes that are provided and combines them in an easy, visual and as far as possible without programming way. When this procedure ends the knowledge Extraction process (which has been just created) can be started.
5. The KnaaS component sends the results as an O-MI/O-DF message (either a write request if the Jeremy enables the KnaaS to do that, or as a simple O-MI response that Jeremy would need to handle).
3.3. Knowledge Extraction Implementation
As it was described in great detail in section 2.4 the art of knowledge extraction is more about choosing among
a set of appropriate fusing tools rather than choosing one to solve each specific task. For that reason, it is
crucial to provide the end user with a great variety of knowledge extraction algorithms in order to permit the
production of new information out of the already available data sources. Moreover, IoT asks for simplicity as
well as ease of usage.
Among one of the best practices in IoT today is Node-RED. Node-RED is a visual tool for wiring the Internet of
Things, but it can also be used for other types of applications to quickly assemble flows of services. The reason
why ‘Node’ is in the name is because the tool is implemented as Node application but from a consumer point
of view that’s really only an internal implementation detail. Node-RED is available as open source and has
been implemented by the IBM Emerging Technology organization. However, even Node-RED is quite simplistic
and permits a bunch of different functionalities, the decision to be built upon the Node server adds the
restriction of the wiring services to be programmed via the JavaScript languages with has a lot of limitations
regarding the available knowledge extraction libraries. On the other hand, there is a vast increase in the
knowledge extraction frameworks which offer an API for the Python language or they are written in the Python
language, i.e., [47]–[52]. Based on the conclusions of section 2.4 a special consideration should be given in
order the KnaaS to be easily extendible and adaptable. For that reason, there should be an easy mechanism
to further extend and add all the functionalities that these libraries provide.
Leveraging both advantages that Node-RED offers with those of well-known inference libraries was the
architectural decision that we have made. Toward that direction, we have integrated a python functionality
on Node-RED which permits the integration of many Knowledge Extraction Frameworks. This python addition
makes easy the extension of KnaaS with new fusing and reasoning functionalities.
In Figure 8 there is an example of the visual and user-friendly interface that the KnaaS will provide. KnaaS
provides capabilities for knowledge extraction from heterogeneous data sources. It is currently designed to
be installed as a stand-alone server on the Web, exposing its services as REST endpoints. However, if sensitive
information should be processed, operating a public KnaaS server may raises security or privacy concerns.
Therefore, it is also possible to run KnaaS within the premises of, e.g., the user Jeremy. In the future we also
plan to integrate KnaaS’s functionality with the capabilities of O-MI nodes so that they can directly deal with
data and avoid privacy, security or performance problems. In addition, the integration of O-MI nodes has
already implemented10 and the full functionality of KnaaS, together with examples, will be provided in next
release. With regard to the AUCS (Artificial Use Case Scenario) introduced in section 3.1, a user is able to (a)
access data available in the Web through e.g., an O-MI query, a SPARQL query, etc., to (b) fuse them and
extract knowledge out of them through pre-custom nodes and/or by writing his/her own custom functions in
JavaScript or Python, and finally to (c) harvest the resulted knowledge. Last but not least, the user is able to
markup the resulted knowledge, and encode it in an O-DF message if resulted knowledge will be exposed in
the bIoTope ecosystem.
10 https://github.com/skubler/Node-Red-OMI and https://github.com/skubler/Node-Red-ODF
D4.4 Framework for Knowledge Extraction from IoT Data Sources
© 688203 bIoTope Project Partners 23 30 June 2017
Figure 8: bIoTope KnaaS example
We provide the KnaaS under an open source license at GitHub11. It is implemented as a flow-based Web visual
programming service using the Node-RED12 framework and mongoDB13 as a NoSQL database to store the
resulted knowledge if needed. The Node-RED framework is extended with a Python node, which is able to
offer advanced knowledge extraction and fusion capabilities through the integration of state-of-the-art
libraries such as: SciPy14, scikit-learn15, etc. Special consideration will be given to the future so as a plethora of
many more knowledge extraction libraries will be added to the Node-RED environment. For an easy testing
setup, we provide instructions how to build the server as well as a Docker16 compose file for simplified
deployment and demo data. In-depth instructions for building, deploying and using KnaaS are available in the
project repository as well.
11 https://github.com/prokolyvakis/knaas 12 https://nodered.org/ 13 https://www.mongodb.com/ 14 https://www.scipy.org/ 15 http://scikit-learn.org/stable/ 16 https://www.docker.com/
D4.4 Framework for Knowledge Extraction from IoT Data Sources
© 688203 bIoTope Project Partners 24 30 June 2017
4. Case Studies
The implementation of smart city use cases requires the exploration of multiple data sources and their
relationships along with user demand-driven knowledge services for intelligent management and operations.
The purpose of this section is to illustrate the knowledge extraction capabilities of the KnaaS framework while
referencing one of the bIoTope use cases, namely the Lyon’s Heat Wave Mitigation Pilot Use Case. Especially,
we will demonstrate how the data fusion and the knowledge discovery capabilities can be used to inform the
citizens of Urban Community of Lyon about the local heat conditions of their region and help the citizens
ventilate their homes accordingly. In addition, it will be demonstrated how a service can be set up through the
KnaaS framework that will enable to establish the correlation between traffic and temperature. Similar to the
work from MIT [15] which was briefly described in section 2.3, the established correlation can provide insights
into the way temperature may be affected by traffic. Based on this work different proactive actions could be
initiated for proactive environmental management. In section 4.1, we describe the scenarios of the Lyon’s
Heat Wave Mitigation Pilot Use Case on which we have focused. We have implemented the different scenarios
through KnaaS and we provide them under an open source license at GitHub17. The implementation of these
scenarios is accompanied with in-depth instructions for building, deploying and using the aforementioned
solutions through the KnaaS current implemented version.
4.1. Knowledge Extraction and Fusion in bIoTope use cases and pilots
The Heat Wave Mitigation Pilot Use Case aims at improving citizen life during summer and particularly when
heat waves hit the city. It also aims to contribute to the metropolitan climate change policy by refining the
heat waves model, by controlling the vegetation, which impacts on the heat waves, and this by using a natural
resource such as rainwater. The realization goals of this pilot use case were fully described in deliverable D2.1:
Ecosystem Stakeholder Requirements Report and Pilot Definition. To demonstrate the knowledge extraction
capabilities of the KnaaS, we have focused on some specific aspects shown in orange colour in Figure 9.
Figure 9: Goal Realization Viewpoint of Lyon's Pilot Case #2 in ArchiMate.
More specifically, we will demonstrate step by step how KnaaS can be used to:
1. Inform Citizens of local heat conditions.
2. Help citizens to ventilate their home.
17 https://github.com/prokolyvakis/knaas#application-to-lyons-heat-wave-mitigation-pilot-use-case
D4.4 Framework for Knowledge Extraction from IoT Data Sources
© 688203 bIoTope Project Partners 25 30 June 2017
3. Establish a correlation between traffic and temperature.
It is obvious from these three specific aspects that a large traffic, temperature and humidity sensors network
is required in order to reach the promised goal. For that reason, traffic data were accessed from the RESTful
APIs18 that the Urban Community of Lyon offers. Additionally, temperature and pressure data were extracted
from the Netatmo PUBLIC API19. The Netatmo company designs and distributes weather stations that allow
the monitoring of the atmospheric conditions inside and outside the buildings.
4.1.1. Heat Wave Mitigation: Inform Citizens of Local Heat Conditions
In Figure 10, an example of how to access Netatmo local weather stations and Urban Community of Lyon’
Traffic Data is shown in a user-friendly and visual programming way. In the upper part of figure the Netatmo
sensors are accessed and the data are cleaned, pre-processed and stored in a MongoDB database20 for further
investigation. The storage of historical data is in full accordance with the Conceptual Architecture of the
Knowledge Framework (presented in Section 3). Moreover, this functionality is very useful especially when a
data analysis should be performed on the data; as it will be also shown in section 4.1.2 when the correlation
between temperature and traffic data will be established.
Figure 10: Access Local Heat Conditions & Traffic Through KnaaS
In the lower part of Figure 10, the same procedure is applied as above and the Urban Community of Lyon’
Traffic Data are accessed, cleaned, pre-processed and stored in the MongoDB database. The preprocessing is
achieved by writing a custom JavaScript function, which parses the JSON messages produced in the “Lyon
Netatmo” and “Lyon’s Traffic Data request” nodes. Finally, in the middle part of Figure 10, the visualization is
achieved through the world map functionality that KnaaS provides. The result of this feature is shown in Figure
11. The local heat conditions of the Urban Community of Lyon are shown in OpenStreetMap21, where the
citizens can be informed about local heat conditions. In addition, a rule is established which is activated when
the temperature is over 24 °C. This fact is shown in the OpenStreetMap with a black icon. In future work,
context-aware events could be initiated when the users’ geographical position is close to temperature critical
18 https://data.grandlyon.com/ 19 https://dev.netatmo.com 20 https://www.mongodb.com/ 21 https://www.openstreetmap.org
D4.4 Framework for Knowledge Extraction from IoT Data Sources
© 688203 bIoTope Project Partners 26 30 June 2017
regions. The section 4.1.2 demonstrates a way how the flow shown in Figure 10 could be extended in order to
create an event through “Switch” node.
To go on step further, and demonstrate the connection between the results presented here and the work
presented in deliverable D5.4: 2D and 3D UI Widgets Library regarding the Context-Sensitive End-user
Dashboards (UIaaS), we proceed with the demonstration of an example of how we can develop simple
widgets, which permit the visual data exploration of the local heat conditions through KnaaS. Figure 12
demonstrates this capability. As it can be seen, we can exploit the widgets to visualize the various results
produced via the KnaaS framework and in that day investigate different aspects of the local heat conditions in
the urban community of Lyon.
Figure 12: Data Exploration of the local Heat Conditions through KnaaS
Specifically, in Figure 13 is depicted the produced dashboards based on the above flow configuration. Different
gauge charts are produced that present in a visual way the temperature and humidity sensor data streams
offered by Netatmo local weather stations. Moreover, the temperature and humidity graphs over time are
shown from which the variation and the mean value of the data can be investigated. In [53] is mentioned that
Figure 11: Visualization of local Heat Conditions through KnaaS
D4.4 Framework for Knowledge Extraction from IoT Data Sources
© 688203 bIoTope Project Partners 27 30 June 2017
there is a consensus that one of the “next breakthroughs will come from integrated solutions that allow the
user to explore their data using graphical metaphors.” This will bring users into direct contact with the data
and it will lower the huge gap between knowledge extraction algorithms and the knowledge presentation. For
that reason, during the initial KnaaS framework design the facilitation of the interaction with the work of WP5
regarding the UIaaS was of significant importance. The initial choice of bIoTope that O-MI, O-DF will be the
‘glue technology’ that manages to make all the IoT devices and the bIoTope components interoperable was a
great facilitator towards the realization of this aim. Last but not least, this work permits the KnaaS framework
to be interoperable not only with the UIaaS, but all the bIoTope components and IoT devices that are
confronted to O-MI and O-DF.
Figure 13: Gauge Charts and Graphs produced via KnaaS to visual explore the sensor streams.
4.1.2. Heat Wave Mitigation: Help citizens to Ventilate their Home
In this subsection, the scenario of helping the citizens to ventilate their home properly is investigated. In order
to realize it, special information is needed regarding the weather information, the weather forecast is needed.
OpenWeatherMap 22 is an online service that provides weather data, including current weather data,
forecasts, and historical data to the developers of web services and mobile applications. For data sources, it
utilizes meteorological broadcast services, raw data from airport weather stations, raw data from radar
stations, and raw data from other official weather stations. All data is processed by OpenWeatherMap in a
way that it attempts to provide accurate online weather forecast data and weather maps, such as those for
clouds or precipitation. Beyond that, the service is focused on the social aspect by involving weather station
owners in connecting to the service and thereby increasing weather data accuracy. The ideology is inspired by
OpenStreetMap and Wikipedia that make information free and available for everybody.
In Figure 14, a Node-RED flow is shown in which the available data from OpenWeatherMap are accessed,
processed and visualized through a message notification in the dashboard created in the previous section. The
result is shown in Figure 15. In the future work, where a custom mobile application will be implemented this
information could be easily demonstrated in it. The only extension to the flow shown in Figure 14 is that an
O-MI node will be added, which will transfer the O-MI/O-DF message to the application.
22 https://openweathermap.org/
D4.4 Framework for Knowledge Extraction from IoT Data Sources
© 688203 bIoTope Project Partners 28 30 June 2017
Figure 14: Node-RED flow, which displays through dashboards’ notifications and tweets the OpenWeatherMap information.
Figure 15: Message notification with the available information in the OpenWeatherMap.
In Figure 16, the same message is shown through a tweet message, which has been triggered by the same
function as the one which triggered the dashboard’s notification.
Figure 16: Tweet message with the parsed information from OpenWeatherMap.
In order to further extend the above functionality and check if the temperature exceeds a specific threshold
only a simple modification is needed in the flow shown in Figure 14. Especially, with the addition of the switch
node – shown in orange colour in Figure 17– a rule can be added which creates a message to be displayed only
in the condition where the temperature exceeds the 24 °C.
Figure 17: Node-RED flow, which enables the Users to ventilate their homes accordingly.
In the future scenario, where the user has exposed his/her e.g., Smart Ventilator (might be an interconnection
to another pilot use case of the bIoTope ecosystem) through a private O-MI node, the switch node could fix
the temperature accordingly just by sending an O-MI/O-DF message to this node.
D4.4 Framework for Knowledge Extraction from IoT Data Sources
© 688203 bIoTope Project Partners 29 30 June 2017
4.1.3. Heat Wave Mitigation: Establish a correlation between traffic and temperature
In statistics, the Pearson correlation coefficient, also referred to as the Pearson's r or Pearson product-moment
correlation coefficient (PPMCC), is a measure of the linear correlation between two variables X and Y. It has a
value between +1 and −1, where 1 is total positive linear correlation, 0 is no linear correlation, and −1 is total
negative linear correlation. It is widely used in the sciences. It was developed by Karl Pearson from a related
idea introduced by Francis Galton in the 1880s. Pearson's correlation coefficient when applied to a population
is commonly represented by the Greek letter ρ (rho) and may be referred to as the population correlation
coefficient or the population Pearson correlation coefficient. The formula for ρ is:
𝜌𝑋,𝑌=𝑐𝑜𝑣(𝑋,𝑌)
𝜎𝑋𝜎𝑌
where:
cov: is the covariance, namely: cov , X YX Y E X Y
𝝈𝑿: is the standard deviation of X, namely: X E X
𝝈𝒀: is the standard deviation of Y, namely: Y E Y
The correlation coefficient ranges from −1 to 1. A value of 1 implies that a linear equation describes the
relationship between X and Y perfectly, with all data points lying on a line for which Y increases as X increases.
A value of −1 implies that all data points lie on a line for which Y decreases as X increases. A value of 0 implies
that there is no linear correlation between the variables. More generally, note that i iX X Y Y is
positive if and only if iX and iY lie on the same side of their respective means. Thus the correlation coefficient
is positive if iX and iY tend to be simultaneously greater than, or simultaneously less than, their respective
means. The correlation coefficient is negative (anti-correlation) if iX and iY tend to lie on opposite sides of
their respective means. Moreover, the stronger is either tendency, the larger is the absolute value of the
correlation coefficient.
KnaaS supports execution of python code thanks to the node shown in Figure 18. This node acts a proxy to
Python modules and libraries which offer advanced knowledge extraction and data mining capabilities. For
instance, in deliverable D5.4 (UIaaS), this node acts as the central element for evaluating the current context
as it uses the Python programming extension in order to execute SPARQL ASK queries against an O-MI/O-DF
Integration Server thanks to the Python RDFlib23 library. (see Deliverable D5.2 for an introduction to OORI).
Figure 18: Node for writing Python code in KnaaS.
In this subsection, we will demonstrate how the node shown in Figure 18 permits us to incorporate in KnaaS
knowledge extraction functionalities provided in SciPy24, an open source Python library used for scientific
23 http://rdflib.readthedocs.io/en/stable/ 24 https://www.scipy.org/
D4.4 Framework for Knowledge Extraction from IoT Data Sources
© 688203 bIoTope Project Partners 30 30 June 2017
computing and technical computing and pandas25, a Python package providing fast, flexible, and expressive
data structures designed to make working with “relational” or “labeled” data both easy and intuitive.
Specifically, the python3 function node shown in Figure 19, with the label Get Netatmo & Traffic Data, accesses
Netatmo local weather stations and urban community of Lyon’ traffic data every two minutes, preprocesses
the data and stores them accordingly in a MongoDB collection. In addition, the timestamps with the exact
time that the data were accessed are stored in a Python list structure.
Figure 19: Establish a correlation between traffic and temperature through KnaaS.
In the next step, a MongoDB query, shown in Figure 20, is executed in order to access the geographic
coordinates of a road with a specific steet_name. Additionally, another MongoDB query is executed in order
to find the closest local weather stations based on the geographic coordinates returned by the previous query.
This query can be shown in Figure 21. Based on this result an average temperature value and an average
pressure value is computed for each timestamp for the road with the specific street name. In the next these
data are stored in a pandas’s dataframe and the correlation of the road speed with the temperature and
pressure is computed based on the Pearson correlation coefficient.
query = { "$and": [ { "layer": 'traffic' }, \
{"last_update": {"$eq": date}}, \
{"libelle": street_name} ]
}
Figure 20: MongoDB query that retrieves the geocoordinates of a specific street_name given a specific timestamp.
query= {"$and":[{"layer":'netatmo'},
{"last_update":{"$eq": date}},\
{"loc":{"$nearSphere":{"$geometry":{"type":"Point",\
"coordinates":[geo[0],geo[1]]}
}
}
}]}
Figure 21: MongoDB query that finds the closest Netatmo local weather stations given specific geographic coordinates.
In order to demonstrate the results a simple dashboard was created and presented in Figure 22 that displays
the Pearson correlation coefficients in a table format. Similar to the work from MIT [15] which was briefly
25 http://pandas.pydata.org/pandas-docs/stable/
D4.4 Framework for Knowledge Extraction from IoT Data Sources
© 688203 bIoTope Project Partners 31 30 June 2017
described in section 2.3, the established correlation can provide insights into the way temperature may be
affected by traffic. Insights gained from this step may lead, for instance, to a better understanding of critical
traffic conditions.
Figure 22: Dashboard with the computed Pearson correlation coefficients.
D4.4 Framework for Knowledge Extraction from IoT Data Sources
© 688203 bIoTope Project Partners 32 30 June 2017
5. Conclusion and Future Work
In this deliverable, we demonstrated the core knowledge extraction and data fusion mechanisms of the Knowledge Extraction Framework. We provide a prototypical implementation of the Knowledge Extraction Framework in the frame of Knowledge as a Service provisioning and outline how it can be integrated with existing services in the bIoTope ecosystem. For the purpose of demonstrating the strengths of the bIoTope’s KnaaS framework and helping the readers understand the proposed approach, different scenarios of Heat Wave Mitigation Pilot Use Case of the Urban Community of Lyon were examined and implemented in the frame of KnaaS. Our immediate next steps will concentrate on extending further the reasoning and fusing mechanisms of the Knowledge Extraction Framework and provide more advanced capabilities in a user-friendly way based on the visual programming principles that the KnaaS implementation was based. Additionally, special consideration will be given to the efficient handling of the semantic annotations within the O-MI/O-DF messages based on the new extended O-DF messages with semantics achieved by the coordinated work among WP3, WP4 and WP5. Last but not least, we will work on ameliorating the interactions with other bIoTopes’ components towards realizing a sustainable bIoTope ecosystem.
D4.4 Framework for Knowledge Extraction from IoT Data Sources
© 688203 bIoTope Project Partners 33 30 June 2017
6. References
[1] J. Mineraud, O. Mazhelis, X. Su, and S. Tarkoma, “A gap analysis of Internet-of-Things platforms,” Comput. Commun., vol. 89, pp. 5–16, 2016.
[2] K. Framling, S. Kubler, and A. Buda, “Universal Messaging Standards for the IoT From a Lifecycle Management Perspective,” IEEE Internet Things J., vol. 1, no. 4, pp. 319–327, Aug. 2014.
[3] The Open Group, “Open Messaging Interface (O-MI), an Open Group Internet of Things (IoT) Standard - C14B.” [Online]. Available: https://www2.opengroup.org/ogsys/catalog/C14B. [Accessed: 21-Feb-2017].
[4] The Open Group, “Open Data Format (O-DF), an Open Group Internet of Things (IoT) Standard.” [Online]. Available: http://www.opengroup.org/iot/odf/index.htm. [Accessed: 21-Feb-2017].
[5] I. F. Akyildiz, Weilian Su, Y. Sankarasubramaniam, and E. Cayirci, “A survey on sensor networks,” IEEE Commun. Mag., vol. 40, no. 8, pp. 102–114, Aug. 2002.
[6] M. Wang et al., “City Data Fusion:,” Int. J. Distrib. Syst. Technol., vol. 7, no. 1, pp. 15–36, Jan. 2016. [7] J. Hintikka, “Knowledge and belief: an introduction to the logic of the two notions,” 1962. [8] G. Bonanno, “Information, knowledge and belief,” Bull. Econ. Res., 2002. [9] P. Battigalli and G. Bonanno, “Recent results on belief, knowledge and the epistemic foundations of
game theory,” Res. Econ., 1999. [10] A. Doyle, “The Annotated Sherlock Holmes (Vol. 1),” I (New York, 1967), 1967. [11] M. Raskino, J. Fenn, and A. Linden, “Extracting Value From the Massively Connected World of 2015.
Gartner Research, 1 April 2005.” . [12] C. Perera, C. H. Liu, and S. Jayawardena, “The Emerging Internet of Things Marketplace From an
Industrial Perspective: A Survey,” IEEE Trans. Emerg. Top. Comput., vol. 3, no. 4, pp. 585–598, Dec. 2015.
[13] R. Joshi and A. C. (Arthur C. . Sanderson, Multisensor fusion : a minimal representation framework. World Scientific, 1999.
[14] A. Zaslavsky, A. Zaslavsky, C. Perera, and D. Georgakopoulos, “Sensing as a Service and Big Data.” [15] S. Sobolevsky et al., “Scaling of City Attractiveness for Foreign Visitors through Big Data of Human
Economical and Social Media Activity,” in 2015 IEEE International Congress on Big Data, 2015, pp. 600–607.
[16] D. L. Hall and J. Llinas, “An introduction to multisensor data fusion,” Proc. IEEE, vol. 85, no. 1, pp. 6–23, 1997.
[17] E. F. Nakamura, A. A. F. Loureiro, and A. C. Frery, “Information fusion for wireless sensor networks,” ACM Comput. Surv., vol. 39, no. 3, p. 9–es, Sep. 2007.
[18] A. N. Shulsky and G. J. Schmitt, Silent warfare : understanding the world of intelligence. Brassey’s, Inc, 2002.
[19] J. R. Boyd, “A discourse on winning and losing. Maxwell Air Force Base, AL: Air University,” Libr. Doc. No. MU, vol. 43947, p. 79, 1987.
[20] O. E. Drummond, “Methodologies for performance evaluation of multitarget multisensor tracking,” 1999, pp. 355–369.
[21] H. Fourati and K. Iniewski, Multisensor data fusion : from algorithms and architectural design to applications. .
[22] E. Blasch and S. Plano, “DFIG Level 5 (User Refinement) issues supporting Situational Assessment Reasoning,” in 2005 7th International Conference on Information Fusion, 2005, pp. xxxv–xliii.
[23] E. P. Blasch and S. Plano, “JDL level 5 fusion model: user refinement issues and applications in group tracking,” 2002, pp. 270–279.
[24] E. Blasch, E. Bosse, and D. A. Lambert, High-level information fusion management and systems design. Norwood, MA: Artech House, 2012.
[25] M. E. Liggins, D. L. (David L. Hall, and J. Llinas, Handbook of multisensor data fusion : theory and practice. CRC Press, 2009.
[26] M. Kumar, D. P. Garg, and R. A. Zachery, “A generalized approach for inconsistency detection in data fusion from multiple sensors,” in 2006 American Control Conference, 2006, p. 6 pp.
[27] P. Smets and Philippe, “Analyzing the combination of conflicting belief functions,” Inf. Fusion, vol. 8,
D4.4 Framework for Knowledge Extraction from IoT Data Sources
© 688203 bIoTope Project Partners 34 30 June 2017
no. 4, pp. 387–412, Oct. 2007. [28] R. P. S. Mahler, “Statistics 101 for multisensor, multitarget data fusion,” IEEE Aerosp. Electron. Syst.
Mag., vol. 19, no. 1, pp. 53–64, Jan. 2004. [29] D. Smith and S. Singh, “Approaches to Multisensor Data Fusion in Target Tracking: A Survey,” IEEE
Trans. Knowl. Data Eng., vol. 18, no. 12, pp. 1696–1710, Dec. 2006. [30] Yunmin Zhu, Enbin Song, Jie Zhou, and Zhisheng You, “Optimal dimensionality reduction of sensor data
in multisensor estimation fusion,” IEEE Trans. Signal Process., vol. 53, no. 5, pp. 1631–1639, May 2005. [31] B. L. Milenova and M. M. Campos, “Mining high-dimensional data for information fusion: a database-
centric approach,” in 2005 7th International Conference on Information Fusion, 2005, p. 7 pp. [32] B. Khaleghi, A. Khamis, F. O. Karray, and S. N. Razavi, “Multisensor data fusion: A review of the state-
of-the-art,” Inf. Fusion, vol. 14, no. 1, pp. 28–44, 2013. [33] F. Sheridan, “A survey of techniques for inference under uncertainty,” Artif. Intell. Rev., 1991. [34] H. Durrant-Whyte and T. Henderson, “Multisensor data fusion,” Springer Handb. Robot., 2016. [35] L. Zadeh, “Fuzzy sets,” Inf. Control, 1965. [36] F. Karray and C. De Silva, “Soft computing and intelligent systems design: theory, tools, and
applications,” 2004. [37] C. Negoita, L. Zadeh, and H. Zimmermann, “Fuzzy sets as a basis for a theory of possibility,” Fuzzy sets
Syst., 1978. [38] Z. Pawlak, “Rough sets: Theoretical aspects of reasoning about data,” 2012. [39] G. Shafer, “A mathematical theory of evidence,” 1976. [40] L. Bottou and Leon, “From machine learning to machine reasoning,” Mach. Learn., vol. 94, no. 2, pp.
133–149, Feb. 2014. [41] R. Kohavi and F. Provost, “Glossary of terms,” Mach. Learn., vol. 30, no. 2–3, pp. 271–274, 1998. [42] A. Munoz, “Machine Learning and Optimization,” URL https//www. cims. nyu. edu/~
munoz/files/ml_optimization. pdf [accessed 2016-03-02][WebCite Cache ID 6fiLfZvnG], 2014. [43] I. Pratt-Hartmann, “First-Order Mereotopology,” in Handbook of Spatial Logics, Dordrecht: Springer
Netherlands, 2007, pp. 13–97. [44] V. N. Vapnik, The nature of statistical learning theory. Springer, 2000. [45] X. Wu et al., “Knowledge Engineering with Big Data,” IEEE Intell. Syst., vol. 30, no. 5, pp. 46–55, Sep.
2015. [46] Xindong Wu, Xingquan Zhu, Gong-Qing Wu, and Wei Ding, “Data mining with big data,” IEEE Trans.
Knowl. Data Eng., vol. 26, no. 1, pp. 97–107, Jan. 2014. [47] J. Bergstra, O. Breuleux, and F. Bastien, “Theano: A CPU and GPU math compiler in Python,” Proc. 9th
Python, 2010. [48] T. Schaul, J. Bayer, D. Wierstra, Y. Sun, and M. Felder, “PyBrain,” Mach. Learn. …, 2010. [49] F. Pedregosa, G. Varoquaux, and A. Gramfort, “Scikit-learn: Machine learning in Python,” Mach. Learn.
…, 2011. [50] J. Demšar, B. Zupan, G. Leban, and T. Curk, “Orange: From experimental machine learning to interactive
data mining,” Princ. Data Min. …, 2004. [51] S. Sonnenburg, S. Henschel, C. Widmer, and J. Behr, “The SHOGUN machine learning toolbox,” Mach.
Learn. …, 2010. [52] D. King, “Dlib-ml: A machine learning toolkit,” J. Mach. Learn. Res., 2009. [53] U. M. Fayyad, A. Wierse, and G. G. Grinstein, Information visualization in data mining and knowledge
discovery. Morgan Kaufmann, 2002. [54] A. J. and R. C. Andrejs Abele, John P. McCrae, Paul Buitelaar, “Linking Open Data cloud diagram 2017.”
[Online]. Available: http://lod-cloud.net/.
D4.4 Framework for Knowledge Extraction from IoT Data Sources
© 688203 bIoTope Project Partners 35 30 June 2017
7. Online References
DBpedia. wiki.dbpedia.org
Dublin Core vocabulary-DCMI Metadata Terms. dublincore.org/documents/dcmi-terms/
FOAF. xmlns.com/foaf/spec/
Google Knowledge Graph. www.google.com/intl/es419/insidesearch/features/search/knowledge.html
Linked Open Data. linkeddata.org
Lyon DATA platform. data.grandlyon.com
SKOS. https://www.w3.org/TR/skos-reference/
SPARQL Protocol And RDF Query Language. www.w3.org/TR/rdf-sparql-query
Pandas. http://pandas.pydata.org/pandas-docs/stable
SciPy. https://www.scipy.org/
OpenStreetMap. https://www.openstreetmap.org
Netatmo. https://dev.netatmo.com
mongoDB. https://www.mongodb.com/
Node-RED. https://nodered.org/
Docker. https://www.docker.com/