Innovative roles of data engineering and software engineering in a changing world software engineering in a changing world”
engineering and software engineering in a changing world”
A Min TjoaCompetence Center for Excellent Technologies -
Vienna University of Technology
Keynote‘s topic
● You would be expected to deliver a keynote on“Software Engineering in the Age of SocialMedia”
● What is meant by “Age of Social Media”?● Was the impact of Social Media on Software Engineering meant?
Use of Skype for Teleconferencing for Open Source projects Use of web-based project management application, e.g. Trello New aspects of CSCW for Software Engineering? Etc. etc.
Keynote‘s topic
● You would be expected to deliver a keynote on“Software Engineering in the Age of SocialMedia”
● I answered that I could give a tentative title ofmy talk, such as:“Innovative roles of data engineering and software engineering in a changing world”
Software Engineering in the Age of Social Media
● Topics which are made possible in the Age of
Social Media for the benefit of the society, e.g.“Crowd Sourcing” Early Detection of Epidemics, Birdwatching….
● Linked Data for Disaster Management coupledwith Social Media
Software Engineering in the Age of Social Media
● A lot has been achieved in the last years (in the “ Age
of Social Media” in the INTEREST OF SOCIETY
●
Software Engineering in the Age of Social Media
● A lot has been published on “Software Engineeing andSocial Media”(e.g. Andrew Begel et al):● New ways for software teams to form and work together. ● Individuals’ self-organization within and across
institutional boundaries● The potential of Social Media for Software Engineering:
Forming Storming Norming Performing Adjourning
● Challenges: Privacy Protecting reputation
University‘s role: Asking questions
The slogan of the 650th anniversary of the University of Vienna:
"We ask the questions. Since 1365“
What is the one of the most important question of this millenium SUSTAINIBILITY
The Sustainibility Development Goals (SDGs) adopted by the United Nations General Assembly in September 2016 should/must havean impact on all fields including SOFTWARE ENGINEERING!
Sustainability Design and Software:The Karlskrona Manifesto (IEEE/ACM ICSE 2015)
Sustainability Design and Software:The Karlskrona Manifesto (IEEE/ACM ICSE 2015)
● Lack of coherent understanding of sustainability, and how it relates to software systems research and practice.
● Need of an articulation of the fundamental principles underpinning design choices that affect sustainability
● A cross-disciplinary initiative to create a common ground and a point of reference for the global community of research and practice in software and sustainability
● Need of effectively communicating key issues, goals, values and principles of sustainability design for software-intensive systems.
State of the Art of Software Research (IEEE/ACM ICSE 2015)
SE research has, for the most part, focused on:
● Reliability, efficiency and cost-benefit relation of software products for their owners,
● Processes, methods, models and techniques to create, verify and validate software systems and keep them operational.
There is a lack of long-term thinking in software engineering research and practice
Sustainibility in SE - the status quo (IEEE/ACM ICSE 2015)
There has been always a focus on some aspects of sustainibility, especially MAINTENANCE, i.e.
● efforts to increase the maintainability of software products ● Facilitate the evolution of software products often focused on
improving architecture decreasing lifecycle costs
● ‘Digital dark age’ Elimination of analog media in preference for digital or old digital storing formats,
BUT● The larger impact of software artefacts on society and the natural
environment is not routinely analyzed.
Most drastic example of neglecting sustainibility of software artefacts: The VOLKSWAGEN case
● The design of software systems comes with a special set of responsibilities to society that are much broader than those described in existing codes of ethics for computing professionals.
● Requirement Engineering was not ethical/sustainable● Monitoring on sustainibility aspects during the whole software lifecycle was not
existant
Consequences:● Need of standards and norms for sustainibility of the software engineering
process (in analogy to ISO 27001 in the area of security)
Sustainable Development Goals
1. End poverty in all its forms everywhere2. End hunger, achieve food security and improved nutrition and promote sustainable agriculture3. Ensure healthy lives and promote well-being for all at all ages4. Ensure inclusive and equitable quality education and promote lifelong learning opportunities for all5. Achieve gender equality and empower all women and girls6. Ensure availability and sustainable management of water and sanitation for all7. Ensure access to affordable, reliable, sustainable and modern energy for all8. Promote sustained, inclusive and sustainable economic growth, full and productive employment
and decent work for all9. Build resilient infrastructure, promote inclusive and sustainable industrialization and foster innovation10. Reduce inequality within and among countries
Sustainable Development Goals
11. Make cities and human settlements inclusive, safe, resilient and sustainable12. Ensure sustainable consumption and production patterns13. Take urgent action to combat climate change and its impacts14. Conserve and sustainably use the oceans, seas and marine resources for sustainable development15. Protect, restore and promote sustainable use of terrestrial ecosystems, sustainably manage forests,combat desertification, and halt and reverse land degradation and halt biodiversity loss16. Promote peaceful and inclusive societies for sustainable development, provide access to
justice for all and build effective, accountable and inclusive institutions at all levels17. Strengthen the means of implementation and revitalize the global partnership for sustainable
development
The Triple Bottom Line of Sustainibility
MANIFESTO: The key idea is that human society is only sustainable if it can be sustained in all three
dimensions:● social, ● economic and ● environmental.
This view is incorporated into the triple bottom line approach
SDG:● The goals should address and incorporate in a balanced way all three dimensions of
sustainable development (environment, economics, and society)
GOAL 17: Strengthen the means of implementation and revitalize the global partnership for sustainable development
● Subgoal: Data - monitoring and accountability
● 17.18 By 2020, enhance capacity-building support to developing countries, including for leastdeveloped countries and small island developing States, to increase significantly the availability ofhigh-quality, timely and reliable data disaggregated by income, gender, age, race, ethnicity, migratory status, disability, geographic location and other characteristics relevant in nationalcontexts
● 17.19 By 2030, build on existing initiatives to develop measurements of progress on sustainabledevelopment that complement gross domestic product, and support statistical capacity-building indeveloping countries
The WWW TodayThe Problem
„We‘re drowning in information but starving for knowledge“
[John Naisbett]
„ We are in an era of data-centric scientific research, in which hypotheses are not only tested through directed data collection and analysis but also generated by combining and mining the pool of data already available. The scientific data landscape we draw upon is expanding rapidly in both scale and diversity “
[Carole Goble]
OPEN GOVERNMENTAL DATA (OGD)
Open Governmental Data (OGD)
● The basic idea of Open Government:
establish a modern cooperation among politicians, public administration,
industry and private citizens
enable more transparency, democracy, participation and collaboration
● In European countries, Open Government is often viewed as a
natural companion to e-government
● Open Government Partnership contains more than 88 countries
e.g., the UK, Indonesia, Philippines
http://www.semantic-web.at/LOD-TheEssentials.pdfhttp://www.opengovpartnership.org/ 19
Open Governmental Data (OGD)
● Open Government Data (OGD) is emerging as a major movement in knowledge sharing
● basic premises are to open up publicly-owned data and information from governmental
institutions Make it available in machine-readable formats for easy re-use and cross-
combination by citizens, industry, media, and academia – as well as by government itself.
● OGD movement has also the power to fuel greater transparency, to enable collaboration between stakeholders, and last but not least to spur new economic activity
Open Governmental Data (OGD)
G8 OPEN DATA CHARTER (Ireland 2013)
1. The world is witnessing the growth of a global movement facilitated by technology and social media and fuelled by information – one that contains enormous potential to create more accountable, efficient, responsive, and effective governments and businesses, and to spur economic growth.
2. Open data sit at the heart of this global movement.
21
Open Governmental Data (OGD)
G8 OPEN DATA CHARTER (Ireland 2013)Principles:
● Open Data by Default ● Quality and Quantity ● Useable by All ● Releasing Data for Improved Governance ● Releasing Data for Innovation
22
Obama on Open Data
Open Data is “going to help launch more startups. It’s going to help launch more business. ...
It’s going to help more entrepeneurs come up with products and services that we haven’t even imagined yet”
(US President, Barack Obama, May 9th, 2013)
23
Vienna Open Government Data
24
Open Data vs Open Governmental Data
● The often-used term “Open Data” refers to data and information beyond just governmental institutions and includes those from other relevant stakeholder groups such as business/industry, science or education citizens, NPOs and NGOs
● Open Research Data „Whenever legally and ethically possible, research data and similar materials which are
collected and/or analysed should be made openly accessible. Data underlying the published research results should either be openly accessible immediately or – if not used in publications – some (two-three?) years after the project is finished.“ (Austrian Science Foundation)
The preservation and archiving of research data is a prerequisite for the repeatability and verification of experiments.
Principles of OGD 1/2
● Data must be complete All public data is made available
● Data must be primary Data is published as collected at the source, with the finest possible level of granularity,
and not in aggregate or modified forms.
● Data must be timely Data is made available as quickly as necessary to preserve the value of the data.
● Data must be accessible Data is available to the widest range of users for the widest range of purposes.
● Data must be machine-processable Data is structured so that it can be processed in an automated way.
Principles of OGD 2/2
● Access must be non-discriminatory Data is available to anyone, with no registration requirement.
● Data formats must be non-proprietary Data is available in a format over which no entity has exclusive control.
● Data must be license-free Data is not subject to any copyright, patent, trademark or trade secrets regulation.
● Permanence Permanence refers to the capability of finding information over time.
● Usage costs No costs imposed on the public for access
LINKED DATA
Tim Berners Lee‘s Data star scheme
Quality Level Description Format / example
Available on the web but with an open licence, to be Open Data
whatever format
Available as machine-readable structured data
Excel instead of image scan of a table
as (2) plus non-proprietary format CSV or JSON instead of Excel
All the above plus, Use open standards from to identify things
RDF, SPARQL
All the above, plus: Link your data to other people’s data to provide context
Linked RDF
Semantic Web vs. Linked Data
Semantic Web Linked Data
Big Data Not neccesarily Often presented as big datasets
Open Data Not necessarily To get the first should be open
Data structuring approach
Top-down: starts with ontologies, a conceptual and logical foundation to how information is modeled and interrelated
Bottom-up : starts simply by primitive data
Complexity Some aspects are complex and too theoretical
Simple and effective
Evolution After a decade, there is a handful of standards and languages, the heavy AI emphasis of the initial semantic Web advocacy now feels dated
Very fast: in less than five years there is a huge amount of data, the community is now shifting the emphasis to linked data
Properties of the Web of Linked Data
● Anyone can publish data to the Web of Linked Data● Entities are connected by links
creating a global data graph that spans data sources and enables the discovery of new data sources.
● Data is self-describing If an application encounters data represented using an unfamiliar
vocabulary, the application can resolve the URIs that identify vocabulary terms in order to find their RDFS or OWL definition.
● The Web of Data is open meaning that applications can discover new data sources at run-time by
following links.
LOD Datasets on the Web: May 2007
Over 500 million RDF triples Around 120,000 RDF links between data sources
Example RDF Links
● RDF links from DBpedia to other data sources
<http://dbpedia.org/resource/Berlin> owl:sameAs
<http://sws.geonames.org/2950159> .
<http://dbpedia.org/resource/Tim_Berners-Lee> owl:sameAs
<http://www4.wiwiss.fu-berlin.de/dblp/resource/person/100007>.
LOD Datasets on the Web: September 2008
LOD Datasets on the Web: March 2009
LOD Datasets on the Web: July 2009
Linked Open Data 2011 : 295 Data Sets with 26 Billion Triples
LOD cloud August 2014: 1014 datasets
Google‘s Knowledge Graph
Example: Universities in Vienna
SELECT xsd:string(?name) as ?name
WHERE {
[] dbpedia-owl:city <http://dbpedia.org/resource/Vienna>;
rdf:type dbpedia-owl:University;
dbpprop:name ?name.
}ORDER BY ?name
LIMIT 10
http://live.dbpedia.org/sparql
Concept of Linked Widget
Linked Widget
● W3C defines a “widget” as “an interactive single purpose application for displaying and/or updating local
data or data on the web, packaged in a way to allow a single download and installation on a user’s machine or mobile device”
● Linked Widget is an extension of standard widget that is customized to consume Linked Data
● It has a semantic model that describes the inputs, outputs, and metadata such as provenance and license
Linked Widget
● We distinguish three types of widgets: Data widgets: are used as data feeds to other widget types and generate
data in a specific format e.g., SPARQL widgets, Location widgets Process widgets: receive a dataset as input and generate the output based
on a customized process e.g., formatters, filters, and merge widgets Presentation widgets: generate visual output based on a given set of data at
runtime e.g., diagrams, tables, and information visualization
Linked Widget Platform - http://linkedwidgets.org
Compare the number of passengers using Bus and Metro in Vienna
Data widgets Process widget Presenation widget
Basic Environmental Widgets 1
● Find a park with good air quality 700m near my office
● Used widgets Location + Map Pointer: serves as input Air quality: retrieves air quality values Geo Merge: combines 2 locations based
on distance Google Map: for visualization
Basic Environmental Widgets 2
● Requirement: I want to ride a City Bike and then go swimming. Good air quality should be provided.
● Solution: Find a City Bike location, which is near a public swimming pool and provides best air quality.
Automatically generate statistical widgets
● We build a widget-generation tool input: SPARQL endpoint using RDF Data Cube vocabulary output: statistical widgets. Each dataset is modelled as a statistical widget
SPARQL endpoint Statistical widgets
Simpler query and better interface
Smart City, Social Networks and SE
Possible role of Social Networks (i.e. Example of Ireland): ● Participation services are necessary to incorporate key aspects of the urban
environment and of human behaviour, and their associated business models.
● Feedback services as a key to enable autonomous behavioural change in both the artifical systems in cities but more importantly in citizens.
● Resource usage that is best optimized when city stakeholders, including citizens, municipal authorities and businesses, cooperate on a massive scale as part of a city-wide feedback loop.
Smart City, Social Networks and SE
Role of Software Engineering:
● applications that involve closing feedback loops around urban environments, including air quality, road congestion, energy in buildings, water and waste management;
● solving optimizations on a massively large scale; ● enacting and facilitating behavioural change in cities by engaging with citizens
to create and regulate large-scale aggregation effects; ● solving privacy, security and legal liability issues that hinder deployment of
cooperative services; ● developing applications that are robust to unreliable connectivity and low social
participation; ● designing key indicators that can be used to track and determine the
effectiveness of smart city interventions.
Smart City, Social Networks and SE
Possible role of Social Networks (i.e. Example of Ireland): ● Participation services are necessary to incorporate key aspects of the urban
environment and of human behaviour, and their associated business models.
● Feedback services as a key to enable autonomous behavioural change in both the artifical systems in cities but more importantly in citizens.
● Resource usage that is best optimized when city stakeholders, including citizens, municipal authorities and businesses, cooperate on a massive scale as part of a city-wide feedback loop.
Smart City, Social Networks and SE
Role of Software Engineering:
● applications that involve closing feedback loops around urban environments, including air quality, road congestion, energy in buildings, water and waste management;
● solving optimizations on a massively large scale; ● enacting and facilitating behavioural change in cities by engaging with citizens
to create and regulate large-scale aggregation effects; ● solving privacy, security and legal liability issues that hinder deployment of
cooperative services; ● developing applications that are robust to unreliable connectivity and low social
participation; ● designing key indicators that can be used to track and determine the
effectiveness of smart city interventions.
Conclusion
Software Engineering Researchers should take SUSTAINIBILITY as a holistic concept of all their work
All Software Processes should take SUSTAINIBILITY into account
THANK YOU VERY MUCH FOR YOUR ATTENTION!