Date post: | 06-Aug-2015 |
Category: |
Technology |
Upload: | andrea-volpini |
View: | 250 times |
Download: | 1 times |
Andrea Volpini @cyberandy
@multilingweb - Dipartimento di Informatica, Sapienza Università di Roma 6th July 2015
WordLift for Digital Publishers
This fine event is hosted by:
@multilingweb // LIDER
future of journalism opendata
@wordliftit v3 @mico_project
Hello, I am: @cyberandy
No.8 - MARK ROTHKO
This workshop is about:
Meet Your Audience
Some are humans and some …are not.
Astro Boy Comic
“Hi Stacey! Would you like me to read your favourite news?”
“ok Hound, When will the sun rise in Japan two days before Christmas in 2021?”
Friendly, helpful and intelligent a complete new class of voice-enabled
assistants has just arrived
Beta Testing the Apocalypse - TOM KACZYNSKI
ANTI MONEY LAUNDRY COMPLIANCE AND INVESTMENT STRATEGIES
BANKS & INVESTORS
CHECKING IF THERE ARE ON-GOING OR PAST LEGAL PROCESSES
LAW FIRMS
POLICY MAKERS
NEWS AS VALUABLE INPUT IN THE LAW MAKING PROCESS
BUSINESS CREATING BUSINESS VALUES AND TAKING DECISIONS BY READING NEWS
(Humans)…creating value with News
Meet Your New Colleagues
can interpret your data and turn it into meaningful, personalised content.
Associated Press announced last year that corporate earnings stories and sport stories are written automatically.
Text Generation Algorithms
Logan Ingalls / Flickr
Analysts expect higher profit for Paychex when the company reports its fourth quarter results on Tuesday, July 1, 2014. The consensus estimate is calling for profit of 40 cents a share, reflecting a rise from 38 cents per share a year ago.
Your New Colleague…the Algorithm has just written a new piece.
but remember… you still are
“Uniquely Human”
Pay a visit to http://nextdraft.com/
“If our role as journalists is to help communities better organize their knowledge and themselves, then it is apparent that we are in the service business and that we must draw on many tools, including content, and place value on the relationships we build with members of our communities, which will also take many forms. Thus we are in the relationship business.”
Jeff Jarvis
Human Factor is key!
Introducing
MEANINGFULLY ORGANISE YOUR CONTENT
A Semantic Editor for WordPress for journalists and bloggers to:
ASSIST THE WRITING PROCESS WITH CONTEXTUAL INFORMATION
ADD STRUCTURED METADATA
ENRICH CONTENT SUGGESTING IMAGES, LINKS AND WIDGETS
RECOMMEND RELEVANT CONTENT TO READERS
BUILD AN OPEN DATASET (ENTITIES + ANNOTATIONS + CONTENT)
ASSIST THE WRITING PROCESS WITH CONTEXTUAL INFORMATION
Fact-based information are derived from open datasets and are contextually relevant to the article. Editors can choose what datasets will be used for the enrichment.
ENRICH CONTENT SUGGESTING IMAGES, LINKS AND WIDGETS
Relevant and free to use photos and illustrations from
the Commons community
meaningful navigation systems for internal interlinking
Bringing to the audience an overview of all the content being written around a specific topic using the chord widget.
RECOMMEND RELEVANT CONTENT
content evolution over time
INTRODUCING THE NAVIGATOR WIDGET
WHERE /entity/earthWHO /entity/michael-caineschema:Person
schema:Place
schema:Organisation WHO /entity/nasa
type: /BlogPosting /2015/07/04/coopers-endurance-crew/
Creates links to entity pages and related articles by using the WHO, WHERE, WHAT and WHEN classifications.
ADD STRUCTURED METADATA
The blog post, entities (dct:references), publishing information (schema:datePublished and schema:dateModified), the author (schema:author), and the number of comments (schema:interactionCount) are published as Linked Open Data and printed using schema.org for on-page SEO.
http://data.redlink.io/91/be2/post/Interstellar.html
Editors identify the basic 'WHO, WHAT, WHEN and WHERE'of an article and structure information around it by creating new entities in their custom vocabulary. Content, vocabulary and annotations constitutes the publisher’s knowledge graph and can be queried via SPARQL.
BUILD AN OPEN DATASET (ENTITIES + ANNOTATIONS + CONTENT)
(using and )How does a blog post look in the knowledge graph?
Special thanks to @dvcama :)
owl:sameAs connects entities, detected in the blog post, such as Wormhole (with the same entity on DBpedia and Freebase).
Starting this coming September WordLift and the technologies of MICO (for cross-media analysis) are going to be used and validated by Greenpeace Italy
on their subscribers magazine website (magazine.greenpeace.it).
Let’s move now to a real-world use case where ecologists, journalists and visionaries
stand to defend the natural world and to promote peace.
CONTENT ANALYSIS
LINKED DATA PUBLISHING
1
3
Technology Stack
Text
Legacy Data
Audio/Images
CONTENT DISCOVERY2
MICO is a 3yrs EU-funded research project (grant no. 610480) that brings to the platform
Cross-Media ExtractionCross-Media Metadata Publishing
Cross-Media QueryingCross-Media Recommendation
• Enterprise Linked Data
• Content Analysis • Semantic Search • Semantic Media
Analysis and Search
Media extractors available in MICO today: Animal detection, video quality, temporal segmentation, automatic speech recognition, speech-music discrimination, face detection and audio tampering detection.
Multimedia Retrieval Cross-Media Querying: Introducing the SPARQL extension SPARQL-MM, which adds multimedia specific features to the standard query language for the Semantic Web.
How can we help Greenpeace Italy?
• Connect videos with text using cross-media recommendations
• Provide compact contextual information for media assets
• Create new discovery path for their readers and subscribers
Spation-Temporal Object Model in SPARQL-MM
“Point me to scenes within videos where Barack Obama is standing to left of the MD of Greenpeace while talking about whale hunting”
Find out more on the SPARQL extension SPARQL-MM by reading this presentation by Thomas Kurz
Lessons learned so far…
• The bond between data and journalism is growing stronger and even for independent news organisation like Greenpeace providing context, clarity and building relationships (and knowledge graphs) is vital
• Algorithms are great and AI has entered the newsrooms but journalists shall preserve their authorship and role when crafting content - always leave the control in the hands of humans
• Providing immediate added value in the UX of semantic apps like WordLift is key to engage journalists and not only marketers and management
• Tags don’t help organising contents and named entities are much better• Linked Data is a service NOT a technology: users want to see images,
meaningful links, recommendation and interactive widgets - they don’t care about underlying technologies like RDF and SPARQL
• Creating datasets as a side effect while editing contents helps journalists make an impact and connect with policy makers, business and other communities.
JOIN.WORDLIFT.IT
Grazie! “[SLIDES] Creating an open database of knowledge by tagging the WHO, WHAT,
WHERE, WHEN of your contents #journalism”
Lclick to share it on Twitter!
mico-project.eu wordlift.it insideout.io
CREDITS
Wilfried Runde of Deutsche Welle, “In Praise of Robots and Humans”
Justin Kosslyn from Google Ideas, on thinking about how journalists' work gets used
Luca Rosati from News to Experience
BBC News Labs A manifesto for structured journalism
this presentation is the result of many inspiring ideas and amazing work from media experts, journalists and technologists and here is the list:
any idea, graphics or meme belonging to us is available for sharing, copying and re-mixing under
creative commons license 3.0
This presentation and the work behind it was partially developed within the MICO project (Media in Context - European Commission 7th Framework Programme
grant agreement no: 610480).
FIND OUT MORE ABOUT OUR PRODUCTS
Video Hosting Platform Semantic Editor Semantic Search