+ All Categories
Home > Technology > ICIC 2013 Conference Proceedings Uwe Rosemann TIB

ICIC 2013 Conference Proceedings Uwe Rosemann TIB

Date post: 15-Jun-2015
Category:
Upload: dr-haxel-congress-and-event-management-gmbh
View: 660 times
Download: 2 times
Share this document with a friend
Description:
Text and Non-textual Objects: Seamless access for scientists Uwe Rosemann (German National Library of Science and Technology (TIB), Germany) The European High Level Expert Group on Scientific data has formulated the challenges for a scientific infrastructure to be reached by 2030: “Our vision is a scientific e-infrastructure that supports seamless access, use, re-use, and trust of data. In a sense, the physical and technical infrastructure becomes invisible and the data themselves become the infrastructure – a valuable asset, on which science, technology, the economy and society can advance”. Here, “data” is not restricted to primary data but also includes all non-textual material (graphs, spectra, videos, 3D-objects etc.). The German National Library of Science and Technology (TIB) has developed a concept for a national competence center for non-textual materials which is now founded by the German State and by the German Federal Countries. The center has to perform the task: developing solutions and services together with the scientific community to make such data available, citable, sharable and usable, including visual search tools and enhanced content-based retrieval. With solutions such as DataCite and modular development for extraction, indexing and visual searching of new scientific metadata, TIB will accept the challenge. And will make all data accessible to its users fast, convenient and easy to use. The paper shows what special tools are developed by TIB in the context of scientific AV-media, 3D-objects and research data.
Popular Tags:
50
Uwe Rosemann ICIC 2013 Vienna Textual and non-textual objects: Seamless access for scientists
Transcript
Page 1: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

Uwe Rosemann

ICIC 2013 Vienna

Textual and non-textual objects:

Seamless access for scientists

Page 2: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

2

• Specialized Library for Architecture, Chemistry, Computer Science,

Mathematics, Physics, Engineering Technology

• Financed by Federal Government and all Federal States

• Member of the Leibniz Association

• Global supplier for scientific and technical

information

German National Library of Science and Technology (TIB)

Page 3: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

3

Global Network

TechLib

Page 4: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

4

Customers

71% 10%

Europe

14% 5%

World USA

Germany

Page 5: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

5

Main Services

• Provision of scientific content

• full texts, document delivery, interlibrary loan

• Scientific retrieval

• portal GetInfo

• Long-term preservation

• DOI-Service for research data

• Research and development

Page 6: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

6

Jim Gray, eScience Group, Microsoft Research

Changes in the scientific process

Page 7: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

7

A gap

• A widening gap in the scientific record between published

research in a text document and the data that underlies it

• As a result, datasets are

• difficult to discover

• difficult to access

• Scientific information gets lost

Page 8: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

8

Requirements - Politics

Knowledge is power.

Europe must manage the digital assets its researchers generate.

Page 9: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

9

Final report of the High Level Expert Group on Scientific Data.

„Riding the wave“ – How Europe can gain access

from the rising tide of scientific data

Page 10: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

10

Strategy – Move beyond text

Simulation

Scientific Films

3D Objects

Text

Research Data

Software

Page 11: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

11

Move beyond text – Consequences for TIB

• Research communities produce many types of scientific and technical

information

• Each has its own unique characteristics and life cycle

• Must become capable of accepting and managing new media formats

Page 12: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

12

Competence Center for Non-textual Materials I

• Develop a clear strategy for the use and integration of non-textual

materials at the TIB

• Systematically collect non-textual materials from research and teaching

• Define, integrate and establish technical infrastructure

• Define and establish workflows for indexing, cataloguing, digital

preservation, DOI names, licencing

Page 13: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

13

Competence Center for Non-textual Materials II

• Develop innovative media-specific portals enabled by e.g. an automated

video analysis with scene, speech, text and image recognition

• Linking non-textual materials to other research information such as full

texts and research data via the specialist portal GetInfo

• Engage in communities, provide support and advice to media providers

TIB will establish its own research capacity

Page 14: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

14

• Infrastructure for research data

• Visual search tools for AV-media

• 3D-Objects

• chemOCR

How have we been preparing ?

Page 15: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

15

• In 2005, the TIB became a non-commercial DOI registration agency

for research data

• In 2010, the TIB became co-founder of the international DataCite

consortium to establish easier access to scientific research data on the

Internet

Mission

• Citability of research data

• High visibility of the data

• Easy re-use and verification of the data sets

• Increasing quality of published papers

Collaboration – Research Data

Page 16: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

16

DataCite Members

Page 17: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

17

Example: EHEC virus

Page 18: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

18

Example: EHEC virus

Page 19: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

19

DOI Services

• Contracts with 60 data centres

• Research Institutes

• Universities

• Libraries

• Publisher

• 776.454 DOI registrations

• 22.533 up to September 2013

Page 20: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

20

Research data – Further developments

• KomFor

• Centre of Expertise for Research Data from the „Earth and

Environment“ project

• RADAR

• RADAR - Research Data Repositorium

• Visual Analysis

• VisInfo Methods

Page 21: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

21

Zeit [h] T [°C] 1 12 2 13 3 12 4 12 5 13 6 35 7 17 8 11 9 10

10 12 11 13 12 13 13 12 14 12 15 12 16 11 17 11 18 10 19 10 20 11 21 11 22 10 23 12 24 12

Numerical data

Page 22: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

22

Visual access to research data

Page 23: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

23

• Infrastructure for research data

• Visual search tools for AV-media

• 3D-Objects

• chemOCR

How have we been preparing ?

Page 24: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

24

TIB‘s portal for audiovisual media

Project Development of a portal for audiovisual media

Aim Improve access to AV-Media

Time July 2011 – December 2013

Partner Hasso-Plattner Institut for Softwaresystemtechnology GmbH

Page 25: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

25

How do I find what I‘m looking for in videos?

Today: Manual annotation of the whole video

TIB‘s portal for audiovisual media

Metadata

• Titel

• Author

• Description

• Publisher

• Publication year

• Rightsholder

• …..

Page 26: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

26

source: Scorupka, Sascha, Experiment der Woche, 2011

Future: Manual Annotation plus content-based information

1. Speech

2. Visual features

e.g. Indoor, Experiment, Technology

4. Structural Information

Scenes, Shots, Segments

3. Textual information Leibniz University Hannover

TIB‘s portal for audiovisual media

Page 27: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

27

TIB‘s portal for audiovisual media

Media analysis process

Upload

Page 28: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

28

TIB‘s portal for audiovisual media

Scene recognition

Hard cut

Kopf, S. Computergestützte Inhaltsanalyse von digitalen Videoarchiven, Mannheim. 2006

Automatic cut detection

→ luminance / contrast

→ colour distribution / colour

histogramm

→ edges

Page 29: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

29

TIB‘s portal for audiovisual media

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering

this work is copy right ed nine teen thirty six

Automatic speech recognition

Quality of results is dependent upon

• quality of the speaker

• dialects

• background noises

• voice overlaps

Page 30: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

30

TIB‘s portal for audiovisual media

Intelligent Character Recognition

Intelligent Character Recognition

(ICR)

• Character/Logo Detection

• Character Filtering

• Character Recognition

Page 31: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

31

Method of analysis

Image recognition

Interview, experiment,

animation, lecture

Extracted data is

converted into text

TIB‘s portal for audiovisual media

Automated analysis: Image recognition

Page 32: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

32

Visual Concepts

Graphical : Animation

Graphical : Drawing

Graphical : Diagram

Real : Outdoor

Real : Indoor

Real : Lecture /

Conference

Real : Interview

Real : Buildings ...

TIB‘s portal for audiovisual media

Machine learning

using visual features Keyframes Annotation

Page 33: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

33

TIB‘s portal for audiovisual media

Page 34: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

34

• Infrastructure for research data

• Visual search tools for AV-media

• 3D Objects

• chemOCR

How have we been preparing?

Page 35: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

35 35

3D Objects – an excursion to Architecture

Page 36: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

36

content based indexing

visual search

Visual search tools

Page 37: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

37

segmentation with

form-primitives

extraction of

room connectivity

graphs

Content based indexing

Page 38: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

38

3D sketch attributed graph

result visualization

Visual search

Page 39: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

39

Further developments

Page 40: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

40

• Infrastructure for research data

• Visual search tools for AV-media

• 3D Objects

• chemOCR

How have we been preparing ?

Page 41: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

41

Search for chemical structures – how?

?

Chemists are used to drawing

Information retrieval in Chemistry

Page 42: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

42

Table with reaction scheme

2a-i: Derivates from the reaction

Chemical structure

Reaction scheme

Chemical Names

Linked entities from the table

Textual and non-textual chemical information

Page 43: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

43

image data chemical structure data

CLiDE chemOCR

Non-textual data processing – chemOCR

Page 44: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

44

Information retrieval in chemistry Text AND formulas

Page 45: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

45

Further subjects

• Open Science Lab

• Ontology

Page 46: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

46

Dissemination of scientific and technical information has been a

foundational mission.

The methods have completely changed, but the mission

remains the same.

Conclusion

Page 47: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

47

Ultimate Goal:

Interlinking and Search Across All

Types of Digital Assets.

Conclusion

Page 48: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

48

GetInfo – Portal for Science and Technology

• 58 m metadata in internal index

• 390 m metadata in external sources

• 900.000 pdf fulltexts

• Data, AV-Media, 3D Objects

Page 49: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

49

Development of media-specific portals

BEREITSTELLU

NG

Probado 3D Portal for audiovisual Media

Page 50: ICIC 2013 Conference Proceedings Uwe Rosemann TIB

50

Questions?


Recommended