+ All Categories
Home > Documents > What is data?

What is data?

Date post: 05-Jan-2016
Category:
Upload: turner
View: 45 times
Download: 1 times
Share this document with a friend
Description:
What is data?. Wietse Dol, LEI-WUR ([email protected]) 13 November 2012, 9.40 – 10.25, C435 Forumgebouw. LEI: Agricultural Economic Research Institute. Part of Wageningen University & Research center (WU R ) - PowerPoint PPT Presentation
Popular Tags:
23
What is data? Wietse Dol, LEI-WUR ([email protected]) 13 November 2012, 9.40 – 10.25, C435 Forumgebouw
Transcript
Page 1: What is data?

What is data?

Wietse Dol, LEI-WUR ([email protected])

13 November 2012, 9.40 – 10.25, C435 Forumgebouw

Page 2: What is data?

LEI: Agricultural Economic Research Institute

Part of Wageningen University & Research center (WUR)

Part of the Social Science Group within the WUR

We are the research part of WUR/SSG (advice ministry of Agriculture) in The Hague

Consultancy (applied research): ministries, EU, local government, industry,…

Collecting data (Farm data: FADN), building models and agricultural content specialists

Page 3: What is data?

University vs. Research center

University: teaching, publications, new theory and technology

Research center:

●applied work/consultancy

●reusing things from the past (e.g. yearly publications)

●sharing knowledge (how to become a content specialist)/teaching for small groups

●working in groups (different disciplines)

●Working in (inter)national groups with many different disciplines

Page 4: What is data?

Wietse Dol

PhD Econometrics

10 years University of Groningen (Econometrics, sampling theory)

18 years LEI (many different departments)

Data and models, i.e. use/reuse and quality, trouble shooter + statistical methods + ICT + user interfacing

Not and IT guy but a researcher (I build software because I use it myself)

Many model projects and user interfaces for models (not only LEI)

Currently: data, data quality, …

Page 5: What is data?
Page 6: What is data?

Data, lifecycle and data management

http://datalib.edina.ac.uk/mantra/researchdataexplained.html

http://www.dcc.ac.uk/resources/curation-lifecycle-model

http://www.data-archive.ac.uk/create-manage/life-cycle

Page 7: What is data?

Data is anything and everything

Research data: collected, observed, or created, for the purpose of analysis to produce and validate original research results.

Anything can become the interest of research …

Data Research

Page 8: What is data?

Primary v.s. Secondary data

Primary data: you collect, targeted to answer/validate your questions.

Secondary data: not yours.

• Quality of data

• Meta-information is crucial

• More and more need of secondary data (primary is expensive and takes a lot of time to collect).

Page 9: What is data?

Production data

Meta-information: Source, Version, Dimension, Definitions etc. without proper information you use the wrong data

is FR with or without DOM?

Is the production in tons or in Euros.

Does the year start 1-1 and ends 31-12?

What’s the definition of Tomato

Product Country Year Production

Tomato NL 2005 325

Wheat BE 1999 100

Sugar FR 2003 450

Page 10: What is data?

DCC Curation Lifecycle Model

Page 11: What is data?

CREATE & MANAGE DATA: RESEARCH DATA LIFECYCLE

Page 12: What is data?

Data

How to get the data, filter it and store it

Quality checks on the data

How to make it available for others

What scientific actions are done on the data

Curate, preserve, versions, ..

Page 13: What is data?

Types of databases according MetaBase

Statistical database

Scientific database

Meta-database

Page 14: What is data?

Statistical database

Databases provided by international organizations like EU, FAO and OECD are in general statistical databases:

●Data are stored as they are received

●Data are consistent in their own domain

●No aggregations are made when underlying data are missing

●Not much attention for data checking

Page 15: What is data?

Scientific versus Statistical database

Problems with statistical database:

●Different definitions of territories and commodities

●Typing errors

●Missing data

●Break in series

Scientific database:

●Problems solved

●Transparency (original data sources and underlying assumptions are kept)

●Essential for modeling and research

Page 16: What is data?

Structural design of a scientific database

Key words for structural design HarDFACTS project IPTS 2007 done by vTI/LEI

●Transparent

●Harmonised

●Complete

●Consistent

Harmonised Database for Agricultural Commodity Time Series

Page 17: What is data?

Transparent

Original data from statistical database are stored

Complete and consistent data are stored

Original and completed data can be compared

Calculation procedures are stored and can be repeated

Page 18: What is data?

Harmonised

Definition used here is to bring together the different international databases in one framework and to link the data through a unique coding system (keywords are classifications and tree structures)

Page 19: What is data?

Complete

Definition used in MetaBase is that an econometric procedures will be proposed to complete the new (time) series in the database.

●Trend estimates

●Interpolation

●Correlation and regression with other variables (e.g. TRAMO: Time series Regression with Arima noise, Missing observations and Outliers)

Page 20: What is data?

Consistent

Definition used here is that the inter relationship of the data in the database holds over classifications (time, territories and variables).

Page 21: What is data?

MetaBase

Page 22: What is data?

MetaBase

1. many different data sources (e.g. FAO, Eurostat) all in same user-interface (SDMX, NetCDF)

2. find data alternatives using Meta-Information

3. search data content (e.g. oilseed)

4. all content easily available in research software (R/GAMS)

5. recodings, aggregations and concordances are all implemented in GAMS

6. Statistical methods in GAMS and R

Page 23: What is data?

Thank you for your attention!

Or send an email: [email protected]


Recommended