Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation...

Post on 06-Apr-2020

6 views 0 download

transcript

Stata export for metadata

documentation

Munich, 26.05.2019Anne Balz, Klaus Pforr, Florian Thirolf

Motivation

� German Microdata Lab (GML) offers Metadata for

various official microdata online

� Goal: extract Metadata from these Datasets automatically

and import them into our database

� German Microcensus

� European Labour Force Survey

� EU-SILC (European Union Statistics on Income and

Living Conditions)

2

Microdata-Informationsystem MISSY

3

� Online plattform („MISSY-web“)

� Documentation of official microdata (European &

national)

� Documentation on different levels:

� study

� question

� variable

Microdata-Informationsystem MISSY

4

Microdata-Informationsystem MISSY

5

ado dta2mdcore functionality

core functionality

7

*.dta

output.*

core functionality

8

*.dta dta2meta.ado meta.dta

output.*

meta2*.ado

ado dta2mdado dta2md

ado dta2md

10

*.dta dta2meta.ado meta.dta meta2*.ado

output.*

the meta-file

All necessary (meta-)information in a table format:

� Variable level

� Varname, -label

� Summary statistics (min, max, mean, std)

� Value level

� Value, - label

� Frequencies and percentages

� Overall

� For groups (e.g.: countries)

11

ado dta2md

12

Value Level

User Input (Variable): Group-Variable & Computed

Technical: First Value within Variable

Variable Level

the meta-file

13

the meta-file

14

ado dta2md

15

ado dta2md

dta2md input(filename) output(filename) //

freqvarlist(varlist) //

[group(varname) //

missingdef(string) smissingdef(string) //

replace ]

dta2md input($path/micro_file.dta) output($path/meta_file.dta)//

freqvarlist(var1 var2 var3) //

group(country) //

missing("X<0") //

smissingdef(`"X="invalid answer"| X="did not understand""') //

replace

16

ado dta2md

17

Loop over all vars

If group specified:

Loop over all groups

(within levels of vars)

If computed:

Loop over all levels

(within all vars)

If group specified:

Loop over all groups

ado dta2mdado meta2DDI

ado meta2DDI

19

*.dta dta2Meta.ado meta.dta meta2DDI.ado

DDI2.5.xml

ado meta2DDI

� Uses the ‚file‘ command

� ‚forvalues‘ to runthrough all categories

� variables of the meta-file are used to form hierarchical output

20

� example:

� ‚first‘ (0/1) tags first category of a variable

� used to generate output on variable level

ado meta2DDI

21

ado meta2DDI

22

ado dta2mdusecase MISSY

Usecase MISSY

24

*.dta dta2Meta.ado meta.dta meta2sql.ado

getUUIDs

generateUUIDs

mapRelations

Database

output.sql

meta2sql.ado

� ‚file‘ command is used

� different frame

� ‚forvalues‘ for each database-table

25