+ All Categories
Home > Documents > Benefits and Challenges of Data Linking: U.S. experience ...

Benefits and Challenges of Data Linking: U.S. experience ...

Date post: 21-Mar-2022
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
75
Benefits and Challenges of Data Linking: U.S. experience linking data on foreign-owned U.S. companies to domestic employment data Patricia Abaroa G-20 Workshop on Data Sharing February 1, 2017
Transcript

Ben

efits

and

Cha

lleng

es o

f Dat

a Li

nkin

g:

U.S

. exp

erie

nce

linki

ng d

ata

on fo

reig

n-ow

ned

U.S

. co

mpa

nies

to d

omes

tic e

mpl

oym

ent d

ata

Patr

icia

Aba

roa

G-20

Wor

ksho

p on

Dat

a Sh

arin

gFe

brua

ry 1

, 201

7

Agen

da

•Pro

ject

link

ing

data

on

fore

ign-

owne

d U.

S. c

ompa

nies

to

dom

estic

em

ploy

men

t dat

a–

Data

–Pr

oces

s

•Ben

efits

of l

inki

ng

•Cha

lleng

es

•Fut

ure

step

s

3/15

/201

72

Empl

oym

ent o

f for

eign

-ow

ned

U.S.

com

pani

es

3/15

/201

73

Num

ber o

f em

ploy

ees

(tho

usan

ds)

Shar

e of

priv

ate

indu

stry

em

ploy

men

t

Sour

ce: B

EA

Data

on

fore

ign-

owne

d U.

S. c

ompa

nies

•Bur

eau

of E

cono

mic

Ana

lysis

(BEA

) sta

tistic

s on

Activ

ities

of

Mul

tinat

iona

l Ent

erpr

ises (

AMN

E)

•Col

lect

ed b

y BE

A on

man

dato

ry b

ench

mar

k an

d an

nual

surv

eys

•Var

iabl

es c

olle

cted

incl

ude:

–Sa

les

–Va

lue

adde

d co

mpo

nent

s

–Em

ploy

men

t –

Expo

rts/

impo

rts

–Ca

pita

l exp

endi

ture

s

–Re

sear

ch &

dev

elop

men

t

–Fi

nanc

ial s

tate

men

ts–

Taxe

s

•Ent

erpr

ise le

vel

3/15

/201

74

Data

on

U.S.

em

ploy

men

t

•Adm

inist

rativ

e da

ta

•Bur

eau

of L

abor

Sta

tistic

s (BL

S)

•Qua

rter

ly C

ensu

s of E

mpl

oym

ent a

nd W

ages

•U.S

. est

ablis

hmen

ts c

over

ed in

the

unem

ploy

men

t

insu

ranc

e pr

ogra

m

•Dat

a co

llect

ed b

y st

ates

and

com

pile

d by

BLS

3/15

/201

75

Proc

ess o

f lin

king

dat

a

•Em

ploy

er Id

entif

icat

ion

Num

ber (

EIN

) –

Tax

iden

tific

atio

n nu

mbe

r for

bus

ines

ses

–O

ne c

ompa

ny ca

n ha

ve m

any

EIN

s•C

ompu

ter m

atch

of E

INs

•Man

ual w

ork

to li

nk a

dditi

onal

est

ablis

hmen

ts to

the

ente

rpris

es•U

se o

utsid

e so

urce

s to

iden

tify

addi

tiona

l est

ablis

hmen

ts•V

alid

atin

g th

e lin

k–

Ove

r-m

atch

ed (B

LS >

BEA

)–

Und

er-m

atch

ed (B

LS <

BEA

)–

Bad

mat

ches

3/15

/201

76

Mat

ch q

ualit

y so

far

•Wor

k st

ill in

pro

gres

s•W

ithin

20

perc

ent –

“clo

se e

noug

h” m

atch

Sour

ce: B

LS

3/15

/201

77

Affil

iate

sEm

ploy

men

t

Afte

r com

pute

rized

mat

ch o

f EIN

s44

.4%

58.7

%

Afte

r BLS

ana

lyst

revi

ew52

.7%

84.5

%

Afte

r BEA

ana

lyst

revi

ewin

pro

gres

sin

pro

gres

s

"Clo

se e

noug

h" m

atch

esSt

ep in

mat

ch p

roce

ss

Bene

fits o

f lin

king

•Exp

and

data

ava

ilabl

e fo

r stu

dyin

g ef

fect

s of d

irect

in

vest

men

t on

the

U.S.

eco

nom

y–

Mor

e gr

anul

ar d

etai

l on

FDI e

mpl

oym

ent a

nd w

ages

–in

dust

ry, g

eogr

aphy

, occ

upat

ion

–In

form

atio

n re

latin

g en

terp

rises

to e

stab

lishm

ents

–a

bypr

oduc

t of t

he li

nk –

is us

eful

for o

ther

link

ing

proj

ects

•Im

prov

emen

t of s

urve

y da

ta–

Link

ing

mic

roda

ta h

elps

iden

tify

erro

rs in

surv

ey re

port

ing 3/

15/2

0178

Bene

fits o

f lin

king

•Gre

ater

freq

uenc

y of

dat

a –

BEA

data

are

ann

ual;

BLS

data

mon

thly

and

qua

rter

ly

–O

nce

initi

al li

nk is

com

plet

ed, s

ubse

quen

t lin

ks m

ay b

e le

ss

labo

r int

ensiv

e an

d co

uld

be a

vaila

ble

with

gre

ater

fr

eque

ncy

•Pot

entia

l to

redu

ce re

spon

dent

bur

den

–M

ay b

e ab

le to

redu

ce d

ata

colle

cted

on

surv

ey if

link

pr

oduc

es in

form

atio

n th

at m

eets

stan

dard

s of q

ualit

y an

d tim

elin

ess

3/15

/201

79

Chal

leng

es

•Ver

y la

bor i

nten

sive

–Su

bsta

ntia

l inv

estm

ent o

f tim

e an

d re

sour

ces t

o lin

k da

ta in

itial

ly

–Su

bseq

uent

yea

rs m

ay b

e le

ss w

ork

•Not

tim

ely

–Be

caus

e of

man

ual e

ffort

, sub

stan

tial d

elay

in p

rodu

cing

resu

lts

–Ho

pe th

at w

ith su

bseq

uent

mat

ches

, dat

a co

uld

be m

ore

timel

y

•Leg

al re

quire

men

ts a

nd li

mita

tions

–In

tera

genc

y ag

reem

ent f

or d

ata

shar

ing

–N

ot a

ll st

ates

allo

w B

EA to

vie

w th

e da

ta

3/15

/201

710

Futu

re w

ork

•U.S

. Gov

ernm

ent e

ffort

to su

ppor

t and

exp

and

data

link

ing

–Em

ploy

er D

ata

Mat

chin

g W

orkg

roup

–14

Fed

eral

age

ncie

s rep

rese

nted

–Re

port

due

to b

e re

leas

ed so

on

•Exp

and

colle

ctio

n of

iden

tifie

rs–

Lega

l Ent

ity Id

entif

ier

–O

ver 4

00,0

00 e

ntiti

es g

loba

lly; o

ver 1

00,0

00 in

the

Uni

ted

Stat

es–

BEA

plan

s to

colle

ct L

EI o

n up

com

ing

surv

ey

•Pur

suin

g ne

w le

gisla

tion

to e

xpan

d in

tera

genc

y ac

cess

to m

icro

-da

ta fo

r sta

tistic

al p

urpo

ses

3/15

/201

711

Inte

rnat

ion

al N

etw

ork

for

Exc

han

gin

g E

xper

ien

ce o

n

Sta

tist

ical

Han

dli

ng

of G

ran

ula

r D

ata

(IN

EX

DA

)

Stef

an B

ende

r (C

hair

ofIN

EX

DA

)

INE

XD

A: T

he G

ranu

lar

Dat

a N

etw

ork

On

6th

Janu

ary

2017

,

•th

e B

anca

d'It

alia

,

•th

e B

anco

de

Port

ugal

,

•th

e B

ank

of E

ngla

nd,

•th

e B

anqu

ede

Fra

nce

•an

d th

e D

euts

che

Bun

desb

ank

have

laun

ched

IN

EX

DA

, an

inte

rnat

iona

l coo

pera

tive

pro

ject

exc

hang

ing

expe

rien

ces

to d

ecla

re th

eir

will

ingn

ess

to fu

rthe

r st

reng

then

thei

r co

oper

atio

n.

Gen

eral

Mis

sion

•A

ckno

wle

dgin

g th

e re

leva

nce

of m

icro

dat

a in

the

area

of

inde

pend

ent s

cien

tific

res

earc

h, p

olic

y ad

vice

s (f

or

exam

ple

mon

etar

y po

licy

or fi

nanc

ial s

tabi

lity)

and

st

atis

tics

and

thei

r im

port

ance

for

inte

rnat

iona

l co

mpa

riso

ns•

Prom

otin

g th

e G

20 D

ata

Gap

s In

itia

tive

II,

in p

arti

cula

r re

com

men

dati

on 2

0, a

ddre

ssin

g th

e ac

cess

ibili

ty o

f gr

anul

ar d

ata

•A

ckno

wle

dgin

g an

d su

ppor

ting

the

wor

k on

dat

a sh

arin

g of

the

Irvi

ng F

ishe

r C

omm

itte

e on

Cen

tral

Ban

k St

atis

tics

Subj

ect a

nd S

cope

I: S

tati

stic

al H

andl

ing

INE

XD

A p

rovi

des

a ba

sis

for

exch

angi

ngex

peri

ence

son

th

est

atis

tica

lhan

dlin

gof

gra

nula

r da

ta, s

uch

as

•th

e ac

cess

ibili

ty o

f dat

a an

d m

etad

ata,

•te

chni

ques

for

stat

isti

cal a

naly

sis

of g

ranu

lar

data

,

•pr

oced

ures

for

conf

iden

tial

ity

and

secu

rity

of d

ata,

•an

d m

etho

ds o

f out

put c

ontr

ol.

Subj

ect a

nd S

cope

II:

Fra

mew

ork

and

Aim

INE

XD

A a

ims

at:

inve

stig

atin

g po

ssib

iliti

es to

har

mon

ise

acce

ss p

roce

dure

s an

d m

etad

ata

stru

ctur

es,

deve

lopi

ng c

ompa

rabl

e st

ruct

ures

for

exis

ting

dat

a an

d fu

rthe

r fo

ster

ing

effic

ienc

y of

sta

tist

ical

wor

k w

ith

gran

ular

dat

a.

The

ulti

mat

e ai

m o

f IN

EX

DA

is to

faci

litat

e th

e us

e of

gra

nula

r da

ta fo

r an

alyt

ical

, res

earc

h an

d co

mpa

rati

ve p

urpo

ses

by u

sers

out

side

the

part

icip

atin

g in

stit

utio

ns, w

ithi

n th

e lim

its

set b

y th

e ap

plic

able

co

nfid

enti

alit

y re

gim

es.

Firs

t 2 Y

ears

: a P

ilot E

xerc

ise

The

five

sign

ing

cent

ral b

anks

hav

e ag

reed

to e

ngag

e in

a p

ilot e

xerc

ise

whi

ch

envi

sage

s,

1.an

ext

ensi

ve s

tock

-tak

ing

of a

vaila

ble

data

sets

and

exi

stin

g pr

oced

ures

and

, 2.

the

inve

stig

atio

n of

har

mon

isat

ion

poss

ibili

ties

at d

iffer

ent l

evel

s.

They

will

pre

sent

thei

r re

sult

s to

oth

er in

tere

sted

cen

tral

ban

ks, n

atio

nal

stat

isti

cal i

nsti

tute

s an

d in

tern

atio

nal i

nsti

tuti

ons

by th

e en

d of

201

8. A

t the

sa

me

tim

e, a

web

pag

e w

ill b

e la

unch

ed.

The

INE

XD

A s

ecre

tari

at i

s pr

ovid

ed b

y th

e D

euts

che

Bun

desb

ank

for

the

next

tw

o ye

ars

and

can

be c

onta

cted

at I

NE

XD

A.s

ecre

tary

@bu

ndes

bank

.de.

Part

icip

atio

n in

to I

NE

XD

A is

ope

n to

oth

er c

entr

al b

anks

, nat

iona

l sta

tist

ical

in

stit

utes

and

inte

rnat

iona

l org

anis

atio

ns.

Th

ank

you

for

you

r at

ten

tion

!

Con

tact

: IN

EX

DA

.sec

reta

ry@

bund

esba

nk.d

e

Dat

a In

tegr

atio

n,

Link

ing

and

Shar

ing

Rel

evan

ce o

f In

tern

atio

nal I

SO

Stan

dard

s

G-2

0 W

orks

hop

on D

ata

Sha

ring

Fran

kfur

t am

Mai

n, 1

6 Fe

brua

ry 2

017

Asi

er C

orne

jo P

érez

Exte

rnal

Sta

tistic

s D

ivis

ion

Dire

ctor

ate

Gen

eral

Sta

tistic

s

ECB

-UN

RES

TRIC

TED

FIN

AL

Rub

ric

ww

w.e

cb.e

urop

a.eu

©

Link

ing

data

sets

Dat

a In

tegr

atio

n, L

inki

ng a

nd S

harin

g

ECB

-UN

RES

TRIC

TED

FIN

AL

1 2 3

EC

B E

xper

ienc

e an

d in

itiat

ives

on

data

inte

grat

ion

Cro

ss c

ount

ry d

ata

shar

ing

Bac

kgro

und

2

Rub

ric

ww

w.e

cb.e

urop

a.eu

©

Incr

easi

ng d

ata

inte

grat

ion

need

s

•Th

e fin

anci

al c

risis

has

cre

ated

a g

row

ing

inte

rest

in c

onsi

sten

t, so

und

and

timel

y st

atis

tics

whi

ch im

plie

s la

rger

dat

a vo

lum

es a

t hig

her f

requ

ency

and

le

vel o

f gra

nula

rity

•In

ord

er to

mon

itor f

aste

r eco

nom

ic d

evel

opm

ents

Dat

a in

tegr

atio

n fro

m

diffe

rent

sou

rces

as

wel

l as

linki

ng o

f dat

aset

s is

nee

ded

toge

ther

with

the

poss

ibili

ty to

sha

re in

form

atio

n be

twee

n in

stitu

tions

and

cou

ntrie

s

•M

ain

chal

leng

es:

–D

ata

stan

daris

atio

nac

ross

the

diffe

rent

sou

rces

U

sing

com

mon

agr

eed

Inte

rnat

iona

l IS

O S

tand

ards

like

ISIN

and

LE

I

–C

onfid

entia

lity

rest

rictio

ns fo

r dat

a sh

arin

g am

ong

inst

itutio

ns a

nd

coun

trie

s U

sing

com

mon

Inte

rnat

iona

l IS

O S

tand

ards

and

enh

anci

ng

publ

icly

ava

ilabl

e in

form

atio

n

Dat

a In

tegr

atio

n, L

inki

ng a

nd S

harin

g

1. B

ackg

roun

d EC

B-U

NR

ESTR

ICTE

DFI

NAL

3

Rub

ric

ww

w.e

cb.e

urop

a.eu

©

Mic

ro a

nd g

ranu

lar

data

sys

tem

s

•EC

B, j

oint

ly w

ith th

e E

urop

ean

Sys

tem

of C

entra

l Ban

ks (E

SCB

), ha

s pr

omot

ed

the

crea

tion

of m

icro

and

gra

nula

r dat

a sy

stem

s ov

er th

e la

st y

ears

•A

s an

exa

mpl

e, in

form

atio

n on

sec

uriti

es a

nd th

eir i

ssue

rs is

com

pile

d in

the

Cen

tral

ised

Secu

ritie

s D

atab

ase

(CSD

B)

sing

le in

form

atio

n te

chno

logy

in

frast

ruct

ure

oper

ated

join

tly b

y th

e E

SC

B m

embe

rs

–D

ata

are

colle

cted

fro

m v

ario

us s

ourc

es: n

atio

nal c

entra

l ban

ks, c

omm

erci

al

data

pro

vide

rs, t

he p

ublic

and

adm

inis

trativ

e so

urce

s

–La

rge

data

vol

umes

are

aut

omat

ical

ly c

ompo

unde

d re

conc

iling

inco

nsis

tent

in

form

atio

n an

d de

tect

ing

inco

mpl

ete

or m

issi

ng d

ata

•O

nly

poss

ible

due

to th

e us

e of

com

mon

sta

ndar

ds b

etw

een

the

sour

ces

In p

artic

ular

, Int

erna

tiona

l IS

O S

tand

ards

like

the

ISIN

, LEI

or S

NA

200

8

•To

geth

er w

ith c

ompl

ex a

lgor

ithm

s to

ove

rcom

e th

e la

ck o

f per

fect

dat

a

–O

utpu

t se

curit

y-by

-sec

urity

info

rmat

ion

with

com

plet

e an

d hi

gh q

ualit

y da

ta to

the

exte

nt p

ossi

ble

shar

ed w

ithou

t res

tric

tions

with

in th

e ES

CB

Dat

a In

tegr

atio

n, L

inki

ng a

nd S

harin

g

2. E

CB

Exp

erie

nce

and

initi

ativ

es o

n da

ta in

tegr

atio

n (1

/2)

ECB

-UN

RES

TRIC

TED

FIN

AL

4

Rub

ric

ww

w.e

cb.e

urop

a.eu

©

Oth

er in

itia

tive

s an

d ch

alle

nges

•In

add

ition

to th

e C

SD

B, t

here

are

oth

ersy

stem

s w

ith m

icro

and

gra

nula

r dat

a

–S

ecur

ities

Hol

ding

s S

tatis

tics

Dat

abas

e (S

HS

DB

),

–M

oney

Mar

ket S

tatis

tical

Rep

ortin

g (M

MS

R)

–R

egis

ter o

f Ins

titut

ions

and

Affi

liate

s D

atab

ase

(RIA

D) o

r

–In

the

futu

re, A

naly

tical

Cre

dit D

atas

et (A

naC

redi

t)

•O

ne k

ey a

nd c

ruci

al e

lem

ent

Use

of c

omm

only

agr

eed

Inte

rnat

iona

l St

anda

rds

(ISO

) and

pub

licly

ava

ilabl

e id

entif

iers

like

ISIN

or L

EI

•C

halle

nges

Cov

erag

e, g

ener

al a

pplic

atio

n an

d pu

blic

ava

ilabi

lity

of id

entif

iers

•In

itiat

ives

ECB

Opi

nion

sen

cour

agin

g th

e m

anda

tory

use

of i

nter

natio

nally

ag

reed

sta

ndar

ds a

s w

ell a

s th

e pu

blic

mac

hine

-rea

dabl

e av

aila

bilit

y of

in

form

atio

n

–Ex

ampl

e: fi

nal t

ext o

f the

upd

ated

EU

Reg

ulat

ion

on th

e pr

ospe

ctus

to b

e pu

blis

hed

whe

n se

curit

ies

are

offe

red

to th

e pu

blic

or a

dmitt

ed to

trad

ing

Dat

a In

tegr

atio

n, L

inki

ng a

nd S

harin

g

2. E

CB

Exp

erie

nce

and

initi

ativ

es o

n da

ta in

tegr

atio

n (2

/2)

ECB

-UN

RES

TRIC

TED

FIN

AL

5

Rub

ric

ww

w.e

cb.e

urop

a.eu

©

Use

of

Inte

rnat

iona

l IS

O S

tand

ards

•D

ata

shar

ing

with

in th

e ES

CB

is p

ossi

ble

with

out m

ajor

rest

rictio

ns, s

ubje

ct to

ag

reed

pro

cedu

res

Als

o to

sha

re in

form

atio

n be

twee

n EU

inst

itutio

ns a

nd

coun

trie

sin

par

ticul

ar a

reas

•H

owev

er, t

here

are

ple

nty

of o

bsta

cles

to p

rovi

de in

form

atio

n w

ith o

ther

co

untr

ies

–Te

chni

cal c

halle

nges

cre

ate

impe

dim

ents

for t

he e

xcha

nge

of in

form

atio

n be

twee

n co

untri

es

•La

ck o

f com

mon

ly u

sed

iden

tifie

rs, e

.g. f

or s

ecur

ities

and

issu

ers

•La

rge

data

vol

umes

or l

ack

of in

form

atio

n fo

r som

e ec

onom

ies

–D

ata

shar

ing

chal

leng

es -

conf

iden

tialit

y or

com

mer

cial

dat

a re

stric

tions

•Po

ssib

le s

olut

ion

Enc

oura

ge th

e us

e of

Inte

rnat

iona

l Sta

ndar

ds (I

SO

) in

all

coun

tries

and

iden

tifie

rs li

ke L

EI,

ISIN

as

a pu

blic

goo

d to

allo

w s

harin

g of

in

form

atio

n

Dat

a In

tegr

atio

n, L

inki

ng a

nd S

harin

g

3. C

ross

cou

ntry

dat

a sh

arin

gEC

B-U

NR

ESTR

ICTE

DFI

NAL

6

Rub

ric

ww

w.e

cb.e

urop

a.eu

©

Than

ks fo

r you

r atte

ntio

nA

ny q

uest

ions

?

Dat

a In

tegr

atio

n, L

inki

ng a

nd S

harin

g

ECB

-UN

RES

TRIC

TED

FIN

AL

7

Fran

kfur

t, Fe

brua

ry 1st

, 201

7G

-20

WO

RKSH

OP

ON

DAT

A S

HA

RIN

G

The

Use

fuln

ess o

f Com

mon

Iden

tifie

r

The

Use

fuln

ess o

f Com

mon

Iden

tifie

r

Wor

k w

itho

utsy

stem

VS

wor

k w

ith

syst

em

The

Use

fuln

ess o

f Com

mon

Iden

tifie

r

Bene

fitso

f Hav

ing

A S

yste

m

enab

le to

mee

t and

surp

ass a

ll da

ta u

sers

exp

ecta

tion

sen

able

to p

rodu

ce th

e sa

me

qual

ity

resu

lts e

very

tim

eim

prov

e pe

rfor

man

cere

duce

cos

tsen

able

to b

e an

org

aniz

edor

gani

zati

onen

able

to s

olve

all

prob

lem

s on

a co

nsis

tent

basi

sen

able

to b

e “g

oing

glo

bal”

enab

le to

be

avai

labl

ean

ywhe

re a

nd 2

4/7

A Sy

stem

is A

n En

able

rW

itho

ut

Syst

emB

enef

itW

ith

Syst

emM

aybe

Enab

le to

mee

t and

surp

ass a

ll da

ta u

sers

ex

pect

atio

nsYe

s

Cou

ld b

eEn

able

to p

rodu

ce th

e sa

me

qual

ity

resu

lts

ever

y ti

me

Yes

Poss

ibly

Enab

le to

impr

ove

perf

orm

ance

Yes

Hop

eful

lyEn

able

to re

duce

cos

tsYe

s

Perh

aps

Enab

le to

be

an o

rgan

ized

org

aniz

atio

nYe

s

Pro

babl

yEn

able

to s

olve

all

prob

lem

s on

a co

nsis

tent

ba

sis

Yes

If lu

cky

Enab

le to

be

“goi

ng g

loba

l”Ye

s

Shou

ld

beEn

able

to b

e av

aila

ble

anyw

here

and

24/

7 Ye

s

The

Use

fuln

ess o

f Com

mon

Iden

tifie

r

A w

ell m

anag

ed s

yste

m re

quir

es a

s fol

low

s:

Har

dwar

e A

nd S

oftw

are

Tele

com

mun

icat

ions

Dat

abas

es A

nd D

ata

War

ehou

ses

Hum

an R

esou

rces

Proc

edur

es

Item

Iden

tifi

er

The

Use

fuln

ess o

f Com

mon

Iden

tifie

r

Impl

emen

tati

on o

f Ide

ntifi

er

depe

ndso

n th

e va

lue

of th

e ite

m:

Mel

ons

No

Han

dpho

nes

Yes

Car

s Ye

sVe

ry sp

ecia

l mel

ons t

hat c

ost

11.50

0 U

SD e

ach

Yes

so, d

epen

ds o

n th

e va

lue

of th

e ite

m

Iden

tifi

er: I

s it

rel

evan

t her

e?

The

Use

fuln

ess o

f Com

mon

Iden

tifie

r

Reas

onst

hatm

ake

mic

roda

tava

luab

le:

Max

imum

gran

ular

ity

Invo

lvin

gpu

blic

conv

enie

nce

and

secu

rity

As a

Val

uabl

e It

em in

the

syst

em,

mic

ro d

ata

pack

age

has t

o be

iden

tifie

d w

ith

an id

enti

fier

The

Use

fuln

ess o

f Com

mon

Iden

tifie

rA

uni

que

iden

tifie

r will

be

assi

gned

on

ever

y m

icro

da

ta p

acka

ge

Q: P

ossi

ble?

A: W

ith c

urre

nt s

tate

of I

T

Very

Pos

sibl

e

Exam

ple

of m

assi

ve h

andl

ing

by c

urre

nt IT

syst

em:

tran

sact

ion

code

, boo

king

cod

e, e

tc.

The

Use

fuln

ess o

f Com

mon

Iden

tifie

r

Key

Pri

ncip

les

Und

erlie

the

Iden

tifie

r:

1.It

isa

glob

alst

anda

rd.

2.A

sing

le,

uniq

ueid

entif

ier

isas

sign

edto

each

data

pack

age.

Requ

ires

conc

ept

mat

urity

and

coor

dina

tion

amon

gco

untr

yco

ordi

nato

rs(l

ocal

oper

atin

gun

its).

3.It

issu

ppor

ted

byhi

ghda

taqu

alit

y.4.

Coh

eren

tw

ithot

her

reco

mm

ende

did

entif

ier

(e.g

.IS

O17

442

LEI,

lear

ning

from

UPI

,U

TI,

USI

).C

oher

ent

does

not

mea

nsi

mila

r.Bu

tado

ptth

eeq

uals

tand

ard.

5.Su

ppor

ted

bygo

odIT

syst

emth

atca

nco

pew

ithhi

ghsp

eed

tran

sact

ion

and

data

base

that

can

acco

mm

odat

em

assi

vean

dhi

ghly

dyna

mic

cont

ent.

6.A

llin

volv

edel

emen

tsar

ebo

und

ina

syst

em.E

ach

elem

ent

has

liabi

litie

sand

bene

fits(

non

finan

cial

and

finan

cial

).

The

Use

fuln

ess o

f Com

mon

Iden

tifie

r

Lear

ning

from

LEI

“The

foun

ding

prin

cipl

esof

the

Glo

bal

LEI

Syst

emw

ere

deve

lope

dth

roug

hex

tens

ive

publ

ican

dpr

ivat

ese

ctor

colla

bora

tion

and

will

cont

inue

toev

olve

inth

issp

irit.

At

thei

rC

anne

sSu

mm

itin

Nov

embe

r20

11,th

eG

-20

lead

ers

supp

orte

d"t

hecr

eati

onof

agl

obal

lega

len

tity

iden

tifie

r(L

EI)

whi

chun

ique

lyid

enti

fies

part

ies

tofi

nanc

ialt

rans

acti

ons.

"Th

ele

ader

sal

soca

lled

onth

eFi

nanc

ial

Stab

ility

Boar

d(F

SB)

tota

keth

ele

adin

help

ing

coor

dina

tew

ork

amon

gth

ere

gula

tory

com

mun

ityon

the

gove

rnan

cefr

amew

ork

ofth

eG

loba

lLE

ISys

tem

,com

plem

entin

gef

fort

sby

the

priv

ate

sect

orto

deve

lop

ate

chni

cal

solu

tion,

incl

udin

gth

roug

hth

eIn

tern

atio

nalO

rgan

isat

ion

forS

tand

ardi

sati

on.”

The

Use

fuln

ess o

f Com

mon

Iden

tifie

r

Lear

ning

Fro

m O

ther

Iden

tifie

rs

UPI

(Uni

que

Prod

uct

Iden

tifie

r),

Aun

ique

code

tode

scri

bea

finan

cial

prod

uctf

orth

epu

rpos

eof

regu

lato

ryre

port

ing.

UTI

(Uni

que

Tran

sact

ion

Iden

tifie

r),

With

aLE

Ian

da

UPI

embe

dded

ina

finan

cial

tran

sact

ion,

the

coun

terp

arty

and

the

prod

uct

trad

edca

nbe

know

n.N

owth

efin

anci

altr

ansa

ctio

nits

elf

need

sto

beid

entif

ied,

asth

ere

can

bem

any

ofth

esa

me

tran

sact

ions

cond

ucte

dby

the

sam

epa

rtie

sin

the

sam

epr

oduc

t.FE

I(F

inan

cial

Even

tId

entif

ier)

,ide

ntify

ing

afin

anci

alev

ent

such

asa

mer

ger,

acqu

isiti

on,b

ankr

uptc

y,sp

in-o

ff,et

cTh

ein

tegr

ated

glob

alid

enti

fica

tion

syst

em,

Prom

otes

U3

(Uni

que,

Una

mbi

guou

s,an

dU

nive

rsal

)G

loba

lId

entif

icat

ion

codi

ngsc

hem

e.It

conf

orm

sto

allk

now

nst

anda

rds

toda

te(I

SOLE

I17

442:

2012

)an

dto

furt

her

expe

cted

UTI

and

UPI

requ

irem

ents

.

The

Use

fuln

ess o

f Com

mon

Iden

tifie

rPr

opos

ed S

truc

ture

As T

he E

nvir

onm

ent o

fM

icro

Dat

a Pa

ckag

e Id

enti

fier

CC

MD

P

MD

D

DU

•Re

gist

erin

g M

DP

•Co

ordi

natin

g M

DP

•Re

gist

erin

g M

DD

•Co

ordi

natin

g M

DD

•D

isse

min

atin

g D

ata

•Br

oadc

astin

g M

DD

s for

U

pdat

es a

nd R

evis

ions

•O

rder

ing

pric

ing

if an

y

•Re

gist

erin

g D

U•

Dis

sem

inat

ing

Dat

a•

On

beha

lf of

MD

P, si

gnin

g LA

DU

with

DU

•Co

ntac

ting

all D

U fo

r Upd

ates

an

d Re

visi

ons

•Re

port

ing

Dis

sem

inat

ion

Stat

istic

s

•Q

uery

ing

Dat

a•

Repo

rtin

g D

isse

min

atio

n St

atis

tics

•Fo

rwar

ding

pay

men

t if

any

•Re

ques

ting

Dat

a•

Repo

rtin

g Re

port

s•

Acc

eptin

g pa

ymen

t if

any

International Coordinator

The

Use

fuln

ess o

f Com

mon

Iden

tifie

rPr

opos

ed F

orm

atof

Mic

ro D

ata

Pack

age

Iden

tifie

r

As

adr

aft

idea

,Mic

roD

ata

Pack

age

Iden

tifie

rat

leas

tco

nsis

tsof

22di

git

ofco

des

asfo

llow

s;C

ount

ry(o

rIn

tern

atio

nal

Org

aniz

atio

n)co

de(e

.g.

ISO

3166

-2al

pha-

3)[D

igit

1-4]

,M

icro

Dat

aPr

ovid

er(M

DP)

code

[Dig

it5-

7],

Mic

roD

ata

Dis

trib

utor

(MD

D)

code

[Dig

it8-

10],

Dat

aset

code

[Dig

it11-

13],

Uni

que

incr

emen

tal

num

ber

for

each

MD

D[D

igit

14-1

6],D

ata

Use

rco

de[D

igit

17-1

9],U

niqu

ein

crem

enta

lnum

ber

fore

ach

DU

requ

est[

Dig

it20

-22]

.

Linki

ng D

iffer

ent D

ata

Sets

Link

ing

Diffe

rent

Dat

a Se

ts

ItC

onsi

stso

f:•

Vari

able

/Fie

ldLi

nkin

g•

Cove

rage

Link

ing

•Ti

me

Seri

esLi

nkin

g

Link

ing

Diff

eren

t Dat

a Se

ts w

ould

sig

nific

antly

en

rich

the

info

rmat

ion

prov

ided

by

data

sets

Link

ing

Diffe

rent

Dat

a Se

ts

On

Dat

a: Im

prov

es c

oher

ency

and

cons

iste

ncy.

On

Dat

a U

ser:

Pos

sibi

lity

to h

ave

mor

eco

mpr

ehen

sive

data

.O

n D

ata

Prov

ider

: Eff

icie

ncy

in d

ata

man

agem

ent a

nd p

roce

ssin

g, Im

prov

es d

ata

qual

ity.

Bene

fits o

f the

Abi

lity

of D

ata

Link

ing

Link

ing

Diffe

rent

Dat

a Se

ts

*Av

aila

bili

tyof

Met

adat

aSi

mila

rper

cept

ion

onth

eda

taAv

oidi

nger

roro

facc

epta

nce

orre

ject

ion

*El

igib

ilit

yA

nal

ysis

ofD

ata

Lin

kin

gpr

ior

tolin

king

proc

ess

Requ

irem

ents

for L

inki

ng

Link

ing

Diffe

rent

Dat

a Se

ts

Elig

ibili

tyA

naly

sis

ispe

rfor

med

inth

eba

ckgr

oun

d.It

star

tsto

run

asso

onas

the

data

pack

age

isav

aila

ble

for

adi

stri

buti

onas

wel

las

met

adat

a.

Elig

ibili

ty A

naly

sis o

f Dat

a Li

nkin

g

Toen

able

the

wid

est

poss

ible

ofda

talin

king

,th

eel

igib

ility

anal

ysis

isdo

neby

Mic

roD

ata

Inte

rnat

ion

alC

oord

inat

or.

Link

ing

Diffe

rent

Dat

a Se

ts

Vari

able

/Fie

ldLi

nkin

g:re

quir

espr

ecis

ion

onre

cord

prof

iling

toav

oid

mis

mat

ch.

Bew

are!

The

mor

elin

ked,

the

mor

eco

mpl

ete

info

rmat

ion

ona

row.

Itm

eans

mor

epo

ssib

ility

ofid

enti

tyre

veal

.C

over

age

Link

ing:

dupl

icat

ion

and

unde

rco

vera

gear

eth

eco

mm

onm

ista

kes.

Requ

ires

also

prec

isio

non

reco

rdpr

ofili

ngan

den

ough

unde

rlyi

ngin

form

atio

non

the

data

sets

invo

lved

.Ti

me

Seri

esLi

nkin

g:H

asto

beaw

are

ofch

ange

sin

data

stru

ctur

eal

ong

the

peri

od.

Cha

lleng

es in

Dat

a Li

nkin

g

Link

ing

Diffe

rent

Dat

a Se

ts

Sign

ifica

ntly

spee

ding

upth

epr

oces

sof

addr

essi

ngda

tase

tsto

belin

ked

•Lo

cati

ngth

eda

ta•

Det

erm

inin

gw

hich

data

part

s•

Det

erm

inin

gth

eti

me

refe

renc

eof

data

sets

Impr

ovin

gac

cura

cyof

data

proc

essi

ngby

ensu

ring

row

iden

tity

Cou

pled

wit

hgo

odm

etad

ata,

will

enab

leth

ecr

eati

onof

glob

alin

terl

inke

dda

taIm

prov

ing

proc

ess

stan

dard

izat

ion

that

will

redu

ceth

epo

ssib

ility

ofle

akso

nin

divi

dual

data

The

Role

of I

dent

ifier

in L

inki

ngD

ata

Sets

Than

k Y

ou!

DATA

SHA

RIN

G: E

XPER

IEN

CE

& C

HALL

ENG

ES O

F IN

DON

ESIA

’S

STA

TISTIC Ga

ntia

h W

urya

ndan

i, E

tika

Rosa

nti

G-20

Wor

ksho

p on

Dat

a Sh

arin

g Fr

ankf

urt,

Janu

ary

31-F

ebru

ary

1, 2

017

2

2

22

Intro

duct

ion

Con

cept

, pur

pose

and

ben

efit

of d

ata

shar

ing

Cur

rent

pra

ctic

e in

Indo

nesia

Obs

tacl

es in

dat

a sh

arin

g

Maj

or c

halle

nge/

issue

s

3

3

3

The

need

ofd

ata

shar

ing

beca

me

mor

ecr

ucia

lfo

rInd

ones

ia,p

arti

cula

rlysin

ceec

onom

iccr

isis

in19

97/9

8.Po

licym

ake

rsm

ust

have

com

preh

ensiv

ein

form

a-

tion

and

clea

rpic

ture

ofiss

ues.

Holis

tican

alys

issu

ppor

tpin

poin

tthe

right

pro

blem

and

itsso

lutio

n.Th

ene

edof

mor

egr

anul

aran

dtra

nspa

rent

dat

ato

supp

orth

olist

ican

alys

is.

4

4

4

Dat

ash

arin

gis

the

prac

tice

ofd

ata

exch

ange

betw

een

vario

usor

gani

zatio

ns,

peop

lean

dte

chno

logi

es.

Dat

ash

arin

gis

law

ful,

and

conf

iden

tialit

ysh

ould

bem

aint

aine

d.

Dat

ash

arin

gis

usua

llyfo

llow

edby

risk,

eith

erpe

rcei

ved

orac

tua

l.Th

eref

ore,

itne

eds

risk

man

agem

ent

toav

oid

the

pote

ntia

lris

kin

misu

sed

da

ta.

5

5

5

To h

ave

syne

rgy

with

oth

er in

stitu

tion

Impr

ove

effic

ienc

y an

d tr

ansp

aren

cyTo

hav

e be

tter p

olic

y fo

rmul

atio

n an

d it

s ef

fect

iven

ess.

Mor

e gr

anul

ar a

nd tr

ansp

aren

t dat

a is

aim

ed to

ed

ucat

e pu

blic

and

smoo

th o

ver t

he

esta

blish

men

t of s

peci

fic p

olic

y.

6

6

6

Wid

e ra

ngin

g in

form

atio

n fo

r all p

olic

ymak

ers

Cle

arer

and

bet

ter q

ualit

y of

dat

a M

inim

ized

bur

den

of r

edun

dan

t dat

a co

llect

ion

effo

rt, p

artic

ular

ly re

porti

ng fr

om th

e sa

me

resp

ond

ent e

ntiti

es b

y d

iffer

ent a

utho

ritie

sEf

ficie

nt in

dat

a co

llect

ion

7

7

77

Act

No.

23of

1999

rega

rdin

gC

entra

lBa

nk,

artic

le14

para

gra

ph

(1)s

tate

stha

tcen

tralb

ank

ma

yco

nduc

tasu

rvey

regu

larly

eith

erat

mac

roor

mic

role

vel

tosu

ppor

tth

efu

nctio

nsin

mon

eta

ry,p

aym

ents

yste

m,a

ndfin

anc

ials

tabi

lity

polic

ies

Act

No.

24of

1999

rega

rdin

gth

eFo

reig

nEx

chan

geFl

ows

and

Exch

ange

Rate

Syst

em,

artic

le3,

conv

eys

toBI

the

auth

ority

tore

ques

tin

form

atio

nan

dd

ata

offo

reig

nex

chan

getra

nsac

tions

,ass

etsa

ndlia

bilit

ieso

fany

inst

itutio

nC

entra

lBa

nkRe

gula

tion

onRe

porti

ngp

artic

ular

lyfo

rb

anks

and

non

finan

cial

inst

itutio

n

8

8

88

Inte

rnat

iona

l

•Su

bmiss

ion

to IM

F (S

DD

S, IF

S)•

Com

mitm

ent r

ealiz

atio

n in

DG

I G

-20

•Su

bmiss

ion

to B

IS, A

DB,

OEC

D,

Wor

ld B

ank,

ASE

AN

Sec

reta

riat

Nat

iona

l

•In

tegr

ated

sys

tem

for d

ata

shar

ing

with

FSA

•Jo

int c

ompi

latio

n w

ith N

SO: a

s co

ntrib

utor

in G

DP,

Flo

w o

f Fu

nd, b

usin

ess s

urve

y, re

sear

ch

in W

PI•

Join

ban

king

repo

rting

m

ain

tena

nce

and

util

izatio

n w

ith F

SA•

Join

com

pila

tion

with

MoF

in

deb

t sec

uriti

es a

nd fo

reig

n lo

an•

As t

he g

over

nmen

t bon

d cu

stod

ian

and

settl

emen

t•

Join

exp

ort -

impo

rt d

ata

shar

ing

syst

em w

ith N

SO,

Cus

tom

, and

Tax

Offi

ce

9

9

Cen

tral B

ank

Inte

rnat

iona

l St

akeh

olde

r

Nat

iona

lSt

akeh

olde

r

Inte

rnal

Stak

ehol

der

BIS,

IMF,

Wor

ld b

ank,

A

DB, D

GI,

ASE

AN

Se

cret

ary,

OEC

D, e

tc.

[Com

mitm

ent b

ase]

Boar

d of

G

over

nor,

othe

r De

pt. &

regi

onal

of

fice

NSO

, MoF

, FSA

, loc

al

gove

rnm

ent,

othe

r m

inist

ries/

inst

itutio

ns

[Leg

al b

ase:

MoU

, sp

ecia

l acc

ess

for

FSA

, in

tegr

ated

sys

tem

of

data

sha

ring

]

[Leg

al b

ase:

in

tern

al

regu

lati

on,

spec

ial

acce

ss]

Way

For

war

d d : “

ON

E DA

TA F

OR

NA

TION

AL

AL”

10

10

1010

Prom

ote

MO

Uw

ithot

heri

nstit

utio

nsa

sdat

aso

urce

sD

evel

opin

tegr

ated

exch

ange

syst

emw

ithFi

nanc

ialS

uper

viso

ryA

utho

rity

Impr

ove

gran

ular

and

trans

pare

ncy

ofst

atis

tical

publ

icat

ion

inor

der

toes

cala

tein

stitu

tiona

lse

ctor

sal

ertn

ess

ofd

ata

as

the

early

war

ning

cond

ition

and

topr

even

tins

tabi

lity

byes

tabl

ishin

gpr

eem

ptiv

epo

licy

Regu

lar

exch

ange

inin

form

atio

nth

roug

hfo

cus

grou

pd

iscus

sion

and

da

taex

cha

nges

with

othe

rins

titut

ions

Mai

ntai

nco

nfid

entia

lity

by

regu

lar

revi

ewan

das

sess

men

ton

acc

ess

prov

ided

Ad

opt

inte

rnat

iona

lst

and

ard

best

prac

tice

met

hod

olog

yin

stat

istic

com

pila

tion

Parti

cipa

tein

Inte

rna

tiona

lsta

tistic

sco

mm

itmen

tsu

cha

sIM

F(IF

S,SD

DS)

,D

GI-G

20,B

IS,O

ECD

,AD

B,W

orld

Bank

,ASE

AN

Secr

etar

iat

Join

com

pilin

g in

som

e in

dic

ator

s su

ch a

s Sec

tora

l Acc

ount

s, flo

w o

f fun

d,

GD

P, d

ebt s

ecur

ities

, ext

erna

l loa

n w

ith o

ther

inst

itutio

ns (N

SO, M

oF, F

SA

and

loca

l gov

ernm

ent)

.W

ay fo

rwar

d, e

stab

lish

an in

tegr

ated

repo

rting

sys

tem

in

th

e ba

nkin

g in

dus

try (c

olla

bora

tion

of c

entra

l ban

k an

d F

SA)

ss

11

11

The

form

at,m

echa

nism

,and

proc

ess

ofd

ata

shar

ing

with

othe

rin

stitu

tion

shou

ldbe

esta

blish

ed.

Dat

aex

chan

gebe

yond

agre

ed,s

houl

dbe

requ

este

dby

spec

iall

ette

rIn

clus

ion

ofdi

scus

sion

foru

mfo

rbu

ildin

gca

pabi

litie

san

dre

sear

chin

the

MO

UC

onfid

entia

lity

oflim

ited

dat

ash

arin

g

Dat

are

late

dto

spec

ific

need

sbe

twee

nce

ntra

lba

nkan

dot

her

inst

itutio

nsbi

late

rally

orm

ultil

ater

ally

agre

edon

.Ex

clud

ein

divi

dual

data

asth

isis

proh

ibite

dby

law

for

bank

ing

data

,res

trict

edan

dco

nfid

entia

lity

dat

a,un

less

stat

edto

be

shar

ed.

Incr

easin

gth

eco

mm

itmen

t,co

oper

atio

n,an

dsy

nerg

yin

ord

erto

exch

ang

eda

ta/s

tatis

ticin

form

atio

nre

late

dto

mon

eta

rypo

licy,

paym

ent

syst

em,

finan

cial

syst

emst

abilit

y,re

alse

ctor

and

othe

rpa

rticu

lar

cove

rage

,an

dal

sod

evel

opin

gth

ehu

man

reso

urce

scap

abilit

iesa

ndre

sear

ch.

Purp

ose

Targ

et d

ata

to b

e sh

ared

Proc

edur

e fo

r dat

a sh

arin

g

12

12

All

disp

utes

/disa

gree

men

tam

ong

the

inst

itutio

nsw

illbe

reso

lved

byco

nsen

susb

ased

onre

gula

tion.

stip

ula

tew

ays

topr

otec

tthe

conf

iden

tialit

yof

info

rma

tion

:o

App

rova

lof

data

-sha

ring

acce

ssbe

gove

rned

byre

view

ing

the

cand

idat

e’sp

ositi

onan

dta

sko

App

ropr

iate

actio

nsto

ensu

reno

n-d

isclo

sure

ofco

nfid

entia

lin

form

atio

n,in

clud

ing

sanc

tion

oRe

gula

tions

and

cod

eof

cond

uct

for

pers

ons

who

have

acce

ssto

conf

iden

tialin

form

atio

nfro

mm

isuse

d

Info

rmat

ion

isex

chan

ged

via

med

iaag

reed

bybo

thin

stitu

tions

.

The

info

rmat

ion

prov

ided

shou

ldno

tbe

used

foro

ther

purp

oses

than

the

inst

itutio

nLim

itatio

n on

use

Tran

sfer

ring

data

Prot

ect c

onfid

entia

lity

Med

iatio

n of

disa

gree

men

t:

13

13

1313

Lega

l and

con

fiden

tialit

y co

nstra

int

Prot

ectio

n of

con

fiden

tial in

form

atio

n in

term

of l

egal

fo

und

atio

n (d

ata

secu

rity)

Stan

dard

ize m

etho

dolo

gy a

nd m

etad

ata

IT sy

stem

and

cos

tA

cces

sibilit

y: o

pen

or re

stric

ted

, cod

e of

con

duc

t to

safe

guar

d th

e se

curit

yFe

ar o

f stra

tegi

c m

anag

emen

t ham

perin

g D

iffic

ultie

s in

dat

a co

llect

ion,

par

ticul

arly

in c

orpo

ratio

n an

d h

ouse

hold

sect

ors

Inst

itutio

ns re

luct

ance

to o

pen

acce

ss fo

r oth

ers

14

14

1414

Prot

ocol

to se

cure

dat

a co

nfid

entia

lity

(sen

sitiv

e/in

div

idua

l/per

sona

l inf

orm

atio

n)

Stan

dar

dize

d m

etad

ata

and

met

hod

olog

y as

to h

ave

relia

ble

anal

ysis

in c

ount

ries

com

paris

on o

f cro

ss b

ord

er tr

ansa

ctio

ns

Dat

a cl

earin

g an

d re

conc

iliatio

n am

ong

coun

tries

cro

ss b

ord

er

stat

istic

s, an

d th

e ne

ed o

f sta

tistic

s co

mpi

lers

mai

ling

list/

cont

act.

Inte

grat

ed s

yste

m in

join

com

pilin

g st

atist

ics

Effe

ctiv

enes

s of d

ata

sha

ring

utiliz

atio

n to

ma

p in

sta

bilit

y of

cro

ss

bord

er tr

ansa

ctio

n/po

sitio

n to

pre

vent

cris

is an

d sa

fegu

ard

st

abi

lity

by jo

in in

terv

entio

n a

mon

g co

untri

es.

14D

ata

relia

bilit

y in

the

cros

s fer

tiliza

tion

as th

e m

irror

ing

dat

a fo

r co

untri

es w

ith m

issin

g d

ata

of o

utw

ard

flow

s

Than

k Yo

uD

anke

Rec

ord

Link

age

Def

initi

on a

nd G

erm

an E

xper

ienc

eSt

efan

Ben

der (

Hea

d of

the

Res

earc

h D

ata

and

Serv

ice

Cen

tre)

,Deu

tsch

e B

unde

sban

k

Mot

ivat

ion:

The

Nee

d fo

r Rec

ord

Link

age

•Lar

ge a

mou

nts

of d

ata

are

bein

g co

llect

ed.

•Inc

reas

e an

alyt

ical

val

ueof

dat

a

•Im

prov

e da

ta q

ualit

y

•Red

uce

surv

ey b

urde

nfo

r uni

ts (l

ike

com

pani

es,

bank

s)

•Dat

a ar

e of

ten

from

diff

eren

t sou

rces

(nee

d fo

r re

cord

link

age)

.

Page

21

-Feb

-20

17Be

nder

: Rec

ord

Link

age

Def

initi

on o

f Rec

ord

Link

age

•RL

is fi

ndin

g re

cord

s in

diff

eren

t dat

a se

ts th

at

repr

esen

t the

sam

e en

tity

and

link

them

.

•RL

is a

lso

know

n as

dat

a m

atch

ing,

ent

ity

reso

lutio

n, o

bjec

t ide

ntifi

catio

n, d

uplic

ate

dete

ctio

n, id

entit

y un

certa

inty

, mer

ge-p

urge

.

Page

31

-Feb

-20

17Be

nder

: Rec

ord

Link

age

Mai

n A

pplic

atio

ns o

f Rec

ord

Link

age

1.M

ergi

ng o

f tw

o or

mor

e da

ta fi

les

2.Id

entif

ying

the

inte

rsec

tion

of th

e tw

o da

ta s

ets

3.U

pdat

ing

of d

ata

files

(with

the

data

row

of t

he o

ther

da

ta fi

les)

4.Im

pute

mis

sing

dat

a

5.D

edup

licat

ea

file

(rem

ove

dupl

icat

es in

one

file

)

Page

41

-Feb

-20

17Be

nder

: Rec

ord

Link

age

Rec

ord

Link

age

Cha

lleng

es (C

hris

ten

2012

)

•No

uniq

ue (c

lean

) ent

ity id

entif

iers

ava

ilabl

e

•Rea

l wor

ld d

ata

are

dirty

(typ

ogra

phic

al e

rror

s an

d va

riatio

ns, m

issi

ng

and

out-o

f-dat

e va

lues

, diff

eren

t cod

ing

sche

mes

, etc

.)

•Sca

labi

lity,

dat

a ba

se s

ize

•Nai

ve c

ompa

rison

of a

ll re

cord

pai

rs is

qua

drat

ic•R

emov

e lik

ely

no-m

atch

es a

s ef

ficie

ntly

as

poss

ible

•No

train

ing

data

in m

any

linka

ge a

pplic

atio

ns

•No

reco

rd p

airs

with

kno

wn

true

mat

ch s

tatu

s

•Priv

acy

and

conf

iden

tialit

y •P

erso

nal i

nfor

mat

ion,

like

nam

es a

nd a

ddre

sses

, are

com

mon

ly

requ

ired

for l

inki

ng

Page

51

-Feb

-20

17Be

nder

: Rec

ord

Link

age

The

exte

nded

reco

rd li

nkag

e pr

oces

s

Page

61

-Feb

-20

17Be

nder

: Rec

ord

Link

age

Ther

e is

no

perf

ect w

orld

Page

71

-Feb

-20

17Be

nder

: Rec

ord

Link

ageTr

ue P

ositi

ve (T

P)Fa

lse

Neg

ativ

e (F

N)

Fals

ePo

sitiv

e (F

P)Tr

ueN

egat

ive

(TN

)

•In

a p

erfe

ct w

orld

•B

ut w

e do

not

live

in a

per

fect

wor

ld

True

Posi

tive

(TP)

True

Neg

ativ

e (T

N)

Pre

dict

ed

True

True

Pre

dict

ed

1 0

01

01

1 0

Rec

ord

Link

age

at B

unde

sban

k (in

the

RD

SC)

Bac

kgro

und

•N

o co

mm

on re

gist

er in

the

Bun

desb

ank

•S

trong

nee

d to

hav

e an

inte

grat

ed c

ompa

ny re

gist

er•

Use

of 7

diff

eren

t dat

a se

ts (i

nter

nal a

nd e

xter

nal)

to c

onst

ruct

som

e ki

nd o

f an

inte

grat

ed c

ompa

ny re

gist

er (f

or o

ne y

ear):

com

pani

es fr

om fo

reig

n di

rect

inve

stm

ents

(MiD

i),

–ba

lanc

e of

pay

men

t sta

tistic

s (S

ITS

), –

bank

ing

supe

rvis

ion

data

on

borr

ower

s (B

AK

IS /

MiM

iK),

–ba

lanc

e sh

eet d

ata

(US

TAN

) and

exte

rnal

bal

ance

she

et d

ata

Cha

lleng

e•

No

com

mon

uni

que

firm

iden

tifie

r in

Ger

man

y(C

ompa

ny b

usin

ess

regi

ster

-ID n

ot s

tabl

e)

Page

81

-Feb

-20

17Be

nder

: Rec

ord

Link

age

Rec

ord

Link

age

at B

unde

sban

k (in

RD

SC)

•M

atch

firm

dat

a…•

… th

at d

o no

thav

e a

com

mon

uni

que

iden

tifie

r / k

ey•

… b

y us

ing

alte

rnat

ive

iden

tifie

rs(s

uch

as n

ames

)

•It

is p

ossi

ble

to c

onst

ruct

a „g

roun

d-tru

th“-s

ampl

e of

mat

ch

cand

idat

e pa

irs to

trai

n a

RL

mod

el:

•C

omm

on e

xter

nal I

ds•

Qua

si id

entic

al b

alan

ce s

heet

s of

firm

s

•M

achi

ne L

earn

ing

Alg

orith

m, w

hich

can

be

used

for o

ther

ye

ars

or c

ompa

ny d

ata

sets

.

Page

91

-Feb

-20

17Be

nder

: Rec

ord

Link

age

Res

ult o

f tw

o C

ompa

ny D

ata

Sets

TP =

15,

475

FN =

653

FP=

647

TN =

15,

638

True

Pre

dict

ed0

1

1 0

Page

10

1 -F

eb -

2017

Bend

er: R

ecor

d Li

nkag

e

Res

ults

ofR

L ar

eve

rygo

odan

dw

ill be

used

foro

ther

linka

ge.

Rec

ord

Link

age

acro

ssIn

stitu

tions

: Kom

biFi

D

In th

e pr

ojec

t Com

bine

d fir

m d

ata

for G

erm

any

(Kom

biFi

D) d

ata

colle

cted

by

•Ger

man

Sta

tistic

al O

ffice

s,

•Fed

eral

Em

ploy

men

t Age

ncy,

and

•Deu

tsch

e B

unde

sban

k

wer

e fo

r the

firs

t tim

e lin

ked.

With

a h

uge

effo

rt -a

fter 5

yea

rs (2

007-

2011

) and

a lo

t of p

erso

ns in

volv

ed -

the

data

wer

e lin

ked

and

mad

e av

aila

ble

to th

e re

sear

ch c

omm

unity

. Mos

t ef

forts

in th

e fo

llow

ing

task

s:

•Leg

al is

sues

•Cle

anin

g •R

ecor

d Li

nkag

e•C

onsi

sten

cy c

heck

s

Page

11

1 -F

eb -

2017

Bend

er: R

ecor

d Li

nkag

e

Cos

ts

•Lin

king

thes

e da

ta a

re c

ostly

•Hig

h ef

fort

to c

lean

the

data

(mos

t tim

e is

spe

nd in

cle

anin

g)•U

ncer

tain

ty o

f hav

ing

all t

rue

mat

ches

•Leg

al „u

ncer

tain

ties”

•Stro

ng n

eed

for b

ette

r dat

a qu

ality

.•S

trong

er n

eed

for u

niqu

e id

entif

iers

like

LE

I

Page

12

1 -F

eb -

2017

Bend

er: R

ecor

d Li

nkag

e

Page

13

1 -F

eb -

2017

Bend

er: R

ecor

d Li

nkag

eThan

k yo

u fo

r you

r atte

ntio

n!

Web

site

: ww

w.b

unde

sban

k.de

\fdsz

Con

tact

: fds

z@bu

ndes

bank

.de


Recommended