Date post: | 05-Jan-2016 |
Category: |
Documents |
Upload: | winifred-gordon |
View: | 213 times |
Download: | 0 times |
26 August 2011
Future of access to EU confidential data for scientific purposes
Jean-Marc MuseuxEurostat – [email protected] ISI conference, Dublin, August 2011
Aims
Provide a vision for the future of access to EU microdata
Lay down fundamental principles
Identify constraints
Provide a aligned view of ongoing and planned EU projects
Current situation Legal framework
– Access to confidential data for scientific purpose is enabled by the Article 23 of Framework Regulation 223/2009 establishing EU statistics and the European Statistical System
– Rules and conditions of access are described in an implementing regulation 831/2001
• Mainly European Universities and Research bodies• Access covered by contracts• NSI should provide their consent for each research project• Two modes of access are enabled
– Delivered access of anonymised microdata to research institutions
– Access in Eurostat safe centre in Luxembourg to non protected data
Current situation Business process
– NSIs collect data according to harmonised definitions/methodologies at the basis of their national statistics
– NSIs transmit files to Eurostat for the sake of production of EU statistics – in some domains (mainly household statistics) microdata are transmitted
– Methods for anonymisation are agreed with 27 MS
– Files are prepared by Eurostat
– Eurostat handles research requests and release data according to appropriate procedures
Technology – Secured transmission of flat files (single entry point – ftp like) and
structural meta data and quality report
– CD rom with anonymised data files
Current situation
Issues
Heavy procedures - centralisation in Eurostat
Lack of flexibility to adapt to national background - limited list of bodies allowed to get access – limited access modes
Resources limiting the offer – 6 surveys out of the 12 surveys which are enabled in the
Regulation 831, – A maximum of 200 access handled per year – not possible to
substainably meet rapid increase of demand (20% increase each year)
Overlooking technology development (remote access, secure network over internet)
Long term target – architectural principles
Diversity of means of access as a tool to better fit various needs and to monitor costs (public use files - online query system - anonymised files - fat remote access centre)
Distribute access to as many points as possible so to improve accessibility of data to researchers
Simplify procedures Minimise the number of data duplicates. Integrate research use into the statistical value chain Maximise value added of available data through
involvment professional partners/agents
Develop shared standards and industrialisation to pave the way to mutualisation of decision making, interoperability of systems and the efficient use of rare resources
Reuse existing infrastructure whenever possible Take decision according to cost benefit analysis Implement a risk management approach ensuring the
adequate level of protection of individual information through the combination of safeguard measures along the 3 pillars : safe people, safe data, safe settings
Future Developments on European Level
Long term target – architectural principles
Constraints
Maintain adequate level of interoperability with national provisions with respect to protection of statistical confidentiality
Allow for some diversity in modalities, in IT infrastructure, in the user support as a source for emulation, innovation and adaptation to specific needs
Maintain costs within operational budgets Ensure security and integrity of the whole systems and
data at all steps of the process
Long term vision - The `Schengen` approach
The strategic objective is to consider all the data collected under European legislation as a common good of the ESS
To empower any NSI to grant access to all the European data given that commonly agreed basic principles are met.
To enable access to the whole set of European data from any accredited entry point
To set up minimal coordination function
Future Developments on European Level
Long term vision - business architecture
Solution
Unique accreditation mechanism for institutions and researchers accessing EU datasets
Distributed database with local version of confidential data sets prepared by NSIs, credentials being set locally
A central directory of files and access maintained by Eurostat
Access to the network through terminal server solution (remote access technology)
Solution
Barriers
Need for a paradigm shift from a exclusive ownership of data from NSIs to a common ownership and shared responsibilities on EU data – Framework Regulation 223 will probably have to be changed to make explicit these fundamental principles.
A careful cost benefit analysis has also still to be developed and an agreed model for cost/burden sharing has still to emerge.
A smooth transition should take place step by step.
Stepwise approach – first steps
1. feasibility study (2009-2010) done by network of NSIs (ESSnet : DE, IT, UK, NL, HU)
2. change in the implementing Regulation 831 (2010-2011)
3. new methodological solutions for protection of microdata in a distributed environment (2011-2012)
4. pilot on limited network infrastructure (2012 -2014)
5. infrastructure including data archives and NSIs (FP7 research infrastructure project 2011-2016)
6. ….
1. Revision of implementing EU regulation
EU data accreditation not limited to EU universities and research bodies
Enabling new modes of access (remote access)
Enabling involvement of external partners (data archive,…)
Establishing new and cost effective procedures
Allow for some flexibility in incorporating new standards
2. New model for micro data protection in a distributed environment
Objective criteria based on disclosure risk and data utility (information loss) - Proof of concept run by ESSnet (IT, NL, DE )
Standards ensuring data usability (documentation, code lists, format ….)
Guidelines and threshold level for protection of EU datasets
3. Pilot solution
Specific secure environment to host confidential data managed by Eurostat (SICON)
Remote access using terminal server solution from NSI data centres (UK, FR, HU, DE, PT)
Feasibility, cost benefit of extension the network, Refining the cost model Tuning procedures and standards for decentralization
– Output checking– Researcher supervision and support
Building mutual trust among partners
3. Pilot solution
Conclusions
Continuous monitoring - integration of results of different projects
Cost benefit analysis at each step
Alignment with the vision
Discussing with NSI partners