Centers for Disease Control and Prevention
NCHS Research Data Center
Wilbur C. Hadden
What is the Research Data Center?
• A NCHS data access program• An office, with staff, within NCHS• A physical space, with security• A research facility, with
computers
Access to NCHS Data
• Public use data files• Restricted access• Tabulations• Reports, publications, web-site• Journal articles
Authority for Confidentiality
• Public Health Service Act, section 308(d)
• Informed consent of survey participants and data providers
Identified Data
• Data records are identified by unique data elements • Name• Social Security Number• Hospital ID number
Identifiable Data
• Data are made identifiable with data elements that can be related to unique identifiers• Address• Size – e.g. number of beds, height• Specialization – e.g. having a level 1
shock trauma center, or being a farm owner/manager
Statistical Disclosure
• When information on data providers can be inferred from published data
Data Accessible in RDC
• Data that are potentially identifiable
• Physician age and sex in NAMCS• Data linked to geographic identifiers• State, county, Census tract
• Data linked to provider characteristics• Hospital size
RDC Data Access Methods
• On-site access• Remote access• RDC staff assistance /
collaboration
Data Preparation
• Users cannot take data into RDC• RDC staff prepare data files for
users• Users can provide data to RDC staff• NCHS policy is not to release user
data to others• RDC staff will link user data to
NCHS data
On-Site Access• Controlled access facility• Isolated computer network• No e-mail• No internet
• No mountable media on computers• Users cannot take materials into
computer room• No more than 3 users per project
Output Disclosure Review
• Only printer is located behind locked door• RDC staff person retrieves
output• Disclosure review performed
before output given to users
Remote Access
• Send SAS programs to RDC by e-mail• SAS program scanned – a few SAS
commands and not allowed• Program executed• Automated disclosure review of
program output• Output returned by e-mail
Software in RDC
• SAS (only language available on ANDRE)
• SUDAAN• Fortran• HLM• Stata• Limdep• text editors/viewers
To Gain Access to NCHS RDC
•Visit the RDC (optional, but suggested)•Discuss proposal with RDC staff•Submit research proposal•Discuss proposal with RDC staff and revise•Pass NCHS review of proposal•Sign “Research Affidavit of Confidentiality”
Proposal Contents
• Identification of researcher(s)• Source of funding• Description of proposed research• Detailed specification of data sought• Detailed specification of any user
supplied data• Software requirements
Proposal Evaluation Criteria
•Risk of disclosure of confidential data•Scientific merit and technical feasibility•Availability of resources in RDC•Consistency with NCHS mission
Cost per Project
File construction and setup
$500 per day
On-site $1000 per week
Remote Access NSFG-CDF $500 per year NHIS-polio $500 per year Files < 130k records $500 per
month Files > 130k records $1000 per
month
Number of RDC Clients
2000 2001 2002*
Remote Access 7 3 3On Site 4 10 11Staff Programming 1 1 5
*January to July 2002
Jobs Processed at RDC
2000 2001 2002
*Remote Access 888 441 456
On Site 570 1006 1645Staff Programming 81 212 760
*January to July 2002
Projects by Data System
2000 2001 2002*
NHIS 6 7 8NSFG 3 7 8NHANES 1 2 1NAMCS 1 1 4Other -- 2 5*January to July 2002
http://www.cdc.gov/nchs/r&d/rdc.htm