D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
1
Deliverable number D2.2
“Description of Data Sources A report on the identified healthcare databases and their characteristics plus literature on their
experience with respect to paediatric studies ”
GRIP
Global Research in Paediatrics
Network of Excellence
HEALTH-F5-2010-261060
Lead Beneficiary EMC
Author(s) C.Ferrajolo, Y. Li, K. Verhamme, F. Fregonese, D. Bonifazi,
O.Osokogu, S. de Bie, I. Wong, D. Weibel, J. Bonhoeffer and M. Sturkenboom
Revision date July 1 2012
Start date 01/01/2011
Duration 5 years
Project Coordinator Dr. Carlo GIAQUINTO
Azienda Ospedaliera di Padova (AOPD)
Re fe re n c e W P W P 2 – I n t e g r a t e d i n f r a s t r u c t u r e f o r e p id e m io l o g ic a l a n d p o s t m a r k e t in g s t u d ie s
Re fe re n c e A ct iv i ty T a s k 2 . 0 2 – I d e n t i f y h e a l t h c a r e d a t a b a s e s c o m p r is in g p a e d ia t r i c d a t a
D is s e m in a t io n Le v e l
P u b l i c P U
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
2
Table of Contents
1 List of authors and GRIP participants .......................................................................... 3 2 Abstract ........................................................................................................................ 5 3 Receivers of the document .......................................................................................... 7 4 Introduction ................................................................................................................. 8 5 Objectives of deliverable 2.2 ....................................................................................... 9 6 Healthcare databases ................................................................................................ 10
6.1 Definition ............................................................................................................ 10 7 Methods ..................................................................................................................... 11
7.1 Procedure for identification of healthcare databases ........................................ 11 7.2 Creation of the survey ........................................................................................ 15
8 Results ........................................................................................................................ 18 8.1 Databases invited to participate to the survey .................................................. 18 8.2 Response rate of survey ..................................................................................... 20 8.3 Assessment of the survey ................................................................................... 20 8.4 Nature and characteristics of the databases ...................................................... 23
8.4.1 Drug exposure ................................................................................................ 23 8.4.2 Vaccine exposure ........................................................................................... 24 8.4.3 Clinical outcome ............................................................................................. 24 8.4.4 Accessibility and costs of databases .............................................................. 24
9 Discussion and Limitations ........................................................................................ 25 10 Conclusions / Outlook and next steps ................................................................ 26 11 References .......................................................................................................... 27
12 APPENDIX ................................................................................................................. 31
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
3
1 List of authors and GRIP participants
Name Institution
C.Ferrajolo
Y. Li,
K. Verhamme
F. Fregonese,
D. Bonifazi,
S. de Bie
I. Wong
D. Weibel
J. Bonhoeffer
M. Sturkenboom
O. Osokogu
F. Bartoloni
C. Giaquinto
EMC
BF
EMC
AOPD
CVBF-TEDDY
EMC
SoP
EMC
BF
EMC
EMC
IRIDIA (CVBF-TEDDY subcontractor)
AOPD
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
4
In GRIP the following acronyms are used for the participating institutions
Participant organisation name Acronym Country Lead Scientist
Azienda Ospedaliera Padova – Dipartimento di Pediatria
AOPD Italy Carlo Giaquinto
National Institute of Child Health and Human Development
NICHD-NIH USA Steven Hirschfeld
European Medicines Agency EMEA UK Agnes Saint-Raymond
Erasmus Medisch Centrum Rotterdam EMC The Netherlands
Miriam Sturkenboom
University of Liverpool, MCRN ULIV-MCRN UK Rosalind Smyth
Ospedale Pediatrico Bambino Gesù OPBG Italy Paolo Rossi
Institut National de la Santé et de la Recherche Médicale
INSERM France Evelyne Jacqz Aigrain
National Center for Child Health and Development
NCCHD Japan Hidefumi Nakamura
St George's Hospital Medical School SGUL UK Mike Sharland
Consorzio per Valutazioni Biologiche e Farmacologiche
CVBF-TEDDY Italy Adriana Ceci
Universiteit Leiden UL The Netherlands
Oscar Della Pasqua
Academisch Medisch Centrum Universiteit van Amsterdam
AMC The Netherlands
Martin Offringa
Fundacion Vasca de Innovacion e Investigacion Sanitarias
BIOEF Spain Adolfo Valls-i-Soler
Instytut Pomnik Centrum Zdrowia Dziecka
PCZD Poland Marek Migdal
World Health Organization WHO Switzerland Suzanne Hill
University of Hong Kong UHK China Ian Wong
Helsingin Ja Uudenmaan Sairaanhoitopiirin Kuntayhtymä
HUS Finland Kalle Hoppu
Brighton Collaboration Foundation BF Switzerland Jan Bonhoeffer
Fondazione PENTA Italy Silvia Faggion
Dutch Genetic Alliance The Netherlands
Coe Oosterwijk
Hospital for Sick Children – Toronto SickKids Canada Shinya Ito
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
5
2 Abstract
Introduction: The available healthcare databases on infants, children, and adolescents are not
adequately utilized to conduct post-authorization drug utilization and safety studies. The lack of
a federation of healthcare databases restricts the capacity for meaningful investigations in these
vulnerable populations. Moreover, the lack of shared methodologies to specifically retrieve
paediatric information hinders access to valuable information.
Objectives: One of the aims of the Global Research in Paediatric (GRiP) network
(http://www.grip-network.org) is to identify and describe automated population-based
healthcare databases that can provide medication and clinical information for paediatric
pharmacoepidemiological researches on a global scale.
Methods: We performed a web-based survey among all global databases that were identified
through manual revision of the pharmacoepidemiology/pharmacovigilance conference
abstracts, Bridge.to.Data software and/or by databases directly identified by members of GRiP
network. The survey included questions concerning: (i) contact information for database and
responsible person; (ii) nature of database (possible linkage of drugs prescriptions and/or clinical
data with population); (iii) demographic, clinical and drug/vaccine related data provided, (iv)
accessibility of the database for future collaboration in paediatric studies, and (v) validity of the
data.
Results: Ninty-nine databases were identified globally (in Europe, North- and South-America, in
Asian-Pacific area, and Africa) and were invited to participate to the survey. At the time this
deliverable was written, only 16 answers were received, corresponding to a response rate of
15%. In total, 75% of the respondents (N=12) accepted to collaborate with the GRiP network for
future pharmacoepidemiology studies. The collaborating databases are located in 5 different
European countries: Germany, United Kingdom, Denmark, Netherlands, and Italy, except for the
MediGuard database that is available in more than 1 country. The data sources were set up
between 1986 and 2007 providing around 16 million of total cumulative number of paediatric
population (0-18 years). Nine databases capture outpatient records and 3 both, outpatient and
inpatient data from primary care physicians and/or insurance claims. Both medication and
clinical information are described in 11 databases. Patient-level linkage between drug
prescription and clinical data is feasible for all 12 databases.
Conclusions: Those databases that replied to the survey and agreed to participate provide good
potential for paediatric pharmacoepidemiological studies. Thos databases that did not yet reply
will be contacted in the coming months which hopefully results in participation from automated
population-based healthcare databases in North- and South-America, in Asian-Pacific area, and
Africa. Creating an inventory of existing health care databases and their willingness to
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
6
participate in future projects is important as large databases are needed for paediatric
pharmacoepidemiology research in terms of power and long term follow-up.
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
7
3 Receivers of the document
User group
GRIP Beneficiaries
European Commission
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
8
4 Introduction
The main aim of Global Research in Paediatrics – Network of Excellence (GRiP) is to implement an infrastructure matrix to stimulate and facilitate the development and safe use of medicine in children. This implementation entails the development of a comprehensive training programme and integrated use of existing research capacity, whilst reducing the fragmentation and duplication of activities.
Implementation of paediatric studies requires well trained researchers, investigators and other experts in number and capacity that currently do not exist (http://www.grip-network.org/). GRIP will address this problem by developing, as its main objective, a joint paediatric clinical pharmacology training program in collaboration with International stakeholders.
In addition, GRiP promotes sharing of best practices in research, including methodologies and research tools that can be globally used. Central to these efforts are activities that evaluate methodologies and research tools according to GRIP recommendations on the needs of researchers (including industry) and patients. Workpackage 2 aims to develop an integrated electronic infrastructure for epidemiological, pharmacovigilance and post marketing research in pediatrics. Pharmacoepidemiology has many well-established roles in paediatric drug development and monitoring of adverse events yields valuable information on safety of drugs and improves planning of trials. However, available healthcare data on infants, children, and adolescents are not adequately utilized. First, the lack of a federation of healthcare databases is a missed opportunity for meaningful investigations in these vulnerable populations (1). Second, the lack of shared methodologies to specifically retrieve paediatric information hinders access to valuable information. Third, a lack of standardized methods and study designs creates an unnecessary burden for paediatric drug development. Therefore, new approaches and standardized methodologies need to be developed and evaluated. (http://www.ema.europa.eu/docs/en_GB/document_library/Report/2011/05/WC500106554.pdf) Combining data from different databases and countries is crucial in paediatric pharmacoepidemioloy to increase the sample size and the heterogeneity of population setting and to perform long-term follow-up studies (2). The targeted electronic infrastructure should allow for virtually linking existing healthcare databases across the globe to assess the occurrence of diseases in children, plus the use and effects of drugs (including vaccines) on a large scale and in a standardized manner. Methodologies for harmonization, data exchange across national boundaries (including ethical and governance issues), data mining and comparative safety and effectiveness studies will be developed.
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
9
5 Objectives of deliverable 2.2
This report describes the approach and the results of the identification and characterization of the existing databases that will be used to develop a global integrated infrastructure. The main aims were to generate:
1. An inventory of existing data sources globally 2. To describe the databases in terms of their possibilities to contribute data for
observational research in children.
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
10
6 Healthcare databases
Computerized health care data has proven to be a valuable resource for
pharmacoepidemiological and health services research and the European Medicines Agency
(EMA) and Food and Drug Administration (FDA) now recommend and recognize the use of
electronic health records when conducting post-authorization drug utilization and safety studies
(1). To conduct proper pharmacoepidemiological studies, we need to have numerators and
denominators, and therefore outcome, exposure and demographic and clinical population data
are essential. The main focus in GRIP is on population-based healthcare databases.
6.1 Definition
Population-based healthcare databases are defined in GRIP as:
Person-level population-based databases that capture:
a) routine care information on drug prescriptions/dispensing on a person level, which can
be linked to the population file by a unique identifier; and/or
b) clinical diagnoses/events/outcomes on a person level, which can be linked to the
population file by a unique identifier.
With population-based databases we mean databases that capture the follow-up period (i.e.,
the start and end date during which data on drugs and/ or outcomes are available) on an
individual person level, independent of health status (i.e. even if healthy). Population-based
does not necessarily mean on a national level. Regional databases (e.g., if care is organized
regionally), or databases capturing GP populations (i.e., if patients need to be registered with GP
independent of being sick and GP is gatekeeper), or claims/insurance databases are regarded as
population-based databases.
Hospital medical records alone (such as neonatal intensive care unit, NICU, data) are not
considered population-based since we do not have an underlying registered catchment
population (i.e., a record of all the persons not referred to a hospital and considering the time of
referred persons prior and after hospitalization).
Immunization /drug use /disease registries alone are not considered population based
databases if they do not capture the population not being exposed /vaccinated /diseased. If
vaccination registries capture defined populations from birth to a specific date comprising all
vaccinated and unvaccinated subjects in that population, this is considered population based.
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
11
7 Methods
7.1 Procedure for identification of healthcare databases
The procedure employed for the identification of the global population-based automated
healthcare databases is outlined in Figure 1. Three different methods were combined to
complete the total list of database contacts which were invited to participate in the on-line
survey.
a) Retrieving data from published ICPE abstracts
A systematic review of published abstracts presented at the 25th and 26th International
Conferences on Pharmacoepidemiology and Therapeutic Risk Management (ICPE) during the
years 2009-2010 was performed. At the same time, the ICPE abstract books of the Asian
meetings (ACPE) abstracts were reviewed. All doubly identified databases were excluded and
the following information was retrieved:
(i) abstract number
(ii) conference year
(iii) country
(iv) name of automated healthcare database.
Subsequently, by consulting of the corresponding websites, further data on contact details,
start-years and type of database (e.g., claims, GPs, pharmacy database, etcetera), and covering
age range were collected, whenever available. A final list namely “Abstract database contacts”
included 169 database contacts from all continents.
b) Procedure for identification of the immunization databases
The contact list for the immunization databases was compiled by the Brighton Collaboration
Foundation according to the following approach:
Step 1: The Brighton Collaboration member list was screened for potential contacts in each
country with emphasis on contacts affiliated with public health authorities
Step 2: In countries where no contacts with public health background were available,
professionals from regulatory authorities or academia or clinical care agencies were
approached for recommendation of suitable contacts in their countries.
Step3: Professionals referred to us based on Step 1 and 2 correspondence were contacted.
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
12
Step 4: Other networks or activities such as the International Paediatric Association, INDEPTH,
the Global H1N1 vaccine safety case series were utilized to identify additional contacts.
The assistance from WHO and other international organizations is still pending.
c) Retrieving form Bridge to Data and meetings/conferences
“B.R.I.D.G.E. to data” is a non-profit organization that provides online reference to different
population-based health-care databases worldwide that can be used in epidemiologic and
health outcomes research (http://www.bridgetodata.org). Access is provided upon paying a
license fee. EMC bought an academic license and agreed with the organization that data may be
utilized for GRIP. The centralized B.R.I.D.G.E. to_data compendium contains over 170
standardized database profiles (with 75 defined data fields) representing 24 countries. It is
structured in such a way that there can be efficient side-by-side analysis of databases as well as
providing extensive database details (with the permission of the database managers). It is being
continuously updated. The types of database that “B.R.I.D.G.E. to_data” contains includes
longitudinal electronic medical records (EMR), claims databases, drug or disease specific
cohorts, registries, national surveys, national surveillance systems and spontaneous reporting
systems. For the purpose of this task however, only longitudinal electronic medical records have
been considered.
In identifying relevant databases, the strategy was to search for the presence of such databases
(longitudinal population-based) for each country following the alphabetical order. The following
steps were undertaken:
Access to the website was requested and granted.
The database “search” page was accessed.
The following information was entered into the relevant fields on the search page or
selected from the available options: country where the database is located and
database type (in this case longitudinal population database [same for every search
conducted]). It was also specified that only databases containing information on age of
the patient were needed.
The following fields were not utilized (no preferences were specified) in conducting the
searches: keyword; database source; “specific period of entry into a database”;
population type (example being general population, inpatient etc); active population
size; gender data; ethnicity/race data; death record; diagnosis data; birth defect data;
cancer data; procedure data; laboratory information; drug data; cost data; validation
against original source; access to original medical records.
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
13
Lastly, results were returned based on the entered search criteria.
The results obtained from “B.R.I.D.G.E. to data” were finally compared with (and used in
updating) the information that was already available for each country in the list of databases
being compiled.
Matching information and creation of inventory
The list “Abstract database contacts” including 169 database contacts was matched to the
inventory retrieved by “B.R.I.D.G.E. to data”, including 74 contacts. After matching, the contact
list was updated to 214 contacts. In parallel, some members of the GRiP network established
direct contacts with the database owners met at the conferences concerning “Vaccine and drug
safety in paediatrics” of ECDC and a meeting at the Public Health Agency Canada . A parallel
inventory was set up including 28 database contacts. This latter inventory and the updated list
were matched to provide the final database contact list to be invited to participate to the survey
(Figure 1).
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
14
Figure 1: Flowchart on the procedure for selection of the database contacts from different
sources.
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
15
7.2 Creation of the survey
In order to conduct the survey, a questionnaire investigating the characteristics of the identified
databases was developed. Given the objectives of this project, the items included in the
questionnaire concerned the nature of the databases, the type of data collected and the
possibility for the database to contribute data to future GRIP studies.
Two previously tested questionnaires, used in surveys describing existing databases in the
European context, were used as reference. Numerous Items that had proven valid in the
previous surveys were adopted in the new questionnaire; others items were modified or
specifically developed to fulfil the needs of the GRIPsurvey.
The two questionnaires that served as guides were:
1. The questionnaire developed by the European Network of Centres for Pharmacoepidemiology
and Pharmacovigilance (ENCePP) to collect information on databases with pharmacological data
in EU (www.encepp.eu)
2. The questionnaire used by the Task-force in Europe for Drug Development for the Young
(TEDDY) for a survey on databases for paediatric medicine research (3).
These questionnaires were reviewed and each of their items compared. In this way we wanted
both to adopt question-formats that have been already tested and to avoid omitting items that
have been proven informative.
The survey from the TEDDY project, which was specifically designed to describe pediatric
databases, provided the guide for most of the pediatric specific questions.
Other original issues were developed uniquely for the GRIP questionnaire, taking into account
the pediatric focus of future studies and the nature of databases surveyed (longitudinal
healthcare databases). For example specific questions needed to address the estimation of
pediatric catchment population, or pediatric pharmacological issues (as dosing per weight). In
addition, dealing with longitudinal healthcare databases, specifications on population currently
on follow-up (defined as “active population”) was needed beside the total registered
population.
Furthermore a complete section on information available on vaccinations was specifically
developed for this survey by the GRIP partners at the Brighton Foundation.
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
16
The final questionnaire comprised of 14 main sections for a total of 55 questions (see Appendix
1). The issues included were as follows:
Contact information for database and responsible person (name and address)
Information on nature of database (possible linkage of drugs prescriptions and/or
clinical data with population)
Years, population and geographic areas covered by database
Information on data collected: type of demographic and clinical data (including data on
referrals), type of data on drugs and vaccines
Possibility of collaboration in future studies: regulations to access the data stored,
additional information that could be collected if needed, intent on future collaborations
Previous publications on data collected (with focus on paediatric)
All questions were developed or reshaped to minimize possible misinterpretation. Complex
questions were broken down in several simple questions and whenever possible multiple-choice
answers were given. Space for open-ended comments was left in the end of the questionnaire.
Contacts of GRIP partners available to clarifications were given both in the questionnaire and in
the cover accompanying it.
A user’s guide including instructions on most questions was developed together with the
questionnaire (see Appendix 2) to be delivered with it prior to the survey.
Each survey was emailed accompanied by a cover letter explaining rational and purposes of the
GRiP project and highlighting the importance to fill the questionnaire and to collaborate to the
network (see Appendix 3).
IRIDIA made an online version of the survey and sent out the invitations with a private key to be
used (see screen shot below)
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
17
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
18
8 Results
8.1 Databases invited to participate to the survey
A total of 238 automated population-based healthcare databases were identified through
manual revision of the ICPE/ACPE abstracts, Bridge.to.Data software and by personal contact of
the members of GRiP network (Figure 2). By continent, we collected 90 databases from
European countries, of which 37 were exclusively extracted by abstract conference revision, 22
from “B.R.I.D.G.E. to data”, 17 were matched between B.R.I.D.G.E. to data and the “Abstract
database contacts” and 14 were retrieved through networking at the meetings. Similarly, 74
databases were identified from northern America countries: 39 came exclusively from the
abstract conference revision, 17 from “B.R.I.D.G.E. to data”, 11 were matched between these
two inventories and 7 through networking at the meetings. Among Asian-Pacific countries we
identified 46 databases: 36 contacts exclusively from abstract review, 6 exclusively from
“B.R.I.D.G.E. to data”, 1 matched, and 3 through networking at the meetings. Twenty-two
databases from African countries and 6 from southern American countries were retrieved only
by abstract revision; however, we failed to found enough contact information.
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
19
Figure 2: Global distribution of the database contacts extracted by different sources and
manually screened.
ICPE / ACPE abstracts
N=169
B.r.i.d.g.e. to data N=74
Personal contacts N=28
Overlap N=29 Overlap N=2
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
20
After screening of the email address details and exclusion of all duplicates (n=3) and the records
with not sufficient details to be contacted (n=112), 125 databases were identified globally and
99 out of them were invited to participate to the on-line survey (see appendix 4a). A reminder
was sent up to non-responders, and repeated when no reply.
The remaining 26 databases, contacted through direct networking at meetings/conferences,
were personally invited by the leader members of the GRiP network (MS, JB, IW, HN) to fill the
questionnaires, either on-line or not (see appendix 4b). The process will be further followed by
each WP2 members who will contact the people by phone number.
8.2 Response rate of survey
To date, 99 surveys have been sent out and were received 16 users’ answers, corresponding to
the 16% of response rate. In total, 75% of the respondents (N=12) accepted to collaborate to the
GRiP network for future pharmacoepidemiology studies. Only two users did not approve or
disagreed; one of them expressing concerns about the clarity of the information provided on the
involvement in the project. Two other users only answered partially to the survey.
8.3 Assessment of the survey
Only the data sources of which the responders fully agreed to collaborate to the GRiP network
were included in the final analysis (N=12). Overall, these databases were set up between 1986
and 2007 and are located in 5 different countries, Germany, United Kingdom (UK), Denmark,
Netherlands (NL), Italy, except for the MediGuard database that is available in the following
countries United States (US), UK, France, Germany, Spain, Australia, Brazil. The databases are
listed as following:
1. German Pharmacoepidemiological Research Database (GePaRD), Germany;
2. The Health Improvement Network database (THIN), UK;
3. Clinical Practice Research Database (CPRD), UK;
4. PEDIANET, Italy;
5. (ASL) Cremona, Italy;
6. (ARS) Toscana, Italy;
7. Information system policies for health and social policies (ARS) Emilia Romagna, Italy;
8. InterAction Database (IADB), The Netherlands;
9. Integrated Primary Care Information database (IPCI), The Netherlands;
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
21
10. Agis Health Database, The Netherlands;
11. Aarhus University Research Database, Denmark;
12. MediGuard.org (several countries: US, UK, France, Germany, Spain, and Australia).
A geographical map illustrates the global distribution of the databases involved in the GriP
network and assessd (Figure 3).
Among 12 databases, 8 provided the total cumulative number of paediatric patients, accounting
for around a population of 15 million of 0-18 years old. Overall, the included databases are
primary care (general practitioner, GP or family paediatricians, FP) and/or insurance claims
databases. Concerning drug information, nine databases capture outpatient records and 3 both,
outpatient and inpatient whilst clinical data are described in 11 databases. Patient-level linkage
between drug exposure and clinical outcome is feasible for all 12 databases.
Based on the survey information and the literature of study based on their data, databases were
categorized with respect to their potential suitability for use in paediatric drug utilization and
drug safety studies. Information collected has been categorized as demographics, drug
exposure, clinical outcomes and data access.
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement n°
261060
22
Figure 3: Distribution of databases included in the GriP network.
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
23
8.4 Nature and characteristics of the databases
Six databases included in this survey were set up between 2000 and 2007 (N=6), six between
1986 and 1998. A detailed overview of the databases included in the survey is described in
Tables 1a and 1b.
Six databases from 4 countries (THIN, CPRD, PEDIANET, Emilia-Romagna, IADB.nl, IPCI) are
longitudinal, population-based databases using electronic medical record data from general
practitioners (GPs) and family pediatricians (FPs). PEDIANET comprises also claims data when
collected by the FPs. These databases were developed in countries where physicians and/or
paediatircians (in Italy) are gatekeepers for medical care and information. All of these electronic
medical record databases contain anonymous data on patient demographics, reasons for visits,
diagnoses from GPs/FPs and specialists, hospitalizations, drug prescriptions, laboratory and
other diagnostic findings for the paediatric population.
Five databases (GePaRD, ASL Cremona, ARS Toscana, Agis Health Database, Aarhus) are drug
dispensing claims databases processing all prescriptions that need to be reimbursed. However, a
patient-level linkage between drug exposure and clinical outcome and patient population file is
feasible for all of them. GePaRD provides demographic data as well as information on hospital
admissions, outpatient physician visits and outpatient prescriptions from Statutory Health
Insurances (SHI).
MediGuard is not a GP neither claim database but is a free medication monitoring service
designed specifically for patients by professionals with decades of experience in healthcare
market research, clinical drug development, and drug safety (www.mediguard.org). No more
info were found concerning the collection of data.
8.4.1 Drug exposure
All databases that participated in the survey collect information on prescription-drugs and the
units dispensed or prescribed, the formulation, and most of them also record the dosage
regimen, which is particularly important for the paediatric population. All drugs are coded
according to the Anatomical Therapeutic Chemical classification system in the majority of the
databases. Some of them use also other drug codes as z-index (IADB.nl and IPCI), DPICS (Agis),
AIC (PEDIANET), Multilex coding system (CPRD). The drug code system used in THIN database is
the British National Formulary (BNF). The indication of use is recorded only in CPRD, PEDIANET
and IPCI, using Read code, ICD-9th CM code, plus free text, and ICPC-code, respectively. In
GePaRD, prescription data contain the prescribed drugs characterized by the central
pharmaceutical number (PZN), the dates of prescription and dispensation, and information on
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
24
the prescribing physician. They are available for all outpatient prescriptions that are reimbursed
by the SHIs. Prescription data are linked to a pharmaceutical reference database that adds
information on the defined daily dose (DDD), the ATC code, strength, packaging size, and the
generic and brand names.
8.4.2 Vaccine exposure
Immunization data are captured comprised in six databases (GePaRD, THIN, CPRD, Aarhus, ASL
Cremona, IADB.nl), they all include vaccine code and date of vaccination for routine paediatric
immunization; three databases (CPRD, ASL Cremona, IADB,nl) include also and name and four
(GeParD, THIN, Aarhus, and ASL Cremona) also data on elective childhood immunization. When
recorded by GPs, the IPCI database provides data on vaccine (e.g. influenza); childhood
vaccinations are available through linkage with a national registry from RIVM (pilot phase).
8.4.3 Clinical outcome
Past and current medical diagnoses are recorded using READ codes (a thesaurus of coded
medical terms maintained and distributed by the United Kingdom Terminology Center) in THIN,
Agis Database, MediGuard.org, according to the German modification of the International
Classification of Diseases (ICD-10 GM) in GePard, Aarhus, IADB.nl, and both code systems in
CPRD. Symptoms and medical diagnoses are either registered as free text or coded using ICPC
(International Classification of Primary Care) in IPCI and ICD-9-CM (International Classification of
Disease, 9th revision, Clinical Modification) in Pedianet. All the remained Italian databases
adopted the ICD-9th CM classification. Hospital data are reported in the majority of the
databases and include the dates of admission and discharge with their corresponding diagnoses,
and information on in-hospital diagnoses and procedures. Claims regarding outpatient physician
visits contain diagnoses, ambulatory diagnostic procedures and non-drug treatments.
8.4.4 Accessibility and costs of databases
All of the twelve databases allow access to data access for paediatric researches. The majority of
the providers require a written policy governing and a committee evaluation. Six of the
databases may be accessed free of charge, although most of them provide special conditions if
data are used for academic research or purposes.
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
25
9 Discussion and Limitations
Combining and sharing data from different databases and countries is important to increase
sample sizes and to perform long-term studies in paediatrics. To date, combining the population
of the databases that participated in this survey results in a paediatric population of around 15
million providing a good potential for paediatric pharmacoepidemiological studies. Creating an
inventory of existing health care databases and their willingness to participate in future projects
is important as large databases are needed for paediatric pharmacoepidemiology research in
terms of power and long term follow-up.
Previous projects such as EU-ADR, SOS and others have shown that data from different
databases can be combined to conduct international observational studies (4). The majority of
health care databases are created not primarily to conduct research but are simply a collection
of electronic patient’s records accessible for the health care staff to monitor patient’s care. The
organisation of health care is country specific which in part explains the heterogeneity among
the databases in terms of disease and drug coding. The development of automatic tools such as
disease and drug mapping will further facilitate the combination of data from different health
care databases according to a common study protocol.
From the survey, we learned that the databases collect information on age, drug dosing,
mother-child linkage, immunisation status etc. Other important information such as height and
weight are however, not collected systematically. Although we appreciate that the health care
databases do not have research as primary aim, it would be an asset if databases would start to
collect crucial information that has been proven important for pediatric research.
So far, the observed response rate and collaborative opportunities are developed only in
Europe. No databases from North and South-America, Asian-Pacific Countries and Africa,
responded to the survey. The majority of non responders from Asian-Pacific countries is mainly
due to the scarce knowledge about the GRIP project and the reluctance to share data . Even,
ethical issue and governance requirement may be a concern related to data extraction from
national databases. Instead, in Europe a precedent project provided to build from national to
international efforts through common structures. Overall, the low response rate could also be
attributed to the fact that the survey specifically addressed the availability of pediatric data and
thus databases that do not collect info on the pediatric pppulation are more likely not to
respond.
The absence of databases from other regions is due to the diversity of healthcare systems
and our common challenge is to identify the methods and technical requirements to facilitate
bridging the different structures. In the coming weeks, we will continue contacting the non-
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
26
responding databases which will enrich our inventory of pediatric databases. In the end, we
hope to create an up to data inventory of all existing pediatric databases which should allow
conduction worldwide pediatric observational research.
10 Conclusions / Outlook and next steps
As next steps, those database contacts that did not yet reply will be contacted in the coming months which hopefully results in participation from automated population-based healthcare databases in North- and South-America, in Asian-Pacific area, and Africa. In order to explain in details the project and to motivate the people to respond to the survey, those databases for Asian-Pacific countries will be personally contacted. There is also the possibility collecting feedback through email and other means in parallel with the web-based survey. However, after a second reminder, the no-responders will be followed by direct calling.
In parallel, the list of 26 personal contacts, i.e., the contact persons directly informed about the project GRiP during the conferences/meetings, will be contacted by personal emails.
The anonymized data from the databases eligible to participate to the GRiP network will be combined. The analyses on the anonymized data sets that are outputted by Jerboa© (5) should be performed in a distributed fashion by using one Remote Research Environment (RRE) which will be located at EMC. The RRE allows for loading, retrieving, extracting, and transforming of the data by different institutions/partners. The report D 2.01 describes in detail the security measures taken for the Remote Research Environment (RRE) to ensure the high level of stored data protection as described in article 34 of the legislative decree 196/2003 and Directive 95/46/EC for processing of healthcare data.
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
27
11 References
1. Blake KV, Devries CS, Arlett P, Kurz X, Fitt H. Increasing scientific standards, independence and transparency in post-authorisation studies: the role of the European Network of Centres for Pharmacoepidemiology and Pharmacovigilance. Pharmacoepidemiol Drug Saf. 2012 Apr 23. 2. Black N, Barker M, Payne M. Cross sectional survey of multicentre clinical databases in the United Kingdom. BMJ. 2004 Jun 19;328(7454):1478. 3. Neubert A, Sturkenboom MC, Murray ML, Verhamme KM, Nicolosi A, Giaquinto C, et al. Databases for pediatric medicine research in Europe--assessment and critical appraisal. Pharmacoepidemiol Drug Saf. 2008 Dec;17(12):1155-67. 4. Trifiro G, Patadia V, Schuemie MJ, Coloma PM, Gini R, Herings R, et al. EU-ADR healthcare database network vs. spontaneous reporting system database: preliminary comparison of signal detection. Stud Health Technol Inform. 2011;166:25-30. 5. Coloma PM, Schuemie MJ, Trifiro G, Gini R, Herings R, Hippisley-Cox J, et al. Combining electronic healthcare databases in Europe to allow for large-scale drug safety monitoring: the EU-ADR Project. Pharmacoepidemiol Drug Saf. 2011 Jan;20(1):1-11.
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement n°
261060
28
Table 1a: Demographic and clinical characteristics of the eligible databases from Germany, UK, Denmark and several countries Germany Denmark Several countries
GePaRD THINCPRD, Clinical Practice
Research Datalink
Aarhus University
Research DatabaseMediGuard.org
Start date 2004 1988 1986 1998 2007
updated yearly 3 months Continuously Continuously Continuously
Cumulative number (0-18 years) > 14 million overall > 1.8 million 12.5 million overall 18 millions overall < 5000
Demographics* Yes Yes+ height & weight Yes & weight Yes Yes
Mother-child linkage Yes Yes Yes Yes No
Drug prescriptions^ (exposure) Outpatients In- and outpatients In- and outpatients In- and outpatients In- and outpatients
Type of data source claim GP GP claim n.a.
Code ATC codeThe Multilex Coding
System - BNF
The Multilex Coding
System - ATC code ATC code WHO Drug Database
Indication No Read code -lab data Read code No No
Units (No.) Yes Yes Yes Yes No
Prescribed dosage frequency No Yes Yes No No
Prescribed duration of treatment No Yes Yes No No
Formulation Yes Yes Yes Yes No
Strength Yes Yes Yes N/A No
Route Yes Yes Yes Yes No
Clinical outcome (disease) In- and outpatients In- and outpatients In- and outpatients In- and outpatients In- and outpatients
Diagnosis for accessing to db ICD-10 Read code -lab data ICD-10 - lab data ICD-10 - lab data free text
Immunization data$ Yes (brand name n.a.) Yes (brand name n.a.)Yes (elective
immunization n.a.)Yes (brand name n.a ) No
Referral to specialist Yes Yes Yes Yes n.a.
Results on referral visits Yes Yes Yes No n.a.
Emergency room admission Yes Yes Yes Yes n.a.
Results of ER admission Yes Yes Yes Yes n.a.
Hospital admission Yes Yes Yes Yes n.a.
Hospital discharge diagnosis ICD-10 Read code Read code ICD-10 n.a.
Additional information Nophysician, patients,
genetic or samples
physician, patients,
genetic or samples
patients, genetic or
samplesphysician, patients
Data access
written policy governing n.a. Yes Yes Yes Yes
evaluation committe Yes Yes Yes Yes No
charge request Yes Yes Yes Yes Yes
UK
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement n°
261060
29
Table 1b: Demographic and clinical characteristics of the eligible databases from Italy and Netherlands.
ASL Cremona PEDIANET Emilia-Romagna ARS Toscana IADB.nlAgis Health
DatabaseIPCI
Start date 2001 2000 2002 2003 16/06/1905 2001 1992
updated monthly monthly continuously yearly bi- or yearly continuously yearly
Cumulative number (0-18 years) 97400 180000 500000 about 930,000 64645 1.3 overall 300,000
Demographics Yes Yes + height & weight Yes Yes Yes Yes Yes + height & weight
Mother-child linkage Yes Yes Yes No Yes Yes Yes (probabilistic)
Drug prescriptions^ (exposure) Outpatients Outpatients Outpatients Outpatients Outpatients Outpatients Outpatients
Type of data source claim claim - GP GP claim GP claim GP
Code ATC code - AICATC code - The National Drug Code (NDC) System - MINSANATC code ATC code ATC code - z-index ATC code - DPICS ATC code - z-index
Indication No ICD-9 - free text No No No No ICPC-code
Units (No.) No Yes Yes Yes Yes Yes Yes
Prescribed dosage frequency No Yes No No Yes Yes Yes
Prescribed duration of treatment No Yes No No Yes Yes Yes
Formulation Yes Yes Yes Yes Yes Yes No
Strength No Yes Yes Yes Yes Yes Yes
Route Yes Yes Yes Yes Yes Yes Yes
Clinical outcome (disease) Inpatients Outpatients Inpatients Inpatients naIn- and
outpatientsIn- and outpatients
DiagNosis for accessing to db ICD-9 ICD-9 code - free text -
lab dataICD-9 ICD-9 code No DBC in hospital
ICPC-code - free text -
lab data
Immunization data* Yes No No No
Yes (routine and
elective
immunization n.a.)
No Yes (linkage)
Referral to specialist No Yes Yes Yes No Yes Yes
Results on referral visits No Yes Yes No No Yes Yes
Emergency room admission Yes Yes Yes No No Yes Yes
Results of ER admission No Yes Yes No No Yes Yes
Hospital admission Yes Yes Yes Yes No Yes Yes
Hospital discharge diagNosis ICD-9 free text ICD-9 ICD-9 No DBC ICPC-code - free text
Additional information physicianphysician, genetic or
samplesphysician
patients, genetic or
samplespatients physician physician, patients
Data access
written policy governing Yes Yes Yes Yes Yes Yes Yes
evaluation committe Yes Yes Yes Yes Yes Yes Yes
charge request No No Yes No n.a. Yes No
Italy Netherlands
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement n°
261060
30
Legend table 1a and 1b: *demographics include age and gender ^in all databases the drugs are indicated by name $immunization data include vaccine code and brand name, date, routine paediatric and elective childhood immunization
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
31
12 APPENDIX
Appendix 1: Survey on electronic healthcare databases
GRIP Survey on electronic health care databases
01 - Main Info
S001: Name of the database
S002: Database URL
S003: Contact persons
Administrative Contact person
Title
Name
Address
City, Postcode, Country, Phone number (incl. country code), Alternative phone number, Fax num.
Email address for administrative contact
Scientific Contact person
Title
Name
Address
City, Postcode, Country, Phone number (incl. country code), Alternative phone number, Fax num.
Email address for scientific contact
S004: Brief Description
02 – Nature of the database
S005: Does the database capture drug prescriptions? Yes No
If yes: does it capture drug prescriptions for Outpatients? Yes No
- through Insurance claims Yes No
- through Medical records Yes No
Does the database capture drug prescriptions for Inpatients? Yes No
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
32
- through Insurance claims Yes No
- through Medical records Yes No
S006: Does the database capture clinical data? Yes No
If yes: Outpatient clinical data Yes No
Inpatient clinical data Yes No
S007: Linkage with population data
Is patient-based linkage of clinical data to follow-up time (population file) possible? Yes No
If Yes:
- Probabilistic linkage Yes No
- Deterministic linkage (with unique identifier) Yes No
S008: Is patient-based linkage between drug prescriptions and clinical data possible? Yes No
If Yes
- Probabilistic linkage Yes No
- Deterministic linkage (with unique identifier) Yes No
03 - General Characteristics
S009: Start date of data collection
S010: Is the database updated:
Continuously Yes No
At intervals Yes No
If Yes, please specify the interval..............................................
S011: Total Cumulative number of registered subjects, including adults
S012: Total Cumulative number of registered children (0-18 years of age)
S013: Number of active (registered) children (0-18 years of age) in 2010
04 - Geographical Coverage
S014: Are the patients in the database representative for national population? (according to age and gender distribution) Yes No
S015: Names of covered regions or provinces
05 - Collected Data – Demographics
S016: Exact date of birth available as
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
33
⃝ Date, Month, Year
⃝ Month, Year
⃝ Year
⃝ None
S017: Gender Yes No
S018: Height Yes No
S019: Weight Yes No
S020: Mother-child linkage Yes No
06 - Collected Data - Clinical Data
S021: Reason for accessing care Yes No
S022: Diagnosis Yes No
If Yes, how is diagnosis collected?:
⃝ as text
⃝ as code:
⃝ ICD-10
⃝ Read code
⃝ ICPC code
⃝ ICD-9 code
⃝ Others (please specify).......................................................
S023: Measurements (laboratory/diagnostics) Yes No
07 - Collected Data – Drugs
S024: Name of drugs prescribed Yes No
S025: Identification code for each drug Yes No
If Yes, which codes are used?:
⃝ The Anatomical Therapeutic Chemical (ATC) Classification System
⃝ The Drug Products Information Coding System (DPICS)
⃝ The Multilex Coding System
⃝ The National Drug Code (NDC) System
⃝ Others (please specify)..............................................................
S026: Indication for prescription Yes No
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
34
If Yes, how is indication collected?:
⃝ as text
⃝ as code:
⃝ ICD-10
⃝ Read code
⃝ ICPC code
⃝ ICD-9 code
S027: Total number of prescribed units (tablets/ml, suppositories etc) for each drug Yes No
S028: Prescribed dosage frequency for each drug Yes No
S029: Prescribed duration of treatment for each drug Yes No
S030: Drug Formulation Yes No
S031: Drug Strength (of each unit) Yes No
S032: Route of administration Yes No
08 - Collected Data – Vaccines
S033: immunizations Yes No
If Yes:
S034: Routine paediatric immunization Yes No
if Yes: select from the list (all possible):
BCG Cholera Diphteria Haemophilus influenzae Hepatitis A Hepatitis B HPV Influenza Japanese encephalitis Measles Meningococci Mumps Pertussis Pneumococci Poliomyelitis Rabies Rotavirus Rubella Tetanus Tick born encephalitis Typhoid Varicella
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
35
Yellow fever
S035: Additional (elective) childhood immunisation Yes No
if Yes: select from the list: (as above)
S036: Date of vaccination Yes No
S037: Brand name of vaccination Yes No
09 - Collected Data – Referrals
S038: Referral to specialist Yes No
S039: Results of referral visits Yes No
S040: Emergency room admission Yes No
S041: Results of emergency room admission Yes No
S042: Hospital admission Yes No
S043: Hospital discharge diagnosis Yes No
If Yes:
S044: How is the diagnosis collected:
⃝ as text
⃝ as code:
⃝ ICD-10
⃝ Read code
⃝ ICPC code
⃝ ICD-9 code
⃝ Others (please specify).......................................................
10 – Would it in principle be possible to obtain the following additional information on the patient?
S045: Clinical information from treating physician? Yes No
S046: Data from questionnaires completed by the patient? Yes No
S047: Genetic information or samples? Yes No
11 – Data access
S048: Is there a written policy governing data access? Yes No
S049: Do you have a committee (governance/ethics) to evaluate requests for data access? Yes No
S050: Is a charge made for data access Yes No
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
36
S051: Are you allowed to provide data / do industry sponsored studies Yes No
S052: Would you allow for auditing of the data/studies by external parties?
⃝ yes to regulators
⃝ yes to companies for whom studies are done
⃝ No
12 - Please list the 5 most relevant publications using your data for the last five calendar years (please focus on paediatrics). If there is a publication explicitly reporting on assessment of data quality, please include first
13- Comments (please add comments or questions on this survey, or additional information on your database)
14-Survey completed by: …………………………
on: DD/MM/YY
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
37
Appendix 2: Guide to fill the survey
Guide for survey completion
Please find below some information to complete the survey. This survey includes 14 sections
and takes approximately 5-10 minutes to fill. If you have additional questions, please contact
Osemeke Osokogu ([email protected]) from the Erasmus medical Center, Rotterdam.
Thank you for your participation.
Section 01: Contact information
Section 01 collects general information on the database and contact information for future
correspondence with the database managers.
-S001: Current database name (full name and acronym, if applicable)
-S002: Database URL: current web address of the database
-S003: Administrative contact: person in charge of administration of database; Scientific
contact: scientific advisor responsible for the database
If for your database administrative and scientific contacts are the same person, please
fill only Administrative contact and write SAME in Scientific contact.
-S004: Brief Description: Please give a brief description (1-3 lines) of the main purpose of
the database and type of data collected. For example “Database created to keep the
records of all prescriptions given by GPs in the public health system. Collects data on
drugs prescriptions and demographic data on adults and children”
Section 02: Nature of database
Section 2 collects information on the nature of data collected in the database and on the
structure of the database.
- S005: Select Yes if you collect data on ‘complete’ drug prescriptions/dispensing for the
patients registered in the database. Please specify if you collect data on Outpatients (GP
or ambulatory specialist visits) and/or Inpatients (prescriptions during
hospitalizations/stay in nursing homes) - check both if applicable. Please specify as well
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
38
if data are collected through Insurance claims (payers national/private) or Medical
records (from prescribers) –check both if applicable.
- S006: Select Yes if data in your database comprise clinical data (diagnoses, reasons for
visits, laboratory assessments, hospitalizations etc.). Please then specify if data capture
clinical data for Outpatients (GP and/or ambulatory specialist visits) or Inpatients
(hospitalizations) - check both if applicable.
- S007, Linkage with population: Please select Yes if it is possible to link data on patients
in your database with data in a population file. Please specify if this linkage is
Deterministic: this means that can you link directly based on a unique identifier that is
the same in all files; or Probabilistic if you link by matching a series of variables between
files (e.g. date of birth, physician, sex, initials) since a unique identifier is not available.
- S008: Select Yes if it is possible in your database to link the information on drug
prescriptions and on clinical data for the same patient. Specify whether this linkage is
done in a deterministic or probabilistic manner (see above)
Section 03: general characteristics of the database
- S009: please write which has been the date of the first data entered in the database
(DD/MM/YYYY or MM/YYYY or YYYY).
- S010: check Continuously if database is updated automatically at each new data entry;
check At intervals if update is done periodically (e.g. every 3 months, or annually) and
then specify the approximate average interval between updates (for example 1 month,
3 months, 1 year, etc).
- S011: total number of persons (of any age) for whom you have data in the database
over all years.
- S012: total number of subjects <19 years old for whom you have data in the database
over all years.
- S013: total number of subjects <19 years old for whom you have data in the database
and who are active (registered and not departed) in the database (and in the age range
0-18) at January 1, 2010
Section 04: Geographical coverage of your database.
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
39
- S014: Please select Yes if population in your database has the same age AND gender
distribution than population of your Country.
-S015: List the provinces or regions for which there are data entered for at least one
subject in the database. Please specify if you are listing Provinces or Regions.
Section 05: Demographic data collected
- S016: if date of birth is recorded in your database, please specify in which format (e.g.
some databases only have month/year for privacy issues whereas others may exact
dates)
- S020: check Yes if children can be linked to their biological mother in the database.
Section 06: Clinical data collected
- S021: check Yes if in the database the reason is recorded of why the person is
accessing care (for either inpatients or outpatients).
- S022: check Yes if in the database data are recorded on diagnoses, please specify in
which format checking all the options given that are applicable. If the format you use is
not in the list, please specify it.
- S023: Check Yes if in the database data are recorded on clinical or laboratory
measurements (e.g. blood pressure, temperature, blood test results, urine culture, etc.).
Section 07: Data collected on drugs
In this section we aim to collect information on the detail of information that is available in the
database on drug prescriptions. Please answer to questions S024-S032 considering the
availability of original records on drugs in the database, not what could be found as result of
analysis /algorithms.
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
40
-S024: check Yes if the database captures the names of the drugs prescribed,
either as commercial name (e.g. Batroban) or generic name (e.g. mupirocine).
-S025: Please specify if the database records a unique product code (either
based on commercial product or active compound) for each drug and which one
(choose all what is applicable).
-S026: check Yes if the database captures the reason for prescribing the drug, as
stated by the physician, and specify in which format the reason is coded.
-S027: check Yes if database captures the total amount of units prescribed by
the physician. For example a bottle of syrup: the total amount of ml in the
bottle, for tablets the total number of tablets, for parenteral drugs: the number
of vials.
- S028: check Yes if database captures information on the frequency of dosing
e.g. twice a day, once a day, three times/week, etc.
- S029: check Yes if database captures information on the prescribed duration
of treatment (number of days).
- S030: check Yes if database captures information on the drug formulation for
each drug prescribed (e.g. syrup, tablets, capsules, suppositories)
-S031: check Yes if database captures information on the drug strength or
concentration for drug prescribed (e.g. paracetamol: 1000mg tablets or 500mg
tablets or injectable 5mg/ml).
- S032 Check Yes if database captures information on the route of
administration or the drug (intra venous, per os, etc).
Section 08: Vaccines
-S033: check Yes if database captures data on immunizations
-S034: check Yes if database captures data on national programs of routine
pediatric immunizations (e.g. DTP); and if yes, specify which from the list (check
all applicable).
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
41
-S035: check Yes if database captures data on specific additional (non-routine
but special) immunizations for the paediatric population; and if yes, specify
which from the list (check all applicable).
Section 09: Data on referrals
-S038-S043: If in your database captures data on referrals to specialists,
emergency room or for Hospital admission: check Yes for any that applies. For
each referral collected, specify if the results of the referral are collected.
-S044: If your database captures the discharge diagnosis from hospitals, please
specify in which format.
Section 10: Possibilities to obtain further data
In this section we aim to explore possibilities for future studies which would involve additional
data on the patients whose other data are already in your database.
Please check Yes for any of S045, S046, S047, if you think it would in principle be feasible to
arrange specific studies to collect this additional information.
Section 11 Regulations and charges for data access.
In this section we aim to collected data on ethical /governance procedures that may be in place
to govern which type of projects are being done.
In addition we would like to ask you whether you could do studies paid by industry and whether
data/analyses may be audited (from governance principles)
Section 12: Publications.
If there are more than 5 publications with data from your database in the last 5 years, please
choose the 5 most relevant, giving preference to paediatric studies. If there are no publications
yet with data from your database, please write None.
Thank you for completing the survey.
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
42
Appendix 3: Cover letter
Dear colleague,
we would like to ask you to help us by completing a short survey on automated healthcare databases that you may hold or be aware of in your country. We specifically aim for databases that could be useful for the conduct of post marketing studies on the use and effects (positive and adverse) of drugs (including vaccines) in children.
This request comes from The Global Research in Paediatrics (GRIP) Network of Excellence. GRIP is a project funded by the European Commission (FP7). It aims to implement a global infrastructure to stimulate and facilitate the development and safe use of medicines in children. The GRIP Network of Excellence is a consortium comprising 21 participant organizations across the world, including WHO, NIH, European Medicines Agency etc. It is coordinated by Prof. Dr. C Giaquinto from Padova Italy.
Why do we ask your help now?
We are all aware of the fact that the evidence about the effects of drugs in children is sparse since few prelicensure studies are conducted in children. This has led to regulatory changes, which may improve the situation for new drugs in the long term. In GRIP we are convinced however, that we have the ability to collect evidence on the use and effects of drugs in children much faster and on a wider scale if we would use existing healthcare data. Millions of children are treated on a daily basis and we could use their data to obtain information on usage patterns as well as safety and eventual effectiveness. To allow for this, GRIP aims to build a platform for global studies of drug effects in paediatrics. With this survey we are at the first step of assessing which systems and healthcare databases are available in each of the countries. We are specifically looking for databases containing
Electronic person-level drug prescription/dispensing information (e.g. pharmacy claims data, primary care prescription databases)
Electronic person-level immunization information (e.g. vaccine registry, primary care/ immunization clinical databases) for routine childhood vaccines
Electronic person-level disease diagnoses (e.g. primary care medical record databases, claims databases, hospital databases)
You can access the survey by registering at the following link:
http://survey.grip-network.org/index.php?sid=51458&newtest=Y&lang=en
You are kindly invited to complete a separate registration for each different database you can provide information about (in case you have access to more than one database).
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
43
What may be offered in the future?
The healthcare databases in the inventory could become part of a global federation (platform) of databases for paediatric pharmacoepidemiological studies. Based on the data available, local researchers could be invited to participate in harmonization and proof of concept studies as third parties to the GRIP network of excellence. We appreciate your support and would be happy to answer any questions you may have. Please contact: Osemeke Osokogu
Jan Bonhoeffer /Yulin Li
Brighton Collaboration Foundation
Basel, Switzerland
Miriam Sturkenboom / Osemeke Osokogu
Erasmus University Medical Center
Rotterdam, The Netherlands
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
44
Appendix 4a: Database invited to fill the survey (n=99)
Continent Country Name of Database
Asia-Pacific Australia (2) Vaccine Assessment using Linked Data (VALiD)
General Practice Research Network (GPRN)
China (3) China Health and Nutrition Survey (CHNS)
Shanghai Food and Drug Administration (FDA) Hospital Medical Record Database
immunization registry
Japan (2) IMS LifeLink? Longitudinal Rx (LRx) Database: Japan
National Claims Database
Laos (1) immunization information system
New Zealand (1)
New Zealand's Mortality Database (MORT) (New Zealand)
Sri Lanka (1) Birth and Immunization Register
Europe Belgium (2) IMS Lifelink: Belgium Hospital Disease Database
CSD Longitudinal Patient Database: Belgium
Denmark (5)
Nation-wide Danish immunization registry
Danish National Patient Registry (NPR)
Aarhus University Prescription Database
Danish Fertility Database
Odense University Pharmacoepidemiologic Database (OPED - Denmark)
Estonia (1) Estonian Health Insurance Fund (EHIF) Prescription Database
Finland (1) THL sentinel hospital databases
France (9) Pharmacoepidemiology
Evaluation chez la Femme Enceinte des MEdicaments et de leurs RISque (EFEMERIS)
THALES
CSD Longitudinal Patient Database: France
Enqu
IMS LifeLink? Electronic Medical Records (EMR) Database - France [aka LifeLink EMR-EU - France]
European Database for Multiple Sclerosis (EDMUS)
French Communicable Diseases Computer Network (FCDN - The Sentinel Network)
Claims databases from the French Rhone-Alpes Region
Germany (3)
CSD Longitudinal Patient Database: Germany
IMS LifeLink? Electronic Medical Records (EMR) Database - Germany [aka LifeLink EMR-EU - Germany]
The German Pharmacoepidemiological Research Database (GePaRD - BIPS database)
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
45
Iceland (2) The Pharmaceuticals Database (PDB)
National Patient Registry (NPR)
Ireland (1) Irish Health Services Executive (HSE) Primary Care Reimbursement Services (PCRS) pharmacy database
Italy (12) Lombardy database
Regional Agency of Healthcare services of Abruzzi.
Regione Emilia Romagna
OSSIFF
Regione Toscana
Marche database
Friuli Venezia Giulia
CSD Longitudinal Patient Database: Italy
Gruppo Italiano di Farmacovigilanza nell Anziano - Italian Group of Pharmacoepidemiology in the Elderly (GIFA)
PEDIANET (Italy)
Panor@mica GP database
Farmaceutica database
Netherlands (8)
Dutch Foundation for Pharmaceutical Statistics (SFK)
Praeventis Immunization registry
PHARMO Record Linkage System
InterAction Database (IADB)
AGIS Health Database (AHD - Netherlands)
Integrated Primary Care Information Database (IPCI)
IMS LifeLink? Longitudinal Rx (LRx) Database: Netherlands
LINH Database (The Netherlands Information Network of General Practice) (Netherlands)
Norway (2) Norwegian hospital databases
Norwegian Prescription Database (NORPD)
Scotland (1) Tayside Medicines Monitoring Unit (MEMO) (UK)
Spain (2) CSD Longitudinal Patient Database: Spain
BIFAP(Database for Pharmacoepidemiological Research in Primary Care/Base de datos para la Investigaci
Sweden (4) Swedish Prescribed Drug Register
Swedish National Patient Register (Sweden)
Dental Health Register of Sweden
Swedish Medical Birth Register
UK (6) IMS Oncology Analyzer (United Kingdom)
The Primary Care Clinical Informatics Unit (PCCIU) Database (UK)
IMS LifeLink? Electronic Medical Records (EMR) Database- UK [aka LifeLink EMR-EU - UK]
QRESEARCH
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
46
General Practice Research Database (GPRD) (UK)
The Health Improvement Network (THIN)
North-America Canada (7) Quebec Pregnancy Registry
Manitoba Population Health Research Data Repository:Bone Mineral Density (BMD) Database
Canadian Cancer Registry (CCR)
IMS LifeLink? Longitudinal Rx (LRx) Database: Canada
Saskatchewan Health, Multiple Linkable Population Databases
MedEcho
Longitudinal database
US (23) MEDSTAT
Health care cost and utilization project nationwide inpatient sample
the Veterans Affairs (VA) database
National Epidemiologic Survey on Alcohol and Related Conditions (NESARC) (USA)
i3 Invision Data Mart (formerly LabRx) (USA)
Slone Epidemiology Unit Case Control Surveillance Study (USA)
Analyticare Long Term Care (LTC) Database (USA)
IMS LifeLink? Longitudinal Rx (LRx) Database: USA
IMS LifeLink Health Plan Claims Database (formerly PharMetrics Patient-Centric Database) (USA)
Medi-Cal Paid Claims File (USA
Systematic Assessment of Geriatric drug use via Epidemiology (SAGE) Database (USA)
MediGuard (formerly iGuard) (USA)
Vaccine Safety Data Link (VSD) (USA)
Premier Perspective
Rochester Epidemiology Project (REP) (Mayo Clinic) (USA)
Geisinger Health Care System (USA)
IntrinsiQ Database (USA)
Regenstrief Medical Record System (RMRS) (USA)
HealthCore Integrated Research Database (HIRD) (USA)
United States Renal Data System (USRDS) (USA)
MarketScan Commercial Claims and Encounters (USA)
Pharmaceutical Assistance Contract for the Elderly (PACE) (USA)
The Cardiovascular Health Study (CHS) (USA)
D. 2.02 – Description of Data Sources
The research leading to these results has received funding from the European Union Seventh Framework
Programme (FP7/2007-2013) under grant agreement n° 261060
47
Appendix 4b: Personal contacts invited after networking in the meetings (n=26+2 with the
same contact persons).
Continent Country Name of automated health care database Conference/Meeting
North-America US MarketScan Commercial Claims and Encounters (USA)
Mid-Year Meeting ISPE
MarketScan Medicaid Database (USA) Mid-Year Meeting ISPE
MarketScan Medicare Supplemental and COB Database (USA)
Mid-Year Meeting ISPE
OPTUM Insight Mid-Year Meeting ISPE
Canada Quebec Health Insurance Agency (vermoedelijk ook RAMQ)
PHAC
Immunization registry (linkable) PHAC
PHAC
Manitoba Immunization registry PHAC
Ottawa Hospital Research Institute PHAC
Ontario Public Healtg PHAC
Europe Belgium Insurance database ECDC
Austria
ECDC
Bulgaria
ECDC
Cyprus
ECDC
Czech republic
ECDC
Estonia
ECDC
Hungary
ECDC
Iceland
ECDC
Ireland
ECDC
Lithuania
ECDC
Poland
ECDC
Portugal
ECDC
Slovakia
ECDC
Slovenia
ECDC
Sweden
ECDC
Asia-Pacific SoutkKorea Immunization data PHAC
China, Republic of (a.k.a. Taiwan)
PHAC
Thailand Immunization data PHAC