ETIS
+: E
urop
ean
Tran
spor
t Pol
icy
Info
rmat
ion
Syst
em -
Dev
elop
men
t and
Impl
emen
tatio
n of
Dat
a C
olle
ctio
n M
etho
dolo
gy fo
r EU
Tra
nspo
rt M
odel
ling
Funded by the European Commission throughthe 7th Framework Programme for Research
http://www.etisplus.eu/
ETISplus Final Workshop
Validation
Brussels 5-11-2012
Daniela Carvalho (TIS)
Consortium:
DEMIS
ISIS
ITC
KIT
MKMetric
NEA (Coord)
NESTEAR
NTU
NTUA
OBET
STRATA
STRATEC
TETRAPLAN
TIS
TML
TNO
TRT
UNIZA
ETIS
+: E
urop
ean
Tran
spor
t Pol
icy
Info
rmat
ion
Syst
em -
Dev
elop
men
t and
Impl
emen
tatio
n of
Dat
a C
olle
ctio
n M
etho
dolo
gy fo
r EU
Tra
nspo
rt M
odel
ling
Funded by the European Commission throughthe 7th Framework Programme for Research
http://www.etisplus.eu/
Validation in ETIS plus
Concept and reasoning
Validation dimensions
Internal Validation Exercise
Learning’s and
Recommendations
ETIS
+: E
urop
ean
Tran
spor
t Pol
icy
Info
rmat
ion
Syst
em -
Dev
elop
men
t and
Impl
emen
tatio
n of
Dat
a C
olle
ctio
n M
etho
dolo
gy fo
r EU
Tra
nspo
rt M
odel
ling
Funded by the European Commission throughthe 7th Framework Programme for Research
http://www.etisplus.eu/
• Transport modeling is essential to policy assessment and development: requirement for good quality input data to support models, evaluation methodologies and indicator frameworks
• Information integration and consistent data source minimise the risks and problems of data interpretation arising from the heterogeneous methodologies associated with the available data sources
• ETISplus, by targeting the data needs of the European Commission’ s reference transport models aims to create a central resource of policy oriented datasets which will be of immediate use to policy analysts
• The data gathered in this project, as well as the data to be gathered in later updates of the ETIS database, need to be checked for quality and consistency, in a structured manner
• WP10 fully dedicated to validation
Concept and
reasoning
ETIS
+: E
urop
ean
Tran
spor
t Pol
icy
Info
rmat
ion
Syst
em -
Dev
elop
men
t and
Impl
emen
tatio
n of
Dat
a C
olle
ctio
n M
etho
dolo
gy fo
r EU
Tra
nspo
rt M
odel
ling
Funded by the European Commission throughthe 7th Framework Programme for Research
http://www.etisplus.eu/• Validation in ETISplus had a twofold objective:
to assure that the data fulfils the needs for the transport models in use to support transport policy related measures and actions (external perspective)
to present consistent data, free from obvious mistakes or gaps, in order to turn the database reliable and ready to be used (internal perspective)
Validation dimensions
External Internal
EU Models (WS March 2012)
Nat Experts(Survey)
Supporting data
Freight Passenger
Suggestions were reverted in databases
Corrections to implement
Validation
ETIS
+: E
urop
ean
Tran
spor
t Pol
icy
Info
rmat
ion
Syst
em -
Dev
elop
men
t and
Impl
emen
tatio
n of
Dat
a C
olle
ctio
n M
etho
dolo
gy fo
r EU
Tra
nspo
rt M
odel
ling
Funded by the European Commission throughthe 7th Framework Programme for Research
http://www.etisplus.eu/• Focus of External Validation with EU Transport Models:
How does ETIS+ fulfils the current needs for on-going models and policy monitoring
Ascertain whether the content of ETIS+ is suitable to the data needs from major European modelers and policy makers
Identify possible gaps or missing indicators
Trans-Tools V3
ASTRATrans-Tools V3
TREMOVE
ASTRAPOLES
CGEurope SASI
LOGIS
RHOMOLO VACLAVVIA
NEAC
Compilation and documentation of the
conclusions and recommendations
= inputs for database developers =
ETIS
+: E
urop
ean
Tran
spor
t Pol
icy
Info
rmat
ion
Syst
em -
Dev
elop
men
t and
Impl
emen
tatio
n of
Dat
a C
olle
ctio
n M
etho
dolo
gy fo
r EU
Tra
nspo
rt M
odel
ling
Funded by the European Commission throughthe 7th Framework Programme for Research
http://www.etisplus.eu/• Focus of External Validation with National Experts:
Evaluation of outputs generated focusing in particular in own country data: accuracy /plausibility / coherence /comprehensiveness
Mostly through graphical visualization (ETIS-VIEW tool)
General data
1) GDP per Capita 20052) GDP per Capita 20103) Motorization Rate Freight 20054) Motorization Rate Freight 2010 5) Motorization Rate Passenger 20056) Motorization Rate Passenger 20107) Freight Performances in Ton-Km car and rail8) Passenger-kilometers car and rail
Airport data
9) Air passenger accessibility 2010 10) Air freight accessibility 201011) Airport freight 200512) Airport freight 2010 13) Airport passenger 200514) Airport passenger 2010
Rail data
15) Rail freight assignment 200516) Rail Freight Net Max. Speed 200517) Rail Passenger Net Max. Speed 200518) Passenger trips by rail
Road data
19) Road freight assignment 2005 20) Passenger trips by car
Waterborne data
21) Sea Port Freight 2005 22) Sea Port Passenger 200523) Inland ports 201024) IWW assignment 2005
ETIS
+: E
urop
ean
Tran
spor
t Pol
icy
Info
rmat
ion
Syst
em -
Dev
elop
men
t and
Impl
emen
tatio
n of
Dat
a C
olle
ctio
n M
etho
dolo
gy fo
r EU
Tra
nspo
rt M
odel
ling
Funded by the European Commission throughthe 7th Framework Programme for Research
http://www.etisplus.eu/• Data Screening => structural checks implemented throughout the
development of the database software platform: Input data errors, Measurement errors, Sampling errors & Transfer
errors, Specification errors, Aggregation errors
Internal Validation Exercise
Right file format?
Fail message
No
Right structure?
Fail message
No
Structural check
Right data format and
unit?
Fail message
No
Data check
Yes
Yes
Yes
End of data screening
ETIS
+: E
urop
ean
Tran
spor
t Pol
icy
Info
rmat
ion
Syst
em -
Dev
elop
men
t and
Impl
emen
tatio
n of
Dat
a C
olle
ctio
n M
etho
dolo
gy fo
r EU
Tra
nspo
rt M
odel
ling
Funded by the European Commission throughthe 7th Framework Programme for Research
http://www.etisplus.eu/• Data Evaluation
3 main issues: consistency of data, consistency over time, and data continuity check (checking of graphic attributes)
Continuous interaction between “validators” and “developers”Internal
Validation Exercise
• Road, rail, air transport
• Public Transport
• Coach transport
• Short distance Transport
Passengers Demand
• International Freight Flows
• National Freight Flows
• Logistic Data: transport logistics and distribution / inventory, containers
Freight-Logistics
• Socio-economic
• Networks (road, rail, air, inland, maritime)
• Level of Service
• Transport External Impacts
Supporting Data
Aggregated checks, Reasonableness Checks (NTUA, NESTEAR, TRT)
ETIS
+: E
urop
ean
Tran
spor
t Pol
icy
Info
rmat
ion
Syst
em -
Dev
elop
men
t and
Impl
emen
tatio
n of
Dat
a C
olle
ctio
n M
etho
dolo
gy fo
r EU
Tra
nspo
rt M
odel
ling
Funded by the European Commission throughthe 7th Framework Programme for Research
http://www.etisplus.eu/
Socioeconomic datasets: Techniques used
Missing valueMissing value analysis between EU27 and other countries.
Total sum of all subsetsThe sum of all subsets should reasonably be equal to the total population of the set. E.g. the sum of males and females in a region should be equal to the total population of the region. The sum of sub classes (e.g. population by gender and age group) should reasonably be equal to the total number of population.
+ =
Identification of extreme valuesCertain parameter values have been statistically analyzed using descriptive statisticsin order to identify extreme (erroneous or suspicious) values in the associated datasets• Min, 5% Percentile, Average, 95% Percentile, Max values • number of cases in the range of [0-2%), • number of cases exceeding the 2% threshold, etc.)
Comparison with alternative sourcesIn this test parameter values are compared with associated values of alternative sources.
ETIS
+: E
urop
ean
Tran
spor
t Pol
icy
Info
rmat
ion
Syst
em -
Dev
elop
men
t and
Impl
emen
tatio
n of
Dat
a C
olle
ctio
n M
etho
dolo
gy fo
r EU
Tra
nspo
rt M
odel
ling
Funded by the European Commission throughthe 7th Framework Programme for Research
http://www.etisplus.eu/
Excel files: One for each parameter containing detailed analysis and results (all erroneous and suspicious values)
Corrections incorporated by developers
Socioeconomic dataset: Parameters and tests performed
ETIS
+: E
urop
ean
Tran
spor
t Pol
icy
Info
rmat
ion
Syst
em -
Dev
elop
men
t and
Impl
emen
tatio
n of
Dat
a C
olle
ctio
n M
etho
dolo
gy fo
r EU
Tra
nspo
rt M
odel
ling
Funded by the European Commission throughthe 7th Framework Programme for Research
http://www.etisplus.eu/
Networks: Techniques used
Test #1: Check for not connected links of the network
Test #2: Network end points inside the sea
Logical Rule: Identification of network links that are not connected.
Technique: Topology rule of ArcGIS “do not have dangles”
Logical Rule: Identification of the terminal points of the network that are outside the land region and indicates if in the Bing Maps these points are actually on land or in the sea.
Technique: Topology rule of ArcGIS “must be properly inside”
ETIS
+: E
urop
ean
Tran
spor
t Pol
icy
Info
rmat
ion
Syst
em -
Dev
elop
men
t and
Impl
emen
tatio
n of
Dat
a C
olle
ctio
n M
etho
dolo
gy fo
r EU
Tra
nspo
rt M
odel
ling
Funded by the European Commission throughthe 7th Framework Programme for Research
http://www.etisplus.eu/
Networks: Techniques used
Testing road network in terms of path selection and travel time duration Greece-Germany (interviews with truck drivers)
ETIS
+: E
urop
ean
Tran
spor
t Pol
icy
Info
rmat
ion
Syst
em -
Dev
elop
men
t and
Impl
emen
tatio
n of
Dat
a C
olle
ctio
n M
etho
dolo
gy fo
r EU
Tra
nspo
rt M
odel
ling
Funded by the European Commission throughthe 7th Framework Programme for Research
http://www.etisplus.eu/
Networks: Techniques used
Test for network continuity: minimum road paths crossing border points & sea connections through ports
ETIS
+: E
urop
ean
Tran
spor
t Pol
icy
Info
rmat
ion
Syst
em -
Dev
elop
men
t and
Impl
emen
tatio
n of
Dat
a C
olle
ctio
n M
etho
dolo
gy fo
r EU
Tra
nspo
rt M
odel
ling
Funded by the European Commission throughthe 7th Framework Programme for Research
http://www.etisplus.eu/
Impedances: Techniques used
Test: Calculation of access and flight speed
Techniques: a) Check if “Access Time + Flight Time = Total Time, b) Frequency distribution for Access and Flight Speed
Air Impedances
Road Impedances
Test: Calculation of distances between regions (centroids)
Rail Impedances
Test: Calculation of distances between regions (centroids)
1. (as Road Impedances)
1. Calculation of the difference between distances traveling from region A to B minus distances from B to A.
Regions with road distances less than 20 Km
2. Identification of short distances between centroids
2. Identification of regions that have no rail impedances but rail network exists
Regions with missing rail freight impedances, while rail network seem to exist
ETIS
+: E
urop
ean
Tran
spor
t Pol
icy
Info
rmat
ion
Syst
em -
Dev
elop
men
t and
Impl
emen
tatio
n of
Dat
a C
olle
ctio
n M
etho
dolo
gy fo
r EU
Tra
nspo
rt M
odel
ling
Funded by the European Commission throughthe 7th Framework Programme for Research
http://www.etisplus.eu/
Validation exercise
Validation by comparing ETISplus tables themselves checking logical consistency :• i.e. ∑ loaded goods of all regions =∑ unloaded goods of all regions• tables “Air Freight by OD”, “International Road Freight”, “International Rail
Freight”, “IWW Freight”, “International Pipeline Freight” were compared with table “EU Detailed Trade (Comext Transport)” at national level, differentiated by mode of transport
Validation by comparing ETISplus tables with other data source and comparing indicator calculated from ETISplus tables with other data source (reference)
• UNECE Transport Statistics, • Sitram• TRM statistics*• CAFT• Port Authority: statistics of “throughput annual”• WTO statistics
TRM * The statistics come from the French part of the European survey on road freight transport. In France, it concerns the transport carried out by trucks over 3.5 tons GVW registered in France, and used for hire or reward or on own account
ETIS
+: E
urop
ean
Tran
spor
t Pol
icy
Info
rmat
ion
Syst
em -
Dev
elop
men
t and
Impl
emen
tatio
n of
Dat
a C
olle
ctio
n M
etho
dolo
gy fo
r EU
Tra
nspo
rt M
odel
ling
Funded by the European Commission throughthe 7th Framework Programme for Research
http://www.etisplus.eu/
Validation exercise
Checks on trips distribution by distance and by purposeCorrelation with socio economic data e.g.• Correlation between total generated trips and population• Correlation between inter/intra-zonal trips and population• Correlation between attracted BUSINESS trips and WORKING PLACES and GDP • Correlation between attracted COMMUTING trips and WORKING PLACES• Correlation between attracted PRIVATE trips and POPULATION• Correlation between attracted VACATION trips and BED PLACES
Comparison with ETIF statistics in terms of PKM
R² = 0.928
0
100
200
300
400
500
600
700
800
900
1000
0 100 200 300 400 500
Wor
k pl
aces
(Tot
al e
mpl
oym
ent)
Mill
ion
Total attracted trips by COMMUTING purpose
Million
R² = 0.9536
0
2
4
6
8
10
12
0 2000 4000 6000 8000 10000
Popu
latio
nM
lilio
n
Intra regional trips
Million
ETIS
+: E
urop
ean
Tran
spor
t Pol
icy
Info
rmat
ion
Syst
em -
Dev
elop
men
t and
Impl
emen
tatio
n of
Dat
a C
olle
ctio
n M
etho
dolo
gy fo
r EU
Tra
nspo
rt M
odel
ling
Funded by the European Commission throughthe 7th Framework Programme for Research
http://www.etisplus.eu/
Learning’s and
Recommen-dations
• Importance of dedicated validation processes performed by “independent” reviewers in addition to regular validation by developers
• Relevance of both internal and external views External (EU) »» 3 workshops per mode (road, rail,
maritime), 1 workshop with Eurostat, 1 workshop with EU models
- importance of observed data (vs estimated data), relevance of detailed metadata concerning source and quality of the data collected is stated as a relevant factor
- Importance of including additional information, especially concerning prices, taxes and tariffs »» database was shaped in line with expectations,
- Provision of reference indicators that could be used for validation purposes
External (MS) »» allowing country experts to visualise its own results
- Identification of “new” gaps and uncertainties but particularly provision of new data
- Inputs being updated in ETIS+
ETIS
+: E
urop
ean
Tran
spor
t Pol
icy
Info
rmat
ion
Syst
em -
Dev
elop
men
t and
Impl
emen
tatio
n of
Dat
a C
olle
ctio
n M
etho
dolo
gy fo
r EU
Tra
nspo
rt M
odel
ling
Funded by the European Commission throughthe 7th Framework Programme for Research
http://www.etisplus.eu/
Learning’s and
Recommen-dations
Internal »» strong interaction process between validators and developers
- inconsistencies verified (data entry mistakes / wrong data) and corrected
- findings of validation included in metadata
• Different validation processes and consistency checks undertaken will be reported in ETIS plus report to be used as guide for future exercises
• Highlight the importance of applying different layers of validation processes:• Internal routines by database developers• Internal checks performed by “independent” validators• External validation by users of datasets• External validation by users (policy)
• ETIS+ dedicated large efforts towards the requirement for good quality input data to support models
ETIS
+: E
urop
ean
Tran
spor
t Pol
icy
Info
rmat
ion
Syst
em -
Dev
elop
men
t and
Impl
emen
tatio
n of
Dat
a C
olle
ctio
n M
etho
dolo
gy fo
r EU
Tra
nspo
rt M
odel
ling
Funded by the European Commission throughthe 7th Framework Programme for Research
http://www.etisplus.eu/
Thank you for your attention Consortium:
DEMIS
ISIS
ITC
KIT
MKMetric
NEA (Coord)
NESTEAR
NTU
NTUA
OBET
STRATA
STRATEC
TETRAPLAN
TIS
TML
TNO
TRT
UNIZA
Daniela [email protected]