+ All Categories
Home > Documents > 1 OLAP for heterogeneous socio-economic data – the challenge of integration, analysis and crime...

1 OLAP for heterogeneous socio-economic data – the challenge of integration, analysis and crime...

Date post: 17-Jan-2016
Category:
Upload: daniella-mcdonald
View: 219 times
Download: 2 times
Share this document with a friend
Popular Tags:
17
1 OLAP for heterogeneous socio-economic data – the challenge of integration, analysis and crime prevention: a Czech case study. Jiří HORÁK, Igor IVAN, Bronislava HORÁKOVÁ VSB-Technical University of Ostrava Intergraph CS Ltd. Czech Republic
Transcript
Page 1: 1 OLAP for heterogeneous socio-economic data – the challenge of integration, analysis and crime prevention: a Czech case study. Jiří HORÁK, Igor IVAN,

1

OLAP for heterogeneous socio-economic data – the challenge of integration, analysis and crime

prevention: a Czech case study.

Jiří HORÁK, Igor IVAN, Bronislava HORÁKOVÁVSB-Technical University of Ostrava

Intergraph CS Ltd.Czech Republic

Page 2: 1 OLAP for heterogeneous socio-economic data – the challenge of integration, analysis and crime prevention: a Czech case study. Jiří HORÁK, Igor IVAN,

2European Forum for Geography and Statistics 2015 Conference Vienna, Austria, 10 – 12 November 2015

Big Spatial Data

Big Spatial Data • Features:

– Volume beyond the limit of usual geo-processing, – Velocity higher than available by usual processes, – Variety, combining more diverse geodata sources than

usual.

• traditional methods of geodata collection, storing, processing, controlling, analysing, modelling, validating and visualizing fail to provide effective solutions

• how to exploit the big spatial data?

Page 3: 1 OLAP for heterogeneous socio-economic data – the challenge of integration, analysis and crime prevention: a Czech case study. Jiří HORÁK, Igor IVAN,

3European Forum for Geography and Statistics 2015 Conference Vienna, Austria, 10 – 12 November 2015

• part of Business intelligence• On-line analytical processing - provide an effective and

intuitive access to consolidated data (harmonized and aggregated) stored in multidimensional data structures.

• OLAP operations:– Drill-down (success in hierarchy down, towards more details), – Roll - Up (success in hierarchy up, obtaining more aggregated data)– Drill-Across (link several fact tables with the same granularity)– Slice-and-Dice (splitting data)– Pivot (exchange of dimension in designed view)

• multidimensional database as a Data Warehouse: subject-oriented, integrated, time-variant and non-volatile collection of data

Multidimensional database and OLAP

Page 4: 1 OLAP for heterogeneous socio-economic data – the challenge of integration, analysis and crime prevention: a Czech case study. Jiří HORÁK, Igor IVAN,

4European Forum for Geography and Statistics 2015 Conference Vienna, Austria, 10 – 12 November 2015

• dimensional modelling • elementary items in fact tables contain

aggregated data (counts, sums etc.) • organised according to dimensions

(features)• dimensions usually contain hierarchical

structure• Granularity – the level of detail for facts • Additivity - possibility to summarize data

according to dimensions

Fact tables and dimensions

http://www.code-magazine.com/focus/Article.aspx?QuickID=1103091

Page 5: 1 OLAP for heterogeneous socio-economic data – the challenge of integration, analysis and crime prevention: a Czech case study. Jiří HORÁK, Igor IVAN,

5European Forum for Geography and Statistics 2015 Conference Vienna, Austria, 10 – 12 November 2015

Data sources:• population data – grid 1km, 100 m Census 2011(CZSO),

municipal IS • reg. of land identif., addresses and properties - buildings

(NMCA)• central crime register (Police CZ) - events • offence register (city police) – local, central is planned• register of schools (Min. of Education, Youth and Sports) -

contact• register of health service providers (Min. of Health) – contact,

beds• register of unemployed (Labour office) • register of gambling machines (Min. of Finance)• register of companies (CZSO, or others)

DWH & OLAP for social environment (crime, human factors)

Page 6: 1 OLAP for heterogeneous socio-economic data – the challenge of integration, analysis and crime prevention: a Czech case study. Jiří HORÁK, Igor IVAN,

6European Forum for Geography and Statistics 2015 Conference Vienna, Austria, 10 – 12 November 2015

ETL processes:• Data differs in quality, formats, accesses, legal and ethical

aspects (license policy, sensitivity), and maintenance• control procedures - integrity constrains, check validity of time

range, geographical range, referential integrity etc.• harmonisation – referential time of event from time interval,

harmonisation of addresses, classification of facilities, buildings etc.

• Geocoding for missing or bad coordinates• aggregation – according to multidimensional modelling• data anonymization – filtering, scramble, rounding, projection

ETL processes for DWH & OLAP for social environment

Page 7: 1 OLAP for heterogeneous socio-economic data – the challenge of integration, analysis and crime prevention: a Czech case study. Jiří HORÁK, Igor IVAN,

7

Fact tables:• CRIME• POPULATION• UNEMPLOYED• HEALTH• BUILDING• FACILITIES

Dimensional tables:• DATE• SQUARE• ADMIN_UNITS• AGE• SEX• and more

Structure

Page 8: 1 OLAP for heterogeneous socio-economic data – the challenge of integration, analysis and crime prevention: a Czech case study. Jiří HORÁK, Igor IVAN,

8European Forum for Geography and Statistics 2015 Conference Vienna, Austria, 10 – 12 November 2015

• Grid – 100 x 100 m (4th level of the scale system for communes and urban districts, Bacler), 500 m, 1 km, 5 km

• Administrative units - part of municipality, municipality, MEA, LAU1, NUTS3

• temporal dimension - one day unit, week, month, year• day-cycle hours – hour unit, morning time, rush hours • age - 5-years basic categories, 10-years, 20-years, “30 and

more”. • crime (& offences) - standard 3-level classification system • facilities - purpose and the hierarchical structure

Dimensions and hierarchy

Page 9: 1 OLAP for heterogeneous socio-economic data – the challenge of integration, analysis and crime prevention: a Czech case study. Jiří HORÁK, Igor IVAN,

9European Forum for Geography and Statistics 2015 Conference Vienna, Austria, 10 – 12 November 2015

Pivoting

Place of commitment

XResid. of offenders

OLAP pivoting, selections, relationships

Scatter plot, regres.a.

Gambling machinesX

Population

Page 10: 1 OLAP for heterogeneous socio-economic data – the challenge of integration, analysis and crime prevention: a Czech case study. Jiří HORÁK, Igor IVAN,

10European Forum for Geography and Statistics 2015 Conference Vienna, Austria, 10 – 12 November 2015

Data grid view

Page 11: 1 OLAP for heterogeneous socio-economic data – the challenge of integration, analysis and crime prevention: a Czech case study. Jiří HORÁK, Igor IVAN,

11European Forum for Geography and Statistics 2015 Conference Vienna, Austria, 10 – 12 November 2015

Number of burglaries per 100 flats (2014)

Page 12: 1 OLAP for heterogeneous socio-economic data – the challenge of integration, analysis and crime prevention: a Czech case study. Jiří HORÁK, Igor IVAN,

12European Forum for Geography and Statistics 2015 Conference Vienna, Austria, 10 – 12 November 2015

# burglaries to dwellings, # residential buildings (2014)

3 towns:• CB Ceske Budejovice• KO Kolin• OV Ostrava

Differences:• density of buildings• density of burglaries • dependencies

Page 13: 1 OLAP for heterogeneous socio-economic data – the challenge of integration, analysis and crime prevention: a Czech case study. Jiří HORÁK, Igor IVAN,

13European Forum for Geography and Statistics 2015 Conference Vienna, Austria, 10 – 12 November 2015

Number of gambling machines per 1km2

Page 14: 1 OLAP for heterogeneous socio-economic data – the challenge of integration, analysis and crime prevention: a Czech case study. Jiří HORÁK, Igor IVAN,

14European Forum for Geography and Statistics 2015 Conference Vienna, Austria, 10 – 12 November 2015

Number of gambling clubs per 100 inhabitants

Page 15: 1 OLAP for heterogeneous socio-economic data – the challenge of integration, analysis and crime prevention: a Czech case study. Jiří HORÁK, Igor IVAN,

15European Forum for Geography and Statistics 2015 Conference Vienna, Austria, 10 – 12 November 2015

# sprayer crimes per 1 school (2014)

Page 16: 1 OLAP for heterogeneous socio-economic data – the challenge of integration, analysis and crime prevention: a Czech case study. Jiří HORÁK, Igor IVAN,

16European Forum for Geography and Statistics 2015 Conference Vienna, Austria, 10 – 12 November 2015

Classification tree for sprayer crimes

Dependency – second.schools + regions; no second.schools + gambling m. + districtsNo dependency – population, buildings, basic schools, property offences

Page 17: 1 OLAP for heterogeneous socio-economic data – the challenge of integration, analysis and crime prevention: a Czech case study. Jiří HORÁK, Igor IVAN,

Thank you for your attention!

[email protected]

17

Data is provided by the courtesy of the Czech Statistical Office, Police of the Czech Republic, Czech Office for Surveying, Mapping and Cadaster, Czech

Ministry of Finance, Labour offices, Czech Ministry of Health and Municipal Police departments in Ostrava, Kolín and České Budějovice.

The research is supported by the research of the Czech Ministry of Interior, project “Geoinformatics as a tool to support integrated activities of safety and

emergency units”, No. MV-32046-58/VZ-2012.


Recommended