Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

Post on 12-Jan-2016

216 views 4 download

transcript

Creating High Performance Spatial Databases with SQL Server 2008

Alastair Aitchison

About Me

• Consultant, Trainer, Author, and Housedad

Session Plan

• Do you need geometry and geography?• Constructing a spatial index (the theory)• Filtering spatial query results (the practice)• Optimising spatial queries

The “Two-Column” Model

CREATE TABLE Customers ( Name varchar(32), Address varchar(255), Lat float, Long float);

Point-in-Polygon

SELECT * FROM CustomersWHERE Lat BETWEEN LtMin AND LtMaxAND Long BETWEEN LnMin AND LnMax

(LtMin, LnMin)

(LtMax, LnMax)

Calculating Distance

SELECT 3963.0 * ACOS( SIN(Lat1) * SIN(Lat2) + COS(Lat1) * COS(Lat2) * COS(Lon2 - Lon1))

(Lat1, Lon1)

(Lat2, Lon2)

“Two-Column” Model Limitations

• Only stores points• Calculations on flat plane or perfect sphere• Limited range of methods

SQL Server 2008

• Points, Linestrings, Polygons• Accurate calculations– Ellipsoid model (geography)– Flat plane (geometry)

• Full complement of spatial methods – Intersects, Contains, Crosses, Touches– Distance, Length, Area– DE-9IM

Do You Need geometry / geography?

• Not all “spatial” apps need spatial datatypes• Example: Store locator

“two column” geometry / geographySimple ComplexApproximate AccurateFAST! SLOW!

Session Plan

• Do you need geometry and geography?• Constructing a spatial index (the theory)• Filtering spatial query results (the practice)• Optimising spatial queries

Querying geometry and geography

• SELECT * WHERE A.STIntersects(B) = 1• Primary Filter (Based on Index)– Approximate– Fast– Superset of actual results

• Secondary Filter (Based on Table)– Refine results of primary filter– Accurate

Assigning Order to Spatial Data

• B-Tree indexing for linearly ordered data– decimal, float, money etc. – numeric order– char, varchar, nvarchar etc. – alphabetic order– datetime, date, time etc. – chronological order

• How do we assign order to spatial data?

The Multi-Level Grid

From Grid to Index

• Covered, partially covered, or touched cells• Maximise accuracy - Minimise index size• Three Rules– Covering Rule– Deepest-Cell Rule– Cells Per Object Rule

Covering Rule

“If a grid cell is completely covered by a geometry, don’t further subdivide that cell.”

Deepest-Cell Rule

“Once a cell has been subdivided, only store the intersecting cell(s) at the deepest grid level.”

Cells Per Object Rule

“If subdividing a cell would exceed the maximum allowed number of cells for each object, do not subdivide the cell.”

Session Plan

• Do you need geometry and geography?• Constructing a spatial index (the theory)• Filtering spatial query results (the practice)• Optimising spatial queries

Creating a Spatial Index I

CREATE TABLE Grid ( id char(1), shape geometry, CONSTRAINT [idxGridCluster] PRIMARY KEY CLUSTERED ( id ASC ));

Add Some Points To The Table

INSERT INTO Grid VALUES ('A', geometry::Point(0.5, 2.5, 0)),('B', geometry::Point(2.5, 1.5, 0)),('C', geometry::Point(3.25, 0.75, 0)),('D', geometry::Point(3.75, 2.75, 0));

Creating a Spatial Index II

CREATE SPATIAL INDEX idxGrid ON Grid(shape)USING GEOMETRY_GRID WITH ( BOUNDING_BOX = (0, 0, 4096, 4096), GRIDS = (

LEVEL_1 = MEDIUM,LEVEL_2 = MEDIUM, LEVEL_3 = MEDIUM, LEVEL_4 = MEDIUM),

CELLS_PER_OBJECT = 16);

-- Each L1 cell is 512 x 512-- Each L2 cell is 64 x 64-- Each L3 cell is 8 x 8-- Each L4 cell is 1 x 1

Grid Level 4

A

B

C

D

Finding Intersecting Points

DECLARE @Polygon geometry = 'POLYGON ((1.5 0.5, 3.5 0.5, 3.5 2.5, 1.5

2.5, 1.5 0.5)) ';

SELECT *FROM GridWHERE shape.STIntersects(@Polygon) = 1;

Execution Plan With Spatial Index

sp_help_spatial_geometry_index

EXEC sp_help_spatial_geometry_index @tabname = Grid, @indexname = idxGrid, @verboseoutput = 1, @query_sample = 'POLYGON ((1.5 0.5, 3.5

0.5, 3.5 2.5, 1.5 2.5, 1.5 0.5))';

Number_Of_ObjectCells_In_Level4_In_Index

A

B

D

C

4

Number_Of_ObjectCells_In_Level4_For_QuerySample 9

Compare the Grid Cells

A

B

C

D

Percentage_Of_Rows_ NotSelected_By_Primary_Filter 25%

A D

B

C

Number_Of_Rows_Selected_By_Primary_Filter

A

3

D

B

C

Percentage_Of_Primary_Filter_Rows_Selected_By_Internal_Filter

A D

B

C

33

Number_Of_Times_Secondary_Filter_Is_Called

A D

B

C

2

Number_Of_Rows_Output

A

B

D

C

2

Primary_Filter_Efficiency

A D

66

B

C

Internal_Filter_Efficiency

A D

50

B

C

Session Plan

• Do you need geometry and geography?• Constructing a spatial index (the theory)• Filtering spatial query results (the practice)• Optimising spatial queries

Making Sure the Index Is Used

• Use a Supported Method– STIntersects()– STContains(), STWithin(), STTouches()– STDistance()– Filter()

• Syntax must be A.STIntersects(B) = 1• Upgrade to SP1• Use a HINT where necessary

Making the Index Effective

• Three possible outcomes:– Preselection (Internal Filter)– Discarding (Primary Filter)– Secondary Filter

• Adjust Index Settings to fit data in the column and typical query samples

Improving Performance

• Make bounding box as tight as possible• Grid Resolution ↑ ... Cells Per Object ↑• Multiple Indexes (may need HINT)• Use non-spatial predicates• Reduce unnecessary detail• Experiment!

Want To Know More?

Beginning Spatial with SQL Server 2008

MSDN Spatial Forum http://social.msdn.microsoft.com/ Forums/en-US/sqlspatial/threads

alastair@beginningspatial.com

A Practical Demonstration

• Geonames export• 6.9 million points• Search for those in Newport• Without Index: ~100 rows. 12,391,230 secs• With Index: ~100 rows. < 1 sec