+ All Categories
Home > Technology > Php data structures – beyond spl (online version)

Php data structures – beyond spl (online version)

Date post: 02-Aug-2015
Category:
Upload: mark-baker
View: 875 times
Download: 1 times
Share this document with a friend
Popular Tags:
43
PHP DataStructures – Beyond SPL A dreamscape made from random noise. Illustration: Google
Transcript

PHP DataStructures – Beyond SPL

A dreamscape made from random noise. Illustration: Google

DataStructuresA data structure is a particular way of organizing data in a computer so that it can be used efficiently.

Different kinds of data structures are suited to different kinds of applications, and some are highly specialized to specific tasks.

DataStructures in PHP• Some basic DataStructures available in PHP’s SPL• Stack• Queue• Heap• Doubly-Linked List• Fixed Array• SPL Object Storage

• SPL is the Standard PHP Library• (Yet another recursive acronym)

DataStructures• Some additional DataStructures that don’t exist in core PHP• Tries• QuadTrees

Tries

Tries• A Tree structure comprising a hierarchy of “indexed” nodes

• Each node can contain:• A series of pointers (keys) to the next node in the hierarchy• A bucket for data values

• This allows for multiple values with the same key

• There are three basic types of Tries:• Tries• Radix Tries• Suffix Tries

Tries – Purpose• Fast lookup with a partial key

• Example implementationhttps://github.com/MarkBaker/Tries

Tries – Uses• Replacement for PHP Arrays (Hashmaps)• No key collisions• Duplicate Keys supported• No Hashing function required

• Partial Key Lookups• Predictive Text• Autocomplete• Spell-Checking• Hyphen-isation

Tries – Methods• add($key, $value = null)

Adds new data to a Trie• search($prefix)

Find data in a Trie

• delete($key)• isNode($key)• isMember($key)

Tries – Basic Trie• Node pointers comprise a single character or byte

Tries – Basic Trie$trie = new \Trie();

$trie->add('cat', 'cat data');C

A

T

Tries – Basic Trie$trie = new \Trie();

$trie->add('cat', 'cat data');

$trie->add('car', 'car data');

C

A

T R

Tries – Basic Trie$trie = new \Trie();

$trie->add('cat', 'cat data');

$trie->add('car', 'car data');

$trie->add('cart', 'cart data');

C

A

T R

T

Tries – Basic Trie$trie = new \Trie();

$trie->add('cat', 'cat data');

$trie->add('car', 'car data');

$trie->add('cart', 'cart data');

$trie->search('car');

T

T

C

C A

A

R

R

Tries – Basic Trie• The key to a data node is inherent in the path to that node,

so it is not necessary to store the key

Tries – Radix Trie• Node pointers comprise one or more characters or bytes• This means they can be more compact and memory efficient than

a basic Trie• It can add more overhead to building the Trie• It may be faster to search the Trie hierarchy

Tries – Radix Trie$radixTrie = new \RadixTrie();

$radixTrie->add('cat', 'cat data');CAT

Tries – Radix Trie$radixTrie = new \Trie();

$radixTrie->add('cat', 'cat data');

$radixTrie->add('car', 'car data');

CA

T R

Tries – Radix Trie$radixTrie = new \Trie();

$radixTrie->add('cat', 'cat data');

$radixTrie->add('car', 'car data');

$radixTrie->add('cart', 'cart data');

CA

T R

T

Tries – Suffix Trie$suffixTrie = new \SuffixTrie();

$suffixTrie->add('cat', 'cat data');C

A

T

Tries – Suffix Trie$suffixTrie = new \SuffixTrie();

$suffixTrie->add('cat', 'cat data');C

A

T

TA

T

Tries – Suffix Trie$suffixTrie = new \SuffixTrie();

$suffixTrie->add('cat', 'cat data');

$suffixTrie->search('at');

C

A

T

T

A T

A

T

Tries – Suffix Tries

•Memory hungry• n + n-1 + n-2… 2 + 1 nodes (where n is key length) used for every

key/value stored in a Suffix Trie

• Slow to populate

• Can be used to search for “contains” rather than simply “begins with”

Tries – Suffix Tries

• It is necessary to store the key with the data

• A search can return duplicate values• e.g. “banana” if we search for “a” or “n” or even “ana”

• Data should only be stored once for the “full word”, and subsequent sequences should only store a pointer to that data

QuadTrees

QuadTrees

• A Tree structure that partitions a 2-Dimensional space by recursively subdividing it into quadrants (or regions)

• Each node can contain:• A series of pointers (keys) to the next node in the hierarchy• A bucket for data values

• There are different types of QuadTrees:• Point QuadTrees• Region QuadTrees• Edge QuadTrees• Polygonal Map (PM) QuadTrees

QuadTrees – Purpose• Fast Geo-spatial or Graph lookup• Sparse data compression

• Example implementationhttps://github.com/MarkBaker/QuadTrees

QuadTrees – Uses• Spatial Indexing• Storing Sparse Data

e.g.• Spreadsheet format data• Pixel data in images

• Collision Detection• Points within a field of vision

QuadTrees – Methods• insert($xyCoordinate, $value = null)

Adds new data to a QuadTree• search($boundingBox)

Find data in a QuadTree

QuadTrees – Point QuadTree

• Used for Spatial Indexing

QuadTrees – Spatial Indexing$quadTree = new \QuadTree(

-180, 90, 180, -90, // Dimensions

3 // Bucket size

);

-90

90

0

-180 180

$quadTree = new \QuadTree(

-180, 90, 180, -90, // Dimensions

3 // Bucket size

);

$quadTree->add('London', 51.5072, -0.1275);

$quadTree->add('New York', 40.7127, - 74.0059);

$quadTree->add('Paris', 48.8567, 2.3508);

QuadTrees – Spatial Indexing

-90

90

0

-180 180

QuadTrees – Spatial Indexing$quadTree = new \QuadTree(

-180, 90, 180, -90, // Dimensions

3 // Bucket size

);

$quadTree->add('London', 51.5072, -0.1275);

$quadTree->add('New York', 40.7127, - 74.0059);

$quadTree->add('Paris', 48.8567, 2.3508);

$quadTree->add('Munich', 48.1333, 11.5667);

$quadTree->add('Dublin', 53.3478, 6.2597);

$quadTree->add('Rome', 41.9000, 12.5000);

$quadTree->add('Athens', 37.9667, 23.7167);

-90

90

90

0

0

-180

-180 1800 0

45

90

0

45

180

QuadTrees – Spatial Indexing$quadTree = new \QuadTree(

-180, 90, 180, -90, // Dimensions

3 // Bucket size

);

$quadTree->add('London', 51.5072, -0.1275);

$quadTree->add('New York', 40.7127, - 74.0059);

$quadTree->add('Paris', 48.8567, 2.3508);

$quadTree->add('Munich', 48.1333, 11.5667);

$quadTree->add('Dublin', 53.3478, 6.2597);

$quadTree->add('Rome', 41.9000, 12.5000);

$quadTree->add('Athens', 37.9667, 23.7167);

$quadTree->add('Amsterdam', 52.3667, 4.9000);

-90

90

90

0

90

45

0

-180

-180 1800 0

45

90

0

45

180

0 90

$quadTree = new \QuadTree(

-180, 90, 180, -90, // Dimensions

3 // Bucket size

);

$quadTree->add('London', 51.5072, -0.1275);

$quadTree->add('New York', 40.7127, - 74.0059);

$quadTree->add('Paris', 48.8567, 2.3508);

$quadTree->add('Munich', 48.1333, 11.5667);

$quadTree->add('Dublin', 53.3478, 6.2597);

$quadTree->add('Rome', 41.9000, 12.5000);

$quadTree->add('Athens', 37.9667, 23.7167);

$quadTree->add('Amsterdam', 52.3667, 4.9000);

// Search QuadTree for Northern Europe

$quadTree->find(

-15.0, 60.0,

25.0, 45.0

);

QuadTrees – Spatial Indexing

-90

90

90

0

90

45

45 45

0 0

0

0

45

45

67.5

45 -45

0

-90

-180 180

-180 1800 0 0 180

90

0

45

0 90 0 90 90 180

0 45

QuadTrees – Spatial Indexing

• The top-level node need not be limited to the maximum graph space (i.e. the whole world)

QuadTrees – Spatial Indexing

QuadTrees – Spatial Indexing

•With a larger bucket size• QuadTree is smaller, fewer nodes using less memory• More points need checking in each node• Faster to insert / slower to search

•With a smaller bucket size• The QuadTree uses more memory• Fewer points in each node to check• Slower to insert / faster to search

QuadTrees – Region QuadTree

• Used for Sparse-data Compression• Used for Level-based Aggregations

QuadTrees – Image Compression

QuadTrees• The same principles can be applied to 3-Dimensional space

using an Octree

PHP DataStructures – Beyond SPL

A dreamscape made from random noise. Illustration: Google

Questions?

Who am I?

Mark BakerDesign and Development ManagerInnovEd (Innovative Solutions for Education) Learning Ltd

Coordinator and Developer of:Open Source PHPOffice library

PHPExcel, PHPWord, PHPPowerPoint, PHPProject, PHPVisioMinor contributor to PHP coreOther small open source libraries available on github

@Mark_Baker

https://github.com/MarkBaker

http://uk.linkedin.com/pub/mark-baker/b/572/171


Recommended