•Who We Are
•Intro to GraphDB
•Intro to Patent-Grant Data
•Graph Concepts
•Pacer::Xml
Agenda
Sponsored By:
¿por qué?
•Data Set Size
•Connectivity of Data
•Semi-structure
•Evolution of SOA and REST
The Zone of SQL Adequacy
Data complexity
Perf
orm
an
ce
SQL database
Requirement of application
Salary List
ERP
CRM
Network / Cloud
Management
Social
MDM
Geo
How?•Nodes / Vertices
•Relationships / Edges
Relational Model vs. Graph
Each of these modelsexpresses the same thing
Person* Friend*Person-Friend
Database # persons query time
MySQL 1,000 2,000 ms
Neo4j 1,000 2 ms
Neo4j 1,000,000 2 ms
Graph db performance
๏ a sample social graph
•with ~1,000 persons
๏ average 50 friends per person
๏ pathExists(a,b) limited to depth 4
๏ caches warmed up to eliminate disk I/O
Different Visualization
Query Languages
•Pacer - gem install pacer
•Cypher
•SPARQL - if you grok RDF already
US PTO Data
•Patent Grant Data in XML
•bi-weekly chunks
•Pacer::Xml has handy loader as an example:
jruby-1.7.0 > g = PacerXml::Sample.load_100Downloading a sample xml file from...
001> PacerXml
Importing XML into a graph? What do you do next?
Resources
https://github.com/xnlogic/pacer-xml
https://github.com/pangloss/pacer
http://neo4j.org/
http://tinkerpop.com/