Date post: | 14-Jul-2015 |
Category: |
Technology |
Upload: | apaichon-punopas |
View: | 553 times |
Download: | 2 times |
Change Relational DB to Graph DB with OrientDB
Speaker : Apaichon PunpasSponsor By
เครือข่ายโปรแกรมเมอร์ไทย
โค้ดชิวๆ
What is Relational DB ?It is a way of storing information into • table • column • row A table is able relate to other.
What is NoSQL ?A NoSQL (often interpreted as Not only SQL) database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Motivations for this approach include simplicity of design, horizontal scaling, and finer control over availability.
Type of NoSQL
Relational vs NoSQL format
EmployeeId FirstName LastName HiredDate PositionId
1 Apaichon Punopas 13/11/2013 1
2 Tony Jar 1/1/2011 3
PositionId PositionName1 Senior Developer2 Developer
Manager3 Actor
{EmployeeId:1 ,FirstName:”Apaichon” ,LastName:”Punopas” ,HiredDate:”2013-11-13” ,PositionId:”#16:0”}
{PositionId:”#16:0”,PositionName:”Senior Developer”}
Employee
Position
Relational DB Pros and Cons
Pros Cons
Flexible and well-established. Performance problems with complicated data structures.
Short Learning Curve Lack of support for complex base types, e.g., drawings.
Data access through SQL. SQL is limited when accessing complex data.
Large development efforts and with large databases are well understood.
Knowledge of the database structure is required to create ad hoc queries.
The fundamental structure, i.e., a table, is easily understood and the design and normalization
Locking mechanisms
NoSQL Pros and Cons
Pros ConsMostly open source. Immaturity
Horizontal scalability. Possible database administration issues
Support for Map/Reduce Data Relationship Like SQL
No need to develop fine-grained data model. No indexing support (Some DB)
Very fast for adding new data No ACID (Some DB)
No need to changes in code when data structure is modified. Absence of standardization
Ability to store complex data types in a single item of storage.
Who are using No SQL ?• All big companies using NoSQL
Who are using OrientDB ?THJUG
• Nobody uses Java anymore, I'm nobody
A.K.A
• Nobody uses OrientDB anymore, I'm nobody
NoSQL trends
http://news.yahoo.com/nosql-databases-eat-relational-database-191517881.html
Database trends
Job trends
Will NoSQL replace Relational ?
• NoSQL databases eat into the relational database market.
• New venture and startup most start with NoSQL.
• Relational has more used in Enterprise and more features better. It is difficult to replace all in 10 - 20 years. In the future might have new database type again.
Why join is suck ?Join Step
• Add relational data into least 2 tables
• select data
• mapping data
• reduce result set
EmployeeId FirstName LastName HiredDate PositionId
1 Apaichon Punopas 13/11/2013 1
2 Tony Jar 1/1/2011 3
PositionId PositionName1 Senior Developer2 Developer
Manager3 Actor
Select e.*,p.PositionName from Employee e INNER Join Position p on e.PositionId = p.PositionId
How is OrientDB join ?Join Step
• Add relational data into least 2 tables.
It’s already Join! @rid FirstName LastName HiredDate PositionId
#22:0 Apaichon Punopas 13/11/2013 #23:0
@rid PositionName
#23:0 Senior Developer
select @rid as employeeId, firstName , lastName , positionId.positionName from Employee
Welcome to OrientDBLuca Garulli
CEO, Founder
Luca OlivariPresident
www.orientdb.org
What is GraphDB ?Graph Theory
G = (V, E)
V = Vertex
E = Edge
Certificate of chievementThis%certificate%is%awarded%to%
Attendee Your%understand%Graph%Theory.%
Apaichon Punopas
เครือข่าย โปรแกรมเมอร์ไทย
Today OrientDB 2.0 is not only GraphDB
It is Multi-Model Database.
Document ModelThe data in this model is stored inside documents. A document is a set of key/value pairs (also referred to as fields or properties) where a key allows access to its value. Values can hold primitive data types, embedded documents, or arrays of other values.
{firstName:”A",lastName:"LA", friends:[{firstName:”A"}
,{firstName:"B" , lastName: “lB”}]
}
Graph ModelA graph represents a network-like structure consisting of Vertices (also known as Nodes) interconnected by Edges (also known as Arcs). OrientDB's graph model is represented by the concept of a property graph, which defines the following:Vertex - an entity that can be linked with other Vertices.Edge - an entity that links two Vertices.
Support Types• Popular types same as other database such as
boolean , integer ,double , string , binary , etc
• Embleded -> JSON such as {name:”A” , friends:[{name:”B” },name:{“C”}]
• Link -> RecordID
ClassA Class is a concept taken from the Object Oriented paradigm. In OrientDB it defines a type of record. It's the closest concept to a Relational DBMS Table. Classes can be schema-less, schema-full, or mixed.
Schema Type• Schema-Full: enable the strict-mode at class level
and set all the fields as mandatory
• Schema-Less: create classes with no properties. Default mode is non strict-mode so records can have arbitrary fields
• Schema-Hybrid, called also Schema-Mixed is the most used: create classes and define some fields but leave the record to define own custom fields
ClusterA cluster is a place where a group of records are stored. Perhaps the best equivalent in the relational world would be a Table. By default, OrientDB will create one cluster per class. All the records of a class are stored in the same cluster which has the same name as the class. You can create up to 32,767 (2^15-1) clusters in a database.
Record IDIn OrientDB each record has its own self-assigned unique ID within the database called Record ID or RID. It is composed of two parts:
• cluster-id is the id of the cluster. Each database can have a maximum of 32,767 clusters (2^15-1)
• cluster-position is the position of the record inside the cluster. Each cluster can handle up to 9,223,372,036,854,780,000 (2^63) records, namely 9,223,372 Trillion of records!
#<cluster-id>:<cluster-position>
Inheritance• Class includes inheritance features same as OOP
concept.
Index• OrientDB supports 4 kinds of indexes:
Security• Support drill down to Record level and support
SSL.
Caching• OrientDB has several caching mechanisms that act
at different levels. Look at this picture:
FunctionsA Function is an executable unit of code that can take parameters and return a result. Using Functions you can perform Functional programming where logic and data are all together in a central place. Functions are similar to the Stored Procedures of RDBMS.
• can be executed via SQL, Java, REST and Studio
TransactionsOrientDB is an ACID compliant DBMS.
A database transaction, by definition, must be atomic, consistent, isolated and durable. Database practitioners often refer to these properties of database transactions using the acronym ACID
Hooks (Triggers)• Hook works like a trigger. Hook lets to the user
application to intercept internal events before and after each CRUD operation against records. You can use to write custom validation rules, to enforce security or even to orchestrate external events like the replication against a Relational DBMS.
APIOrientDB supports 3 kinds of drivers:
• Native binary remote, that talks directly against the TCP/IP socket using the binary protocol
• HTTP REST/JSON, that talks directly against the TCP/IP socket using the HTTP protocol
• Java wrapped, as a layer that links in some way the native Java driver. This is pretty easy for languages that run into the JVM like Scala, Groovy and JRuby
Scalability
Programming Language Driver
• Most popular language are supported.
SQLMost NoSQL products have a custom query language. OrientDB focuses on standards when it comes to query languages. Instead of inventing "Yet Another Query Language", we started from the widely used and well understood SQL.
SQL - Select• select from OUser• select from #10:3• select from [#10:1, #10:3, #10:5]• select from OUser where name like 'l%'• select sum(salary) from Employee where age < 40 group by job• select from Employee where any() like ‘Apa%'• select from china:Customers
SQL - Insert• insert into Employee (name, surname, gender) values ('Jay', 'Miner', 'M')
• insert into Employee set name = 'Jay', surname = 'Miner', gender = 'M'
• insert into Employee content {name : 'Jay', surname : 'Miner', gender : 'M'}
SQL - Update• update Employee set local = true where city = 'London'
• update Employee merge { local : true } where city = 'London'
Delete•delete from Employee where city <> 'London'•delete from [#24:0,#24:1,#24:2]
Sub Queryselect from Documentlet $temp = ( select @rid, $depth from ( traverse V.out, E.in from $parent.current ) where @class = 'Concept'
and (id = 'first concept' or id = 'second concept' )
)where $temp.size() > 0
TraverseTraverse is a special command that retrieves the connected records crossing the relationships. This command works not only with graph API but at document level. This means you can traverse relationships between invoice and customers without the need to model the domain using the Graph API.
traverse * from #9:1
My favourite in OrientDB• I’m favourite many things in OrientDB which never found in other DB.
• insert , update with JSON
• save - automatic insert or update when pass value with @rid
• validate property with regular expression.
• median - I’m got bad performance and develop out of the box with other DB but OrientDB included and fast.
• array and JSON hierarchy - keep array in one field help easily to use with data visualise.
• expand - expand array to horizontal like table , row , column.
AppendixPrerequisite
• JVM
Installation
• Download at http://www.orientechnologies.com/download/
• Extract file
• go to directory bin then run server.sh or server.bat
AppendixManagement Studio
• by default run on port 2480
• open browser then type http://localhost:2480
AppendixConsole
• go to directory bin then run console.sh or console.bat
Thank you • Delicious and Enjoy to use