Date post: | 05-Jul-2015 |
Category: |
Engineering |
Upload: | rahul-dhawani |
View: | 216 times |
Download: | 0 times |
NO-SQLIt’s not
Not Only SQL
It’s
What is NOSQL?
•Early ages, Relational databases allowed applications to
store data through a standard data modeling and query
language SQL.
•Expensive and data schemas were fairly simple and
straightforward. Since the rise of the web, the volume of
data stored about users, objects, products and events has
exploded.
•Data is also accessed more frequently, and is processed
more intensively
•Low-cost, commodity cloud hardware has emerged to
replace vertical scaling on highly complex and expensive
single-server deployments.
•Engineers now use agile development methods, which aim
for continuous deployment and short development cycles,
to allow for quick response to user demand for features.
What was the need?
What urged to introduce NOSQL?
•Trend 1: BigUsers
•Trend 2: Size(BigData)
•Trend 3: Connectedness(InterConnected
Data)
•Trend 4: Semi-structure(Complex Data
Structure)
•Trend 5: Architecture
Trend 1: BigUsers
161
253
397
623
988
0
200
400
600
800
1000
1200
2006 2007 2008 2009 2010
data(in exabyte)
Trend 2: Size(BigData)ExaBytes of data stored per year
Source: neotechnology
Trend 3: Connectedness
• To handle hierarchical nested data structures
SQL, you would need multiple relational tables
with all kinds of keys.
• there is a relationship between performance
and data complexity. Performance can degrade
in a traditional RDBMS as we store the massive
amounts of data required in social networking
applications and the semantic web.
• Individualization of content
Trend 4: Semi-structure(Complex
Data Structure)
Trend 4: Semi-structure(Complex
Data Structure)
Source:couchbase.com
Trend 5:
Architecture
Why is so
NoSqlin
spotlight
?
1. Scaling.
•The process of adding
more capacity means
taking existing actors in a
system and increasing
their individual power.
•A single server has to
host the entire database
to ensure reliability and
continuous availability of
data. This gets expensive
quickly, places limits on
scale.
Vertical Scaling(Relational):
Example, let’s assume you have 3 trucks that can carry 25 felled trees per load, and it takes 1 hour to move each load down the road, our maximum
capacity will be:
3 trucks * 25 trees * 1 hour/load = 75 trees processed per hour
Assuming we’ve chosen a vertical scaling capacity model, what if we
wanted to process 150 felled trees?
We’d need to do one of two things:
1. either double the carrying capacity of each truck (50 trees per hour),
2. halve the time it takes for each truck to process each load (30 minutes).
3 trucks * 50 trees * 1 hour/load = 150 trees processed per hour
OR
3 trucks * 25 trees * 30 minutes/load = 150 trees processed per hour
We haven’t increased the number of actors in the system, but we have
increased the productivity of each actor to achieve the desired jump in
capacity.
Vertical Scaling(Relational):
Horizontal Scaling(NoSql):•Instead of increasing
the capacity of each
individual actor in the
system, we simply add
more actors to the
system.
•By adding servers
instead of
concentrating more
capacity in a single
server.
Horizontal Scaling(NoSql):
In our lumber harvesting example, this means adding more
trucks to move the lumber. So when we need to increase
our capacity from 75 trees per hour to 150 trees per hour, we
simply add 3 more trucks:
6 trucks * 25 trees * 1 hour/load = 150 trees processed per
hour
The productivity of each actor in the system remains the
same, but we’ve added more trucks to the system.
2. Dynamic Schemas.
Dynamic Schemas:
•Relational databases require that schemas
be defined before you can add data.
•This fits poorly with agile development
approaches, because each time you
complete new features, the schema of your
database often needs to change.
•If the database is large, this is a very slow
process that involves significant downtime.
How RELATIONAL DATABASE does it??
Dynamic Schemas:
And how NOSQL does:
•NoSQL databases are built to allow the insertion
of data without a predefined schema.
•That makes it easy to make significant
application changes in real-time, without
worrying about service interruptions – which
means development is faster, code integration is
more reliable, and less database administrator
time is needed.
3. Sharding.
Sharding:
How RELATIONAL DATABASE does it??
Sharding is the process of storing data records
across multiple machines.
•As SQL scales vertically, sharding is done by
complex arrangements for making hardware act
as a single server
Sharding:And how NOSQL does:
•NOSQL natively and automatically spread data
across an arbitrary number of servers, without
requiring the application to even be aware of
the composition of the server pool.
• Data and query load are automatically
balanced across servers, and when a server
goes down, it can be quickly and transparently
replaced with no application disruption
(replication).
4. Replication.
Replication:
•NoSQL databases also support data
replication, storing multiple copies of data
across the cluster, and even across data
centers, to ensure high availability and
support disaster recovery.
•A properly managed NoSQL database
system should never need to be taken offline,
for any reason, supporting 24x365 continuous
operation of applications.
5.Integrated Caching.
Integrated Caching:
How RELATIONAL DATABASE does it??
•In relational technology, caching tier is usually
a separate infrastructure tier that must be
developed to, deployed on separate servers,
and explicitly managed by the operating team
•To reduce latency and increase sustained
data throughput, NoSQL database
transparently cache data in system
memory.
•This behavior is transparent to the
application developer and the operations
team
Integrated Caching:
And how NOSQL does: