Date post: | 15-Jan-2015 |
Category: |
Technology |
Upload: | scaleout-software |
View: | 1,833 times |
Download: | 1 times |
Copyright © 2011 by ScaleOut Software, Inc.
Webinar December, 2011
Bill Bain ([email protected])
The Top Five Six Reasons to Use a
Distributed Data Grid
X
ScaleOut Software, Inc.
2
Agenda
• About ScaleOut Software • Overview of Products • What is a Distributed Data Grid (DDG)? • The Top Six Reasons • What to Look for in a DDG Product
ScaleOut Software, Inc.
3
Company • Founded in September 2003, privately funded • Offices in Bellevue, WA and Beaverton, OR • Team:
– Dr. William Bain, Founder & CEO • Career focused on parallel computing – Bell Labs, Intel, Microsoft • 3 prior start-ups, last acquired by Microsoft and product now ships
as Network Load Balancing in Windows Server
– David Brinker, COO • 20 years software business and executive management
experience • Mentor Graphics, Cadence, Webridge
• Develops and markets Linux & Windows DDG products. • Seven years market experience.
ScaleOut Software, Inc.
4
It’s All About Scaling Performance
• Scaling performance:
Memory
CPU
Storage
Scale Out
Storage
CPU
Storage
CPU
Storage
CPU
Storage
CPU
Memory Memory Memory MemoryScaling out: • Has excellent scalability. • But is challenging to implement.
SCALE OUT
ScaleOut Software, Inc.
5
What is a Distributed Data Grid?
• A new “vertical” storage tier: – Adds missing layer to boost
performance. – Uses in-memory, out-of-process
storage. – Avoids repeated trips to backing
storage.
Processor Cache
Application Memory
“In-Process”
L2 Cache
Processor Cache
Application Memory
“In-Process”
L2 Cache
Backing Storage
• A new “horizontal” storage tier: – Allows data sharing among servers. – Scales performance & capacity. – Adds high availability. – Can be used independently of
backing storage.
Distributed Data Grid“Out-of-Process”
Distributed Data Grid“Out-of-Process”
(Aka “distributed cache”, “in-memory data grid”)
ScaleOut Software, Inc.
6
Distributed Data Grids: A Closer Look
• Incorporates a client-side, in-process cache (“near cache”): – Transparent to the application – Holds recently accessed data.
• Boosts performance: – Eliminates repeated network data
transfers & deserialization. – Reduces access times to near “in-
process” latency. – Is automatically updated if the
distributed grid changes. – Supports various coherency models
(coherent, polled, event-driven)
Application Memory
“In-Process”Client-side
Cache“In-Process”Distributed
Data Grid“Out-of-Process”
ScaleOut Software, Inc.
7
The Need for Memory-Based Storage
W eb Server W eb Server W eb Server W eb Server W eb Server W eb Server
Ethernet
Internet
DatabaseServer
Raid D iskArray
DatabaseServer
Ethernet
App. Server App. Server App. Server App. Server
Ethernet
POW ER FAU LT DATA ALARM Load-balancer
Example: Web server farm:
• Load-balancer directs incoming client requests to Web servers.
• Web and app. server farms build Web pages and run business logic.
• Database server holds all mission-critical, LOB data.
• Server farms share fast-changing data using a DDG to avoid bottlenecks and maximize scalability.
Bottleneck
Distributed, In-Memory Data Grid
Distributed, In-Memory Data Grid
ScaleOut Software, Inc.
8
The Need for Memory-Based Storage
App VS
Cloud Application
App VS App VS
App VSApp VS
Cloud-Based Storage
Grid VSGrid VS
Grid VS
Distributed Data Grid
Example: Cloud Application:
• Application runs as multiple, virtual servers (VS).
• Application instances store and retrieve LOB data from cloud-based file system or database.
• Applications need fast, scalable storage for fast-changing data.
• Distributed data grid runs as multiple, virtual servers to provide “elastic,” in-memory storage.
ScaleOut Software, Inc.
9
• “Scaled out” server applications repeatedly access two types of data: – Repeatedly referenced database-data (e.g., stock prices) and
– Fast changing, business-logic data (e.g., session-state, workflow state)
• Database servers are not designed to meet this need:
• Scaled-out applications create additional challenges:
– How to make shared application data quickly accessible by any server – How to maintain fast access and avoid bottlenecks as the server farm grows – How to keep application data highly available when a server fails
Scalability Challenges for Applications
Characteristics: Typical DBMS data Application data Volume High Low
Lifetime/turnover Long/slow Short/fast
Access patterns Complex Simple
Data preservation Critical Less critical
Fast access/update Less important More important
ScaleOut Software, Inc.
10
Wide Range of Applications for DDGs Financial Services • Portfolio risk analysis • VaR calculations • Monte Carlo simulations • Algorithmic trading • Market message caching • Derivatives trading • Pricing calculations
Other Applications • Edge servers: chat, email • Online gaming servers • Scientific computations • Command and control
E-commerce • Session-state storage • Application state storage • Online banking • Loan applications • Wealth management • Online learning • Hotel reservations • News story caching • Shopping carts • Social networking • Service call tracking • Online surveys
ScaleOut Software, Inc.
11
Product: ScaleOut StateServer®
Fully distributed data grid designed for storing application data on server farms, compute grids, and the cloud:
• Runs in-memory directly on a farm or grid as a distributed service. • Automatically:
– Distributes and shares data across the farm.
– Reduces access time. – Scales when
the farm grows. – Survives when
a server fails. • Cost-effective • Complements & offloads DBMS. • Portable across Windows and Linux.
Web Server
Ethe
rnet
DBMSServer
Internet Web Server
Web Server
Web Server
Eth
erne
t
SOSS Service
SOSS Service
SOSS Service
SOSS Service
DBMS Bottleneck
ScaleOut Software, Inc.
12
Product: ScaleOut Remote Client Option
• Allows hosting ScaleOut StateServer on a separate server farm.
• Ensures highly available connectivity to SOSS store.
• Automatically load-balances access requests to minimize response times.
• Uses multiple connections to maximize throughput.
ClientApplication
ClientApplication
ClientApplication
ClientApplication
WindowsRemote Client
WindowsRemote Client
LinuxRemote Client
LinuxRemote Client
ScaleOut StateServer Farm
Web or Application Server Farm
WindowsSOSS
LinuxSOSS
WindowsSOSS
ClientApplication
WindowsRemote Client
Load-balanced Connections
ScaleOut Software, Inc.
13
Products: Grid Computing Edition
Compute Servers
Master
Data Bottleneck
..
Database Servers
• Extends ScaleOut StateServer for use in high performance computing (HPC) applications.
• Provides advanced capabilities for parallel data analysis.
• Includes optional management tools.
• Complements SSI’s extended support plans.
SOSSService
ScaleOut Software, Inc.
14
Products: ScaleOut GeoServer Option
Global, Multi-Site Data Grids • Extends SOSS across multiple sites. • Ensures against site-wide failures. • Replicates data between
data SOSS farms. • Employs scalable,
hi-av connections. • Automatically handles
membership changes at remote sites.
• Can support both “push” and “pull” access models.
ScaleOut Software, Inc.
15
Reason #1: Faster Access Time
• Eliminates repeated network data transfers. • Eliminates repeated object deserialization.
0
500
1000
1500
2000
2500
3000
3500
DDG DBMS
Mic
rose
cond
s
Average Response Time10KB Objects
20:1 Read/Update
ScaleOut Software, Inc.
16
Example of Faster API Read Access
• Example for direct API access: – 10 KB objects, 20:1 read/update ratio – 3-host ScaleOut StateServer store with 3 clients
• Results: – Distributed cache provided >6X faster read time than database server.
ScaleOut Software, Inc.
17
Reason #2: Linearly Scalable Throughput
Tests performed in Microsoft Enterprise Engineering Center
Read/Write Throughput10KB Objects
0
20,000
40,000
60,000
80,000
4 16 28 40 52 64
Acc
esse
s / S
econ
d
Nodes16,000 ------------------------------------------- 256,000 #Objects
ScaleOut StateServer automatically scales its performance to match the size and workload of a server farm or HPC compute grid.
ScaleOut Software, Inc.
18
What is Scalable Throughput?
• What it is (a perfect fit for server farms): – Workload W takes time T on 1 server ( 1 W/T). – Workload 2W takes time T on 2 servers (2 W/T). – Workload nW takes time T on n servers (n W/T). – Total completion time (i.e., response time) stays fixed.
• What it is not (common misperception): – Workload W takes time T/2 on 2 servers (2 W/T). – Workload W takes time T/n on n servers (n W/T).
• Why increase the workload with more servers? – Adding servers adds overhead (e.g., networking). – Increasing workload hides overheads for linear scaling. – DDG must keep overheads low for linear scaling. – Must not let network saturate! (Its throughput is fixed.)
ScaleOut Software, Inc.
19
How SOSS Achieves Scalable Throughput
• Fully peer-to-peer architecture to eliminate bottlenecks.
• Automatically partitioned data storage with dynamic load-balancing.
• Fixed number of replicas per stored object (1 or 2) to avoid order-n overhead (storage and latency)
• Patented technique for scaling quorum updates to stored objects
• Patented, scalable heart-beating algorithm
Ethernet
Web orApplication
Server
CacheService
CacheService
CacheService
CacheService
Web orApplication
Server
Web orApplication
Server
Web orApplication
Server
Object ReplicaCopy
ScaleOut StateServer Distributed Cache
Heartbeats Heartbeats Heartbeats
ScaleOut Software, Inc.
20
Integrated, Powerful Platform for Scaling
• All product features benefit from the scalable, hi-av architecture: – Ex. Parallel object
eventing: • All hosts handle events. • Event delivery is hi-av.
– Ex. Global replication: • All hosts replicate objects. • Caches automatically handle
membership changes.
CacheService
CacheService
CacheService
CacheService
ScaleOut StateServer Distributed Cache
ClientApplication
ClientLibrary
ClientApplication
ClientLibrary
ClientApplication
ClientLibrary
ClientApplication
ClientLibrary
LocalFarm
RemoteFarm
ScaleOut Software, Inc.
21
Impact of Scalable TP on Access Latency
• Scalable, distributed data grid scales throughput and thereby maintains low latency: – DDG scales throughput by
adding servers. – Avoids throughput barrier
of a DBMS or file system. – Maintains low latency as
throughput increases. – Network bandwidth is
only throughput limit. – Also has inherently lower
latency due to: • Memory-based storage • Client-side caching
Acce
ss L
aten
cy (m
sec)
Throughput (accesses / sec)
SOSS DBMS
Access Latency vs. Throughput
ScaleOut Software, Inc.
22
Putting it Together: How SOSS Works
• Creating or updating an object: – Client connects to a SOSS service instance and makes request. – Local SOSS service load-balances request to a selected host. – Selected host creates object and one or two remote replicas.
Server Server Server Server
SOSS SOSS SOSS SOSS
Client
ScaleOut Software, Inc.
23
How SOSS Works • Reading an object:
– Client connects to SOSS service and makes request. – Local SOSS service forwards to selected host. – Selected host returns object’s data. – Requesting host caches object for future reads.
Server Server Server Server
SOSS SOSS SOSS SOSS
Client
ScaleOut Software, Inc.
24
How SOSS Works
• Adding a new host: – Neighboring hosts detect SOSS on new host. – Hosts automatically establish new membership. – Neighbor hosts migrate objects to new host to rebalance load.
Server Server Server Server
SOSS SOSS SOSS SOSS
Server
SOSS
ScaleOut Software, Inc.
25
Reason #3: High Availability
• Recovering from a host failure: – Host or NIC fails. – Neighboring hosts detect heartbeat failure. – Hosts establish new membership. – Neighbor host creates new object replica to “self-heal”.
Server Server Server Server
SOSS SOSS SOSS SOSS STOP
ScaleOut Software, Inc.
26
SOSS: Integrated High Availability
• Peer-to-peer architecture for maximum redundancy & scalability • Fully integrated data replication for data redundancy, scalability, and
ease of use: – Partial replicas ensure scalable storage and throughput. – Per-server and per-client caches ensure fast access.
• Self-discovery and self-healing for hi-av and ease of use • Patented quorum algorithm for reliable updating with scalability
CacheService
CacheService
CacheService
CacheService
Object
ClientApplication
ClientLibrary
ScaleOut StateServer Distributed Cache
Retrieve
CachedCopy
ReplicaCopy
ScaleOut Software, Inc.
27
Reason #4: Sharing Data Across the Farm
The first step for server farms (1998): load-balanced, stateless, Web applications:
• Without the ability to share data, we need “sticky” sessions (no hi av!):
• Or we can overload the database server:
• Or we can share data across the farm in a distributed data grid for both scalability & high av.
Web Server
Eth
erne
t
DBMSServer
Internet Web Server
Web Server
Web Server
Eth
erne
t
SOSS Service
SOSS Service
SOSS Service
SOSS Service
ScaleOut Software, Inc.
28
The Evolution in DDGs and Data Sharing
2005 2006 2007 2008 2010 2011
Mar
ket P
enet
ratio
n
Session-state Storage
Application Caching
Platform-wide Usage
Grid Computing
Drivers: • Scaling data access & analysis are critical to
competitiveness. • Server farms & the cloud are now mainstream
computing platforms. • Data access is a key bottleneck. • Short dev. cycles are mandatory. • Standard APIs are emerging.
Early adoption on Web and app. server farms
for speed and hi-av
Expansion to new verticals (e.g., financial services)
for data & compute grids
Cloud Computing using industry-standard APIs
Data Analysis
ScaleOut Software, Inc.
29
Data Sharing: a Closer Look
• Advantages of sharing data in a distributed data grid: – Boosts application performance and offloads the DBMS. – Advances & simplifies the programming model:
• Allows “stateful” business objects • Keeps object/relational mapping at the data access layer
• Examples: session & profile data, business objects, workflow state
• Requirements of a distributed data grid: – Coherent storage so all clients see a consistent view – Easy-to-use APIs – Integrated object locking to enable coordinated updating – High availability to avoid data loss if a server fails – Advanced features to enable effective use of the grid (e.g.,
parallel query, map/reduce analysis)
ScaleOut Software, Inc.
30
Basic APIs for Data Access .
• Are easy to use in C#, Java, or C/C++. • Store objects in the grid as serialized blobs. • Primarily use string or numeric keys to identify objects. • Group objects into name spaces (“named caches”).
Object
key
// Read and update object:
MyClass retrievedObj;
retrievedObj = cache["myObj"] as MyClass;
retrievedObj.var1 = "Hello, again!";
cache["myObj"] = retrievedObj;
ScaleOut Software, Inc.
31
Example: Named Cache Access (Java) static void Main(string argv[])
{
// Initialize string object to be stored:
String s = “Test string”;
// Create a cache collection:
SossCache cache = SossCacheFactory.getCache(“MyCache”);
// Store object in ScaleOut StateServer (SOSS):
CachedObjectId id = new CachedObjectId(UUID.randomUUID());
cache.put(id, s);
// Read object stored in SOSS:
String answerJNC = (String)cache.get(id);
// Remove object from SOSS:
cache.remove(id);
}
ScaleOut Software, Inc.
32
Example: Named Cache Access (C#) static void Main(string[] args)
{
// Initialize object to be stored:
SampleClass sampleObj = new SampleClass();
sampleObj.var1 = "Hello, SOSS!";
// Create a cache:
SossCache cache = CacheFactory.GetCache("myCache");
// Store object in the distributed cache:
cache["myObj"] = sampleObj;
// Read and update object stored in cache:
SampleClass retrievedObj = null;
retrievedObj = cache["myObj"] as SampleClass;
retrievedObj.var1 = "Hello, again!";
cache["myObj"] = retrievedObj;
// Remove object from the cache:
cache.["myObj“] = null;
}
ScaleOut Software, Inc.
33
Fully Distributed Locking
• Goal: synchronize access to a stored object by multiple client threads.
• Two mechanisms: pessimistic and optimistic locking • Pessimistic uses read-modify-write semantics:
– Can be set as default for all objects within a named cache. – Reads to locked objects are automatically retried. – Locks have timeouts to handle client failures. – Simple reads and updates can bypass locks.
• Optimistic uses object’s version number to allow or inhibit an update: – User supplies version number from read to a locking update. – Benefit: one trip to the server if high probability of success.
string myObj = cache.Retrieve("key", true); // read and lock
...
cache.Update("key", “new value", true); // update and unlock
ScaleOut Software, Inc.
34
Advanced API Features
• Object timeouts • Distributed locking for coordinating access • Object dependency relationships • Asynchronous events on object changes • Automatic access to a backing store • Object eviction on high memory usage • Object metadata • Bulk insertion • Authentication • Custom serialization for compression & encryption • Parallel query based on metadata or properties
ScaleOut Software, Inc.
35
Parallel Data Analysis • The goal:
– Quickly analyze a large set of data for patterns and trends. – Take advantage of scalable computing to shorten “time to insight.”
• Applications: – Search – Financial services – Business intelligence – Risk analysis – Weather simulation – Structural modeling – Fluid-flow analysis – Climate modeling NCAR Community Climate Model
http://www.vets.ucar.edu/vg/IPCC_CCSM3/index.shtml
ScaleOut Software, Inc.
36
Reason #5: Parallel Data Analysis
• Rapid analysis of large data sets has become a top priority.
• Distributed data grids enable fast parallel analysis: – Automatically harness the power of many servers and cores. – Offer a simple, easy-to-use development model. – Deliver top performance for memory-based datasets.
• Key attributes of DDG-based data analysis: – Data is memory-based and
data motion is minimized. – Programming model is object-
oriented; parallelism is automatic. 0
100
200
300
400
500
600
4512
81024
121536
162048
202560
243072
283584
324096
Ob
ject
s p
er
Se
con
d
Number of Nodes
Number of Objects
PMI vs. Random Access Throughput Comparison2mb time series objects
SOSS PMI
Random Access
ScaleOut Software, Inc.
37
Parallel Query • Goal: identify a set of objects with selected properties. • Uses all grid servers to scale query performance. • Uses fast, optimized lookup on each grid server.
Query the DDG in parallel.
Merge the keys into a list.
Sequentially analyze all
queried objects.
ScaleOut Software, Inc.
38
Parallel Query Example (Java)
• Mark class properties as indexes for SOSS query:
• Define a query using these properties:
public class Stock implements Serializable {
private String ticker;
private int totalShares;
private double price;
@SossIndexAttribute
public String getTicker() {
return ticker;} … }
NamedCache cache = CacheFactory.getCache("Stocks",
false);
Set keys = cache.queryKeys(Stock.class,
or(equal("ticker", "GOOG"),
equal("ticker", "ORCL")));
ScaleOut Software, Inc.
39
Parallel Query Example (C#)
• Mark class properties as indexes for SOSS query:
• Define a query using these properties. Objects are automatically read into memory:
class Stock {
[SossIndex]
public string Ticker { get; set; }
public decimal TotalShares { get; set; }
public decimal Price { get; set; }}
NamedCache cache = CacheFactory.GetCache("Stocks");
var q = from s in cache.QueryObjects<Stock>()
where s.Ticker == "GOOG" || s.Ticker == "ORCL"
select s;
Console.WriteLine("{0} Stocks found", q.Count());
ScaleOut Software, Inc.
40
Parallel Method Invocation (“Map/Reduce”) • Goal: analyze a set of objects with selected properties. • Executes user’s code in parallel across the grid. • Uses a parallel query to select objects for analysis.
Analyze Data (Map)
Combine Results (Reduce)
In-Memory Distributed Data Grid Runs Map/Reduce Analysis.
ScaleOut Software, Inc.
41
Example in Financial Services
Analyze trading strategies across stock histories: Why?
• Back-testing systems help guard against risks in deploying new trading strategies.
• Performance is critical for “first to market” advantage. • Uses significant amount of market data and computation time. How?
• Write method E to analyze trading strategies across a single stock history.
• Write method M to merge two sets of results. • Populate the data store with a set of stock histories. • Run method E in parallel on all stock histories. • Merge the results with method M to produce a report. • Refine and repeat…
ScaleOut Software, Inc.
42
Stage the Data for Analysis
• Step 1: Populate the distributed data grid with objects each of which represents a price history for a ticker symbol:
ScaleOut Software, Inc.
43
Code the Eval and Merge Methods • Step 2: Write a method to evaluate a stock history based on parameters:
• Step 3: Write a method to merge the results of two evaluations:
• Notes:
– This code can be run a sequential calculation on in-memory data. – No explicit accesses to the distributed data grid are used.
Results EvalStockHistory(StockHistory history, Parameters params)
{
<analyze trading strategy for this stock history>
return results;
}
Results MergeResuts(Results results1, Results results2)
{
<merge both results>
return results;
}
ScaleOut Software, Inc.
44
Run the Analysis
• Step 4: Invoke parallel evaluation and merging of results: Results Invoke(EvalStockHistory, MergeResults, querySpec,
params);
EvalStockHistory()
MergeResults()
ScaleOut Software, Inc.
45
stock history
stock history
stock history
stock history
stock history
stock history
.eval()
results results results results results results
.merge() .merge() .merge()
results results results
.merge()
results results returned
to client
Start parallel analysis
ScaleOut Software, Inc.
46
Advantages of Using PMI • Fast
– Automatically scales application performance across grid servers.
– Automatically uses all server cores. – Minimizes data motion between
servers. – API-based invocation delivers very
low latency. • Easy to Use:
– User writes simple, “in memory” code; all grid accesses are implicit.
– Matches Java/C# model of object-oriented collections.
– Requires no tuning.
Core
Core
Core
Core
Grid Service
PMI Engine
ScaleOut Software, Inc.
47
Comparison of DDGs and File-Based M/R DDG File-Based M/R
Data set size Gigabytes->terabytes Terabytes->petabytes Data repository In-memory File / database Data view Queried object collection File-based key/value
pairs Development time Low High Automatic scalability
Yes Application dependent
Best use Quick-turn analysis of memory-based data
Complex analysis of large datasets
I/O overhead Low High Cluster mgt. Simple Complex High availability Memory-based File-based
ScaleOut Software, Inc.
48
DDG Minimizes Data Motion • File-based map/reduce must move data to memory for analysis:
• Memory-based DDG analyzes data in place:
D D D D D D D D D
D D D D D D D D D
Grid Server Grid Server Grid Server Grid Server
E Grid Server
E Grid Server
E
M/R Server M/R Server
E M/R Server M/R Server
E M/R Server M/R Server
E
File System / Database
Server Memory
Distributed Data Grid
ScaleOut Software, Inc.
49
stock history
stock history
stock history
stock history
stock history
stock history
.eval()
results results results results results results
.merge() .merge() .merge()
results results results
.merge()
results results returned
to client
Start parallel analysis
File I/O
File I/O
File I/O
ScaleOut Software, Inc.
50
Performance Impact of Data Motion Measured random access to DDG data to simulate file I/O:
ScaleOut Software, Inc.
51
PMI Delivers 16X Speedup Over Hadoop
0
100
200
300
400
500
600
700
800
4 6 8
Thro
ugh
pu
t (O
bj/
Sec)
Number of Servers
Throughput Comparison
SOSS PMI
Hadoop/SOSS
Hadoop
ScaleOut Software, Inc.
52
Reason # 6: Simplify Data Migration • DDGs enable seamless data migration across on-
premise sites and the cloud: – Automatically access
remote data as needed. – Efficiently manage
WAN bandwidth. – Enable full data
synchronization across sites.
In-Memory Distributed Data Grid
ScaleOut Software, Inc.
53
Example: Web Farm Cloud-Bursting • DDGs bridge on-premise and cloud-based in-memory storage of
Web session state. • DDG automatically migrates session-state objects into the cloud
on demand. • This enables seamless access to data across multiple sites.
Automatically Migrate Data
Cloud of Virtual Servers User’s On-Premise Application
SOSS VSSOSS VS
SOSS VS
Cloud-Based Distributed Cache
App VS
Cloud Application
App VS App VS
App VSApp VS
SOSS HostSOSS Host
On-Premise Cache
Server App
On-Premise Application 2
Cloud of Virtual Servers
User’s On-Premise Application
Server App
AutomaticallyMigrate Data
BackingStore
Cloud hosted Distributed Data Grid
On-Premise Distributed Data Grid
Cloud Application
On-Premise Application 2
App VS
App VS App VS
App VS
App VS
Server App Server App
SOSS Host SOSS Host SOSS VS
SOSS VS
SOSS VS
Web Load Balancer
Virtual Distributed Data Grid
ScaleOut Software, Inc.
54
Example: Global Access to Shared Data
Distributed Data Grid
SOSS SVRSOSS SVR
SOSS SVR
Distributed Data Grid
SOSS SVRSOSS SVR
SOSS SVR
Global Distributed Data Grid
Distributed Data Grid
SOSS SVRSOSS SVR
SOSS SVR
Distributed Data Grid
SOSS SVRSOSS SVR
SOSS SVR
Mirrored Data Centers Satellite Data Centers
ScaleOut Software, Inc.
55
What to Look for in a DDG Product
• In direct comparison tests, SSI demonstrates faster access performance and scalability in key benchmarks. Performance
• SSI’s architecture integrates both scalability and high availability and uniformly applies key architectural principles,
such as peer-to-peer design. Architecture
• SSI's products have an unusually high level of integration and focus on automatic operation. This dramatically simplifies deployment and management of a distributed data grid.
Ease of Use
• Seamless interoperability across Windows and Unix (Linux, Solaris, etc.) operating systems was designed into SSI’s
architecture from the outset. Portability
• Advanced capabilities for "map/reduce"-style parallel data analysis open up important new applications for distributed data
grids. Data Analysis
• SSI’s comprehensive tools for managing distributed data grids, such as its object browser and parallel backup and restore utility,
are unique in the industry. Manageability
ScaleOut Software, Inc.
56
SOSS Maximizes Ease of Use
Tree list shows: • Store status
• Host list • Host status
• Remote stores • Remote client configuration
Host configuration
pane: Just need to
select subnet shared by all
hosts.
Grid servers self-aggregate, self-heal, and automatically load-balance.
ScaleOut Software, Inc.
57
Real-time Performance Charting
ScaleOut Software, Inc.
58
SOSS Object Browser • Simplifies development. • Provides extremely useful visibility into grid usage. • Allows grid objects to be analyzed and managed.
ScaleOut Software, Inc.
59
SOSS Parallel Backup and Restore
• Enables grid contents (or portions) to be backed up or restored in parallel either to:
– Separate file systems on all caching servers or – A single network file share
• Creates backups or snapshots for later analysis. • Makes full use of SOSS’s parallel implementation to
deliver highly scalable performance and high availability.
Ethernet
Server
Ethernet
SOSS
Server Server Server
SOSS SOSS SOSS
Ethernet
Server
Ethernet
SOSS
Server Server Server
SOSS SOSS SOSS
ScaleOut Software, Inc.
60
Recap: Top 6 Reasons to Use a DDG 1. Faster access time for business logic state or database data
2. Scalable throughput to match a growing workload and keep response times low
3. High availability to prevent data loss if a grid server (or network link) fails
4. Shared access to data across the server farm
5. Advanced capabilities for quickly and easily mining data using scalable, “map/reduce,” analysis
6. Transparent data migration across multiple sites and the cloud.
Acce
ss L
aten
cy (m
sec)
Throughput (accesses / sec)
Grid DBMS
Access Latency vs. Throughput
Distributed Data Grids for
Server Farms & High Performance Computing
www.scaleoutsoftware.com
Thank you for joining us today!