Date post: | 09-Jun-2015 |
Category: |
Documents |
Upload: | guy-harrison |
View: | 852 times |
Download: | 0 times |
© 2010 Quest Software, Inc. ALL RIGHTS RESERVED
Performance by Design
Guy Harrison
Director, R&D Melbourne
www.guyharrison.net
2
Introductions
3
http://www.motivatedphotos.com/?id=17760
4
Blue
Yellow
Red
0 10 20 30 40 50 60 70 80
Star trek shirt fatality analysis
Pct
5
Not worrying, just wondering...• How will Oracle deal
respond to Hadoop?• Will Oracle play in the
NoSQL database world?• What will happen to
MySQL?• What will happen to red-
shirt TOAD?
6
Core message
• Design limits performance• Architecture maps requirements to design• Make sure performance requirements are specified• Make sure architecture allows for performance• Make sure performance requirements are realized
7
Elements of Performance by Design
Methodology
•Define requirements
•Prototype
•Measurement and instrumentation
•Benchmarking
Database Design
•Logical and Physical
•Indexing, partitioning, clustering
•Denormalization
Application Architecture
•Minimize requests
•Optimize requests
8
Methodology
•Response time
•Throughput
•Data volumes
•Hardware budget
Requirements analysis
•Data model
•Key transactions
•Data volumes
Prototype
•Concurrency
•Transaction rates
•Data volumes
Benchmark
9
High performance can mean different things
Speed: response time
10
Efficiency: power consumption
11
Power: throughput
12
Not usually easy to change architectures
13
Poorly defined requirements lead to this:
14
The fail whale
15
Twitter growth
16
“Twitter is, fundamentally, a messaging system.
Twitter was not architected as a messaging
system, however. For expediency's sake, Twitter
was built with technologies and practices that are
more appropriate to a content management
system.”
17
Patterns of database performance
1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85 89 93 970
20
40
60
80
100
120
O(1)
O(n)
O(log n)
O(n2)
Q
Hard to distinguish patterns at low levels
18
19
Database Design
•Normalize (enough but no further)
•Data types
•Artificial keys
Logical Modelling
•Subtypes
•Table types (clustered, nested, heap)
•Nulls
•Denormalization
Logical to physical
•Index and clustering strategies
•Partitioning
Indexing and physical storage
20
Normalize, but not too far!
"Make everything as simple as possible, but not
simpler."
21
Other logical design thoughts• Artificial keys
– Generally more efficient than long composite keys
• Null values– Not a good idea if you intend to search for “unknown” or
“incomplete” values– Null should not mean something– But beneficial as long as you don’t need to look for them.
• Data types– Constraints on precision can sometimes reduce row lengths– Variable length strings usually better– Carefully consider CLOBs vs long VARCHARs
22
Logical to Physical: Subtypes
“Customers are people too”
23
Indexing, clustering and weird table types• Lots’ of options:
– B*-Tree index– Bitmap index– Hash cluster– Index Cluster– Nested table– Index Organized Table
• Most often useful:– B*-Tree (concatenated) indexes– Bitmap indexes– Hash Clusters
24
25
Concatenated index effectiveness
SELECT cust_id
FROM sh.customers c
WHERE cust_first_name = 'Connor'
AND cust_last_name = 'Bishop'
AND cust_year_of_birth = 1976;
None
last name
last+first name
last,first,BirthYear
last,first,birthyear,id
0 200 400 600 800 1000 1200 1400 1600
1459
63
6
4
3
Logical IO
26
Concatenated indexing guidleines• Create a concatenated index for columns from a table that
appear together in the WHERE clause.• If columns sometimes appear on their own in a WHERE
clause, place them at the start of the index.• The more selective a column is, the more useful it will be
at the leading end of the index (better single key lookups)• But indexes compress better when the leading columns
are less selective. (better scans) • Index skip scans can make use of an index even if the
leading columns are not specified, but it’s a poor second choice to a “normal” index range scan.
27
Bitmap indexes
28
Bitmap indexes
1 10 100 1000 10000 100000 10000000.01
0.1
1
10
100
Bitmap index B*-Tree index Full table scan
Distinct values in table
Ela
pse
d T
ime
(s)
29
30
Bitmap join performance
SELECT SUM (amount_sold)
FROM customers JOIN sales s USING (cust_id) WHERE
cust_email='[email protected]';
Bitmap Join index
Bitmap index
Full table scan
0 2000 4000 6000 8000 10000 12000 14000
68
1,524
13,480
Logical IO
Acc
ess
Pat
h
31
Index overhead
1 (PK only)
2
3
4
5
6
7
0 2,000 4,000 6,000 8,000 10,000 12,000 14,000 16,000 18,000
1,191
6,671
8,691
10,719
12,727
14,285
16,316
Logical reads required
Nu
mb
er o
f in
dex
es
32
Hash Cluster• Cluster key
determines physical location on disk
• Single IO lookup by cluster key
• Misconfiguration leads to overflow or sparse tables
Sparse
Overflow
33
Hash Cluster vs B-tree index
B-tree index
Hash (hashkeys=100000,size=1000)
Hash (hashkeys=1000, size=50)
0 1 2 3 4 5 6 7 8 9
3
1
9
Logical reads
34
Hash cluster table scan
Heap table
Hash (hashkeys=100000, size=1000)
Hash (hashkeys=1000, size=50)
0 500 1,000 1,500 2,000 2,500 3,000 3,500 4,000
1,458
3,854
1,716
Logical reads
35
Denormalization and partitioning
• Repeating groups – VARRAYS, nested tables• Summary tables – Materialized Views, Result cache• Horizontal partitioning – Oracle Partition Option • In-line aggregations – Dimensions • Derived columns – Virtual columns• Vertical partitioning • Replicated columns - triggers
36
Summary tables• Aggregate queries on big tables often the most expensive• Pre-computing them makes a lot of sense• Balance accuracy with overhead
Accuracy
Efficiency
Aggregate Query
MV stale tolerated
MV on COMMIT
Manual Summary
Result set cache
37
Vertical partitioning
38
Physical storage options
• LOB Storage• PCTFREE• Compression • Block size • Partitioning
39
40
Application Architecture and implementation
•Reduce requests though application caching
•Reduce “hard” parsing using bind variables
SQL Statement Management
•Minimize lock duration
•Optimistic and Pessimistic locking strategies
Transaction design
•Array fetch and Insert
•Stored procedures
Network overhead
41
The best SQL is no SQL • Avoid asking for the same data twice.
42
11g client side cache • CLIENT_RESULT_CACHE_SIZE: this is the amount of memory
each client program will dedicate to the cache.• Use RESULT_CACHE hint or (11GR2) table property• Optionally set the CLIENT_RESULT_CACHE_LAG
11g client Cache
Program caching
NoCaching
0 1,000 2,000 3,000 4,000 5,000 6,000 7,000
1,250
1,438
6,265
Elapsed time (ms)
43
Parse overhead• It’s easy enough in most programming languages to
create a unique SQL for every query:
44
Bind variables are preferred
45
Parse overhead reduction
No Bind variables
Bind Variables
CURSOR_SHARING
0 200 400 600 800 1,000 1,200 1,400
HardParse
OtherParse
Other
Elapsed time (ms)
46
Identifying similar SQLs
See force_matching.sql at www.guyharrison.net
47
Transaction design • Optimistic vs. Pessimistic
Dura
tion of lock
Duration
of lock
48
Using ORA_ROWSCN
• Setting ROWDEPENDENCIES will reduce false fails
49
Network – stored procedures
50
Network traffic example
Stored Procedure
Java client
0 200 400 600 800 1,000 1,200 1,400 1,600 1,800
344
1703
297
313
Local Host
Remote Host
Elapsed time (ms)
51
Array processing - Fetch
52
Network overhead – Array processing
0 20 40 60 80 100 120 1400
5,000
10,000
15,000
20,000
25,000
30,000
35,000
40,000
Logical Reads Network round trips
Array fetch size
53
Array Insert (Java)
54
Array Insert: (.NET)
55
Array Insert – PL/SQL
56
Array Insert Performance
© 2010 Quest Software, Inc. ALL RIGHTS RESERVED
너를 감사하십시요 Thank You Danke Schön
Gracias 有難う御座いました Merci
Grazie Obrigado 谢谢
58
•
Brockman Kwik-E-Mart, Ms Krabaple, Mrs. Hoover ,• Waylan Smithers
2)Who is C. Montgomery Burns' assistant?Answer3)Who is
Bart's Teacher? Lisa's?Answer
6)Kent ______ is the local newscaster.Answer7)____-_-____
is the local convenience store.Answer