Date post: | 18-Jan-2017 |
Category: |
Software |
Upload: | evgeniy-khyst |
View: | 21,120 times |
Download: | 1 times |
www.luxoft.com
APPLICATION PERFORMANCE: DATABASE-RELATED PROBLEMSEvgeniy Khyst26.04.2016
www.luxoft.com
Application Performance: Database-related Problems● Application performance;● Common performance problems and their solutions;● Database-related problems;● Lock contention;● Locking mechanism;● Transaction isolation level;● URL shortener example;● Hi/Lo algorithms;● Payment system example.
www.luxoft.com
Application Performance
● Key performance metrics: Request processing time; Throughput;
● Poor performance: Long time to process single requests; Low number of requests processed per second.
www.luxoft.com
Request Processing Time
Request processing time = 4 seconds
www.luxoft.com
Throughput
Throughput = 1/4 req/sec = 15 req/min
www.luxoft.com
Throughput
Throughput = 3/4 req/sec = 45 req/min
www.luxoft.com
Throughput
Throughput = 10/4 req/sec = 150 req/sec
www.luxoft.com
Common Performance Problems and Their Solutions
● Database-related problems;● JVM performance problems;● Application specific performance problems;● Network-related problems.
www.luxoft.com
Database-related Performance Problems
● Query execution time is too big;● Too much queries per single business function;● Database connection management problems.
www.luxoft.com
Query Execution Time is Too Big
● Missing indexes;● Slow SQL queries (sub-queries, too many JOINs etc);● Slow SQL queries generated by ORM;● Not optimal JDBC fetch size;● Not parameterized statements for queries;● Lack of proper data caching;● Lock contention.
www.luxoft.com
Missing Indexes
To find out what indexes to create look at query execution plan.
In Oracle database it is done as follows:EXPLAIN PLAN FOR SELECT isbn FROM book;
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY());
www.luxoft.com
TABLE ACCESS FULL
● Full table scan is a scan made on a database where each row of the table under scan is read in a sequential order and the columns encountered are checked for the validity of a condition;
● Full table scans are the slowest method of scanning a table in most of the cases;
● Create missing indexes to search by index instead of performing full table scan.
www.luxoft.com
Slow SQL Queries
● Slow SQL queries (sub-queries, too many JOINs etc):Solution: Rewrite query
● Slow SQL queries generated by ORM: JPQL/HQL and Criteria API queries are translated to SQL;Solutions: Rewrite JPQL/HQL, Criteria API queries; Replace with plain SQL query.
www.luxoft.com
Not Optimal JDBC Fetch Size
JDBC allows to specify the number of rows fetched with each database round-trip for a query, and this number is referred to as the fetch size.
Solutions:● java.sql.Statement.setFetchSize(rows)
● hibernate.jdbc.fetch_size property
www.luxoft.com
Not Parameterized Statements for Queries
When a database receives SQL statement it:●parses the statement and looks for syntax errors,●does the access plan generation (checks what indexes can
be used etc),●executes statement.
Problem: Access plan generation takes CPU power.
www.luxoft.com
Not Parameterized Statements for Queries
●Database caches computed access plan;●JEE application servers cache PreparedStatement
instances.
Solution: Reusing the previous access plan or PreparedStatement saves CPU power.
www.luxoft.com
Not Parameterized Statements for Queries
The entire statement is the key in cache.
SELECT * FROM tbl WHERE name='a' andSELECT * FROM tbl WHERE name='b'
will have different entries in cachesbecause the name='b' is different from the cached name='a'.
www.luxoft.com
Not Parameterized Statements for Queriesfor (String name : names) { Statement stmt = conn.createStatement(); ResultSet rs = stmt.executeQuery("SELECT * FROM tbl WHERE name = " + name); /* … */}
The cache won't be used, a new access plan is computed for each iteration.
PreparedStatement ps = conn.prepareStatement("SELECT * FROM tbl WHERE name = ?");for (String name : names) { ps.setString(1, name); ResultSet rs = ps.executeQuery(); /* … */}
Database reuses the access plan for the statement parameterized using the '?'.
www.luxoft.com
Lack of Proper Data Caching
Solutions:● Enable ORM second-level cache;● Enable ORM query cache;● Implement custom cache.
www.luxoft.com
Lock Contention
Operations are waiting to obtain lock for a long time due to high lock contention.
Solution:Revise application logic and implementation:● Update asynchronously;● Replace updates with inserts (inserts are not blocking).
www.luxoft.com
Too Much Queries per Single Business Function
● Insert/update queries executed in a loop;● "SELECT N+1" problem;● Reduce number calls hitting database.
www.luxoft.com
Insert/Update Queries Executed in a Loop
● Use JDBC batch (keep batch size less than 1000);● hibernate.jdbc.batch_size property;● Periodically flush changes and clear
Session/EntityManager to control first-level cache size.
www.luxoft.com
JDBC Batch ProcessingPreparedStatement preparedStatement = connection.prepareStatement("UPDATE book SET title=? WHERE isbn=?");
preparedStatement.setString(1, "Patterns of Enterprise Application Architecture");preparedStatement.setString(2, "007-6092019909");
preparedStatement.addBatch();
preparedStatement.setString(1, "Enterprise Integration Patterns");preparedStatement.setString(2, "978-0321200686");
preparedStatement.addBatch();
int[] affectedRecords = preparedStatement.executeBatch();
for (int i=0; i<100000; i++) { Book book = new Book(.....); session.save(book); if ( i % 20 == 0 ) { // 20, same as the JDBC batch size // flush a batch of inserts and release memory: session.flush(); session.clear(); }}
www.luxoft.com
"SELECT N+1" Problem
● The first query will selected root entities only, and each associated collection will be selected with additional query.
● So persistence provider generates N+1 SQL queries, where N is a number of root entities in result list of user query.
www.luxoft.com
"SELECT N+1" Problem
Solutions:● Use different fetching strategy or entity graph;● Make child entities aggregate roots and use DAO methods
to fetch them: Replace bidirectional one-to-many mapping with unidirectional;
● Enable second-level and query cache.
www.luxoft.com
Reduce Number Database Calls
Solutions:● Use Hi/Lo algorithms;● Enable ORM second-level cache;● Enable ORM query cache;● Implement custom cache.
www.luxoft.com
Database Connection Management Problems
● Application is using too much DB connections: Application is not closing connections after usingSolution: Close all connections after using DB is not able to handle that much connections application uses Solution: Use connection pooling
● Application is waiting to get connection from pool too longSolution: Increase pool size
www.luxoft.com
JVM Performance Problems
Excessive JVM garbage collections slows down application.Solutions:● Analyze garbage collector logs:
Send GC data to a log file, enable GC log rotation:-Xloggc:gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=1M -XX:+PrintGCTimeStamps
● Tune GC: Use Garbage-First Collector: -XX:+UseG1GC
www.luxoft.com
Application Specific Performance Problems
Resource consuming computations: ● Algorithms with complexity O(N2), O(2N);● Asymmetric RSA encryption;● Bcrypt hashing during authentication;● Etc.
Solution: Horizontal scalability. Increase number of instances capable of processing requests and balance load (create cluster).
www.luxoft.com
Network-related Problems
● Network latency;● Not configured timeout:
mail.smtp.connectiontimeout Socket connection timeout. Default is infinite timeout.
mail.smtp.timeout Socket read timeout. Default is infinite timeout.
www.luxoft.com
Reducing Lock Contention
● Database-related problems Query execution time is too big
• Lock contention
Solutions:● Use Hi/Lo algorithms;● Update asynchronously;● Replace updates with inserts.
www.luxoft.com
Locking Mechanism
Locks are mechanisms that prevent destructive interaction between transactions accessing the same resource.
In general, multi-user databases use some form of data locking to solve the problems associated with:● data concurrency,● consistency,● integrity.
www.luxoft.com
Isolation Levels vs Locks
● Transaction isolation level does not affect the locks that are acquired to protect data modifications.
● A transaction always gets an exclusive lock on any data it modifies and holds that lock until the transaction completes, regardless of the isolation level set for that transaction.
● For read operations transaction isolation levels primarily define the level of protection from the effects of modifications made by other transactions.
www.luxoft.com
Preventable Read Phenomena
● Dirty reads - A transaction reads data that has been written by another transaction that has not been committed yet.
● Nonrepeatable reads - A transaction rereads data it has previously read and finds that another committed transaction has modified or deleted the data.
● Phantom reads - A transaction reruns a query returning a set of rows that satisfies a search condition and finds that another committed transaction has inserted additional rows that satisfy the condition.
www.luxoft.com
Standard Transaction Isolation Levels
● Read uncommited● Read commited● Repeatable reads● Serializable
www.luxoft.com
Isolation Levels vs Read Phenomena
Dirty reads Nonrepeatable reads Phantom reads
Read uncommited Possible Possible Possible
Read commited Notpossible Possible Possible
Repeatable reads Notpossible Notpossible Possible
Serializable Notpossible Notpossible Notpossible
www.luxoft.com
Default Isolation Level
Read commited isolation level is default.
www.luxoft.com
Read Commited Isolation Level
In read commited reads are not blocking.
www.luxoft.com
Read Commited Isolation Level
Conflicting writes in read commited transactions.
www.luxoft.com
Pessimistic and optimistic locking are concurrency control mechanisms.
Pessimistic locking is a strategy when you lock record when reading and then modify:
SELECT name FROM tbl FOR UPDATE;UPDATE tbl SET name = 'new value';
Optimistic locking is a strategy when you read record with version number and then check this version when updating:SELECT name, version FROM tbl;
UPDATE tbl SET name = 'new value', version = version + 1 WHERE version = :version;
Pessimistic and Optimistic Locking
www.luxoft.com
● Pessimistic locking prevents lost updates and makes updates serial (FIFO) reducing throughput;
● Optimistic locking just prevents lost updates;● If version check in optimistic locking fails, read and update
queries should be re-executed;● Optimistic locking allows to reduce time the lock is held and
sometimes increases throughput.
Pessimistic and Optimistic Locking
www.luxoft.com
URL Shortener Example
Requirements:● Receives URL and returns "shortened" version;● E.g. post "http://github.com" to "http://url-shortener/s/"
and get back "http://url-shortener/s/2Bi";● The shortened URL can be resolved to original URL. E.g.
"http://url-shortener/s/2Bi" will return "http://github.com";● Shortened URLs that were not accessed longer than some
specified amount of time should be deleted.
www.luxoft.com
URL Shortener Example
● Each time URL is submitted a new record is inserted into the database;
● Insert operations do not introduce locks in database;● For primary key generation database sequence is used;● The Hi/Lo algorithm allows to reduce number of database
hits to improve performance.
www.luxoft.com
URL Shortener Example
● Original URL’s primary key is converted to radix 62: Radix 62 alphabet contains digits lower- and upper-case letters:
10000 in radix 10 = 2Bi in radix 62;● String identifying original URL is converted back to radix
10 to get primary key value and original URL can be found by ID.
www.luxoft.com
URL Shortener Example
E.g. URL "http://github.com/" shortened to "http://url-shortener/s/2Bi":● Inserting new record to database with id 10000 for original
URL "http://github.com/" representing "shortened" URL● Converting id 10000 to radix 62: 2Bi
www.luxoft.com
URL Shortener Example
● During each shortened URL resolving last view timestamp is updated in database and total number of views column is incremented;
● These update should be asynchronous to not reduce performance due to lock contention;
● Absence of update operations gives application better scalability and throughput.
www.luxoft.com
Update Asynchronously
● When URL is resolved JMS message is sent to queue;● Application consumes messages from queue and updates
records in database;● During URL resolving there are no update operations.
www.luxoft.com
Hi/Lo Algorithms
The usage of Hi/Lo algorithm allows different application nodes not to block each other.
www.luxoft.com
Hi/Lo Algorithms
JPA mapping:@SequenceGenerator(name = "MY_SEQ", sequenceName = "MY_SEQ",
allocationSize = 50)
allocationSize = N - fetch the next value from the database once in every N persist calls and locally (in-memory) increment the value in
between.
Sequence DDL:CREATE SEQUENCE MY_SEQ INCREMENT BY 50 START WITH 50;
INCREMENT BY should match allocationSizeSTART WITH should be greater or equal to allocationSize
www.luxoft.com
Payment System Example
Requirements:● Users can add funds on their accounts (add funds)● Users can pay to shops with funds from their accounts
(payment)● Users and shops can withdraw money from their accounts
(withdraw funds)● Account balance must be always up to date
www.luxoft.com
Simple Solution 1
● Store account balance in table and update on each operation.
● Advantage: Simple
www.luxoft.com
Simple Solution 1 - Data Model
Table ACCOUNT_BALANCE
ACCOUNT_ID BALANCE
www.luxoft.com
Simple Solution 1 - Queries
UPDATE ACCOUNT_BALANCE SET BALANCE = BALANCE + :amount WHERE ACCOUNT_ID = :account
SELECT ACCOUNT_ID, BALANCE FROM ACCOUNT_BALANCE WHERE ACCOUNT_ID = :account
www.luxoft.com
Simple Solution 1 - Problems
● Update operations introduce locks;● During Christmas holidays users can make hundreds of
payments simultaneously;● Due to lock contention payments will be slow;● System have low throughput.
www.luxoft.com
Simple Solution 2
● Do not store account balance at all;● Store details of each transaction;● Calculate balance dynamically based on transaction log;● Advantages:
Still simple enough; No update operations at all.
www.luxoft.com
Simple Solution 2 - Data Model
Table TRANSACTION_LOG
TX_ID TX_TYPE TX_STATUS TX_DATE ACCOUNT_ID TX_AMOUNT
www.luxoft.com
Simple Solution 2 - Queries
● Payment and withdrawal are 2-step operations: Authorization step; Fulfillment step;
● First, authorization step is done in separate transaction;● Next, balance check and fulfillment step are done in other
transaction.
www.luxoft.com
Simple Solution 2 - Queries
-- Authorization in new transactionINSERT INTO TRANSACTION_LOG(TX_ID, TX_TYPE, TX_STATUS, TX_DATE, ACCOUNT_ID, TX_AMOUNT) VALUES(:id, :type, 'AUTHORIZED', :date, :account, :amount)
-- Balance check and fulfillment in new transactionSELECT ACCOUNT_ID, SUM(TX_AMOUNT) AS BALANCE FROM TRANSACTION_LOG WHERE ACCOUNT_ID = :account
UPDATE TRANSACTION_LOG SET TX_STATUS = 'FULFILLED' WHERE TX_ID = :id
www.luxoft.com
Simple Solution 2 - Problems
● Users can make thousands of transactions per day;● During Christmas holidays users can make thousands of
payments per hour;● Number of transactions continuously grow;● More records in TRANSACTION_LOG table - slower requests.
www.luxoft.com
Better Solution
● Store balance on yesterday in table;● Update account balance once a day in background;● Store details of each transaction;● Calculate balance dynamically based on value of balance
on yesterday and transactions made today from transaction log.
www.luxoft.com
Better Solution - Data Model
Table ACCOUNT_BALANCE
Table TRANSACTION_LOG
ACCOUNT_ID BALANCE_DATE BALANCE
TX_ID TX_TYPE TX_STATUS TX_DATE ACCOUNT_ID TX_AMOUNT
www.luxoft.com
Better Solution - Queries
-- Authorization in new transactionINSERT INTO TRANSACTION_LOG(TX_ID, TX_TYPE, TX_STATUS, TX_DATE, ACCOUNT_ID, TX_AMOUNT) VALUES(:id, :type, 'AUTHORIZED', :date, :account, :amount)
-- Balance check and fulfillment in new transactionUPDATE TRANSACTION_LOG SET TX_STATUS = 'FULFILLED' WHERE TX_ID = :id
-- Executed once a day at midnightUPDATE ACCOUNT_BALANCE SET BALANCE = BALANCE + :transactionLogSum, BALANCE_DATE = :lastTransactionLogDate WHERE ACCOUNT_ID = :account
www.luxoft.com
Better Solution - Queries
SELECT ACCOUNT_ID, BALANCE_DATE, BALANCE AS CACHED_BALANCE FROM ACCOUNT_BALANCE WHERE ACCOUNT_ID = :account
SELECT ACCOUNT_ID, MAX(TX_DATE) AS LAST_TX_LOG_DATE, SUM(TX_AMOUNT) AS TX_LOG_SUM FROM TRANSACTION_LOG WHERE ACCOUNT_ID = :account AND TX_DATE > :balanceDateGROUP BY ACCOUNT_ID
-- BALANCE = CACHED_BALANCE + TX_LOG_SUM
www.luxoft.com
Better Solution - Advantages
● No updates during payment operations - no locks● No locks - better throughput● Number of rows in query with SUM operation is limited (1
day)● Constant query execution time
www.luxoft.com
THANK YOU