+ All Categories
Home > Technology > Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Date post: 10-May-2015
Category:
Upload: pmjones88
View: 4,419 times
Download: 12 times
Share this document with a friend
Popular Tags:
51
Solving the N+1 Problem; or, A Stitch In Time Saves Nine Dallas PHP User Group 21 Feb 2012 joind.in/event/view/894 paul-m-jones.com @pmjones
Transcript
Page 1: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Solving the N+1 Problem; or,A Stitch In Time Saves Nine

Dallas PHP User Group21 Feb 2012

joind.in/event/view/894

paul-m-jones.com@pmjones

Page 2: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Read These

Page 3: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

About Me• 8 years USAF Intelligence

• PHP since 1999

• Developer, Senior Developer,Team Lead, Architect, VP Engineering

• Aura project, benchmarking series, Solar framework, Savant template system, Zend_DB, Zend_View

• PEAR Group member,ZCE Education Advisory Board

Page 4: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Overview

• Performance benchmarking

• The N+1 problem

• Native solutions to the N+1 problem

• Libraries to help with the N+1 problem

Page 5: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Performance Benchmarking

Page 6: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Benchmarking Subjects• CPU

• RAM

• Disk access

• Database access

• Network access

•Requests/second

• Programmer productivity

• Time to initial implementation

• Time to add new major feature

• Time to fix bugs

-- numeric measurement ---- control for variables --

Page 7: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Luck and Limitations

• Do you feel lucky?

• A man’s got to know his limitations

• Hardware, OS, web server, language, framework, app

•Where in the stack to expend effort?

Page 8: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Outer Limits Of Responsiveness

• Amazon EC2 “Large” instance, 64-Bit Ubuntu 10.10 (Alestic)

• Apache 2.2 (stock install)

• PHP 5.3.3, APC, Suhosin (stock install)

• MySQL 5.1.49 (stock install)

Page 9: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Performance Measures

• Static index.html (Hello World!)

• Dynamic index.php (<?php echo 'Hello World!'; ?>)

• Database connect (mysql_* and PDO code)

• Database connect, query, and fetch (mysql_* and PDO code)

Page 10: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Baseline Performance

5 runs of 10 users for 60 seconds, averaged

relative averagehtml 1.2514 2726.35php 1.0000 2178.63

Page 11: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

MySQL Connect

$host = 'localhost';$user = 'root';$pass = 'admin';$dbname = 'bench';$table = 'hello';

$conn = mysql_connect($host, $user, $pass);mysql_select_db($dbname);echo "Mysql Connect!";mysql_close($conn);

Page 12: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

PDO Connect

$host = 'localhost';$user = 'root';$pass = 'admin';$dbname = 'bench';$table = 'hello';

$pdo = new PDO( "mysql:host=$host;dbname=$dbname", $user, $pass);

echo "PDO Connect!";

Page 13: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Connection PerformanceMySQL relative averagehtml 1.2514 2726.35php 1.0000 2178.63connect 0.7926 1726.81

PDO relative averagehtml 1.2514 2726.35php 1.0000 2178.63connect 0.8346 1818.30

Page 14: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Database TableCREATE TABLE hello ( id INT PRIMARY KEY AUTO_INCREMENT, ch VARCHAR(1));INSERT INTO hello (ch) VALUES ('H');INSERT INTO hello (ch) VALUES ('e');INSERT INTO hello (ch) VALUES ('l');INSERT INTO hello (ch) VALUES ('l');INSERT INTO hello (ch) VALUES ('o');INSERT INTO hello (ch) VALUES (' ');INSERT INTO hello (ch) VALUES ('W');INSERT INTO hello (ch) VALUES ('o');INSERT INTO hello (ch) VALUES ('r');INSERT INTO hello (ch) VALUES ('l');INSERT INTO hello (ch) VALUES ('d');INSERT INTO hello (ch) VALUES ('!');

Page 15: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

MySQL Query & Fetch

$conn = mysql_connect($host, $user, $pass);mysql_select_db($dbname);

$rows = mysql_query("SELECT * FROM $table ORDER BY id");while ($row = mysql_fetch_array($rows, MYSQL_ASSOC)) { echo $row['ch'];}

mysql_free_result($rows);mysql_close($conn);

Page 16: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

PDO Query & Fetch

$pdo = new PDO( "mysql:host=$host;dbname=$dbname", $user, $pass);

$stmt = $pdo->prepare("SELECT * FROM $table ORDER BY id");$stmt->execute();

$rows = $stmt->fetchAll(PDO::FETCH_ASSOC);foreach ($rows as $row) { echo $row['ch'];}

Page 17: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Connect, Query, Fetch PerformanceMySQL relative averagehtml 1.2514 2726.35php 1.0000 2178.63connect 0.7926 1726.81connect, query, fetch 0.6907 1504.76

PDO relative averagehtml 1.2514 2726.35php 1.0000 2178.63connect 0.8346 1818.30connect, query, fetch 0.7397 1611.61

Page 18: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Overall Performance

html

php

connect

query, fetch

0 750 1500 2250 3000

mysql pdo

Page 19: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

The N+1 Problem

Page 20: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Background

• Performance problems in application report

• 2m rows into 40k record objects, 3+ hours

• Reduced dataset to 2000 rows and 40 record objects

• Profiler: 201 queries

• 1 query, plus 5 additional queries per record

Page 21: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

N+1 in PHP

// 1 query to get 10 posts$stmt = 'SELECT * FROM posts LIMIT 10';$posts = $sql->fetchAll($stmt);

// 10 queries for comments (1 per post)$stmt = 'SELECT * FROM comments WHERE post_id = ?';foreach ($posts as &$post) { $bind = array($post['id']); $rows = $sql->fetchAll($stmt, $bind); $post['comments'] = $rows;}

Page 22: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

$posts = array( 0 => array( 'id' => '1', 'body' => 'Post text', 'comments' => array( 0 => array( 'id' => '1', 'post_id' => '1', 'body' => 'Comment 1 text' ), // ... 9 => array( 'id' => '9', 'post_id' => '1', 'body' => 'Comment 10 text' ), ), ), // ... 9 => array(...),);

Page 23: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Why It’s A Problem

• Each relationship is one extra query per master row

• 5 relationships == 5 queries per master row

• 10 records means 50 added queries

• 40,000 records means 200,000 added queries

• Performance drag. Need to use fewer queries.

Page 24: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Why Does N+1 Happen?

Page 25: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

CRUDdy Mindset

• Create, read, update, delete

• Record-oriented focus

• ActiveRecord, RowDataGateway

• Collections are secondary

• In a hurry? Treat collection as a series of single records in a loop

Page 26: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

BREAD Instead

• Browse, read, edit, add, delete, search

• “Browse” is a first-class requirement

• TableModule, TableDataGateway

• Build collections of records right away

• Efficient collection building lends itself to efficient record building

Page 27: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Single-Query Solution

Page 28: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Single Query: Intro

• Select all results, including relationships, in a single query

• Loop through results to marshal into domain objects

Page 29: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Single Query: One-to-One// one-to-one$stmt = 'SELECT posts.*, stats.hit_count FROM posts LEFT JOIN stats ON stats.post_id = posts.id LIMIT 10';

$rows = $sql->fetchAll($stmt);$posts = array();foreach ($rows as $post) { $post['stats']['hit_count'] = $post['hit_count']; unset($post['hit_count']); $posts[] = $post;}

Page 30: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Single Query: One-to-Many

$stmt = 'SELECT posts.*, comments.* FROM posts LEFT JOIN comments ON comments.post_id = posts.id';

$rows = $sql->fetchAll($stmt);

// posts.id posts.author_id posts.title comments.id comments.body// 1 3 Frist Psot! 1 Initial comment// 1 3 Frist Psot! 2 Another comment// 1 3 Frist Psot! 3 Third comment// 1 3 Frist Psot! 4 Oh come on// 2 5 Second post 5 1st comment on post 2// 2 5 Second post 6 2nd comment on post 2// 2 5 Second post 7 3rd comment on post 2

Page 31: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Single Query: One-to-Many$posts = array();foreach ($rows as $row) { $post_id = $row['posts.id']; $posts[$post_id] = array( 'id' => $row['posts.id'], 'title' => $row['posts.title'], ); $posts[$post_id]['comments'][] = array( 'id' => $row['comments.id'], 'body' => $row['comments.body'], );}

Page 32: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Single Query: Review

• Loop through result set to marshal into domain objects

• Fine when you have only “to-one” relationships

• “To-many” relationships introduce complexity (esp. more than one)

• Result set is larger and more repetitive

• Less efficient to marshal

• Difficult to LIMIT/OFFSET

Page 33: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Query-and-Stitch Solution

Page 34: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Query-and-Stitch: Intro

• One query for the master set

• Loop through master set to key on identity field

• One query for related set, against all rows in master set

• Loop through related set and stitch into master set

Page 35: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Query-and-Stitch: Master Set

// 1 query to get 10 posts.$stmt = 'SELECT * FROM posts LIMIT 10';$rows = $sql->fetchAll($stmt);

// Find the ID of each the post// and key the $posts array on them.$posts = array();foreach ($rows as $post) { $id = $post['id']; $posts[$id] = $post;}

Page 36: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Query-and-Stitch: Related Set

// 1 query to get all comments for all posts at once.$stmt = 'SELECT * FROM comments WHERE post_id IN (:post_ids)';$bind = array('post_ids' => array_keys($posts));$rows = $sql->fetchAll($stmt, $bind);

// Stitch into posts.foreach ($rows as $comment) { $id = $comment['post_id']; $posts[$id]['comments'][] = $comment;}

Page 37: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Query-and-Stitch: Review

• One added loop (stitching into master set) but 9 fewer queries

• Best for “to-many” relationships but works for “to-one” as well

• Easy to do LIMIT/OFFSET

• Easy to add multiple related sets

• One query to get results

• One loop to stitch into master set

Page 38: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Query-and-Stitch: Performance

• 40k records from 2m rows (5 relationships)

• From 200,001 queries to 6 (1 master, 5 related)

• From 3+ hours to ~5 minutes

Page 39: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Automating Query-and-Stitch

Page 40: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

ORM

• Query-and-stitch is used by many (most? all?) ORMs for eager-fetch

• ORMs are disliked by a non-trivial set of developers

• Overhead of including and learning the ORM system

• Non- or pseudo-SQL query construction, hard to hand-tune

• Opaque behavior, ineffective/unpredictable in edge cases, resource hog

• Lazy loading of individual results will reintroduce N+1

Page 41: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Aura.Marshal: Intro

• The problem is not SQL

• The problem is marshaling result sets into domain objects

• Aura.Marshal handles only marshaling, not queries

• Specify types and relationship fields

• Load types with results from your own queries

•Wires up the results lazily into domain objects on fetch

Page 42: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Aura.Marshal: Types$manager->setType('posts', array( 'identity_field' => 'id', 'relation_names' => array( 'comments' => array( 'relationship' => 'has_many', 'native_field' => 'id', 'foreign_field' => 'post_id' ), ),));

$manager->setType('comments', array( 'identity_field' => 'id', 'relation_names' => array( 'post' => array( 'foreign_type' => 'posts', 'relationship' => 'belongs_to', 'native_field' => 'post_id', 'foreign_field' => 'id' ), ),));

Page 43: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Aura.Marshal: Loading

// load posts and get back IDs$stmt = 'SELECT * FROM posts LIMIT 10';$result = $sql->fetchAll($stmt);$post_ids = $manager->posts->load($result);

// load comments for posts$stmt = 'SELECT * FROM comments WHERE post_id IN (:post_ids)';$bind = array('post_ids' => $post_ids);$result = $sql->fetchAll($stmt, $bind);$manager->comments->load($result);

Page 44: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Aura.Marshal: Retrieval

foreach ($manager->posts as $post) { echo 'Post titled ' . $post->title . 'has ' . count($post->comments) . '.' . PHP_EOL;}

Page 45: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Conclusion

Page 46: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Conclusion

• Performance benchmarking

• Example of N+1 in PHP

• Mindset: CRUD vs BREAD

• Solutions: single query, query-and-stitch

• Aura.Marshal package as one way of automating

Page 47: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

• Questions?

• Comments?

• Criticism?

Page 48: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Bonus Slides

Page 49: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Framework Benchmarksaura/dev

aura/dev-cached

cake/1.3.10

ci/1.7.3

ci/2.0.2

kohana/3.1.3.1

lithium/0.9.9

solar/1.1.1

symfony/1.4.8

symfony/2.0.4

symfony/2.0.4-fp

zend/1.10.2-minapp

zend/1.10.2-project

zend/1.11.9

zend/2.0.0beta1

0 200 400 600 800

Page 50: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Framework Implications

• Dynamic dispatch cycle: speed limited by slowest component

• Regardless of framework responsiveness ...

• ... if database takes 0.10 sec, then max req/sec is 10 req/sec (per process)

• Framework speed is *an* important factor, not the *only* one

• Faster database code, or database avoidance, is paramount

• Might extend framework benchmarks to include database work

Page 51: Solving the N+1 Problem; or, A Stitch In Time Saves Nine

Thanks!

joind.in/event/view/894

auraphp.github.com

paul-m-jones.com@pmjones

(available for speaking and consulting)


Recommended