Date post: | 31-Oct-2014 |
Category: |
Technology |
Upload: | simeon-simeonov |
View: | 597 times |
Download: | 3 times |
Memory Issuesin Rails Applications
I am @simeons
recruit amazing people
solve hard problems !
ship !
make users happy !
repeat
Problems of Success (good problems)
Too many users Too much traffic Too much data
Memory Issuesin Rails Applications
Common Problem of Success
Display AdvertisingMakes the Web Suck
User-focused optimization Tens of millions of users
1000+% better than average 200+% better than Google
Swoop Fixes That
Mobile SDKs iOS & Android
Web SDK RequireJS & jQuery
Components AngularJS
NLP, etc. Python
Targe<ng High-‐Perf Java
Analy<cs Ruby 2.0
Internal Apps Ruby 2.0 / Rails 3
Pub Portal Ruby 2.0 / Rails 3
Ad Portal Ruby 2.0 / Rails 4
Before 1hr @ 4Gb
Before 1hr @ 4Gb
When problems grow faster than the rate at which you can throw HW at them, you actually have to solve them
Before 1hr @ 4Gb
After 5min @ 230Mb
Resolving Memory Issuesin Rails ApplicationsUsing Streams
CSV
0
125
250
375
500
0 25,000 50,000 75,000 100,000
Rows
Mem
ory
(Mb)
0
125
250
375
500
0 25,000 50,000 75,000 100,000
Rows
Mem
ory
(Mb)
You are here
0
125
250
375
500
0 25,000 50,000 75,000 100,000
Rows
Mem
ory
(Mb)
You are here
This sucks
0
125
250
375
500
0 25,000 50,000 75,000 100,000
Rows
Mem
ory
(Mb)
You are here
This sucks
Start thinking here
Memory Leaks
class AddDomainsStep def call(hashes) hashes.map do |hash| transform_and_return(hash) end end end
1 class AddDomainsStep 2 def initialize 3 @domain_config = DomainConfig.instance 4 end 5 6 def call(hashes) 7 hashes.each do |hash| 8 hash['domain'] = 9 @domain_config. 10 domain_for(hash['domain_id']) 11 end 12 end 13 end
1 class DomainConfig 2 include Singleton 3 4 def initialize 5 @domains = {} 6 end 7 8 def domain_for(id) 9 @domains[id] ||= Domain.name_for(id) || '' 10 end 11 end
@domains[id] ||= Domain.name_for(id) || ''
Memory Leak
•Memory that will never be released by the garbage collector.
•Memory usage grows the longer the process runs.
Avoid Global State
•Global variables
•Class variables
•Singletons
•Per-process instance state
Memory Churn
hashes.map do |hash| hash['domain'].downcase.strip end
hashes.each do |hash| hash['domain'].downcase! hash['domain'].strip! end
vs
Memory Churn
•Allocating and deallocating tons of objects slows down processing
•Mutation limits allocations, but makes it easier to introduce bugs
1 hashes.each do |hash| 2 hash['domain'].downcase! 3 hash['domain'].strip! 4 end
Spot the Bug!
# In shared state: @domains[id] ||= Domain.name_for(id) || '' !
# Much later: hash['domain'].downcase! hash['domain'].strip!
Good News!•Allocating and freeing objects is
fairly fast in Ruby •Keeping your stack frame light
will limit the effects of memory churn
Memory Bloat
def to_csv csv = [CSV.generate_line(headers)] !
rows.each do |row| values = headers.map do |header| row[header] || defaults[header] end !
csv << CSV.generate_line(values) end !
csv.join('') end
def to_csv csv = [CSV.generate_line(headers)] !
rows.each do |row| values = headers.map do |header| row[header] || defaults[header] end !
csv << CSV.generate_line(values) end !
csv.join('') end
def to_csv csv = [CSV.generate_line(headers)] !
rows.each do |row| values = headers.map do |header| row[header] || defaults[header] end !
csv << CSV.generate_line(values) end !
csv.join('') end
Memory Bloat
•Memory usage grows with data set
•Loading too much data at once
Laziness
rename_report_fields( squash( add_domains( add_properties( unwind_variations( rows ) ) ) ) )
def duplicate(number, count) if count > 0 [number] + repeat(number, count - 1) else [] end end !
def sum(list) list.inject(0) do |result, number| result + number end end
sum(repeat(5,10)) # => 50
duplicate :: Int -> Int -> [Int] duplicate number count | count <= 0 = [] | otherwise = number:duplicate number (count - 1) !sum :: [Int] -> Int sum [x] = x sum (x:remaining) = x + sum remaining
> sum $ duplicate 5 10 50
Be ProactiveAbout Being Lazy
Enumerable
class AddDomainsStep def initialize(source) @source = source end !
def each @source.each do |hash| hash['domain'] = DomainConfig. instance. domain_for(hash['domain_id']) yield hash end end end
RenameReportFieldsStep.new( SquashStep.new( AddDomainsStep.new( AddPropertiesStep.new( UnwindVariationsStep.new( rows ) ) ) ) )
Buffering