Date post: | 16-Jan-2017 |
Category: |
Engineering |
Upload: | minqi-pan |
View: | 394 times |
Download: | 1 times |
How we scaled GitLab for a 30k-employee company
Minqi Pan
Hello, I’m Minqi Pan
github.com/pmq20
twitter @psvr
What’s GitLab?
GitLab
a git-boxinstalled on-premises
GitLab
HTTP 80/443
SSH 22
GitLab
HTTP 80/443
SSH 22
GitLab
RedisMySQL File System
What’s inside?
GitLab
NGINX OpenSSH Server
Unicorn gitlab-shellGitlab Workhorse
git
gitlab_gitrails sidekiq rugged
libgit2
Works great for small teams
However
to make it easy to do business anywhere
Let’s scale it!
GitLab
HTTP 80/443
SSH 22
HTTP 80/443
SSH 22
unicorn unicorn
unicorn …
HTTP 80/443
SSH 22
unicorn unicorn
unicorn …
nginx ?
HTTP 80/443
SSH 22
unicorn unicorn
unicorn …
nginx
ssh2httphttps://github.com/pmq20/ssh2http
unicorn unicorn
unicorn …
LVS (IPVS)
HTTP 80/443
SSH 22
Linux Virtual Server(IP Virtual Server)
• transport-layer load balancing inside kernel
• layer-4 switching, unlike nginx (layer-7)
• can: IP weighting, IP blocking, health checking
• can’t: HTTP 200 Health Checking, URL rewriting
Complications
• SSH Host Key Synchronisation: do it once
• SSH Client Key Synchronisation: do it every time
• synchronised via redis pub-sub
Does it scale in the backend?
IV. Backing services Treat backing services as attached resources
🤔 Redis
🤔 MySQL
🤔 File System
GitLab
* git repositories * user generated attachments / avatars
GitLab Geo• introduced in GitLab 8.5 EE
• 1 Master N Slave Replication
• achieves A-P in C-A-P theorem
• no disaster recovery
• no sharing
HTTP 80/443
SSH 22
nginx
ssh2http
routing via key namespace/repo_name
GitLab shard
FS shard
GitLab shard
FS shard
GitLab shard
FS shard
GitLab Sharding• Introduces Sidekiq sharing as well
• Introduces many changes to the application layer as well- need to have super user authentication - need to eliminate every page with requests across shards (e.g. admin page of repo sizes)
• Tedious changes on the application level.
How to deal with FS?
• 🤔 Hardware Network-Attached Storage?
• 🤔 Software Network-Attached Storage?
• 🤔 Remote Procedure Calls to FS shards?
• 🤔 Kill it?
• Hard-NAS: Alibaba has non-IOE policies.
• Soft-NAS: Alibaba does not have it yet.
• RPC: GitRPC? Good. GitHub does that.
• Kill FS: Use the cloud. Try something new!
by “cloud” we mean…
• Amazon S3: Amazon Simple Storage Service
• Alibaba OSS: Alibaba Object Storage Service
libgit2 git grit• used in wiki’s • via gollum-lib • via gollum-grit_adapter • eliminate-able via
gollum-rugged_adapter
gitlab-rails
gitlab-rails
libgit2 git• via gitlab_git • via rugged • backend
replace-able
• via gitlab-shell • via gitlab-workhorse • via popen • backend
hard-to-replace (FS)
grit
Basic Idea
gitlab-workhorsegitlab-rails gitlab-shell
git
libgit2
Cloud Based Backend
grit
Cloud Based Backend
odb’s refdb• stored via OSS • locked via redis
hi-priority
lo-priority
loose OSS store
packed OSS store
OSS refdb (read)
OSS refdb (write)
loose OSS store (write)
loose OSS store (read)
packed OSS store (write)
packed OSS store (read)
via HTTP “Range” header
packed OSS store (read)
Example
• First byte of the name is 0x9f
• IDX[8 + (0x9f - 1) * 4] == 0x0403 == 1027
• IDX[8 + 0x9f * 4] == 0x0403 == 1029
• Object No. 1027 ~ 1029
Read 9fcf811e00fa469688943a9152c16d4ee90fb9a9
Example
• Binary search 1027 ~ 1029
• Found at 8 + 4 * 256 + 1027 * 20 == 21572
• Skip the rest total_num*(20+4) == 1628*24
Read 9fcf811e00fa469688943a9152c16d4ee90fb9a9
Example
• IDX[8 + 4 * 256 + 1628*24 + 4 * 1027]
Read 9fcf811e00fa469688943a9152c16d4ee90fb9a9
• PACK[0x0004482D] == PACK[280621]
ExampleRead 9fcf811e00fa469688943a9152c16d4ee90fb9a9
E3 11100011 1_______ => MSB 1 continue _110____ => type == 6 == OFS_DELTA ____0011 => length == 3
3-bit type, (n-1)*7+4-bit length
ExampleRead 9fcf811e00fa469688943a9152c16d4ee90fb9a9
• PACK[0x0004482D]
01 00000001 0_______ => MSB 0 break _0000001 => length += (1 << 4)
final length == 19
ExampleRead 9fcf811e00fa469688943a9152c16d4ee90fb9a9
• PACK[0x0004482D]
AA 10101010 1_______ MSB 1 continue _0101010 base offset == 42
ExampleRead 9fcf811e00fa469688943a9152c16d4ee90fb9a9
• PACK[0x0004482D]
44 01000100 0_______ MSB 0 break _1000100 offset == ((42+1)<<7)+68 == 5572
ExampleRead 9fcf811e00fa469688943a9152c16d4ee90fb9a9
offset == 5572push 0x0004482D into stackdeal with (0x0004482D - 5572)push (0x0004482D - 5572) into stack…root base
ExampleSHA1 type size size-pack offset-
pack depth base
9fcf811e00fa469688943a9152c16d4ee90fb9a9
blob 19 32 280621 46110c89446f2281e5db9b798a0fa020fad6e63e1
6110c89446f2281e5db9b798a0fa020fad6e63e1
blob 52 45 275049 33bbeff3fc22b75c1a26f4ab9b64449b33002aea5
3bbeff3fc22b75c1a26f4ab9b64449b33002aea5
blob 2935 1263 273786 2a399208309046656ecc01f7653c5d5b8905fc16e
a399208309046656ecc01f7653c5d5b8905fc16e
blob 4686 1540 272246 1e4e56117de8b3bd0bd899701da4712caee27c7d6
e4e56117de8b3bd0bd899701da4712caee27c7d6
blob 12635 3279 115703 0 -
git → libgit2
git fetch / clone
• git upload-pack --advertise-refs(rewritten via libgit2)
• git upload-pack(untouched)
• git pack-objects(rewritten via libgit2 pack builder)
git push (small data)• git upload-pack --advertise-refs
(rewritten via libgit2)
• git upload-pack(untouched)
• ntohl(hdr.hdr_entries) < unpack_limit
• git unpack-objects(modified via libgit2, writing to loose OSS store)
git push (big data)• git upload-pack --advertise-refs
(rewritten via libgit2)
• git upload-pack(untouched)
• ntohl(hdr.hdr_entries) >= unpack_limit
• git index-pack(modified via libgit2, writing to packed OSS store)
Naked Benchmark(no cache)
Fixture
• Repository: gitlab-ce
• https://gitlab.com/gitlab-org/gitlab-ce.git
• More than 200k objects
• More than 100MB when packed
git push
• FS-based:6.27s user 1.72s system 14% cpu 53.299 total
• Cloud-based:6.13s user 1.29s system 13% cpu 54.697 total
git push (delta)
• FS-based:0.09s user 0.07s system 5% cpu 3.059 total
• Cloud-based:0.04s user 0.05s system 3% cpu 2.845 total
git clone
• FS-based:6.89s user 8.99s system 33% cpu 47.096 total
• Cloud-based:7.08s user 8.12s system 20% cpu 1:14.12 total
git fetch (delta)
• FS-based:0.14s user 0.13s system 33% cpu 0.806 total
• Cloud-based:0.09s user 0.10s system 1% cpu 16.019 total
GET /namespace/repo/tree/master
• FS-based:Executing action: show - 74.5 ms
• Cloud-based:Executing action: show - 5877.7 ms
GET /namespace/repo/tree/master/builds
• FS-based:Executing action: show - 50.0 ms
• Cloud-based:Executing action: show - 4547.0 ms
Cache
odb hamburger refdb
• cached via redishi-priority
lo-priority
loose OSS store
packed OSS store
loose FS cache
packed FS cache
loose FS cache
• cache written whenntohl(hdr.hdr_entries) < unpack_limit in git-unpack-objects
• when reading via loose OSS store
packed FS cache
• cache written whenntohl(hdr.hdr_entries) >= unpack_limit in git-index-pack
• cache written in git-pack-objects
redis refdb cache
• cache written when read and cache-miss
• cache expired when refdb got updatede.g. git-receive-pack
Future Work
• develop libgit2 backends for AWS S3
• gitlab: favour libgit2, eliminate direct calls to git
• gitlab: add settings to choose backends
• gollum: use rugged as the default
• libgit2: improve performance, e.g. pack builder
https://github.com/pmq20