+ All Categories
Home > Documents > NSI Registry Engineering & Operations Update

NSI Registry Engineering & Operations Update

Date post: 16-Mar-2016
Category:
Upload: joshwa
View: 30 times
Download: 2 times
Share this document with a friend
Description:
NSI Registry Engineering & Operations Update. Ari Balogh VP of Engineering [email protected]. High-Level Architecture. Registrar Growth. Average Daily Transactions. Qtr to 5/31. In millions, compared to Original Plan and New Projections (peak of 27.5M). Total Transactions Summary. - PowerPoint PPT Presentation
21
12-Jun-2000 1 NSI Registry Engineering & Operations Update Ari Balogh VP of Engineering [email protected]
Transcript
Page 1: NSI Registry Engineering & Operations Update

12-Jun-20001

NSI Registry Engineering& Operations Update

Ari BaloghVP of Engineering

[email protected]

Page 2: NSI Registry Engineering & Operations Update

12-Jun-20002

High-Level Architecture

R egistra tionSystem CSRs

Root, gTLD ,Node

RR PProxy

R egr.Tools

App.Server

Dom ains, N Ss,R egistrars

W hoisServer

Network SolutionsRegistry

W hois

DN SZones

R egr.Reprts.

C SRTools

InternetUsers

CSR sF irew all

Registrars

H TTPR R P/S S L H TTPR R P/S S L

Page 3: NSI Registry Engineering & Operations Update

12-Jun-20003

Welcome letter sent to Registrar candidates 31Registrars in pre-production 48Registrars in production 44Total number of Registrars in Registry 123

I CANN accredited Registrars 123

Total Number of Names in the Registry Database

Registrar Growth

Page 4: NSI Registry Engineering & Operations Update

12-Jun-20004

0.0

5.010.0

15.020.0

25.0

Plan of Record 1.2 1.4 1.6 1.9 2.21/ 1/ 00 Projection 2.8 4.5 5.2 6.0 7.0Actual 2.8 5.6 19.4

4Q99 1Q00 2Q00 3Q00 4Q00

Average Daily Transactions

Qtr to 5/31

In millions, compared to Original Plan and New Projections (peak of 27.5M)

Page 5: NSI Registry Engineering & Operations Update

12-Jun-20005

0.0100.0200.0300.0400.0500.0600.0700.0

Write 1.6 4.5 5.0 4.0 4.5 4.5 8.1 8.4Query 0.4 3.3 4.4 2.6 3.8 4.2 15.4 18.0Check 29.2 38.6 77.7 113.7 151.6 212.2 518.6 616.9

Oct Nov Dec J an Feb March April May

Total Transactions Summary

In millions

38%49% 88%

33% 38%

145%

19%

Page 6: NSI Registry Engineering & Operations Update

12-Jun-20006

Availability & Performance

• Service Level Agreement (SLA) allowances:– 8 hours total outage per month, 4 hours unplanned– 3 seconds average for check domain (excluding worst

5%)– 5 seconds average for add domain (excluding worst 5%)

• January observed performance:– 3.5 hours planned outage to implement governance

issues, no unplanned– 600 ms per check domain, 2.5 seconds per add

• February observed performance– No planned or unplanned outages– 700 ms per check domain, 2.6 seconds per add

Page 7: NSI Registry Engineering & Operations Update

12-Jun-20007

Availability & Performance

• March observed performance– Two 2 hour planned outages, 1.25 hour unplanned

outage– 60 ms per check domain, 300 ms per add

• April observed performance– 2.5 hours planned outage, no unplanned– 78.5 ms per check domain, 319.5 ms per add

• May observed performance– 2 hours planned outage, no unplanned– 34.7 ms per check domain, 257.2 ms per add

Page 8: NSI Registry Engineering & Operations Update

12-Jun-20008

A Root Performance - UDP Packets/Second

5 Minute Average

30 Minute Average

Page 9: NSI Registry Engineering & Operations Update

12-Jun-20009

A Root Performance - Drops & Overflows

Drops - 5 Minute Average

Overflows - 5 Minute Average

Page 10: NSI Registry Engineering & Operations Update

12-Jun-200010

J gTLD Performance - UDP Packets/Second

5 Minute Average

30 Minute Average

Page 11: NSI Registry Engineering & Operations Update

12-Jun-200011

M gTLD Performance - UDP Packets/Second

5 Minute Average

30 Minute Average

Page 12: NSI Registry Engineering & Operations Update

12-Jun-200012

The Infrastructure Problem

• SLA that incurs $500K/day outage and performance penalties

• Single shared database experiencing 30% - 90% per month OLTP growth– Heavyweight stored procedures– Sustained 50%-70% utilization with peaks to 100% … and no

more easy software fixes– Increasing extract duration for zones, Whois, registrar extracts, 5

- 14 hours• Immature or end-of-life HA options for E4500• Sun, Veritas, EMC version and support issues

Page 13: NSI Registry Engineering & Operations Update

12-Jun-200013

DB Server Evaluation

• Evaluated top Unix machines– Sun E10000, HP V2500, IBM S7A/S80

• Narrowed to E10000 and S7A/S80• Conducted three month live test of S7A/S80

– Ported gateway and application servers to IBM Java environment

– Created RRP path configuration– Demonstrated performance and availability (HA/CMP)

• Investigated impacts of E10K– Different administrative model– EMC integration issues

Page 14: NSI Registry Engineering & Operations Update

12-Jun-200014

Definitive Results

• Excellent Java and C code portability• S80 clear performance leader, benchmarks and real-world– Approximately 3 times the throughput per CPU vs. E10K– Noticeably improved Java performance (!)

• Robust HA implementation• Complete 64-bit environment• Native file system and volume management;

excellent EMC integration• Impressive and thorough support

– Demonstrated appreciation for multi-vendor, mission critical computing

Page 15: NSI Registry Engineering & Operations Update

12-Jun-200015

Scaling DNS

• Domain name resolutions on A Root– 4Q99 - 220M per day– 1Q00 - 430M per day– 2Q00 - 650M per day– 4Q00 - 1.5B per day, more?

• Need 64-bit machines to scale past 4GB/23M domain name wall

• Developing bind extensions for high performance gTLD

Page 16: NSI Registry Engineering & Operations Update

12-Jun-200016

64-bit DNS Evaluation

• Engaged Unix vendors to aid with in-house evaluation of 64-bit mid-range Unix servers– HP N4000, IBM H70, Sun E3500

• E3500 eliminated early -- scale and 64-bit issues

• H70 within 15% of N4000, upcoming upgrade substantially faster

• Chose M80 as new root/gTLD platform• Using E4500s as alternate platform and

placeholder for UltraSparcIII generation

Page 17: NSI Registry Engineering & Operations Update

12-Jun-200017

0

50,000,000

100,000,000

150,000,000

200,000,000

250,000,000

The Dot Problem

Resolutions per day. A Root meltdown?

Page 18: NSI Registry Engineering & Operations Update

12-Jun-200018

Dot Diagnosis and Fix

• Too much load for existing E450• Qualified and put into production the [evaluation]

H70– Greater than 60% increased throughput– Jump from 220M resolutions per day to over 400M

• Qualified and put into production an S80 as placeholder for upcoming M80 deployment– Greater than factor of three improvement over previous E450

• Tweaked TCP keepalive defaults and bind select loop

• Filtered dynamic updates

Page 19: NSI Registry Engineering & Operations Update

12-Jun-200019

The New Dot

050,000,000

100,000,000150,000,000200,000,000250,000,000300,000,000350,000,000400,000,000450,000,000500,000,000

A Root resolutions per day with H70

Page 20: NSI Registry Engineering & Operations Update

12-Jun-200020

Packet Drops

Percent packets dropped, day of H70 deployment

Deployed 11 a.m.

“Current” time(9 a.m. day after)

Page 21: NSI Registry Engineering & Operations Update

12-Jun-200021

Upcoming access -www.dnsentral.net


Recommended