+ All Categories
Home > Technology > [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve...

[Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve...

Date post: 23-Jan-2018
Category:
Upload: vinu-charanya
View: 82 times
Download: 0 times
Share this document with a friend
80
How built a framework to improve infrastructure resource utilization at scale
Transcript
Page 1: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

How built a framework to improve infrastructure resource utilization at scale

Page 2: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

★ Sr. Systems Engineer @Twitter★ Proud being a member of @TwitterWomen,

@Techwomen and @WomenWhoCode

I am @VinuCharanya

Hello!

Page 3: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

1 2 3 4

History & ContextChargeback @TwitterKite - Service Lifecycle ManagerImpact & Future Work

Agenda

Page 4: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

History & Context

Page 5: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

Thousands of MicroServices

Page 6: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

Thousands of MicroServices

Page 7: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

Thousands of MicroServices

Page 8: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale
Page 9: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

INFRASTRUCTURE & DATACENTER MANAGEMENT

CORE APPLICATION SERVICES

TWEETS

USERS

SOCIAL GRAPH

PLATFORM SERVICES

SEARCH

MESSAGING & QUEUES

CACHE

MONITORING AND ALERTING

INGRESS & PROXY

FRAMEWORK/

LIBRARIES

FINAGLE(RPC)

SCALDING(Map Reduce in

Scala)

HERON(Streaming Compute)

JVM

MANAGEMENT

TOOLS

SELF SERVE

SERVICE DIRECTORY

CHARGEBACK

CONFIG MGMT

DATA & ANALYTICSPLATFORM

INTERACTIVE QUERY

DATA DISCOVERY

WORKFLOWMANAGEMENT

INFRASTRUCTURESERVICES

MANHATTAN

BLOBSTORE

GRAPHSTORE

TIMESERIESDB

STORAGE

MESOS/AURORA

HADOOP

COMPUTE

MYSQL

VERTICA

POSTGRES

DB/DW

DEPLOY(Workflows)

Page 10: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

MESOS/AURORA

HADOOP

MANHATTAN

67%N

umbe

r of S

erve

rs

Page 11: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

Number of Servers

MESOS/AURORA

HADOOP

MANHATTAN

67%How to get visibility into resources used by

individual jobs & datasets?

Page 12: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

Number of Servers

MESOS/AURORA

HADOOP

MANHATTAN

67%How to attribute resource consumption

to teams/organization?

Page 13: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

Number of Servers

MESOS/AURORA

HADOOP

MANHATTAN

67%How do you incentivize the right behavior to

improve efficiency of resource usage?

Page 14: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

Chargeback @Twitter

Page 15: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

Chargeback @Twitter

Ability to meter allocation & utilization of resources

Page 16: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

Chargeback @Twitter

Ability to meter allocation & utilization of resources per service, per project, per engineering team

Page 17: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

Chargeback @Twitter

Ability to meter allocation & utilization of resources per service, per project, per engineering team to improve visibility & enable accountability

Page 18: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

Features

Supports diverse Infra Services

Chargeback @Twitter

18

Meters abstract resources at daily

granularityDetailed Reports

Page 19: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

19

Chargeback @Twitter

1. Resource Catalog: Consistent way to inventory infrastructure resources

Support diverse Infrastructure and Platform Services

Page 20: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

20

Chargeback @Twitter

1. Resource Catalog: Consistent way to inventory infrastructure resources

• Resource Fluidity: Support primitive (CPU) and abstract resource (“Tweets / second”). Extend existing resource

Support diverse Infrastructure and Platform Services

Page 21: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

21

Chargeback @Twitter

1. Resource Catalog: Consistent way to inventory infrastructure resources

• Resource Fluidity: Support primitive (CPU) and abstract resource (“Tweets / second”). Extend existing resource

2. Resource <> Client Identifier Ownership: Map of client identifier to an owner to enable accountability

Support diverse Infrastructure and Platform Services

Page 22: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

OFFER MEASURE COST

RESOURCE CATALOG ENTITY MODEL

Page 23: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

OFFER MEASURES

OFFER MEASURE COST

1:N

RESOURCE CATALOG ENTITY MODEL

Page 24: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

PROVIDER

INFRASTRUCTURE SERVICE

OFFERINGS

OFFER MEASURES

OFFER MEASURE COST

1:N

1:N

1:N

1:N

RESOURCE CATALOG ENTITY MODEL

Page 25: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

TWITTER DC/PUBLIC CLOUD

COMPUTE

CORE-DAYS

$X

PROVIDER

INFRASTRUCTURE SERVICE

OFFERINGS

OFFER MEASURES

OFFER MEASURE COST

1:N

1:N

1:N

1:N

RESOURCE CATALOG ENTITY MODEL

Page 26: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

TWITTER DC/PUBLIC CLOUD

COMPUTE

CORE-DAYS

$X

PROVIDER

INFRASTRUCTURE SERVICE

OFFERINGS

OFFER MEASURES

OFFER MEASURE COST

1:N

1:N

1:N

1:N

TWITTER DC

STORAGE

GB- RAM

PROCESSING CLUSTER

FILEACCESSES

…GB- RAM

FILE ACCESSE

S… …

$X $Y …$M $N… …

RESOURCE CATALOG ENTITY MODEL

Page 27: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

{ measures: [{"measure_id": 1,"measure_label": "core-days","measure_unit_label": "per 1 core-day","offering_id": 1,"offering_label": "Compute","infrastructure_id": 1,"infrastructure_name": "Aurora"

},

{"measure_id": 2,"measure_label": "machine-days","measure_unit_label": "per 1 machine-day","offering_id": 2,"offering_label": “zone:tweety","infrastructure_id": 8,"infrastructure_name": "Physical Infrastructure",

},

{

/api/1/measures

Chargeback @Twitter

Page 28: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

So, how do you incentivize the right behavior to improve efficiency of resource usage?

Page 29: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

Pricing is one way…

Page 30: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

Operational Overhead

Headroom

Production Used Cores

Non-Prod Used Cores

Cost of Physical Server($X / day) Total available Cores

Quota Buffer(Underutilized Quota)

Container Size Buffer(Underutilized Reservation)

Total Cost of Ownership for Aurora$X core-day

Page 31: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

Operational Overhead

Headroom

Production Used Cores

Non-Prod Used Cores

Cost of Physical Server($X / day) Total available Cores

Quota Buffer(Underutilized Quota)

Container Size Buffer(Underutilized Reservation)

Total used Cores

Total Cost of Ownership for Aurora$X core-day

Page 32: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

Operational Overhead

Headroom

Production Used Cores

Non-Prod Used Cores

Cost of Physical Server($X / day) Total available Cores

Quota Buffer(Underutilized Quota)

Container Size Buffer(Underutilized Reservation)

Total used Cores

Excess Cores (incl. DR, Spikes, Overallocation)Total Cost of Ownership for Aurora

$X core-day

Page 33: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

Operational Overhead

Headroom

Production Used Cores

Non-Prod Used Cores

Cost of Physical Server($X / day) Total available Cores

Quota Buffer(Underutilized Quota)

Container Size Buffer(Underutilized Reservation)

Total used Cores

Excess Cores (incl. DR, Spikes, Overallocation)

Cores used by platformfor operations &

maintenance

Total Cost of Ownership for Aurora$X core-day

Page 34: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

Features

Supports diverse Infra/Platform

Services

Chargeback @Twitter

34

Meters abstract resources at daily

granularityDetailed Reports

Page 35: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

35

Chargeback @Twitter

INFRASTRUCTURE SERVICE 1

INFRASTRUCTURE SERVICE 2

INGESTMETRICS

RAWFACT TRANSFORMER RESOLVED

FACT

RESOURCE CATALOG

REPORT

REPORT

Metering Pipeline (ETL Job)

IDENTIFIER OWNERSHIP

MAPPING

Metrics Ingestor

DATA FIDELITY

Metering Pipeline (ETL Job)

Page 36: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

36

Chargeback @Twitter

INFRASTRUCTURE SERVICE 1

INFRASTRUCTURE SERVICE 2

INGESTMETRICS

RAWFACT TRANSFORMER RESOLVED

FACT

RESOURCE CATALOG

REPORT

REPORT

Metering Pipeline (ETL Job)

IDENTIFIER OWNERSHIP

MAPPING

Schema(client_identifier, offering_measure, volume, metadata, timestamp)

DATA FIDELITY

Metering Pipeline (ETL Job)

Page 37: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

37

Chargeback @Twitter

Metering Pipeline (ETL Job)

INFRASTRUCTURE SERVICE 1

INFRASTRUCTURE SERVICE 2

INGESTMETRICS

RAWFACT TRANSFORMER RESOLVED

FACT

RESOURCE CATALOG

IDENTIFIER OWNERSHIP

MAPPING

REPORT

REPORT

Transformer

DATA FIDELITY

Metering Pipeline (ETL Job)

Page 38: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

38

Chargeback @Twitter

Metering Pipeline (ETL Job)

INFRASTRUCTURE SERVICE 1

INFRASTRUCTURE SERVICE 2

INGESTMETRICS

RAWFACT TRANSFORMER RESOLVED

FACT

RESOURCE CATALOG

IDENTIFIER OWNERSHIP

MAPPING

REPORT

REPORT

1. Resolve Ownership

DATA FIDELITY

Metering Pipeline (ETL Job)

Page 39: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

39

Chargeback @Twitter

Metering Pipeline (ETL Job)

INFRASTRUCTURE SERVICE 1

INFRASTRUCTURE SERVICE 2

INGESTMETRICS

RAWFACT TRANSFORMER RESOLVED

FACT

RESOURCE CATALOG

IDENTIFIER OWNERSHIP

MAPPING

REPORT

REPORT

2. Cost Computation

DATA FIDELITY

Metering Pipeline (ETL Job)

Page 40: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

40

Chargeback @Twitter

Metering Pipeline (ETL Job)

INFRASTRUCTURE SERVICE 1

INFRASTRUCTURE SERVICE 2

INGESTMETRICS

RAWFACT TRANSFORMER RESOLVED

FACT

RESOURCE CATALOG DATA FIDELITY

REPORT

REPORT

IDENTIFIER OWNERSHIP

MAPPING

Data Fidelity & Reporting

Metering Pipeline (ETL Job)

Page 41: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

41

Chargeback @Twitter

Metering Pipeline (ETL Job)

INFRASTRUCTURE SERVICE 1

INFRASTRUCTURE SERVICE 2

INGESTMETRICS

RAWFACT TRANSFORMER RESOLVED

FACT

RESOURCE CATALOG

REPORT

REPORT

IDENTIFIER OWNERSHIP

MAPPING

1. Verify Data Integrity & Fidelity

DATA FIDELITY

Metering Pipeline (ETL Job)

Page 42: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

42

Chargeback @Twitter

Metering Pipeline (ETL Job)

INFRASTRUCTURE SERVICE 1

INFRASTRUCTURE SERVICE 2

INGESTMETRICS

RAWFACT TRANSFORMER RESOLVED

FACT

RESOURCE CATALOG

REPORT

REPORT

IDENTIFIER OWNERSHIP

MAPPING

2. Alert when things don’t seem the way it should be

DATA FIDELITY

Metering Pipeline (ETL Job)

Page 43: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

43

Chargeback @Twitter

INFRASTRUCTURE SERVICE 1

INFRASTRUCTURE SERVICE 2

EXPORTMETRICS

RAWFACT TRANSFORMER RESOLVED

FACT

RESOURCE CATALOG

IDENTIFIER OWNERSHIP

DATA FIDELITY

REPORT

REPORT

Metering Pipeline (ETL Job)

Page 44: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

Features

Supports diverse Infra/Platform

Services

Chargeback @Twitter

44

Meters abstract resources at daily

granularityDetailed Reports

Page 45: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

45

Chargeback @Twitter

Customers

Infrastructure & Platform Operators Overall Cluster GrowthAllocation v/s Utilization of resources by Client/Tenant

Finance & Execs Budget v/s Spend per OrgInfrastructure PnLOverall Efficiency & Trends

Service Owners & Developers Team BillPer Service Allocation vs. Utilization of Resources

Reports

Customers

Infrastructure & Platform Operators Overall Cluster GrowthAllocation v/s Utilization of resources by Client/Tenant

Finance & Execs Budget v/s Spend per OrgInfrastructure PnLOverall Efficiency & Trends

Page 46: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

INFRASTRUCTURE PNL

Page 47: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

47

Chargeback @Twitter

Customers

Infrastructure & Platform Operators Overall Cluster GrowthAllocation v/s Utilization of resources by Client/Tenant

Finance & Execs Budget v/s Spend per OrgInfrastructure PnLOverall Efficiency & Trends

Service Owners & Developers Team BillPer Service Allocation vs. Utilization of Resources

Reports

Page 48: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

CHARGEBACK BILL FOR A TEAM

Page 49: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

CHARGEBACK DRILLDOWN FOR A TEAM

Page 50: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

Features

Supports diverse Infra/Platform

Services

Chargeback @Twitter

50

Meters abstract resources at daily

granularityDetailed Reports

Page 51: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

51

1 2 3 4

Learnings

Chargeback @Twitter

Invest in data Fidelity

Accurate Ownership Mapping

Logical grouping of resources

Track historical data

• Trust in data is most important.

• Invest in monitoring & alerting for data inconsistencies

• Leverage this for detecting abnormal increase/decrease and notify users

• Static mappings go out of date quickly

• Invest in systems (ex, Kite) for users to manage it themselves

• Identifiers were too granular and teams were too broad.

• Find a good middle ground and invest in system (ex, Kite) to track, understand and maintain

• Unit prices change over time

• Orgs / Teams change over time

• Resources get added / removed

• Change history is essential for consistency which is used for CAP planning

Page 52: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

52

1 2 3 4

Learnings

Chargeback @Twitter

Invest in data Fidelity

Accurate Ownership Mapping

Logical grouping of resources

Track historical data

• Trust in data is most important.

• Invest in monitoring & alerting for data inconsistencies

• Leverage this for detecting abnormal increase/decrease and notify users

• Static mappings go out of date quickly

• Invest in systems (ex, Kite) for users to manage it themselves

• Identifiers were too granular and teams were too broad.

• Find a good middle ground and invest in system (ex, Kite) to track, understand and maintain

• Unit prices change over time

• Orgs / Teams change over time

• Resources get added / removed

• Change history is essential for consistency which is used for CAP planning

Page 53: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

53

1 2 3 4

Learnings

Chargeback @Twitter

Invest in data Fidelity

Accurate Ownership Mapping

Logical grouping of resources

Track historical data

• Trust in data is most important.

• Invest in monitoring & alerting for data inconsistencies

• Leverage this for detecting abnormal increase/decrease and notify users

• Static mappings go out of date quickly

• Invest in systems (ex, Kite) for users to manage it themselves

• Identifiers were too granular and teams were too broad.

• Find a good middle ground and invest in system (ex, Kite) to track, understand and maintain

• Unit prices change over time

• Orgs / Teams change over time

• Resources get added / removed

• Change history is essential for consistency which is used for CAP planning

Page 54: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

54

1 2 3 4

Learnings

Chargeback @Twitter

Invest in data Fidelity

Accurate Ownership Mapping

Logical grouping of resources

Track historical data

• Trust in data is most important.

• Invest in monitoring & alerting for data inconsistencies

• Leverage this for detecting abnormal increase/decrease and notify users

• Static mappings go out of date quickly

• Invest in systems (ex, Kite) for users to manage it themselves

• Identifiers were too granular and teams were too broad.

• Find a good middle ground and invest in system (ex, Kite) to track, understand and maintain

• Unit prices change over time

• Orgs / Teams change over time

• Resources get added / removed

• Change history is essential for consistency which is used for CAP planning

Page 55: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

55

1 2 3 4

Learnings

Chargeback @Twitter

Invest in data Fidelity

Accurate Ownership Mapping

Logical grouping of resources

Track historical data

• Trust in data is most important.

• Invest in monitoring & alerting for data inconsistencies

• Leverage this for detecting abnormal increase/decrease and notify users

• Static mappings go out of date quickly

• Invest in systems (ex, Kite) for users to manage it themselves

• Identifiers were too granular and teams were too broad.

• Find a good middle ground and invest in system (ex, Kite) to track, understand and maintain

• Unit prices change over time

• Orgs / Teams change over time

• Resources get added / removed

• Change history is essential for consistency which is used for CAP planning

Page 56: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale
Page 57: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

SERVICE IDENTITY MANAGER

RESOURCE PROVISIONING MANAGER

DASHBOARD (SINGLE PANE OF GLASS)

REPORTING

INFRASTRUCTURE SERVICEINFRASTRUCTURE SERVICEINFRASTRUCTURE SERVICEINFRASTRUCTURE & PLATFORM SERVICE

SERVICE LIFECYCLE WORKFLOWS

METADATA RESOURCE QUOTA MANAGEMENT

METERING & CHARGEBACKCLIENT IDENTITY

PROVIDER APIS & ADAPTERS

Page 58: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

10,000+ Client Identifiers 1,000+ Projects 100+ Teams 8 Infrastructure Services

58

Kite @Twitter

Page 59: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

59

Kite @Twitter

Identity System: Built a consistent way to group client identifiers of different infrastructure services into a project and enabled ownership

• Capture Org Structure: Support org structure changes, project transfer workflows to ensure up-to-date ownership of identifiers

• Unify client identifier provisioning workflow: Enables single source of truth and reduces operator pain around provisioning and managing client identifiers.

Client Identifier Management

Page 60: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

IDENTITY ENTITY MODEL

<INFRA, CLIENTID> <Aurora, tweetypie.prod.tweetypie>

<Aurora, ads-prediction.prod.campaign-x>

Page 61: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

IDENTITY ENTITY MODEL

SERVICE/SYSTEM ACCOUNT

<INFRA, CLIENTID>

1:N

tweetypie

<Aurora, tweetypie.prod.tweetypie>

ads-prediction

<Aurora, ads-prediction.prod.campaign-x>

Page 62: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

BUSINESS OWNER

TEAM

PROJECT

SERVICE/SYSTEM ACCOUNT

<INFRA, CLIENTID>

1:N

1:N

1:N

1:N

INFRASTRUCTURE

TWEETYPIE

tweetypie

tweetypie

<Aurora, tweetypie.prod.tweetypie>

ADS PREDICTION

prediction

ads-prediction

<Aurora, ads-prediction.prod.campaign-x>

REVENUE

IDENTITY ENTITY MODEL

Page 63: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

BUSINESS OWNER

TEAM

PROJECT

SERVICE/SYSTEM ACCOUNT

<INFRA, CLIENTID>

1:N

1:N

1:N

1:N

INFRASTRUCTURE

TWEETYPIE

tweetypie

tweetypie

<Aurora, tweetypie.prod.tweetypie>

ADS PREDICTION

prediction

ads-prediction

<Aurora, ads-prediction.prod.campaign-x>

REVENUE

IDENTITY ENTITY MODEL

Entities are time varying dimensions

Page 64: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

Impact

Page 65: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

10,000+ Client Identifiers

Page 66: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

CLAIM OWNERSHIP

Page 67: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

PROJECT DISCOVERY

Page 68: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

TEAM OVERVIEW

Page 69: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

TEAM OVERVIEW

Released unused Resources

Page 70: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

TEAM OVERVIEW

Q2 unit price update

Page 71: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

TEAM OVERVIEW

New project launch

Page 72: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

PROJECT METADATA

Page 73: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

AURORA QUOTA MANAGER

Page 74: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

Future Work

Page 75: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

75

Future Work

Impact & Future Work

1 2Resource provisioning

Enable project deprecation

• Extend Quota Manager and unify the experience into Kite

• Onboard Hadoop, Storage and other systems

• Detect unused resources, notify users, trigger deprecation process based on policy

3Capacity Planning

• Provide historic trends and help with forecast of capacity

Page 76: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

76

1 2

Future Work

Impact & Future Work

Resource provisioning

Enable project deprecation

• Extend Quota Manager and unify the experience into Kite

• Onboard Hadoop, Storage and other systems

• Detect unused resources, notify users, trigger deprecation process based on policy

3Capacity Planning

• Provide historic trends and help with forecast of capacity

Page 77: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

77

1 2

Future Work

Impact & Future Work

Resource provisioning

Enable project deprecation

• Extend Quota Manager and unify the experience into Kite

• Onboard Hadoop, Storage and other systems

• Detect unused resources, notify users, trigger deprecation process based on policy

3Capacity Planning

• Provide historic trends and help with forecast of capacity

Page 78: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale
Page 79: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

79

1 2

Future Work

Impact & Future Work

Resource provisioning

Enable project deprecation

• Extend Quota Manager and unify the experience into Kite

• Onboard Hadoop, Storage and other systems

• Detect unused resources, notify users, trigger deprecation process based on policy

3Capacity Planning

• Provide historic trends and help with forecast of capacity

Page 80: [Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Service Ownership & Improve Infrastructure Utilization at Scale

@VinuCharanya


Recommended