+ All Categories
Home > Documents > Constructing Open Source SDKs for Ops Teams with REST and ... · 95th percentile load times with...

Constructing Open Source SDKs for Ops Teams with REST and ... · 95th percentile load times with...

Date post: 23-May-2020
Category:
Upload: others
View: 13 times
Download: 0 times
Share this document with a friend
46
Constructing Open Source SDKs for Ops Teams with REST and GraphQL Chris Wahl
Transcript

Constructing Open Source SDKs for Ops Teams with REST and

GraphQL

Chris Wahl

Chris Wahl

Chief Technologist @ Rubrik

Author of Networking for VMware Administrators

Open Source Enabler @ Rubrik Build

ex Datanauts Podcast host

🥑 he/him

Twitter: @ChrisWahl

GitHub: chriswahl

LinkedIn: /wahlchris

Blog: Wahl Network

@ChrisWahl | #DevWeek2019 3

https://twitter.com/AxolotlCure/status/1136284938830045184

This is a story about toil And a lot of learning through triumph and mistakes

@ChrisWahl | #DevWeek2019 4

The kind of work tied to running a production service that tends to be

manual, repetitive, automatable, tactical, devoid of enduring value, and that scales linearly as a service grows

- Toil

@ChrisWahl | #DevWeek2019 5

@ChrisWahl | #DevWeek2019 6

Life of an operator • At the end of the release cycle

• “Here’s a thing, make it work, keep it working”

• Myriad of systems to understand and maintain while being short staffed

@ChrisWahl | #DevWeek2019 7

I need a one-liner or script to accomplish this task so I can copy and paste it into my environment, solve my problem, and get back to putting out a

hundred other fires

- Systems Administrators

@ChrisWahl | #DevWeek2019 8

Abuse from Crude Tools

Tools like AutoIt

• Script GUI actions using a DSL

• The ultimate “sad panda”

@ChrisWahl | #DevWeek2019 9

Key Ingredients

@ChrisWahl | #DevWeek2019 10

RESTful API

Operator Audience

Free Time

SDK

Initial Research

• Our audience preferred Microsoft PowerShell

• Auto generation of SDK was ugly

• Our swagger specification was non-standard

• Decided to craft a bespoke SDK

@ChrisWahl | #DevWeek2019 11

The Mission

• Give operators a familiar tool to manage our product and remove toil

• Use my background as an operator to control the UX

• Selfishly: Learn how to build an SDK

@ChrisWahl | #DevWeek2019 12

Project Plan

• Everything in GitHub as an open source project

• MIT licensing (Legal 👍 )

• One project per repository

• Official product support for projects

• Unit tests for new features

• External CI: AppVeyor, Azure Pipelines

• Internal CI: CircleCI

• Integration of Jira and GitHub via Zapier

@ChrisWahl | #DevWeek2019 13

People use this thing? The mysterious tale of unloved APIs

@ChrisWahl | #DevWeek2019 14

Our API’s Original Purpose

• Distributed systems to chat with each other

• Supply the GUI with an interface

@ChrisWahl | #DevWeek2019 15

me

This created friction

• There were no API versions

• Breaking changes were normal

• Standards for model, params, enums, etc. did not exist

• The product surface area was rapidly expanding

@ChrisWahl | #DevWeek2019 16

@ChrisWahl | #DevWeek2019 17

@ChrisWahl | #DevWeek2019 18

We Made Versions!

• Internal

• meant for testing and developing new features and for providing command and control endpoints for the software itself.

• Versioned (Vn)

• meant for public consumption with a declaration on versioning, deprecation, and when breaking changes would be introduced.

@ChrisWahl | #DevWeek2019 19

API versioning does not prevent breaking changes. It just helps

control when, where, and how the break occurs. Someone must still

update their code. - Me

@ChrisWahl | #DevWeek2019 20

More Cleanup

• Placed major integrations at the parent (root) level

• Leveraged HTTP methods to simplify workflows

• Used Boolean field naming conventions

@ChrisWahl | #DevWeek2019 21

Ugly: POST to “/add_node” and “/remove_node/{id}” Pretty: POST to “/node” and DELETE to “/node/{id}”

Start with ‘has’, ‘is’ or ‘should’ to make it clear that it is a Boolean field

Examples: ‘hasRootAccess’, ‘isAdmin’ and ‘shouldDoSomething’

The sooner you start to code, the longer the program will

take.

- Roy Carlson

@ChrisWahl | #DevWeek2019 22

Internal Became the Hypnotoad

• No incentives for versioning

• Over 95% of the API resided in Internal

@ChrisWahl | #DevWeek2019 23

The Universal Solvent Embracing our audience further

@ChrisWahl | #DevWeek2019 24

Too Much Complexity

• Each function with the SDK was a closed loop

• The community found it too difficult to contribute

• A new architecture was needed

@ChrisWahl | #DevWeek2019 25

SDK Design Goal

API File

• Gather information for each supported endpoint

• Supply the SDK with methods, params, status codes, etc.

• Version the data for backwards compatibility

Generic Functions

• Functions look at the API File to understand their purpose

• Functions can alter their state based on the target product version

@ChrisWahl | #DevWeek2019 26

@ChrisWahl | #DevWeek2019 27

Product versions 1.0+

Product versions 5.0+

Enablement and Communication

Too focused on the technology

Not enough focus on the hygiene

Lots of questions from our customers

General fear of GitHub and coding

More was needed

@ChrisWahl | #DevWeek2019 30

Choose Your Own Adventure

Educational Workshops for Operators

Communication Efforts

@ChrisWahl | #DevWeek2019 34

The rules of versioning and deprecation.

Future deprecation of endpoints / resources.

New or updated endpoints / resources.

And then GraphQL appeared There goes the neighborhood

@ChrisWahl | #DevWeek2019 35

@ChrisWahl | #DevWeek2019 36

You haven't mastered a tool until you understand when it

should not be used.

- Kelsey Hightower

@ChrisWahl | #DevWeek2019 37

Initial Research in 2017

• Dramatic speed improvements for the GUI

• As more objects are added, REST continues to fall behind

• Simple to query all objects and use cursor / pagination

• More flexibility with our returned values

@ChrisWahl | #DevWeek2019 38

Stress tested load times 95th percentile load times with GraphQL: 3.256 seconds 95th percentile load times with REST: 6.619 seconds

Since Then

Added GraphQL to our on-premises product.

o Reporting

o Dashboards

o Various other components

Constructed a SaaS platform with GraphQL as the standard API

o Started from scratch

o Using what we learned

o Lots of tweaking

@ChrisWahl | #DevWeek2019 39

Challenges

Schema is in flux

There are no versions

Documentation holy wars

We’re all still learning GraphQL

@ChrisWahl | #DevWeek2019 40

Graph-Que-What?

Current State

• Schema tools (Voyager, GraphiQL) for visualization

• Internal construction of new SDKs

• Existing auth methods (e.g. tokens) are valid globally

@ChrisWahl | #DevWeek2019 41

Base platform will continue with REST and GraphQL SaaS platform will remain entirely GraphQL Using GitHub private repos for development

SDK Development

Let use cases drive stack-ranking

Mimic a near-identical UX

Educate and enable in parallel

Invite early-adopters and give them checklists

@ChrisWahl | #DevWeek2019 43

Takeaways A bit of navel gazing

@ChrisWahl | #DevWeek2019 44

If we could do it all over again

• Increased collaboration with engineering and support

• Create incentives to document and polish the API

• Make documentation a top priority

• Educate internal stakeholders on API usage

• Bring (more) operators into the SDK build process

@ChrisWahl | #DevWeek2019 45

Use cases, UX, testing, feedback


Recommended