Date post: | 28-Nov-2014 |
Category: |
Technology |
Upload: | patrick-mckenzie |
View: | 75,937 times |
Download: | 0 times |
PRODUCTIZING TWILIO APPLICATIONS
Patrick McKenzie – Kalzumeus Software
My Business
Twilio Has The Power To Make You…
Sob softly at
3 AM in a cold, wet, dark room
How could I have avoided that? Process: Do not push new code to
production at 5 PM on Friday night. Process: Test on staging server first. Fail
the deploy if core features do not work as expected.
Tech: Switch to idempotent queues. Tech: How about we don’t call the same
person 50 times in five minutes? Tech: Activity spike 500x historical max
= Shut. Down. Everything.
TestingTwilio Apps
Testing Pitfalls With Twilio
Testing is dangerous Testing trivial changes often requires
manual work Your view code (Twiml) will frequently
blow up business logic Poor separation of concerns between
model, view, controller, Twilio libraries, and Twilio API. Many classes of bugs not exercised by automated testing
Treat All Guns As Loaded
What To Test
Business logic, business logic, business logic Scheduling calls / SMSes per business rules Call flow
Am I calling Twilio API the way Twilio expects? Twiml looks OK? Parameters for requests passed correctly?
Does stuff actually work?
Don’t Contact Twilio In Tests
Makes tests slow Potentially dangerous
Bought numbers in unit test. Twilio.revenue += 340
Hurts reproducability Instead, record and playback (VCR
gem, etc) Not Ruby? Use Twilio API explorer,
copy/paste response to mock.
Use localtunnel in development
Quicker than “FTP new version to site” Won’t break stuff for real customers
Staging Servers Are Required Staging = Production – Customers “Same” hardware, configurations, etc,
different Twilio numbers Ban the Internet (except Twilio) from
servers Strongly recommend no real data in
staging DB Staging servers good for automated
test calls
Staging Servers Protect Production
Prior to pushing to production, push to staging.
Run a script to automatically drive website and telephone, verifying that stuff actually works.
Fail deploy to production if anything goes wrong.
Adds ~5 minutes to a deploy, will save you outages, catastrophic blowups, and your sanity.
ModelingCalls
“How Do We Do A Call Tree?”
“How Do We Do A Call Tree?”
Case Statements Considered Harmful
Easy to introduce subtle bugs Very difficult to test
Requires manual testing (with a phone !?) Tightly couples business logic w/ Twilio
Hard to maintain Adding menu item => stuff breaks Change a number => stuff breaks Restructure flow => stuff breaks
A Better Way
You’ll Appreciate This Later
What To Use State Machines For?
Call flows Business logic testable (in model) Forces similar organization on model, view,
controller, and vocal assets SMS flows Necessity for contact in the first place
Avoid easiest catastrophic failure mode with Twilio
Specifics To Modeling Calls
Each call gets a DB/model object Model tracks call state Set state to “processing” prior to
initiating call (or at entrance to Twilio script for inbound)
Then, transition based on input, using each transition to: trigger side-effects (updating DB, etc) present user with view state (voice, etc)
AnsweringMachines
Twilio’s IfMachine = Continue Wait until call recipient says something
If they don’t say something, must be a machine.
If they do say something, maybe still a machine? Error rates ~20% in my limited experience
Problems With IfMachine=Continue
“I tried a test call to myself and it never started talking. I’m concerned my customers would hang up before my message plays.”
If you don’t pick up beep correctly, first several seconds of message does not get recorded.
“My customers hit 1 and nothing happens.”
Other Options (Not Answers) Give machines/humans the same message. Give machines/humans the same message,
but force a keypress (“1”) prior to talking. This coerces most answering machines/voicemails into starting recording, even early.
“This is an automated message from Your Company Here. Press 1 to hear your message.” <Gather> their input. If input, play human message. If none, play answering machine message.
Be Careful With Answering Machines
Hit 5 To Confirm
Your Appointme
nt
Be Careful With Answering Machines
MessageErased
This Is A Real Problem
We arethat
stupid.
This Is A Real Problem
Security
Check Your Application For… Application security issues Unintended information disclosure Catastrophic degradation during failure
conditions The 4Chan Rule
Outgoing Call Security
Educate users regarding proper use. This will require firing some of them.
Establish per-account, per-destination, and global rate caps. Review manually after triggers.
Have a global “Stop all outgoing calls” button.
Most Important Part of Data Security
This call could end up over the PA at Macy’s.
Incoming Call Security
Caller IDs can be spoofed. Do not gate important stuff on them.
“Thanks for calling our automated system. Put in your task code to continue.”
Task code: 4~6 digit random ID. Expires in 1 hour. If possible, flush codes if > 3 failures in a row.
Per-account call-in numbers when feasible. Increases security and cuts down on support costs.
Scaling
One Commodity Server Has… 6 hours per working day 3,600 seconds per hour ~25 requests per second ~3 requests per 2 minute phone call
180,000 calls/day
People Hate Numbers So…
Do you need
to call all of
Little Rock?
Why Rate Limit Then?
Control costs to your business and customer. Protect customer from crushing their offline
processes which are feeding to/from the phones. “Great that it scales. By the way, can we get an
off button? To turn off calls for a few hours?” “Why do you need an off button?”
“Our operators sometimes get called away from their desks, for meetings and whatnot.” “Certainly. How many operators do you have?”
“Two.”
Random Advice
Random Grabbag Of Advice
Never contact Twilio in request/response cycle. Queue requests, use worker process.
Fiverr.com for voice actresses. Find one you like, put her on retainer.
Record copious information about errors. Very hard to get individualized “What did your customer do to hear that unspecified ‘Something broke’ message?”
Fail closed: default to not making the call.
Thanks For Listening
http://www.kalzumeus.com [email protected] I’m patio11 on Twitter or HN. I love talking about this. Feel free to get
in touch.