@jtahoyle #IPEXPO mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
The Power of Serverless: How we bet big on the cloud with
transformational results
Jamie Hoyle, MirrorWeb
mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
Who am I?
• VP, Product at MirrorWeb
• Responsible for facilitating product direction and developing future product strategy
• Digital transformation advocate - which is why I’m here!
• Lifelong Bury FC fan hoping I’ll have a team to support on Saturday
mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
• Digital archiving and compliance specialist
• Web archiving, social media archiving, customer journey, data analytics
• Wide collection of clients - public sector, brand preservation, FS&I regulatory
What do we do?
mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
Client base
Large FS&I Firms
mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
The engineering challenge
Capture, processing, and preservation
of Actual Big Data
FS&I regulatorycompliance
requirements
mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
The engineering challenge• We deal with huge amounts of data
• There are lots of ways that we have to collect data, and we have to store all that data in its original format until the end of time*
• We have to use lots of different methods to ingest data, which means we have to maintain a lot of different applications
• We have to process all that information - analytics, screening, full-text search
• Our product has to have a robust pipeline to ensure these different content types are correctly process and made accessible
• Our data ingest and workload requirements vary on a day-by-day basis
• We need flexible capacity and have to be able to scale to any size very quickly
mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
The engineering challenge• We offer regulatory compliance services
• We can never, ever, ever lose data, or fail to collect any data
• We have to be always-on and always-reliable - HA clusters, no maintenance windows
• We have to be ready to provide any data in court-admissible formats in any jurisdiction at any time
• All our data has to be available all the time - no cold stores or tapes
• Our capture surface for FS&I changes frequently depending on market trends. Instagram, Snapchat, WhatsApp…
• We have to iterate fast and ship constantly to keep up with demand
mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
“We have to be always on, all the time, our data has to be constantly warm, we need to do instant analysis on that
data, and we need to ship to production on a daily basis.”
mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
–Johnny Appleseed
“Type a quote here.”
mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
–Johnny Appleseed
“Type a quote here.”
mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
The solution?
mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
Serverless.
mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
Serverless.
What does it do? Does it do things? Let’s find out!
mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
What is serverless?
• Where your cloud provider manages the allocation and management of servers rather than you.
• Yes, there’s still servers somewhere…. you just don’t have to deal with them!
• Pay-per-request, not a fixed monthly cost
• Often manifested as micro service architecture
mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
Why did we turn to serverless?
• Easy to maintain
• Deployments are a dream
• Built-in fault tolerance by design
• For all of our use cases, cheaper than traditional infrastructure… and almost certainly cheaper for all of your use cases too
• Vast scale
mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
Isn’t it just code?
• It can be.…
• …but you won’t get very far.
• Deploying serverless as code doesn’t matter if your only database instance goes down for 12 hours.
• Service resiliency is only as good as your single point of failure.
mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
Understanding the paradigm• Serverless isn’t a magic bullet. You can still write bad serverless
functions that will still fail.
• Don’t try and reuse monolithic codebases. Design your codebase for the architecture that you’re using.
• You can just run your Python WSGI or Node Express.js app as a single serverless function… but why would you?
• Remember: serverless platforms are designed for running lots of small things.
mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
Understanding the paradigm• Serverless puts us back in a land of limited compute resources
• The more efficient your code, the fewer resources you use…
• …and you’re paying for resource usage now, not having a server sit there 24/7!
• The few hundred milliseconds saving per execution equals big savings. Micro-optimisations matter again!
• Profiling tools help - NewRelic, Datadog etc.
mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
– Edna Mode, Incredibles 2
“Done properly, parenting is a heroic act. Done properly.”
serverless
mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
Building resilient serverless applications
• Your cloud service provider has vast capacity to scale, and resources to support their technology that far outstrip yours
• Think about the rest of your architecture - what interfaces with your serverless code? Is it as resilient as your serverless tech?
• Use managed services as much as you can - but make sure you know how to recover when things go wrong
mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
Building efficient serverless applications
• Use microservices
• Split your codebases into as many different serverless functions as is reasonable
• Share core functions as libraries (or Lambda Layers on AWS)
• Choose your language wisely - AWS Lambda now supports Node.js, Golang, Python…
• Continuous improvement.
mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
Case study: MirrorWeb Social Media
mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
(other clouds are available)
mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
SMA - Platform Support
YOUR PLATFORM
HERE
mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
SMA - the numbers
• 1,000+ social media accounts archived
• 6,000+ posts archived a day
• 6 core platforms, new platforms added all the time
mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
SMA - the MVP
https://blog.crisp.se/2016/01/25/henrikkniberg/making-sense-of-mvp
mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
SMA - the MVP
https://blog.crisp.se/2016/01/25/henrikkniberg/making-sense-of-mvp
Launch Partner
450,000 tweets 17,000 videos
15% of accounts no longer available… …this is why we need archives!
mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
1. Triggering a crawl
Amazon CloudWatch Event Scheduler Lambda Function
Server Count: 0
mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
2. Getting content
Amazon CloudWatch Event Scheduler Lambda Function
Server Count: 0
Social Crawl Lambda Function
Social DynamoDB
table
Scheduler Lambda Function
mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
3. Getting media
Amazon CloudWatch Event Scheduler Lambda Function
Server Count: 0
Social Crawl Lambda Function
Media D/L Lambda Function
Social Crawl Lambda Function
Social DynamoDB
table
Social Media S3
Bucket
mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
4. Adding webhook support
Amazon CloudWatch Event Scheduler Lambda Function
Managed Count: 4
Social Crawl Lambda Function
Media D/L Lambda Function
Social Crawl Lambda Function
Amazon AuroraDB
Cluster
Social Media S3
Bucket
API Gateway HTTP req
mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
5. Full text social search
Amazon CloudWatch Event Scheduler Lambda Function
Social Post Indexing Function
Media D/L Lambda Function
Social Crawl Lambda Function
Amazon AuroraDB
Cluster
Social Media S3
Bucket
Amazon Aurora
INS/UPD triggerManaged Count: 8
Amazon API Gateway HTTP(S) Request
ElasticSearch Service
mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
6. Image tagging
Amazon CloudWatch Event Scheduler Lambda Function
AWS Rekognition (+Lambda)
Media D/L Lambda Function
Social Crawl Lambda Function
Amazon AuroraDB
Cluster
Social Media S3
Bucket
Managed Count: 8
Amazon API Gateway HTTP(S) Request
ElasticSearch Service
Social Media S3
Bucket Trigger
Social Post Indexing Function
ElasticSearch Service
mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
Look at where we started…
Amazon CloudWatch Event Scheduler Lambda Function
Server Count: 0
Social Crawl Lambda Function
Social DynamoDB
table
Media D/L Lambda Function
Social Media S3
Bucket
mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
Look at where we are…
Amazon CloudWatch Event Scheduler Lambda Function
Media D/L Lambda Function
Social Crawl Lambda Function
Amazon AuroraDB
Cluster
Social Media S3
Bucket
Managed Count: 8
Amazon API Gateway HTTP(S) Request
Social Post Indexing Function
ElasticSearch Service
AWS Rekognition (+Lambda)
Multi-cloud S3 DR bucket
mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
SMA - why serverless?• Serverless is perfect for event-driven architecture - all social media
posts are events
• Unpredictable requirements - needs to scale with client archiving demands on a per-hour basis
• Excellent fault tolerance - a non-core part of the service being unavailable doesn’t impact our ability to archive
• Easy to extend upon - individual features remain as separate codebases that we’ve glued together
mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
SMA - what would have happened?
• In a world without serverless, we’d be running monolith social media capture servers…
• …storage RAID arrays across multiple datacenter and cloud providers
• HTTP load balancers for webhooks…
• …and we probably wouldn’t have a full extract-transform-load (ETL) pipeline for data analytics.
• It’s transformed our business and won us clients.
mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
In summary• Serverless isn’t just code – it’s the rest of your architecture, too.
Design your systems with that in mind.
• Fully commit to the paradigm. Separate out your code into microservices so you can ship fast and ship often.
• Micro-optimisations matter again. Write efficient code to see the full benefits of the architecture.
• Your cloud service provider has much greater capacity for scale than you do. Make full use of it.
@jtahoyle #IPEXPO mirrorweb.com Confidential and Proprietary. Copyright © by MirrorWeb Limited. All Rights Reserved.
Q&AJamie Hoyle
[email protected] @jtahoyle