About Me
2
● Data Engineer at Fundbox
● Technology enthusiast - especially data, ML and serverless
● Tennis (go Roger!), Football (Chelsea)
● Beer
About FundboxOVERVIEW
Products Data-driven business credit solutions
Customers>150k US-based small businesses
Founded2013
Locations• San Francisco: headquarters• Tel Aviv: R&D• Dallas: credit and support• 225 employees and 12 dogs
Americas most promising AI companies
Investors
Tectonic shifts in data landscapeTHE OPPORTUNITY
Expensive: ~$3,500 per underwriteInaccurate: low approval rateOverly reliant on personal credit
Data-Native Approach
Tectonic shifts in data landscapeTHE OPPORTUNITY
Scalable: ~$0 cost per underwriteAccurate: high approval rateComprehensive view of business
Expensive: ~$3,500 per underwriteInaccurate: low approval rateOverly reliant on personal credit
Data-Native Approach
7
Data Scientist
Data Engineer
Data Analyst
Data
SQL ReportsETL
Training
Feature engineering
Data preparation
SSH
10
Our goal:
Let users think less about how their jobs are run on our infrastructure, and instead focus on their core inquiry.
Extensible Multiple runtime support
Runtime params Low level API
Simple Easy to use and maintain
MonitoredMetrics collection
Logs collection
12
Principles
Create a folder in Github
1
User creates a job definition folder in a dedicated Github repository
3-step job creationSimple
Create a folder in Github Add job files and config
1 2
User adds his python or sql file to the resource folder and created a
config.yaml for his job
User creates a job definition folder in a dedicated Github repository
3-step job creationSimple
Create a folder in Github Add job files and config Get job notifications
1 2 3
User adds his python or sql file to the resource folder and created a
config.yaml for his job
User get notified on job events such as job started and finished
User creates a job definition folder in a dedicated Github repository
3-step job creationSimple
Monitoring
StepFunctions SNS Topic
Slack Lambda #bijo_notifications
CloudWatch Logs
http post
Job A has finished
ECR - Container Registry
Pull docker image
33
CloudWatch Cron Schedule
Monitoring
StepFunctions SNS Topic
Slack Lambda #bijo_notifications
CloudWatch Logs
http post
Job A has finished
ECR - Container Registry
Pull docker image
34
CloudWatch Cron Schedule
StepFunctions SNS Topic
Sns-to-Slack Lambda #bijo_notifications
CloudWatch Logs
http post
36
Job dependency service
Job-Dependency Lambda
StepFunctions SNS Topic
Sns-to-Slack Lambda #bijo_notifications
CloudWatch Logs
http post
37
Job dependency service
Job-Dependency Lambda
DynamoDB table
StepFunctions SNS Topic
Sns-to-Slack Lambda #bijo_notifications
CloudWatch Logs
http post
38
Job dependency service
Job-Dependency Lambda
DynamoDB table
Run job
StepFunctions SNS Topic
Sns-to-Slack Lambda #bijo_notifications
CloudWatch Logs
http post
Job-Dependency Lambda
DynamoDB table
42
Job dependency service
API Gateway
REST API
New Object creationS3
Run job
StepFunctions SNS Topic
Sns-to-Slack Lambda #bijo_notifications
CloudWatch Logs
http post
Job-Dependency Lambda
DynamoDB table
43
Job dependency service
API Gateway
http request
REST API
New Object creationS3
Run job
StepFunctions SNS Topic
Sns-to-Slack Lambda #bijo_notifications
CloudWatch Logs
http post
Job-Dependency Lambda
DynamoDB table
44
Job dependency service
API Gateway
http request
REST API
New Object creationS3
Run job
StepFunctions SNS Topic
Sns-to-Slack Lambda #bijo_notifications
CloudWatch Logs
http post
Run job
48
Job dependency service
New Object creationS3
API Gateway
http request
Dependency Service
Job-Dependency Lambda
DynamoDB table
Principles
REST API Custom Runtime
REST API
Python Client
Custom runtime support using Docker containers
Custom runtime support through DockerSupport for any runtime environment and configuration using Docker
Custom runtime support through DockerSupport for any runtime environment and configuration using Docker
BIJO APIsExtensible
Manger Service Notifications
Run jobs on demand
Get job execution history
Get job status
More…
Send Slack notifications easily
Plots and images support
Job Manager Service
StepFunctions SNS Topic
Sns-to-Slack Lambda#bijo_notifications
CloudWatch Logs
http post
Job-Dependency Lambda
DynamoDB table
Job-Manager Lambda
API Gateway
Client
http request
Key Value Parameter Store 56
New Object creationS3
Job Manager Service
StepFunctions SNS Topic
Sns-to-Slack Lambda#bijo_notifications
CloudWatch Logs
http post
Job-Dependency Lambda
DynamoDB table
Job-Manager Lambda
API Gateway
Client
http request
Key Value Parameter Store
Execute job
57
New Object creationS3
StepFunctions SNS Topic
Sns-to-Slack Lambda
#bijo_notifications
CloudWatch Logs
http post
Job-Dependency Lambda
DynamoDB table
API Gateway
Client
http request
Bot Lambda
Key Value Parameter Store
-Get history -Execute
64
Execute job
New Object creationS3
Job-Manager Lambda
StepFunctions SNS Topic
Sns-to-Slack Lambda
#bijo_notifications
CloudWatch Logs
http post
Job-Dependency Lambda
DynamoDB table
API Gateway
Client
http request
Bot Lambda
Bot calls
Key Value Parameter Store
-Get history -Execute
65
Execute job
New Object creationS3
Job-Manager Lambda
StepFunctions SNS Topic
Sns-to-Slack Lambda
#bijo_notifications
CloudWatch Logs
http post
Job-Dependency Lambda
DynamoDB table
API Gateway
Client
http request
Bot Lambda
Bot calls
Key Value Parameter Store
-Get history -Execute
66
Execute job
New Object creationS3
Job-Manager Lambda
S3 object creation == 1 job
StepFunctions SNS Topic
Job-Dependency Lambda
DynamoDB table
New Object creation
S3
S3 object creation == 1 job
StepFunctions SNS Topic
Job-Dependency Lambda
DynamoDB table
New Object creation
S3
S3 object creation == 1 job
StepFunctions SNS Topic
Job-Dependency Lambda
DynamoDB table
New Object creation
S3
S3 object creation == 1 job
StepFunctions SNS Topic
Job-Dependency Lambda
DynamoDB table
New Object creation
S3
S3 object creation == 1 job
StepFunctions SNS Topic
Job-Dependency Lambda
DynamoDB table
New Object creation
S3
S3 object creation == 1 job
StepFunctions SNS Topic
Job-Dependency Lambda
DynamoDB table
New Object creation
S3
Number of ECS Fargate tasks is limited
S3 object creation == 1 job
StepFunctions SNS Topic
Job-Dependency Lambda
DynamoDB table
New Object creation
S3
S3 object creation == 1 job
StepFunctions SNS Topic
Job-Dependency Lambda
DynamoDB table
New Object creation
S3
S3 object creation == 1 job
StepFunctions SNS Topic
Job-Dependency Lambda
DynamoDB table
New Object creation
S3
S3 object creation == 1 job
StepFunctions SNS Topic
Job-Dependency Lambda
DynamoDB table
New Object creation
S3
Job-Manager Lambda
S3 object creation == 1 job
StepFunctions SNS Topic
Job-Dependency Lambda
DynamoDB table
New Object creation
S3
Job-Manager Lambda
Any available resources?
S3 object creation == 1 job
StepFunctions SNS Topic
Job-Dependency Lambda
DynamoDB table
New Object creation
S3
Job-Manager Lambda
Yes
S3 object creation == 1 job
StepFunctions SNS Topic
Job-Dependency Lambda
DynamoDB table
New Object creation
S3
Job-Manager Lambda
Run job
S3 object creation == 1 job
StepFunctions SNS Topic
Job-Dependency Lambda
DynamoDB table
New Object creation
S3
Job-Manager Lambda
Any available resources?
S3 object creation == 1 job
StepFunctions SNS Topic
Job-Dependency Lambda
DynamoDB table
New Object creation
S3
Job-Manager Lambda
No :(
S3 object creation == 1 job
StepFunctions SNS Topic
Job-Dependency Lambda
DynamoDB table
New Object creation
S3
Job-Manager Lambda
Job queue
S3 object creation == 1 job
StepFunctions SNS Topic
Job-Dependency Lambda
DynamoDB table
New Object creation
S3
Job-Manager Lambda
Job queue
Job X done
S3 object creation == 1 job
StepFunctions SNS Topic
Job-Dependency Lambda
DynamoDB table
New Object creation
S3
Job-Manager Lambda
Job queue
Run job
Extensible Multiple runtime support
Runtime params Low level API
Simple Easy to use and maintain
MonitoredMetrics collection
Logs collection
91
Principles
Rate today ’s session
95
Session page on conference website O’Reilly Events App
@tomer_levi