Date post: | 22-Jan-2018 |
Category: |
Data & Analytics |
Upload: | spark-summit |
View: | 425 times |
Download: | 0 times |
OPTIMIZING SPARK DEPLOYMENTS FOR CONTAINERS: ISOLATION, SAFETY, AND PERFORMANCE
William Benton • @willb Red Hat, Inc.
OPTIMIZING SPARK DEPLOYMENTS FOR CONTAINERS: ISOLATION, SAFETY, AND PERFORMANCE
William Benton • @willb Red Hat, Inc.
ForecastBackground and definitions
Architectural concerns
Security concerns
Performance concerns
Conclusions and takeaways
Background and definitions
ForecastBackground and definitions
Architectural concerns
Security concerns
Performance concerns
Conclusions and takeaways
Background and definitions
Architectural concerns
ForecastBackground and definitions
Architectural concerns
Security concerns
Performance concerns
Conclusions and takeaways
Background and definitions
Architectural concerns
Security concerns
ForecastBackground and definitions
Architectural concerns
Security concerns
Performance concerns
Conclusions and takeaways
Preliminaries
What is a container?…a lightweight VM?
…a way to totally isolate applications?
…a packaging format for a container runtime or orchestration platform?
pid
root
net
$SPARK_HOME/bin/spark-class \ org.apache.spark.deploy.worker.Worker \ master:7077
pid
root
net
$SPARK_HOME/bin/spark-class \ org.apache.spark.deploy.worker.Worker \ master:7077
pid
root /
net
$SPARK_HOME/bin/spark-class \ org.apache.spark.deploy.worker.Worker \ master:7077
pid
root /
net
$SPARK_HOME/bin/spark-class \ org.apache.spark.deploy.worker.Worker \ master:7077
pid
root /tmp/foo
net
$SPARK_HOME/bin/spark-class \ org.apache.spark.deploy.worker.Worker \ master:7077
container runtime
pid
root /tmp/foo
net
$SPARK_HOME/bin/spark-class \ org.apache.spark.deploy.worker.Worker \ master:7077
container runtimeSPEED LIMIT
55
pid
root
net
$SPARK_HOME/bin/spark-class \ org.apache.spark.deploy.worker.Worker \ master:7077
/
pid
root
net
$SPARK_HOME/bin/spark-class \ org.apache.spark.deploy.worker.Worker \ master:7077
/
What is a container?…a lightweight VM?
…a way to totally isolate applications?
…a packaging format for a container runtime or orchestration platform?
What is a container?…a lightweight VM?
…a way to totally isolate applications?
…a packaging format for a container runtime or orchestration platform?
…a lightweight means to address some of the same use cases as VMs.
What is a container?…a lightweight VM?
…a way to totally isolate applications?
…a packaging format for a container runtime or orchestration platform?
…a lightweight means to address some of the same use cases as VMs.
…a way to provide reasonable, not exhaustive application isolation.
What is a container?…a lightweight VM?
…a way to totally isolate applications?
…a packaging format for a container runtime or orchestration platform?
…a lightweight means to address some of the same use cases as VMs.
…a way to provide reasonable, not exhaustive application isolation.
…yes, but really just any Linux process with some special settings!
Architectural considerations
Microservice architectures
Microservice architectures
Microservice architectures
Microservice architectures
High-level app architecture
federate
events
databases
file, object storage
transform
transform
transform
archive
High-level app architecture
federate
trainmodels
events
databases
file, object storage
transform
transform
transform
archive
High-level app architecture
federate
trainmodels
events
databases
file, object storage
management
web and mobile
reporting
developer UItransform
transform
transform
archive
High-level app architecture
federate
trainmodels
events
databases
file, object storage
management
web and mobile
reporting
developer UItransform
transform
transform
archive
High-level app architecture
federate
trainmodels
archive
events
databases
file, object storage
management
web and mobile
reporting
developer UItransform
transform
transform
High-level app architecture
federate
trainmodels
archive
events
databases
file, object storage
management
web and mobile
reporting
developer UItransform
transform
transform
High-level app architecture
federate
trainmodels
archive
events
databases
file, object storage
management
web and mobile
reporting
developer UItransform
transform
transform
Spark is a natural fit for microservice architectures, since executors are microservices!
Monolithic Spark clustersCluster scheduler
Shared FS / object store
Spark executor
Spark executor
Spark executor
Spark executor
Spark executor
Spark executor
Resource manager
app 1 app 2
app 4app 3
Databases
Monolithic Spark clustersCluster scheduler
Shared FS / object store
Spark executor
Spark executor
Spark executor
Spark executor
Spark executor
Spark executor
Resource manager
app 1 app 2
app 4app 3
Databases
One cluster per applicationResource manager
Shared FS / object store
app 1 app 2
app 5app 4
app 3
app 6
Databases
One cluster per applicationResource manager
Shared FS / object store
app 1 app 2
app 5app 4
app 3
app 6
app 2
app 4
Databases
Security
systemd
qemu
qemu
qemu
systemd
nginx
mongodb
spark-class /tmp/foo
/tmp/bar
/tmp/blah
systemd
nginx
mongodb
spark-class
spark-class /tmp/foo
systemd
nginx
Use SELinux
spark-class /tmp/foo
systemd
nginx
Use SELinux
spark-class /tmp/foo
systemd
nginxSELinux limits your exposure to an exploit in a container or a bug in a container runtime.
Use SELinux
Root is root
…
/tmp/foo
Root is root
…
/
Denials of service
…
/tmp/foo
Denials of service
…
/tmp/foo
Kernel panics
…
/tmp/foo
Kernel panics
…
/tmp/foo
Keeping secrets
…
/tmp/foo
Keeping secrets
…
/tmp/foo
Shared FS / object store
ACCESS_KEY=… SECRET_KEY=…
Keeping secrets
cat <<EOF > secret.txt ACCESS_KEY=… SECRET_KEY=… EOF git add secret.txt
Keeping secretscat <<EOF > secret.txt ACCESS_KEY=… SECRET_KEY=… EOF git add secret.txt
export ACCESS_KEY=… export SECRET_KEY=…
Keeping secretscat <<EOF > secret.txt ACCESS_KEY=… SECRET_KEY=… EOF git add secret.txt
export ACCESS_KEY=… export SECRET_KEY=…
kubectl create secret \ generic mysecrets \ --from-file=… \ --from-file=…
Keeping secretscat <<EOF > secret.txt ACCESS_KEY=… SECRET_KEY=… EOF git add secret.txt
export ACCESS_KEY=… export SECRET_KEY=…
kubectl create secret \ generic mysecrets \ --from-file=… \ --from-file=…
Performance
Potential performance pitfalls
Potential performance pitfalls
Hypervisors introduce overhead. Use more lightweight isolation mechanisms to preserve performance.
Potential performance pitfalls
Potential performance pitfalls
Potential performance pitfallsVirtualized networking likely has minimal impact on overall application performance!
Potential performance pitfallsVirtualized networking likely has minimal impact on overall application performance!
…but measure the performance of your I/O configuration!
Potential performance pitfalls
Potential performance pitfalls
SPEED LIMIT 55
Potential performance pitfalls
SPEED LIMIT 55
Quotas mean some ubiquitous techniques can have surprising performance impact. Consider in particular parallel GC and disk buffer cache use.
Potential performance pitfalls
SPEED LIMIT 55
Be sure you set your heap sizes based on your resource limits…or wait for OpenJDK 9!
Conclusions and takeaways
Architectural takeawaysSpark executors are already microservices.
Consider using a single Spark cluster per application for flexible scheduling and easy deployments.
Persistent storage lives outside of containers and is probably best accessed via service interfaces rather than through filesystem interfaces.
Security takeawaysIt isn’t safe to run arbitrary code just because you put it in a container.
Use SELinux to minimize your exposure to error and malice.
Don’t run as root unless you absolutely have to (and you probably don’t).
Ad hoc mechanisms for configuring secrets are likely to leak information and are almost always a bad idea.
Performance takeawaysAvoid hypervisor overhead by using different approaches to isolation.
Measure everything, but virtualized networking likely has a minimal performance impact on real applications.
Artificially throttled performance can be a real problem. Experiment with JVM settings, including serial GC, to reduce your chance of getting limited.
Configuration takeawaysIf you consume logs from standard output and error, consider using an alternate stack trace formatter to get exceptions in a single log record.
If you use ephemeral user IDs, set SPARK_USER or use nss_wrapper so Hadoop file libraries won’t get confused.
[email protected] • @willb http://radanalytics.io https://chapeau.freevariable.com