Date post: | 16-Jun-2015 |
Category: |
Technology |
Upload: | andrew-pantyukhin |
View: | 525 times |
Download: | 0 times |
Unix in the CloudIgnorance, Stagnation,
Obsolescence
Synopsis▪ cloud in the broad sense of ideology
▪ not quite about running BSD on EC2
▪ very limited to skills and experience of yourshumbly
Multi-core▪ installation?
▪ configuration management?
▪ load balancing?
Multi-node▪ installation?
▪ configuration management?
▪ load balancing?
▪ why multi-node?
Large ComputingNeeds
▪ Facebook, Google, ...
▪ more than any OS can provide
Happy Hardware Vendor LawThe amount of nodes needed to solve a given task doubles every now and again.
OS Scalability Limit▪ 1 node only
▪ multi-socket and stacks approaching NUMA
▪ E25K, z10, etc — fail for most purposes
Operating System — ?▪ traditional definition no more relevant
▪ the notion itself on the brink of obsolescence
▪ field heavily eroded by current distributed apps
DistributedApplications
▪ forced to be an OS unto themselves
▪ huge overlap
▪ huge opportunity for sharing and consolidation
Anti-Patterns▪ virtualization
▪ chefs and puppets
▪ thick abstraction
Attempts▪ z/OS
▪ Plan 9, Inferno
▪ Clustrx, E1, DYSEAC, ...
▪ OpenStack (~~)
Species Survival PlanFreeze the bodies and leave them for future generations to fix.
Don't Panic:Incremental
▪ perfection v. done
▪ still a decade or more till a good AI
▪ no practical need for POSIX over a cloud
Mindful Approach▪ immediate practicality
▪ long-term perspective
▪ sustained, integrally rich effect
Operating System▪ major abstraction repository
▪ overlapping code distillery
▪ pre-production architecture research
Increments
Machine GeneratedData
▪ logs, error messages, status monitors
▪ meant for humans... no more
▪ rethinking for better aggregation and analysis
Identity andAuthentication
▪ YP, LDAP outdated and poorly supported
▪ no distributed model
▪ passwd in git as a first stab
Remote Procedure Call▪ ssh losing relevance, HPN or not
▪ all-mighty agent daemon worse than rsh
▪ capabilities, RBAC, WoT
Hardware Failures▪ no culture for low-level fault-tolerance
▪ watchdogd as state-of-the-art self-healing
▪ focus on self-diagnostics: disk error counters, etc
DistributedConfiguration
▪ current anti-patterns worsen the problem
▪ role-aware configuration
▪ / in git as a second stab
Storage▪ intra-node redundancy irrelevant
▪ no appropriate local multi-disk FS
▪ no fast path for data exchange
▪ nginx + curl + dispatcher
Error Handling▪ cf MGD and hardware failures
▪ software is 10x more prone to failures
▪ serious problem at scale
☺