linux.conf.au 2015 Day 3
Keynote - Bob Young
- Don't interrupt a good story with the facts (as long as they're within an order of magnitude of reality).
- Used the M word.
- "I just started a little business, you guys built a great company."
- "You give some code away and you get more code back" - give away an ethernet driver and getting back a whole OS.
- "I was selling control." Early customers didn't want Linux because it was better, but because it gave them more control.
- "The value proposition of Red Hat today is the same as in 1994."
- Free markets and capitalism solve everything!
- Bob loves the GPL.
- "Our lawyers didn't understand our commitment."
- "rms didn't understand our commitment."
- "Customer control was the only benefit I could offer to customers that Apple couldn't offer."
- "Don't waste time trying to build a better license, just use an existing license, and our preference is the GPL."
This would have been a much better talk without rambling about how much Bob misses the good old days of being nasty to people - and in the context of current-day arguments it's hard to avoid the conclusion that he means women, brown people, and so on - without pushback. It left a rather sour taste in my mouth.
CoreOS
Why Build CoreOS?
- Inspired by a Google paper on "The Datacenter as a Computer".
- Individual servers don't matter; design for the inevitable failure of components (servers) within the datacentre.
- There are no maintenance windows.
- Use dumb hardware and smart software.
- Containers:
- Run and isolate applications.
- There's a lot of confusion, and that's not the fault of people trying to understand - it's pretty nebulous.
- Essentially virtualisation with a common kernel. Traditionally this has been a full init system and userland.
- Docker has been changing that perspective by focusing on a simple single-application container.
- Traditionally everything was provided by Linux and only the app is unique.
- Containers make some problems worse: you now have many libc/openssl/etc.
- But they solve some problems: we don't have the agony of trying to fun disparate apps onto e.g. a single common python deployment.
- Important to understand that this is moving problems around, not making them vanish.
- At a kernel level, this is supported by a couple of technologies:
- namespaces to isolate pids and users. The external and internal views will be quite different.
- cgroups provide resource management: memory, network utilisation, and so on and so forth.
- The raw kernel outputs for cgroups aren't necessarily that great. Tools are being build around solving this problem.
- The kernel doesn't think in containers. Containers are a userspace abstraction.
- The docker engine has caused a shift in the ease of use. lxc has been around for ages, docker has made it popular. Fetching images remotely and dashboards have made containers sexy
- etc
- signed images
- Orcehstation/Scheduling
- Distrubuted state
seL4
- Descended from a common L4 lineage which is 20 years old.
- L4 is widely used in different guises, e.g. Qualcomm baseband OS, iOS device security processors.
- seL4 has a mathematical proof of security:
- There is an abstract specification
- The implementation is provably functionally correct, which eliminates C faults like null pointers, overflows, and other such problems.
- But this doesn't tell you anything about the generated assembly... so there's a seperate mathematical proof for the translation from C to the binary, which is a world first.
- So there's a correct implementation that works as designed.
- Put this doesn't do anything useful in and of itself.
- There are then proofs that the confidentiality, integrity, and availability are enforced.
- These layers of proofs are the basis for the claim that seL4 is the only secure kernel.
- There are also hard realtime latency guarantees. "No one really knows what the safe upper bound is. Except us."
- "seL4 is the world's fastest microkernel."
- "We mostly use gcc but we should probably move onto LLVM."
- Debugging is via the proofs: either you fail against the proof, or you create functionality that's not covered by an existing proof, and you need to come up with another one.
- Many eyes it rubbish compared to formal proofs.
- The proofs are machine checked as well as pen-and-paper.
- The proofs are literally orders of magnitude bigger than the kernel code.
- The proofs are then themselves checked by proofers.
- There isn't a formalisation of the memory model yet, which means the cache behaviour isn't proven. "It's scary because we've found bugs in the cache layer, which shows anything that can't be proven will have bugs."
- "The MMU is modeled at a higher level."
- Memory and multicore aren't fully modeled yet, but it's a work in profile.
- Not an operating system - it's just a kernel.
- Linux as a VM gives you a lot of the things you need.
- IPC is tightly controlled.
- There are strong controls between userland applications.
- Differences with other L4 microkernels.
- No memory allocation in the kernel (other than initialisation).
- Memory allocation happens in userland.
- "The red team had no problem breaking into the Boeing system which is supposedly millitary grade."
Current WIP
- High performance multicore support. It's in the lab, sometime in the next few months.
- Full virtualisation support for ARM and x86 is on a similar timeline.
- 64-bit support isn't quite as far along, and is currently x86 only.
- There are more pure research - more proofs, eliminating timing channels, mixed-criticality workloads, and hardware failure resistence (e.g. locksteps across CPUs and cores to understand and eliminate hardware failures).
- Longer term: reduce costs by automation and abstraction:
- It costs about $400 per line of code.
- That's cheap relative to other high assurance systems ($1,000-$2,000/SLOC).
- There's still a factor of 2 to get down to the costs of traditional OS development.
- Mechnisms:
- Produce drivers from a formal spec rather than by hand.
- Code and proof co-generation: spec functionality by DSL, generate C code and proofs.
- Eliminate humans.
- Type and memory safe high level languages - working with ANU on this.
Kernel Development
- Stable: the proven code and the associate proofs, all of which are open sourced.
- Experimental: it's not experimental in the traditional sense, it's merely unverified.
- Private branches: people maintain pivate branches which are staged into the experimental release.
- Code will not be accepted into experimental unless you are committed to work through the verification process.
- You can contribute to the kernel, but it's a high barrier.
BUT
- Libraries are very rudimentary - barest POSIX support, for example.
- Platform ports welcome.
- Drivers are needed.
- Network stacks and filesystems are needed.
- Tools.
- Languages:
- Partial C++ just released.
- Haskell port in progress.
- Python would be awesome.
- Qubes on seL4 would be awesome because Linux on Linux is "a million lines of insecure code".
Why Wouldn't You Use seL4
- It's pretty spartan.
- Perhaps you like insecure systems.
- Perhaps you like th thrill of danger!
Questions
- Should we be using seL4 as the hypervisor for all the things? YES.
- Hardware is scary because how can you trust it - UEFI, microcode errata.
collectd in Dynamic Environments
- "Cloud" - high rate of change in virtual environments.
- collectd has:
- Massive contributor base adding many plugins (e.g. almost 100 read plugins and 15 write plugins).
- "Platform independent" if my that you mean *ix.
- Agent based design, one agent per design.
- Extensible.
- The libvirt plugin is the core of using collectd this way; it has some limitations in terms of the data gathered, but is good.
- GenerixJMX can be used to measure any exported beans, the threads, and the memory behaviour.
- The network plugin supports multicast to create a pubsub type model for new servers auto-adding.
- Aggregates are more useful for alerting in the cloud - you don't care if you lose 1 server in 10.
- Bear in mind your recording may be storage limited.
- There is an aggregate plugin to make this happen automagically - it intercepts raw data and then sends it through as an aggregate.
- There are some limitations in how you aggregate.
- You can aggregatte on the client or on the collector.
- Riemann is a filtering and aggregation processor.
- It can calculate much more sophostocated metrics than the aggregation plugin.
- Bosun is a monitoring and alerting system driven by a rich expression language.
- Based on OpenTSDB.