LCA 2018 Day 3

Another balmy day in Sydney. This is… not my ideal climate, shall we say. But hey, I saw a big ol’ bat flying around in the wild, which was cool for me and probably completely uninteresting to the locals.

We have a lucky door prize… for someone who isn’t here. And another person. Three down. Four. But lucky number five is here!

Six Years Later or, did you ever get the source code to that thing in your heart?

Karen Sandler @o0karen0o

Ballarat was the first full-length keynote Karen had ever done - it was exciting and amazing and wonderful of the LCA team to take a chance.

As she said then, she “has a big heart”, in a very literal sense; Karen’s heart is three times larger than normal. In order to reduce her chance of dying, she has an embedded defribrillator. It turned into quite a journey - she was the first person most of her medical professionals had encountered who was interested in understanding the software on the devices. Neither had the manufacturer representatives. It was very much unknown territory, and Karen was not happy that her life literally depends on closed, potentially buggy and insecure software that she is prevented from inspecting and understanding.

(Karen would like you to try GNOME 3. You might be pleasantly surprised.)

“I was terrified. You can see me shaking in the video […] It was terrifying, but amazing. It made my career […] What I had landed on was a story that people could understand about why we should care about [free and open source software]”

(This last part is definitely true: I have showed Karen’s Ballarat talk to quite a few people to explain why this free software stuff should matter to them.)

Or, as Karen’s slide cheekily puts it: The. Best. Printer. Story.

“That poor woman with the heart condition. I hope she doesn’t die while we’re watching her! (It’s great advocacy)”

“Hey, did you ever get your source code?”

When Karen gave her talk in 2012, she felt she’d reached the end of the road. She’d offered to sign NDAs, she’d chased all the manufacturers through every avenue she knew. So she found the question really frustating; she channeled her energy into the bigger software freedom world; at this point, the medical device had become a symbol of the bigger problem.

Becoming a cyborg is not particularly unique, if we understand that as incorporating technology in our bodies. It’s becoming more common, and will be an experience more and more people have, as we become more and more dependent on these things.

Karen’s been really really busy with Conservancy and Outreachy - but that question keeps coming back to bug her. And she’s been networking with people who are involved in security, particularly medical security. She’s helped work getting a DMCA exemption for medical device research.

These devices are the worst of both worlds: they are closed, proprietary software that accept remote input. And the DMCA had criminalised even trying to understand what was going on - so the DMCA exemption has been a critical foot in the door. Especially as more and more examples of vulnerabilities for medical devices have come to light: with pacemakers, insulin devices, as well as defibrillators.

Pushing these issues originally created such hostility that Karen had medical specialists refuse to treat her, because they wrote her off as a conspiracy theorist; the subsequent findings in this field are vindication of her questions. Karen notes Johnson and Johnson were the first medical company to actually engage positively with a security researcher, after years of company ignoring or trying to suppress the problems.

Now the idea has started filtering into NCIS and Homeland, popularising the idea - the pacemaker forums lit up.

(“The forums call their devices their internal bling”)

When Karen first published on this topic, back in 2010, she was scared of people knowing she had a medical condition; she didn’t mention it, and it undermined the effectiveness of her advocacy. It wasn’t until 2012 that she was prepared to share this.

As it became part of popular culture, Karen’s advocacy becme dramatically more effective; people who wouldn’t give her time of day, are now interested in speaking with her; in some cases, such as the legal counsel of manufacterers, people are beginning to think that perhaps they are more liable with closed software than free software.

On a personal level, Karen thought she had it worked out: her talk in Ballarat described what her relationship technology was and would remain. And then she got pregnant. She felt anxious one day - and got shocked. Twice. Doctors apparently get really, really anxious when a pregnant lady gets shocked.

“It’s always a good time to advocate for free software. I saw three doctors, and talked to them all at free software.”

Shockingly enough, because most people with defibrillators are over 65 and skew strongly male, these devices don’t understand pregnancy. Such as for example the fact that 25% of women experience heart palpitations. Which is normal, and shouldn’t cause a shock - but the algorithm doesn’t know that. So it triggers.

At the time, the only solution on offer was to use drugs to slow Karen’s heart rate to the point where she could barely walk. Karen is no longer pregnant, but what other circumstances of her life won’t be covered by the algorithm.

What if we’re in a different geographical region? What if the equipment is in use for longer than we expect?

Karen wants to have some say in this: why can’t she get all the pregnant women together and hire a medical profession to help tweak the algorithm so they don’t get shocked unecessarily. But this isn’t an option.

One of the delicate balances we’ve created is one where we’ve engaged with corporate actors to create a symbiotic relationship between the community and corporations; companies are funding the community and contributing to it, on a basis where the licenses set out by the community help everyone enjoy the benefits.

Something we’re in danger of losing sight of that balance, though: we’ve gotten so excited about jobs and funding and people paying attention to us, that we’re losing sight of our own power, and how useful our software is. We’re giving companies too much power - picking licenses because “the companies won’t like it”. We give away our own copyrights in jobs, we don’t even ask the questions when we negotiate. And we forget that short term corporate interests are not always aligned with long term community goals, or even their own long term success.

As technologists, we have the knowledge and skills and power that we have more leverage than we think we do. We should ask questions - not just about licenses, but more generally about the long term ethical implications of our work. We don’t want to be like Volkswagen.

More Things Happened

Karen started beeping every day at 1:10 pm. It turned out that the battery was running down, after the unecessary shocks. Karen went to her doctor with a laundry list of requirements: no remote telemetary. Data dumps. The old device itself to hack on.

Turns out that some things have gotten worse: you can’t disable remote access to the device, for example. Which means Karen had to start thinking about changing her career. Becuase she gets death threats as a result of her work with Outreachy, and being a public figure. “Just last like time, I found myself crying in the doctor’s office.”

The doctor and nurse practitioner called literally every device manufacturer. Manufacturer after manufacturer said they couldn’t disable it - one even claimed that their device was “hackproof”. Eventually Karen’s doctor found a European company who had one model that doesn’t have radiotelemetary.

Advocating for FOSS is hard - “If anyone says they aren’t using proprietary software, the probably don’t understand what they’re using, or getting someone else to do a lot of work for them.”

So that source code? Karen is hopeful: she’s working with a small manufacturer, and she has ore leverage than she ever had before.

Kubernetes @ Home

Angus Lees

This is the weirdest room in the building. Everything is green - the garpets, the walls, the seats. The whole thing feels like a special effects stage.

Angus left Google for Rackspace just as Kubernetes was in it’s infancy, and immediately picked up a side-project to have the two play well together, writing code for k8s and OpenStack to play together. It was an abandoned side project… until people tracked him down and started filing patches.

“This isn’t really a talk about kubernetes, it’s a talk about free software.”

The history of computing started with centralised, reliable devices; moved to personal computers, and then started moving back to reliable, centralised, but commodity hardware. But no matter how reliable you make the hardwae, things still go wrong. Around 2005 we saw a sea change: moving to putting the reliability into the software, and accepting unreliable hardware.

This leads to many, many machines with smart software constructs ensuring my redundancy and availability.

The problem is, though, that the “home hacker” is stuck in the single-system metaphor. Our hom labs and environments aren’t taking advantage of the advances we’ve made.

This divergence between the home hacker and the work hacker has many secondary effects:

  • Apache over GPL.
  • Centralised “as a service” rather than federated protocols.
  • Profit-driven technology projects.
    • Admin and workflow tools become products, not ecosystems.
    • Multi-arch suffers.

Problem:

  • Distributed systems are complex.
  • Solutions scale down pooly.
  • Most homes don’t have racks of servers.

Implication:

  • Home free sofware stuck at single-machine architecture.

Kubernetes

  • Active manager of applications across multiple machines.
  • “Unix Process as a Service.”
  • Based around groups of containers (pod).
  • Apache licensed, mostly golang.

Kubenetes has a simple architecture: an API server accepts interactions from clients. etcd is the only state store, and set of controllers check into the API server and try to make sure the state from etcd is the state all the running components are in.

Your hosts then run a kubelet to manage the host.

The network is also simple; each node has a subnet. Each pod has an IP allocated from the node subnet; as long has pods can communicate, k8s don’t care. Even static routing works.

Host Stack

  • We’ve inverted the stack: the host used to be very important. Now it isn’t - the host is ephemeral, only the application state is important.
  • Capturig the executable as a container image lets you cleanly deploy the application independently of the host.

“I’ve got news for you. Hardware fails … it surprises me that software developers act like hardware is reliable … you need to build systems that assume that hardware failure is normal.” If you do this right, no-one gets paged because hardware failed.

Garage Cloud

  • RAID has gone from being exotic to being normal. Why don’t we do the same with compute?
  • Plan:

    • Distro providing k8s layer.
    • Modern software architecture.
    • Easy/automatic maintenance is essential.
  • Budget: $100 for the control plane; 3 x bananapis, using containOS.

  • Assume single layer 2 network.

  • Assume local DNS, DHCP, etc.

  • Optimise for least maint.

  • Broken laptops has worker nodes, running CoreOS.

Adapting for home:

  • PersistentVolumes: an NFS mount from a RAID server.
  • keepalived-vip provides the exposed service.
  • Seperate external/ingress controllers.
  • Wildcard DNS entry.
  • nginx-ingress.
  • keepalived-vip.

It works!

k8s-home

XFS: Teaching an Old Dog Old Tricks

Dave Chinner

XFS Architecture

  • The original btree filesystem; all metadata is indexed into a btree (or a btree variation).
  • Split into allocation groups, which act like mini-filesystems in and of themselves.
  • Data is held in extent btrees.
  • Directory and Attribute btrees are the most complex, with many tricks for scaling.
  • Write-ahead for reliability, checkpoint based journalling.

COW Fiesystems from 1000 metres

Write anyway. But when data is written, it requires a recursive update to the three root when writes happen. This means the trees are often re-written, which gives big linear writes, which is performant. New roots are written in parallel to old ones, then swapped atomically. Unfortunately, this requires significant amounts of space, and worse yet, we don’t know the requirements ahead of time.

This is why we can do things like:

  • Sharing - a function of refernces via counted objects.
  • Subvolumes are a natural extension of the tree structure.
  • Snapshots are just superceeded tree indexes.
  • Replication is tree index based (the tree and all its references) - the downside is that partial replication is not possible; the whole tree must be replicated back to the start of the tree.

COW in XFS

  • Data only! COW is limited to file cloning and data de-duplication, because the btree would be modified going from a leaf to a whole tree. In a worst case, this turns into a full filesystem re-write.
    • People are using the data cloning; overlay FS and NFS use it, for example.
  • No impact on existing non-shared data.
  • Space is predictable. There is always space available for modifications, no metadata update blowouts.
  • Update atomicity via “deferred operations” - this is to get around the lack of ability to perform an equivalent of the tree swaps from the COW system.

So what can we do with Data COW?

  • Is COW metadata needed for subvolumes?
  • How much of a filesystem do we actually need for a subvolume?
  • What other implementations tell us we should avoid?
  • Is there an existing model we can replicate? Don’t re-implement, copy?

What is a subvolume? Why do we care? What does it provide?

  • A subvolume has flexible capacity. It can be grown and shrunk or grown to the underlying pooll imits.
  • It’s a fully functional filesystem.
  • Can be snapshotted and cloned. Dave thinks this is the key characteristic - it must function as an independent entity.

Subvols as a Namespace Construct

  • Bind mounts with a directory quotas?
    • This is something that can be done in the VFS today.
  • Snapshot via çp -aR –reflink=always’; if you squint hard enough, is that a snapshot?
  • Replication via tar and rsync? It’s slow, sure.
  • Overlay provides into namespace construct capabilities.

It doesn’t look like a btrfs snapshot, but it provides much of the same functionality - thanks to overlay, you can provide something that looks like a subvolume with the constructs already available in the VFS.

Subvols as a Namespace Construct

  • Loopback devices using sparse image files. This is something most of us have done. You’ve got tremendous flexibility.
  • Snapshot via cloning the image file.
  • Replication via data copies.
  • Has all the capacity management and ENOSPC problems on thin devices.

Again, thinking this way demonstrates that many of the characterstics of COW-style subvolumes are partially available already.

Learning from others?

  • Subvol specification by mount option is clunky - this is present in overlayfs and btrfs.
  • Having a common root blook shared between all the subvolumes causes all sorts of problems with backup, free space reporting, and so on.
  • So subvolumes need to be an independent VFS entity.
  • Object based replication is very complex and hard to get right.
  • ENOSPC is a primary concern, both in theory and in practise. ENOSPC needs to be the first problem we solve.

So what about ENOSPC?

  • Layers need to communicate. The layers developing different ideas about what space is available is where things go horribly wrong.
  • It’s a problem that’s been talked about for a long time, but there’s not been a solution.
  • PNFS file layout implemetation by Christopher Helwig is a useful model. The PNFS client/server comms, which require communication of metadata and data layouts seperately from one another, is instructive.

A new type of subvolume

It looks and acts like a subvolume, but the implementation is very different, working without COW.

  • Kernel directly mounts image files, rather than block devices.
  • If we add a device space management API we sort out the ENOSPC problem.
  • Filesystems implement both sides of the device space management API. The filesystem is abstracted away from the underlying block device.
  • Sbvolume does IO direct to host filesystem block device.

Subvolumes on Thin Device?

  • Block device implements the hsot side of space management API.
  • FS uses the subvol client side for space accounting and IO mapping.
  • Block device reports ENOSPC before FS modifies structures and issues IO.

So even for thin devices we can avoid the ENOSPC.

Snapshots

  • Simple with existing tools:
    • Freeze subvol.
    • Clone image file/device.
    • Already exists in XFS.
  • Fast and efficient.
  • End result is COW metadata in the subvol.

Replication

  • Extent based snapshot replication:
    • Look for differences in extent map between source and destination.
    • Use pread/pwrite/fallocate to replicate structure.
  • Simple, fast.
  • No knowledge of what is being replicated is needed.
    • ext4 could re-use the same tools!

Cache sharing

  • Shared data on disk, but not in memory.
    • Bad for containers using cloned images.
    • OverlayFS does this right today.
    • FS cache indexed by block number, which makes this difficult.
    • XFS already has a buffer cache.
  • Page sharing mechanism is not yet tied down.

The point being that this is a value-add over current subvolumes.

Subvolume Encryption

  • Generic VFS file encryption
    • Allows image files to be encrypted and/or
    • Files within subvolumes can be encrypted
    • On a per-image or per-file basis, with e.g. per-user home dir encryption.

Early stages!

  • Management interface not decided.
    • How to present a host volume?
    • Is everything a subvolume?
    • Abstractions for integration into other environments?
  • Discussion and review of kernel infrastructure changes.
  • Lots of code still needs writing.

The demo works! Obviously ready to ship.

No, not really. But as a proof of concept, it’s very impressive. Much of the functionality works (in what Dave tells me is only 1,000 extra lines of code on base XFS).

Some Thoughts

This is tremendously exciting to me. I worked with ZFS in production from about 2008 onward, and spent a lot of time trying to work with btrfs before concluding, after a number of years, that it was mostly good for making me improve my backups. And that was before the whole RAID56 scrub fiasco came out.

ZFS is a lot more reliable, of course, but it shows the era it was designed in: it assumes a lot of care and feeding from a relatively low ratio of servers:admins, and it’s surprisingly easy to get a zpool to a point where the only fix to ENOSPC or abysmal performance is to destroy everything and start again. For they world Sun’s developers were in at the start of the century, this made sense, but less so now (also, the clunky userland is annoying).

If the XFS team can deliver the same features I care about (and I care about good snapshots so much compared to what I get from LVM), reliably (unlike btrfs), and in an easy-to-use fashion (unlike ZFS with it’s “never make any mistakes or you need to start from scratch”), it will be a huge, huge thing.

When Should You Rewrite in Rust?

e dunham @qedunham

“Should” is a value judgement, and as such is based on a set of assumptions; Emily’s assumptions that she brings are:

  • You’re working on some code.
  • People who help you with the code.
  • People use your code.
  • People’s time and happiness are valuable.

So what is rewriting:

  • Rewriting is starting anew on a problem that you already have code to address.
  • Refactoring is modifying code you already have.

You can sort rewrites by the size of the rewrite, and hence the challenges you may expect to face. The smaller the rewrite, the fewer new bugs introduced and the quicker it will likey to be; as rewrites embiggen, so does the time taken.

So why might you rewrite?

  • To learn a new tool.
  • To better understand a task.
  • Ship different production code to new users.

Rewriting to Understand: Costs your time, to rearch, understand, and implement.

Rewriting to ship: tremendously more complicated. Many people have to learn new things, and may be affected by bugs and new issues. Your whole toolchain changes.

So why ship a rewrite?

  • Improve performance.
  • Improve maintainability.
  • Increase contributor base.

If you want to make a large change, which a rewrite is, you need to communicate! You eed to start by being clear about the problem you’re trying to solve, and what you believe the root cause of the problem is.

Once you’re clear on the problem, you need to be cear who will be affected by the change; whether other programmers, users, whoever. You should learn from history; how have these kind of efforts have suceeded and failed. Be sure you understand why things have gone poorly or well. Don’t add unnecessary complexity.

Note that experts prefer small changes. Break the changes into small, incremental ones. This maximises your changes of success.

Note that risks mutiply: the more chances you take at once, the less likely you are to succeed.

A small soapbox: if you rock into a community and say “rewrite and x” you are iplicitly saying:

  • You should throw out your hard work.
  • Your work would be easy to replace.
  • You don’t understand your own engineering tradoffs.
  • I don’t understand what you’re doing.

Don’t some across as a language troll! Respect what people are trying to do. Ask how you can help, don’t just write code a project doesn’t want.

Case studies: Rewrite to understand

  • Use a familiar task or tool.
  • Document what you learn.
  • Take lessons home.

Carol Nichols: “I know Rust and I wonder if it could make Zopfli compression faster.” She re-wrote a library and made it available for other people would find it useful.

Rewriting to Deploy

Be honest about your problems:

  • Code is slow? Refactor the hot code, don’t rewrite.
  • Code is confusing?
    • Fix your docs.
    • Prune dead code.
    • Simplify and refactor.
    • Build and document a boundary where you would add code.
    • Otherwise you’ll just rewite your problems.
  • Bored now.
    • Not a great reason.
    • People will get bored of the new language.

So can Rust solve your problems? Well, let’s look at Rust’s goals:

  • Memory safety.
  • C-like performance.
  • Minimal runtime.
  • Integration with other languages.

Rust offers you many choices:

  • Stable vs Nightly. But stable is not a long term support.
  • Safe vs unsafe. Unsafe loses the memory safety guarantee of Rust.
  • Dynamic and static linking.

Rust has a lot of modern, excellent tooling, particularly by systems processing standards.

  • Rustup.
  • Cargo and crates.io.
  • Rustfmt.
  • Clippy.

Industry lessons

  • Isolate a problem.
  • Establish the workflow.
  • Work with a small, expert team.
  • Work incrementally.

NPM

They build a load balancer that would send some service requests to their current stack, and some to new Rust microservices. Those Rust microservices were replacements of particular high priority services.

By breaking the work into small, discrete chunks they were able to roll out Rust with low risk, and without having to have a lot of people learn Rust all at once.

Project Quantum

The first challenge is working out who to move Rust into the Firefox deployment pipeline, with a small, feature flagged component, and growing from there. That first deployment had Rust code that never executed for users, but it tested whether it was possible to actually have the code build with the C++ code base. Once everything was reliable, then the work of porting chunks of the all-Rust Servo project over to Firefox could begin.

FFI

Rust has an FFI interface; this allows a language to call or be called by other languages; in FFI parlance, to be host (embedding another language in a Rust code base, for example), or to be a guest (embedding Rust in Perl, say).

edunham walked us through a simple FFI example, using Perl as the host language, and Rust as the guest; it required four files (the Rust program, the Perl program, a Makefile, and the build file for Rust), and was very straightforward. She encouraged the audience to have a go at FFI, in line with the idea that the best way to use Rust, in many cases, is is small specific cases, rather than full re-writes.