LCA 2016 Day 2

The second day of miniconfs, moving on to spending most of the day at the sysadmin miniconf.

Keynote: “The Cavalry’s not coming… We are the Cavalry”

The Challenges of the Changing Social Significance of the Nerd

George Fong

Started out as an Admiralty lawyer in Singapore, became a legal academic, and has become interested in technology, becoming interested in the outcomes; in the 80s the maths department felt it was “theirs” and needed to be kept away from “people like me”.

The Interent predates Linux, but George considers the development of Linux to be linked inextricably to the Internet, but he resists the Linux Foundation’s suggestion that “life without Linux” would see no Internet; nevertheless a large chunk of what we think of the contemporary Internet is implemented on Linux.

Fear drove the creation of the Internet - ARPANet - and it was forked to NSFNet.

There isn’t a single law or governance process imposed on the IAB and IETF, and yet they have continued to defined how the Internet works, but there is a source of great concern: the desire by government and NGO organisations (such as the ITU and UN) to take control of the standards and apparatus of the Internet. George recognises that some of the concerns raised were legitimate - but was still delighted that the process was stymied.

“The difficulty and complexity is something that protects us.”

Geroge talked about his involvement in bringing up rural Internet access in Victoria - communities were begging for access. They were unsure what the effect would be, but they knew it would change things.

An elder told him: “you can have a pair of Aboriginal talking sticks when you can make the Internet as reliable.” Twenty years on he still doesn’t have a pair.

The changes wrought by the Internet have been astonishing, and making things work (via organisations like the IAB and IETF) has remained founded on key principles of community, in particular the idea of “The Rough Consensus”.

Today “the cloud” has proliferated in Australia and caused people to become far more dependent on the Internet than ever before. People love the services “the cloud” offer.

“We must continue to talk in plain English and we must continue to speak to the people in power.” George reminisced over a golden past of trust and naivity in the 70s and 80s; over time the “technical community were left behind” from a naive environment with “no firewalls”.

(I remember pre commercial Internet and I do not remember a Utopia poisoned by “ordinary people”. Trolling, flame wars, net.terrorism and general arseholery existed before “ordinary people” arrived. I am not a big fan of the idea of an Eden ruined by the normies, to put it mildly.)

As people and companies commercialised the ‘net, anxiety set in about security, so we started adding security - two-factor authentication via parallel non-Internet channels is “an admission the Internet is broken”. RFC1984, which laid out the position of the IAB and IEFT in favour of encryption, and expressed concerns about moves by governments to attack privacy and encryption; revelations by the likes of Chelsea Manning and Ed Snowden show the correctness of this concern.

In Australia the push for data retention laws in Australia are the biggest local concern in this ongoing struggle; these are an attack on integrity and trust.

  • The technical community does not live in a parallel universe.
  • There is a difference between what ought to be and what is - all lawyers have to deal with conflict.
  • So what do you do? You cannot retreat into building software as a way of retreating from humanity.
  • Anonymous is not a fix. Neither are social media wars.
  • This is not a new problem.
  • A seat at the table should always be your first objective: speak in plain English, educate, factor in humanity.

(While I have some quibbles with George’s comments, I found this an enjoyable keynote - the oral history of the rollout of the ‘net in Australia was fascinating, and the overall point that technical fixes can’t work around human problems is one that we don’t, in my opinion, hear enough, and needs repeating.)

Q&A

  • Is gender balance a problem? Yes, it is critical. We are throwing away half our talent. “I don’t have the answers.”
  • Do you believe the IoT will deliver benefits to people or companies? If we drove at 30 km/h we could save a lot of lives - we tend to sacrifice safety for convenience. Fundamentally IoT can be transformative, so we need to consider the human impacts of what we’re doing. How we use the technology is the problem. We need to become sociologists.

Is That a Datacentre in Your Pocket?

Steven Ellis

  • AKA “Is that a cloud in your pocket?” but everyone is over clouds.
  • Steven has to demo at LUGs and for work on a regular basis, which involves building stuff in a regular, repeatable manner.
  • “I have a scratch I need to itch” - how do I stop using my workarounds and turn them into proper solutions to problems.

Hardware & OS

This talk is all on x86 running CentOS. Everything should be usable on other *ix flavours.

  • When mass virtualising on your laptop, always use SSD storage.
  • Used USB 3, not 2, and confirm your link is coming up at full speed. hdparm is your friend, and Steven has been unpleasantly surprised over the years how often slow links get auto-negotiated.

Nested Virtualisation

  • With modern hardware gives much better performance than using QEMU to do the same thing.
  • Steven strongly recommends KVM over VirtualBox.
  • There are some processor-specific options to enable and optimise this, enabled at module load time.
  • Be careful that your nested VMs come up on a different address range to the hypervisor addresses. Otherwise your second-level guests will clash with your first-level guest(s).

Thin LVM

  • Underpins a lot of the talk.
  • Use thin LVMs on SSDs to create the illusion of size.
  • Means you don’t have to agonise over the system image sizes.
  • Enable issue_discards=1 or TRIM won’t work.
  • Think-on-LUKS doesn’t work well.

Base OS Image

  • Always use a kickstart or similar automated build.
  • Doesn’t have to be a minimal install, but should be fairly small.
  • Consider your main use case.
  • Steven rsyncs the yum cache from image to image. Consider also mrepo or Satellite/spacewalk.

All About That Thin Base

  • You can use fstrim and kpartx to convert full images to thin LVM images.
  • virt-manager doesn’t always understand thin volumes.
  • The failure mode is to delete the thin image. This is Not Good.
  • Think LVM doesn’t auto-start.

Building Micro-Environments

  • Learn cloud-init.
  • Docker and Kubernete on Atomic - “mydbforweb” and “webwithdb” playbooks from Red Hat are very helpful how-tos.

UEFI

UEFI on KVM is an interesting aid to getting UEFI-ready.

  • UEFI can be booted on KVM now.
  • UEFI ready USB keys there is a world of pain.
  • If you prepare it correctly you can boot on either.
  • Some hardware will aggressively try to boot legacy (BIOS) mode in preference to UEFI: e.g. Cisco, Lenovo. This is especially true of USB. Check your tools are behaving the way you think.

To-Do

  • Re-Work all demos to UEFI.
  • Kubernetes + Docker + Ansible.
  • ARM64 Servers. “ARM is coming to the datacentre.”

Revisiting Unix Principles for Automation

Martin Krafft

  • Automation via the principles of Unix we know and love (like not working with projectors).

The “SSH botnet”. Run sysadmin “stuff” on a set of persistent topology of connections between nodes, using SSH. The focus is on providing a high-quality transport layer for other automation tools.

Martin has strong feelings (“crap”) about the transport layers of cfengine, Salt, puppet, etc - homebrewed encryption, NIH implementations, and so on.

Transport Unix style: * Do one thing well. * Allow push or pull. * Using socat for the demo. * Creates sockets for the demo, with agents listening. * Nodes can be listeners (waiting for the master) or pull (from the aster).

There’s nothing actually there yet - no policy mechanisms, control, etc, which Martin describes as “vapourware”.

On github

My reaction: the idea of a common transport layer is a good one, but I don’t think SSH is the Right Thing to do - message queues seem like a much better concept, since the problems that are hard about this sort of thing (like guaranteed delivery of config events, for example) are well-understood and solved.

Also, I’m not sure ssh “does one thing and does it well” in a way which is compatible with long-lived streams.

A Gentle Introduction to Ceph

Tim Serong

  • It has been around for a while.
  • Stable since 2012.
  • Scale out.
  • Most popular choice for storage on OpenStack.
  • Storage Clusters.
  • Object access, block access, distributed filesystem (CephFS).

The Components

  • Disk
  • FS for storing Ceph primitives.
  • OSDs, daemons which provide the interface to Ceph clients, and also provide the clustering and replication amongst themselves.
  • Monitor nodes - provide the lookup service for clients to discover the OSDs.
  • All writes to OSDs must compete for a write to be complete, but blocks can be read from any OSD.
  • CRUSH rules control the layout of data: number of replicas, rules around how far apart they need to be, and so on. Clients understand CRUSH, so you don’t need the controller to manage interaction.
  • Pools allow you to group storage for different types of storage with access controls and layout policy.

Networks

  • Fast as you can afford.
  • Separate public and cluster networks.
  • The cluster network needs twice the bandwidth of the public network.

Storage Networks

  • 1.5 GHz of core per OSD.
  • SSDs for the OSD journal is a good idea.
  • Regular HDDs for the main storage.

Performance

Layout Choices

  • Replication: fast but wasteful.
  • Erasure encoding, analagous to RAID5/6. Efficient, but big CPU and rebuild overheads.
  • Cache tiering: one pool can act as a cache over another.

Adding More Nodes

  • Ceph tries to keep your data balanced evenly across the cluster.
  • More nodes = more performance.
  • More disks per node = more performance, so long as you’re not controller/CPU/memory bound.

Q&A

  • Is ceph working on changing write patterns to work well with shingled drives? Tim knows people are looking at it, but not what state it’s in.
  • How does ceph go with small clusters? Don’t bother unless it’s about a quarter-rack size. SUSE’s smallest supported stack is around 4-5 nodes.
  • What are the failure scenarios like? There’s always an odd number of MONs, so the quorum will win.

Keeping Pintrest Running

Joe Gordon

Software vs Service

Moving from “supporting software” and “supporting service” is a culture shift.

  • There’s no stable branch. There’s production.
  • There’s no compatibility matrix.
  • Developers support their own service. They’re on-call.

What is SRE?

  • Fire fighting at scale.
  • Changing tyres while driving at 100 km/h.
  • You don’t stop.

Operational Maturity

Moving from “ad hoc, individual heroics” to repeatable, reliable processes (which through documentation or automation).

Operational excellence looks like resilience, monitoring and managing to SLAs, incident management, runbooks for anything that could go wrong.

Efficiency, capacity management, speed & latency.

Visibility

  • Absolutely critical.
  • Data drive: needed for debugging, SLA enforcement, and capacity planning.
  • Time series data: system, service, latency metrics are critical.
  • Alerting.
  • ELK stack for real time log collection.

Deployment Requirements

  • Easy to do the right thing.
  • Change history needs to be a thing.
  • And users should never know.

Canary vs Staging

  • Staging are “dark traffic” copies of the customer data to see what’s happening without sending responses to the user.
  • Canaries are pushing new code to one member of the cluster, letting it interact with users, and then leaving in place, so you get different behaviours for different users.

Postmortems

  • They need to be blameless.
  • They’re lead by an incident manager.
  • The objective has to be making sure the problem doesn’t happen again.
  • You need to restore service, then establish the root cause.
  • Understanding the timeline of the incident is critical.

Production Readiness Review

  • Do your NFRs: SLAs, capacity requirements and so on.
  • On call rotation: you don’t get to build stuff without supporting.
  • Rate limiting: you need to be prioritise service availability.

Public Cloud Issues

  • They use Amazon.
  • Surprising errors.
  • InsufficientInstanceCapacity is a thing. You can exhaust Amazon’s resources, especially around Christmas.
  • Everything is rate limited. RequestLimitExceeded can happen to you.
  • Noisy neighbours are a problem.
  • Rightsizing: resources cost money, you don’t just want to buy the biggest one.
  • Ownership - who’s spending your money in the cloud.

I can only imagine the joy of being able to tell Amazon you’ve run them out of resources for Christmas.

Open Sourced Tools

  • mysql_utils: Pintrest is heavily dependent on mysql.
  • thrift-tools: again dependent on Apache Thrift.
  • secor: Kafka log persistence.
  • pymemcache: memcached client for Python.
  • pinrepo: an artifact repo.
  • teletraan: the deployment tooling is about to be open sourced.

Can you hear me now? Networking for containers

Jay Coles

  • Bonding - primarily for reliability; one flow can never get more than a single link worth of performance.
  • VLANs: logical versions of physical segments. Useful for segregating traffic on the same physical network; VLANs can span multiple switches via trunking.
  • Dummy interfaces: used for routers. Remain available even if the underlying physical device goes away, avoiding upsetting the applications bound to them.
  • Bridges: emulate an L2 switch in software, basis of virt/container networking.
  • VETH: a virtual cable from your VM/container into your bridge. Simplest way to get the virtual environment plumbed in.
  • MACVLAN: Avoids having to reconfigure your networking to create and route bridges; high performance, because it redces the number of switches in and out of the kernel mode. Note that it relies on hardware MAC filtering support.
  • Overlay networks: lets you have a network layout that looks different to the physical implementation.
  • VXLANs are an example - like a cut cable that somehow sends packets, or a multi-system GRE tunnel.

Ergonomics of Automation

Jamie Wilkinson

Ergonomics is “understanding the interaction between people and systems”, not your desk setup. “As I’ve gotten older I’ve realised that automation is about safety science, which means the aviation industry has beaten us to it.”

Many mistakes were made:

  • Automation failed to handle problems.
  • Machines do err.
  • Humans forget things they need to know.
  • Not enough focus on usability.
  • Automation has not caused us to start working 20 hour weeks.

The substitution myth: MABA-MABA: Men are better at/machines are better at. Belief that strengths and weaknesses are fixed, but actually automation creates new strengths and weaknesses.

Automation changes us from doers to supervisors.

Ultimately we haven’t reduced our workload, we’ve just shifted it. And we’ve shifted it unevenly. For example, when flight computers were introduced, pilots were now being asked to read the computers and tell ATC what’s going on, and fly the plane, rather than just fly the plane.

Automation tends to try and compensate for failures, then punts to a human, often in an emergency. We need new models of feedback - warnings on problems before those resources are exhausted, for example.

Complacency sets in: mostly reliable systems become trusted to do the right thing, even when they are misbehaving.

Automation should be human-centred, instead of technology centred. The automation must become part of the team; they must be both observable and directable.

Why is this important?

  • HCI (Human-Computer Interaction) is not “UI fluff.”
  • Outages are caused by clunky automation.
  • Time to recover is slowed by clunky automation.

PDNS

Little Creatures Lights

We all went out to the Little Creatures Brewery, which was very nice - wonderful staff and a very good range of foods, including my first time eating kangaroo (tastes delicious), and trying eating roasted hops (interesting and some varieties are very nommy). The mix of food to booze was excellent, as well.

Also, I played the “Bibleopoly” game, courtesy of Paul Fenwick, which was… entertaining. Not a good game, but an entertaining one. Randi Lee Harper was one of the other players, and I will note that contrary to the opinions of certain sectors of the Internet, she does not eat babies and does have a sense of humour.