linux.conf.au 2015 Day 2

Keynote: Eben Moglen

  • Patent warfare broke out as predicted - the cst has been so high that even many of the proponents of using patents have begun to rethink their strategy.
  • The courts have bgun to push back on patent cases, with 3 supreme court wins.
  • Attemps to create a patent-backed monopoly in mobile are inexhaustible.
  • "In ten years the most important patent system in IT will be the People's Republic of China."
    • This is likely to generate many state-backed monopoly.
    • This make not affect software authors much, but it will affect business partners significantly.
  • Eben prefres to talk about "free software", because "everything from weapons to ice cream is described as open source."
  • 21st century organisations are distinguished by transparency and non-hierarchical organisations, something free software has embodied.
  • Eben notes that DVCS embodies the idea of flat collaboration; the way we develop has come to embody the idea of non-hierarchical organisation.
  • If Phil Zimmerman had lost his legal battles: we'd have lost not just PGP, but SSH and other tooling. We'd be helpless in the face of despotism.
  • The questions that matter are less about copyright over API or patents; the real political, social, and legal action is not there. Snowden did crucial work for freedom.
  • The majority of people with Internet connectivity are now aware of Snowden, and most of them want to do something based on those revelations.
    • That means everyone wants to meet us.
    • We need to explain saving freedom is convenient.
    • We need to put on our nice vest and explain how we can help.
    • It is hard to sustain freedom in an environment which is has been designed to make it hard to sustain it.
  • Snowden has proven that you can't trust what you can't read.

etcd

  • Master with a cluster of slaves.
  • CoreOS developed etcd to support a set of technologies CoreOS think are important.
    • e.g. locksmith does policy-based rollouts of changes and reboots.
    • locksmith does A/B partitions updateing, rebooting, testing, and then either completing the rollout or rolling back.
    • Scales up to cluster level and depends on etcd.
  • Schedulers
    • e.g. kubernetes and fleet.
    • An Scheduler API gives a policy-based way of explaining what you want; e.g. "I want 100 HTTP servers."
    • The scheduler makes things happen based on that policy and then keeps state.
  • Others:
    • vlucan replaces load balancers.
    • confd for config information.
    • DNS.
    • distributed git to provide a multimaster git server.

etcd is an enabling technology for all these sorts of problem spaces.

etcd guarantees consistency and ordering. The tradeoff is that this doesn't defeat the speed of light - members may lag, so you need to make sure you're getting the latest version. The API has tooling to make this easy (block until consistent on requests; Quorum gets to force the member to find the cluster quorum).

Questions

  • How does it handle network partitions? There will be a strong side of the split brain and a weak side, based on where the numbers end up, so the weak side won't elect a master (e.g. when 5 splits into 3+2, the 2 will run masterless, the 3 will elect a master, and then when the partition collapses things will continue.)
  • You can run arbitary numbers of instances, from one up, but you want an odd number, and 5-9 is the safe number. 5 is effectively a minimum: one planned outage, one unplanned outage, and 3 to keep a proper cluster running.

Managing Microservices Effectively

  • Standard app advice: externalise state from microservices and so forth.
  • I love the Gartner hype cycle graph. I may deploy this in anger.
  • They use Docker for isolation/ease of management.
    • Which is around the Peak of Inflated Expectations and heading down.
  • Deployment
    • As fast as possible.
    • Automate, automate, automate. Deployment = "I hate my job".
    • LIFX use Mesos, Marathon. This allows switching from old apps to new.
    • It's awesome, but it's climing towards the Peak of Inflated Expectations.
  • Scheduling is a pain.
    • cron is not the answer.
    • If you're using Mesos, Chronos can be a Good Thing.
    • But it's not really ready for prime time - it works "most of the time".

Coralling Logs with ELK

  • Why centralise?: democratise your logs! Don't be your bosses grep! Because other logs suck!
  • Sizing:
    • It depends.
    • Test on a single server and see what happens; project from a single index/single shard/single replica.
    • Keep your heap under 32 GB - pointer compression is disabled at the 32 GB boundary, which kills performance. This is something I did not know.
  • Ecosystem: there's lots of application specific plugins and tooling now.
  • The REST API makes it easy to monitor/manage.
    • But shell utilities make life easier.
  • Filters for logstash are awesome and powerful - "you will get to know grok very well".
    • There are about 150 input and output filters supplied.
    • There are another few hundred on top of that.
    • e.g. CLF filters, Postfix filters, and so on.
  • Kibana is in fact a thing of bewdy. Why are you playing for Splunk again?
  • ElasticSearch is intended to scale horizontally - you can scale vertically to a point, but ultimately it's primarily scale-out.
  • ES has been working on preventing corruption by checksumming indices across nodes for data integrity.
  • ES tuning is sufficiently complex that auto-tuning is not really viable, unfortunately. The community is happy to help.
  • Curator is what you need to selectively age out indices. It will delete, export, backup the data to your policy.