Disks Suck

2010-12-24 · 615 words · 3 minute read

Technology

postgresql · storage · disk · partitioning

I’m currently dealing with an irritating problem in a skunkworks DB I’ve been building and maintaining for the last month or two; over the past week or so it’s hit a tipping point in terms of performance; queries that used to run in seconds take minutes, COPYs into the DB that used to take minutes take hours.

This sucks.

The root cause is hardware. Because it’s a little skunkworks project it’s living on a cobbled-together system: a Core2 Quad box with 4 GB of RAM and a RAID5 array of 3 x 80 GB, 7200 RPM drives. It personifies the weakness of standard PC hardware: cheap, abundant CPU and RAM. Don’t mention the disks.

To put it in some perspective: a little over a decade ago, a CPU clocked at a couple of hundred MHz was wicked fast. Memory clocked at tens of MHz was good. And a 7200 RPM drive with 11 ms seek times was really fast.

My skunkworks box has a CPU clocked an order of magnitude higher than my decade-old high-end CPU. The memory is an even bigger gap. The vastly improved caches on modern CPUs, the multiple data-fetch per cycle capability of modern RAM, wider data paths, these all make this skunkworks system almost magically more grunty that any server I’d have bought back then.

The disks are the same speed. The seek times are maybe 10-20% better. In a decade. Order of magnitude (or more) for commodity RAM and CPUs, nothing for disks. It’s like I’m building databases in 1999.

(Arguably it’s actually worse to do DBA work, even half-arse DBA work like I’m doing; back in the day if I needed 120 GB for data it’s be spread over 10 or 20 spindles, not 3. Assuming I could find them.)

What’s interesting is the beautiful “off a cliff” performance degradation this database has exhbited. It’s the classic tipping point behaviour. A million rows performs much like a 100,000 rows. Ten million rows gets a little hinky for queries without tuning the indicies and the Postgresql config a little more carefully. More indices don’t help import performance, of course, but overall it’s still OK.

Twenty million rows? You’re fucked. Queries take forever, better indicies don’t really help. That’s OK, though, because there’s practically no way to run any new data in, anyway. Fiddle with all the memory parameters you want, son, you ain’t goin’ nowhere. You’ve hit the wall of 100 - 120 disk transactions per spindle per second, and that’s all you’re getting forever and ever amen, or until you shell out for battery-backed caching disk controllers, or a SAN with gobs of same, or SSDs.

So I have, with a little reluctance, embraced partitioning a couple of the bigger tables; monthly for the tables that takes down to a million rows a month, weekly for the ones that still have 20 - 25 million rows a table. Partitioning in Postgresql is really useful in this regard; it puts a catastrophically unusable system back into contention. Unfortunately, unlike (say) Oracle partitioning, it’s not very automatic, which is why I’ve avoided it in the past: I’m not a real DBA, I don’t have a toolkit of scripts to auto-extend my partitioning schemes and triggers and whatnot, so I have to do all that irritating shit from scratch.

But still: on balance, I sing the praises of partitions. Without them I’d be dead in the water.

(I suppose I could go get one of those trendy NoSQL databases, but most of them seem to work around the problem of slow disk by just throwing away ACID compliance; I can pipe data to /dev/null all by myself, thanks all the same.)