What are IOPS?

What are IOPS KB ID 0001833

My IOPS History

I was on a call this morning where the IOPS (Input / Output Operations Per Second) were being discussed. I have a love / hate relationship with IOPS insofar as they are ONLY any use when you are comparing apples with apples, and more importanly (which is the bit we don’t talk about) that we have defined what an apple is. Because one mans Golden Delicious is another mans Bramley cooking apple, (that was deep eh?).

A few years ago when I was back on the tools I was installing a storage system for a client (it was a virtual storage array) and we had benchmarked it with some software at 95 thousand IOPS. the vendor that supplied the storage pulled the support for it, so we were left red faced trying to source an alternative. Everything we installed came out with a figure of less than 95 thousand IOPS – As far as the customer was concerned we had promised him one thing and delivered another.

So What Are IOPS?

Let’s say you want to buy a car, these days with environmental concerns and the cost of fuel, one of the things you might want to compare are the ‘Miles per Gallon‘ fuel consumption. Let’s say one of your choices has an MPG figure of 96Mpg (154.5Kpg). Well that’s dandy, but I guaranty that figure was tested in an environment that gave the manufacturer the best possible outcome, so unless you are going to drive at 56 miles an hour constantly, with the highest rated fuel, on a rolling road and never stop or brake, then ACTUAL RESULTS MAY VARY. And who is to say car vendor A used the same tests as car vendor B. And there’s also THREE DIFFERENT SIZES for a gallon, and for countries that don’t use gallons they will convert from litres which can’t be done to less than a lot of decimal places.

IOPS suffers from the similar problems e.g. Storage Vendor ‘A’ will say, “we deliver 1.2 million IOPS”, and Vendor B will say “we deliver 1.8 million IOPS” – so Vendor B is the better option, right? Well NO, that’s why you need to know how the figures are derived.

The figure that gets derived relies heavily on the following factors.

  • Block size / sector size of the storage.
  • Resiliency/RAID level of the storage.
  • Actual physical storage media (e.g. Spinning disk/nearline/midline/SSD).
  • Actual physical connection fabric (e.g. SAS/Fiber/iSCSI).
  • Size of data written.
  • Size of data read.
  • Sequential or random read/writes, or a blend of the two.
  • Concurrent Workload (Testing an array with no load, is like driving an F1 race car on a closed motorway).
  • Storage QoS If you’re in a ‘shared’ storage environment your IOPS may be ‘capped’.

What are IOPS: Throughput and Latency

THROUGHPUT is normally used in conjunction with IOPS, throughput is a figure measured in bps bits per second, or Bps bytes per second. So If we know this figure AND we have an IOPS figure (that we know how it was derived.) Then we can make a comparison? Well no, there’s a third thing we didn’t take into consideration – LATENCY, this is the amount of time it takes to get an operation to and from the storage array. Why is that important? Let’s say we have an ‘All SSD’ array with blistering throughput and IOPS figures, but your 10+ year old Solaris 7 servers cannot match that through their 5+ year old HBAs then your ‘experience’ is going to be bad. OK that’s a severe example. But put that in a real world scenario, I work for a service provider, we provide storage, If we say we will warrant X thousand IOPS and a customer that just consumes storage from us connects their Solaris 7 servers to that storage and says, “we are only getting half that performance”. Whose responsibility is it to investigate why? This is why if you look at the large hyperscalers, when they give you performance info, they will give you IOPS (without telling you what those IOPS are!) and they will give you throughput (That they will cap, usually at xMbps). Because latency is not really their problem  – search their documentation and they deliberately only use the word latency to say things like ‘Ultra low latency SSD” or that “SSD provides lower latency than HDD“.

So Why the Ben Affleck Meme?

Because of the three things you need to take into consideration when looking at storage performance, (and remember this is storage performance, not application performance, because a poorly coded DB application from 1987 can be on the best hardware in the world and still be awful – and your DB consultant will blame the storage or the network because he can earn several hundred pounds a day while you bust a gut proving otherwise). Because it’s a figure that without any definition, means nothing.

I do like an analogy, (as you’ve seen). What are IOPS? IOPS are the digital equivalent of giving 50 teenage boys some ribbon and a sharpie, and telling them to all to make a tape measure and find out who is the best endowed, then deciding (without seeing the tape measures), based on who came up with the biggest number.

Related Articles, References, Credits, or External Links

NA