Quick Links
As organizations move toward real-time data processing, cloud computing and high-performance workloads, storage performance has become a critical part of IT infrastructure planning. Storage speed directly affects application responsiveness, system scalability and the ability to avoid performance bottlenecks. Input/output operations per second (IOPS) is one of the most widely used metrics to evaluate storage performance.
While faster storage is universally desirable, IOPS provides a concrete way to measure how effectively a storage system can handle frequent data requests. Understanding IOPS helps organizations make informed decisions about storage design, capacity planning and workload optimization.
What is IOPS?
IOPS definition: Input/output operations per second measures how many read and write operations a storage device can complete in one second. It is a standard benchmark for evaluating storage technologies such as HDDs, SSDs and storage area networks (SANs).
IOPS complements other performance metrics such as throughput and latency. Throughput measures how much data is transferred, and latency measures how long a single operation takes. IOPS focuses instead on how frequently a storage device can process requests, making it especially relevant for workloads that generate many small, rapid transactions.
IOPS is commonly measured for both random and sequential access patterns.
Achievable IOPS depends on factors such as the storage controller, NAND flash type in SSDs and the interface used to connect storage to the host system, including non-volatile memory express (NVMe™) or serial advanced technology attachment (SATA). Often, high IOPS values are essential for workloads that require rapid, small block data transfers.
How does IOPS work?
IOPS is designed to quantify the frequency of data transactions between a server and its storage. These measurements help IT professionals determine whether a storage solution can meet the performance demands of specific applications. There are four core factors in measuring IOPS.
Workload type determines whether the operations are reads, writes or a combination of both. Most benchmarks use a mixed workload to better represent real-world use.
Access pattern identifies whether data is accessed randomly or sequentially. Random IOPS are significantly more difficult for a drive to deliver than sequential IOPS.
Block size defines the size of each data transfer, such as 4KB or 8KB. Smaller block sizes typically result in higher IOPS values, while larger block sizes favor higher throughput.
Queue depth (QD) represents the number of outstanding input/output (I/O) requests waiting to be processed. Higher queue depths can reveal the maximum IOPS potential of high-performance SSDs by increasing parallelism. At higher queue depths, interfaces such as NVMe protocol allow storage devices to process many I/O requests simultaneously, enabling much higher IOPS than legacy interfaces.
What is the history of IOPS?
IOPS originated as a simple way to describe storage performance when mechanical drives dominated computing. As storage technology evolved, its role and impact expanded significantly.
- 1970s, mechanical limits: HDD performance was constrained by seek time and rotational latency, typically limiting drives to fewer than 200 IOPS.
- 1990s, RAID scaling: Redundant array of independent disks (RAID) architectures allowed multiple drives to operate in parallel, increasing total system IOPS by distributing workloads.
- 2000s, SSD adoption and SATA limits: Flash-based SATA SSDs eliminated mechanical delays and increased IOPS from hundreds to tens of thousands. However, the SATA interface quickly became a performance bottleneck as flash memory advanced.
- 2010s, NVMe protocol: As SSD performance increased, NVMe moved storage from legacy interfaces onto the peripheral component interconnect express (PCIe®) bus, removing interface bottlenecks and enabling massive parallelism. By bringing storage closer to the compute core and reducing protocol overhead, NVMe dramatically lowered latency and pushed IOPS into the hundreds of thousands and millions.
- 2020s and beyond, AI-scale and composable architectures: AI, machine learning and real-time analytics increased the demand for high random IOPS and low-latency storage. At the same time, emerging architectures such as memory and storage pooling and advanced interconnects like compute express link (CXL) are reshaping how I/O performance is delivered, making IOPS increasingly influenced by system-level design rather than storage devices alone.
What are the key types of IOPS?
IOPS can be categorized based on how data is accessed and how performance is sustained over time:
- Random read IOPS: Measures how quickly small data blocks can be retrieved from different locations on the storage device.
- Random write IOPS: Measures how quickly small data blocks can be written across the storage medium.
- Sequential read/write: Describes performance during large, continuous data transfers, often expressed alongside throughput in megabytes per second (MB/s).
- Steady-state IOPS: Represents sustained performance after a device has been under load for an extended period, which is especially important for enterprise SSDs.
- Burst IOPS: Indicates short-term peak performance before power, thermal or controller constraints reduce throughput.
How is IOPS used?
IOPS is a key performance indicator for workloads that depend on frequent, low-latency data access.
In database environments, high IOPS ensures fast query execution and reliable transaction processing. Databases rely heavily on small, random read and write operations, making IOPS more relevant than raw throughput.
In virtual desktop infrastructure (VDI), IOPS helps maintain performance during boot storms, when many users log in or launch applications at the same time. Storage systems with insufficient IOPS can experience slowdowns or instability during these peak events.
IOPS is also critical in AI, automotive and edge computing. Autonomous systems and AI pipelines must ingest and retrieve data in real time. High IOPS storage enables rapid sensor data processing and supports low-latency decision-making.
HDDs rely on mechanical components that must physically move to access data, which limits how many operations can be completed per second. SSDs access data electronically through flash memory, eliminating mechanical delays and enabling much higher IOPS.
Latency is the amount of time it takes to complete a single I/O operation (such as a read or write request). High IOPS combined with low latency delivers responsive storage performance. Even systems with high IOPS can feel slow if latency is too high.