Cache memory is a CPU memory, which the microprocessor uses to increase the average speed of the interacting with the general memory of the computer (usually random access memory or RAM). It is one of the upper levels of the memory hierarchy. The cache uses a small capacity, very fast memory (typically, SRAM), which keeps code of frequently used data. If most of the requests would be processed in memory cache, the average delay would be close to the delays of the cache. When the CPU needs to address the memory for reading or writing data, it starts with checking if there is a copy of it in the cache. If successful, the processor performs an operation using a cache with very small latency, increasing overall performance by that.

Today, most microprocessors have multi-lavel cache: instruction cache to speed up loading of machine code, a data cache to speed the reading and writing of the data, and the TLB to speed up the translation of a virtual (mathematical) addresses to physical addresses for both data and instructions. Memory cache is often built as a three-level cache (L1, L2, L3).

At the dawn of microprocessor technology, the difference between the access time of the general memory and the processor registers was not so great. However, since the 80s of the twentieth century, the gap in the performance has significantly grown. The speed of the cache grew faster than that of RAM, especially in terms of operation frequency, so the RAM has become a bottleneck for the system performance. Although the same type of the fast memory as for cache can be used for the general memory, a more economical way was chosen: to use excessive amounts of relatively slow, but cheap memory, and a small amount of fast cache memory to reduce the performance gap.

For modern processors, it usually takes more than one clock cycle to read the data from the cache. The execution of programs is sensitive to delays in reading data from the first level cache. A lot of efforts was given to speed up the memory caches.

