Caching Is Key, and SIEVE Is Better Than LRU (usenix.org) 24
USENIX, the long-running OS/networking research group, also publishes a magazine called ;login:. Today the magazine's editor — security consultant Rik Farrow — stopped by Slashdot to share some new research. rikfarrow writes:
Caching means using faster memory to store frequently requested data, and the most commonly used algorithm for determining which items to discard when the cache is full is Least Recently Used [or "LRU"]. These researchers have come up with a more efficient and scalable method that uses just a few lines of code to convert LRU to SIEVE.
Just like a sieve, it sifts through objects (using a pointer called a "hand") to "filter out unpopular objects and retain the popular ones," with popularity based on a single bit that tracks whether a cached object has been visited: As the "hand" moves from the tail (the oldest object) to the head (the newest object), objects that have not been visited are evicted... During the subsequent rounds of sifting, if objects that survived previous rounds remain popular, they will stay in the cache. In such a case, since most old objects are not evicted, the eviction hand quickly moves past the old popular objects to the queue positions close to the head. This allows newly inserted objects to be quickly assessed and evicted, putting greater eviction pressure on unpopular items (such as "one-hit wonders") than LRU-based eviction algorithms.
It's an example of "lazy promotion and quick demotion". Popular objects get retained with minimal effort, with quick demotion "critical because most objects are not reused before eviction."
After 1559 traces (of 247,017 million requests to 14,852 million objects), they found SIEVE reduces the miss ratio (when needed data isn't in the cache) by more than 42% on 10% of the traces with a mean of 21%, when compared to FIFO. (And it was also faster and more scalable than LRU.)
"SIEVE not only achieves better efficiency, higher throughput, and better scalability, but it is also very simple."
Just like a sieve, it sifts through objects (using a pointer called a "hand") to "filter out unpopular objects and retain the popular ones," with popularity based on a single bit that tracks whether a cached object has been visited: As the "hand" moves from the tail (the oldest object) to the head (the newest object), objects that have not been visited are evicted... During the subsequent rounds of sifting, if objects that survived previous rounds remain popular, they will stay in the cache. In such a case, since most old objects are not evicted, the eviction hand quickly moves past the old popular objects to the queue positions close to the head. This allows newly inserted objects to be quickly assessed and evicted, putting greater eviction pressure on unpopular items (such as "one-hit wonders") than LRU-based eviction algorithms.
It's an example of "lazy promotion and quick demotion". Popular objects get retained with minimal effort, with quick demotion "critical because most objects are not reused before eviction."
After 1559 traces (of 247,017 million requests to 14,852 million objects), they found SIEVE reduces the miss ratio (when needed data isn't in the cache) by more than 42% on 10% of the traces with a mean of 21%, when compared to FIFO. (And it was also faster and more scalable than LRU.)
"SIEVE not only achieves better efficiency, higher throughput, and better scalability, but it is also very simple."
Pretty Neat (Score:5, Informative)
And you can even understand it by just reading the summary.
Re:Pretty Neat (Score:5, Funny)
A detailed, explanatory summary; plus a link to an article on a site that doesn't throw up a paywall or even require registration.
Who let THIS story through?
Re: (Score:2)
> Who let THIS story through?
Probably the same person who let the dogs out
Re: (Score:1)
Love it. Pure and simple progress.
Re:Pretty Neat (Score:5, Informative)
And you can even understand it by just reading the summary.
Almost. A relevant thing left out (though possibly sussed out by some) is that the the claims were motivated by and apply to the Web CDN exclusively, not necessarily caching more generally.
The data access patterns of these web caches, specifically in key-value and CDN workloads, differ markedly from traditional page cache workloads. For instance, whereas loops and scans of address ranges are common access patterns in block cache workloads, they are very rare in web cache workloads. Instead, the objects in web cache workloads invariably exhibit skewed and long-tailed popularity distributions that follow a power-law distribution. ...
So for other uses... ymmv
Re: (Score:2)
Re: (Score:2)
Agreed. One of these times that reading slashdot feels like it's actually worth it. Both the summary and the actual article were a nice read.
MFU (Score:3, Funny)
Congratulations, you just re-invented "Most Frequently Used"
Check out the ZFS ARC, which combines both MRU and MFU, along with ghost lists for things recently evicted to learn how it predicted wrongly to help auto-tune itself.
Yeah, you're not coming up with new ideas, not even remotely. This has been around literally decades.
Re: (Score:3)
ARC was one of 19 algorithms they compared.
Re:MFU (Score:5, Informative)
ARC is mentioned on page 1. They note that ARC has 4 internal LRU lists. However,
Re:MFU (Score:4, Informative)
The ZFS ARC has patents... I don't know when they expire, or if they already have... ZFS is damned close to 20 years old now from a release date perspective. If anything I'd say they re-invented CLOCK, though they do mention how it differs.
(MRU = Most Recently Used, MFU = Most Frequently Used)
But ARC still isn't quite the same thing. Notably, once promoted to MFU, pages don't get demoted - MFU only evicts items when MRU items are promoted, bumping off the oldest MRU item. Whereas SIEVE effectively bumps MRU items back down to MFU as it touches them, though it's still just 1 queue and its status is just a flag. So a page really needs to keep showing it's wanted to avoid eviction.
ARC has two distinct queues... pages go into MRU first when loaded for the first time, and then moved to MFU if hit again before eviction and then this queue behaves like regular LRU. Thus a simple file copy might blow away the first queue, but the second should remain untouched. What makes ZFS's ARC special is those ghost lists are used to help decide the size of the two queues... cache misses for items very recently purged are actually detected as near-misses and is a hint that one of the two queues - whichever it belonged to - might benefit from being larger (at the expense of the other)
Re: (Score:3)
Alternatively, you could read the paper, and then you'd know this is discussed and is not the same thing your talking about.
Arc is discussed as an example of an overly complicated algorithm Sieve replaces.
Re: (Score:2)
Alternatively, you could read the paper, and then you'd know this is discussed and is not the same thing your talking about.
Arc is discussed as an example of an overly complicated algorithm Sieve replaces.
It won't replace ARC, it won't even replace LRU. The principle that MFU > MRU is just not remotely new. Is ARC the only way of doing that?
If your caching algorithm is still using LRU at this point... why?
Re: (Score:1)
What about Tiny-LFU ? (Score:3)
Re: (Score:2)
Across many large and small traces their policies tend to be average. That's because it's just a small adjustment to existing work, e.g.
Re: (Score:1)
LRU (Score:2)
Let's talk about LRU:
LRU (linked list and hashtable):
Get (item in cache)
* Look for key in hashtable, get reference to cache entry
* Move found item to back of the linked list
Get (item not in cache)
* Look for key in hashtable, you get nothing
* If there's no room, evict items from the front of the list until there is room, removing them from the hashtable as well.
* If there's room, add new item to back of the linked list
Changing LRU to simplify the Found case:
Get (item in cache)
* Look for key in hashtable, get
ARC is at least twenty years old (Score:3)
Older than that becase what Sun called ARC was already patented and licensed from IBM.
Our evaluation on 1559 cache traces from 7 sources shows that SIEVE achieves up to 63.2% lower miss ratio than ARC.
This is highly deceptive, "up to", in what % of the test cases was that seen, 1/1559...? Only the statistics across many workloads matter, or specific real world edge cases that are likely to occur, such as background processes walking the filesystem and wiping the cache. There are always outliers that makes one cache strategy look better or worse by large margins, because they're only guessing.
SIEVE is doing roughly the same trick as ARC, or probably anything else remotely modern, separating recently used from frequently used. It straight up prioritizes FU all the time though, where the A in ARC, adaptive, is about shifting over time to somewhere between what SIEVE does and what LRU does. The other thing is both sides in ARC are still LRU, while the two sides in SIEVE are FIFO.
If you do a server backup, or for whatever reason say Splunk decides to read the whole damn universe in, your system will be flooded with cache misses. LRU evicts the whole cache in LRU order. SIEVE would evict all things that were hit once, then all other things, in FIFO order, replacing the whole cache with recently used objects. ARC would evict from the hit once side in LRU order, leaving the hit more than once side alone, for any number of cache misses. Only if it gets ghost hits on the hit once side would it pull from the hit more than once side, in LRU order. Keeping batches of cache misses from dumping more valuable cache is a huge feature of ARC, it's almost the whole point. SIEVE can only kinda sorta do that if it gets lucky with small runs of cache misses, small relative to its total cache size. I'm sure it's good enough for web content caching, but the system page cache on a general purpose server, uuuuuhmmmm... if we're finally moving on from LRU let's do better please.
SIEVE is very, very, very simple on the other hand, and it's better than LRU, because LRU is as dumb as it gets not counting FIFO. In fact SIEVE is FIFO, it just shows that even FIFO is better than LRU when you track frequently used objects. But we already knew this. It is 2024 and we're still using LRU for many things. ARC is old, its principles are older. Your competition is not ARC. No amount of white papers with Barney style explanations will improve the situation with system engineering in the post-commercial-unix era, and we will all be using LRU on default Linux installs ten years from now. ... Unless SystemD takes over the page cache, god help us all.
Re: (Score:3)
I one of my algorithm books they propose marking objects and then randomly evicting "old" objects rather than in FIFO order. That helps break the type of situation where the cache is a bit too small and you are reading the data in sequential order (for example, running a large report several times in a row). If you use FIFO you can maximize cache misses in such cases. But if you evict randomly among the oldest items you can get much fewer cache misses.
SIEVE seems to also break this pattern which could expla
Re: (Score:3)
Unless SystemD takes over the page cache, god help us all.
Why only one comparison (Score:3)
The paper shows a somewhat convoluted comparison of the various algorithms running against one dataset relative to FIFO. Why didn't they publish comparisons with the other data sets? And what does "10% of the traces" mean?
Why don't they compare SIEVE to the fastest of the other algorithms directly?
And what does their throughput number actually mean? Is it "objects checked?"
Re: (Score:2)