lru_ng: Mapping with LRU replacement and callback

lru_ng is a Python module based on the original lru module by Amit Dev. It features better support for a threaded environment, greater control over callback execution, and other internal bug-fixes and improvements.

In most situations, drop-in compatibility with lru.LRU can be achieved by the following import statement in Python code:

from lru_ng import LRUDict as LRU

Differences from lru.LRU are described in the Compatibility section.

The code is optimized for fast insertion, deletion, and eviction. The extension class participates in Python’s cycle garbage collection.

Introduction

The module lru_ng provides a single Python type, LRUDict that behaves like a normal Python dictionary dict by providing key-value mapping.

In addition, its capacity is bounded. When the size bound is reached, inserting a new key will cause the least-recently used key (and its associated value) to be removed (or “evicted”) from the mapping. The LRUDict can be resized, and reducing its size will remove keys in the order from the least- to the most-recent use.

Optionally, a callback can be registered for displaced items. The callback is applied to the (key, value) pair, and can be used for a variety of purposes such as freeing up resources, printing information, or queueing up the discarded items for reuse, among others.

Installation

The latest version’s source is available from the PyPI, and you can install it using the pip command:

pip install lru_ng

The pip utility will download the C source and try to build a binary distribution. For this to work, you need CPython >= 3.6. The following are required for compilation:

  • A C compiler that supports C99 syntax

  • Python development headers

  • setuptools, although distutils included with Python should work, too.

First example

An LRUDict object is created by specifying its fixed size bound. In this example we begin with a size of 4 items.

>>> from lru_ng import LRUDict
>>> cache = LRUDict(3)

The object cache, in many ways, will behave just like a Python dict. It can be indexed, and you can test for membership using the in keyword.

>>> cache["system"] = "programming"
>>> cache["web"] = "development"
>>> "technology" in cache
False
>>> cache["documentation"]
Traceback (most recent call last):
  ...
KeyError: 'documentation'

By incurring a KeyError exception, we have caused a cache miss, which is recorded.

>>> # For Python >= 3.8
>>> "hits: {0.hits}; misses: {0.misses}".format(cache.get_stats())
'hits: 0; misses: 1'
>>> # For Python < 3.8, use
>>> "hits: {}; misses: {}".format(*cache.get_stats())
'hits: 0; misses: 1'

Successful retrieval of a value by key increases the “hits” record by one:

>>> word = cache["system"]
>>> hits, misses = cache.get_stats()
>>> hits
1

When more keys are inserted and the capacity reached, items are displaced by starting from the least-recently used.

>>> for phrase in ("documentation writing", "software testing"):
...     topic, action = phrase.split(" ")
...     cache[topic] = action
>>> print(cache)
<LRUDict(3) object with dict {'system': 'programming', 'documentation': 'writing', 'software': 'testing'} at 0x...>
>>> len(cache)
3

The item for "web development" has been removed from cache. The item is the first to be discarded because "web" is the least recently used key.

Using a callback

A callback is a callable object with two arguments for (key, value). If it is set, it will apply to displaced (or “evicted”) key-value pairs. The callback can be accessed or set via the LRUDict.callback property.

>>> def func(key, value):
...     print(f"discarded: ({key}, {value})")
...     print("total length: ", len(key) + len(value))
>>> cache.callback = func
>>> cache["quality"] = "assurance"
discarded: (system, programming)
total length:  17

This is a rather trivial illustration. A more useful situation for the callback would be, for example, where the values are objects that should be cleaned up before reaching the end of lifecycle. If the values stored are file-like objects, closing them would be a suitable task for a callback.

The callback can also be specified during the instance initialization as an optional parameter:

>>> cache = LRUDict(size, callback=func)

Caveats with callbacks

Although the callback may execute arbitrary code, certain restrictions are imposed on its action. Most notably:

Warning

  • The return value of the callback is discarded.

  • Any exception raised in the callback cannot be handled the usual way.

What the latter means is that, if the callback would have raised an exception, the exception is passed to the “unraisable handler” sys.unraisablehook which may be further customized. Python’s default unraisable handler will print a message with the traceback to sys.stderr.

Note

It is strongly suggested that a callback should only

  • do what is minimally necessary,

  • avoid lengthy computation or substantial interaction with an external system, and

  • especially avoid modifying the state of the LRUDict object itself.

Because of the loss of return value and suppression of exceptions, it has limited capability to interact with other parts of the programme.

The most significant reason why it works like this is that any method call that “adds something” to the LRUDict object may trigger the eviction-handling callback. The caller may not be well-equipped to handle any exception raised; indeed, we cannot assume that the caller ever knows or should know what, if any, is being evicted.

Also, it is rather likely that the evicted object will go beyond the reach of normal code if no other references to it are held. The callback in this way shares some similarity with the __del__() method (or “finalizer”). However, unlike __del__(), the callback is not bound to the object about to be discarded.

API Reference

Module

The module lru_ng exposes the following class, LRUDict. It can be imported as

from lru_ng import LRUDict

Exception

exception lru_ng.LRUDictBusyError

Raised by any LRUDict method indicating the method call cannot begin because another method has not finished processing. This is a subclass of RuntimeError.

This is typically a result of unsynchronized access in threaded code (see the section Thread safety). In single-thread code, it may arise because a key’s __hash__() or __eq__() methods cause a jump while in the middle of a method call.

This exception will not be raised if LRUDict._detect_conflict is set to False.

The LRUDict object

class lru_ng.LRUDict(size: int, callback: Optional[Callable] = None)

Initialize a LRUDict object.

Parameters
  • size (int) – Size bound or “capacity” of the LRUDict object. Must be a positive integer that is no greater than sys.maxsize.

  • callback (callable or None) – Callback object to be applied to displaced or “evicted” key-value pairs.

Raises

Properties

property LRUDict.size

Get or set the size bound. This property can be set (i.e. assigned to) in order to resize the bound after initialization. If the size is reduced and is less than the current length (i.e. number of items stored), evictions will be triggred.

Raises
property LRUDict.callback

Get or set the callback. This property can be set (i.e. assigned to) in order to change the callback after initialization. The value None indicates that no callback is set, and setting it to None disables callback upon eviction.

Raises
  • TypeError – if setting the callback to a non-callable object.

  • AttributeError – if attempting to delete the property.

Special methods for the mapping protocol

LRUDict.__getitem__(self, key, /)Any

Access the value associated with the key. Implement the indexing syntax L[key]. On success, it also increases the hits counter and increases the misses counter on failure.

Parameters

key (hashable) – Hashable object.

Returns

The value associated with key.

Raises

KeyError – if key is not found.

LRUDict.__setitem__(self, key, value, /)None

Assign the value associated with the key. Implement the index-assignment syntax L[key] = value. Do not modify the hit/miss counter.

Parameters
  • key (hashable) – Hashable object.

  • value – Object as value to be associated with key.

Returns

None.

LRUDict.__delitem__(self, key, /)None

Delete the key and its associated value. Implement the del L[key] syntax. Do not modify the hit/miss counter.

Parameters

key (hashable) – Hashable object.

Returns

None.

Raises

KeyError – if key is not found.

LRUDict.__contains__(self, key, /)Bool

Test key membership. Implement the key in L or key not in L syntax.

Parameters

key (hashable) – Hashable object.

Returns

True or False.

LRUDict.__len__(self, /)int

Support for the builtin len() function.

Returns

The length of the LRUDict object (number of items stored).

Methods emulating dict

LRUDict.clear(self, /)None

Remove all items from the LRUDict object and reset the hits/misses counter.

Returns

None.

LRUDict.get(self, key, default=None, /)Any

If key is in the LRUDict, return the value associated with key and increment the “hits” counter, if key is in the object. Otherwise, return the value of default and increment the “missing” counter.

LRUDict.setdefault(self, key, default=None, /)Any

If key is in the LRUDict, return the value associated with key and increment the “hits” counter (just like the get() method). Otherwise, return default, insert the key with the value default, and return default.

Note

Like Python’s dict.setdefault(), the hash function for key is evaluated only once.

LRUDict.pop(self, key, [default, ]/)Any

If key is in the LRUDict, return its associated value, remove this (key, value) pair, and increment the “hits” counter. If key is not in the object, return default if it is given as an argument; however, if default is not given, raises KeyError. No matter whether default is given, the “missing” counter is incremented as long as key is not in the LRUDict.

Returns

(key, value) pair.

Raises

KeyError – if key is not stored but default is not given.

LRUDict.popitem(self, least_recent : Bool = False, /)Tuple[Object, Object]

Return a pair (two-element tuple) key, value and remove it from the storage. If the LRUDict object is empty (hence no item to pop), raise KeyError.

By default, The item popped is the most-recently used (MRU) one. If the optional paramter least_recent is True, the least-recently used (LRU) one is returned instead.

This method does not modify the hits/misses counters.

Parameters

least_recent (bool) – Boolean flag indicating the order of popping. Default: False (popping the MRU)

Returns

(key, value) pair.

Note

The optional argument least_recent is specific to LRUDict and not present for dict. It bears some similarity to Python’s collections.OrderedDict.popitem(), and the default order (LIFO) is the same despite different terminologies.

LRUDict.update(self, [other, ]/, *, **kwargs)None

Update self with the key-value pairs from other, a Python dict, if the argument is present. If keyword arguments are also given, further update self using them. This method may cause eviction if the update would have grown the length of self beyond the size limit. The update is performed in the iteration order of other (which is the same as key-insertion order), and then the order of keyword arguments as they are specified in the method call if any.

Warning

This method may make multiples passes into the critical section in order to consume the sources, and while self is being updated the intermediate list of evicted items may grow (bound by the diffrence of source size and self size). It is preferrable to not having another thread modifying other or self while self is being updated.

LRUDict.has_key(self, key, /)Bool

Deprecated. Use key in L aka. __contains__() instead.

The following methods are modelled with their counterparts of dict, but instead of returning a dynamic view object, they return a list, which contains shallow copies, that reflects the state of the LRUDict object at the time of call. The items returned are in MRU-to-LRU order. Accessing the items in the returned sequences will not modify the ordering of items in the original LRUDict object.

LRUDict.keys(self, /)List
LRUDict.values(self, /)List
LRUDict.items(self, /)List[Tuple[Object, Object]]

Methods specific to LRUDict

LRUDict.to_dict(self, /)Dict

Return a new dictionary, other, whose keys and values are shallow copies of self’s. other’s iteration order (i.e. key-insertion order) is the same as self’s recent-use (i.e. LRU-to-MRU) order.

The method can be loosely considered an inverse of update() when evictions are ignored. In other words, if L in the following snippet is a LRUDict object, L_dup will be it’s duplicate with the same recent-use order:

d = L.to_dict()
L_dup = LRUDict(len(d))
L_dup.update(d)

This can be used to dump a picklable copy if the keys and values are picklable. The pickle can be loaded into a dictionary and be used to update an empty LRUDict object with sufficient size to restore the original content and order.

Returns

New dictionary.

LRUDict.peek_first_item(self, /)Tuple[Object, Object]

Return a tuple (key, value) of the most-recently used (“first”) item, or raise KeyError if the LRUDict is empty. The hits/misses counters are not modified.

Returns

(key, value) pair.

Raises

KeyError – if the LRUDict is empty.

LRUDict.peek_last_item(self, /)Tuple[Object, Object]

Return a tuple (key, value) of the least-recently used (“last”) item, or raise KeyError if the LRUDict is empty. The hits/misses counters are not modified.

Returns

(key, value) pair.

Raises

KeyError – if the LRUDict is empty.

LRUDict.get_stats(self, /)Tuple[int, int]

Return a tuple of integers, (hits, misses), that provides feedback on the LRUDict hit/miss information.

Note

On CPython >= 3.8, The specific type of the return value (named LRUDictStats) is a built-in “struct sequence” (a subclass of tuple), similar to named-tuple classes created by Python collections.namedtuple(). As such, it also supports accessing the fields by the attributes .hits and .misses respectively.

Warning

The numerical values are stored as C unsigned long internally and may wrap around to zero if overflown, although this seems unlikely.

Other special methods

LRUDict.__repr__(self, /)str

Return a textual representation of the LRUDict object. The output includes a preview of the stored key and values if the overall line is not too long, but the order of keys should not be presumed to be relevant.

Less-common and experimental methods

LRUDict.purge(self, /)int

Purge internally-buffered evicted items manually.

As an implementation detail, evicted items are temporarily buffered in the case that a callback is set when the items are evicted. Purging refers to clearing the buffer (with callback application if set). Normally, this happens without intervention, but in threaded environments there is no absolute guarantee that the buffer will always be cleared all the time. This method can be used to purge manually. As noted in Caveats with callbacks, the purge will ignore the return value of the callback and suppress exceptions raised in there.

Returns

Number of items purged, or -1 in the case of error.

property LRUDict._suspend_purge

Set to True to disable purging altogether. Set to False to re-enable (but do not immediately trigger purging).

property LRUDict._purge_queue_size

Peek the length of the purge buffer. The value is indicative only and may not be accurate in a threaded environment.

property LRUDict._max_pending_callbacks

The maximum number of callbacks allowed to be pending on the purge queue. Must be an integer between 1 and C USHRT_MAX (typically 65535).

property LRUDict._detect_conflict

Set to False to disable runtime conflict detection. By doing so, no method will raise LRUDictBusyError.

Obsolete methods

LRUDict.get_size(self, /)int
LRUDict.set_size(self, size : int, /)None

Deprecated. Use the property size instead.

LRUDict.set_callback(self, func : Callable, /)None

Deprecated. Use the property callback instead.

Hits and misses

This is a summary of what constitutes a “hit” or a “miss”, for which the information is scattered around in the API Reference.

Note

In general,

  • a hit refers to the successful retrieval of a value associated with a given key from the LRUDict object itself, and

  • a miss refers to one such attempt that fails.

An “insert”-like method typically neither hits nor misses, and neither does deleting a key-value pair (no matter whether the key exists). For instance, the index-assignment statement

L["watermelon"] = "green"

neither hits nor misses no matter what. The del-index statement

del L["lemon"]

does not affect hit/miss stats either, even if it may fail and raise KeyError.

Somewhat special is the setdefault() method that may either “retrieve” from the LRUDict – in the case that the given key is “in”; or “insert” in it – in the case that there is not yet a key in it. The former counts as a hit, because the value associated with the given key is indeed obtained from the LRUDict. However, the latter is not a miss, because it is an insert-only operation: no attempt has been made to retrieve the value associated with the given key.

Merely testing for membership using the in/not in operator neither hits nor misses.

A miss is not always associated with a KeyError. An example is the get() method. The statement

colour = L.get("apple", "red")

incurs a miss if the key "apple" is not in L, and it will not raise KeyError. The substitute value "red", though assigned to the variable colour, is missed by L.

The methods popitem(), peek_first_item(), and peek_last_item() neither hit nor miss, because they do not accept a key.

The tallying of hits/misses is orthogonal to other side effects. For example, the pop() is a “destructive” way to retrieve a value. It can hit or miss, and in the case of a hit the item is removed from the LRUDict. The fact that a hit is scored does not imply that the item is “still there”.

It is safe to say that a key, if hit and extant, will always be promoted to the most-recently used one, if it is not so already. The converse is not true: assigning a value to an existing key will promote the key to the most recent, but it is not considered a hit.

Compatibility

Being a fork of the lru module, this module is designed with compatibility in mind. Most of the time, drop-in replacement can be achieved by the import statement

from lru_ng import LRUDict as LRU

However, this module is also intended for newer CPython applications following newer conventions and standards, which can diverge from the inherited code (hence the “ng” in the module name, for “new generation”). I attempt to provide a list of differences from lru == 1.1.7 here.

Differences from lru.LRU 1.1.7

  • Name of the class has changed to LRUDict (because the acronym “LRU” is better reserved as an adjective or a description of the particular replacement policy).

  • Python 2.7 support is removed. Instead, CPython 3.6+ is now required.

  • The get_size(), set_size(), and set_callback() methods are obsoleted. To resize or change the callback, use the properties size and callback respectively.

  • The peek_first_item() and peek_last_item() methods raise KeyError for empty LRUDict, rather than return None.

  • The popitem() method by default pops in the order from the most recent to the least recent. This is the inverse of the old default behaviour.

  • The string returned by repr() can now be distinguished from that of a plain dict.

In addition, there are also some significant internal changes that affect behaviours in more subtle ways.

  • The methods that takes a “key” parameter evaluate the hash only once, like the Python dict. This results in a performance gain.

  • Python garbage collection (reclaiming objects forming reference cycles that are otherwise unreachable) is supported (see also Memory usage).

  • The LRUDict instance is unhashable.

  • The callback is now executed outside the critical section of the code (where internal changes to the data structures are taking place). Newly evicted items are buffered, and a purge() method can optionally ensure that the buffer is cleared, although the code takes care of purging normally. This change reduces the likelihood that a misbehaving callback may crash the Python process.

  • Similarly, the __del__() of the key or value cannot be normally triggered inside the critical section.

  • The callback’s exception is suppressed (more details in Caveats with callbacks).

  • Assuming the protection of the global interpreter lock, the methods now has limited protection of the critical section. If for some exotic reason the __hash__() or __eq__() special-methods of the key attempt to modify the LRUDict object inside a method call, the code will attempt to detect this and raise an LRUDictBusyError exception.

  • Overall behaviour in Python threads or other execution units (such as coroutines or greenlets) is now more predictable and safer. Even if a callback releases the GIL, the internal state remains consistent. There is limited ability to detect conflicting access at runtime; see Thread safety for more.

Comparison with Python dict or OrderedDict

LRUDict attempts to emulate Python built-in dict in API and behaviour (when sensible). However, currently returning an iterable proxy by keys(), values(), and items() are among the unsupported methods. The reason is that the internal ordering of keys is volatile, and an iterator honouring the ordering is very easily invalidated, thus limiting its use.

A dict maintains key-insertion order (since CPython 3.6+), which is not the same as key-use order in the LRU sense: a hit promotes the key to the highest priority but does not necessarily change the insertion order (unless it is removed and inserted again).

Python offers another mapping type, collections.OrderedDict, which bear some similarity to LRUDict. The major difference is that LRUDict is focused on bounded size and eviction callback while the former is a more flexible mapping type.

Comparison with Python functools.lru_cache()

Python provides a native and optimized functools.lru_cache() implementation, which is primarily used as a decorator for memoization (remembering return values for inputs that has been evaluated before).

This module, however, provides a container/mapping class, and a memoizing LRU decorator can be built on top of it (although not as fast as the native one, because the latter implements more optimizations and also avoids a lot of overhead of handling Python calls or exceptions). Instead, this module focuses on dict-compatibility and the ability to set a callback.

Also, unlike functools.lru_cache(), there is no default size limit at initialization: a value must be provided by the user. Unbound size is not supported, either: the size bound must be explicit (although ultimately hard-limited by the platform’s sys.maxsize).

Thread safety

The module relies on the global interpreter lock (GIL) to maintain the consistency of internal data structures. Internally, the building blocks are mostly normal Python objects, although they’re accessed and modified via the C API. The GIL is expected to be held by the method caller, so that accesses are ordered as a result of the GIL’s functioning. When a thread is using a method, it can be certain that no other thread is modifying the data at the same time.

However, there are four places where the protection is not perfect. These are the callback, the key’s __hash__()/__eq__() methods (used by the internal dict), and the __del__() method on the key or value. In the notes on compatibility we have described the precaution about these code paths that can in principle do anything.

Here, we should mention the qualities of these callable objects that makes the programme overall safer. We will refer to them as “external code”.

First, avoid releasing the GIL in the external code. The GIL is released, for example, when input/output (I/O) is performed. Some code may drop the GIL when computing the hash by a C extension on an internal buffer for speed. Even if __hash__() may be made to execute before entering the critical section (relying on not-so-public Python C API), __eq__() currently cannot be. When a thread-switch happens in the middle of a method call, another thread may try to call into a method, too, causing contention. There’s limited built-in ability to detect this at run-time, but currently,

Warning

No additional locking is performed,

because locking tend to degrade single-thread performance and necessarily make the code more complex. Decisions about contention-handling (such as blocked waiting, timed waiting, or other tactics) is best made by client code.

Second, avoid unwanted side effects in the external code. Modifying the LRUDict object itself is one such effect. Again, this can be detected at runtime, but it is not preferable to rely on this.

Overall, callbacks and __del__() are mostly safe. However, the timing and order of the callback execution cannot be guaranteed in general in a threaded environment.

Summary

Note

  1. As long as the key’s __hash__() and __eq__() do not drop the GIL or modify the LRUDict, usage in a threaded environment is generally safe.

  2. The callback and the __del__() method of keys or values are generally safe, but it is not preferable to modify the LRUDict in there or clog up the purge queue by taking long computations.

  3. Limited ability to detect contention at runtime is provided (see LRUDictBusyError), and the user can implement their own synchronization.

  4. For single-thread programme it’s “life as usual”, and the GIL provides adequate protection.

Example: contention

In the following Python script, we simulate a (highly contrived) situation where individual threads simply try to insert keys into a cache without synchronization. The GIL is dropped in the __hash__ and __eq__ methods of the key by taking some time to read from a file.

run_insertion.py – simulate contention with key insertions
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
import threading
from lru_ng import LRUDict


class Key:
    def __init__(self, value, read_blocks=128):
        self._v = value
        self.nbytes = read_blocks * 512

    def __hash__(self):
        h = hash(self._v)
        remaining = self.nbytes
        with open("/dev/urandom", "rb") as f:
            while remaining > 0:
                remaining -= len(f.read(remaining))  # Release GIL
        return h

    def __eq__(self, other):
        try:
            v = other._v
        except AttributeError:
            v = other
        remaining = self.nbytes
        with open("/dev/urandom", "rb") as f:
            while remaining > 0:
                remaining -= len(f.read(remaining))  # Release GIL
        return self._v == v


cache = LRUDict(5)


def runner(n):
    for i in range(n):
        k = Key(i)
        cache[k] = i  # Insertion without synchronization


num_threads = 10
num_keys = 1000
threads = [threading.Thread(target=runner, args=(num_keys,))
           for i in range(num_threads)]
for th in threads:
    th.start()
for th in threads:
    th.join()

The program will output error messages like this:

Traceback (most recent call last):
...
File "run_insertion.py", line 36, in runner
  cache[k] = i  # Insertion without synchronization
lru_ng.LRUDictBusyError: attempted entry into LRUDict critical section while busy

(The actual error messages themselves from all the threads may be out of order too!) The threads are failing because of the unhandled error.

If the __hash__ or __eq__ methods had not released the GIL, these errors would have been prevented.

By adding a lock around the insertion operation, we can make it “work” in the sense of ensuring sequential access to the global cache. The following snippet replacing the definition of runner() should make the error messages disappear:

32
33
34
35
36
37
38
cond = threading.Condition()
def runner(n):
    for i in range(n):
        k = Key(i)
        with cond:         # Wait on the underlying lock
            cache[k] = i
            cond.notify()  # Wake up one thread waiting on the lock

Re-entrancy

“Re-entrancy” is a concept related to but distinct from “thread-safety”. In our context it means the ability to call into the code for LRUDict while another such call has not returned. The pending call’s caller is not necessarily another thread: it can be a different part of the same thread’s code, for example a coroutine that has yielded at some point.

Note

In summary,

  • the callback’s action is “mostly” re-entrant but no more reentrant than the callback itself, while

  • for the methods, re-entering is an exception.

Actions of the callback

The callback is an optional callable that is called automatically on each evicted item (the key-value pair removed because the capacity is reached). Although integrated into the workings of the LRUDict object, the callback is beyond its control and subject to caveats. It is generally not up to the inner workings of LRUDict to handle every situation where the callback may misbehave; this is a consequence of Rice’s Theorem.

Nevertheless, a degree of re-entrancy pertaining to the action of callbacks is supported, provided that

  1. every entrance into the body of LRUDict methods is eventually paired with an exit, and that

  2. the callback is otherwise re-entrant in and by itself.

It is up to the programmer to ensure that the callback should behave thus.

Rule 1 means that the callback is allowed to redirect the control flow towards re-entering the LRUDict object while it is being called by such an object upon item eviction. This is obviously an error-prone situation. It is partially mitigated by hitting Python’s recursion (or more precisely, call-stack) limit, which is understood by LRUDict’s internal routines and propagated whenever possible. For coroutines, if too many of them yield inside the callback before returning, there’s still the hard limit of USHRT_MAX currently-pending entrances (typically 65535). If the current-pending counter is about to be saturated, the purge will simply be skipped.

As an illustration, it is very easy to write an “amplification attack” version of the callback, where for each evicted key it inserts more unique keys back in. The rate of amplification can go exponential and beyond:

from lru_ng import LRUDict


def cb(key, value):
    for i in range(cb.version):
        cb.version += 1
        key = "callback-%d" % i
        cb.victim[key] = i


r = LRUDict(1, callback=cb)
cb.version = 1
cb.victim = r
r[0] = 0
r[1] = 1

This is not considered a valid way to use the module.

Trade-off with multiple pending callbacks

(WIP)

The LRUDict instance has a private property, _max_pending_callbacks, that can be fine-tuned to allow for greater control over callback execution. This is normally not required, but in certain cases the default (a fairly large value, 8192) may not work well, and a much lower limit may be desirable.

One of the possible situations is the case when, in a multi-threaded environment, the callback performs GIL release-acquire cycles (typically by doing I/O). If there’s a large number of them pending at the same time, each failure to acquire the GIL causes blocking for the thread. Since only one thread can acquire the GIL, most threads already executing in the callback will still spend time waiting in the callback, and little progress can be made overall.

In this particular scenario, lowering _max_pending_callbacks helps. If the pending-callback count has already saturated, any new entrance into the “purge” section will not touch the evicted items in the queue, but instead returns almost immediately (i.e. progress is made). The queue will eventually be cleared, if not by calling purge() explicitly.

However, there are two major downsides of lowering the _max_pending_callbacks:

  1. The queue will not be purged as aggressively, so sometimes it may be worthwhile to check if there are stuck items.

  2. If the callback behaves like the “amplification attack” example above, it will likely evade the recursion limit, because the calls that could have been indirect recursions are “consumed” when the pending-callback counter saturates.

No. 2 may sound counter-intuitive. The reason is given at the end of this page, for it is not very relevant to “normal”, well-behaving callbacks. Notice that this doesn’t mean a smaller max-callback limit always serves to curb runaway callback at runtime. It’s not difficult to contrive a counterexample.

In summary, these are the pros and cons of using large/small max-callback bounds:

Callback behaviour

Small bound

Large bound

Single-thread, plain

✅ (No impact)

Single-thread, coroutine

❌ May miss some purges

✅ No extra issue

Multi-thread, I/O bound

✅ Better concurrency

❌ High GIL contention

Runaway callback

❌ All bets are off; situation-dependent

Normal method access

Normally, the maintenance of the element’s use-order means that full re-entrancy can be costly, since every operation may cause a reordering. If, for example, the mere insertion of a key

L[key] = value

causes another call into L’s methods while the insertion is incomplete, then that new call must be able to work on sensible data that may have been partially modified by the previous (but pending) call, and that the first call (key insertion) must understand the possible mutations to the state in the wake of the latter one, and also decide on a meaningful way to proceed.

The complexity to support this kind of operation means that currently, we detect this sort of re-entry and raise the LRUDictBusyError exception. It is up to the caller to decide what to do – maybe to try again after some timeout, or abandon the operation, or perform some other action instead.

In practice, the kind of “re-entering” prevented this way is almost always caused by a key’s __hash__() or __eq__() redirecting the control flow in such a way that makes the LRUDict instance accessed again while already in the process of interacting with the key. This is typically the result of programming error, but possible reasons include GIL-dropping in the implementation of these methods, which may arise from complicated interaction with C-extensions.

The benefit to be gained from supporting full re-entrancy (beside raising exceptions) seems minimal. If you know how to achieve this in a cost-efficient manner, please help!

Appendix: why max-callback limit may circumvent stack limit

(Implementation details inside)

The reason is that the call tree looks like this in the “amplification attack” example:

insert
  evict
  purge
    callback
      insert
        .
         .
          .

If the bound on pending callbacks is not hit, the call stack goes ever deeper and eventually hits the stack (recursion) limit.

However, if a callback may be bypassed because too many have been pending, the call tree will get flattened, turning the unbound recursion into infinite loop:

insert
  evict
  purge
    callback    [1]
      insert
        evict
        purge   [callback skipped, too many pending]
      insert
      .
      .
      .
      [... inescapable loop without growing deeper ...]

Even if the callback marked as [1] eventually returns by running out of insertion operations in its body, the next call into the callback may cause an even larger number of insertions because there is an even larger number of enqueued items to call on. Worse, there can always be written a sufficiently bad callback that circumvents restrictions. The only solution is to not use such a callback.

Performance

Speed

In general, the code should perform as fast as lru.LRU for the following operations:

  • Lookup by key (either a hit or miss), and

  • Replace a key’s associated value.

Internally there is more overhead because of some extra checking and loading in order to avoid executing foreign code in a critical section. Such overhead are insignificant though as shown by benchmarks, and the benefit outweighs the cost.

The following operations are faster and overall safer (where applicable):

  • Inserting a new item,

  • Deleting an existing key-value pair,

  • Evicting items under insertion load, with or without callback set, and

  • Dict-like methods.

The one scenario that could incur significant slowdown is when

  • a callback is not set, and

  • the evicted key or value’s __del__() method would have presented a risk of executing arbitrary code.

As special cases, for a handful of Python built-in types such as str or bytearray (but not for subclasses), we can be certain that their finalization/deallocation wouldn’t interfere with our normal operation, and they will not cause slowdown.

However, in general, we take extra care to defer potential deallocation despite the overhead, because the safety far outweighs the extra speed. The “slow” code can be observed in benchmarks where the evictions are triggered by resizing a very large LRUDict object to a small size, without the callback. The cause is the overhead of extra moves to ensure that no __del__() code may be triggered while doing internal sensitive operations, and that normal method calls may not fail spuriously.

Memory usage

The memory usage is larger because of internal memoization of key hash values. The overhead per key is the same as the platform’s pointer size (4/8 bytes on 32/64-bit systems). That is, the overhead is O(n) where n is the number of keys.

Some additional memory allocations are made to keep a queue of items facing imminent destruction, but these are usual small and allocated per LRUDict instance, or O(1).

The LRUDict object participates effectively in Python’s garbage collection. Reference cycles are detected by Python’s cyclic garbage collector and broken up when all external references are dropped. For example, the following code

from lru_ng import LRUDict
r = LRUDict(4)
r[0] = r
del r

will not leave a self-referencing, unusable, but “live” object permanently in memory. It will eventually be collected by Python’s garbage collector.

Indices and tables