1. Introduction

The module lru_ng provides a single Python type, LRUDict that behaves like a normal Python dictionary dict by providing key-value mapping.

In addition, its capacity is bounded. When the size bound is reached, inserting a new key will cause the least-recently used key (and its associated value) to be removed (or “evicted”) from the mapping. The LRUDict can be resized, and reducing its size will remove keys in the order from the least- to the most-recent use.

Optionally, a callback can be registered for displaced items. The callback is applied to the (key, value) pair, and can be used for a variety of purposes such as freeing up resources, printing information, or queueing up the discarded items for reuse, among others.

1.1. Installation

The latest version’s source is available from the PyPI, and you can install it using the pip command:

pip install lru_ng

The pip utility will download the C source and try to build a binary distribution. For this to work, you need CPython >= 3.6. The following are required for compilation:

  • A C compiler that supports C99 syntax

  • Python development headers

  • setuptools, although distutils included with Python should work, too.

1.2. First example

An LRUDict object is created by specifying its fixed size bound. In this example we begin with a size of 4 items.

>>> from lru_ng import LRUDict
>>> cache = LRUDict(3)

The object cache, in many ways, will behave just like a Python dict. It can be indexed, and you can test for membership using the in keyword.

>>> cache["system"] = "programming"
>>> cache["web"] = "development"
>>> "technology" in cache
False
>>> cache["documentation"]
Traceback (most recent call last):
  ...
KeyError: 'documentation'

By incurring a KeyError exception, we have caused a cache miss, which is recorded.

>>> # For Python >= 3.8
>>> "hits: {0.hits}; misses: {0.misses}".format(cache.get_stats())
'hits: 0; misses: 1'
>>> # For Python < 3.8, use
>>> "hits: {}; misses: {}".format(*cache.get_stats())
'hits: 0; misses: 1'

Successful retrieval of a value by key increases the “hits” record by one:

>>> word = cache["system"]
>>> hits, misses = cache.get_stats()
>>> hits
1

When more keys are inserted and the capacity reached, items are displaced by starting from the least-recently used.

>>> for phrase in ("documentation writing", "software testing"):
...     topic, action = phrase.split(" ")
...     cache[topic] = action
>>> print(cache)
<LRUDict(3) object with dict {'system': 'programming', 'documentation': 'writing', 'software': 'testing'} at 0x...>
>>> len(cache)
3

The item for "web development" has been removed from cache. The item is the first to be discarded because "web" is the least recently used key.

1.3. Using a callback

A callback is a callable object with two arguments for (key, value). If it is set, it will apply to displaced (or “evicted”) key-value pairs. The callback can be accessed or set via the LRUDict.callback property.

>>> def func(key, value):
...     print(f"discarded: ({key}, {value})")
...     print("total length: ", len(key) + len(value))
>>> cache.callback = func
>>> cache["quality"] = "assurance"
discarded: (system, programming)
total length:  17

This is a rather trivial illustration. A more useful situation for the callback would be, for example, where the values are objects that should be cleaned up before reaching the end of lifecycle. If the values stored are file-like objects, closing them would be a suitable task for a callback.

The callback can also be specified during the instance initialization as an optional parameter:

>>> cache = LRUDict(size, callback=func)

1.4. Caveats with callbacks

Although the callback may execute arbitrary code, certain restrictions are imposed on its action. Most notably:

Warning

  • The return value of the callback is discarded.

  • Any exception raised in the callback cannot be handled the usual way.

What the latter means is that, if the callback would have raised an exception, the exception is passed to the “unraisable handler” sys.unraisablehook which may be further customized. Python’s default unraisable handler will print a message with the traceback to sys.stderr.

Note

It is strongly suggested that a callback should only

  • do what is minimally necessary,

  • avoid lengthy computation or substantial interaction with an external system, and

  • especially avoid modifying the state of the LRUDict object itself.

Because of the loss of return value and suppression of exceptions, it has limited capability to interact with other parts of the programme.

The most significant reason why it works like this is that any method call that “adds something” to the LRUDict object may trigger the eviction-handling callback. The caller may not be well-equipped to handle any exception raised; indeed, we cannot assume that the caller ever knows or should know what, if any, is being evicted.

Also, it is rather likely that the evicted object will go beyond the reach of normal code if no other references to it are held. The callback in this way shares some similarity with the __del__() method (or “finalizer”). However, unlike __del__(), the callback is not bound to the object about to be discarded.