Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate deterministic memory pool #94

Open
jserv opened this issue Dec 12, 2022 · 10 comments
Open

Integrate deterministic memory pool #94

jserv opened this issue Dec 12, 2022 · 10 comments
Assignees
Labels
enhancement New feature or request

Comments

@jserv
Copy link
Contributor

jserv commented Dec 12, 2022

TLSF (Two-Level Segregated Fit) dynamic memory allocation algorithm is guaranteed to complete allocation and deallocation operations in constant time, suitable for real-time applications.

Reference:

@jserv
Copy link
Contributor Author

jserv commented Dec 12, 2022

0001-WIP-Integrate-deterministic-memory-pool.patch.gz implementation serves as proof of concept.

@jserv
Copy link
Contributor Author

jserv commented Jan 5, 2023

Related: #105

@jserv
Copy link
Contributor Author

jserv commented Mar 12, 2023

HTFH-RT-Search-Cache is a real-time search cache build on top of a Hybrid TLSF Fixed Heap allocator with DLIRS mixed-strategy caching.

See also: DLIRS: Improving Low Inter-Reference Recency Set Cache Replacement Policy with Dynamics

Inspired by the idea of dynamic cache space partitioning from another state-of-the-art policy, Adaptive Replacement Cache (ARC), we propose a new Dynamic LIRS (DLIRS) policy. The new policy uses a simple mechanism to perform an approximated online estimation on how well IRR predicts future access behaviors, and then dynamically adapts the space allocation of low IRR blocks against high IRR blocks. Experiments are performed on traces from the UMass Trace Repository as well as a synthetic trace drawn from a stack depth distribution. While sometimes LIRS outperforms ARC with a significant margin and sometimes vice versa, the new DLIRS policy consistently performs close to the winner between ARC and LIRS in all the cases.

@jserv
Copy link
Contributor Author

jserv commented Mar 13, 2023

xvmalloc ever existed in Linux kernel. However, it was removed later.

See also: War of allocators: TLSF in action

@jserv
Copy link
Contributor Author

jserv commented Aug 30, 2023

xvmalloc ever existed in Linux kernel. However, it was removed later.

Recent report on TLSF, xvMalloc, and zsmalloc. See linux2023-report.

@jserv jserv added the enhancement New feature or request label Dec 25, 2023
@jserv
Copy link
Contributor Author

jserv commented Jan 9, 2024

Fast Efficient Fixed-Size Memory Pool: No Loops and No Overhead

In this paper, we examine a ready-to-use, robust, and computationally fast fixed-size memory pool manager with no-loops and no-memory overhead that is highly suited towards time-critical systems such as games. The algorithm achieves this by exploiting the unused memory slots for bookkeeping in combination with a trouble-free indexing scheme. We explain how it works in amalgamation with straightforward step-by-step examples. Furthermore, we compare just how much faster the memory pool manager is when compared with a system allocator (e.g., malloc) over a range of allocations and sizes.

@jserv
Copy link
Contributor Author

jserv commented Mar 2, 2024

The TLSF memory allocation system consists of the following elements:

  • Memory Pool (Heap): This is a continuous memory area (the heap) where data allocated by the user is stored. TLSF also stores the properties of free and allocated slots here.
  • Linked List Head Pointer Matrix: This matrix stores the entry point of each linked list referencing the free slots in the memory pool. The linked lists group together slots of similar sizes.
  • First Level Bitmap: The first level of indexing is stored in the form of a 16-bit word. Each bit indicates the existence of at least one linked list in the second level of indexing. Classification is done by powers of 2 based on the size of the slot.
  • Second Level Bitmaps: The second level of indexing uses a 16-bit word for each first level of indexing. Each bit indicates the existence in the linked list head matrix of a linked list containing free memory slots. Classification is done linearly on the size of the slot within a range defined by the first level of indexing.

0-tlsf

When a free slot is created or deleted in the memory pool, it is necessary to update the indexing data.

Here is an example of how a slot size of 781 bytes is broken down according to the first (fl) and second level (sl) indexes:

0000001 1000 01101 = x030D = 781 octets  
_______ ____
  fl=9  sl=8
(MSB position) (value)

The first-level indexing (fl) is done by classifying the size of the slot (minimum 4) by powers of 2 in 13 levels, from fl=3 to fl=15.

The table below presents the slot sizes indexed by each first-level (fl) value.
0-tlsf-1

Second-level indexing (sl) is carried out by linearly classifying the size of the slot into 16 levels for each first level (fl).

Example: For a level fl=5, all the values indexed by the second level range from 2^5 to 2^6-1, or 32 to 63. There are 16 possible levels for sl, so 32/16=2 values for each sl level. The values are thus:

  • fl:5, sl:0 [32,33]
  • fl:5, sl:1 [34,35] ...
  • fl:5, sl:15 [62,63]

The table below represents each possible fl/sl pair. For each pair, a start of a linked list is stored. This linked list only contains free slots whose minimum size is indicated in the table (the maximum size is not shown for clarity; it corresponds to the next cell's value minus 1).
Note: Special cases where exact indexing occurs (no range of values) are shown in green.
0-tlsf-2

Reference: Arena Allocation

@jserv
Copy link
Contributor Author

jserv commented Dec 27, 2024

RT-Mimalloc: A New Look at Dynamic Memory Allocation for Real-Time Systems

Dynamic memory allocation is a pivotal feature of modern software systems but has mostly been scarcely used in
real-time systems due to the limited time-predictability offered by dynamic memory allocators (DynMAs). ... After analyzing and comparing modern DynMAs, we discuss how to modify the Mimalloc general purpose DynMA into RT-Mimalloc, so that more predictable allocation times can be obtained. All the studies and evaluations performed in this work were based on both modern state-of-the-art benchmarks for memory allocation and synthetic workload to assess specific capabilities of the tested DynMAs.

@henrybear327
Copy link
Collaborator

Related: #535

@henrybear327
Copy link
Collaborator

RT-Mimalloc: A New Look at Dynamic Memory Allocation for Real-Time Systems

Dynamic memory allocation is a pivotal feature of modern software systems but has mostly been scarcely used in
real-time systems due to the limited time-predictability offered by dynamic memory allocators (DynMAs). ... After analyzing and comparing modern DynMAs, we discuss how to modify the Mimalloc general purpose DynMA into RT-Mimalloc, so that more predictable allocation times can be obtained. All the studies and evaluations performed in this work were based on both modern state-of-the-art benchmarks for memory allocation and synthetic workload to assess specific capabilities of the tested DynMAs.

The main idea from the paper is that if we identify the slow path by checking the instructions executed per function during runtime, we can identify the slow paths. We can then try to modify the allocator in such a way that the longest observed allocation time (LOAT) is on par with TLSF.

The proposed implementations leverage profile-driven initialization and make certain slow paths constant time.

It's shown that modified mimalloc (called RT-mimalloc) can achieve a similar LOAT compared to TLSF and have a lower average allocation / de-allocation execution time, also able to support multi-threaded execution due to the lockless design, and have a lower fragmentation rate. However, RT-mimalloc exhibits higher peak memory usage compared to TLSF.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants