GitHub - nfergu/memalot: Finds memory leaks in Python programs

Memalot finds memory leaks in Python programs.

Memalot prints suspected leaks to the console by default, and also has a CLI and an MCP server for analyzing memory leaks.

For example, here is a Python program that creates a string object every half-second and stores these in a list:

from time import sleep
import memalot

memalot.start_leak_monitoring(max_object_lifetime=1.0)

def my_function():
    my_list = []
    for i in range(100000):
        my_list.append(f"Object {i}")
        sleep(0.5)

my_function()

In this example, the memalot.start_leak_monitoring(max_object_lifetime=1.0) line tells Memalot to find objects that have lived for longer than one second and identity them as potential leaks. After a short delay, Memalot will print a report like this to the console:

Memalot has identified that some string objects are leaking, and has printed details about the first object, including its referrers (the references to the object that are keeping it alive), its size and its string representation.

Note: Memalot may slow down your program, so be wary of using it in a production system. Additionally, Memalot can use a lot of memory itself (but it should not leak memory!) so make sure you have plenty of RAM available.

Installation

Install using pip:

pip3 install memalot

Getting Started

Memalot can identify suspected memory leaks in one of these ways:

Time-based Leak Discovery. Identifies objects that have lived for more than a certain amount of time without being garbage collected. This is most suitable for web servers and other programs that process short-lived requests, and multithreaded programs.
Function-based Leak Discovery. Identifies objects that have been created while a specific function is being called, but have not yet been garbage collected. This is most suitable for single-threaded batch processing systems or other long-lived jobs.

Time-based Leak Discovery

To get started with time-based leak discovery, call this code after your Python program starts:

import memalot

memalot.start_leak_monitoring(max_object_lifetime=60.0)

This will periodically print out potential memory leaks to the console. An object is considered a potential leak if it lives for more than max_object_lifetime seconds (in this case, 60 seconds).

By default, Memalot has a warm-up period equal to max_object_lifetime seconds. Objects created during the warm-up period will not be identified as leaks. You can control the warm-up period using the warmup_period parameter.

Function-based Leak Discovery

To get started with function-based leak discovery, wrap your code in the @leak_monitor decorator:

from memalot import leak_monitor

@leak_monitor
def function_that_leaks_memory():
    # Code that leaks memory here

When the function exits, Memalot will print out potential memory leaks. That is, objects created while the function was being called, which cannot be garbage collected.

You can also ask Memalot to only consider objects that have lived for more than a certain number of calls to the function. For example:

from memalot import leak_monitor

@leak_monitor(max_object_age_calls=2)
def function_that_leaks_memory():
    # Code that leaks memory here

In this case the max_object_age_calls=2 parameter asks Memalot to only consider objects that have been created while the function was being called, and have survived two calls to the function.

Function-based leak discovery may not be accurate if other threads are creating objects outside the function while it is being called. Memalot cannot detect objects that are created within a specific function, only while the function is being called. If this causes problems for you, use time-based Leak Discovery instead.

Note: you should not call memalot.start_leak_monitoring when using function-based leak discovery.

Filtering

Memalot can be used to filter the types of objects that are considered leaks. This can speed up leak discovery significantly if you know what types of objects are likely to be leaks.

To filter object types, pass the included_type_names parameter with the type names that you wish to include. For example:

memalot.start_leak_monitoring(max_object_lifetime=60.0, included_type_names={"mypackage.MyObject", "OtherObject"})

This will only include objects with mypackage.MyObject or OtherObject in their fully qualified type name. Matching is based on substrings, so mypackage.MyObject will match mypackage.MyObjectSubclass as well.

You can also exclude certain types of objects from being considered as leaks. Use the excluded_type_names option for this. For example:

memalot.start_leak_monitoring(max_object_lifetime=60.0, included_type_names={"builtins"}, excluded_type_names={"dict"})

This will include all built-in types except for dict.

One efficient way to use Memalot is to generate a report with check_referrers=False to see which types of objects might be leaking, and then generate further reports with check_referrers=True and included_type_names set to the types of objects that you think may be leaking. Since finding referrers is slow, this can speed up leak discovery.

Console Output

By default, Memalot prints out suspected leaks to the console. However, you can specify the output_func option to send the output to a different location. For example, to send the output to a Python logger:

LOG = logging.getLogger(__name__)
memalot.start_leak_monitoring(max_object_lifetime=60.0, output_func=LOG.info)

Saved Reports

Memalot saves leak reports to disk, which can be inspected later via the CLI or MCP server. By default reports are saved to the .memalot/reports directory in the user's home directory, but this can be changed by setting the report_directory option.

Reports can be copied between machines by copying the contents of the report_directory to the other machine (using, for example, scp or rsync). This is useful if, for example, you are running Memalot in your test environment but want to inspect reports on your local machine.

For example, to copy the report with ID rcf1-6kks from a remote machine to your local machine:

scp alice@remote_host:/home/alice/.memalot/reports/memalot_report_rcf1-6kks /home/alice/.memalot/reports/

Or to rsync all reports from a remote machine to your local machine:

rsync -avh --progress alice@remote_host:/home/alice/.memalot/reports/ /home/alice/.memalot/reports/

There is a small chance of report ID collisions if you copy reports between machines (although this is relatively unlikely, since report IDs are 8 alphanumeric characters). To avoid report collisions, use a different report_directory for each machine you copy reports from.

CLI

Memalot has a basic CLI that can be used to view stored reports.

To list reports, run:

memalot list

To print a specific report, run:

memalot print <report_id>

To get help, run a command with the --help flag. For example:

memalot print --help

MCP Server

Memalot has an MCP server that can be used to analyze leak reports using your favorite AI tool. The MCP server uses the stdio transport so you need to run it on the same machine as the AI tool.

MCP Server Installation

Before installing the MCP server, make sure you have installed UV on your machine.

General Configuration

To run the MCP server, you'll need to specify the following in your AI tool:

Name: Memalot
Command: uvx
Arguments: --python >=3.10 --from memalot[mcp] memalot-mcp

However, the precise way you do this varies depending on the specific tool you are using. See below for instructions for some popular tools.

JSON Configuration

For tools that support JSON configuration of MCP servers (for example, Cursor, Claude Desktop), add the following to your JSON configuration:

"Memalot": {
    "command": "uvx",
    "args": [
        "--python", ">=3.10", "--from", "memalot[mcp]", "memalot-mcp"
    ]
}

Note: you may have to specify the full path to the uvx executable in some cases, even if it is on your path. You can find this by running which uvx from the command line. Try this if you get an error like "spawn uvx ENOENT" when starting the MCP server.

Claude Code

Run this command:

claude mcp add Memalot -- uvx --python '>=3.10' --from memalot[mcp] memalot-mcp

Codex CLI

Run this command:

codex mcp add Memalot -- uvx --python '>=3.10' --from memalot[mcp] memalot-mcp

Copilot Coding Agent

Adding the following JSON configuration to your repository's MCP configuration:

"Memalot": {
    "type": "local",
    "tools": ["*"],
    "command": "uvx",
    "args": [
        "--python", ">=3.10", "--from", "memalot[mcp]", "memalot-mcp"
    ]
}

Example Prompts

Before you can use the MCP server, you'll need to generate some reports if you haven't already. See the Getting Started section for more details.

Here are some things you can ask the MCP server to do:

"List memalot leak reports"
"List the most recent 10 memalot leak reports from report directory /var/memalot_reports"
"Analyse the most recent iteration of memalot report <report-id>"
"Analyse the most recent iteration of memalot report <report-id>. Filter to include MyObject objects only."
"Fix the memory leak in memalot report <report-id>"
"Analyze the referrer graph for objects of type MyObject for memalot report <report-id>"
"Create a diagram of the references to leaking objects in memalot report <report-id>"
"Create a comprehensive HTML report for memalot report <report-id>"

Tips for Using the MCP Server

If the context window is being exceeded, try the following:
- Ask the AI tool to filter on specific object type names. This is performed in the MCP server, so reduces the amount of information sent to the client.
- Set the max_object_details option to a smaller value when generating the report.
By default, only the most recent iteration of a report is returned. You can ask your AI tool to retrieve more iterations if you wish.
By default, the MCP server will look for reports in the default directory. However, you can ask your AI tool to look in a specific directory if you have saved reports elsewhere.

Referrers

Memalot uses the Referrers package (by the same author as Memalot) to show the referrers of objects. These are the references to the object that are keeping it alive. There are a number of options that can be used to control the behaviour of this. See Referrer Tracking Options for more details.

Options

Memalot has a number of options that can be used to customize its behavior. Pass these options to start_leak_monitoring or @leak_monitor. For example:

memalot.start_leak_monitoring(max_object_lifetime=60.0, force_terminal=True, max_object_details=50)

Type Filtering

included_type_names (set of strings, default: empty set): The types of objects to include in the report. By default all types are checked, but this can be limited to a subset of types. Inclusion is based on substring matching of the fully-qualified type name (the name of the type and its module). For example, if included_type_names is set to {"numpy"}, all NumPy types will be included in the report.
excluded_type_names (set of strings, default: empty set): The types of objects to exclude from the report. By default no types are excluded. Exclusion is based on substring matching of the fully-qualified type name (the name of the type and its module). For example, if excluded_type_names is set to {"numpy"}, all NumPy types will be excluded from the report.

Leak Report Options

max_types_in_leak_summary (int, default: 500): The maximum number of types to include in the leak summary.
compute_size_in_leak_summary (bool, default: False): Computes the (shallow) size of all objects in the leak summary. Note: the shallow size of an object may not be particularly meaningful, since most objects refer to other objects, and often don't contain much data themselves.
max_object_details (int, default: 30): The maximum number of objects for which to print details. We try to check at least one object for each object type, within this limit. If the number of types exceeds this limit, then we check only the most common types. If the number of types is less than this limit, then we will check more objects for more common types.

Referrer Tracking Options

check_referrers (bool, default: True): Whether to check for referrers of leaked objects. This option may cause a significant slow-down (but provides useful information). Try setting this to False if Memalot is taking a long time to generate object details. Then, when you have an idea of what types of objects are leaking, you can generate reports with check_referrers=True and included_type_names set to the types of objects that you think may be leaking.
referrers_max_depth (int or None, default: 50): The maximum depth to search for referrers. Specify None to search to unlimited depth (but be careful with this: it may take a long time).
referrers_search_timeout (float or None, default: 300.0): The maximum time in seconds to spend searching for referrers for an individual object. If this time is exceeded, a partial graph is displayed and the referrer graph will contain a node containing the text "Timeout of N seconds exceeded". Note that this timeout is approximate, and may not be effective if the search is blocked by a long-running operation. The default is 5 minutes (300 seconds). Setting this to None will disable the timeout.
single_object_referrer_limit (int or None, default: 100): The maximum number of referrers to include in the graph for an individual object instance. If the limit is exceeded, the referrer graph will contain a node containing the text "Referrer limit of N exceeded". Note that this limit is approximate and does not apply to all referrer types. Specifically, it only applies to object references. Additionally, this limit does not apply to immortal objects.
referrers_module_prefixes (set of strings or None, default: None): The prefixes of the modules to search for module-level variables when looking for referrers. If this is not specified, the top-level package of the calling code is used.
referrers_max_untracked_search_depth (int, default: 30): The maximum depth to search for referrers of untracked objects. This is the depth that referents will be searched from the roots (locals and globals). If you are missing referrers of untracked objects, you can increase this value.

Report Storage Options

save_reports (bool, default: True): Whether to save reports to disk. This is useful for inspecting them later. Reports are written to the report_directory, or the default directory if this is not specified.
report_directory (Path or None, default: None): The directory to write the report data to. Individual report data is written to a subdirectory of this directory. If this is None (the default), the default directory will be used. This is the .memalot/reports directory in the user's home directory. To turn off saving of reports entirely, use the save_reports option.

Output Options

str_func (callable or None, default: None): A function for outputting the string representation of an object. The first argument is the object and the second argument is the length to truncate the string to, as specified by str_max_length. If this is not supplied the object's __str__ is used.
str_max_length (int, default: 100): The maximum length of object string representations, as passed to str_func.
force_terminal (bool or None, default: None): Forces the use of terminal control codes, which enable colors and other formatting. Defaults to False, as this is normally detected automatically. Set this to True if you are missing colors or other formatting in the output, as sometimes (like when running in an IDE) the terminal is not detected correctly. This must be set to False if output_func is set and tee_console is False.
output_func (callable or None, default: None): A function that writes reports. If this is not provided reports are printed to the console. This option can be used to, for example, write reports to a log file. If this option is specified then output is not written to the console, unless tee_console is set to True.
tee_console (bool, default: False): If this is set to True, output is written to the console as well as to the function specified by output_func. If output_func is not specified (the default) then this option has no effect.
color (bool, default: True): Specifies whether colors should be printed to the console. Note: in certain consoles (like when running in an IDE), colors are not printed by default. Try setting force_terminal to True if this happens.

Other Options

max_untracked_search_depth (int, default: 3): The maximum search depth when looking for leaked objects that are not tracked by the garbage collector. Untracked objects include, for example, mutable objects and collections containing only immutable objects in CPython. This defaults to 3, which is enough to find most untracked objects. However, this may not be sufficient to find some untracked objects, like nested tuples. Increase this if you have nested collections of immutable objects (like tuples). However, note that increasing this may impact speed.

Context Manager

Memalot can be used as a context manager. However, it is generally recommended to use the @leak_monitor decorator instead, unless this is not possible.

To use Memalot as a context manager, call create_leak_monitor once, and then use the returned object as a context manager each time you want to monitor memory leaks. For example:

monitor = create_leak_monitor()

with monitor:
    # Code that leaks memory here

Note: it is important to call create_leak_monitor only once and reuse the returned object each time you want to monitor memory leaks.

Definition of a Leak

Memalot defines a memory leak as an object that has lived for longer than is necessary.

However, note that Memalot cannot distinguish between objects that live for a long time when this is necessary (for example, you want to cache some objects for speed) and when this is unnecessary (for example, you forget to evict stale objects from your cache). It's up to you to make this distinction.

Known Limitations

Memalot is slow. Be wary of using it in a production system.
Memalot does not guarantee to find all leaking objects. If you have leaking objects that are created very rarely, Memalot may not detect them. Specifically:
- Memalot does not find objects that are created while the leak report is being generated. This is mostly applicable to time-based leak discovery.
- If the max_object_age_calls parameter is set to greater than 1 during function-based leak discovery, Memalot will not find objects that are created on some calls to the function.

Leaks Found by Memalot

Memalot has been used to track down leaks in TensorFlow, TensorFlow Probability, Pydantic, and more.

If you use Memalot successfully in an open source project, please let us know by tagging @nfergu in Github.

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
.github/workflows		.github/workflows
src		src
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
logo.png		logo.png
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Installation

Getting Started

Time-based Leak Discovery

Function-based Leak Discovery

Filtering

Console Output

Saved Reports

CLI

MCP Server

MCP Server Installation

General Configuration

JSON Configuration

Claude Code

Codex CLI

Copilot Coding Agent

Example Prompts

Tips for Using the MCP Server

Referrers

Options

Type Filtering

Leak Report Options

Referrer Tracking Options

Report Storage Options

Output Options

Other Options

Context Manager

Definition of a Leak

Known Limitations

Leaks Found by Memalot

About

Uh oh!

Releases 13

Packages

Languages

License

nfergu/memalot

Folders and files

Latest commit

History

Repository files navigation

Installation

Getting Started

Time-based Leak Discovery

Function-based Leak Discovery

Filtering

Console Output

Saved Reports

CLI

MCP Server

MCP Server Installation

General Configuration

JSON Configuration

Claude Code

Codex CLI

Copilot Coding Agent

Example Prompts

Tips for Using the MCP Server

Referrers

Options

Type Filtering

Leak Report Options

Referrer Tracking Options

Report Storage Options

Output Options

Other Options

Context Manager

Definition of a Leak

Known Limitations

Leaks Found by Memalot

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 13

Packages 0

Languages

Packages