Skip to content

packingcubes

PyPI Supported Python Versions

Tested with Hypothesis Ruff

Compact octree implementation for fast search.

packingcubes aims to provide a fast, minimal-memory-usage octree implementation, specialized for use in astronomical/astrophysical contexts. It's written in pure python, with Numba-based acceleration of the critical code paths.

Key Features

  • Separation between data and tree (meta)data: Unlike other implementations (e.g. SciPy's spatial.KDTree or pykdtree), packingcubes sorts your dataset using a chunked octree and doesn't associate any positional information with the nodes. This leads to a number of useful properties:

    • Low-memory: Each node contains only minimal information (20 bytes per node), meaning a PackedTree with a similar depth can be ~100x smaller than a KDTree on the same data
    • Fast tree traversal: Tree searches on big datasets are usually memory bound. Smaller trees can more easily fit in cache/memory -> much faster tree traversal
    • Easy serialization: the tree information is trivially represented with collections of numpy arrays and can easily/quickly be saved to and loaded from storage.
  • Built for loading datasets: packingcubes preferentially uses and returns index chunks/slices (i.e. (start, stop) <-> start:stop), leading to increased cache efficiency and significant speedups when working with large datasets.

  • Useability: packingcubes is designed to work with datasets that contain information associated with the position information to make spatially searching other fields easy. E.g. finding a basic quantity like the mass of all particles in a sphere is a simple "one-liner".
  • Parallelization: The cubes in packingcubes are created and searched in parallel, with the cubes creation exhibiting strong scaling.
  • Pure Python: There's no C/Rust/etc. extensions and we're available on PyPI -> we're extendable and portable!