Skip to content

Cubes

packingcubes.cubes

Package for creating and manipulating packing cubes

Can be used via CLI or programmatically. The CLI essentially calls make_cubes and then Cubes on the output of whatever snapshot it's pointed at, after which it saves the generated cubes structure. Use the --help argument for more information.

Cubes is intended to be the primary entry point when used programmatically, calling make_cubes under the hood as needed. It will return a ParticleCubes object.

ParticleCubes are the main workhorse; construction via make_cubes does the initial dataset sorting, while each ParticleCubes object has a number of search methods attached, like get_particles_in_sphere.

MultiCubes are essentially identical to ParticleCubes, with each method having one additional parameter, particle_types, that is used to select which particle type information to return. The output of each method differs as well. If output would be the output from a ParticleCubes method call,

{"PartType0":output_for_PartType0, "PartType1":output_for_PartType1, ...}
would be the MultiCubes output. Effectively, it acts as a dictionary-based wrapper around a collection of ParticleCubes objects, and would be used in the case when you want some or all particle types from a search, not just one.

Classes:

  • ParticleCubes

    Object to perform rapid, parallel searches of a dataset

  • MultiCubes

    Collection of ParticleCubes organized by particle type

Functions:

  • Cubes

    Load or create a ParticleCubes object (calls make_cubes under the hood as needed)

  • make_cubes

    The actual ParticleCubes creation

  • make_MultiCubes

    Like Cubes but explicitly creates a MultiCubes object, even if only one particle type is present

Cubes

Cubes(dataset=None, *, cubes_dict=None, particle_type=None, extras=None, **kwargs)

Create or load ParticleCubes objects from the provided data

As an alternative to a dataset, you can provide a dictionary containing cube data offsets, bounding boxes, and optionally PackedTrees as cube_indices, cube_boxes, and cube_trees. This could be useful in the case where a dataset has a natural top-level structure already, but may not yet have PackedTree subcomponents. As an example, a collection of disjoint blobs in a 3D parameter space, or if the dataset already contains an octree-like structure.

Parameters:

  • dataset (str | NDArray | MultiParticleDataset | None, default: None ) –

    Dataset containing positional data. Will be used to create a new ParticleCubes, including sorting. Must provide either this or cubes_dict, below. Assumes strings are filepaths to GadgetishHDF5Datasets.

  • cubes_dict (dict[str, NDArray | list[BoundingBox] | list[NDArray | PackedTree]] | None, default: None ) –

    Dictionary with 2-3 components:

    1. cube_indices - contains the data offsets for each cube's particles (i.e. cube 0 is from cubes_indices[0]:cubes_indices[1])
    2. cube_boxes - containes the BoundingBox for each cube
    3. cube_trees (optional) - contains the PackedTree for each cube
  • particle_type (str | None, default: None ) –

    The particle type to use. Unused if cubes_dict is provided. Defaults to dataset.particle_type

  • extras (Mapping[str, Any] | None, default: None ) –

    Attach additional fields to the dataset to be sorted. Unused if cubes_dict is provided. See process_extra_fields for MultiParticleDataset or GadgetishHDF5Dataset

  • **kwargs

    Extra arguments to InMemory/ GadgetishHDF5Dataset, make_cubes, and ParticleCubes for a description.

Returns:

  • ParticleCubes

    ParticleCubes object constructed from the dataset/dictionary

Raises:

  • CubesError

    If neither dataset nor cubes_dict is provided

See Also

ParticleCubes, MultiCubes

ParticleCubes

ParticleCubes(*, cube_indices, cube_boxes, cube_trees, dataset=None, **kwargs)

The cubes for a single particle type

Methods:

Attributes:

Box

Box(box, *, dataset=None, strict=False, fields=None, extras=None, save_filepath=None, save_particle_type=None)

Construct a box-shaped subdataset

Parameters:

  • box (BoxLike) –

    The box to search in

  • dataset (Dataset | None, default: None ) –

    Dataset containing the particle positions. Defaults to self.dataset.

  • strict (bool, default: False ) –

    Flag to specify whether only particles inside the shape will be returned. If False (default), additional nearby particles may be included for signficantly increased performance

  • fields (Collection[str] | None, default: None ) –

    Subset of fields in dataset.extras to include. Specify "all" to include everything in dataset.extras. Defaults to the empty set.

  • extras (Mapping[str, Any] | None, default: None ) –

    Additional fields to sort, add to dataset.extras, and include in the returned subdataset. See [process_extra_fields][] for more details. Defaults to None

  • save_filepath (str | None, default: None ) –

    If provided, save this subdataset to the specified file with the specified particle type. save_particle_type can be omitted to use the default particle type.

  • save_particle_type (str | None, default: None ) –

    If provided, save this subdataset to the specified file with the specified particle type. save_particle_type can be omitted to use the default particle type.

Returns:

  • InMemory

    Subdataset with the specified bounding volume and fields

Raises:

  • ValueError

    If fields are specified that are in neither extras nor dataset.extras.

Sphere

Sphere(center, radius, *, dataset=None, strict=False, fields=None, extras=None, save_filepath=None, save_particle_type=None)

Construct a spherical subdataset

Parameters:

  • center (ArrayLike) –

    Center point of the sphere

  • radius (float) –

    Radius of the sphere

  • dataset (Dataset | None, default: None ) –

    Dataset containing the particle positions. Defaults to self.dataset.

  • strict (bool, default: False ) –

    Flag to specify whether only particles inside the shape will be returned. If False (default), additional nearby particles may be included for signficantly increased performance

  • fields (Collection[str] | None, default: None ) –

    Subset of fields in dataset.extras to include. Specify "all" to include everything in dataset.extras. Defaults to the empty set.

  • extras (Mapping[str, Any] | None, default: None ) –

    Additional fields to sort, add to dataset.extras, and include in the returned subdataset. See [process_extra_fields][] for more details. Defaults to None

  • save_filepath (str | None, default: None ) –

    If provided, save this subdataset to the specified file with the specified particle type. save_particle_type can be omitted to use the default particle type.

  • save_particle_type (str | None, default: None ) –

    If provided, save this subdataset to the specified file with the specified particle type. save_particle_type can be omitted to use the default particle type.

Returns:

  • InMemory

    Subdataset with the specified bounding volume and fields

Raises:

  • ValueError

    If fields are specified that are in neither extras nor dataset.extras.

cube_boxes instance-attribute

cube_boxes = cube_boxes

The bounding boxes for each cube

cube_indices instance-attribute

cube_indices = cube_indices

Array of cube indices into the dataset

cube_trees instance-attribute

cube_trees = []

The packed trees for each cube

dataset property

dataset

Return the attached Dataset to this object or None

get_closest_particles

get_closest_particles(*, xyz, data=None, distance_upper_bound=None, p=None, k=None, return_shuffle_indices=None, return_sorted=None)

Get kth nearest particle distances and indices to point.

Parameters:

  • xyz (NDArray) –

    Coordinates of point to check

  • data (DataContainer | Dataset | None, default: None ) –

    Source of particle position data. Defaults to self.dataset.

  • distance_upper_bound (float | None, default: None ) –

    Return only neighbors from other nodes within this distance. This is used for tree pruning, so if you are doing a series of nearest-neighbor queries, it may help to supply the distance to the nearest neighbor of the most recent point.

  • p (float | None, default: None ) –

    Which Minkowski p-norm to use. 1 is the sum of absolute-values distance ("Manhattan" distance). 2 is the usual Euclidean distance. Infinity is the maximum-coordinate-difference distance. Currently, only p=2 is supported.

  • k (int | None, default: None ) –

    Number of closest particles to return. Default 1

  • return_shuffle_indices (bool | None, default: None ) –

    Flag to return the shuffle indices instead of the data indices. Default False.

  • return_sorted (bool | None, default: None ) –

    Flag to return the distances and indices in distance-sorted order. Set to False for a performance boost. Default True

Returns:

  • distances ( NDArray[float] ) –

    Distances to the kth nearest neighbors. Has shape (min(N,k),), where N is the number of particles in the sphere bounded by distance_upper_bound

  • indices ( NDArray[int] ) –

    Indices in data of the kth nearest neighbors. Has same shape as distances

Raises:

get_particle_index_list_in_box

get_particle_index_list_in_box(box, *, data=None, use_data_indices=True, strict=False)

Return all particle indices contained within the box

Parameters:

  • box (BoxLike) –

    The box to search in

  • data (DataContainer | Dataset | None, default: None ) –

    Dataset containing the particle positions. Pass a DataContainer object for a slight performance increase. Defaults to self.dataset.

  • use_data_indices (bool, default: True ) –

    Flag to return indices into the sorted dataset (True, default) or into the shuffle list (False)

  • strict (bool, default: False ) –

    Flag to specify whether only particles inside the shape will be returned. If False (default), additional nearby particles may be included for signficantly increased performance

Returns:

  • indices ( Array[int] ) –

    Array of particle indices contained within shape

Raises:

  • ValueError

    If data is None and self.dataset is None

get_particle_index_list_in_sphere

get_particle_index_list_in_sphere(center, radius, *, data=None, use_data_indices=True, strict=False)

Return all particle indices contained within the sphere

Parameters:

  • center (NDArray) –

    Center point of the sphere

  • radius (float) –

    Radius of the sphere

  • data (DataContainer | Dataset | None, default: None ) –

    Dataset containing the particle positions. Pass a DataContainer object for a slight performance increase. Defaults to self.dataset.

  • use_data_indices (bool, default: True ) –

    Flag to return indices into the sorted dataset (True, default) or into the shuffle list (False)

  • strict (bool, default: False ) –

    Flag to specify whether only particles inside the shape will be returned. If False (default), additional nearby particles may be included for signficantly increased performance

Returns:

  • indices ( NDArray[int] ) –

    Array of particle indices contained within the sphere

Raises:

  • ValueError

    If data is None and self.dataset is None

get_particle_indices_in_box

get_particle_indices_in_box(box)

Return all particles contained within the box

Parameters:

  • box (BoxLike) –

    Box to check

Returns:

  • indices ( Xx3 NDArray[np.int_] ) –

    Array of index information. Each row describes a chunk/slice of data in the form [start, stop, partial], where partial is a flag - (1) if the data chunk is entirely contained within box, (0) otherwise.

get_particle_indices_in_sphere

get_particle_indices_in_sphere(center, radius)

Return all particles contained within the sphere defined by center and radius

Parameters:

  • center (NDArray) –

    Center point of the sphere

  • radius (float) –

    Radius of the sphere

Returns:

  • indices ( Xx3 NDArray[np.int_] ) –

    Array of index information. Each row describes a chunk/slice of data in the form [start, stop, partial], where partial is a flag - (1) if the data chunk is entirely contained within the sphere, (0) otherwise.

save

save(dataset, *, force_overwrite=False)

Save cubes information to specified file

Parameters:

  • dataset (str | Path | HDF5Dataset) –

    Location to store cubes data.

  • force_overwrite (bool, default: False ) –

    If dataset already contains cubes data, overwrite if True. Default False

Returns:

  • Path

    Path to the saved cubes information

MultiCubes

MultiCubes(*, cubes_dict, dataset=None, **kwargs)

The cubes for multiple particle types

Methods:

Attributes:

Box

Box(box, *, particle_types=None, **kwargs)

Construct a subdataset bounded by a sphere for each particle type

Parameters:

  • box (BoxLike) –

    Box to check

  • particle_types (str | Collection[str] | None, default: None ) –

    Particle type(s) to include. Defaults to self.particle_types

  • **kwargs

    See Sphere in ParticleCubes for other arguments. Note that save_particle_type will be ignored and the corresponding particle_type will be used.

Returns:

  • dict[str, InMemory]

    Dictionary of subdatasets with the specified bounding volume and fields, organized by particle type.

Raises:

  • ValueError

    If fields are specified that are in neither extras nor dataset.extras.

Sphere

Sphere(center, radius, *, particle_types=None, **kwargs)

Construct a subdataset bounded by a sphere for each particle type

Parameters:

  • center (NDArray) –

    Center point of the sphere

  • radius (float) –

    Radius of the sphere

  • particle_types (str | Collection[str] | None, default: None ) –

    Particle type(s) to include. Defaults to self.particle_types

  • **kwargs

    See Sphere in ParticleCubes for other arguments. Note that save_particle_type will be ignored and the corresponding particle_type will be used.

Returns:

  • dict[str, InMemory]

    Dictionary of subdatasets with the specified bounding volume and fields, organized by particle type.

Raises:

  • ValueError

    If fields are specified that are in neither extras nor dataset.extras.

get_closest_particles

get_closest_particles(*, data, xyz, particle_types=None, distance_upper_bound=None, p=None, k=None, return_shuffle_indices=None, return_sorted=None)

Get kth nearest particle distances and indices to point

Parameters:

  • data (DataContainer | Dataset) –

    Source of particle position data

  • xyz (NDArray) –

    Coordinates of point to check

  • particle_types (str | Collection[str] | None, default: None ) –

    Particle type(s) to include. Defaults to self.particle_types

  • distance_upper_bound (float | None, default: None ) –

    Return only neighbors from other nodes within this distance. This is used for tree pruning, so if you are doing a series of nearest-neighbor queries, it may help to supply the distance to the nearest neighbor of the most recent point.

  • p (float | None, default: None ) –

    Which Minkowski p-norm to use. 1 is the sum of absolute-values distance ("Manhattan" distance). 2 is the usual Euclidean distance. Infinity is the maximum-coordinate-difference distance. Currently, only p=2 is supported.

  • k (int | None, default: None ) –

    Number of closest particles to return. Default 1

  • return_shuffle_indices (bool | None, default: None ) –

    Flag to return the shuffle indices instead of the data indices. Default False.

  • return_sorted (bool | None, default: None ) –

    Flag to return the distances and indices in distance-sorted order. Set to False for a performance boost. Default True

Returns:

  • distances ( NDArray[float] ) –

    Distances to the kth nearest neighbors. Has shape (min(N,k),), where N is the number of particles in the sphere bounded by distance_upper_bound

  • indices ( NDArray[int] ) –

    Indices in data of the kth nearest neighbors. Has same shape as distances

Raises:

get_particle_index_list_in_box

get_particle_index_list_in_box(*, data, box, particle_types=None, strict=True, use_data_indices=True)

Return all particles contained within the sphere defined by center and radius

Parameters:

  • data (DataContainer | Dataset) –

    Dataset containing the particle positions. Pass a DataContainer object for a slight performance increase

  • box (BoxLike) –

    Box to check

  • particle_types (str | Collection[str] | None, default: None ) –

    Particle type(s) to include. Defaults to self.particle_types

  • strict (bool, default: True ) –

    Flag to specify whether only particles inside the sphere will be returned. If False (default), additional nearby particles may be included for signficantly increased performance

  • use_data_indices (bool, default: True ) –

    Flag to return indices into the sorted dataset (True, default) or into the shuffle list (False)

Returns:

  • indices ( NDArray[int] ) –

    List of original particle indices contained within sphere

get_particle_index_list_in_sphere

get_particle_index_list_in_sphere(*, data, center, radius, particle_types=None, strict=True, use_data_indices=True)

Return all particles contained within the sphere defined by center and radius

Parameters:

  • data (DataContainer | Dataset) –

    Dataset containing the particle positions. Pass a DataContainer object for a slight performance increase

  • center (NDArray) –

    Center point of the sphere

  • radius (float) –

    Radius of the sphere

  • particle_types (str | Collection[str] | None, default: None ) –

    Particle type(s) to include. Defaults to self.particle_types

  • strict (bool, default: True ) –

    Flag to specify whether only particles inside the sphere will be returned. If False (default), additional nearby particles may be included for signficantly increased performance

  • use_data_indices (bool, default: True ) –

    Flag to return indices into the sorted dataset (True, default) or into the shuffle list (False)

Returns:

  • indices ( NDArray[int] ) –

    List of original particle indices contained within sphere

get_particle_indices_in_box

get_particle_indices_in_box(box, *, particle_types=None)

Return all particles contained within the box

Parameters:

  • box (BoxLike) –

    Box to check

  • particle_types (str | Collection[str] | None, default: None ) –

    Particle type(s) to include. Defaults to self.particle_types

Returns:

  • indices ( dict[str, NDArray[int]] ) –

    Dictionary of arrays of particle start-stop indices plus partiality flag contained within box, organized by particle type

See Also

ParticleCubes.get_particle_indices_in_box

get_particle_indices_in_sphere

get_particle_indices_in_sphere(center, radius, *, particle_types=None)

Return all particles contained within the sphere defined by center and radius

Parameters:

  • particle_types (str | Collection[str] | None, default: None ) –

    Particle type(s) to include

  • center (NDArray) –

    Center point of the sphere

  • radius (float) –

    Radius of the sphere

  • particle_types (str | Collection[str] | None, default: None ) –

    Particle type(s) to include. Defaults to self.particle_types

Returns:

  • indices ( dict[str, NDArray[int]] ) –

    Dictionary of arrays of particle start-stop indices plus partiality flag contained within sphere, organized by particle type

See Also

ParticleCubes.get_particle_indices_in_sphere

get_single_cubes

get_single_cubes(particle_type)

Return the ParticleCubes instance corresponding to the specified type.

particle_types property

particle_types

Return the list of particle types with cubes

save

save(dataset, *, force_overwrite=False)

Save cubes information to specified file

Parameters:

  • dataset (str | Path | HDF5Dataset) –

    Location to store cubes data.

  • force_overwrite (bool, default: False ) –

    If dataset already contains cubes data, overwrite if True. Default False

Returns:

  • Path

    Path to the saved cubes information

CubesError

Bases: Exception

Error during cubes creation or traversal

make_MultiCubes

make_MultiCubes(dataset, particle_types, **kwargs)

Make MultiCubes object from dataset even if there is only one particle type

Parameters:

  • dataset (str | NDArray | MultiParticleDataset) –

    The dataset to load or create the MultiCubes from

  • particle_types (Collection[str] | None) –

    Collection of particle types to include. Defaults to all particle types found in the dataset

  • **kwargs

    Refer to Cubes documentation for a list of all posssible arguments

make_cubes

make_cubes(*, dataset, cubes_per_side=-1, cube_box=None, particle_threshold=None, particle_type=None, save_dataset=False, **kwargs)

Create a ParticleCubes from the provided dataset

Parameters:

  • dataset (MultiParticleDataset) –

    The dataset containing particle data. Will be sorted in-place, but will not save updated positional information unless save_dataset is True

  • cubes_per_side (int, default: -1 ) –

    Number of cubes on a side. Dataset will be divided into cubes_per_side**3 cubes, plus an additional cube to catch any remaining particles (if the cube_box is smaller than the actual data extants). Note: due to the PackedTree's packed format, cubes must contain fewer than ~4 billion particles. If cubes_per_side is too small to support this, a ValueError will be raised. The limit is per-particle-type.

  • cube_box (BoxLike | None, default: None ) –

    A box-like object (i.e. something that can convert to a (6,) ndarray) that delineates the region of data to be cubed. Any particles outside this region will fall into an overflow cube. Useful for zoom-in simulations or other datasets with sparse outer regions. Default is the data bounding box.

  • particle_threshold (int | None, default: None ) –

    Maximum number of particles in a tree leaf node. Default is 400

  • particle_type (str | None, default: None ) –

    Particle type to process. Default is dataset.particle_type

  • save_dataset (bool, default: False ) –

    Whether to save the sorted dataset positions out to a file using default values for the parameters. The data will be sorted in memory either way. Default False.

Returns:

Raises:

  • ValueError

    If requested particle type isn't in the dataset or if too few cubes were requested for the number of particles