Cubes

packingcubes.cubes ¶

Package for creating and manipulating packing cubes

Can be used via CLI or programmatically. The CLI essentially calls make_cubes and then Cubes on the output of whatever snapshot it's pointed at, after which it saves the generated cubes structure. Use the --help argument for more information.

Cubes is intended to be the primary entry point when used programmatically, calling make_cubes under the hood as needed. It will return a ParticleCubes object.

ParticleCubes are the main workhorse; construction via make_cubes does the initial dataset sorting, while each ParticleCubes object has a number of search methods attached, like get_particles_in_sphere.

MultiCubes are essentially identical to ParticleCubes, with each method having one additional parameter, particle_types, that is used to select which particle type information to return. The output of each method differs as well. If output would be the output from a ParticleCubes method call,

{"PartType0":output_for_PartType0, "PartType1":output_for_PartType1, ...}

would be the MultiCubes output. Effectively, it acts as a dictionary-based wrapper around a collection of ParticleCubes objects, and would be used in the case when you want some or all particle types from a search, not just one.

Classes:

ParticleCubes –

Object to perform rapid, parallel searches of a dataset
MultiCubes –

Collection of ParticleCubes organized by particle type

Functions:

Cubes –

Load or create a ParticleCubes object (calls make_cubes under the hood as needed)
make_cubes –

The actual ParticleCubes creation
make_MultiCubes –

Like Cubes but explicitly creates a MultiCubes object, even if only one particle type is present

Cubes ¶

Cubes(dataset=None, *, cubes_dict=None, particle_type=None, extras=None, **kwargs)

Create or load ParticleCubes objects from the provided data

As an alternative to a dataset, you can provide a dictionary containing cube data offsets, bounding boxes, and optionally PackedTrees as cube_indices, cube_boxes, and cube_trees. This could be useful in the case where a dataset has a natural top-level structure already, but may not yet have PackedTree subcomponents. As an example, a collection of disjoint blobs in a 3D parameter space, or if the dataset already contains an octree-like structure.

Parameters:

dataset (str | NDArray | MultiParticleDataset | None, default: None ) –

Dataset containing positional data. Will be used to create a new ParticleCubes, including sorting. Must provide either this or cubes_dict, below. Assumes strings are filepaths to GadgetishHDF5Datasets.
cubes_dict (dict[str, NDArray | list[BoundingBox] | list[NDArray | PackedTree]] | None, default: None ) –
Dictionary with 2-3 components:
1. cube_indices - contains the data offsets for each cube's particles (i.e. cube 0 is from cubes_indices[0]:cubes_indices[1])
2. cube_boxes - containes the BoundingBox for each cube
3. cube_trees (optional) - contains the PackedTree for each cube
particle_type (str | None, default: None ) –

The particle type to use. Unused if cubes_dict is provided. Defaults to dataset.particle_type
extras (Mapping[str, Any] | None, default: None ) –

Attach additional fields to the dataset to be sorted. Unused if cubes_dict is provided. See process_extra_fields for MultiParticleDataset or GadgetishHDF5Dataset
**kwargs –

Extra arguments to InMemory/ GadgetishHDF5Dataset, make_cubes, and ParticleCubes for a description.

Returns:

ParticleCubes –

ParticleCubes object constructed from the dataset/dictionary

Raises:

CubesError –

If neither dataset nor cubes_dict is provided

ParticleCubes ¶

ParticleCubes(*, cube_indices, cube_boxes, cube_trees, dataset=None, **kwargs)

The cubes for a single particle type

Methods:

Box –

Construct a box-shaped subdataset
Sphere –

Construct a spherical subdataset
get_closest_particles –

Get kth nearest particle distances and indices to point.
get_particle_index_list_in_box –

Return all particle indices contained within the box
get_particle_index_list_in_sphere –

Return all particle indices contained within the sphere
get_particle_indices_in_box –

Return all particles contained within the box
get_particle_indices_in_sphere –

Return all particles contained within the sphere defined by center and radius
save –

Save cubes information to specified file

Attributes:

cube_boxes (List[BoundingBox]) –

The bounding boxes for each cube
cube_indices (NDArray) –

Array of cube indices into the dataset
cube_trees (list[PackedTree]) –

The packed trees for each cube
dataset (Dataset | None) –

Return the attached Dataset to this object or None

Box ¶

Box(box, *, dataset=None, strict=False, fields=None, extras=None, save_filepath=None, save_particle_type=None)

Construct a box-shaped subdataset

Parameters:

box (BoxLike) –

The box to search in
dataset (Dataset | None, default: None ) –

Dataset containing the particle positions. Defaults to self.dataset.
strict (bool, default: False ) –

Flag to specify whether only particles inside the shape will be returned. If False (default), additional nearby particles may be included for signficantly increased performance
fields (Collection[str] | None, default: None ) –

Subset of fields in dataset.extras to include. Specify "all" to include everything in dataset.extras. Defaults to the empty set.
extras (Mapping[str, Any] | None, default: None ) –

Additional fields to sort, add to dataset.extras, and include in the returned subdataset. See [process_extra_fields][] for more details. Defaults to None
save_filepath (str | None, default: None ) –

If provided, save this subdataset to the specified file with the specified particle type. save_particle_type can be omitted to use the default particle type.
save_particle_type (str | None, default: None ) –

If provided, save this subdataset to the specified file with the specified particle type. save_particle_type can be omitted to use the default particle type.

Returns:

InMemory –

Subdataset with the specified bounding volume and fields

Raises:

ValueError –

If fields are specified that are in neither extras nor dataset.extras.

Sphere ¶

Sphere(center, radius, *, dataset=None, strict=False, fields=None, extras=None, save_filepath=None, save_particle_type=None)

Construct a spherical subdataset

Parameters:

center (ArrayLike) –

Center point of the sphere
radius (float) –

Radius of the sphere
dataset (Dataset | None, default: None ) –

Dataset containing the particle positions. Defaults to self.dataset.
strict (bool, default: False ) –

Flag to specify whether only particles inside the shape will be returned. If False (default), additional nearby particles may be included for signficantly increased performance
fields (Collection[str] | None, default: None ) –

Subset of fields in dataset.extras to include. Specify "all" to include everything in dataset.extras. Defaults to the empty set.
extras (Mapping[str, Any] | None, default: None ) –

Additional fields to sort, add to dataset.extras, and include in the returned subdataset. See [process_extra_fields][] for more details. Defaults to None
save_filepath (str | None, default: None ) –

If provided, save this subdataset to the specified file with the specified particle type. save_particle_type can be omitted to use the default particle type.
save_particle_type (str | None, default: None ) –

If provided, save this subdataset to the specified file with the specified particle type. save_particle_type can be omitted to use the default particle type.

Returns:

InMemory –

Subdataset with the specified bounding volume and fields

Raises:

ValueError –

If fields are specified that are in neither extras nor dataset.extras.

cube_boxes `instance-attribute` ¶

cube_boxes = cube_boxes

The bounding boxes for each cube

cube_indices `instance-attribute` ¶

cube_indices = cube_indices

Array of cube indices into the dataset

cube_trees `instance-attribute` ¶

cube_trees = []

The packed trees for each cube

dataset `property` ¶

dataset

Return the attached Dataset to this object or None

get_closest_particles ¶

get_closest_particles(*, xyz, data=None, distance_upper_bound=None, p=None, k=None, return_shuffle_indices=None, return_sorted=None)

Get kth nearest particle distances and indices to point.

Parameters:

xyz (NDArray) –

Coordinates of point to check
data (DataContainer | Dataset | None, default: None ) –

Source of particle position data. Defaults to self.dataset.
distance_upper_bound (float | None, default: None ) –

Return only neighbors from other nodes within this distance. This is used for tree pruning, so if you are doing a series of nearest-neighbor queries, it may help to supply the distance to the nearest neighbor of the most recent point.
p (float | None, default: None ) –

Which Minkowski p-norm to use. 1 is the sum of absolute-values distance ("Manhattan" distance). 2 is the usual Euclidean distance. Infinity is the maximum-coordinate-difference distance. Currently, only p=2 is supported.
k (int | None, default: None ) –

Number of closest particles to return. Default 1
return_shuffle_indices (bool | None, default: None ) –

Flag to return the shuffle indices instead of the data indices. Default False.
return_sorted (bool | None, default: None ) –

Flag to return the distances and indices in distance-sorted order. Set to False for a performance boost. Default True

Returns:

distances ( NDArray[float] ) –

Distances to the kth nearest neighbors. Has shape (min(N,k),), where N is the number of particles in the sphere bounded by distance_upper_bound
indices ( NDArray[int] ) –

Indices in data of the kth nearest neighbors. Has same shape as distances

Raises:

NotImplementedError –

If a p value of greater than 2 is provided
ValueError –

If data is None and self.dataset is None

get_particle_index_list_in_box ¶

get_particle_index_list_in_box(box, *, data=None, use_data_indices=True, strict=False)

Return all particle indices contained within the box

Parameters:

box (BoxLike) –

The box to search in
data (DataContainer | Dataset | None, default: None ) –

Dataset containing the particle positions. Pass a DataContainer object for a slight performance increase. Defaults to self.dataset.
use_data_indices (bool, default: True ) –

Flag to return indices into the sorted dataset (True, default) or into the shuffle list (False)
strict (bool, default: False ) –

Flag to specify whether only particles inside the shape will be returned. If False (default), additional nearby particles may be included for signficantly increased performance

Returns:

indices ( Array[int] ) –

Array of particle indices contained within shape

Raises:

ValueError –

If data is None and self.dataset is None

get_particle_index_list_in_sphere ¶

get_particle_index_list_in_sphere(center, radius, *, data=None, use_data_indices=True, strict=False)

Return all particle indices contained within the sphere

Parameters:

center (NDArray) –

Center point of the sphere
radius (float) –

Radius of the sphere
data (DataContainer | Dataset | None, default: None ) –

Dataset containing the particle positions. Pass a DataContainer object for a slight performance increase. Defaults to self.dataset.
use_data_indices (bool, default: True ) –

Flag to return indices into the sorted dataset (True, default) or into the shuffle list (False)
strict (bool, default: False ) –

Flag to specify whether only particles inside the shape will be returned. If False (default), additional nearby particles may be included for signficantly increased performance

Returns:

indices ( NDArray[int] ) –

Array of particle indices contained within the sphere

Raises:

ValueError –

If data is None and self.dataset is None

get_particle_indices_in_box ¶

get_particle_indices_in_box(box)

Return all particles contained within the box

Parameters:

box (BoxLike) –

Box to check

Returns:

indices ( Xx3 NDArray[np.int_] ) –

Array of index information. Each row describes a chunk/slice of data in the form [start, stop, partial], where partial is a flag - (1) if the data chunk is entirely contained within box, (0) otherwise.

get_particle_indices_in_sphere ¶

get_particle_indices_in_sphere(center, radius)

Return all particles contained within the sphere defined by center and radius

Parameters:

center (NDArray) –

Center point of the sphere
radius (float) –

Radius of the sphere

Returns:

indices ( Xx3 NDArray[np.int_] ) –

Array of index information. Each row describes a chunk/slice of data in the form [start, stop, partial], where partial is a flag - (1) if the data chunk is entirely contained within the sphere, (0) otherwise.

save ¶

save(dataset, *, force_overwrite=False)

Save cubes information to specified file

Parameters:

dataset (str | Path | HDF5Dataset) –

Location to store cubes data.
force_overwrite (bool, default: False ) –

If dataset already contains cubes data, overwrite if True. Default False

Returns:

Path –

Path to the saved cubes information

MultiCubes ¶

MultiCubes(*, cubes_dict, dataset=None, **kwargs)

The cubes for multiple particle types

Methods:

Box –

Construct a subdataset bounded by a sphere for each particle type
Sphere –

Construct a subdataset bounded by a sphere for each particle type
get_closest_particles –

Get kth nearest particle distances and indices to point
get_particle_index_list_in_box –

Return all particles contained within the sphere defined by center and radius
get_particle_index_list_in_sphere –

Return all particles contained within the sphere defined by center and radius
get_particle_indices_in_box –

Return all particles contained within the box
get_particle_indices_in_sphere –

Return all particles contained within the sphere defined by center and radius
get_single_cubes –

Return the ParticleCubes instance corresponding to the specified type.
save –

Save cubes information to specified file

Attributes:

particle_types –

Return the list of particle types with cubes

Box ¶

Box(box, *, particle_types=None, **kwargs)

Construct a subdataset bounded by a sphere for each particle type

Parameters:

box (BoxLike) –

Box to check
particle_types (str | Collection[str] | None, default: None ) –

Particle type(s) to include. Defaults to self.particle_types
**kwargs –

See Sphere in ParticleCubes for other arguments. Note that save_particle_type will be ignored and the corresponding particle_type will be used.

Returns:

dict[str, InMemory] –

Dictionary of subdatasets with the specified bounding volume and fields, organized by particle type.

Raises:

ValueError –

If fields are specified that are in neither extras nor dataset.extras.

Sphere ¶

Sphere(center, radius, *, particle_types=None, **kwargs)

Construct a subdataset bounded by a sphere for each particle type

Parameters:

center (NDArray) –

Center point of the sphere
radius (float) –

Radius of the sphere
particle_types (str | Collection[str] | None, default: None ) –

Particle type(s) to include. Defaults to self.particle_types
**kwargs –

See Sphere in ParticleCubes for other arguments. Note that save_particle_type will be ignored and the corresponding particle_type will be used.

Returns:

dict[str, InMemory] –

Dictionary of subdatasets with the specified bounding volume and fields, organized by particle type.

Raises:

ValueError –

If fields are specified that are in neither extras nor dataset.extras.

get_closest_particles ¶

get_closest_particles(*, data, xyz, particle_types=None, distance_upper_bound=None, p=None, k=None, return_shuffle_indices=None, return_sorted=None)

Get kth nearest particle distances and indices to point

Parameters:

data (DataContainer | Dataset) –

Source of particle position data
xyz (NDArray) –

Coordinates of point to check
particle_types (str | Collection[str] | None, default: None ) –

Particle type(s) to include. Defaults to self.particle_types
distance_upper_bound (float | None, default: None ) –

Return only neighbors from other nodes within this distance. This is used for tree pruning, so if you are doing a series of nearest-neighbor queries, it may help to supply the distance to the nearest neighbor of the most recent point.
p (float | None, default: None ) –

Which Minkowski p-norm to use. 1 is the sum of absolute-values distance ("Manhattan" distance). 2 is the usual Euclidean distance. Infinity is the maximum-coordinate-difference distance. Currently, only p=2 is supported.
k (int | None, default: None ) –

Number of closest particles to return. Default 1
return_shuffle_indices (bool | None, default: None ) –

Flag to return the shuffle indices instead of the data indices. Default False.
return_sorted (bool | None, default: None ) –

Flag to return the distances and indices in distance-sorted order. Set to False for a performance boost. Default True

Returns:

distances ( NDArray[float] ) –

Distances to the kth nearest neighbors. Has shape (min(N,k),), where N is the number of particles in the sphere bounded by distance_upper_bound
indices ( NDArray[int] ) –

Indices in data of the kth nearest neighbors. Has same shape as distances

Raises:

NotImplementedError –

If a p value of other then 2 is provided

get_particle_index_list_in_box ¶

get_particle_index_list_in_box(*, data, box, particle_types=None, strict=True, use_data_indices=True)

Return all particles contained within the sphere defined by center and radius

Parameters:

data (DataContainer | Dataset) –

Dataset containing the particle positions. Pass a DataContainer object for a slight performance increase
box (BoxLike) –

Box to check
particle_types (str | Collection[str] | None, default: None ) –

Particle type(s) to include. Defaults to self.particle_types
strict (bool, default: True ) –

Flag to specify whether only particles inside the sphere will be returned. If False (default), additional nearby particles may be included for signficantly increased performance
use_data_indices (bool, default: True ) –

Flag to return indices into the sorted dataset (True, default) or into the shuffle list (False)

Returns:

indices ( NDArray[int] ) –

List of original particle indices contained within sphere

get_particle_index_list_in_sphere ¶

get_particle_index_list_in_sphere(*, data, center, radius, particle_types=None, strict=True, use_data_indices=True)

Return all particles contained within the sphere defined by center and radius

Parameters:

data (DataContainer | Dataset) –

Dataset containing the particle positions. Pass a DataContainer object for a slight performance increase
center (NDArray) –

Center point of the sphere
radius (float) –

Radius of the sphere
particle_types (str | Collection[str] | None, default: None ) –

Particle type(s) to include. Defaults to self.particle_types
strict (bool, default: True ) –

Flag to specify whether only particles inside the sphere will be returned. If False (default), additional nearby particles may be included for signficantly increased performance
use_data_indices (bool, default: True ) –

Flag to return indices into the sorted dataset (True, default) or into the shuffle list (False)

Returns:

indices ( NDArray[int] ) –

List of original particle indices contained within sphere

get_particle_indices_in_box ¶

get_particle_indices_in_box(box, *, particle_types=None)

Return all particles contained within the box

Parameters:

box (BoxLike) –

Box to check
particle_types (str | Collection[str] | None, default: None ) –

Particle type(s) to include. Defaults to self.particle_types

Returns:

indices ( dict[str, NDArray[int]] ) –

Dictionary of arrays of particle start-stop indices plus partiality flag contained within box, organized by particle type

get_particle_indices_in_sphere ¶

get_particle_indices_in_sphere(center, radius, *, particle_types=None)

Return all particles contained within the sphere defined by center and radius

Parameters:

particle_types (str | Collection[str] | None, default: None ) –

Particle type(s) to include
center (NDArray) –

Center point of the sphere
radius (float) –

Radius of the sphere
particle_types (str | Collection[str] | None, default: None ) –

Particle type(s) to include. Defaults to self.particle_types

Returns:

indices ( dict[str, NDArray[int]] ) –

Dictionary of arrays of particle start-stop indices plus partiality flag contained within sphere, organized by particle type

get_single_cubes ¶

get_single_cubes(particle_type)

Return the ParticleCubes instance corresponding to the specified type.

particle_types `property` ¶

particle_types

Return the list of particle types with cubes

save ¶

save(dataset, *, force_overwrite=False)

Save cubes information to specified file

Parameters:

dataset (str | Path | HDF5Dataset) –

Location to store cubes data.
force_overwrite (bool, default: False ) –

If dataset already contains cubes data, overwrite if True. Default False

Returns:

Path –

Path to the saved cubes information

CubesError ¶

Bases: Exception

Error during cubes creation or traversal

make_MultiCubes ¶

make_MultiCubes(dataset, particle_types, **kwargs)

Make MultiCubes object from dataset even if there is only one particle type

Parameters:

dataset (str | NDArray | MultiParticleDataset) –

The dataset to load or create the MultiCubes from
particle_types (Collection[str] | None) –

Collection of particle types to include. Defaults to all particle types found in the dataset
**kwargs –

Refer to Cubes documentation for a list of all posssible arguments

make_cubes ¶

make_cubes(*, dataset, cubes_per_side=-1, cube_box=None, particle_threshold=None, particle_type=None, save_dataset=False, **kwargs)

Create a ParticleCubes from the provided dataset

Parameters:

dataset (MultiParticleDataset) –

The dataset containing particle data. Will be sorted in-place, but will not save updated positional information unless save_dataset is True
cubes_per_side (int, default: -1 ) –

Number of cubes on a side. Dataset will be divided into cubes_per_side**3 cubes, plus an additional cube to catch any remaining particles (if the cube_box is smaller than the actual data extants). Note: due to the PackedTree's packed format, cubes must contain fewer than ~4 billion particles. If cubes_per_side is too small to support this, a ValueError will be raised. The limit is per-particle-type.
cube_box (BoxLike | None, default: None ) –

A box-like object (i.e. something that can convert to a (6,) ndarray) that delineates the region of data to be cubed. Any particles outside this region will fall into an overflow cube. Useful for zoom-in simulations or other datasets with sparse outer regions. Default is the data bounding box.
particle_threshold (int | None, default: None ) –

Maximum number of particles in a tree leaf node. Default is 400
particle_type (str | None, default: None ) –

Particle type to process. Default is dataset.particle_type
save_dataset (bool, default: False ) –

Whether to save the sorted dataset positions out to a file using default values for the parameters. The data will be sorted in memory either way. Default False.

Returns:

ParticleCubes –

The created ParticleCubes object

Raises:

ValueError –

If requested particle type isn't in the dataset or if too few cubes were requested for the number of particles

Cubes

packingcubes.cubes ¶

Cubes ¶

ParticleCubes ¶

Box ¶

Sphere ¶

cube_boxes instance-attribute ¶

cube_indices instance-attribute ¶

cube_trees instance-attribute ¶

dataset property ¶

get_closest_particles ¶

get_particle_index_list_in_box ¶

get_particle_index_list_in_sphere ¶

get_particle_indices_in_box ¶

get_particle_indices_in_sphere ¶

save ¶

MultiCubes ¶

Box ¶

Sphere ¶

get_closest_particles ¶

get_particle_index_list_in_box ¶

get_particle_index_list_in_sphere ¶

get_particle_indices_in_box ¶

get_particle_indices_in_sphere ¶

get_single_cubes ¶

particle_types property ¶

save ¶

CubesError ¶

make_MultiCubes ¶

make_cubes ¶

cube_boxes `instance-attribute` ¶

cube_indices `instance-attribute` ¶

cube_trees `instance-attribute` ¶

dataset `property` ¶

particle_types `property` ¶