API Reference 

Raises:

ValueError – If any required arguments are missing.

File Checking

File system validation utilities.

image_analysis_3D.file_utils.file_checking.check_number_of_files(directory, n_files, verbose=False)[source]

Check if the number of files in a directory is equal to a given number.

Parameters:

directory (pathlib.Path) – Specified directory to check file number.
n_files (int) – The expected number of files in the directory.
verbose (bool, optional) – If verbose is True, additional information will be printed.

Returns:

True if the number of files in the directory is equal to the expected number, False otherwise. If False, also returns the name of the directory.

Return type:

tuple[bool, str | None]

File Reading

Helpers for reading imaging files and channel stacks.

image_analysis_3D.file_utils.file_reading.find_files_available(input_dir, image_extensions={'.tif', '.tiff'})[source]

List available image files in a directory.

Parameters:

input_dir (pathlib.Path) – Directory to scan for image files.
image_extensions (set[str], optional) – File extensions to include, by default {“.tif”, “.tiff”}.

Returns:

Sorted list of image file paths.

Return type:

List[str]

image_analysis_3D.file_utils.file_reading.read_in_channels(files, channel_dict={'brightfield': 'TRANS', 'cyto1': '488', 'cyto2': '555', 'cyto3': '640', 'nuclei': '405'}, channels_to_read=[None])[source]

Read z-stack images for each channel token.

Parameters:

files (Iterable[str]) – File paths to search for channel tokens.
channel_dict (dict[str, str], optional) – Mapping of channel name to filename token.
channels_to_read (List[str | None], optional) – Reserved for channel selection; currently unused.

Returns:

Mapping of channel name to loaded image array (or None).

Return type:

dict[str, np.ndarray | None]

image_analysis_3D.file_utils.file_reading.read_zstack_image(file_path)[source]

Reads in a z-stack image from a given file path and returns it as a numpy array.

Parameters:: file_path (str) – The path to the z-stack image file.
Returns:: The z-stack image as a numpy array.
Return type:: np.ndarray
Raises:: ValueError – If the image has less than 3 dimensions.

Notebook Initialization

Notebook initialization helpers and Bandicoot path utilities.

image_analysis_3D.file_utils.notebook_init_utils.init_notebook()[source]

Initializes the notebook environment by determining the root directory of the Git repository and checking if the code is running in a Jupyter notebook.

Returns:

pathlib.Path: The root directory of the Git repository.
bool: True if running in a Jupyter notebook, False otherwise.

Return type:

Tuple[pathlib.Path, bool]

image_analysis_3D.file_utils.notebook_init_utils.bandicoot_check(bandicoot_mount_path, root_dir)[source]

This function determines if the external mount point for Bandicoot exists.

Parameters:

bandicoot_mount_path (pathlib.Path) – The path to the Bandicoot mount point.
root_dir (pathlib.Path) – The root directory of the Git repository.

Returns:

The base directory for image data.

Return type:

pathlib.Path

image_analysis_3D.file_utils.notebook_init_utils.avoid_path_crash_bandicoot(bandicoot_path)[source]

This function avoids path crashes by checking if the Bandicoot path exists and setting the raw image directory and output base directory accordingly.

Parameters:: bandicoot_path (pathlib.Path) – The path to the Bandicoot directory.
Returns:: The raw image directory and output base directory.
Return type:: Tuple[pathlib.Path, pathlib.Path]

Preprocessing Functions

Preprocessing utilities for organizing image data.

image_analysis_3D.file_utils.preprocessing_funcs.read_2D_image_for_zstacking(file_path)[source]

Read a 2D image for z-stacking from a file.

Reads in a 2D image from a given file path and returns it as a numpy array.

Parameters:: file_path (str) – The path to the 2D image file.
Returns:: The 2D image as a numpy array.
Return type:: np.ndarray
Raises:: ValueError – If the image has more than 2 dimensions.

image_analysis_3D.file_utils.preprocessing_funcs.get_well_fov_dirs(parent_dir)[source]

Retrieve all well fov dirs in a given parent dir

Parameters:: parent_dir (pathlib.Path) – Patient parent dir
Returns:: List of well fov dirs in _parent_dir
Return type:: List[pathlib.Path]

image_analysis_3D.file_utils.preprocessing_funcs.get_to_the_unested_dir(nested_dir, times_nested)[source]

Unest the dir given the number of time the directories are nested.

Parameters:

nested_dir (pathlib.Path) – The parent directory containing the nested dirs
times_nested (int) – The number of times that a dir is nested

Returns:

The output file path of the least nested parent dir or None

Return type:

pathlib.Path | None

image_analysis_3D.file_utils.preprocessing_funcs.check_well_dir_name_format(dir_name)[source]

Check if a well directory name matches the expected format.

Accepts formats like A1-1, A01-01, A1-1 (60X), or A12-34 with any trailing parenthetical/metadata.

Parameters:: dir_name (str) – The name of the directory to check.
Returns:: True if the directory name matches the expected format, False otherwise.
Return type:: bool

Channel Mapping

Channel mapping utilities.

image_analysis_3D.file_utils.read_in_channel_mapping.retrieve_channel_mapping(toml_path)[source]

Read in channel mapping from a TOML file.

Parameters:: toml_path (str) – Path to the TOML file.
Returns:: Dictionary containing the channel mapping.
Return type:: dict

Segmentation Decoupling (File Utilities)

Utilities for decoupling and merging segmentation masks.

image_analysis_3D.file_utils.segmentation_decoupling.euclidian_2D_distance(coord_set_1, coord_set_2)[source]

This function calculates the euclidian distance between two sets of coordinates (2D)

sqrt((x1 - x2)^2 + (y1 - y2)^2)

Parameters:

coord_set_1 (tuple) – The first set of coordinates (x, y)
coord_set_2 (tuple) – The second set of coordinates (x, y)

Returns:

The euclidian distance between the two sets of coordinates

Return type:

image_analysis_3D.file_utils.segmentation_decoupling.check_coordinate_inside_box(coord, box)[source]

This function checks if a coordinate is inside a box

Parameters:

coord (tuple) – The coordinate to check (y, x)
box (tuple) – The box to check against [y_min, x_min, y_max, x_max]

Returns:

True if the coordinate is inside the box, False otherwise

Return type:

image_analysis_3D.file_utils.segmentation_decoupling.get_larger_bbox(bbox1, bbox2)[source]

This function returns the larger of two bounding boxes

Parameters:

bbox1 (tuple) – The first bounding box [y_min, x_min, y_max, x_max]
bbox2 (tuple) – The second bounding box [y_min, x_min, y_max, x_max]

Returns:

A tuple of the larger bounding box [y_min, x_min, y_max, x_max]

Return type:

tuple

image_analysis_3D.file_utils.segmentation_decoupling.extract_unique_masks(image_stack)[source]

This function extracts unique masks from an image stack

Parameters:: image_stack (np.ndarray) – The image stack to extract unique masks from
Returns:: The dataframe containing the unique masks
Return type:: pd.DataFrame

image_analysis_3D.file_utils.segmentation_decoupling.compare_masks_for_merged(df, index1, index2, distance_threshold=10)[source]

This function compares masks for merging

Parameters:

df (pd.DataFrame) – The dataframe containing the masks
index1 (int) – Index 1
index2 (int) – Index 2
distance_threshold (int, optional) – The distance threshold, by default 10

Returns:

The dataframe containing the masks for merging

Return type:

pd.DataFrame

image_analysis_3D.file_utils.segmentation_decoupling.get_combinations_of_indices(df, distance_threshold=10)[source]

This function gets the combinations of indices

Parameters:

df (pd.DataFrame) – The dataframe containing the masks
distance_threshold (int, optional) – The distance threshold, by default 10

Returns:

The dataframe containing the combinations of indices

Return type:

pd.DataFrame

image_analysis_3D.file_utils.segmentation_decoupling.merge_sets(list_of_sets)[source]

Merge overlapping sets in-place and count merges.

Parameters:: list_of_sets (list[set[int]]) – Sets of integer labels to merge.
Returns:: Updated list of sets and the number of merges performed.
Return type:: tuple[list[set[int]], int]

image_analysis_3D.file_utils.segmentation_decoupling.merge_sets_df(merged_df)[source]

This function merges the sets of masks

Parameters:: merged_df (pd.DataFrame) – The dataframe containing the masks
Returns:: The dataframe containing the merged masks
Return type:: pd.DataFrame

image_analysis_3D.file_utils.segmentation_decoupling.reassemble_each_mask(df, original_img_shape)[source]

This function reassembles the masks from the dataframe

Parameters:

df (pd.DataFrame) – The dataframe containing the masks
original_img_shape (tuple) – The shape of the original image

Returns:

The reassembled masks

Return type:

np.ndarray

image_analysis_3D.file_utils.segmentation_decoupling.get_dimensionality(image_array)[source]

This function returns the dimensionality of an image array while checking if the input is a numpy array

Parameters:: image_array (np.ndarray) – The image array to check the dimensionality of
Returns:: The dimensionality of the image array
Return type:: int
Raises:: TypeError – If the input is not a numpy array

image_analysis_3D.file_utils.segmentation_decoupling.get_number_of_unique_labels(image_array)[source]

This function returns the number of unique labels in an image array

Parameters:: image_array (np.ndarray) – The image array to check the number of unique labels
Returns:: The number of unique labels in the image array
Return type:: int

Errors

This class defines a custom exception class for exceeding the max workers on a machine.

exception image_analysis_3D.errors.exceptions.MaxWorkerError[source]: Raised when the number of workers assigned to max_workers exceeds the number of CPU/workers on the machine.

Segmentation Utilities

General Segmentation

image_analysis_3D.segmentation_utils.general_segmentation_utils.sliding_window_two_point_five_D(image_stack, window_size)[source]

Create 2.5D max-projection stack using a sliding window.

Parameters:

image_stack (np.ndarray) – Input 3D image stack (Z, Y, X).
window_size (int) – Number of slices to project per window.

Returns:

2.5D stack of max projections.

Return type:

np.ndarray

image_analysis_3D.segmentation_utils.general_segmentation_utils.reverse_sliding_window_max_projection(output_dict, window_size, original_z_slice_count)[source]

Reconstruct per-slice masks from sliding-window projections.

Parameters:

output_dict (dict) – Output dictionary with projected labels.
window_size (int) – Sliding window size used during projection.
original_z_slice_count (int) – Number of slices in the original stack.

Returns:

Mapping of slice index to list of reconstructed masks.

Return type:

image_analysis_3D.segmentation_utils.general_segmentation_utils.butterworth_grid_optimization(img, return_plot=False)[source]

Sweep Butterworth parameters and optionally plot results.

Parameters:

img (np.ndarray) – Image stack used for optimization.
return_plot (bool, optional) – Whether to display a parameter grid plot, by default False.

Returns:

Only displays plots when requested.

Return type:

None

image_analysis_3D.segmentation_utils.general_segmentation_utils.apply_butterworth_filter(img, cutoff_frequency_ratio=0.05, order=1, high_pass=False, squared_butterworth=True)[source]

Apply a Butterworth filter and Gaussian smoothing.

Parameters:

img (np.ndarray) – Input image stack to filter.
cutoff_frequency_ratio (float, optional) – Cutoff frequency ratio, by default 0.05.
order (int, optional) – Butterworth filter order, by default 1.
high_pass (bool, optional) – Whether to use a high-pass filter, by default False.
squared_butterworth (bool, optional) – Use squared Butterworth response, by default True.

Returns:

Filtered image stack.

Return type:

np.ndarray

image_analysis_3D.segmentation_utils.general_segmentation_utils.decouple_masks(reconstruction_dict, original_img_shape, distance_threshold, verbose=False)[source]

Decouple projected masks into per-slice masks.

Parameters:

reconstruction_dict (dict) – Mapping of slice index to projected masks.
original_img_shape (np.ndarray) – Shape of the original image stack.
distance_threshold (int) – Distance threshold for mask merging.
verbose (bool, optional) – Whether to print warnings, by default False.

Returns:

Mapping of slice index to reassembled masks.

Return type:

image_analysis_3D.segmentation_utils.general_segmentation_utils.generate_coordinates_for_reconstruction(image)[source]

Generate centroid and bounding-box coordinates for reconstruction.

Parameters:: image (np.ndarray) – Labeled mask stack (Z, Y, X).
Returns:: DataFrame of labels, centroids, and bounding boxes.
Return type:: pd.DataFrame

image_analysis_3D.segmentation_utils.general_segmentation_utils.generate_distance_pairs(coordinates_df, x_y_vector_radius_max_constraint)[source]

Create distance pairs for centroid matching across slices.

Parameters:

coordinates_df (pd.DataFrame) – DataFrame containing centroid coordinates.
x_y_vector_radius_max_constraint (int) – Maximum centroid distance to include.

Returns:

Pairwise centroid distances within the constraint.

Return type:

pd.DataFrame

image_analysis_3D.segmentation_utils.general_segmentation_utils.calculate_mask_iou(mask1, mask2)[source]

Calculate the Intersection over Union (IoU) between two binary masks.

Parameters:

mask1 (np.ndarray) – The first binary mask.
mask2 (np.ndarray) – The second binary mask.

Returns:

True if the IoU is greater than 0.5, False otherwise.

Return type:

image_analysis_3D.segmentation_utils.general_segmentation_utils.graph_creation(df)[source]

Build a graph connecting centroid pairs.

Parameters:: df (pd.DataFrame) – DataFrame of centroid pairs and distances.
Returns:: Graph with centroid nodes and distance edges.
Return type:: networkx.Graph

image_analysis_3D.segmentation_utils.general_segmentation_utils.solve_graph(G)[source]

Solve for longest shortest paths in a graph.

Parameters:: G (networkx.Graph) – Graph of centroid connections.
Returns:: Longest paths discovered in the graph.
Return type:: list[list[int]]

image_analysis_3D.segmentation_utils.general_segmentation_utils.merge_sets(list_of_sets)[source]

Merge overlapping sets of node indices.

Parameters:: list_of_sets (list) – List of set objects to merge.
Returns:: Updated list of merged sets.
Return type:: list

image_analysis_3D.segmentation_utils.general_segmentation_utils.collapse_labels(df, longest_paths)[source]

Collapse labels using graph paths.

Parameters:

df (pd.DataFrame) – DataFrame containing unique IDs.
longest_paths (list) – List of paths from graph solution.

Returns:

Updated DataFrame with collapsed labels.

Return type:

pd.DataFrame

image_analysis_3D.segmentation_utils.general_segmentation_utils.reassign_labels(image, df)[source]

Reassign labels in a mask based on a mapping DataFrame.

Parameters:

image (np.ndarray) – Mask stack to relabel.
df (pd.DataFrame) – DataFrame containing label mappings by slice.

Returns:

Relabeled mask stack.

Return type:

np.ndarray

image_analysis_3D.segmentation_utils.general_segmentation_utils.calculate_bbox_area(bbox)[source]

Calculate the area of a bounding box.

Parameters:: bbox (Tuple[int, int, int, int]) – The bounding box coordinates in the format (x_min, y_min, x_max, y_max).
Returns:: The area of the bounding box.
Return type:: int

image_analysis_3D.segmentation_utils.general_segmentation_utils.calculate_overlap(bbox1, bbox2)[source]

Calculate overlap percentage between two bounding boxes.

Parameters:

bbox1 (Tuple[int, int, int, int]) – First bounding box.
bbox2 (Tuple[int, int, int, int]) – Second bounding box.

Returns:

Overlap percentage relative to the smaller box.

Return type:

image_analysis_3D.segmentation_utils.general_segmentation_utils.check_for_all_same_labels(object_information_df)[source]

Check if all labels in the object information DataFrame are the same.

Parameters:: object_information_df (pd.DataFrame) – The DataFrame containing object information with ‘label’ column.
Returns:: True if all labels are the same, False otherwise.
Return type:: bool

image_analysis_3D.segmentation_utils.general_segmentation_utils.missing_slice_check(object_information_df, window_min=0, window_max=2, interpolated_rows_to_add=[])[source]

Check for missing slices in the object information DataFrame and add interpolated rows if necessary.

Parameters:

object_information_df (pd.DataFrame) – The DataFrame containing object information with ‘z’ and ‘label’ columns.
window_min (int, optional) – The minimum window size for checking missing slices, by default 0
window_max (int, optional) – The maximum window size for checking missing slices, by default 2
interpolated_rows_to_add (List[int], optional) – A list to store rows to be added for interpolation, by default []

Returns:

A list of DataFrames containing rows to be added for interpolation.

Return type:

List[pd.DataFrame]

image_analysis_3D.segmentation_utils.general_segmentation_utils.add_min_max_boundry_slices(object_information_df, global_min_z, global_max_z, interpolated_rows_to_add=[])[source]

Add slices to the object information DataFrame that are one slice away from the global min and max z slices.

Parameters:

object_information_df (pd.DataFrame) – The DataFrame containing object information with ‘z’ and ‘label’ columns.
global_min_z (int) – The global minimum z slice.
global_max_z (int) – The global maximum z slice.
interpolated_rows_to_add (List[pd.DataFrame], optional) – A list to store rows to be added for interpolation, by default []

Returns:

A list of DataFrames containing rows to be added for interpolation at the min and max z slices.

Return type:

List[pd.DataFrame]

image_analysis_3D.segmentation_utils.general_segmentation_utils.add_masks_where_missing(new_mask_image, interpolated_rows_to_add_df)[source]

Add masks to the new mask image where the slices are missing based on the interpolated rows.

Parameters:

new_mask_image (np.ndarray) – The new mask image to which the slices will be added.
interpolated_rows_to_add_df (pd.DataFrame) – The DataFrame containing the rows to be added for interpolation, with columns ‘added_z’, ‘added_new_label’, and ‘zslice_to_copy’.

Returns:

The new mask image with the added slices.

Return type:

np.ndarray

image_analysis_3D.segmentation_utils.general_segmentation_utils.reorder_organoid_labels(label_image)[source]

Reorder the labels in the label image to ensure they are sequential starting from 1.

Parameters:: label_image (np.ndarray) – The label image where labels need to be reordered.
Returns:: The label image with reordered labels.
Return type:: np.ndarray

image_analysis_3D.segmentation_utils.general_segmentation_utils.run_post_hoc_refinement(mask_image, sliding_window_context)[source]

Refine labels by interpolating across missing slices.

Parameters:

mask_image (List[int]) – 3D labeled mask stack.
sliding_window_context (int) – Number of slices to consider in the sliding context.

Returns:

Refined mask stack.

Return type:

np.ndarray

image_analysis_3D.segmentation_utils.general_segmentation_utils.segment_cells_with_3D_watershed(cyto_signal, nuclei_mask)[source]

Segment cells using seeded 3D watershed.

Parameters:

cyto_signal (np.ndarray) – Cytoplasm signal image stack.
nuclei_mask (np.ndarray) – Nuclei mask used as watershed seeds.

Returns:

Cell segmentation mask.

Return type:

np.ndarray

image_analysis_3D.segmentation_utils.general_segmentation_utils.remove_edge_cases(mask, border=10)[source]

Remove masks that are image edge cases In this case - the edge literally means the edge of the image This is useful to remove masks that are not fully contained within the image

Parameters:

mask (np.ndarray) – The mask to process, should be a 3D numpy array
border (int, optional) – The number of pixels in width to create border to scan for edge cased, by default 10

Returns:

The mask with edge cases removed

Return type:

np.ndarray

image_analysis_3D.segmentation_utils.general_segmentation_utils.centroid_within_bbox_detection(centroid, bbox)[source]

Check if the centroid is within the bbox

Parameters:

centroid (tuple) – Centroid of the object in the order of (z, y, x) Order of the centroid is important
bbox (tuple) – Where the bbox is in the order of (z_min, y_min, x_min, z_max, y_max, x_max) Order of the bbox is important

Returns:

True if the centroid is within the bbox, False otherwise

Return type:

image_analysis_3D.segmentation_utils.general_segmentation_utils.check_if_centroid_within_mask(centroid, mask, label)[source]

Check if the centroid is within the mask

Parameters:

centroid (tuple) – Centroid of the object in the order of (z, y, x) Order of the centroid is important
mask (np.ndarray) – The mask to check against
label (int)

Returns:

True if the centroid is within the mask, False otherwise

Return type:

image_analysis_3D.segmentation_utils.general_segmentation_utils.get_labels_for_post_hoc_reassignment(compartment_mask, compartment_name)[source]

Collect centroid and bbox data for mask relabeling.

Parameters:

compartment_mask (np.ndarray) – Labeled mask for a compartment.
compartment_name (str) – Name of the compartment.

Returns:

DataFrame of centroids, bboxes, and labels.

Return type:

pd.DataFrame

image_analysis_3D.segmentation_utils.general_segmentation_utils.mask_label_reassignment(mask_df, mask_input)[source]

Reassign the labels of the mask based on the mask_df

Parameters:

mask_df (pd.DataFrame) – DataFrame containing the labels and centroids of the mask
mask_input (np.ndarray) – The input mask to reassign the labels to

Returns:

The mask with reassigned labels

Return type:

np.ndarray

image_analysis_3D.segmentation_utils.general_segmentation_utils.run_post_hoc_mask_reassignment(nuclei_mask, cell_mask, nuclei_df, cell_df, return_dataframe=False)[source]

Reassign nuclei labels based on cell containment.

Parameters:

nuclei_mask (np.ndarray) – Nuclei segmentation mask.
cell_mask (np.ndarray) – Cell segmentation mask.
nuclei_df (pd.DataFrame) – DataFrame with nuclei centroids and labels.
cell_df (pd.DataFrame) – DataFrame with cell centroids and labels.
return_dataframe (bool, optional) – Whether to return the merged DataFrame, by default False.

Returns:

Updated nuclei mask and optional merged DataFrame.

Return type:

tuple[np.ndarray, pd.DataFrame | None]

image_analysis_3D.segmentation_utils.general_segmentation_utils.create_cytoplasm_masks(nuclei_masks, cell_masks)[source]

Create cytoplasm masks by subtracting nuclei from cells.

Parameters:

nuclei_masks (np.ndarray) – Nuclei segmentation masks.
cell_masks (np.ndarray) – Cell segmentation masks.

Returns:

Cytoplasm masks.

Return type:

np.ndarray

image_analysis_3D.segmentation_utils.general_segmentation_utils.clean_border_objects(segmentation, border_width=20)[source]

Remove objects touching the segmentation border.

Parameters:

segmentation (np.ndarray) – Labeled segmentation mask.
border_width (int, optional) – Width of the border region, by default 20.

Returns:

Cleaned segmentation mask.

Return type:

np.ndarray

image_analysis_3D.segmentation_utils.general_segmentation_utils.remove_label_id(mask_image, label_id_to_remove)[source]

Remove the label id

Parameters:

mask_image (np.ndarray) – Mask image from which to remove the label id
label_id_to_remove (int) – Label id to remove from the mask image

Returns:

Mask image with the label id removed

Return type:

np.ndarray

Nuclei Segmentation

# Nuclei segmentation using cellpose in a two-D manner

image_analysis_3D.segmentation_utils.nuclei_segmentation.segmentaion_on_two_D(imgs, diameter=None)[source]

Run 2D Cellpose segmentation slice-by-slice.

Parameters:

imgs (np.ndarray) – 3D image stack (Z, Y, X) to segment.
diameter (int, optional) – Approximate diameter of objects to segment (in pixels). If None, Cellpose will estimate it automatically.

Returns:

Dictionary containing slice indices, labels, and details.

Return type:

image_analysis_3D.segmentation_utils.nuclei_segmentation.build_complete_bipartite_graph(input_masks, distance_threshold=None)[source]

Build a complete bipartite graph from 2D segmentation masks.

For each pair of consecutive slices, connect EVERY object in slice N to EVERY object in slice N+1, computing Euclidean distance between centroids.

Parameters:

input_masks (list) – List of 2D segmentation masks (numpy arrays).
distance_threshold (float or None, optional) – Optional maximum distance to include edges. If None, ALL pairs are connected (truly complete).

Returns:

G (networkx.Graph) – NetworkX graph.
df (pandas.DataFrame) – DataFrame with all edges.

Return type:

tuple[networkx.Graph, pandas.DataFrame]

image_analysis_3D.segmentation_utils.nuclei_segmentation.solve_graph_improved(G, max_distance=100.0, slices=None, verbose=False)[source]

Solve bipartite matching across consecutive slices using greedy matching based on edge weights (distances).

Instead of Hungarian algorithm (which can force bad matches), greedily match objects in order of smallest distance. This ensures we only connect objects that are genuinely close.

Parameters:

G (networkx.Graph) – NetworkX graph with edges between different slices.
max_distance (float, optional) – Maximum distance to accept a match.
slices (list or None, optional) – Optional list of slice numbers to process.
verbose (bool, optional) – Print debugging info.

Returns:

List of paths, where each path is a list of node IDs.

Return type:

list

image_analysis_3D.segmentation_utils.nuclei_segmentation.split_long_trajectories(paths, max_length)[source]

Split trajectories that exceed max_length into shorter ones.

Parameters:

paths (list) – List of node paths.
max_length (int) – Maximum number of consecutive nodes per trajectory.

Returns:

List of split trajectories.

Return type:

list

image_analysis_3D.segmentation_utils.nuclei_segmentation.collapse_labels_from_paths(input_masks, paths)[source]

Assign unified labels based on trajectories.

Parameters:

input_masks (list) – List of 2D masks.
paths (list) – List of node paths from solve_graph_improved.

Returns:

List of 3D masks with unified labels.

Return type:

list

image_analysis_3D.segmentation_utils.nuclei_segmentation.stack_3d_segmentation(relabeled_masks)[source]

Stack 2D relabeled masks into 3D volume.

Parameters:: relabeled_masks (list[numpy.ndarray])
Return type:: numpy.ndarray

image_analysis_3D.segmentation_utils.nuclei_segmentation.remove_single_slice_objects(segmentation_mask)[source]

Remove objects that only appear in a single z-slice.

Parameters:: segmentation_mask (numpy.ndarray) – 3D segmentation mask array.
Returns:: Cleaned 3D segmentation with single-slice objects removed.
Return type:: numpy.ndarray

image_analysis_3D.segmentation_utils.nuclei_segmentation.fill_object_gaps(segmentation_mask, max_gap_size=2)[source]

Fill gaps in object trajectories (missing slices between appearances).

For example, if object ID 5 appears in slices [10, 11, 14, 15], the gap between 11 and 14 will be filled if gap_size <= max_gap_size.

Parameters:

segmentation_mask (numpy.ndarray) – 3D segmentation mask array.
max_gap_size (int, optional) – Maximum number of consecutive missing slices to fill (default: 2, meaning fill gaps of 1-2 slices).

Returns:

Filled 3D segmentation.

Return type:

image_analysis_3D.segmentation_utils.nuclei_segmentation.postprocess_segmentation(segmentation_mask, remove_singletons=True, fill_gaps=True, max_gap_size=2)[source]

Post-process 3D segmentation to clean up artifacts.

Parameters:

segmentation_mask (numpy.ndarray) – 3D segmentation array.
remove_singletons (bool, optional) – If True, remove objects that only appear in 1 slice.
fill_gaps (bool, optional) – If True, fill small gaps in object trajectories.
max_gap_size (int, optional) – Maximum gap size to fill (only used if fill_gaps=True).

Returns:

Cleaned 3D segmentation.

Return type:

image_analysis_3D.segmentation_utils.nuclei_segmentation.object_stitching_and_relation(input_masks, max_match_distance=100.0, max_trajectory_length=None, verbose=False)[source]

Complete pipeline: build complete bipartite graph -> solve matching -> relabel.

Parameters:

input_masks (list) – List of 2D segmentation masks.
max_match_distance (float, optional) – Maximum distance to accept a match (in pixels).
max_trajectory_length (int or None, optional) – Optional maximum number of consecutive slices an object can span. If None, no limit. Use to prevent unrealistic tall objects (e.g., set to 10 if cells shouldn’t span >10 slices).
verbose (bool, optional) – Print diagnostics.

Returns:

segmentation_mask (numpy.ndarray) – 3D array with unified instance labels across slices.
diagnostics (dict) – Dict with stats about the matching.

Return type:

tuple[numpy.ndarray, dict]

Cell Segmentation

# Cell segmentation in 3D

image_analysis_3D.segmentation_utils.cell_segmentation.fill_holes_in_mask(mask, compartment=None)[source]

This function fills holes in instance segmented mask images

Parameters:

mask (np.ndarray) – 3D instance segmented mask image where each cell has a unique integer label and background is 0
compartment (str, optional) – Compartment type of the mask (e.g. “cell” or “organoid”), by default None. This is used to determine the hole filling strategy.
Errors
------
ValueError – If compartment is not specified, a ValueError is raised since the hole filling strategy depends on the compartment type.

Returns:

3D instance segmented mask image with holes filled

Return type:

np.ndarray

image_analysis_3D.segmentation_utils.cell_segmentation.segment_cells_with_3D_watershed(cyto_signal, nuclei_mask, thresholded_signal, connectivity=1, compactness=0)[source]

Segment cells using 3D watershed algorithm.

Segments cells using a 3D watershed algorithm given cytoplasm signal (channel) and nuclei mask.

Parameters:

cyto_signal (np.ndarray) – 3D numpy array representing the cytoplasm signal.
nuclei_mask (np.ndarray) – 3D numpy array representing the nuclei mask.
thresholded_signal (np.ndarray) – 3D numpy array representing the thresholded cytoplasm signal to be used as a mask for watershed.
connectivity (int, optional) – Connectivity parameter for the watershed algorithm. Default is 1. A value of 1 means only directly adjacent pixels (6-connectivity in 3D) are considered connected, preventing over-segmentation.
compactness (float, optional) – Compactness parameter controlling watershed region shape. Default is 0. A value of 0 means no compactness enforcement, allowing irregularly shaped segments to capture true cell morphology.

Returns:

3D numpy array representing the segmented cell mask.

Return type:

np.ndarray

image_analysis_3D.segmentation_utils.cell_segmentation.perform_morphology_dependent_segmentation(organoid_label, cyto_signal, nuclei_mask, min_size=1000, max_size=10000000)[source]

Perform morphology-dependent cell segmentation.

Performs morphology dependent segmentation based on the provided morphology label.

Parameters:

organoid_label (str) – Morphology label indicating the type of morphology.
cyto_signal (np.ndarray) – 3D numpy array representing the cytoplasm signal.
nuclei_mask (np.ndarray) – 3D numpy array representing the nuclei mask.
min_size (int, optional) – Minimum size threshold for segmented objects. Default is 1,000 voxels.
max_size (int, optional) – Maximum size threshold for segmented objects. Default is 10,000,000 voxels.

Returns:

3D numpy array representing the segmented cell mask.

Return type:

np.ndarray

Segmentation Decoupling

Utilities for decoupling and merging segmentation masks.

image_analysis_3D.segmentation_utils.segmentation_decoupling.euclidian_2D_distance(coord_set_1, coord_set_2)[source]

This function calculates the euclidian distance between two sets of coordinates (2D)

sqrt((x1 - x2)^2 + (y1 - y2)^2)

Parameters:

coord_set_1 (tuple) – The first set of coordinates (x, y)
coord_set_2 (tuple) – The second set of coordinates (x, y)

Returns:

The euclidian distance between the two sets of coordinates

Return type:

image_analysis_3D.segmentation_utils.segmentation_decoupling.check_coordinate_inside_box(coord, box)[source]

This function checks if a coordinate is inside a box

Parameters:

coord (tuple) – The coordinate to check (y, x)
box (tuple) – The box to check against [y_min, x_min, y_max, x_max]

Returns:

True if the coordinate is inside the box, False otherwise

Return type:

image_analysis_3D.segmentation_utils.segmentation_decoupling.get_larger_bbox(bbox1, bbox2)[source]

This function returns the larger of two bounding boxes

Parameters:

bbox1 (tuple) – The first bounding box [y_min, x_min, y_max, x_max]
bbox2 (tuple) – The second bounding box [y_min, x_min, y_max, x_max]

Returns:

A tuple of the larger bounding box [y_min, x_min, y_max, x_max]

Return type:

tuple

image_analysis_3D.segmentation_utils.segmentation_decoupling.extract_unique_masks(image_stack)[source]

This function extracts unique masks from an image stack

Parameters:: image_stack (np.ndarray) – The image stack to extract unique masks from
Returns:: The dataframe containing the unique masks
Return type:: pd.DataFrame

image_analysis_3D.segmentation_utils.segmentation_decoupling.compare_masks_for_merged(df, index1, index2, distance_threshold=10)[source]

This function compares masks for merging

Parameters:

df (pd.DataFrame) – The dataframe containing the masks
index1 (int) – Index 1
index2 (int) – Index 2
distance_threshold (int, optional) – The distance threshold, by default 10

Returns:

The dataframe containing the masks for merging

Return type:

pd.DataFrame

image_analysis_3D.segmentation_utils.segmentation_decoupling.get_combinations_of_indices(df, distance_threshold=10)[source]

This function gets the combinations of indices

Parameters:

df (pd.DataFrame) – The dataframe containing the masks
distance_threshold (int, optional) – The distance threshold, by default 10

Returns:

The dataframe containing the combinations of indices

Return type:

pd.DataFrame

image_analysis_3D.segmentation_utils.segmentation_decoupling.merge_sets(list_of_sets)[source]

Merge overlapping sets in-place and count merges.

Parameters:: list_of_sets (list[set[int]]) – Sets of integer labels to merge.
Returns:: Updated list of sets and the number of merges performed.
Return type:: tuple[list[set[int]], int]

image_analysis_3D.segmentation_utils.segmentation_decoupling.merge_sets_df(merged_df)[source]

This function merges the sets of masks

Parameters:: merged_df (pd.DataFrame) – The dataframe containing the masks
Returns:: The dataframe containing the merged masks
Return type:: pd.DataFrame

image_analysis_3D.segmentation_utils.segmentation_decoupling.reassemble_each_mask(df, original_img_shape)[source]

This function reassembles the masks from the dataframe

Parameters:

df (pd.DataFrame) – The dataframe containing the masks
original_img_shape (tuple) – The shape of the original image

Returns:

The reassembled masks

Return type:

np.ndarray

image_analysis_3D.segmentation_utils.segmentation_decoupling.get_dimensionality(image_array)[source]

This function returns the dimensionality of an image array while checking if the input is a numpy array

Parameters:: image_array (np.ndarray) – The image array to check the dimensionality of
Returns:: The dimensionality of the image array
Return type:: int
Raises:: TypeError – If the input is not a numpy array

image_analysis_3D.segmentation_utils.segmentation_decoupling.get_number_of_unique_labels(image_array)[source]

This function returns the number of unique labels in an image array

Parameters:: image_array (np.ndarray) – The image array to check the number of unique labels
Returns:: The number of unique labels in the image array
Return type:: int

Image Utilities

Image Processing

image_analysis_3D.image_utils.image_utils.select_objects_from_label(label_image, object_ids)[source]

Selects objects from a label image based on the provided object IDs.

Parameters:

label_image (numpy.ndarray) – The segmented label image.
object_ids (list) – The object IDs to select.

Returns:

The label image with only the selected objects.

Return type:

image_analysis_3D.image_utils.image_utils.expand_box(min_coor, max_coord, current_min, current_max, expand_by)[source]

Expand the bounding box of an object in a 3D image.

Parameters:

min_coor (int) – The minimum coordinate of the image for any dimension.
max_coord (int) – The maximum coordinate of the image for any dimension.
current_min (int) – The current minimum coordinate of the bounding box of an object for any dimension.
current_max (int) – The current maximum coordinate of the bounding box of an object for any dimension.
expand_by (int) – The amount to expand the bounding box by.

Returns:

The new minimum and maximum coordinates of the bounding box. Raises ValueError if the expansion is not possible.

Return type:

Union[Tuple[int, int], ValueError]

image_analysis_3D.image_utils.image_utils.new_crop_border(bbox1, bbox2, image)[source]

Expand the bounding boxes of two objects in a 3D image to match their sizes.

Parameters:

bbox1 (Tuple[Union[int, float], Union[int, float], Union[int, float], Union[int, float], Union[int, float], Union[int, float]]) – The bounding box of the first object.
bbox2 (Tuple[Union[int, float], Union[int, float], Union[int, float], Union[int, float], Union[int, float], Union[int, float]]) – The bounding box of the second object.
image (numpy.ndarray) – The image to crop for each of the bounding boxes.

Returns:

Tuple[Tuple[Union[int, float], Union[int, float], Union[int, float], Union[int, float], Union[int, float], Union[int, float]], Tuple[Union[int, float], Union[int, float], Union[int, float], Union[int, float], Union[int, float], Union[int, float]]] – The new bounding boxes of the two objects.
Raises
ValueError – If the expansion is not possible.

Return type:

image_analysis_3D.image_utils.image_utils.crop_3D_image(image, bbox)[source]

Crop a 3D image to the bounding box of a mask.

Parameters:

image (numpy.ndarray) – The image to crop.
bbox (Tuple[Union[int, float], Union[int, float], Union[int, float], Union[int, float], Union[int, float], Union[int, float]]) – The bounding box of the mask.

Returns:

The cropped image.

Return type:

image_analysis_3D.image_utils.image_utils.single_3D_image_expand_bbox(image, bbox, expand_pixels, anisotropy_factor)[source]

Expand the bbox in a way that keeps the crop within the confines of the image volume

Parameters:

image (numpy.ndarray) – 3D image array from which the bbox was derived
bbox (tuple[int, int, int, int, int, int]) – 3D bbox in the format (zmin, ymin, xmin, zmax, ymax, xmax)
expand_pixels (int) – number of pixels to expand the bbox in each direction (z, y, x) the coordinates become isotropic here so the expansion is the same across dimensions, but the anisotropy factor is used to adjust for the z dimension
anisotropy_factor (int) – The ratio of “pixel” size in um between the z dimension and the x/y dimensions. This is used to adjust the expansion of the bbox in the z dimension to account for anisotropy in the image volume. For example, if the z spacing is 5um and the x/y spacing is 1um, then the anisotropy factor would be 5.

Returns:

Updated bbox in the format (zmin, ymin, xmin, zmax, ymax, xmax) after expansion and adjustment for anisotropy

Return type:

tuple[int, int, int, int, int, int]

image_analysis_3D.image_utils.image_utils.check_for_xy_squareness(bbox)[source]

This function returns the ratio of the x length to the y length A value of 1 indicates a square bbox is present

Parameters:: bbox (The bbox to check) – (z_min, y_min, x_min, z_max, y_max, x_max) Where each value is an int representing the pixel coordinate of the bbox in that dimension
Returns:: The ratio of the y length to the x length of the bbox. A value of 1 indicates a square bbox.
Return type:: float

image_analysis_3D.image_utils.image_utils.square_off_xy_crop_bbox(bbox)[source]

Adjust the bbox to be square in the XY plane.

The function computes the new bbox from the current X/Y dimensions.

Parameters:

bbox (tuple[int, int, int, int, int, int]) –

The bbox to adjust: (z_min, y_min, x_min, z_max, y_max, x_max)

Each value is an integer pixel coordinate in that dimension.

Returns:

The adjusted bbox that is square in the XY plane: (z_min, new_y_min, new_x_min, z_max, new_y_max, new_x_max)

Each value is an integer pixel coordinate in that dimension.

Return type:

tuple[int, int, int, int, int, int]

Featurization Utilities

Area & Size & Shape

Area, size, and shape features for 3D objects.

image_analysis_3D.featurization_utils.area_size_shape_utils.calculate_surface_area(label_object, props, spacing)[source]

Calculate the surface area of a 3D object using the marching cubes algorithm.

Parameters:

label_object (numpy.array) – This is an array of the segmented objects of a given compartment.
props (numpy.array) – This is the output of the regionprops function, which contains information about the objects.
spacing (tuple) – This is the spacing of the image in each dimension (z, y, x).

Returns:

The surface area for the object.

Return type:

image_analysis_3D.featurization_utils.area_size_shape_utils.measure_3D_area_size_shape(image_set_loader, object_loader)[source]

This function calculates the area, size, and shape of objects in a 3D image using the regionprops function. It uses the numpy library to perform the calculations on the CPU.

Parameters:

image_set_loader (ImageSetLoader) – The image set loader object that contains the image and label image.
object_loader (ObjectLoader) – The object loader object that contains the image and label image.

Returns:

A dictionary containing the area, size, and shape of the objects in the image.

Return type:

Colocalization

Colocalization feature extraction utilities for 3D image objects.

Computes per-object colocalization metrics (Pearson correlation, Manders coefficients, overlap coefficient, K1/K2 coefficients) between pairs of fluorescence channels using the Costes automatic thresholding method.

image_analysis_3D.featurization_utils.colocalization_utils.linear_costes_threshold_calculation(first_image, second_image, scale_max=255, fast_costes='Accurate')[source]

Finds the Costes Automatic Threshold for colocalization using a linear algorithm. Candiate thresholds are gradually decreased until Pearson R falls below 0. If “Fast” mode is enabled the “steps” between tested thresholds will be increased when Pearson R is much greater than 0. The other mode is “Accurate” which will always step down by the same amount.

Parameters:

first_image (numpy.ndarray) – The first fluorescence image.
second_image (numpy.ndarray) – The second fluorescence image.
scale_max (int, optional) – The maximum value for the image scale, by default 255.
fast_costes (str, optional) – The mode for the Costes threshold calculation, by default “Accurate”.

Returns:

The calculated thresholds for the first and second images.

Return type:

Tuple[float, float]

image_analysis_3D.featurization_utils.colocalization_utils.bisection_costes_threshold_calculation(first_image, second_image, scale_max=255)[source]

Finds the Costes Automatic Threshold for colocalization using a bisection algorithm. Candidate thresholds are selected from within a window of possible intensities, this window is narrowed based on the R value of each tested candidate. We’re looking for the first point at 0, and R value can become highly variable at lower thresholds in some samples. Therefore the candidate tested in each loop is 1/6th of the window size below the maximum value (as opposed to the midpoint).

Parameters:

first_image (numpy.ndarray) – The first fluorescence image.
second_image (numpy.ndarray) – The second fluorescence image.
scale_max (int, optional) – The maximum value for the image scale, by default 255.

Returns:

The calculated thresholds for the first and second images.

Return type:

Tuple[float, float]

image_analysis_3D.featurization_utils.colocalization_utils.prepare_two_images_for_colocalization(label_object1, label_object2, image_object1, image_object2, object_id1, object_id2)[source]

This function prepares two images for colocalization analysis by cropping them to the bounding boxes of the specified objects. It selects the objects from the label images, calculates their bounding boxes, and crops the images accordingly.

Parameters:

label_object1 (numpy.ndarray) – The segmented label image for the first object.
label_object2 (numpy.ndarray) – The segmented label image for the second object.
image_object1 (numpy.ndarray) – The spectral image to crop for the first object.
image_object2 (numpy.ndarray) – The spectral image to crop for the second object.
object_id1 (int) – The object index to select from the label image for the first object.
object_id2 (int) – The object index to select from the label image for the second object.

Returns:

The two cropped images for colocalization analysis.

Return type:

Tuple[numpy.ndarray, numpy.ndarray]

image_analysis_3D.featurization_utils.colocalization_utils.measure_3D_colocalization(cropped_image_1, cropped_image_2, thr=15, fast_costes='Accurate')[source]

This function calculates the colocalization coefficients between two images. It computes the correlation coefficient, Manders’ coefficients, overlap coefficient, and Costes’ coefficients. The results are returned as a dictionary.

Parameters:

cropped_image_1 (numpy.ndarray) – The first cropped image.
cropped_image_2 (numpy.ndarray) – The second cropped image.
thr (int, optional) – The threshold for the Manders’ coefficients, by default 15
fast_costes (str, optional) – The mode for Costes’ threshold calculation, by default “Accurate”. Options are “Accurate” or “Fast”. “Accurate” uses a linear algorithm, while “Fast” uses a bisection algorithm. The “Fast” mode is faster but less accurate.

Returns:

The output features for colocalization analysis.

Return type:

Dict[str, float]

Granularity

Calculate the granularity spectrum of a 3D image.

image_analysis_3D.featurization_utils.granularity_utils._fix_scipy_ndimage_result(result)[source]

Convert scipy.ndimage aggregation results to a consistent array.

Equivalent to centrosome.cpmorphology.fixup_scipy_ndimage_result. scipy.ndimage.mean/sum can return a scalar when there’s one label, or a list otherwise. This ensures we always get a numpy array.

Parameters:: result (scalar, list, or numpy.ndarray) – Output from scipy.ndimage.mean or similar.
Returns:: 1-D array of results.
Return type:: numpy.ndarray

image_analysis_3D.featurization_utils.granularity_utils._subsample_3d(data, new_shape, subsample_factor, order=1)[source]

Subsample a 3D array using map_coordinates, matching CellProfiler.

CellProfiler generates coordinates for the new shape and divides by subsample_factor to map back into the original coordinate space. The same scalar factor is used for all three axes.

Parameters:

data (numpy.ndarray) – 3D array to subsample.
new_shape (numpy.ndarray) – Target shape as a float array (coordinate grid extent).
subsample_factor (float) – The factor used to divide coordinates (same for all axes).
order (int) – Interpolation order (1 for linear, 0 for nearest-neighbor).

Returns:

Subsampled array.

Return type:

image_analysis_3D.featurization_utils.granularity_utils._upsample_3d(data, subsampled_shape, original_shape)[source]

Upsample a 3D array back to original shape using map_coordinates.

Matches CellProfiler’s approach for restoring reconstructed images to the original label resolution.

Parameters:

data (numpy.ndarray) – Subsampled 3D array to upsample.
subsampled_shape (numpy.ndarray) – Shape of the subsampled space (float array, preserves CellProfiler precision).
original_shape (tuple) – Target shape to upsample to.

Returns:

Upsampled array at original_shape resolution.

Return type:

image_analysis_3D.featurization_utils.granularity_utils.measure_3D_granularity(object_loader, radius=10, granular_spectrum_length=16, subsample_size=0.25, image_sample_size=0.25, mask_threshold=0.9, verbose=False, image_mask=None)[source]

Calculate the granularity spectrum of a 3D image.

Follows the CellProfiler MeasureGranularity algorithm exactly for 3D: 1. Subsample the image uniformly (same factor for Z, Y, X). 2. Further subsample for background tophat removal. 3. Iteratively erode with ball(1) and reconstruct, measuring signal lost at each scale as image-level and per-object values.

Parameters:

object_loader (ObjectLoader) – Loader containing the image and label arrays.
radius (int) – Radius of the structuring element for background removal. Should correspond to texture radius after subsampling.
granular_spectrum_length (int) – Number of granularity scales to measure.
subsample_size (float) – Subsampling factor for the image (0, 1]. Applied uniformly to Z/Y/X.
image_sample_size (float) – Subsampling factor for background reduction (0, 1]. Applied relative to the already-subsampled image.
mask_threshold (float) – Threshold for converting interpolated masks back to boolean.
verbose (bool) – Print diagnostic information.
image_mask (numpy.ndarray or None) – Boolean mask matching the image shape. Corresponds to CellProfiler’s im.mask. If None (default), all pixels are considered valid (all-True mask), matching the typical CellProfiler behavior for unmasked images.

Returns:

Dictionary with keys ‘object_id’, ‘feature’, ‘value’. Image-level measurements use object_id=0.

Return type:

Dict[str, list]

Intensity

Intensity feature extraction utilities for 3D image objects.

Provides functions to compute intensity statistics (mean, median, min, max, standard deviation, quartiles), edge-based measurements, center-of-mass coordinates, and mass displacement for segmented 3D objects.

image_analysis_3D.featurization_utils.intensity_utils.get_outline(mask)[source]

Get the outline of a 3D mask.

Parameters:: mask (numpy.ndarray) – The input mask.
Returns:: The outline of the mask.
Return type:: numpy.ndarray

image_analysis_3D.featurization_utils.intensity_utils.measure_3D_intensity_CPU(object_loader)[source]

Measure the intensity of objects in a 3D image.

Parameters:: object_loader (ObjectLoader) – The object loader containing the image and label image.
Returns:: A dictionary containing the measurements for each object. The keys are the measurement names and the values are the corresponding values.
Return type:: dict

Neighbors

image_analysis_3D.featurization_utils.neighbors_utils.neighbors_expand_box(min_coor, max_coord, current_min, current_max, expand_by)[source]

Expand the bounding box of the object by a specified distance in each direction.

Parameters:

min_coor (Union[int, float]) – The global minimum coordinate of the image.
max_coord (Union[int, float]) – The global maximum coordinate of the image.
current_min (Union[int, float]) – The current minimum coordinate of the object.
current_max (Union[int, float]) – The current maximum coordinate of the object.
expand_by (int) – The distance by which to expand the bounding box.

Returns:

The new minimum and maximum coordinates of the bounding box.

Return type:

Tuple[Union[int, float], Union[int, float]]

image_analysis_3D.featurization_utils.neighbors_utils.crop_3D_image(image, bbox)[source]

Crop the 3D image to the bounding box of the object.

Parameters:

image (numpy.ndarray) – The 3D image to be cropped.
bbox (Tuple[Union[int, float], Union[int, float], Union[int, float], Union[int, float], Union[int, float], Union[int, float]]) – The bounding box of the object in the format (z1, y1, x1, z2, y2, x2).

Returns:

The cropped 3D image.

Return type:

image_analysis_3D.featurization_utils.neighbors_utils.measure_3D_number_of_neighbors(object_loader, distance_threshold=10, anisotropy_factor=10)[source]

This function calculates the number of neighbors for each object in a 3D image.

Parameters:

object_loader (ObjectLoader) – The object loader object that contains the image and label image.
distance_threshold (int, optional) – The distance threshold for counting neighbors, by default 10
anisotropy_factor (int, optional) – The anisotropy factor for the image where the anisotropy factor is the ratio of the pixel size in the z direction to the pixel size in the x and y directions, by default 10

Returns:

A dictionary containing the object ID and the number of neighbors for each object.

Return type:

Dict[str, list]

image_analysis_3D.featurization_utils.neighbors_utils.get_coordinates(nuclei_mask, object_ids=None)[source]

Extract coordinates from a labeled mask.

Parameters:

nuclei_mask (ndarray) – 3D labeled mask where each object has a unique ID
object_ids (list) – List of object IDs to extract

Returns:

coords – DataFrame with columns: object_id, x, y, z

Return type:

pandas.DataFrame

image_analysis_3D.featurization_utils.neighbors_utils.calculate_centroid(coords)[source]

Calculate the centroid of cell coordinates.

Parameters:: coords (pandas.DataFrame)
Return type:: numpy.ndarray

image_analysis_3D.featurization_utils.neighbors_utils.euclidean_distance_from_centroid(coords, centroid)[source]

Calculate Euclidean distance from centroid for each cell.

Parameters:

coords (numpy.ndarray)
centroid (numpy.ndarray)

Return type:

image_analysis_3D.featurization_utils.neighbors_utils.mahalanobis_distance_from_centroid(coords, centroid, min_cells_threshold=50)[source]

Calculate Mahalanobis distance from centroid for each cell. This accounts for the covariance structure (shape) of the organoid.

For small sample sizes (<50 cells), uses regularization or falls back to Euclidean.

Parameters:

coords (ndarray) – Cell coordinates (n_cells, 3)
centroid (ndarray) – Centroid coordinates (3,)
min_cells_threshold (int) – Minimum cells needed for reliable Mahalanobis (default: 50)

Returns:

distances – Mahalanobis distances for each cell

Return type:

ndarray

image_analysis_3D.featurization_utils.neighbors_utils.classify_cells_into_shells(coords, n_shells=5, method='mahalanobis', min_cells_per_shell=3, centroid=None)[source]

Classify cells into radial shells based on distance from centroid.

Automatically adjusts n_shells for small organoids to ensure meaningful statistics.

Parameters:

coords (pandas.DataFrame or dict) – Cell coordinates with /keys: object_id, x, y, z
n_shells (int) – Number of concentric shells to create (will be adjusted if needed)
method (str) – ‘euclidean’ or ‘mahalanobis’
min_cells_per_shell (int) – Minimum average cells per shell (default: 3)
centroid (numpy.ndarray, optional) – Pre-calculated centroid (if None, will be calculated from coords)

Returns:

results – Dictionary containing: - ‘ShellAssignments’: Shell number for each cell (0 = innermost) - ‘DistancesFromCenter’: Distance from centroid for each cell - ‘DistancesFromExterior’: Distance from exterior for each cell - ‘NormalizedDistancesFromCenter’: Normalized distances (0-1)

Return type:

image_analysis_3D.featurization_utils.neighbors_utils.create_results_dataframe(results)[source]

Create a pandas DataFrame with all cell information.

Parameters:: results (dict) – Results from classify_cells_into_shells
Returns:: df – DataFrame with cell information
Return type:: pandas.DataFrame

image_analysis_3D.featurization_utils.neighbors_utils.visualize_organoid_shells(coords, classification_results, title='Organoid Shell Classification', centroid=None)[source]

Create 3D visualization of organoid with shell coloring.

Parameters:

coords (pandas.DataFrame or dict) – Cell coordinates with columns/keys: object_id, x, y, z
classification_results (dict) – Results from classify_cells_into_shells
title (str) – Plot title
centroid (numpy.ndarray)

Return type:

matplotlib.pyplot.figure

image_analysis_3D.featurization_utils.neighbors_utils.plot_distance_distributions(classification_results, n_shells=None)[source]

Plot distance distributions for each shell.

Parameters:

classification_results (dict) – Results from classify_cells_into_shells
n_shells (int, optional) – Number of shells (will use ShellsUsed from results if not provided)

Return type:

matplotlib.pyplot.figure

Texture

image_analysis_3D.featurization_utils.texture_utils.scale_image(image, num_gray_levels=256)[source]

Scale the image to a specified number of gray levels. Example: 1024 gray levels will be scaled to 256 gray levels if num_gray_levels=256. An image with a pixel value of 0 will be scaled to 0 and a pixel value of 1023 will be scaled to 255.

Parameters:

image (numpy.ndarray) – The input image to be scaled. Can be a ndarray of any shape.
num_gray_levels (int, optional) – The number of gray levels to scale the image to, by default 256

Returns:

The gray level scaled image of any shape.

Return type:

image_analysis_3D.featurization_utils.texture_utils.measure_3D_texture(object_loader, distance=1, grayscale=256)[source]

Calculate texture features for each object in the image using Haralick features.

The features are calculated for each object separately and the mean value is returned.

Parameters:

object_loader (ObjectLoader) – The object loader containing the image and object information.
distance (int, optional) – The distance parameter for Haralick features, by default 1
grayscale (int, optional) – The number of gray levels to scale the image to, by default 256

Returns:

A dictionary containing the object ID, texture name, and texture value with keys: - object_id - texture_name - texture_value

Texture names include: Angular Second Moment, Contrast, Correlation, Variance, Inverse Difference Moment, Sum Average, Sum Variance, Sum Entropy, Entropy, and related texture measures.

AngularSecondMoment
Contrast
Correlation
Variance
InverseDifferenceMoment
SumAverage
SumVariance
SumEntropy
Entropy
DifferenceVariance
DifferenceEntropy
InformationMeasureOfCorrelation1
InformationMeasureOfCorrelation2

Return type:

SAMMED3D Featurizer

SAM-Med3D Feature Extractor Convert SAM-Med3D from segmentation to featurization model.

SAM-Med3D Architecture: - 3D Image Encoder (ViT-based): Extracts features from 3D volumes - 3D Prompt Encoder: Processes prompts, which are supervision signals provided by user for segmentation at inference time (not needed nor used for featurization) - 3D Mask Decoder: Generates segmentation masks (not needed for featurization)

For featurization, we extract embeddings from the 3D image encoder.

Requirements:

pip install torch torchvision monai einops timm

# For using pretrained SAM-Med3D: pip install medim

class image_analysis_3D.featurization_utils.sammed3d_featurizer.SAMMed3DFeatureExtractor(model_path=None, device='cpu', use_medim=True, image_size=128)[source]

Extract features from 3D microscope volumes using SAM-Med3D encoder.

This class wraps the SAM-Med3D model and extracts dense or global features from the 3D image encoder for downstream tasks like classification, clustering, or retrieval.

Parameters:

model_path (Optional[str])
device (Optional[str])
use_medim (Optional[bool])
image_size (Optional[int])

extract(volume, normalize=True, feature_type=None)[source]

Extract features from a 3D volume.

Parameters:

volume (numpy.ndarray or torch.Tensor) – 3D volume (Z, Y, X) or (C, Z, Y, X) or (B, C, Z, Y, X).
normalize (bool, optional) – Whether to normalize the volume.
feature_type (str | None)

Returns:

Feature vector(s) as numpy array.

Return type:

extract_batch(volumes, batch_size=4)[source]

Extract features from multiple volumes in batches.

Parameters:

volumes (list) – List of 3D volumes.
batch_size (int, optional) – Batch size for processing.

Returns:

(N, Z) array of features.

Return type:

class image_analysis_3D.featurization_utils.sammed3d_featurizer.TransformerBlock3D(*args, **kwargs)[source]

3D Transformer block.

Parameters:

dim (int)
num_heads (int)
mlp_ratio (float)

forward(x)[source]

Apply self-attention and MLP blocks.

Parameters:: x (torch.Tensor) – Input tensor of shape (B, N, C).
Returns:: Output tensor of shape (B, N, C).
Return type:: torch.Tensor

class image_analysis_3D.featurization_utils.sammed3d_featurizer.MicroscopySAMMed3DPipeline(sammed3d_path=None, device='cpu')[source]

End-to-end pipeline for microscopy feature extraction.

Parameters:

sammed3d_path (Optional[str])
device (str)

preprocess_volume(volume)[source]

Preprocess microscopy volume.

Parameters:: volume (numpy.ndarray)
Return type:: numpy.ndarray

extract_features(volume, preprocess=True, feature_type=None)[source]

Extract features from microscopy volume.

Parameters:

volume (numpy.ndarray) – 3D numpy array (Z, Y, X).
preprocess (bool, optional) – Whether to preprocess the volume.
feature_type (str | None)

Returns:

Feature vector.

Return type:

extract_features_batch(volumes, preprocess=True, batch_size=4, feature_type=None)[source]

Extract features from multiple volumes.

Parameters:

volumes (List[numpy.ndarray])
preprocess (bool)
batch_size (int)
feature_type (str | None)

Return type:

image_analysis_3D.featurization_utils.sammed3d_featurizer.check_for_zero_objects(label_image)[source]

Check if there are any objects in the label image.

Parameters:: label_image (numpy.ndarray)
Return type:: bool

image_analysis_3D.featurization_utils.sammed3d_featurizer.call_SAMMed3D_pipeline(object_loader, SAMMed3D_model_path=None, feature_type=['global', 'patch', 'cls'], extractor=None)[source]

Call the SAMMed3D pipeline to extract features per patient, well-fov.

Here we call the SAMMed3D pipeline to extract features for each object in the label image.

Parameters:

object_loader (ObjectLoader) – Class that loads the image and label image for a given patient, well-fov, channel, compartment
SAMMed3D_model_path (Optional[str], optional) – Path to the SAMMed3D model, by default None. Ignored if extractor is provided.
feature_type (str | List, optional) – Feature types to extract, by default [“global”, “patch”, “cls”]
extractor (Optional[MicroscopySAMMed3DPipeline], optional) – Pre-loaded extractor instance. If provided, SAMMed3D_model_path is ignored. Use this to avoid reloading the model in loops. By default None.

Returns:

Dictionary of extracted features from SAMMed3D for each object with keys:

”object_id”: List of object IDs
”feature_name”: List of feature names
”channel”: List of channels
”compartment”: List of compartments
”value”: List of feature values
”feature_type”: List of feature types

Return type:

image_analysis_3D.featurization_utils.sammed3d_featurizer.call_whole_image_sammed3d_pipeline(image, SAMMed3D_model_path=None, feature_type=['global', 'patch', 'cls'], extractor=None)[source]

Call the SAMMed3D pipeline to extract features for the whole image.

This function is called per patient, well-fov and extracts features for the whole FOV volume using the SAMMed3D pipeline.

Parameters:

image (np.ndarray) – 3D numpy array of the image
SAMMed3D_model_path (Optional[str], optional) – Path to the SAMMed3D model, by default None. Ignored if extractor is provided.
feature_type (str | List, optional) – Type of features to extract, by default [“global”, “patch”, “cls”]
extractor (Optional[MicroscopySAMMed3DPipeline], optional) – Pre-loaded extractor instance. If provided, SAMMed3D_model_path is ignored. Use this to avoid reloading the model in loops. By default None.

Returns:

Dictionary of extracted features from SAMMed3D for the whole image with keys:

”feature_name”: List of feature names
”value”: List of feature values
”feature_type”: List of feature types
”compartment”: List of compartments (will be “Image” for whole image features)

Return type:

CHAMMI-75 Featurizer

This utils file has module that utilize CHAMMI-75’s featurization model. This used a self-supervised deep-learning model that uses a Vision Transformer (ViT) architecture

image_analysis_3D.featurization_utils.chammi75_featurization.get_chammi75_model(device)[source]

Load the CHAMMI-75 (MorphEm) model from Hugging Face.

Parameters:: device (str or None) – The device to load the model on ('cuda' or 'cpu'). If None, CUDA is used when available, otherwise CPU.
Returns:: The CHAMMI-75 (MorphEm) model in evaluation mode.
Return type:: torch.nn.Module

class image_analysis_3D.featurization_utils.chammi75_featurization.SaturationNoiseInjector(*args, **kwargs)[source]

Inject uniform random noise into saturated pixels of an image tensor. There are three channels to the image where image 2 and 3 are duplicates of the first channel. We have three channels to fit the ViT architecture which expects three-channel input. This transformation replaces saturated pixels (value == 255) in the first channel of an image with uniform random noise sampled from [low, high]. It is applied as a pre-processing step before passing the image to the CHAMMI-75 model.

Parameters:

low (int)
high (int)

forward(x)[source]

Apply saturation-noise injection to the first channel.

Parameters:: x (torch.Tensor) – Image tensor of shape (C, H, W) where saturated pixels in the first channel (index 0) have value 255.
Returns:: Tensor with the same shape as x where saturated pixels in first channel have been replaced by uniform random noise.
Return type:: torch.Tensor

class image_analysis_3D.featurization_utils.chammi75_featurization.PerImageNormalize(*args, **kwargs)[source]

Normalize each image independently using InstanceNorm2d.

Parameters:: eps (float)

forward(x)[source]

Forward pass on the network

Parameters:: x (torch.Tensor) – Input tensor of shape (N, C, H, W) where N is batch size, C is number of channels, H and W are height and width.
Returns:: Normalized tensor of the same shape as input.
Return type:: torch.Tensor

image_analysis_3D.featurization_utils.chammi75_featurization.featurize_2D_image_w_chammi75(image_tensor, model, device)[source]

Extract CHAMMI-75 CLS-token features from a multi-channel 2D image.

The function processes each channel of the input image independently (Bag-of-Channels strategy). In step 1, the function resizes the image tensor to 224×224. In step 2, the function injects random noise into saturated pixels. In step 3, the function normalizes each image. In step 4, the function passes the stacked image into the Vision Transformer encoder. Lastly, in step 5, the function outputs the x_norm_clstoken per channel.

Parameters:

image_tensor (torch.Tensor) – Batch of images with shape (N, C, H, W) where N is the batch size, C is the number of channels, and H, W are the spatial dimensions.
model (torch.nn.Module) – The loaded CHAMMI-75 (MorphEm) model (see get_chammi75_model()).
device (torch.device) – Device on which to run inference ('cuda' or 'cpu').

Returns:

A list of length C where each element is a (N, 384) array containing the CLS-token embedding for that channel.

Return type:

list of numpy.ndarray

image_analysis_3D.featurization_utils.chammi75_featurization.call_chammi75_featurization_pipeline(cropped_image, model, device='cpu')[source]

Run the CHAMMI-75 featurization pipeline on a single cropped 2D image.

Converts the input NumPy array to a three-channel PyTorch tensor (by replicating the single channel) and extracts CLS-token features from the first channel. Because the ViT architecture expects three-channel input but we feed it a single fluorescence channel, the channel is replicated three times yet only the first copy’s features are returned.

Parameters:

cropped_image (numpy.ndarray) – A 2D single-channel image array of shape (H, W) containing the cropped object region.
model (torch.nn.Module) – The loaded CHAMMI-75 model (see get_chammi75_model()).
device (str | torch.device)

Returns:

A (1, 384) array of CLS-token embeddings for the input image.

Return type:

Feature Writing

Functions for formatting morphology feature names in a consistent way across all morphology features.

image_analysis_3D.featurization_utils.feature_writing_utils.remove_underscores_from_string(string)[source]

Remove unwanted delimiters from a string and replace them with hyphens.

Parameters:: string (str) – The string to remove unwanted delimiters from.
Returns:: The string with unwanted delimiters removed and replaced with hyphens.
Return type:: str

image_analysis_3D.featurization_utils.feature_writing_utils.format_morphology_feature_name(compartment, channel, feature_type, measurement)[source]

Format a morphology feature name in a consistent way across all morphology features. This format follows specification for the following: https://github.com/WayScience/NF1_3D_organoid_profiling_pipeline/blob/main/docs/RFC-2119-Feature-Naming-Convention.md

Parameters:

compartment (str) – The compartment name.
channel (str) – The channel name.
feature_type (str) – The feature type.
measurement (str) – The measurement name.

Returns:

The formatted feature name.

Return type:

str

image_analysis_3D.featurization_utils.feature_writing_utils.save_features_as_parquet(parent_path, df, compartment, channel, feature_type, cpu_or_gpu)[source]

Save features as parquet files in a consistent way across all morphology features.

Parameters:

parent_path (pathlib.Path) – The parent path to save the features to.
df (pandas.DataFrame) – The dataframe containing the features to save.
compartment (str) – The compartment name.
channel (str) – The channel name.
feature_type (str) – The feature type.
cpu_or_gpu (str) – Whether the features were generated using CPU or GPU processing.

Return type:

pathlib.Path

Resource Profiling

This document provides utility functions for profiling memory and time usage during featurization runs.

image_analysis_3D.featurization_utils.resource_profiling_util.start_profiling()[source]

Start memory and time profiling.

Parameters:: None
Returns:: A (start_time, start_mem) pair where start_time is a Unix timestamp and start_mem is the current RSS in MB (kept for backward-compatibility with the legacy profiler).
Return type:: tuple[float, float]

image_analysis_3D.featurization_utils.resource_profiling_util.stop_profiling(start_time, well_fov, patient_id, feature_type, channel, compartment, CPU_GPU, output_file_dir, start_mem=None)[source]

Stop profiling, report results, and save them to a parquet file.

This function stops tracemalloc, computes peak memory usage (via tracemalloc) and elapsed wall-clock time, prints a summary, and persists the statistics to output_file_dir as a Parquet file.

Parameters:

start_time (float) – Unix timestamp returned by start_profiling().
well_fov (str) – Well and field of view for the run.
patient_id (str) – Patient ID for the run.
feature_type (str) – Feature type for the run (e.g., 'intensity', 'shape').
channel (str) – Channel name for the run.
compartment (str) – Cellular compartment for the run (e.g., 'nucleus', 'cytoplasm').
CPU_GPU (str) – Processing unit used ('CPU' or 'GPU').
output_file_dir (pathlib.Path) – File path to save the run-statistics Parquet file.
start_mem (float, optional) – Starting RSS in MB (from start_profiling()). Included in the output for backward-compatibility but is not used for the peak-memory calculation.

Returns:

True if the function ran successfully.

Return type: