API Reference
Function documentation is generated from in-code docstrings.
File Utilities
Argument Parsing
Argument parsing helpers for pipeline scripts.
- image_analysis_3D.file_utils.arg_parsing_utils.check_for_missing_args(**kwargs)[source]
Check if any required arguments are missing.
- Raises:
ValueError – If any required arguments are missing.
- Parameters:
kwargs (Any)
- Return type:
None
- image_analysis_3D.file_utils.arg_parsing_utils.parse_args()[source]
Parse command line arguments for segmentation tasks.
- Returns:
A dictionary containing the parsed arguments with keys:
’well_fov’: well and field of view to process (e.g., ‘A01-1’)
’patient’: patient ID (e.g., ‘NF0014’)
’window_size’: window size for image processing (e.g., 3)
’clip_limit’: clip limit for contrast enhancement (e.g., 0.05)
’compartment’: compartment to process (e.g., ‘Nuclei’)
’channel’: channel to process (e.g., ‘DAPI’)
- Return type:
- Raises:
ValueError – If any required arguments are missing.
File Checking
File system validation utilities.
- image_analysis_3D.file_utils.file_checking.check_number_of_files(directory, n_files, verbose=False)[source]
Check if the number of files in a directory is equal to a given number.
- Parameters:
directory (pathlib.Path) – Specified directory to check file number.
n_files (int) – The expected number of files in the directory.
verbose (bool, optional) – If verbose is True, additional information will be printed.
- Returns:
True if the number of files in the directory is equal to the expected number, False otherwise. If False, also returns the name of the directory.
- Return type:
File Reading
Helpers for reading imaging files and channel stacks.
- image_analysis_3D.file_utils.file_reading.find_files_available(input_dir, image_extensions={'.tif', '.tiff'})[source]
List available image files in a directory.
- Parameters:
input_dir (pathlib.Path) – Directory to scan for image files.
image_extensions (set[str], optional) – File extensions to include, by default {“.tif”, “.tiff”}.
- Returns:
Sorted list of image file paths.
- Return type:
List[str]
- image_analysis_3D.file_utils.file_reading.read_in_channels(files, channel_dict={'brightfield': 'TRANS', 'cyto1': '488', 'cyto2': '555', 'cyto3': '640', 'nuclei': '405'}, channels_to_read=[None])[source]
Read z-stack images for each channel token.
- Parameters:
- Returns:
Mapping of channel name to loaded image array (or None).
- Return type:
- image_analysis_3D.file_utils.file_reading.read_zstack_image(file_path)[source]
Reads in a z-stack image from a given file path and returns it as a numpy array.
- Parameters:
file_path (str) – The path to the z-stack image file.
- Returns:
The z-stack image as a numpy array.
- Return type:
np.ndarray
- Raises:
ValueError – If the image has less than 3 dimensions.
Notebook Initialization
Notebook initialization helpers and Bandicoot path utilities.
- image_analysis_3D.file_utils.notebook_init_utils.init_notebook()[source]
Initializes the notebook environment by determining the root directory of the Git repository and checking if the code is running in a Jupyter notebook.
- Returns:
pathlib.Path: The root directory of the Git repository.
bool: True if running in a Jupyter notebook, False otherwise.
- Return type:
Tuple[pathlib.Path, bool]
- image_analysis_3D.file_utils.notebook_init_utils.bandicoot_check(bandicoot_mount_path, root_dir)[source]
This function determines if the external mount point for Bandicoot exists.
- Parameters:
bandicoot_mount_path (pathlib.Path) – The path to the Bandicoot mount point.
root_dir (pathlib.Path) – The root directory of the Git repository.
- Returns:
The base directory for image data.
- Return type:
- image_analysis_3D.file_utils.notebook_init_utils.avoid_path_crash_bandicoot(bandicoot_path)[source]
This function avoids path crashes by checking if the Bandicoot path exists and setting the raw image directory and output base directory accordingly.
- Parameters:
bandicoot_path (pathlib.Path) – The path to the Bandicoot directory.
- Returns:
The raw image directory and output base directory.
- Return type:
Tuple[pathlib.Path, pathlib.Path]
Preprocessing Functions
Preprocessing utilities for organizing image data.
- image_analysis_3D.file_utils.preprocessing_funcs.read_2D_image_for_zstacking(file_path)[source]
Read a 2D image for z-stacking from a file.
Reads in a 2D image from a given file path and returns it as a numpy array.
- Parameters:
file_path (str) – The path to the 2D image file.
- Returns:
The 2D image as a numpy array.
- Return type:
np.ndarray
- Raises:
ValueError – If the image has more than 2 dimensions.
- image_analysis_3D.file_utils.preprocessing_funcs.get_well_fov_dirs(parent_dir)[source]
Retrieve all well fov dirs in a given parent dir
- Parameters:
parent_dir (pathlib.Path) – Patient parent dir
- Returns:
List of well fov dirs in _parent_dir
- Return type:
List[pathlib.Path]
- image_analysis_3D.file_utils.preprocessing_funcs.get_to_the_unested_dir(nested_dir, times_nested)[source]
Unest the dir given the number of time the directories are nested.
- Parameters:
nested_dir (pathlib.Path) – The parent directory containing the nested dirs
times_nested (int) – The number of times that a dir is nested
- Returns:
The output file path of the least nested parent dir or None
- Return type:
pathlib.Path | None
Channel Mapping
Channel mapping utilities.
Segmentation Decoupling (File Utilities)
Utilities for decoupling and merging segmentation masks.
- image_analysis_3D.file_utils.segmentation_decoupling.euclidian_2D_distance(coord_set_1, coord_set_2)[source]
This function calculates the euclidian distance between two sets of coordinates (2D)
sqrt((x1 - x2)^2 + (y1 - y2)^2)
- image_analysis_3D.file_utils.segmentation_decoupling.check_coordinate_inside_box(coord, box)[source]
This function checks if a coordinate is inside a box
- image_analysis_3D.file_utils.segmentation_decoupling.get_larger_bbox(bbox1, bbox2)[source]
This function returns the larger of two bounding boxes
- image_analysis_3D.file_utils.segmentation_decoupling.extract_unique_masks(image_stack)[source]
This function extracts unique masks from an image stack
- Parameters:
image_stack (np.ndarray) – The image stack to extract unique masks from
- Returns:
The dataframe containing the unique masks
- Return type:
pd.DataFrame
- image_analysis_3D.file_utils.segmentation_decoupling.compare_masks_for_merged(df, index1, index2, distance_threshold=10)[source]
This function compares masks for merging
- image_analysis_3D.file_utils.segmentation_decoupling.get_combinations_of_indices(df, distance_threshold=10)[source]
This function gets the combinations of indices
- Parameters:
df (pd.DataFrame) – The dataframe containing the masks
distance_threshold (int, optional) – The distance threshold, by default 10
- Returns:
The dataframe containing the combinations of indices
- Return type:
pd.DataFrame
- image_analysis_3D.file_utils.segmentation_decoupling.merge_sets(list_of_sets)[source]
Merge overlapping sets in-place and count merges.
- image_analysis_3D.file_utils.segmentation_decoupling.merge_sets_df(merged_df)[source]
This function merges the sets of masks
- Parameters:
merged_df (pd.DataFrame) – The dataframe containing the masks
- Returns:
The dataframe containing the merged masks
- Return type:
pd.DataFrame
- image_analysis_3D.file_utils.segmentation_decoupling.reassemble_each_mask(df, original_img_shape)[source]
This function reassembles the masks from the dataframe
- Parameters:
df (pd.DataFrame) – The dataframe containing the masks
original_img_shape (tuple) – The shape of the original image
- Returns:
The reassembled masks
- Return type:
np.ndarray
- image_analysis_3D.file_utils.segmentation_decoupling.get_dimensionality(image_array)[source]
This function returns the dimensionality of an image array while checking if the input is a numpy array
- image_analysis_3D.file_utils.segmentation_decoupling.get_number_of_unique_labels(image_array)[source]
This function returns the number of unique labels in an image array
- Parameters:
image_array (np.ndarray) – The image array to check the number of unique labels
- Returns:
The number of unique labels in the image array
- Return type:
Errors
This class defines a custom exception class for exceeding the max workers on a machine.
Segmentation Utilities
General Segmentation
- image_analysis_3D.segmentation_utils.general_segmentation_utils.sliding_window_two_point_five_D(image_stack, window_size)[source]
Create 2.5D max-projection stack using a sliding window.
- Parameters:
image_stack (np.ndarray) – Input 3D image stack (Z, Y, X).
window_size (int) – Number of slices to project per window.
- Returns:
2.5D stack of max projections.
- Return type:
np.ndarray
- image_analysis_3D.segmentation_utils.general_segmentation_utils.reverse_sliding_window_max_projection(output_dict, window_size, original_z_slice_count)[source]
Reconstruct per-slice masks from sliding-window projections.
- image_analysis_3D.segmentation_utils.general_segmentation_utils.butterworth_grid_optimization(img, return_plot=False)[source]
Sweep Butterworth parameters and optionally plot results.
- Parameters:
img (np.ndarray) – Image stack used for optimization.
return_plot (bool, optional) – Whether to display a parameter grid plot, by default False.
- Returns:
Only displays plots when requested.
- Return type:
None
- image_analysis_3D.segmentation_utils.general_segmentation_utils.apply_butterworth_filter(img, cutoff_frequency_ratio=0.05, order=1, high_pass=False, squared_butterworth=True)[source]
Apply a Butterworth filter and Gaussian smoothing.
- Parameters:
img (np.ndarray) – Input image stack to filter.
cutoff_frequency_ratio (float, optional) – Cutoff frequency ratio, by default 0.05.
order (int, optional) – Butterworth filter order, by default 1.
high_pass (bool, optional) – Whether to use a high-pass filter, by default False.
squared_butterworth (bool, optional) – Use squared Butterworth response, by default True.
- Returns:
Filtered image stack.
- Return type:
np.ndarray
- image_analysis_3D.segmentation_utils.general_segmentation_utils.decouple_masks(reconstruction_dict, original_img_shape, distance_threshold, verbose=False)[source]
Decouple projected masks into per-slice masks.
- Parameters:
- Returns:
Mapping of slice index to reassembled masks.
- Return type:
- image_analysis_3D.segmentation_utils.general_segmentation_utils.generate_coordinates_for_reconstruction(image)[source]
Generate centroid and bounding-box coordinates for reconstruction.
- Parameters:
image (np.ndarray) – Labeled mask stack (Z, Y, X).
- Returns:
DataFrame of labels, centroids, and bounding boxes.
- Return type:
pd.DataFrame
- image_analysis_3D.segmentation_utils.general_segmentation_utils.generate_distance_pairs(coordinates_df, x_y_vector_radius_max_constraint)[source]
Create distance pairs for centroid matching across slices.
- Parameters:
coordinates_df (pd.DataFrame) – DataFrame containing centroid coordinates.
x_y_vector_radius_max_constraint (int) – Maximum centroid distance to include.
- Returns:
Pairwise centroid distances within the constraint.
- Return type:
pd.DataFrame
- image_analysis_3D.segmentation_utils.general_segmentation_utils.calculate_mask_iou(mask1, mask2)[source]
Calculate the Intersection over Union (IoU) between two binary masks.
- Parameters:
mask1 (np.ndarray) – The first binary mask.
mask2 (np.ndarray) – The second binary mask.
- Returns:
True if the IoU is greater than 0.5, False otherwise.
- Return type:
- image_analysis_3D.segmentation_utils.general_segmentation_utils.graph_creation(df)[source]
Build a graph connecting centroid pairs.
- Parameters:
df (pd.DataFrame) – DataFrame of centroid pairs and distances.
- Returns:
Graph with centroid nodes and distance edges.
- Return type:
networkx.Graph
- image_analysis_3D.segmentation_utils.general_segmentation_utils.solve_graph(G)[source]
Solve for longest shortest paths in a graph.
- image_analysis_3D.segmentation_utils.general_segmentation_utils.merge_sets(list_of_sets)[source]
Merge overlapping sets of node indices.
- image_analysis_3D.segmentation_utils.general_segmentation_utils.collapse_labels(df, longest_paths)[source]
Collapse labels using graph paths.
- Parameters:
df (pd.DataFrame) – DataFrame containing unique IDs.
longest_paths (list) – List of paths from graph solution.
- Returns:
Updated DataFrame with collapsed labels.
- Return type:
pd.DataFrame
- image_analysis_3D.segmentation_utils.general_segmentation_utils.reassign_labels(image, df)[source]
Reassign labels in a mask based on a mapping DataFrame.
- Parameters:
image (np.ndarray) – Mask stack to relabel.
df (pd.DataFrame) – DataFrame containing label mappings by slice.
- Returns:
Relabeled mask stack.
- Return type:
np.ndarray
- image_analysis_3D.segmentation_utils.general_segmentation_utils.calculate_bbox_area(bbox)[source]
Calculate the area of a bounding box.
- image_analysis_3D.segmentation_utils.general_segmentation_utils.calculate_overlap(bbox1, bbox2)[source]
Calculate overlap percentage between two bounding boxes.
- image_analysis_3D.segmentation_utils.general_segmentation_utils.check_for_all_same_labels(object_information_df)[source]
Check if all labels in the object information DataFrame are the same.
- Parameters:
object_information_df (pd.DataFrame) – The DataFrame containing object information with ‘label’ column.
- Returns:
True if all labels are the same, False otherwise.
- Return type:
- image_analysis_3D.segmentation_utils.general_segmentation_utils.missing_slice_check(object_information_df, window_min=0, window_max=2, interpolated_rows_to_add=[])[source]
Check for missing slices in the object information DataFrame and add interpolated rows if necessary.
- Parameters:
object_information_df (pd.DataFrame) – The DataFrame containing object information with ‘z’ and ‘label’ columns.
window_min (int, optional) – The minimum window size for checking missing slices, by default 0
window_max (int, optional) – The maximum window size for checking missing slices, by default 2
interpolated_rows_to_add (List[int], optional) – A list to store rows to be added for interpolation, by default []
- Returns:
A list of DataFrames containing rows to be added for interpolation.
- Return type:
List[pd.DataFrame]
- image_analysis_3D.segmentation_utils.general_segmentation_utils.add_min_max_boundry_slices(object_information_df, global_min_z, global_max_z, interpolated_rows_to_add=[])[source]
Add slices to the object information DataFrame that are one slice away from the global min and max z slices.
- Parameters:
object_information_df (pd.DataFrame) – The DataFrame containing object information with ‘z’ and ‘label’ columns.
global_min_z (int) – The global minimum z slice.
global_max_z (int) – The global maximum z slice.
interpolated_rows_to_add (List[pd.DataFrame], optional) – A list to store rows to be added for interpolation, by default []
- Returns:
A list of DataFrames containing rows to be added for interpolation at the min and max z slices.
- Return type:
List[pd.DataFrame]
- image_analysis_3D.segmentation_utils.general_segmentation_utils.add_masks_where_missing(new_mask_image, interpolated_rows_to_add_df)[source]
Add masks to the new mask image where the slices are missing based on the interpolated rows.
- Parameters:
new_mask_image (np.ndarray) – The new mask image to which the slices will be added.
interpolated_rows_to_add_df (pd.DataFrame) – The DataFrame containing the rows to be added for interpolation, with columns ‘added_z’, ‘added_new_label’, and ‘zslice_to_copy’.
- Returns:
The new mask image with the added slices.
- Return type:
np.ndarray
- image_analysis_3D.segmentation_utils.general_segmentation_utils.reorder_organoid_labels(label_image)[source]
Reorder the labels in the label image to ensure they are sequential starting from 1.
- Parameters:
label_image (np.ndarray) – The label image where labels need to be reordered.
- Returns:
The label image with reordered labels.
- Return type:
np.ndarray
- image_analysis_3D.segmentation_utils.general_segmentation_utils.run_post_hoc_refinement(mask_image, sliding_window_context)[source]
Refine labels by interpolating across missing slices.
- image_analysis_3D.segmentation_utils.general_segmentation_utils.segment_cells_with_3D_watershed(cyto_signal, nuclei_mask)[source]
Segment cells using seeded 3D watershed.
- Parameters:
cyto_signal (np.ndarray) – Cytoplasm signal image stack.
nuclei_mask (np.ndarray) – Nuclei mask used as watershed seeds.
- Returns:
Cell segmentation mask.
- Return type:
np.ndarray
- image_analysis_3D.segmentation_utils.general_segmentation_utils.remove_edge_cases(mask, border=10)[source]
Remove masks that are image edge cases In this case - the edge literally means the edge of the image This is useful to remove masks that are not fully contained within the image
- Parameters:
mask (np.ndarray) – The mask to process, should be a 3D numpy array
border (int, optional) – The number of pixels in width to create border to scan for edge cased, by default 10
- Returns:
The mask with edge cases removed
- Return type:
np.ndarray
- image_analysis_3D.segmentation_utils.general_segmentation_utils.centroid_within_bbox_detection(centroid, bbox)[source]
Check if the centroid is within the bbox
- Parameters:
- Returns:
True if the centroid is within the bbox, False otherwise
- Return type:
- image_analysis_3D.segmentation_utils.general_segmentation_utils.check_if_centroid_within_mask(centroid, mask, label)[source]
Check if the centroid is within the mask
- image_analysis_3D.segmentation_utils.general_segmentation_utils.get_labels_for_post_hoc_reassignment(compartment_mask, compartment_name)[source]
Collect centroid and bbox data for mask relabeling.
- Parameters:
compartment_mask (np.ndarray) – Labeled mask for a compartment.
compartment_name (str) – Name of the compartment.
- Returns:
DataFrame of centroids, bboxes, and labels.
- Return type:
pd.DataFrame
- image_analysis_3D.segmentation_utils.general_segmentation_utils.mask_label_reassignment(mask_df, mask_input)[source]
Reassign the labels of the mask based on the mask_df
- Parameters:
mask_df (pd.DataFrame) – DataFrame containing the labels and centroids of the mask
mask_input (np.ndarray) – The input mask to reassign the labels to
- Returns:
The mask with reassigned labels
- Return type:
np.ndarray
- image_analysis_3D.segmentation_utils.general_segmentation_utils.run_post_hoc_mask_reassignment(nuclei_mask, cell_mask, nuclei_df, cell_df, return_dataframe=False)[source]
Reassign nuclei labels based on cell containment.
- Parameters:
nuclei_mask (np.ndarray) – Nuclei segmentation mask.
cell_mask (np.ndarray) – Cell segmentation mask.
nuclei_df (pd.DataFrame) – DataFrame with nuclei centroids and labels.
cell_df (pd.DataFrame) – DataFrame with cell centroids and labels.
return_dataframe (bool, optional) – Whether to return the merged DataFrame, by default False.
- Returns:
Updated nuclei mask and optional merged DataFrame.
- Return type:
tuple[np.ndarray, pd.DataFrame | None]
- image_analysis_3D.segmentation_utils.general_segmentation_utils.create_cytoplasm_masks(nuclei_masks, cell_masks)[source]
Create cytoplasm masks by subtracting nuclei from cells.
- Parameters:
nuclei_masks (np.ndarray) – Nuclei segmentation masks.
cell_masks (np.ndarray) – Cell segmentation masks.
- Returns:
Cytoplasm masks.
- Return type:
np.ndarray
- image_analysis_3D.segmentation_utils.general_segmentation_utils.clean_border_objects(segmentation, border_width=20)[source]
Remove objects touching the segmentation border.
- Parameters:
segmentation (np.ndarray) – Labeled segmentation mask.
border_width (int, optional) – Width of the border region, by default 20.
- Returns:
Cleaned segmentation mask.
- Return type:
np.ndarray
- image_analysis_3D.segmentation_utils.general_segmentation_utils.remove_label_id(mask_image, label_id_to_remove)[source]
Remove the label id
- Parameters:
mask_image (np.ndarray) – Mask image from which to remove the label id
label_id_to_remove (int) – Label id to remove from the mask image
- Returns:
Mask image with the label id removed
- Return type:
np.ndarray
Nuclei Segmentation
# Nuclei segmentation using cellpose in a two-D manner
- image_analysis_3D.segmentation_utils.nuclei_segmentation.segmentaion_on_two_D(imgs, diameter=None)[source]
Run 2D Cellpose segmentation slice-by-slice.
- image_analysis_3D.segmentation_utils.nuclei_segmentation.build_complete_bipartite_graph(input_masks, distance_threshold=None)[source]
Build a complete bipartite graph from 2D segmentation masks.
For each pair of consecutive slices, connect EVERY object in slice N to EVERY object in slice N+1, computing Euclidean distance between centroids.
- Parameters:
- Returns:
G (networkx.Graph) – NetworkX graph.
df (pandas.DataFrame) – DataFrame with all edges.
- Return type:
tuple[networkx.Graph, pandas.DataFrame]
- image_analysis_3D.segmentation_utils.nuclei_segmentation.solve_graph_improved(G, max_distance=100.0, slices=None, verbose=False)[source]
Solve bipartite matching across consecutive slices using greedy matching based on edge weights (distances).
Instead of Hungarian algorithm (which can force bad matches), greedily match objects in order of smallest distance. This ensures we only connect objects that are genuinely close.
- Parameters:
- Returns:
List of paths, where each path is a list of node IDs.
- Return type:
- image_analysis_3D.segmentation_utils.nuclei_segmentation.split_long_trajectories(paths, max_length)[source]
Split trajectories that exceed max_length into shorter ones.
- image_analysis_3D.segmentation_utils.nuclei_segmentation.collapse_labels_from_paths(input_masks, paths)[source]
Assign unified labels based on trajectories.
- image_analysis_3D.segmentation_utils.nuclei_segmentation.stack_3d_segmentation(relabeled_masks)[source]
Stack 2D relabeled masks into 3D volume.
- Parameters:
relabeled_masks (list[numpy.ndarray])
- Return type:
- image_analysis_3D.segmentation_utils.nuclei_segmentation.remove_single_slice_objects(segmentation_mask)[source]
Remove objects that only appear in a single z-slice.
- Parameters:
segmentation_mask (numpy.ndarray) – 3D segmentation mask array.
- Returns:
Cleaned 3D segmentation with single-slice objects removed.
- Return type:
- image_analysis_3D.segmentation_utils.nuclei_segmentation.fill_object_gaps(segmentation_mask, max_gap_size=2)[source]
Fill gaps in object trajectories (missing slices between appearances).
For example, if object ID 5 appears in slices [10, 11, 14, 15], the gap between 11 and 14 will be filled if gap_size <= max_gap_size.
- Parameters:
segmentation_mask (numpy.ndarray) – 3D segmentation mask array.
max_gap_size (int, optional) – Maximum number of consecutive missing slices to fill (default: 2, meaning fill gaps of 1-2 slices).
- Returns:
Filled 3D segmentation.
- Return type:
- image_analysis_3D.segmentation_utils.nuclei_segmentation.postprocess_segmentation(segmentation_mask, remove_singletons=True, fill_gaps=True, max_gap_size=2)[source]
Post-process 3D segmentation to clean up artifacts.
- Parameters:
segmentation_mask (numpy.ndarray) – 3D segmentation array.
remove_singletons (bool, optional) – If True, remove objects that only appear in 1 slice.
fill_gaps (bool, optional) – If True, fill small gaps in object trajectories.
max_gap_size (int, optional) – Maximum gap size to fill (only used if fill_gaps=True).
- Returns:
Cleaned 3D segmentation.
- Return type:
- image_analysis_3D.segmentation_utils.nuclei_segmentation.object_stitching_and_relation(input_masks, max_match_distance=100.0, max_trajectory_length=None, verbose=False)[source]
Complete pipeline: build complete bipartite graph -> solve matching -> relabel.
- Parameters:
input_masks (list) – List of 2D segmentation masks.
max_match_distance (float, optional) – Maximum distance to accept a match (in pixels).
max_trajectory_length (int or None, optional) – Optional maximum number of consecutive slices an object can span. If None, no limit. Use to prevent unrealistic tall objects (e.g., set to 10 if cells shouldn’t span >10 slices).
verbose (bool, optional) – Print diagnostics.
- Returns:
segmentation_mask (numpy.ndarray) – 3D array with unified instance labels across slices.
diagnostics (dict) – Dict with stats about the matching.
- Return type:
Cell Segmentation
# Cell segmentation in 3D
- image_analysis_3D.segmentation_utils.cell_segmentation.fill_holes_in_mask(mask, compartment=None)[source]
This function fills holes in instance segmented mask images
- Parameters:
mask (np.ndarray) – 3D instance segmented mask image where each cell has a unique integer label and background is 0
compartment (str, optional) – Compartment type of the mask (e.g. “cell” or “organoid”), by default None. This is used to determine the hole filling strategy.
Errors
------
ValueError – If compartment is not specified, a ValueError is raised since the hole filling strategy depends on the compartment type.
- Returns:
3D instance segmented mask image with holes filled
- Return type:
np.ndarray
- image_analysis_3D.segmentation_utils.cell_segmentation.segment_cells_with_3D_watershed(cyto_signal, nuclei_mask, thresholded_signal, connectivity=1, compactness=0)[source]
Segment cells using 3D watershed algorithm.
Segments cells using a 3D watershed algorithm given cytoplasm signal (channel) and nuclei mask.
- Parameters:
cyto_signal (np.ndarray) – 3D numpy array representing the cytoplasm signal.
nuclei_mask (np.ndarray) – 3D numpy array representing the nuclei mask.
thresholded_signal (np.ndarray) – 3D numpy array representing the thresholded cytoplasm signal to be used as a mask for watershed.
connectivity (int, optional) – Connectivity parameter for the watershed algorithm. Default is 1. A value of 1 means only directly adjacent pixels (6-connectivity in 3D) are considered connected, preventing over-segmentation.
compactness (float, optional) – Compactness parameter controlling watershed region shape. Default is 0. A value of 0 means no compactness enforcement, allowing irregularly shaped segments to capture true cell morphology.
- Returns:
3D numpy array representing the segmented cell mask.
- Return type:
np.ndarray
- image_analysis_3D.segmentation_utils.cell_segmentation.perform_morphology_dependent_segmentation(organoid_label, cyto_signal, nuclei_mask, min_size=1000, max_size=10000000)[source]
Perform morphology-dependent cell segmentation.
Performs morphology dependent segmentation based on the provided morphology label.
- Parameters:
organoid_label (str) – Morphology label indicating the type of morphology.
cyto_signal (np.ndarray) – 3D numpy array representing the cytoplasm signal.
nuclei_mask (np.ndarray) – 3D numpy array representing the nuclei mask.
min_size (int, optional) – Minimum size threshold for segmented objects. Default is 1,000 voxels.
max_size (int, optional) – Maximum size threshold for segmented objects. Default is 10,000,000 voxels.
- Returns:
3D numpy array representing the segmented cell mask.
- Return type:
np.ndarray
Segmentation Decoupling
Utilities for decoupling and merging segmentation masks.
- image_analysis_3D.segmentation_utils.segmentation_decoupling.euclidian_2D_distance(coord_set_1, coord_set_2)[source]
This function calculates the euclidian distance between two sets of coordinates (2D)
sqrt((x1 - x2)^2 + (y1 - y2)^2)
- image_analysis_3D.segmentation_utils.segmentation_decoupling.check_coordinate_inside_box(coord, box)[source]
This function checks if a coordinate is inside a box
- image_analysis_3D.segmentation_utils.segmentation_decoupling.get_larger_bbox(bbox1, bbox2)[source]
This function returns the larger of two bounding boxes
- image_analysis_3D.segmentation_utils.segmentation_decoupling.extract_unique_masks(image_stack)[source]
This function extracts unique masks from an image stack
- Parameters:
image_stack (np.ndarray) – The image stack to extract unique masks from
- Returns:
The dataframe containing the unique masks
- Return type:
pd.DataFrame
- image_analysis_3D.segmentation_utils.segmentation_decoupling.compare_masks_for_merged(df, index1, index2, distance_threshold=10)[source]
This function compares masks for merging
- image_analysis_3D.segmentation_utils.segmentation_decoupling.get_combinations_of_indices(df, distance_threshold=10)[source]
This function gets the combinations of indices
- Parameters:
df (pd.DataFrame) – The dataframe containing the masks
distance_threshold (int, optional) – The distance threshold, by default 10
- Returns:
The dataframe containing the combinations of indices
- Return type:
pd.DataFrame
- image_analysis_3D.segmentation_utils.segmentation_decoupling.merge_sets(list_of_sets)[source]
Merge overlapping sets in-place and count merges.
- image_analysis_3D.segmentation_utils.segmentation_decoupling.merge_sets_df(merged_df)[source]
This function merges the sets of masks
- Parameters:
merged_df (pd.DataFrame) – The dataframe containing the masks
- Returns:
The dataframe containing the merged masks
- Return type:
pd.DataFrame
- image_analysis_3D.segmentation_utils.segmentation_decoupling.reassemble_each_mask(df, original_img_shape)[source]
This function reassembles the masks from the dataframe
- Parameters:
df (pd.DataFrame) – The dataframe containing the masks
original_img_shape (tuple) – The shape of the original image
- Returns:
The reassembled masks
- Return type:
np.ndarray
- image_analysis_3D.segmentation_utils.segmentation_decoupling.get_dimensionality(image_array)[source]
This function returns the dimensionality of an image array while checking if the input is a numpy array
- image_analysis_3D.segmentation_utils.segmentation_decoupling.get_number_of_unique_labels(image_array)[source]
This function returns the number of unique labels in an image array
- Parameters:
image_array (np.ndarray) – The image array to check the number of unique labels
- Returns:
The number of unique labels in the image array
- Return type:
Image Utilities
Image Processing
- image_analysis_3D.image_utils.image_utils.select_objects_from_label(label_image, object_ids)[source]
Selects objects from a label image based on the provided object IDs.
- Parameters:
label_image (numpy.ndarray) – The segmented label image.
object_ids (list) – The object IDs to select.
- Returns:
The label image with only the selected objects.
- Return type:
- image_analysis_3D.image_utils.image_utils.expand_box(min_coor, max_coord, current_min, current_max, expand_by)[source]
Expand the bounding box of an object in a 3D image.
- Parameters:
min_coor (int) – The minimum coordinate of the image for any dimension.
max_coord (int) – The maximum coordinate of the image for any dimension.
current_min (int) – The current minimum coordinate of the bounding box of an object for any dimension.
current_max (int) – The current maximum coordinate of the bounding box of an object for any dimension.
expand_by (int) – The amount to expand the bounding box by.
- Returns:
The new minimum and maximum coordinates of the bounding box. Raises ValueError if the expansion is not possible.
- Return type:
Union[Tuple[int, int], ValueError]
- image_analysis_3D.image_utils.image_utils.new_crop_border(bbox1, bbox2, image)[source]
Expand the bounding boxes of two objects in a 3D image to match their sizes.
- Parameters:
bbox1 (Tuple[Union[int, float], Union[int, float], Union[int, float], Union[int, float], Union[int, float], Union[int, float]]) – The bounding box of the first object.
bbox2 (Tuple[Union[int, float], Union[int, float], Union[int, float], Union[int, float], Union[int, float], Union[int, float]]) – The bounding box of the second object.
image (numpy.ndarray) – The image to crop for each of the bounding boxes.
- Returns:
Tuple[Tuple[Union[int, float], Union[int, float], Union[int, float], Union[int, float], Union[int, float], Union[int, float]], Tuple[Union[int, float], Union[int, float], Union[int, float], Union[int, float], Union[int, float], Union[int, float]]] – The new bounding boxes of the two objects.
Raises
ValueError – If the expansion is not possible.
- Return type:
Tuple[Tuple[int | float, int | float, int | float, int | float, int | float, int | float], Tuple[int | float, int | float, int | float, int | float, int | float, int | float]]
- image_analysis_3D.image_utils.image_utils.crop_3D_image(image, bbox)[source]
Crop a 3D image to the bounding box of a mask.
- image_analysis_3D.image_utils.image_utils.single_3D_image_expand_bbox(image, bbox, expand_pixels, anisotropy_factor)[source]
Expand the bbox in a way that keeps the crop within the confines of the image volume
- Parameters:
image (numpy.ndarray) – 3D image array from which the bbox was derived
bbox (tuple[int, int, int, int, int, int]) – 3D bbox in the format (zmin, ymin, xmin, zmax, ymax, xmax)
expand_pixels (int) – number of pixels to expand the bbox in each direction (z, y, x) the coordinates become isotropic here so the expansion is the same across dimensions, but the anisotropy factor is used to adjust for the z dimension
anisotropy_factor (int) – The ratio of “pixel” size in um between the z dimension and the x/y dimensions. This is used to adjust the expansion of the bbox in the z dimension to account for anisotropy in the image volume. For example, if the z spacing is 5um and the x/y spacing is 1um, then the anisotropy factor would be 5.
- Returns:
Updated bbox in the format (zmin, ymin, xmin, zmax, ymax, xmax) after expansion and adjustment for anisotropy
- Return type:
- image_analysis_3D.image_utils.image_utils.check_for_xy_squareness(bbox)[source]
This function returns the ratio of the x length to the y length A value of 1 indicates a square bbox is present
- Parameters:
bbox (The bbox to check) – (z_min, y_min, x_min, z_max, y_max, x_max) Where each value is an int representing the pixel coordinate of the bbox in that dimension
- Returns:
The ratio of the y length to the x length of the bbox. A value of 1 indicates a square bbox.
- Return type:
- image_analysis_3D.image_utils.image_utils.square_off_xy_crop_bbox(bbox)[source]
Adjust the bbox to be square in the XY plane.
The function computes the new bbox from the current X/Y dimensions.
- Parameters:
bbox (tuple[int, int, int, int, int, int]) –
The bbox to adjust: (z_min, y_min, x_min, z_max, y_max, x_max)
Each value is an integer pixel coordinate in that dimension.
- Returns:
The adjusted bbox that is square in the XY plane: (z_min, new_y_min, new_x_min, z_max, new_y_max, new_x_max)
Each value is an integer pixel coordinate in that dimension.
- Return type:
Featurization Utilities
Area & Size & Shape
Area, size, and shape features for 3D objects.
- image_analysis_3D.featurization_utils.area_size_shape_utils.calculate_surface_area(label_object, props, spacing)[source]
Calculate the surface area of a 3D object using the marching cubes algorithm.
- Parameters:
label_object (numpy.array) – This is an array of the segmented objects of a given compartment.
props (numpy.array) – This is the output of the regionprops function, which contains information about the objects.
spacing (tuple) – This is the spacing of the image in each dimension (z, y, x).
- Returns:
The surface area for the object.
- Return type:
- image_analysis_3D.featurization_utils.area_size_shape_utils.measure_3D_area_size_shape(image_set_loader, object_loader)[source]
This function calculates the area, size, and shape of objects in a 3D image using the regionprops function. It uses the numpy library to perform the calculations on the CPU.
- Parameters:
image_set_loader (ImageSetLoader) – The image set loader object that contains the image and label image.
object_loader (ObjectLoader) – The object loader object that contains the image and label image.
- Returns:
A dictionary containing the area, size, and shape of the objects in the image.
- Return type:
Colocalization
Colocalization feature extraction utilities for 3D image objects.
Computes per-object colocalization metrics (Pearson correlation, Manders coefficients, overlap coefficient, K1/K2 coefficients) between pairs of fluorescence channels using the Costes automatic thresholding method.
- image_analysis_3D.featurization_utils.colocalization_utils.linear_costes_threshold_calculation(first_image, second_image, scale_max=255, fast_costes='Accurate')[source]
Finds the Costes Automatic Threshold for colocalization using a linear algorithm. Candiate thresholds are gradually decreased until Pearson R falls below 0. If “Fast” mode is enabled the “steps” between tested thresholds will be increased when Pearson R is much greater than 0. The other mode is “Accurate” which will always step down by the same amount.
- Parameters:
first_image (numpy.ndarray) – The first fluorescence image.
second_image (numpy.ndarray) – The second fluorescence image.
scale_max (int, optional) – The maximum value for the image scale, by default 255.
fast_costes (str, optional) – The mode for the Costes threshold calculation, by default “Accurate”.
- Returns:
The calculated thresholds for the first and second images.
- Return type:
- image_analysis_3D.featurization_utils.colocalization_utils.bisection_costes_threshold_calculation(first_image, second_image, scale_max=255)[source]
Finds the Costes Automatic Threshold for colocalization using a bisection algorithm. Candidate thresholds are selected from within a window of possible intensities, this window is narrowed based on the R value of each tested candidate. We’re looking for the first point at 0, and R value can become highly variable at lower thresholds in some samples. Therefore the candidate tested in each loop is 1/6th of the window size below the maximum value (as opposed to the midpoint).
- Parameters:
first_image (numpy.ndarray) – The first fluorescence image.
second_image (numpy.ndarray) – The second fluorescence image.
scale_max (int, optional) – The maximum value for the image scale, by default 255.
- Returns:
The calculated thresholds for the first and second images.
- Return type:
- image_analysis_3D.featurization_utils.colocalization_utils.prepare_two_images_for_colocalization(label_object1, label_object2, image_object1, image_object2, object_id1, object_id2)[source]
This function prepares two images for colocalization analysis by cropping them to the bounding boxes of the specified objects. It selects the objects from the label images, calculates their bounding boxes, and crops the images accordingly.
- Parameters:
label_object1 (numpy.ndarray) – The segmented label image for the first object.
label_object2 (numpy.ndarray) – The segmented label image for the second object.
image_object1 (numpy.ndarray) – The spectral image to crop for the first object.
image_object2 (numpy.ndarray) – The spectral image to crop for the second object.
object_id1 (int) – The object index to select from the label image for the first object.
object_id2 (int) – The object index to select from the label image for the second object.
- Returns:
The two cropped images for colocalization analysis.
- Return type:
Tuple[numpy.ndarray, numpy.ndarray]
- image_analysis_3D.featurization_utils.colocalization_utils.measure_3D_colocalization(cropped_image_1, cropped_image_2, thr=15, fast_costes='Accurate')[source]
This function calculates the colocalization coefficients between two images. It computes the correlation coefficient, Manders’ coefficients, overlap coefficient, and Costes’ coefficients. The results are returned as a dictionary.
- Parameters:
cropped_image_1 (numpy.ndarray) – The first cropped image.
cropped_image_2 (numpy.ndarray) – The second cropped image.
thr (int, optional) – The threshold for the Manders’ coefficients, by default 15
fast_costes (str, optional) – The mode for Costes’ threshold calculation, by default “Accurate”. Options are “Accurate” or “Fast”. “Accurate” uses a linear algorithm, while “Fast” uses a bisection algorithm. The “Fast” mode is faster but less accurate.
- Returns:
The output features for colocalization analysis.
- Return type:
Granularity
Calculate the granularity spectrum of a 3D image.
- image_analysis_3D.featurization_utils.granularity_utils._fix_scipy_ndimage_result(result)[source]
Convert scipy.ndimage aggregation results to a consistent array.
Equivalent to centrosome.cpmorphology.fixup_scipy_ndimage_result. scipy.ndimage.mean/sum can return a scalar when there’s one label, or a list otherwise. This ensures we always get a numpy array.
- Parameters:
result (scalar, list, or numpy.ndarray) – Output from scipy.ndimage.mean or similar.
- Returns:
1-D array of results.
- Return type:
- image_analysis_3D.featurization_utils.granularity_utils._subsample_3d(data, new_shape, subsample_factor, order=1)[source]
Subsample a 3D array using map_coordinates, matching CellProfiler.
CellProfiler generates coordinates for the new shape and divides by subsample_factor to map back into the original coordinate space. The same scalar factor is used for all three axes.
- Parameters:
data (numpy.ndarray) – 3D array to subsample.
new_shape (numpy.ndarray) – Target shape as a float array (coordinate grid extent).
subsample_factor (float) – The factor used to divide coordinates (same for all axes).
order (int) – Interpolation order (1 for linear, 0 for nearest-neighbor).
- Returns:
Subsampled array.
- Return type:
- image_analysis_3D.featurization_utils.granularity_utils._upsample_3d(data, subsampled_shape, original_shape)[source]
Upsample a 3D array back to original shape using map_coordinates.
Matches CellProfiler’s approach for restoring reconstructed images to the original label resolution.
- Parameters:
data (numpy.ndarray) – Subsampled 3D array to upsample.
subsampled_shape (numpy.ndarray) – Shape of the subsampled space (float array, preserves CellProfiler precision).
original_shape (tuple) – Target shape to upsample to.
- Returns:
Upsampled array at original_shape resolution.
- Return type:
- image_analysis_3D.featurization_utils.granularity_utils.measure_3D_granularity(object_loader, radius=10, granular_spectrum_length=16, subsample_size=0.25, image_sample_size=0.25, mask_threshold=0.9, verbose=False, image_mask=None)[source]
Calculate the granularity spectrum of a 3D image.
Follows the CellProfiler MeasureGranularity algorithm exactly for 3D: 1. Subsample the image uniformly (same factor for Z, Y, X). 2. Further subsample for background tophat removal. 3. Iteratively erode with ball(1) and reconstruct, measuring signal lost at each scale as image-level and per-object values.
- Parameters:
object_loader (ObjectLoader) – Loader containing the image and label arrays.
radius (int) – Radius of the structuring element for background removal. Should correspond to texture radius after subsampling.
granular_spectrum_length (int) – Number of granularity scales to measure.
subsample_size (float) – Subsampling factor for the image (0, 1]. Applied uniformly to Z/Y/X.
image_sample_size (float) – Subsampling factor for background reduction (0, 1]. Applied relative to the already-subsampled image.
mask_threshold (float) – Threshold for converting interpolated masks back to boolean.
verbose (bool) – Print diagnostic information.
image_mask (numpy.ndarray or None) – Boolean mask matching the image shape. Corresponds to CellProfiler’s
im.mask. If None (default), all pixels are considered valid (all-True mask), matching the typical CellProfiler behavior for unmasked images.
- Returns:
Dictionary with keys ‘object_id’, ‘feature’, ‘value’. Image-level measurements use object_id=0.
- Return type:
Intensity
Intensity feature extraction utilities for 3D image objects.
Provides functions to compute intensity statistics (mean, median, min, max, standard deviation, quartiles), edge-based measurements, center-of-mass coordinates, and mass displacement for segmented 3D objects.
- image_analysis_3D.featurization_utils.intensity_utils.get_outline(mask)[source]
Get the outline of a 3D mask.
- Parameters:
mask (numpy.ndarray) – The input mask.
- Returns:
The outline of the mask.
- Return type:
- image_analysis_3D.featurization_utils.intensity_utils.measure_3D_intensity_CPU(object_loader)[source]
Measure the intensity of objects in a 3D image.
- Parameters:
object_loader (ObjectLoader) – The object loader containing the image and label image.
- Returns:
A dictionary containing the measurements for each object. The keys are the measurement names and the values are the corresponding values.
- Return type:
Neighbors
- image_analysis_3D.featurization_utils.neighbors_utils.neighbors_expand_box(min_coor, max_coord, current_min, current_max, expand_by)[source]
Expand the bounding box of the object by a specified distance in each direction.
- Parameters:
min_coor (Union[int, float]) – The global minimum coordinate of the image.
max_coord (Union[int, float]) – The global maximum coordinate of the image.
current_min (Union[int, float]) – The current minimum coordinate of the object.
current_max (Union[int, float]) – The current maximum coordinate of the object.
expand_by (int) – The distance by which to expand the bounding box.
- Returns:
The new minimum and maximum coordinates of the bounding box.
- Return type:
- image_analysis_3D.featurization_utils.neighbors_utils.crop_3D_image(image, bbox)[source]
Crop the 3D image to the bounding box of the object.
- Parameters:
- Returns:
The cropped 3D image.
- Return type:
- image_analysis_3D.featurization_utils.neighbors_utils.measure_3D_number_of_neighbors(object_loader, distance_threshold=10, anisotropy_factor=10)[source]
This function calculates the number of neighbors for each object in a 3D image.
- Parameters:
object_loader (ObjectLoader) – The object loader object that contains the image and label image.
distance_threshold (int, optional) – The distance threshold for counting neighbors, by default 10
anisotropy_factor (int, optional) – The anisotropy factor for the image where the anisotropy factor is the ratio of the pixel size in the z direction to the pixel size in the x and y directions, by default 10
- Returns:
A dictionary containing the object ID and the number of neighbors for each object.
- Return type:
- image_analysis_3D.featurization_utils.neighbors_utils.get_coordinates(nuclei_mask, object_ids=None)[source]
Extract coordinates from a labeled mask.
- Parameters:
nuclei_mask (ndarray) – 3D labeled mask where each object has a unique ID
object_ids (list) – List of object IDs to extract
- Returns:
coords – DataFrame with columns: object_id, x, y, z
- Return type:
pandas.DataFrame
- image_analysis_3D.featurization_utils.neighbors_utils.calculate_centroid(coords)[source]
Calculate the centroid of cell coordinates.
- Parameters:
coords (pandas.DataFrame)
- Return type:
- image_analysis_3D.featurization_utils.neighbors_utils.euclidean_distance_from_centroid(coords, centroid)[source]
Calculate Euclidean distance from centroid for each cell.
- Parameters:
coords (numpy.ndarray)
centroid (numpy.ndarray)
- Return type:
- image_analysis_3D.featurization_utils.neighbors_utils.mahalanobis_distance_from_centroid(coords, centroid, min_cells_threshold=50)[source]
Calculate Mahalanobis distance from centroid for each cell. This accounts for the covariance structure (shape) of the organoid.
For small sample sizes (<50 cells), uses regularization or falls back to Euclidean.
- Parameters:
coords (ndarray) – Cell coordinates (n_cells, 3)
centroid (ndarray) – Centroid coordinates (3,)
min_cells_threshold (int) – Minimum cells needed for reliable Mahalanobis (default: 50)
- Returns:
distances – Mahalanobis distances for each cell
- Return type:
ndarray
- image_analysis_3D.featurization_utils.neighbors_utils.classify_cells_into_shells(coords, n_shells=5, method='mahalanobis', min_cells_per_shell=3, centroid=None)[source]
Classify cells into radial shells based on distance from centroid.
Automatically adjusts n_shells for small organoids to ensure meaningful statistics.
- Parameters:
coords (pandas.DataFrame or dict) – Cell coordinates with /keys: object_id, x, y, z
n_shells (int) – Number of concentric shells to create (will be adjusted if needed)
method (str) – ‘euclidean’ or ‘mahalanobis’
min_cells_per_shell (int) – Minimum average cells per shell (default: 3)
centroid (numpy.ndarray, optional) – Pre-calculated centroid (if None, will be calculated from coords)
- Returns:
results – Dictionary containing: - ‘ShellAssignments’: Shell number for each cell (0 = innermost) - ‘DistancesFromCenter’: Distance from centroid for each cell - ‘DistancesFromExterior’: Distance from exterior for each cell - ‘NormalizedDistancesFromCenter’: Normalized distances (0-1)
- Return type:
- image_analysis_3D.featurization_utils.neighbors_utils.create_results_dataframe(results)[source]
Create a pandas DataFrame with all cell information.
- Parameters:
results (dict) – Results from classify_cells_into_shells
- Returns:
df – DataFrame with cell information
- Return type:
pandas.DataFrame
- image_analysis_3D.featurization_utils.neighbors_utils.visualize_organoid_shells(coords, classification_results, title='Organoid Shell Classification', centroid=None)[source]
Create 3D visualization of organoid with shell coloring.
- Parameters:
coords (pandas.DataFrame or dict) – Cell coordinates with columns/keys: object_id, x, y, z
classification_results (dict) – Results from classify_cells_into_shells
title (str) – Plot title
centroid (numpy.ndarray)
- Return type:
matplotlib.pyplot.figure
Texture
- image_analysis_3D.featurization_utils.texture_utils.scale_image(image, num_gray_levels=256)[source]
Scale the image to a specified number of gray levels. Example: 1024 gray levels will be scaled to 256 gray levels if num_gray_levels=256. An image with a pixel value of 0 will be scaled to 0 and a pixel value of 1023 will be scaled to 255.
- Parameters:
image (numpy.ndarray) – The input image to be scaled. Can be a ndarray of any shape.
num_gray_levels (int, optional) – The number of gray levels to scale the image to, by default 256
- Returns:
The gray level scaled image of any shape.
- Return type:
- image_analysis_3D.featurization_utils.texture_utils.measure_3D_texture(object_loader, distance=1, grayscale=256)[source]
Calculate texture features for each object in the image using Haralick features.
The features are calculated for each object separately and the mean value is returned.
- Parameters:
- Returns:
A dictionary containing the object ID, texture name, and texture value with keys: - object_id - texture_name - texture_value
Texture names include: Angular Second Moment, Contrast, Correlation, Variance, Inverse Difference Moment, Sum Average, Sum Variance, Sum Entropy, Entropy, and related texture measures.
AngularSecondMoment
Contrast
Correlation
Variance
InverseDifferenceMoment
SumAverage
SumVariance
SumEntropy
Entropy
DifferenceVariance
DifferenceEntropy
InformationMeasureOfCorrelation1
InformationMeasureOfCorrelation2
- Return type:
SAMMED3D Featurizer
SAM-Med3D Feature Extractor Convert SAM-Med3D from segmentation to featurization model.
SAM-Med3D Architecture: - 3D Image Encoder (ViT-based): Extracts features from 3D volumes - 3D Prompt Encoder: Processes prompts, which are supervision signals provided by user for segmentation at inference time (not needed nor used for featurization) - 3D Mask Decoder: Generates segmentation masks (not needed for featurization)
For featurization, we extract embeddings from the 3D image encoder.
- Requirements:
pip install torch torchvision monai einops timm
# For using pretrained SAM-Med3D: pip install medim
- class image_analysis_3D.featurization_utils.sammed3d_featurizer.SAMMed3DFeatureExtractor(model_path=None, device='cpu', use_medim=True, image_size=128)[source]
Extract features from 3D microscope volumes using SAM-Med3D encoder.
This class wraps the SAM-Med3D model and extracts dense or global features from the 3D image encoder for downstream tasks like classification, clustering, or retrieval.
- Parameters:
- extract(volume, normalize=True, feature_type=None)[source]
Extract features from a 3D volume.
- Parameters:
volume (numpy.ndarray or torch.Tensor) – 3D volume (Z, Y, X) or (C, Z, Y, X) or (B, C, Z, Y, X).
normalize (bool, optional) – Whether to normalize the volume.
feature_type (str | None)
- Returns:
Feature vector(s) as numpy array.
- Return type:
- class image_analysis_3D.featurization_utils.sammed3d_featurizer.TransformerBlock3D(*args, **kwargs)[source]
3D Transformer block.
- class image_analysis_3D.featurization_utils.sammed3d_featurizer.MicroscopySAMMed3DPipeline(sammed3d_path=None, device='cpu')[source]
End-to-end pipeline for microscopy feature extraction.
- preprocess_volume(volume)[source]
Preprocess microscopy volume.
- Parameters:
volume (numpy.ndarray)
- Return type:
- extract_features(volume, preprocess=True, feature_type=None)[source]
Extract features from microscopy volume.
- Parameters:
volume (numpy.ndarray) – 3D numpy array (Z, Y, X).
preprocess (bool, optional) – Whether to preprocess the volume.
feature_type (str | None)
- Returns:
Feature vector.
- Return type:
- image_analysis_3D.featurization_utils.sammed3d_featurizer.check_for_zero_objects(label_image)[source]
Check if there are any objects in the label image.
- Parameters:
label_image (numpy.ndarray)
- Return type:
- image_analysis_3D.featurization_utils.sammed3d_featurizer.call_SAMMed3D_pipeline(object_loader, SAMMed3D_model_path=None, feature_type=['global', 'patch', 'cls'], extractor=None)[source]
Call the SAMMed3D pipeline to extract features per patient, well-fov.
Here we call the SAMMed3D pipeline to extract features for each object in the label image.
- Parameters:
object_loader (ObjectLoader) – Class that loads the image and label image for a given patient, well-fov, channel, compartment
SAMMed3D_model_path (Optional[str], optional) – Path to the SAMMed3D model, by default None. Ignored if extractor is provided.
feature_type (str | List, optional) – Feature types to extract, by default [“global”, “patch”, “cls”]
extractor (Optional[MicroscopySAMMed3DPipeline], optional) – Pre-loaded extractor instance. If provided, SAMMed3D_model_path is ignored. Use this to avoid reloading the model in loops. By default None.
- Returns:
Dictionary of extracted features from SAMMed3D for each object with keys:
”object_id”: List of object IDs
”feature_name”: List of feature names
”channel”: List of channels
”compartment”: List of compartments
”value”: List of feature values
”feature_type”: List of feature types
- Return type:
- image_analysis_3D.featurization_utils.sammed3d_featurizer.call_whole_image_sammed3d_pipeline(image, SAMMed3D_model_path=None, feature_type=['global', 'patch', 'cls'], extractor=None)[source]
Call the SAMMed3D pipeline to extract features for the whole image.
This function is called per patient, well-fov and extracts features for the whole FOV volume using the SAMMed3D pipeline.
- Parameters:
image (np.ndarray) – 3D numpy array of the image
SAMMed3D_model_path (Optional[str], optional) – Path to the SAMMed3D model, by default None. Ignored if extractor is provided.
feature_type (str | List, optional) – Type of features to extract, by default [“global”, “patch”, “cls”]
extractor (Optional[MicroscopySAMMed3DPipeline], optional) – Pre-loaded extractor instance. If provided, SAMMed3D_model_path is ignored. Use this to avoid reloading the model in loops. By default None.
- Returns:
Dictionary of extracted features from SAMMed3D for the whole image with keys:
”feature_name”: List of feature names
”value”: List of feature values
”feature_type”: List of feature types
”compartment”: List of compartments (will be “Image” for whole image features)
- Return type:
CHAMMI-75 Featurizer
This utils file has module that utilize CHAMMI-75’s featurization model. This used a self-supervised deep-learning model that uses a Vision Transformer (ViT) architecture
- image_analysis_3D.featurization_utils.chammi75_featurization.get_chammi75_model(device)[source]
Load the CHAMMI-75 (MorphEm) model from Hugging Face.
- Parameters:
device (str or None) – The device to load the model on (
'cuda'or'cpu'). IfNone, CUDA is used when available, otherwise CPU.- Returns:
The CHAMMI-75 (MorphEm) model in evaluation mode.
- Return type:
torch.nn.Module
- class image_analysis_3D.featurization_utils.chammi75_featurization.SaturationNoiseInjector(*args, **kwargs)[source]
Inject uniform random noise into saturated pixels of an image tensor. There are three channels to the image where image 2 and 3 are duplicates of the first channel. We have three channels to fit the ViT architecture which expects three-channel input. This transformation replaces saturated pixels (value == 255) in the first channel of an image with uniform random noise sampled from
[low, high]. It is applied as a pre-processing step before passing the image to the CHAMMI-75 model.- forward(x)[source]
Apply saturation-noise injection to the first channel.
- Parameters:
x (torch.Tensor) – Image tensor of shape
(C, H, W)where saturated pixels in the first channel (index 0) have value 255.- Returns:
Tensor with the same shape as
xwhere saturated pixels in first channel have been replaced by uniform random noise.- Return type:
torch.Tensor
- class image_analysis_3D.featurization_utils.chammi75_featurization.PerImageNormalize(*args, **kwargs)[source]
Normalize each image independently using InstanceNorm2d.
- Parameters:
eps (float)
- image_analysis_3D.featurization_utils.chammi75_featurization.featurize_2D_image_w_chammi75(image_tensor, model, device)[source]
Extract CHAMMI-75 CLS-token features from a multi-channel 2D image.
The function processes each channel of the input image independently (Bag-of-Channels strategy). In step 1, the function resizes the image tensor to 224×224. In step 2, the function injects random noise into saturated pixels. In step 3, the function normalizes each image. In step 4, the function passes the stacked image into the Vision Transformer encoder. Lastly, in step 5, the function outputs the
x_norm_clstokenper channel.- Parameters:
image_tensor (torch.Tensor) – Batch of images with shape
(N, C, H, W)where N is the batch size, C is the number of channels, and H, W are the spatial dimensions.model (torch.nn.Module) – The loaded CHAMMI-75 (MorphEm) model (see
get_chammi75_model()).device (torch.device) – Device on which to run inference (
'cuda'or'cpu').
- Returns:
A list of length C where each element is a
(N, 384)array containing the CLS-token embedding for that channel.- Return type:
- image_analysis_3D.featurization_utils.chammi75_featurization.call_chammi75_featurization_pipeline(cropped_image, model, device='cpu')[source]
Run the CHAMMI-75 featurization pipeline on a single cropped 2D image.
Converts the input NumPy array to a three-channel PyTorch tensor (by replicating the single channel) and extracts CLS-token features from the first channel. Because the ViT architecture expects three-channel input but we feed it a single fluorescence channel, the channel is replicated three times yet only the first copy’s features are returned.
- Parameters:
cropped_image (numpy.ndarray) – A 2D single-channel image array of shape
(H, W)containing the cropped object region.model (torch.nn.Module) – The loaded CHAMMI-75 model (see
get_chammi75_model()).device (str | torch.device)
- Returns:
A
(1, 384)array of CLS-token embeddings for the input image.- Return type:
Feature Writing
Functions for formatting morphology feature names in a consistent way across all morphology features.
- image_analysis_3D.featurization_utils.feature_writing_utils.remove_underscores_from_string(string)[source]
Remove unwanted delimiters from a string and replace them with hyphens.
- image_analysis_3D.featurization_utils.feature_writing_utils.format_morphology_feature_name(compartment, channel, feature_type, measurement)[source]
Format a morphology feature name in a consistent way across all morphology features. This format follows specification for the following: https://github.com/WayScience/NF1_3D_organoid_profiling_pipeline/blob/main/docs/RFC-2119-Feature-Naming-Convention.md
- image_analysis_3D.featurization_utils.feature_writing_utils.save_features_as_parquet(parent_path, df, compartment, channel, feature_type, cpu_or_gpu)[source]
Save features as parquet files in a consistent way across all morphology features.
- Parameters:
parent_path (pathlib.Path) – The parent path to save the features to.
df (pandas.DataFrame) – The dataframe containing the features to save.
compartment (str) – The compartment name.
channel (str) – The channel name.
feature_type (str) – The feature type.
cpu_or_gpu (str) – Whether the features were generated using CPU or GPU processing.
- Return type:
Resource Profiling
This document provides utility functions for profiling memory and time usage during featurization runs.
- image_analysis_3D.featurization_utils.resource_profiling_util.start_profiling()[source]
Start memory and time profiling.
- image_analysis_3D.featurization_utils.resource_profiling_util.stop_profiling(start_time, well_fov, patient_id, feature_type, channel, compartment, CPU_GPU, output_file_dir, start_mem=None)[source]
Stop profiling, report results, and save them to a parquet file.
This function stops
tracemalloc, computes peak memory usage (viatracemalloc) and elapsed wall-clock time, prints a summary, and persists the statistics to output_file_dir as a Parquet file.- Parameters:
start_time (float) – Unix timestamp returned by
start_profiling().well_fov (str) – Well and field of view for the run.
patient_id (str) – Patient ID for the run.
feature_type (str) – Feature type for the run (e.g.,
'intensity','shape').channel (str) – Channel name for the run.
compartment (str) – Cellular compartment for the run (e.g.,
'nucleus','cytoplasm').CPU_GPU (str) – Processing unit used (
'CPU'or'GPU').output_file_dir (pathlib.Path) – File path to save the run-statistics Parquet file.
start_mem (float, optional) – Starting RSS in MB (from
start_profiling()). Included in the output for backward-compatibility but is not used for the peak-memory calculation.
- Returns:
Trueif the function ran successfully.- Return type:
Loading Classes
Data-loading classes for featurization workflows.
- class image_analysis_3D.featurization_utils.loading_classes.ImageSetLoader(image_set_path, mask_set_path, anisotropy_spacing, channel_mapping, image_set_name=None, mask_key_name=None, raw_image_key_name=None)[source]
Load an image set consisting of raw z stack images and segmentation masks.
A class to load an image set consisting of raw z stack images from multiple spectral channels and segmentation masks. The images are loaded into a dictionary, and various attributes and compartments are extracted from the images. The class also provides methods to retrieve images and their attributes.
- Parameters:
image_set_path (pathlib.Path) – Path to the image set directory.
mask_set_path (pathlib.Path) – Path to the mask set directory.
anisotropy_spacing (tuple) – The anisotropy spacing of the images in format (z_spacing, y_spacing, x_spacing).
channel_mapping (dict) – A dictionary mapping channel names to their corresponding image file names. Example:
{'nuclei': 'nuclei_', 'cell': 'cell_', 'cytoplasm': 'cytoplasm_'}image_set_name (str | None)
- image_set_name
The name of the image set.
- Type:
- anisotropy_spacing
The anisotropy spacing of the images.
- Type:
- anisotropy_factor
The anisotropy factor calculated from the spacing.
- Type:
- image_set_dict
A dictionary containing the loaded images, with keys as channel names.
- Type:
- unique_mask_objects
A dictionary containing unique object IDs for each mask in the image set.
- Type:
- unique_compartment_objects
A dictionary containing unique object IDs for each compartment in the image set. A compartment is defined as a segmented region in the image (e.g., Cell, Cytoplasm, Nuclei, Organoid). The compartments are bounds for measurements.
- Type:
- image_names
A list of image names in the image set.
- Type:
- compartments
A list of compartment names in the image set.
- Type:
- retrieve_image_attributes()[source]
Retrieve unique object IDs for each mask in the image set.
- Return type:
None
- get_unique_objects_in_compartments()[source]
Retrieve unique object IDs for each compartment in the image set.
- Return type:
None
- get_image(key)[source]
Retrieve the image corresponding to the specified key.
- Parameters:
key (str)
- Return type:
- get_compartments()[source]
Retrieve the names of compartments in the image set.
- retrieve_image_attributes()[source]
- This is also a quick and dirty way of loading two types of images:
masks (multi-indexed segmentation masks)
The spectral images to extract morphology features from
My naming convention puts the work “mask” in the segmentation images this this is a way to differentiate each mask of each compartment apart from the spectral images.
Future work should be to load the images in a more structured way that does not depend on the file naming convention.
- Return type:
None
- get_unique_objects_in_compartments()[source]
Populate unique object IDs per compartment.
- Return type:
None
- get_image(key)[source]
Return an image array for a given key.
- Parameters:
key (str) – Channel or mask key.
- Returns:
Image array for the requested key.
- Return type:
- get_image_names()[source]
Populate image (non-compartment) names.
- get_compartments()[source]
Populate compartment names from available keys.
- class image_analysis_3D.featurization_utils.loading_classes.ObjectLoader(image, label_image, channel_name, compartment_name)[source]
A class to load objects from a labeled image and extract their properties. Where an object is defined as a segmented region in the image. This could be a cell, a nucleus, or any other compartment segmented.
- Parameters:
image (numpy.ndarray) – The image from which to extract objects. Preferably a 3D image -> z, y, x
label_image (numpy.ndarray) – The labeled image containing the segmented objects.
channel_name (str) – The name of the channel from which the objects are extracted.
compartment_name (str) – The name of the compartment from which the objects are extracted.
- image
The image from which the objects are extracted.
- Type:
- label_image
The labeled image containing the segmented objects.
- Type:
- channel
The name of the channel from which the objects are extracted.
- Type:
- compartment
The name of the compartment from which the objects are extracted.
- Type:
- objects
The labeled image containing the segmented objects.
- Type:
- object_ids
The unique object IDs for the segmented objects.
- Type:
- __init__(image, label_image, channel_name, compartment_name)[source]
Initializes the ObjectLoader with the image, label image, channel name, and compartment name.
- class image_analysis_3D.featurization_utils.loading_classes.TwoObjectLoader(image_set_loader, compartment, channel1, channel2)[source]
A class to load two images and a label image for a specific compartment. This class is primarily used for loading images for two-channel analysis like co-localization.
- Parameters:
image_set_loader (ImageSetLoader) – An instance of the ImageSetLoader class containing the image set.
compartment (str) – The name of the compartment for which the label image is loaded.
channel1 (str) – The name of the first channel to be loaded.
channel2 (str) – The name of the second channel to be loaded.
- image_set_loader
An instance of the ImageSetLoader class containing the image set.
- Type:
ImageSetLoader
- compartment
The name of the compartment for which the label image is loaded.
- Type:
- label_image
The labeled image containing the segmented objects for the specified compartment.
- Type:
- image1
The image corresponding to the first channel.
- Type:
- image2
The image corresponding to the second channel.
- Type:
- object_ids
The unique object IDs for the segmented objects in the specified compartment.
- Type:
- __init__(image_set_loader, compartment, channel1, channel2)[source]
Initializes the TwoObjectLoader with the image set loader, compartment, and channel names.
Featurization Errors
Visualization Utilities
Animation
Animation helpers for visualization outputs.
- image_analysis_3D.visualization_utils.animation_utils.mp4_to_gif(input_mp4, output_gif, fps=10)[source]
Convert an MP4 file to a looping GIF.
- image_analysis_3D.visualization_utils.animation_utils.animate_view(viewer, output_path_name, steps=30, easing='linear', dim=3)[source]
Animate a napari viewer and save to disk.
- Parameters:
viewer (Any) – Napari viewer instance to animate.
output_path_name (str) – Output file path for the animation.
steps (int, optional) – Steps per keyframe, by default 30.
easing (str, optional) – Easing style name, by default “linear”.
dim (int, optional) – Number of displayed dimensions, by default 3.
- Returns:
The output animation path.
- Return type: