API Documentation
Tracking module
roicat.tracking.alignment module
- class roicat.tracking.alignment.Aligner(use_match_search: bool = True, all_to_all: bool = False, radius_in: float = 4, radius_out: float = 20, order: int = 5, z_threshold: float = 4.0, um_per_pixel: float = 1.0, device: str = 'cpu', verbose: bool = True)[source]
Bases:
ROICaT_ModuleA class for registering ROIs to a template FOV. Currently relies on available OpenCV methods for rigid and non-rigid registration. RH 2023
- Parameters:
use_match_search (bool) – Whether to densely search all possible paths to match images to the template upon failure. (Default is
True)all_to_all (bool) – Whether to start with an all-to-all matching approach using out match_search algorithm. Much slower (N vs. N^2) but useful if you know you are working with challenging data.
radius_in (float) – Value in micrometers used to define the maximum shift/offset between two images that are considered to be aligned. Use larger values for more lenient alignment requirements. (Default is 4)
radius_out (float) – Value in micrometers used to define the minimum shift/offset between two images that are considered to be misaligned. Use smaller values for more stringent alignment requirements. (Default is 20)
order (int) – The order of the Butterworth filter used to define the ‘in’ and ‘out’ regions of the ImageAlignmentChecker class. (Default is 5)
z_threshold (float) – Z-score required to define two images as aligned. Larger values results in more stringent alignment requirements and possibly slower registration. Value is the threshold used on the ‘z_in’ output of the ImageAlignmentChecker class to determine if two images are properly aligned. (Default is 4.0)
um_per_pixel (float) – The number of micrometers per pixel in the FOV images. (Default is 1.0)
device (str) – The torch device used for various steps in the alignment process. (Default is
'cpu')verbose (bool) – Whether to print progress updates. (Default is
True)
- augment_FOV_images(FOV_images: List[ndarray], spatialFootprints: List[csr_array] | None = None, normalize_FOV_intensities: bool = True, roi_FOV_mixing_factor: float = 0.5, use_CLAHE: bool = True, CLAHE_grid_block_size: int = 10, CLAHE_clipLimit: int = 1, CLAHE_normalize: bool = True) None[source]
Augments the FOV images by mixing the FOV with the ROI images and optionally applying CLAHE. RH 2023
- Parameters:
FOV_images (List[np.ndarray]) – A list of FOV images.
spatialFootprints (Optional[List[scipy.sparse.csr_array]]) – A list of spatial footprints for each ROI. If
None, then no mixing will be performed. (Default isNone)normalize_FOV_intensities (bool) – Whether to normalize the FOV images. Setting this to
Truewill scale each FOV image to the same intensity range. (Default isTrue)roi_FOV_mixing_factor (float) – The factor by which to mix the ROI images into the FOV images. If 0, then no mixing will be performed. (Default is 0.5)
use_CLAHE (bool) – Whether to apply CLAHE to the images. (Default is
True)CLAHE_grid_block_size (int) – The size of the blocks in the grid for CLAHE. Used to divide the image into small blocks and create the grid_size parameter for the cv2.createCLAHE function. Smaller block sizes will result in more local CLAHE. (Default is 50)
CLAHE_clipLimit (int) – The clip limit for CLAHE. See cv2.createCLAHE for more details. (Default is 1)
CLAHE_normalize (bool) – Whether to normalize the CLAHE output. See alignment.clahe for more details. (Default is
True)
- Returns:
The augmented FOV images.
- Return type:
List[np.ndarray]
- fit_geometric(template: int | ndarray, ims_moving: List[ndarray], template_method: str = 'sequential', mask_borders: Tuple[int, int, int, int] = (0, 0, 0, 0), method: str = 'RoMa', kwargs_method: dict = {'DISK_LightGlue': {'num_features': 2048, 'threshold_confidence': 0.2}, 'ECC_cv2': {'auto_fix_gaussFilt_step': 10, 'gaussFiltSize': 31, 'mode_transform': 'euclidean', 'n_iter': 200, 'termination_eps': 1e-09}, 'LoFTR': {'model_type': 'indoor_new', 'threshold_confidence': 0.2}, 'NullRegistration': {}, 'ORB': {'WTA_K': 2, 'edgeThreshold': 31, 'fastThreshold': 20, 'firstLevel': 0, 'nfeatures': 1000, 'nlevels': 8, 'patchSize': 31, 'scaleFactor': 1.2, 'scoreType': 0}, 'PhaseCorrelation': {'bandpass_freqs': [1, 30], 'order': 5}, 'RoMa': {'batch_size': 1000, 'model_type': 'outdoor', 'n_points': 10000}, 'SIFT': {'contrastThreshold': 0.04, 'edgeThreshold': 10, 'nfeatures': 10000, 'sigma': 1.6}}, constraint: str = 'affine', kwargs_RANSAC: dict = {'confidence': 0.99, 'inl_thresh': 2.0, 'max_iter': 10}, verbose: bool | None = None) ndarray[source]
Performs geometric registration of
ims_movingto a template using the specified method. RH 2023- Parameters:
template (Union[int, np.ndarray]) – The template image or index. If
template_methodis ‘image’, this should be an image (np.ndarray) or an index of the image to use as the template. Iftemplate_methodis ‘sequential’, then template is the integer index or fractional index of the image to use as the template.ims_moving (List[np.ndarray]) – List of images to be aligned.
template_method (str) – Method to use for template selection. * ‘image’: use the image specified by ‘template’. * ‘sequential’: register each image to the previous or next image. (Default is ‘sequential’)
mask_borders (Tuple[int, int, int, int]) – Border mask for the image. Format is (top, bottom, left, right). (Default is (0, 0, 0, 0))
method (str) –
The method to use for registration. One of {‘RoMa’, ‘LoFTR’, ‘ECC_cv2’, ‘DISK_LightGlue’, ‘SIFT’, ‘ORB’}.
’RoMa’: Feature-based registration using the RoMa algorithm.
’LoFTR’: Feature-based registration using LoFTR.
’ECC_cv2’: Direct intensity-based registration using OpenCV’s findTransformECC.
’DISK_LightGlue’: Feature-based registration using DISK features and LightGlue matcher.
’SIFT’: Feature-based registration using SIFT keypoints.
’ORB’: Feature-based registration using ORB keypoints.
’PhaseCorrelation’: Phase correlation registration.
’NullRegistration’: No registration, just returns identity
(Default is ‘RoMa’)
kwargs_method (dict) –
Keyword arguments for the selected method. The keys are method names, and the values are dictionaries of keyword arguments. For example:
- ’RoMa’: {
‘model_type’: ‘outdoor’, ‘n_points’: 10000, ‘batch_size’: 1000,
}, ‘LoFTR’: {
’model_type’: ‘indoor_new’, ‘threshold_confidence’: 0.2,
constraint (str) – The type of transformation to use for the registration. One of: * ‘rigid’: Rigid transformation (translation only) * ‘euclidean’: Euclidean transformation (translation + rotation) * ‘affine’: Affine transformation (translation + rotation + scale + shear) * ‘homography’: Homography transformation (translation + rotation + scale + shear + perspective)
kwargs_RANSAC (dict) –
Keyword arguments for RANSAC algorithm used in homography estimation. * ‘inl_thresh’ (float): RANSAC inlier threshold. (Default is
2.0)
’max_iter’ (int): Maximum number of iterations for RANSAC. (Default is 10)
’confidence’ (float): Confidence level for RANSAC. (Default is 0.99)
verbose (Optional[bool]) – Whether to print progress updates. If
None, the verbose level set during initialization will be used.
- Returns:
An array of shape (N, H, W, 2) representing the remap field for N images.
- Return type:
np.ndarray
- fit_nonrigid(template: int | ndarray, ims_moving: List[ndarray], remappingIdx_init: ndarray | None = None, template_method: str = 'sequential', method: str = 'RoMa', kwargs_method: dict = {'DeepFlow': {}, 'NullRegistration': {}, 'OpticalFlowFarneback': {'iterations': 15, 'levels': 5, 'poly_n': 5, 'poly_sigma': 1.5, 'pyr_scale': 0.7, 'winsize': 128}, 'RoMa': {'model_type': 'outdoor'}}) ndarray[source]
Performs non-rigid registration of
ims_movingto a template using the specified method. RH 2023- Parameters:
template (Union[int, np.ndarray]) – The template image or index. If
template_methodis ‘image’, this should be an image (np.ndarray) or an index of the image to use as the template. Iftemplate_methodis ‘sequential’, then template is the integer index or fractional index of the image to use as the template.ims_moving (List[np.ndarray]) – A list of images to be aligned.
remappingIdx_init (Optional[np.ndarray]) – An array of shape (N, H, W, 2) representing any initial remap field to apply to the images in
ims_moving. The output of this method will be composed withremappingIdx_init. (Default isNone)template_method (str) –
Method to use for template selection. * ‘image’: use the image specified by ‘template’. * ‘sequential’: register each image to the previous or next
image.
(Default is ‘sequential’)
method (str) –
The method to use for registration. One of {‘RoMa’, ‘DeepFlow’, ‘OpticalFlowFarneback’}. * ‘DeepFlow’: Optical flow using OpenCV’s DeepFlow algorithm. * ‘RoMa’: Non-rigid registration using the RoMa algorithm. * ‘OpticalFlowFarneback’: Optical flow using OpenCV’s
calcOpticalFlowFarneback.
’NullRegistration’: No registration, just returns identity (remappingIdx) for each image.
(Default is ‘RoMa’)
kwargs_method (dict) – Keyword arguments for the selected method. The keys are method names, and the values are dictionaries of keyword arguments.
- Returns:
An array of shape (N, H, W, 2) representing the remap field for N images.
- Return type:
np.ndarray
- get_ROIsAligned_maxIntensityProjection(H: int | None = None, W: int | None = None, normalize: bool = True) List[ndarray][source]
Returns the max intensity projection of the ROIs aligned to the template FOV.
- Parameters:
H (Optional[int]) – The height of the output projection. If not provided and if not already set, an error will be thrown. (Default is
None)W (Optional[int]) – The width of the output projection. If not provided and if not already set, an error will be thrown. (Default is
None)normalize (bool) – If
True, the ROIs are normalized by the maximum value. (Default isTrue)
- Returns:
- max_projection (List[np.ndarray]):
The max intensity projections of the ROIs.
- Return type:
(List[np.ndarray])
- get_flowFields(remappingIdx: ndarray | None = None) List[ndarray][source]
Returns the flow fields based on the remapping indices.
- Parameters:
remappingIdx (Optional[np.ndarray]) – The indices for remapping the flow fields. If
None, geometric or nonrigid registration must be performed first. (Default isNone)- Returns:
- flow_fields (List[np.ndarray]):
The transformed flow fields.
- Return type:
(List[np.ndarray])
- plot_alignment_results_geometric(plot_direct: bool = True) Tuple[Figure, Figure][source]
Plots the alignment results for geometric registration.
- Parameters:
plot_direct (bool) – If
True, plots the direct alignment results.- Returns:
- tuple containing:
- fig_final (plt.Figure):
Figure showing the final alignment results.
- fig_direct (plt.Figure):
Figure showing the direct alignment results.
- Return type:
(Tuple[plt.Figure, plt.Figure])
- plot_alignment_results_nonrigid() Tuple[Figure, Figure][source]
Plots the alignment results for non-rigid registration.
- Returns:
- tuple containing:
- fig_final (plt.Figure):
Figure showing the final alignment results.
- fig_direct (plt.Figure):
Figure showing the direct alignment results.
- Return type:
(Tuple[plt.Figure, plt.Figure])
- transform_ROIs(ROIs: ndarray, remappingIdx: ndarray | None = None, normalize: bool = True) List[ndarray][source]
Transforms ROIs based on remapping indices and normalization settings. RH 2023
- Parameters:
ROIs (np.ndarray) – The regions of interest to transform. (shape: (H, W))
remappingIdx (Optional[np.ndarray]) – The indices for remapping the ROIs. If
None, geometric or nonrigid registration must be performed first. (Default isNone)normalize (bool) – If
True, data is normalized. (Default isTrue)
- Returns:
- ROIs_aligned (List[np.ndarray]):
Transformed ROIs.
- Return type:
(List[np.ndarray])
- transform_images(ims_moving: List[ndarray], remappingIdx: List[ndarray]) List[ndarray][source]
Transforms images using the specified remapping index.
- Parameters:
ims_moving (List[np.ndarray]) – The images to be transformed. List of arrays with shape: (H, W) or (H, W, C)
remappingIdx (List[np.ndarray]) – The remapping index to apply to the images. List of arrays with shape: (H, W, 2). List length must match the number of images.
- Returns:
- ims_registered (List[np.ndarray]):
The transformed images. (N, H, W)
- Return type:
(List[np.ndarray])
- transform_images_geometric(ims_moving: List[ndarray], remappingIdx: ndarray | None = None) ndarray[source]
Transforms images based on geometric registration warps.
- Parameters:
ims_moving (np.ndarray) – The images to be transformed. (N, H, W)
remappingIdx (Optional[np.ndarray]) – An array specifying how to remap the images. If
None, the remapping index from geometric registration is used. (Default isNone)
- Returns:
- ims_registered_geo (np.ndarray):
The images after applying the geometric registration warps. (N, H, W)
- Return type:
(np.ndarray)
- transform_images_nonrigid(ims_moving: List[ndarray], remappingIdx: ndarray | None = None) ndarray[source]
Transforms images based on non-rigid registration warps.
- Parameters:
ims_moving (np.ndarray) – The images to be transformed. (N, H, W)
remappingIdx (Optional[np.ndarray]) – An array specifying how to remap the images. If
None, the remapping index from non-rigid registration is used. (Default isNone)
- Returns:
- ims_registered_nonrigid (np.ndarray):
The images after applying the non-rigid registration warps. (N, H, W)
- Return type:
(np.ndarray)
- class roicat.tracking.alignment.DISK_LightGlue(num_features: int = 2048, threshold_confidence: float = 0.2, window_nms: int = 5, device: str = 'cpu', verbose: bool = False)[source]
Bases:
ImageRegistrationMethodImage registration method using DISK features and LightGlue matcher. RH 2024
- Parameters:
num_features (int) – Number of features to extract. (Default is 2048)
threshold_confidence (float) – Confidence threshold for filtering matches. (Default is 0.2)
window_nms (int) – Window size for non-maximum suppression. Must be odd integer. Larger values will result in fewer keypoints. (Default is 5)
device (str) – Device to use for computations.
verbose (bool) – Whether to print progress updates.
- class roicat.tracking.alignment.DeepFlow(device: str = 'cpu', verbose=False)[source]
Bases:
ImageRegistrationMethodImage registration method using OpenCV’s DeepFlow algorithm. RH 2024
- Parameters:
device (str) – Device to use for computations.
verbose (bool) – Whether to print progress updates.
- class roicat.tracking.alignment.ECC_cv2(mode_transform='euclidean', n_iter: int = 200, termination_eps: float = 1e-09, gaussFiltSize: float | int = 1, auto_fix_gaussFilt_step: int | None = 10, device: str = 'cpu', verbose: bool = False)[source]
Bases:
ImageRegistrationMethodImage registration method using OpenCV’s ECC algorithm. RH 2024
- Parameters:
mode_transform (str) – Type of geometric transformation. One of {‘translation’, ‘euclidean’, ‘affine’, ‘homography’}. (Default is ‘euclidean’)
n_iter (int) – Number of iterations for optimization. (Default is 200)
termination_eps (float) – Convergence tolerance. (Default is 1e-09)
gaussFiltSize (Union[float, int]) – Size of Gaussian blurring filter applied to images. (Default is 1)
auto_fix_gaussFilt_step (Optional[int]) – Increment in gaussFiltSize after a failed optimization. If
None, no automatic fixing is performed. (Default is 10)device (str) – Device to use for computations.
verbose (bool) – Whether to print progress updates.
- class roicat.tracking.alignment.ImageRegistrationMethod(device: str = 'cpu', verbose: bool = False)[source]
Bases:
objectBase class for image registration methods. RH 2024
This class defines the interface for image registration methods, both rigid and non-rigid. Subclasses should implement the methods _forward_rigid and _forward_nonrigid.
- Parameters:
device (str) – Device to use for computations.
verbose (bool) – Whether to print progress updates.
- fit_rigid(im_template: ndarray | Tensor, im_moving: ndarray | Tensor, inl_thresh: float = 2.0, max_iter: int = 10, confidence: float = 0.99, constraint: str = 'homography', **kwargs) ndarray[source]
Estimate a constrained warp between two images using matched points. :returns: warp_matrix (np.ndarray[3x3]) - the resulting transformation.
- class roicat.tracking.alignment.LoFTR(model_type: str = 'indoor_new', threshold_confidence: float = 0.2, device: str = 'cpu', verbose: bool = False)[source]
Bases:
ImageRegistrationMethodLoFTR-based image registration method. RH 2024
- Parameters:
model_type (str) – Type of LoFTR model to use. Default is ‘indoor_new’.
threshold_confidence (float) – Confidence threshold for filtering matches. (Default is 0.2)
device (str) – Device to use for computations.
verbose (bool) – Whether to print progress updates.
- class roicat.tracking.alignment.NullRegistration(device: str | None = None, verbose: bool = False)[source]
Bases:
ImageRegistrationMethodNull registration method that does nothing. RH 2024
- Parameters:
device (str) – Device to use for computations.
verbose (bool) – Whether to print progress updates.
- class roicat.tracking.alignment.ORB(nfeatures: int = 500, scaleFactor: float = 1.2, nlevels: int = 8, edgeThreshold: int = 31, firstLevel: int = 0, WTA_K: int = 2, scoreType: int = 0, patchSize: int = 31, fastThreshold: int = 20, device: str = 'cpu', verbose: bool = False)[source]
Bases:
ImageRegistrationMethodImage registration method using ORB keypoints. RH 2024
- Parameters:
nfeatures (int) – Maximum number of features to retain. (Default is 500)
scaleFactor (float) – Pyramid decimation ratio. (Default is 1.2)
nlevels (int) – Number of pyramid levels. (Default is 8)
edgeThreshold (int) – Size of the border where the features are not detected. (Default is 31)
firstLevel (int) – The level of pyramid to put source image to. (Default is 0)
WTA_K (int) – Number of points that produce each element of the oriented BRIEF descriptor. (Default is 2)
scoreType (int) – Type of score to rank features. (Default is cv2.ORB_HARRIS_SCORE)
patchSize (int) – Size of the patch used by the oriented BRIEF descriptor. (Default is 31)
fastThreshold (int) – FAST threshold. (Default is 20)
device (str) – Device to use for computations.
verbose (bool) – Whether to print progress updates.
- class roicat.tracking.alignment.OpticalFlowFarneback(pyr_scale: float = 0.3, levels: int = 3, winsize: int = 128, iterations: int = 7, poly_n: int = 5, poly_sigma: float = 1.5, flags: int = 256, device: str = 'cpu', verbose=False)[source]
Bases:
ImageRegistrationMethodImage registration method using OpenCV’s calcOpticalFlowFarneback. RH 2024
- Parameters:
pyr_scale (float) – Parameter specifying the image scale (<1) to build pyramids for each image. (Default is 0.3)
levels (int) – Number of pyramid layers including the initial image. (Default is 3)
winsize (int) – Averaging window size. Larger values increase the algorithm robustness to noise and provide smoother motion field. (Default is 128)
iterations (int) – Number of iterations the algorithm does at each pyramid level. (Default is 7)
poly_n (int) – Size of the pixel neighborhood used to find polynomial expansion in each pixel. (Default is 5)
poly_sigma (float) – Standard deviation of the Gaussian used to smooth derivatives used as a basis for the polynomial expansion. (Default is 1.5)
flags (int) – Operation flags. (Default is cv2.OPTFLOW_FARNEBACK_GAUSSIAN)
device (str) – Device to use for computations.
verbose (bool) – Whether to print progress updates.
- class roicat.tracking.alignment.PhaseCorrelationRegistration(device: str = 'cpu', bandpass_freqs: List[float] | None = None, order: int = 5, verbose: bool = False)[source]
Bases:
ImageRegistrationMethodImage registration method using helpers.phase_correlation. RH 2024
- Parameters:
device (str) – Device to use for computations.
verbose (bool) – Whether to print progress updates.
- class roicat.tracking.alignment.RoMa(model_type: str = 'outdoor', n_points: int = 10000, batch_size: int = 1000, device: str = 'cpu', weight_urls: Dict[str, Dict[str, Dict[str, str]]] = {'dinov2': {'filename': 'dinov2_vitl14_pretrain.pth', 'hash': '19a02c10947ed50096ce382b46b15662', 'url': 'https://dl.fbaipublicfiles.com/dinov2/dinov2_vitl14/dinov2_vitl14_pretrain.pth'}, 'romatch': {'indoor': {'filename': 'roma_indoor.pth', 'hash': '349a17aaa21883bb164b1a5884febb21', 'url': 'https://github.com/Parskatt/storage/releases/download/roma/roma_indoor.pth'}, 'outdoor': {'filename': 'roma_outdoor.pth', 'hash': '9a451dfb65745e777bf916db6ea84933', 'url': 'https://github.com/Parskatt/storage/releases/download/roma/roma_outdoor.pth'}}, 'tiny_roma_v1': {'outdoor': {'filename': 'tiny_roma_v1_outdoor.pth', 'hash': 'b8120606c6b027a07b856d64f20bd13e', 'url': 'https://github.com/Parskatt/storage/releases/download/roma/tiny_roma_v1_outdoor.pth'}}}, fallback_weight_urls={'dinov2': {'filename': 'dinov2_vitl14_pretrain.pth', 'hash': '19a02c10947ed50096ce382b46b15662', 'url': 'https://osf.io/tmj5c/download'}, 'romatch': {'indoor': {'filename': 'roma_indoor.pth', 'hash': '349a17aaa21883bb164b1a5884febb21', 'url': 'https://osf.io/uzx64/download'}, 'outdoor': {'filename': 'roma_outdoor.pth', 'hash': '9a451dfb65745e777bf916db6ea84933', 'url': 'https://osf.io/cmzpa/download'}}, 'tiny_roma_v1': {'outdoor': {'filename': 'tiny_roma_v1_outdoor.pth', 'hash': 'b8120606c6b027a07b856d64f20bd13e', 'url': 'https://osf.io/6anre/download'}}}, verbose=False)[source]
Bases:
ImageRegistrationMethodRoMa-based image registration method. RH 2024
- Parameters:
model_type (str) – Type of RoMa model to use. Either ‘outdoor’ or ‘indoor’.
n_points (int) – Number of points to sample for matching. (Default is 10000)
batch_size (int) – Batch size for processing matches. (Default is 1000)
device (str) – Device to use for computations.
verbose (bool) – Whether to print progress updates.
- class roicat.tracking.alignment.SIFT(nfeatures: int = 500, contrastThreshold: float = 0.04, edgeThreshold: float = 10, sigma: float = 1.6, device: str = 'cpu', verbose: bool = False)[source]
Bases:
ImageRegistrationMethodImage registration method using SIFT keypoints. RH 2024
- Parameters:
nfeatures (int) – Number of best features to retain. (Default is 500)
contrastThreshold (float) – Contrast threshold used to filter out weak features. (Default is 0.04)
edgeThreshold (float) – Threshold used to filter out edge-like features. (Default is 10)
sigma (float) – Sigma of the Gaussian applied to the input image at the octave #0. (Default is 1.6)
device (str) – Device to use for computations.
verbose (bool) – Whether to print progress updates.
- roicat.tracking.alignment.adaptive_brute_force_matcher(features_template: Tensor, features_moving: Tensor, thresh_prob: float = 0.05, metric: str = 'normalized_euclidean', moat_prob_ratio: float = 10, batch_size: int = 100)[source]
Perform adaptive brute force matching between two sets of features. Similar to brute force matching, but converts the distance threshold to a probability using statistics of the distances.
- roicat.tracking.alignment.clahe(im: ndarray, grid_size: int | Tuple[int, int] = 50, clipLimit: int = 0, normalize: bool = True) ndarray[source]
Perform Contrast Limited Adaptive Histogram Equalization (CLAHE) on an image.
- Parameters:
im (np.ndarray) – Input image.
grid_size (int) – Size of the grid. See
cv2.createCLAHEfor more info. (Default is 50)clipLimit (int) – Clip limit. See
cv2.createCLAHEfor more info. (Default is 0)normalize (bool) – Whether to normalize the image to the maximum value (0 - 1) during the CLAHE process, then return the image to the original dtype and range. (Default is
True)
- Returns:
- im_out (np.ndarray):
Output image after applying CLAHE.
- Return type:
(np.ndarray)
roicat.tracking.blurring module
- class roicat.tracking.blurring.ROI_Blurrer(frame_shape: Tuple[int, int] = (512, 512), kernel_halfWidth: int = 2, plot_kernel: bool = False, verbose: bool = True)[source]
Bases:
ROICaT_ModuleBlurs the Region of Interest (ROI) spatial footprints using 2D convolution to account for registration uncertainty across imaging sessions. Uses the
sparse_convolutionlibrary for fast sparse convolution via the'direct'method (batch-parallel numba scatter). RH 2022- Parameters:
frame_shape (Tuple[int, int]) – The shape of the frame/Field Of View (FOV). Product of
frame_shape[0]andframe_shape[1]must equal the length of a single flattened/sparse spatialFootprint. (Default is (512, 512))kernel_halfWidth (int) – The half-width of the cosine kernel to use for convolutional blurring. (Default is 2)
plot_kernel (bool) – Whether to plot an image of the kernel. (Default is
False)verbose (bool) – Whether to print the convolutional blurring operation progress. (Default is
True)
- frame_shape
The shape of the frame/Field Of View (FOV). Product of
frame_shape[0]andframe_shape[1]must equal the length of a single flattened/sparse spatialFootprint.- Type:
Tuple[int, int]
- kernel_halfWidth
The half-width of the cosine kernel to use for convolutional blurring.
- Type:
int
- plot_kernel
Whether to plot an image of the kernel.
- Type:
bool
- verbose
Whether to print the convolutional blurring operation progress.
- Type:
bool
- blur_ROIs(spatialFootprints: List[object]) List[object][source]
Blurs the Region of Interest (ROI).
- Parameters:
spatialFootprints (List[object]) – A list of sparse matrices corresponding to spatial footprints from each session.
- Returns:
- ROIs_blurred (List[object]):
A list of blurred ROI spatial footprints.
- Return type:
(List[object])
roicat.tracking.clustering module
- class roicat.tracking.clustering.Clusterer(similarities: Dict[str, csr_array], metric_configs: List[SimilarityMetric], s_sesh: csr_array, n_bins: int | None = None, smoothing_window_bins: int | None = None, session_bool: ndarray | None = None, verbose: bool = True)[source]
Bases:
ROICaT_Module- Class for clustering algorithms. Performs:
- Optimal mixing and pruning of similarity matrices:
self.find_optimal_parameters_for_pruning()
self.make_pruned_similarity_graphs()
- Clustering:
self.fit(): Which uses a modified HDBSCAN
self.fit_sequentialHungarian: Which uses a method similar to CaImAn’s clustering method.
- Quality control:
self.compute_cluster_quality_metrics()
Initialization ingests and stores similarity matrices. RH 2023 / 2025
- Parameters:
similarities (Dict[str, scipy.sparse.csr_array]) – Dict mapping metric name to sparse similarity matrix. All matrices must share the same nonzero pattern (same nnz). Example:
{'sf': csr_array, 'nn': csr_array, 'swt': csr_array}.metric_configs (List[SimilarityMetric]) – List of
SimilarityMetricdataclass instances describing each metric’s optimization behavior (sparsity source, sigmoid, power).s_sesh (scipy.sparse.csr_array) – Inter-session mask. Shape: (n_rois, n_rois). Boolean, with 1s where the two ROIs are from different sessions.
n_bins (Optional[int]) – Number of bins to use for the pairwise similarity distribution. If
None, then a heuristic is used to estimate the value based on the number of nonzero pairs. (Default isNone)smoothing_window_bins (Optional[int]) – Number of bins to use when smoothing the distribution. If
None, then a heuristic is used. (Default isNone)session_bool (Optional[np.ndarray]) – Boolean array indicating which ROIs belong to which session. Shape: (n_rois, n_sessions). (Default is
None)verbose (bool) – Specifies whether to print out information about the clustering process. (Default is
True)
- similarities
Dict of similarity matrices keyed by metric name.
- Type:
Dict[str, scipy.sparse.csr_array]
- s_sesh
The inter-session similarity matrix. Shape: (n_rois, n_rois).
- Type:
scipy.sparse.csr_array
- s_sesh_inv
Intra-session mask (True where ROIs are from the SAME session). Shape: (n_rois, n_rois).
- Type:
scipy.sparse.csr_array
- n_bins
Number of bins to use for the pairwise similarity distribution.
- Type:
Optional[int]
- smooth_window
Number of bins to use when smoothing the distribution.
- Type:
Optional[int]
- verbose
Specifies how much information to print out:
0/False: Warnings only
1/True: Basic info, progress bar
2: All info
- Type:
bool
- apply_weighted_jaccard(s_conj: csr_array | None = None, d_conj: csr_array | None = None, alpha: float = 1.0) Tuple[csr_array, csr_array][source]
Apply weighted Jaccard preprocessing to a similarity graph. Returns new similarity and distance matrices without modifying
self.Can be called in two ways:
From stored state (no args): uses
self.sConj_prunedfrommake_pruned_similarity_graphs().Purely functional: pass
s_conjord_conjdirectly. Exactly one must be provided; the other is derived as1 - x.
The weighted Jaccard replaces each pairwise similarity with a measure of shared neighborhood structure:
\[J_w(i,j) = \frac{\sum_k \min(s_{ik}, s_{jk})} {\sum_k \max(s_{ik}, s_{jk})}\]This amplifies within-cluster similarity and suppresses cross-cluster noise. See
weighted_jaccard_similarity()for details.- Parameters:
s_conj (Optional[scipy.sparse.csr_array]) – Similarity matrix to transform. If
None, usesself.sConj_pruned. Mutually exclusive withd_conj.d_conj (Optional[scipy.sparse.csr_array]) – Distance matrix to transform (converted to similarity via
1 - d). IfNone, usesself.sConj_pruned. Mutually exclusive withs_conj.alpha (float) – Blending weight between Jaccard and original similarity.
alpha=1.0uses pure Jaccard (default).alpha=0.0returns copies of the originals. Values in between give a linear blend:s_final = alpha * s_jaccard + (1-alpha) * s_original.
- Returns:
- sConj_jaccard (scipy.sparse.csr_array):
Jaccard-refined similarity matrix. Same sparsity as input.
- dConj_jaccard (scipy.sparse.csr_array):
Corresponding distance matrix (
1 - sConj_jaccard).
- Return type:
(Tuple[scipy.sparse.csr_array, scipy.sparse.csr_array])
- compute_quality_metrics(sim_mat: object | None = None, dist_mat: object | None = None, labels: ndarray | None = None) Dict[source]
Computes quality metrics of the dataset. RH 2023
- Parameters:
sim_mat (Optional[object]) – Similarity matrix of shape (n_samples, n_samples). If
Nonethen self.sConj must exist. (Default isNone)dist_mat (Optional[object]) – Distance matrix of shape (n_samples, n_samples). If
Nonethen self.dConj must exist. (Default isNone)labels (Optional[np.ndarray]) – Cluster labels of shape (n_samples,). If
None, then self.labels must exist. (Default isNone)
- Returns:
- quality_metrics (Dict):
Quality metrics dictionary that includes: ‘cluster_intra_means’, ‘cluster_intra_mins’, ‘cluster_intra_maxs’, ‘cluster_silhouette’, ‘sample_silhouette’, and other metrics if available.
- Return type:
(Dict)
- find_optimal_parameters_for_pruning(bounds_findParameters: Dict[str, List[float]] | None = None, de_kwargs: Dict[str, Any] = {'maxiter': 100, 'mutation': (0.5, 1.5), 'polish': True, 'popsize': 15, 'recombination': 0.7, 'tol': 1e-06}, n_bins: int | None = None, smoothing_window_bins: int | None = None, subsample_pairs: int | None = None, seed: int | None = None) Dict[source]
Find optimal mixing parameters for pruning the similarity graph.
Two-stage approach:
Naive Bayes calibration: For each similarity feature, estimates
P(same | s_k)from histogram subtraction. The resulting per-feature calibration curves are used to analytically estimate optimal sigmoid parameters(mu, b)via Fisher’s linear discriminant.Differential evolution: With sigmoid parameters frozen from stage 1, optimizes the remaining parameters (one
power_<name>per metric withoptimize_power=True, plusp_norm) by minimizing the histogram overlap loss.
This method replaces the original Optuna TPE search (see
_find_optimal_parameters_for_pruning_optuna()in the legacy section). The two-stage approach achieves better separation quality (lower histogram overlap) on typical datasets. RH 2023 / 2025- Parameters:
bounds_findParameters (Dict[str, List[float]]) – Bounds for the optimized parameters. Keys are
power_<name>for each metric withoptimize_power=True, plusp_norm. Auto-constructed from metric configs ifNone.de_kwargs (Dict[str, Any]) –
Keyword arguments for
scipy.optimize.differential_evolution:maxiter(int): Maximum number of DE generations.tol(float): Convergence tolerance on the loss.popsize(int): Population size multiplier (actual population =popsize * n_params).mutation(Tuple[float, float]): Differential weight range(min, max)for dithering.recombination(float): Crossover probability in[0, 1].polish(bool): IfTrue, run L-BFGS-B from the best DE solution. Often has no effect on piecewise-constant histogram loss.
n_bins (Optional[int]) – Overwrites
n_binsfrom__init__.smoothing_window_bins (Optional[int]) – Overwrites
smoothing_window_binsfrom__init__.subsample_pairs (Optional[int]) – If not
None, subsample this many pairs for histogram loss evaluation. Maintains intra/inter ratio. IfNone, auto-computed based on pair counts.seed (Optional[int]) – Random seed for reproducibility.
- Returns:
- kwargs_makeConjunctiveDistanceMatrix_best (Dict):
Optimal parameters for
make_conjunctive_distance_matrix().
- Return type:
(Dict)
- fit(d_conj: csr_array, session_bool: ndarray, min_cluster_size: int = 2, max_cluster_size: int | None = None, min_samples: int | None = None, n_iter_violationCorrection: int = 5, cluster_selection_method: str = 'leaf', cluster_selection_persistence: float = 0.0, d_clusterMerge: float | None = None, alpha: float = 0.999, split_intraSession_clusters: bool = True, discard_failed_pruning: bool = True, n_steps_clusterSplit: int = 100, backend: str = 'fast_hdbscan', algorithm: str = 'kruskal', rescue_noise: bool = True) ndarray[source]
Fits clustering using HDBSCAN with same-session constraint enforcement.
By default (
backend='fast_hdbscan'), usesfast_hdbscan.HDBSCANwith group-label cannot-link constraints. Each ROI’s session index is passed as a group label; same-session ROIs cannot co-cluster. This uses O(N) memory (an int32 vector) instead of the O(N^2) sparse cannot-link matrix, and O(1) per-merge conflict checks via bitmask. Transitive constraints are handled correctly: if components A and B are merged and both contain session-0 ROIs, the merge is blocked.Set
backend='legacy'to use the originalhdbscan.HDBSCANwith iterative violation correction via dendrogram walk-back.RH 2023 / 2025
- Parameters:
d_conj (scipy.sparse.csr_array) – Conjunctive distance matrix.
session_bool (np.ndarray) – Boolean array indicating which ROIs belong to which session. Shape: (n_rois, n_sessions)
min_cluster_size (int) – Minimum cluster size to be considered a cluster. Can be ‘all’. (Default is 2)
max_cluster_size (Optional[int]) – Maximum cluster size. Clusters larger than this are split. If
None, defaults ton_sessions(one ROI per session), which is the natural constraint for neuron tracking. Set to a larger value to allow clusters spanning a subset of sessions. (Default isNone)min_samples (Optional[int]) – Number of neighbors a point needs to be considered “core” (non-noise) by HDBSCAN. Controls the density threshold independently of
min_cluster_size. Lower values → fewer noise points. IfNone, defaults tomin_cluster_size(HDBSCAN default behavior). (Default isNone)n_iter_violationCorrection (int) – Number of iterations to correct for clusters with multiple ROIs per session. Only used with
backend='legacy'. (Default is 5)cluster_selection_method (str) – Cluster selection method. Either
'leaf'or'eom'. ‘leaf’ leans towards smaller clusters, ‘eom’ towards larger clusters. (Default is'leaf')cluster_selection_persistence (float) – Minimum stability (persistence) a cluster must have to survive selection. Clusters below this threshold are folded into their parent. Higher values → fewer but more stable clusters. Only used with
backend='fast_hdbscan'. (Default is 0.0)d_clusterMerge (Optional[float]) – Distance threshold for merging clusters (
cluster_selection_epsilonin HDBSCAN). Clusters separated by less than this distance are merged. IfNone, defaults toself.d_cutoff(the pruning threshold frommake_pruned_similarity_graphs), which is the inferred same/different decision boundary. Falls back tomean + 1*stdof the distance data ifd_cutoffis not available. (Default isNone)alpha (float) – Alpha value. Only used with
backend='legacy'. (Default is 0.999)split_intraSession_clusters (bool) – If
True, clusters containing ROIs from multiple sessions will be split. Only used withbackend='legacy'. (Default isTrue)discard_failed_pruning (bool) – If
True, clusters failing to prune are set to -1. Only used withbackend='legacy'. (Default isTrue)n_steps_clusterSplit (int) – Number of steps for splitting clusters with multiple ROIs from the same session. Only used with
backend='legacy'. (Default is 100)backend (str) –
Which HDBSCAN implementation to use:
'fast_hdbscan': Use fast_hdbscan with native cannot-link constraints (default).'legacy': Use legacy hdbscan with iterative violation correction.
(Default is
'fast_hdbscan')algorithm (str) –
Algorithm for fast_hdbscan MST construction. Only used with
backend='fast_hdbscan':'kruskal': Kruskal DSU on full CSR edge list. Supports cannot-link with any metric.'boruvka': Boruvka parallel MST. Supports cannot-link only withmetric='precomputed'.
(Default is
'kruskal')rescue_noise (bool) – If
True, run a post-HDBSCAN noise rescue pass that assigns noise ROIs to nearby clusters (or nucleates new small clusters) using a Kruskal-style sorted-edge traversal with DSU bitmask cannot-link constraints. Only used withbackend='fast_hdbscan'. (Default isTrue)
- Returns:
- labels (np.ndarray):
Cluster labels for each ROI, shape: (n_rois_total)
- Return type:
(np.ndarray)
- fit_sequentialHungarian(d_conj: csr_array, session_bool: ndarray, thresh_cost: float = 0.95) ndarray[source]
Applies CaImAn’s method for clustering.
- For further details, please refer to:
[CaImAn’s paper](https://elifesciences.org/articles/38173#s4)
[CaImAn’s repository](https://github.com/flatironinstitute/CaImAn)
[Relevant script in CaImAn’s repository](https://github.com/flatironinstitute/CaImAn/blob/master/caiman/base/rois.py)
- Parameters:
d_conj (scipy.sparse.csr_array) – Distance matrix. Shape: (n_rois, n_rois)
session_bool (np.ndarray) – Boolean array indicating which ROIs are in which sessions. Shape: (n_rois, n_sessions)
thresh_cost (float) – Threshold below which ROI pairs are considered potential matches. (Default is 0.95)
- Returns:
- labels (np.ndarray):
Cluster labels. Shape: (n_rois,)
- Return type:
(np.ndarray)
- make_conjunctive_distance_matrix(similarities: Dict[str, csr_array], mixing_params: Dict[str, Any]) Tuple[csr_array, csr_array, Dict[str, Tensor]][source]
Makes a conjunctive distance matrix from the similarity matrices using the given mixing parameters. RH 2023 / 2025
- Parameters:
similarities (Dict[str, scipy.sparse.csr_array]) – Dict mapping metric name to sparse similarity matrix.
mixing_params (Dict[str, Any]) –
Mixing parameters dict. Expected keys:
power_<name>(float): Power for each metric. Defaults to 1.0 if not present.sig_<name>_kwargs(Dict[str, float]): Sigmoid parameters{'mu': float, 'b': float}per metric.Noneor absent means no sigmoid.p_norm(float): p-norm exponent for combining activated similarities.
- Returns:
- Tuple containing:
- dConj (scipy.sparse.csr_array):
Conjunctive distance matrix (1 - sConj).
- sConj (scipy.sparse.csr_array):
Conjunctive similarity matrix.
- activated_data (Dict[str, torch.Tensor]):
Per-metric activated similarity data arrays.
- Return type:
(Tuple)
- make_naive_bayes_distance_matrix(n_bins: int | None = None, smoothing_window_bins: int | None = None, prob_clip: Tuple[float, float] = (0.0001, 0.9999)) Tuple[csr_array, csr_array, Dict[str, Any]][source]
Compute pairwise distance matrix using independent per-feature calibration combined via naive Bayes.
For each similarity feature k (SF, NN, SWT), estimates the posterior
P(same | s_k)from a 1D histogram of similarity values, using the intra-session (known-different) distribution as reference. The per-feature posteriors are combined under conditional independence:\[\text{logit}(P(\text{same} | \mathbf{s})) = \sum_k \text{logit}(P(\text{same} | s_k)) - (K-1) \cdot \text{logit}(\pi)\]where \(\pi\) is the estimated prior P(same) and K is the number of features.
No iterative optimization — just histogram + lookup. Typically completes in under 1 second even on large datasets. RH 2025
- Parameters:
n_bins (Optional[int]) – Number of histogram bins per feature. If
None, usesself.n_bins.smoothing_window_bins (Optional[int]) – Smoothing window. If
None, usesself.smooth_window.prob_clip (Tuple[float, float]) – Clamp P(same|s_k) to
[lo, hi]before logit.
- Returns:
- dConj (scipy.sparse.csr_array):
Distance matrix
d = 1 - P(same|all).- sConj (scipy.sparse.csr_array):
Similarity matrix
s = P(same|all).- calibrations (Dict[str, Any]):
Diagnostic dict with per-feature calibrations, prior, and combined P(same).
- Return type:
(Tuple[scipy.sparse.csr_array, scipy.sparse.csr_array, Dict])
- make_pruned_similarity_graphs(convert_to_probability: bool = False, stringency: float = 1.0, mixing_params: Dict | None = None, d_cutoff: float | None = None) None[source]
Constructs pruned similarity graphs. RH 2023
- Parameters:
convert_to_probability (bool) – Whether to convert the distance and similarity graphs to probability, p(different) and p(same), respectively. (Default is
False)stringency (float) – Modifies the threshold for pruning the distance matrix. A higher value results in less pruning, a lower value leads to more pruning. This value is multiplied by the inferred threshold to generate a new one. (Default is 1.0)
mixing_params (Optional[Dict]) – Mixing parameters for
self.make_conjunctive_distance_matrix. IfNone, the best parameters found usingself.find_optimal_parametersare used. Use'precomputed'to use a previously storedself.dConj. (Default isNone)d_cutoff (Optional[float]) – The cutoff distance for pruning the distance matrix. If
None, then the optimal cutoff distance is inferred. (Default isNone)
- plot_distSame(mixing_params: dict | None = None) None[source]
Plot the estimated distribution of the pairwise similarities between matched ROI pairs of ROIs.
- Parameters:
mixing_params (Optional[dict]) – Mixing parameters for
make_conjunctive_distance_matrix. IfNone, the function uses the object’s best parameters. (Default isNone)
- plot_similarity_relationships(max_samples: int = 1000000, kwargs_scatter: Dict[str, int | float] = {'alpha': 0.1, 's': 1}, mixing_params: Dict[str, Any] | None = None) Tuple[figure, axes][source]
Plot pairwise similarity relationships for all N*(N-1)/2 metric pairs. Each subplot shows one pair of metrics, colored by conjunctive distance.
- Parameters:
max_samples (int) – Maximum number of samples to plot.
kwargs_scatter (Dict[str, Union[int, float]]) – Keyword arguments for
matplotlib.pyplot.scatter.mixing_params (Optional[Dict[str, Any]]) – Mixing parameters for
make_conjunctive_distance_matrix. IfNone, usesself.best_paramsif available, else defaults.
- Returns:
fig, axs: Figure and axes objects.
- Return type:
(Tuple[matplotlib.pyplot.figure, matplotlib.pyplot.axes])
- rescue_noise(d_conj: csr_array, labels: ndarray, session_bool: ndarray, d_cutoff: float) ndarray[source]
Assign noise ROIs (
label == -1) to nearby clusters or nucleate new small clusters, respecting same-session cannot-link constraints.Uses a Kruskal-style sorted-edge traversal with DSU and bitmask constraints (Phase 2 of two-phase clustering). See
noise_rescue_kruskal()for algorithm details.Non-mutating: does not modify
self.labelsor any stored state.- Parameters:
d_conj (scipy.sparse.csr_array) – Inter-session masked distance matrix. Shape: (n_rois, n_rois).
labels (np.ndarray) – Phase 1 cluster labels from HDBSCAN. Shape: (n_rois,).
-1= noise.session_bool (np.ndarray) – Boolean array, shape (n_rois, n_sessions). Each row has exactly one
True.d_cutoff (float) – Maximum edge distance to accept for noise rescue.
- Returns:
- new_labels (np.ndarray):
Updated cluster labels after noise rescue. Shape: (n_rois,).
- Return type:
(np.ndarray)
- roicat.tracking.clustering.attach_fully_connected_node(d: object, dist_fullyConnectedNode: float | None = None, n_nodes: int = 1) object[source]
Appends a single node to a sparse distance graph that is weakly connected to all nodes.
- Parameters:
d (object) – Sparse graph with multiple components. Refer to scipy.sparse.csgraph.connected_components for details.
dist_fullyConnectedNode (Optional[float]) – Value used for the connection strength to all other nodes. This value will be appended as elements in a new row and column at the ends of the ‘d’ matrix. If
None, then the value will be set to 1000 times the difference between the maximum and minimum values in ‘d’. (Default isNone)n_nodes (int) – Number of nodes to append to the graph. (Default is 1)
- Returns:
- d2 (object):
Sparse graph with only one component.
- Return type:
(object)
- roicat.tracking.clustering.cluster_quality_metrics(sim: ndarray | csr_array, labels: ndarray) Tuple[source]
Computes the cluster quality metrics for a clustering solution including intra-cluster mean, minimum, maximum similarity, and cluster silhouette score. RH 2023
- Parameters:
sim (Union[np.ndarray, scipy.sparse.csr_array]) – Similarity matrix. (shape: (n_roi, n_roi)) It can be obtained using _, sConj, _,_,_,_ = clusterer.make_conjunctive_similarity_matrix().
labels (np.ndarray) – Cluster labels. (shape: (n_roi,))
- Returns:
- tuple containing:
- cs_intra_means (np.ndarray):
Intra-cluster mean similarity. (shape: (n_clusters,))
- cs_intra_mins (np.ndarray):
Intra-cluster minimum similarity. (shape: (n_clusters,))
- cs_intra_maxs (np.ndarray):
Intra-cluster maximum similarity. (shape: (n_clusters,))
- cs_sil (np.ndarray):
Cluster silhouette score. (shape: (n_clusters,)) Describes intra_mean - inter_max_of_maxes
- Return type:
(tuple)
- roicat.tracking.clustering.make_label_variants(labels: ndarray, n_roi_bySession: ndarray) Tuple[source]
Creates convenient variants of label arrays. RH 2023
- Parameters:
labels (np.ndarray) – Cluster integer labels. (shape: (n_roi,))
n_roi_bySession (np.ndarray) – Number of ROIs in each session.
- Returns:
- tuple containing:
- labels_squeezed (np.ndarray):
Cluster labels squeezed into a continuous range starting from 0.
- labels_bySession (List[np.ndarray]):
List of label arrays split by session.
- labels_bool (scipy.sparse.csr_array):
Sparse boolean matrix representation of labels.
- labels_bool_bySession (List[scipy.sparse.csr_array]):
List of sparse boolean matrix representations of labels split by session.
- labels_dict (Dict[int, np.ndarray]):
Dictionary mapping unique labels to their locations in the labels array.
- Return type:
(tuple)
- roicat.tracking.clustering.noise_rescue_kruskal(d_conj: csr_array, labels: ndarray, group_labels: ndarray, n_groups: int, d_cutoff: float) ndarray[source]
Assign HDBSCAN noise points to nearby clusters (or nucleate new small clusters) using a Kruskal-style sorted-edge traversal with DSU and bitmask cannot-link constraints.
- This is Phase 2 of a two-phase clustering strategy:
Phase 1: HDBSCAN with
min_samples > 1produces robust core clusters but marks many ROIs as noise (label == -1).Phase 2 (this function): Processes edges from the distance graph where at least one endpoint is noise, sorted by distance. Merges are accepted only if no session conflict arises (checked via
uint64bitmask per DSU component). This can either assign noise points to existing Phase 1 clusters or nucleate new clusters when 2+ noise points are mutual neighbors withind_cutoff.
The algorithm uses the same DSU + bitmask pattern as
fast_hdbscan’s_kruskal_core_group_constrained, but with the DSU pre-initialized from Phase 1’s pre-formed clusters.- Parameters:
d_conj (scipy.sparse.csr_array) – Sparse distance matrix (inter-session masked). Shape: (n_rois, n_rois). Must be symmetric with sorted indices.
labels (np.ndarray) – Phase 1 cluster labels. Shape: (n_rois,).
-1= noise.group_labels (np.ndarray) – Session index per ROI (
int32). Shape: (n_rois,).n_groups (int) – Number of distinct sessions (groups).
d_cutoff (float) – Maximum edge distance to accept. Edges with
d > d_cutoffare ignored.
- Returns:
- new_labels (np.ndarray):
Updated cluster labels. Shape: (n_rois,). Noise points that were rescued get their assigned cluster label; new clusters of 2+ ex-noise points get fresh label IDs; remaining singletons stay
-1.
- Return type:
(np.ndarray)
- roicat.tracking.clustering.plot_quality_metrics(quality_metrics: dict, labels: ndarray | list, n_sessions: int) None[source]
- roicat.tracking.clustering.score_labels(labels_test: ndarray, labels_true: ndarray, ignore_negOne: bool = False, thresh_perfect: float = 0.9999999999) Dict[str, float | Tuple[int, int]][source]
Computes the score of the clustering by finding the best match using the linear sum assignment problem. The score is bounded between 0 and 1. Note: The score is not symmetric if the number of true and test labels are not the same. I.e., switching
labels_testandlabels_truecan lead to different scores. This is because we are scoring how well each true set is matched by an optimally assigned test set.RH 2022
- Parameters:
labels_test (np.ndarray) – Labels of the test clusters/sets. (shape: (n,))
labels_true (np.ndarray) – Labels of the true clusters/sets. (shape: (n,))
ignore_negOne (bool) – Whether to ignore
-1values in the labels. If set toTrue,-1values will be ignored in the computation. (Default isFalse)thresh_perfect (float) – Threshold for perfect match. Mostly used for numerical stability. (Default is 0.9999999999)
- Returns:
- dictionary containing:
- score_weighted_partial (float):
Average correlation between the best matched sets of true and test labels, weighted by the number of elements in each true set.
- score_weighted_perfect (float):
Fraction of perfect matches, weighted by the number of elements in each true set.
- score_unweighted_partial (float):
Average correlation between the best matched sets of true and test labels.
- score_unweighted_perfect (float):
Fraction of perfect matches.
- adj_rand_score (float):
Adjusted Rand score of the labels.
- adj_mutual_info_score (float):
Adjusted mutual info score of the labels. None if
compute_mutual_infoisFalse.- ignore_negOne (bool):
Whether
-1values were ignored in the labels.- idx_hungarian (Tuple[int, int]):
’Hungarian Indices’. Indices of the best matched sets.
- Return type:
(dict)
- roicat.tracking.clustering.weighted_jaccard_similarity(s: csr_array) csr_array[source]
Compute weighted Jaccard (Ruzicka) similarity from a sparse similarity graph. For each pair (i, j) in the input, replaces the direct similarity
s_ijwith a neighborhood-based structural similarity:\[J_w(i, j) = \frac{\sum_{k \neq i,j} \min(s_{ik}, s_{jk})} {\sum_{k \neq i,j} \max(s_{ik}, s_{jk})}\]where the sum is over all
kin the union of non-zero neighbors ofiandj, excludingk = iandk = j.This is a second-order similarity: two nodes score high if they share many strong connections to the same neighbors. Acts as a denoising step that amplifies community structure. Used in SNN clustering (Seurat), Louvain/Leiden, and UMAP local connectivity.
Uses a numba-accelerated merge-scan over sorted CSR rows. Complexity: O(nnz * avg_degree).
- Parameters:
s (scipy.sparse.csr_array) – Sparse symmetric similarity matrix. Shape: (n, n). Must have non-negative values, zero diagonal (not stored), and sorted indices within each row (standard CSR convention).
- Returns:
- s_jaccard (scipy.sparse.csr_array):
Weighted Jaccard similarity matrix. Same sparsity pattern as input. Values in [0, 1]. Symmetric if input is symmetric. Zero diagonal.
- Return type:
(scipy.sparse.csr_array)
roicat.tracking.scatteringWaveletTransformer module
- class roicat.tracking.scatteringWaveletTransformer.SWT(kwargs_Scattering2D: Dict[str, Any] = {'J': 2, 'L': 8}, image_shape: Tuple[int, int] = (36, 36), device: str = 'cpu', verbose: bool = True)[source]
Bases:
ROICaT_ModulePerforms scattering wavelet transform using the kymatio library. RH 2022
- Parameters:
kwargs_Scattering2D (Dict[str, Any]) – The keyword arguments to pass to the Scattering2D class. (Default is
{'J': 2, 'L': 8})image_shape (Tuple[int, int]) – The shape of the images to be transformed. (Default is
(36,36))device (str) – The device to use for the transformation. (Default is
'cpu')verbose (bool) – If
True, print statements will be outputted. (Default isTrue)
Example
swt = SWT(kwargs_Scattering2D={'J': 2, 'L': 8}, image_shape=(36,36), device='cpu', verbose=True) transformed_images = swt.transform(ROI_images, batch_size=100)
- transform(ROI_images: ndarray, batch_size: int = 100) ndarray[source]
Transforms the ROI images.
- Parameters:
ROI_images (np.ndarray) – The ROI images to transform. One should probably concatenate ROI images across sessions for passing through here. (n_ROIs, height, width)
batch_size (int) – The batch size to use for the transformation. (Default is 100)
- Returns:
- latents (np.ndarray):
The transformed ROI images. (n_ROIs, latent_size)
- Return type:
(np.ndarray)
roicat.tracking.similarity_graph module
- class roicat.tracking.similarity_graph.ROI_graph(n_workers: int = -1, frame_height: int = 512, frame_width: int = 1024, block_height: int = 100, block_width: int = 100, overlapping_width_Multiplier: float = 0.0, algorithm_nearestNeigbors_spatialFootprints: str = 'brute', verbose: bool = True, metric_configs: List[SimilarityMetric] | None = None, kwargs_nearestNeigbors_spatialFootprints: dict = {})[source]
Bases:
ROICaT_ModuleClass for building similarity and distance graphs between Regions of Interest (ROIs) based on their features, generating potential clusters of ROIs using linkage clustering, building a similarity graph between clusters of ROIs, and computing silhouette scores for each potential cluster. The computations are performed on ‘blocks’ of the full field of view to accelerate computation and reduce memory usage.
The similarity system is pluggable: each pairwise metric is described by a
SimilarityMetricdataclass. The default configuration (DEFAULT_METRICS) reproduces the legacy 3-metric system (spatial footprints + ROInet NN + SWT). RH 2022 / 2026- Parameters:
n_workers (int) – The number of workers to use for the computations. If -1, all available cpu cores will be used. (Default is
-1)frame_height (int) – The height of the frame. (Default is
512)frame_width (int) – The width of the frame. (Default is
1024)block_height (int) – The height of the block. (Default is
100)block_width (int) – The width of the block. (Default is
100)overlapping_width_Multiplier (float) – The multiplier for the overlapping width. (Default is
0.0)algorithm_nearestNeigbors_spatialFootprints (str) – The algorithm to use for the nearest neighbors computation. See sklearn.neighbors.NearestNeighbors for more information. (Default is
'brute')verbose (bool) – If set to
True, outputs will be verbose. (Default isTrue)metric_configs (Optional[List[SimilarityMetric]]) – Pluggable metric configurations. If
None, defaults toDEFAULT_METRICSat compute time. (Default isNone)kwargs_nearestNeigbors_spatialFootprints (dict) – The keyword arguments to use for the nearest neighbors. See sklearn.neighbors.NearestNeighbors for more information. (Optional)
- similarities
Dict mapping metric name to pairwise similarity matrix. Populated after
compute_similarity_blockwiseis called.- Type:
Dict[str, scipy.sparse.csr_array]
- s_sesh
Pairwise session-membership mask (True where ROIs come from different sessions).
- Type:
scipy.sparse.csr_array
- similarities_z
Dict mapping metric name to z-scored similarity matrix. Populated after
make_normalized_similaritiesis called.- Type:
Dict[str, scipy.sparse.csr_array]
- compute_similarity_blockwise(spatialFootprints: csr_array, ROI_session_bool: Tensor, features: Dict[str, Tensor], spatialFootprint_maskPower: float = 1.0, precomputed_similarities: Dict[str, csr_array] | None = None, metric_configs: List[SimilarityMetric] | None = None) Dict[str, csr_array][source]
Computes the similarity graph between ROIs for all configured metrics. Results are stored in
self.similarities(dict mapping metric name to sparse similarity matrix) andself.s_sesh(session mask).Computation is done block-by-block over the field of view.
- Parameters:
spatialFootprints (scipy.sparse.csr_array) – The spatial footprints of the ROIs. Can be obtained from
blurring.ROI_blurrer.ROIs_blurredordata_importing.Data_suite2p.spatialFootprints.ROI_session_bool (torch.Tensor) – Boolean array indicating which ROIs (across all sessions) belong to each session. Shape: (n_ROIs total, n_sessions).
features (Dict[str, torch.Tensor]) – Dict mapping metric name to feature tensor. For example:
{'nn': roinet.latents, 'swt': swt.latents}. The'sf'key is handled automatically fromspatialFootprintsand should NOT be in this dict.spatialFootprint_maskPower (float) – Power to raise the spatial footprint mask to. Use 1.0 for no change, low values (e.g. 0.5) for more binary masks, high values (e.g. 2.0) for intensity-dependent similarities. Applied ONCE inside
_compute_manhattan_similarity. (Default is1.0)precomputed_similarities (Optional[Dict[str, scipy.sparse.csr_array]]) – Optional dict of precomputed similarity matrices (keyed by metric name). These bypass the similarity computation step for the corresponding metric. (Default is
None)metric_configs (Optional[List[SimilarityMetric]]) – Override metric configs for this call. If
None, usesself._metric_configsorDEFAULT_METRICS. (Default isNone)
- Returns:
Dict mapping metric name to pairwise similarity matrix. Also stored as
self.similarities.- Return type:
Dict[str, scipy.sparse.csr_array]
- make_normalized_similarities(centers_of_mass: ndarray | List[ndarray], features: Dict[str, Tensor], k_max: int = 3000, k_min: int = 200, algo_NN: str = 'kd_tree', device: str = 'cpu', verbose: bool = True) None[source]
Normalizes similarity matrices by z-scoring using the mean and standard deviation from the distributions of pairwise similarities between ROIs that are spatially distant from each other. This makes similarity scores more comparable across different field-of-view regions.
Only metrics with
normalize_zscore=Truein their config are z-scored. Metrics withnormalize_zscore=Falseare copied intoself.similarities_zas-is.- Parameters:
centers_of_mass (Union[np.ndarray, List[np.ndarray]]) – Centers of mass of the ROIs. Array shape: (n_ROIs total, 2), or a list of arrays with shape: (n_ROIs per session, 2).
features (Dict[str, torch.Tensor]) – Dict mapping metric name to feature tensor. Only needed for metrics with
normalize_zscore=Trueandsimilarity_fn='cosine'.k_max (int) – Maximum number of nearest centroid-distance neighbors. (Default is
3000)k_min (int) – Minimum number of nearest centroid-distance neighbors. ROIs between k_min and k_max are used as the “different” reference distribution. (Default is
200)algo_NN (str) – Algorithm for nearest neighbor search on centroids. (Default is
'kd_tree')device (str) – Device for similarity computations. Output is always CPU. (Default is
'cpu')verbose (bool) – If
True, print progress updates. (Default isTrue)
- property similarities_final: Dict[str, csr_array]
Returns the final similarity matrices for downstream use (e.g., Clusterer). For metrics with
normalize_zscore=True, returns the z-scored version. For others, returns the raw version.Requires
make_normalized_similarities()to have been called. Falls back toself.similaritiesif z-scored versions are not available.- Returns:
Dict mapping metric name to final similarity matrix.
- Return type:
Dict[str, scipy.sparse.csr_array]
- class roicat.tracking.similarity_graph.SimilarityMetric(name: str, similarity_fn: str | Callable | None = 'cosine', is_sparsity_source: bool = False, normalize_zscore: bool = False, optimize_power: bool = True, optimize_sigmoid: bool = True, power_bounds: Tuple[float, float]=(0.0, 2.0), similarity_fn_kwargs: dict = <factory>, post_process: Dict[str, ~typing.Any] | None=None)[source]
Bases:
objectConfiguration for one similarity metric in the pluggable system.
Each metric represents a pairwise comparison between ROIs. It can be computed from raw features (via
similarity_fn) or provided as a precomputed sparse similarity matrix.- Parameters:
name (str) – Unique identifier for this metric. Used as dict key throughout. Examples:
'sf','nn','swt','temporal'.similarity_fn (Union[str, Callable, None]) –
How to compute pairwise similarity from raw features:
'cosine': L2-normalize, then matmul. For dense feature vectors.'manhattan': sklearn NearestNeighbors with manhattan metric. For sparse spatial footprints.Callable: Custom function with signature(features_block, **kwargs) -> similarity_matrix. Must return a matrix (dense or sparse) of shape (n_roi_block, n_roi_block).None: Metric uses a precomputed similarity matrix (no feature computation needed).
is_sparsity_source (bool) – If
True, this metric’s nonzero pattern contributes to the global sparsity mask. When only one metric has this set, all others are masked to its pattern. When multiple metrics are sparsity sources, the intersection of their nonzero patterns is used.normalize_zscore (bool) – If
True, z-score normalize this metric’s similarity values using distant-neighbor statistics (same approach as the currentmake_normalized_similarities).optimize_power (bool) – If
True, includepower_<name>in the DE optimization bounds. IfFalse, similarity values are used as-is (power=1) in the conjunctive distance computation.optimize_sigmoid (bool) – If
True, estimate sigmoid parameters(mu, b)from NB calibration for this metric. IfFalse, no sigmoid activation is applied.power_bounds (Tuple[float, float]) – Bounds for the power parameter in DE optimization. Only used when
optimize_power=True.similarity_fn_kwargs (dict) – Additional keyword arguments passed to
similarity_fn. For'manhattan': can include'algorithm','n_jobs', etc.post_process (Optional[Dict[str, Any]]) –
Per-metric post-processing after similarity computation. Supported keys:
'clip_min'(Optional[float]): Clip values below this threshold to zero.Nonemeans no clipping.'clip_near_one'(bool): IfTrue, clip values above(1 - 1e-5)to exactly 1.0.
- classmethod from_dict(d: dict) SimilarityMetric[source]
Reconstruct from dict.
similarity_fnmust be str or None.
- is_sparsity_source: bool = False
- name: str
- normalize_zscore: bool = False
- optimize_power: bool = True
- optimize_sigmoid: bool = True
- post_process: Dict[str, Any] | None = None
- power_bounds: Tuple[float, float] = (0.0, 2.0)
- similarity_fn: str | Callable | None = 'cosine'
- similarity_fn_kwargs: dict
- roicat.tracking.similarity_graph.cosine_similarity_customIdx(features: Tensor, idx: ndarray) Tensor[source]
Calculate cosine similarity using custom indices.
- Parameters:
features (torch.Tensor) – A tensor of feature vectors. Shape: (n, d), where n is the number of data points and d is the dimensionality of the data.
idx (np.ndarray) – Array of indices. Shape should match the first dimension of the features tensor.
- Returns:
- result (torch.Tensor):
Cosine similarity tensor calculated using the provided indices. Shape: (n, d), where n is the number of data points and d is the dimensionality of the data.
- Return type:
(torch.Tensor)
- roicat.tracking.similarity_graph.get_idx_in_kRange(X: ndarray, k_max: int = 3000, k_min: int = 100, algo_kNN: str = 'brute', n_workers: int = -1) Tuple[ndarray, coo_array][source]
Get indices in a given range for k-Nearest Neighbors graph. RH 2022
- Parameters:
X (np.ndarray) – Input data array where each row is a data point and each column is a feature.
k_max (int) – Maximum number of neighbors to find. (Default is
3000)k_min (int) – Minimum number of neighbors to consider. (Default is
100)algo_kNN (str) – Algorithm to use for nearest neighbors search. (Default is
'brute')n_workers (int) – Number of worker processes to use. If
-1, use all available cores. (Default is-1)
- Returns:
- tuple containing:
- idx_diff (np.ndarray):
Indices of the non-zero values in the distance graph, with a range between
k_minandk_max.- d (scipy.sparse.coo_array):
Sparse matrix representing the distance graph from the k-Nearest Neighbors algorithm.
- Return type:
(Tuple[np.ndarray, scipy.sparse.coo_array])
Classification module
roicat.classification.classifier module
- class roicat.classification.classifier.Auto_LogisticRegression(X: ndarray, y: ndarray, params_LogisticRegression: Dict = {'C': [1e-14, 1000.0], 'fit_intercept': True, 'l1_ratio': None, 'max_iter': 1000, 'n_jobs': None, 'penalty': 'l2', 'solver': 'lbfgs', 'tol': 0.0001, 'warm_start': False}, n_startup: int = 15, kwargs_convergence: Dict = {'max_duration': 600, 'max_trials': 150, 'n_patience': 50, 'tol_frac': 0.05}, n_jobs_optuna: int = 1, penalty_testTrainRatio: float = 1.0, test_size: float = 0.3, class_weight: Dict[str, float] | str | None = 'balanced', sample_weight: List[float] | None = None, cv: BaseCrossValidator | None = None, verbose: bool = True)[source]
Bases:
Autotuner_regressionImplements automatic hyperparameter tuning for Logistic Regression. RH 2023
- Parameters:
X (np.ndarray) – Training data. (shape: (n_samples, n_features))
y (np.ndarray) – Target variable. (shape: (n_samples,))
params_LogisticRegression (Dict) –
Dictionary of Logistic Regression parameters. For each item in the dictionary if item is:
list: The parameter is tuned. If the values are numbers, then the list wil be the bounds [low, high] to search over. If the values are strings, then the list will be the categorical values to search over.not a
list: The parameter is fixed to the given value.
See LogisticRegression for a full list of arguments.
n_startup (int) – Number of startup trials. (Default is 15)
kwargs_convergence (Dict[str, Union[int, float]]) –
Convergence settings for the optimization. Includes:
'n_patience'(int): The number of trials to wait for convergence before stopping the optimization.'tol_frac'(float): The fractional tolerance for convergence. Aftern_patiencetrials, the optimization will stop if the loss has not improved by at leasttol_frac.'max_trials'(int): The maximum number of trials to run.'max_duration'(int): The maximum duration of the optimization in seconds.
n_jobs_optuna (int) – Number of jobs for Optuna. Set to
-1to use all cores. Note that some'solver'options are already parallelized (like'lbfgs'). Setn_jobs_optunato1for these solvers.penalty_testTrainRatio (float) – Penalty ratio for test and train.
test_size (float) – Test set ratio.
class_weight (Union[Dict[str, float], str]) – Weights associated with classes in the form of a dictionary or string. If given “balanced”, class weights will be calculated. (Default is “balanced”)
sample_weight (Optional[List[float]]) –
Sample weights. See LogisticRegression for more details.
cv (Optional[sklearn.model_selection._split.BaseCrossValidator]) –
A Scikit-Learn cross-validator class. If not
None, then must have:Call signature:
idx_train, idx_test = next(self.cv.split(self.X, self.y))
If
None, then a StratifiedShuffleSplit cross-validator will be used.verbose (bool) – Whether to print progress messages.
- Demo:
## Initialize with NO TUNING. All parameters are fixed. autoclassifier = Auto_LogisticRegression( X, y, params_LogisticRegression={ 'C': 1e-14, 'penalty': 'l2', 'solver': 'lbfgs', }, ) ## Initialize with TUNING 'C', 'penalty', and 'l1_ratio'. 'solver' is fixed. autoclassifier = Auto_LogisticRegression( X, y, params_LogisticRegression={ 'C': [1e-14, 1e3], 'penalty': ['l1', 'l2', 'elasticnet'], 'l1_ratio': [0.0, 1.0], 'solver': 'lbfgs', }, )
- evaluate_model(model: LogisticRegression | None = None, X: ndarray | None = None, y: ndarray | None = None, sample_weight: List[float] | None = None) Tuple[float, array][source]
Evaluates the given model on the given data. Makes label predictions, then computes the accuracy and confusion matrix.
- Parameters:
model (sklearn.linear_model.LogisticRegression) – A sklearn LogisticRegression model. If None, then self.model_best is used.
X (np.ndarray) – The data to evaluate on. If None, then self.X is used.
y (np.ndarray) – The labels to evaluate on. If None, then self.y is used.
sample_weight (List[float]) – The sample weights to evaluate on. If None, then self.sample_weight is used.
- Returns:
- Tuple containing:
- accuracy (float):
The accuracy of the model on the given data.
- confusion_matrix (np.array):
The confusion matrix of the model on the given data.
- Return type:
(tuple)
- class roicat.classification.classifier.Autotuner_regression(model_class: Type[BaseEstimator], params: Dict[str, Dict[str, Any]], X: Any, y: Any, cv: Any, fn_loss: Callable, n_jobs_optuna: int = -1, n_startup: int = 15, kwargs_convergence={'max_duration': 600, 'max_trials': 350, 'n_patience': 100, 'tol_frac': 0.05}, sample_weight: Any | None = None, catch_convergence_warnings: bool = True, verbose=True)[source]
Bases:
ROICaT_ModuleA class for automatic hyperparameter tuning and training of a regression model. RH 2023
- model_class
A Scikit-Learn estimator class. Must have:
Method:
fit(X, y)Method:
predict_proba(X)(for classifiers) orpredict(X)(for continuous regressors)
- Type:
Type[sklearn.base.BaseEstimator]
- params
A dictionary of hyperparameters with their names, types, and bounds.
- Type:
Dict[str, Dict[str, Any]]
- X
Input data. Shape: (n_samples, n_features)
- Type:
np.ndarray
- y
Output data. Shape: (n_samples,)
- Type:
np.ndarray
- cv
A Scikit-Learn cross-validator class. Must have:
Call signature:
idx_train, idx_test = next(self.cv.split(self.X, self.y))
- Type:
Type[sklearn.model_selection._split.BaseCrossValidator]
- fn_loss
Function to compute the loss. Must have:
Call signature:
loss, loss_train, loss_test = fn_loss(y_pred_train, y_pred_test, y_true_train, y_true_test, sample_weight_train, sample_weight_test)
- Type:
Callable
- n_jobs_optuna
Number of jobs for Optuna. Set to
-1to use all cores. Note that some'solver'options are already parallelized (like'lbfgs'). Setn_jobs_optunato1for these solvers.- Type:
int
- n_startup
The number of startup trials for the optuna pruner and sampler.
- Type:
int
- kwargs_convergence
Convergence settings for the optimization. Includes:
'n_patience'(int): The number of trials to wait for convergence before stopping the optimization.'tol_frac'(float): The fractional tolerance for convergence. Aftern_patiencetrials, the optimization will stop if the loss has not improved by at leasttol_frac.'max_trials'(int): The maximum number of trials to run.'max_duration'(int): The maximum duration of the optimization in seconds.
- Type:
Dict[str, Union[int, float]]
- sample_weight
Weights for the samples, equal to ones_like(y) if None.
- Type:
Optional[np.ndarray]
- catch_convergence_warnings
If
True, ignore ConvergenceWarning during model fitting.- Type:
bool
- verbose
If
True, show progress bar and print running results.- Type:
bool
Example
params = { 'C': {'type': 'real', 'kwargs': {'log': True, 'low': 1e-4, 'high': 1e4}}, 'penalty': {'type': 'categorical', 'kwargs': {'choices': ['l1', 'l2']}}, }
- fit() BaseEstimator | Dict[str, Any] | None[source]
Fit and tune the hyperparameters and train the classifier.
- Returns:
- best_model (sklearn.base.BaseEstimator):
The best estimator obtained from hyperparameter tuning.
- best_params (Optional[Dict[str, Any]]):
The best parameters obtained from hyperparameter tuning.
- Return type:
(Union[sklearn.base.BaseEstimator, Optional[Dict[str, Any]])
- save_model(filepath: str | None = None, allow_overwrite: bool = False)[source]
Uses ONNX to save the best model as a binary file.
- Parameters:
filepath (str) – The path to save the model to. If None, then the model will not be saved.
allow_overwrite (bool) – Whether to allow overwriting of existing files.
- Returns:
The ONNX model.
- Return type:
(onnx.ModelProto)
- roicat.classification.classifier.Load_ONNX_model_sklearnLogisticRegression
alias of
ONNX_model_sklearnLogisticRegression
- class roicat.classification.classifier.LossFunction_CrossEntropy_CV(penalty_testTrainRatio: float = 1.0, labels: List | ndarray | None = None, test_or_train: str = 'test')[source]
Bases:
objectCalculates the cross-entropy loss of a classifier using cross-validation. RH 2023
- Parameters:
penalty_testTrainRatio (float) – The amount of penalty for the test loss to the train loss. Penalty is applied with formula:
loss = loss_test_or_train * ((loss_test / loss_train) ** penalty_testTrainRatio).labels (Optional[Union[List, np.ndarray]]) – A list or ndarray of labels. Shape: (n_samples,).
test_or_train (str) – A string indicating whether to apply the penalty to the test or train loss. It should be either
'test'or'train'.
- class roicat.classification.classifier.ONNX_model_sklearnLogisticRegression(path_or_bytes: str = 'path/to/model.onnx', providers: List[str] = ['CPUExecutionProvider'])[source]
Bases:
objectLoads an ONNX model of an sklearn LogisticRegression model into a runtime session. RH 2023
- Parameters:
path_or_bytes (Union[str, bytes]) –
Either:
The filepath to the ONNX model.
The bytes of the ONNX model: model.SerializeToString().
- Returns:
A partial function that takes in a numpy array or torch Tensor and passes it through the ONNX runtime session (model).
- Return type:
(function)
roicat.pipelines module
- roicat.pipelines.pipeline_tracking(params: dict, custom_data: Data_roicat = None) tuple[source]
Pipeline for tracking ROIs across sessions. RH 2023
- Parameters:
params (dict) – Dictionary of parameters. See
roicat.util.get_default_parameters(pipeline='tracking')for details.custom_data – Optional[data_importing.Data_roicat] Optional. If not None, then this is a custom roicat data object that will be used instead of loading data known formats from disk. Be careful to ensure that the object is prepared for the tracking pipeline.
- Returns:
- tuple containing:
- results (dict):
Dictionary of results.
- run_data (dict):
Dictionary containing the different class objects used in the pipeline.
- params (dict):
Parameters used in the pipeline. See
roicat.helpers.prepare_params()for details.
- Return type:
(tuple)
roicat.ROInet module
OSF.io links to ROInet versions:
- ROInet_tracking:
Info: This version does not include occlusions or large affine transformations.
Hash (MD5 hex): 7a5fb8ad94b110037785a46b9463ea94
- ROInet_classification:
Info: This version includes occlusions and large affine transformations.
Hash (MD5 hex): 357a8d9b630ec79f3e015d0056a4c2d5
- class roicat.ROInet.Dataloader_ROInet(ROI_images: ndarray, batchSize_dataloader: int = 8, pinMemory_dataloader: bool = True, numWorkers_dataloader: int = -1, persistentWorkers_dataloader: bool = True, prefetchFactor_dataloader: int = 2, transforms: Callable | None = None, n_transforms: int = 1, img_size_out: Tuple[int, int] = (224, 224), jit_script_transforms: bool = False, shuffle_dataloader: bool = False, drop_last_dataloader: bool = False, verbose: bool = True)[source]
Bases:
ROICaT_ModuleClass for creating a dataloader for the ROInet network. JZ, RH 2023
- Parameters:
ROI_images (np.ndarray) – Array of ROIs to resize. Shape should be (nROIs, height, width).
pref_plot (bool) – If
True, plots the sizes of the ROI images before and after normalization. (Default isFalse)batchSize_dataloader (int) – The batch size to use for the DataLoader. (Default is 8)
pinMemory_dataloader (bool) – If
True, pins the memory of the DataLoader, as per PyTorch’s best practices. (Default isTrue)numWorkers_dataloader (int) – The number of worker processes for data loading. (Default is -1)
persistentWorkers_dataloader (bool) – If
True, uses persistent worker processes. (Default isTrue)prefetchFactor_dataloader (int) – The prefetch factor for data loading. (Default is 2)
transforms (Optional[Callable]) – The transforms to use for the DataLoader. If
None, the function will only scale dynamic range (to 0-1), resize (to img_size_out dimensions), and tile channels (to 3) as a minimum to pass images through the network. (Default isNone)n_transforms (int) – The number of times to apply the transforms to each image. Should be 1 for inference and 2 for training. (Default is 1)
img_size_out (Tuple[int, int]) – The image output dimensions of DataLoader if transforms is
None. (Default is (224, 224))jit_script_transforms (bool) – If
True, converts the transforms pipeline into a TorchScript pipeline, potentially improving calculation speed but can cause problems with multiprocessing. (Default isFalse)shuffle (bool) – If
True, shuffles the data. Should be set toTruefor SimCLR training. (Default isFalse)drop_last (bool) – If
True, drops the last batch if it is not full. Should be set toTruefor SimCLR training. (Default isFalse)verbose (bool) – If
True, print out extra information. (Default isTrue)
- class roicat.ROInet.ROInet_embedder(dir_networkFiles: str, device: str = 'cpu', download_method: str = 'check_local_first', download_url: str = 'https://osf.io/x3fd2/download', download_hash: dict = None, names_networkFiles: dict = None, forward_pass_version: str = 'latent', verbose: bool = True)[source]
Bases:
ROICaT_ModuleClass for loading the ROInet model, preparing data for it, and running it. RH, JZ 2022
OSF.io links to ROInet versions:
- ROInet_tracking:
Info: This version does not include occlusions or large affine transformations.
Hash (MD5 hex): 7a5fb8ad94b110037785a46b9463ea94
- ROInet_classification:
Info: This version includes occlusions and large affine transformations.
Hash (MD5 hex): 357a8d9b630ec79f3e015d0056a4c2d5
- Parameters:
dir_networkFiles (str) – Directory to find an existing ROInet.zip file or download and extract a new one into.
device (str) – Device to use for the model and data. (Default is
'cpu')download_method (str) –
Approach to downloading the network files. Options are:
'check_local_first': Check if the network files are already in dir_networkFiles, if so, use them.'force_download': Download an ROInet.zip file from download_url.'force_local': Use an existing local copy of an ROInet.zip file, if they don’t exist, raise an error. Hash checking is done and download_hash must be specified.
(Default is
'check_local_first')download_url (str) – URL to download the ROInet.zip file from. (Default is https://osf.io/x3fd2/download)
download_hash (dict) – MD5 hash of the ROInet.zip file. This can be obtained from ROICaT documentation. If you don’t have one, use download_method=’force_download’ and determine the hash using helpers.hash_file(). (Default is
None)names_networkFiles (dict) –
Names of the files in the ROInet.zip file. If uncertain, leave as None. The dictionary should have the form:
{'params': 'params.json', 'model': 'model.py', 'state_dict': 'ConvNext_tiny__1_0_unfrozen__simCLR.pth',}Where ‘params’ is the parameters used to train the network (usually a .json file), ‘model’ is the model definition (usually a .py file), and ‘state_dict’ are the weights of the network (usually a .pth file). (Default is
None)forward_pass_version (str) – Version of the forward pass to use. Options are ‘latent’ (return the post-head output latents, use this for tracking), ‘head’ (return the output of the head layers, use this for classification), and ‘base’ (return the output of the base model). (Default is
'latent')verbose (bool) – If True, print out extra information. (Default is
True)
- generate_dataloader(ROI_images: List[ndarray], um_per_pixel: float | List[float], resize_ROI_images: bool = True, nan_to_num: bool = True, nan_to_num_val: float = 0.0, pref_plot: bool = False, batchSize_dataloader: int = 8, pinMemory_dataloader: bool = True, numWorkers_dataloader: int = -1, persistentWorkers_dataloader: bool = True, prefetchFactor_dataloader: int = 2, transforms: Callable | None = None, img_size_out: Tuple[int, int] = (224, 224), jit_script_transforms: bool = False)[source]
Generates a PyTorch DataLoader for a list of Region of Interest (ROI) images. Performs preprocessing such as rescaling, normalization, and resizing.
- Parameters:
ROI_images (List[np.ndarray]) – The ROI images to use for the dataloader. List of arrays, each array corresponds to a session and is of shape (n_rois, height, width).
um_per_pixel (Union[float, List[float]]) – The conversion factor from pixels to microns. This is used to scale the ROI_images to a common size. Should either be a float or a list of floats, one for each session.
resize_ROI_images (bool) – If
True, resizes the ROI images to a common size. (Default isTrue)nan_to_num (bool) – Whether to replace NaNs with a specific value. (Default is
True)nan_to_num_val (float) – The value to replace NaNs with. (Default is 0.0)
pref_plot (bool) – If
True, plots the sizes of the ROI images before and after normalization. (Default isFalse)batchSize_dataloader (int) – The batch size to use for the DataLoader. (Default is 8)
pinMemory_dataloader (bool) – If
True, pins the memory of the DataLoader, as per PyTorch’s best practices. (Default isTrue)numWorkers_dataloader (int) – The number of worker processes for data loading. (Default is -1)
persistentWorkers_dataloader (bool) – If
True, uses persistent worker processes. (Default isTrue)prefetchFactor_dataloader (int) – The prefetch factor for data loading. (Default is 2)
transforms (Optional[Callable]) – The transforms to use for the DataLoader. If
None, the function will only scale dynamic range (to 0-1), resize (to img_size_out dimensions), and tile channels (to 3) as a minimum to pass images through the network. (Default isNone)img_size_out (Tuple[int, int]) – The image output dimensions of DataLoader if transforms is
None. (Default is (224, 224))jit_script_transforms (bool) – If
True, converts the transforms pipeline into a TorchScript pipeline, potentially improving calculation speed but can cause problems with multiprocessing. (Default isFalse)
- Returns:
- ROI_images (np.ndarray):
The ROI images after normalization and resizing. Shape is (n_sessions, n_rois, n_channels, height, width).
- Return type:
(np.ndarray)
Example
dataloader = generate_dataloader(ROI_images)
- class roicat.ROInet.ROInet_embedder_original(dir_networkFiles: str, device: str = 'cpu', download_method: str = 'check_local_first', download_url: str = 'https://osf.io/x3fd2/download', download_hash: dict = None, names_networkFiles: dict = None, forward_pass_version: str = 'latent', verbose: bool = True)[source]
Bases:
ROICaT_ModuleClass for loading the ROInet model, preparing data for it, and running it. RH, JZ 2022
OSF.io links to ROInet versions:
- ROInet_tracking:
Info: This version does not include occlusions or large affine transformations.
Hash (MD5 hex): 7a5fb8ad94b110037785a46b9463ea94
- ROInet_classification:
Info: This version includes occlusions and large affine transformations.
Hash (MD5 hex): 357a8d9b630ec79f3e015d0056a4c2d5
- Parameters:
dir_networkFiles (str) – Directory to find an existing ROInet.zip file or download and extract a new one into.
device (str) – Device to use for the model and data. (Default is
'cpu')download_method (str) –
Approach to downloading the network files. Options are:
'check_local_first': Check if the network files are already in dir_networkFiles, if so, use them.'force_download': Download an ROInet.zip file from download_url.'force_local': Use an existing local copy of an ROInet.zip file, if they don’t exist, raise an error. Hash checking is done and download_hash must be specified.
(Default is
'check_local_first')download_url (str) – URL to download the ROInet.zip file from. (Default is https://osf.io/x3fd2/download)
download_hash (dict) – MD5 hash of the ROInet.zip file. This can be obtained from ROICaT documentation. If you don’t have one, use download_method=’force_download’ and determine the hash using helpers.hash_file(). (Default is
None)names_networkFiles (dict) –
Names of the files in the ROInet.zip file. If uncertain, leave as None. The dictionary should have the form:
{'params': 'params.json', 'model': 'model.py', 'state_dict': 'ConvNext_tiny__1_0_unfrozen__simCLR.pth',}Where ‘params’ is the parameters used to train the network (usually a .json file), ‘model’ is the model definition (usually a .py file), and ‘state_dict’ are the weights of the network (usually a .pth file). (Default is
None)forward_pass_version (str) – Version of the forward pass to use. Options are ‘latent’ (return the post-head output latents, use this for tracking), ‘head’ (return the output of the head layers, use this for classification), and ‘base’ (return the output of the base model). (Default is
'latent')verbose (bool) – If True, print out extra information. (Default is
True)
- generate_dataloader(ROI_images: List[ndarray], um_per_pixel: float = 1.0, nan_to_num: bool = True, nan_to_num_val: float = 0.0, pref_plot: bool = False, batchSize_dataloader: int = 8, pinMemory_dataloader: bool = True, numWorkers_dataloader: int = -1, persistentWorkers_dataloader: bool = True, prefetchFactor_dataloader: int = 2, transforms: Callable | None = None, img_size_out: Tuple[int, int] = (224, 224), jit_script_transforms: bool = False)[source]
Generates a PyTorch DataLoader for a list of Region of Interest (ROI) images. Performs preprocessing such as rescaling, normalization, and resizing.
- Parameters:
ROI_images (List[np.ndarray]) – The ROI images to use for the dataloader. List of arrays, each array corresponds to a session and is of shape (n_rois, height, width).
um_per_pixel (float) – The number of microns per pixel. Used to rescale the ROI images to the same size as the network input. (Default is 1.0)
nan_to_num (bool) – Whether to replace NaNs with a specific value. (Default is
True)nan_to_num_val (float) – The value to replace NaNs with. (Default is 0.0)
pref_plot (bool) – If
True, plots the sizes of the ROI images before and after normalization. (Default isFalse)batchSize_dataloader (int) – The batch size to use for the DataLoader. (Default is 8)
pinMemory_dataloader (bool) – If
True, pins the memory of the DataLoader, as per PyTorch’s best practices. (Default isTrue)numWorkers_dataloader (int) – The number of worker processes for data loading. (Default is -1)
persistentWorkers_dataloader (bool) – If
True, uses persistent worker processes. (Default isTrue)prefetchFactor_dataloader (int) – The prefetch factor for data loading. (Default is 2)
transforms (Optional[Callable]) – The transforms to use for the DataLoader. If
None, the function will only scale dynamic range (to 0-1), resize (to img_size_out dimensions), and tile channels (to 3) as a minimum to pass images through the network. (Default isNone)img_size_out (Tuple[int, int]) – The image output dimensions of DataLoader if transforms is
None. (Default is (224, 224))jit_script_transforms (bool) – If
True, converts the transforms pipeline into a TorchScript pipeline, potentially improving calculation speed but can cause problems with multiprocessing. (Default isFalse)
- Returns:
- ROI_images (np.ndarray):
The ROI images after normalization and resizing. Shape is (n_sessions, n_rois, n_channels, height, width).
- Return type:
(np.ndarray)
Example
dataloader = generate_dataloader(ROI_images)
- generate_latents() Tensor[source]
Passes the data in the dataloader through the network and generates latents.
- Returns:
- latents (torch.Tensor):
Latents for each ROI (Region of Interest).
- Return type:
(torch.Tensor)
- classmethod resize_ROIs(ROI_images: ndarray, um_per_pixel: float) ndarray[source]
Resizes the ROI (Region of Interest) images to prepare them for pass through network.
- Parameters:
ROI_images (np.ndarray) – The ROI images to resize. Array of shape (n_rois, height, width).
um_per_pixel (float) – The number of microns per pixel. This value is used to rescale the ROI images so that they occupy a standard region of the image frame.
- Returns:
- ROI_images_rs (np.ndarray):
The resized ROI images.
- Return type:
(np.ndarray)
- class roicat.ROInet.Resizer_ROI_images(function_scaleFactor: ~typing.Callable[[float, int], float] = <function Resizer_ROI_images.<lambda>>, nan_to_num: bool = True, nan_to_num_val: float = 0.0, verbose: bool = True, batch_size: int = 10000)[source]
Bases:
ROICaT_ModuleClass for resizing ROIs. RH 2023-2024
- Parameters:
function_scaleFactor (Callable) – The function used to convert
um_per_pixelto a scale factor. (Default islambda um_per_pixel, size_im: 1.2 * um_per_pixel * (size_im / 36)) Whereum_per_pixelis the number of microns per pixel and size_im is the edge length of the image.nan_to_num (bool) – Whether to replace NaNs with a specific value. (Default is
True)nan_to_num_val (float) – The value to replace NaNs with. (Default is 0.0)
verbose (bool) – If True, print out extra information. (Default is
False)
- plot_resized_comparison(ROI_images_cat: ndarray, ROI_images_rs: ndarray)[source]
Plot a comparison of the ROI sizes before and after resizing.
- Parameters:
ROI_images_cat (np.ndarray) – Array of ROIs to resize. Shape should be (nROIs, height, width).
ROI_images_rs (np.ndarray) – Array of resized ROIs. Shape should be (nROIs, height, width).
- resize_ROIs(ROI_images: ndarray, um_per_pixel: float) ndarray[source]
Resizes the ROI (Region of Interest) images to prepare them for pass through network.
- Parameters:
ROI_images (np.ndarray) – The ROI images to resize. Array of shape (n_rois, height, width).
um_per_pixel (float) – The number of microns per pixel. This value is used to rescale the ROI images so that they occupy a standard region of the image frame.
- Returns:
- ROI_images_rs (np.ndarray):
The resized ROI images.
- Return type:
(np.ndarray)
- class roicat.ROInet.ScaleDynamicRange(scaler_bounds=(0, 1), epsilon=1e-09)[source]
Bases:
ModuleMin-max scaling of the input tensor. RH 2021
- forward(tensor)[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class roicat.ROInet.TileChannels(dim=0, n_channels=3)[source]
Bases:
ModuleExpand dimension dim in X_in and tile to be N channels. RH 2021
- forward(tensor)[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class roicat.ROInet.Unsqueeze(dim=0)[source]
Bases:
ModuleExpand dimension dim in X_in and tile to be N channels. JZ 2023
- forward(tensor)[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class roicat.ROInet.dataset_simCLR(X: Tensor | array | List[float], y: Tensor | array | List[int], n_transforms: int = 2, transform: Callable | None = None, DEVICE: str = 'cpu', dtype_X: dtype = torch.float32, dtype_y: dtype = torch.int64)[source]
Bases:
Dataset- Parameters:
X (Union[torch.Tensor, np.array, List[float]]) – Images. Expected shape: (n_samples, height, width). Currently expects no channel dimension. If/when it exists, then shape should be (n_samples, n_channels, height, width).
y (Union[torch.Tensor, np.array, List[int]]) – Labels. Shape: (n_samples).
n_transforms (int) – Number of transformations to apply to each image. Should be >= 1. (Default is
2)transform (Optional[Callable]) –
Optional transform to be applied on a sample. See torchvision.transforms for more information. Can use torch.nn.Sequential(a, bunch, of, transforms,) or other methods from torchvision.transforms.
If not
None: Transform(s) are applied to each image and the output shape of X_sample_transformed for __getitem__ will be (n_samples, n_transforms, n_channels, height, width).If
None: No transform is applied and output shape of X_sample_trasformed for __getitem__ will be (n_samples, n_channels, height, width) (which is missing the n_transforms dimension).
(Default is
None)DEVICE (str) – Device on which the data will be stored and transformed. Best to leave this as ‘cpu’ and do .to(DEVICE) on the data for the training loop. (Default is
'cpu')dtype_X (torch.dtype) – Data type of X. (Default is
torch.float32)dtype_y (torch.dtype) – Data type of y. (Default is
torch.int64)temp_uncetainty (float) – Temperture term applied to the CrossEntropyLoss input. (Default is
1.0for no change)
Example
transforms = torch.nn.Sequential( torchvision.transforms.RandomHorizontalFlip(p=0.5), torchvision.transforms.GaussianBlur( 5, sigma=(0.01, 1.) ), torchvision.transforms.RandomPerspective( distortion_scale=0.6, p=1, interpolation=torchvision.transforms.InterpolationMode.BILINEAR, fill=0 ), torchvision.transforms.RandomAffine( degrees=(-180,180), translate=(0.4, 0.4), scale=(0.7, 1.7), shear=(-20, 20, -20, 20), interpolation=torchvision.transforms.InterpolationMode.BILINEAR, fill=0, fillcolor=None, resample=None ), ) scripted_transforms = torch.jit.script(transforms) dataset = dataset_simCLR( torch.tensor(images), labels, n_transforms=2, transform=scripted_transforms, DEVICE='cpu', dtype_X=torch.float32, dtype_y=torch.int64) dataloader = torch.utils.data.DataLoader( dataset, batch_size=64, shuffle=True, drop_last=True, pin_memory=False, num_workers=0)
- tile_channels(X_in: Tensor | ndarray, dim: int = -3) Tensor | ndarray[source]
Expand dimension dim in X_in and tile to be 3 channels.
- Parameters:
X_in (torch.Tensor or np.ndarray) – Input image with shape: (n_channels==1, height, width)
dim (int) – Dimension to expand. (Default is
-3)
- Returns:
- X_out (torch.Tensor or np.ndarray):
Output image with shape: (n_channels==3, height, width)
- Return type:
(torch.Tensor or np.ndarray)
- roicat.ROInet.resize_affine(img: ndarray, scale: float, clamp_range: bool = False) ndarray[source]
Resizes an image using an affine transformation, scaled by a factor.
- Parameters:
img (np.ndarray) – The input image to resize. Shape: (H, W)
scale (float) – The scale factor to apply for resizing.
clamp_range (bool) – If
True, the image will be clamped to the range [min(img), max(img)] to prevent interpolation from extending outside of the image’s range. (Default isFalse)
- Returns:
- resized_image (np.ndarray):
The resized image.
- Return type:
(np.ndarray)
- roicat.ROInet.resize_affine2(imgs: ndarray, scale: float, clamp_range: bool = False) ndarray[source]
Resizes an image using an affine transformation, scaled by a factor.
- Parameters:
img (np.ndarray) – The input images to resize. Shape: (N, H, W)
scale (float) – The scale factor to apply for resizing.
clamp_range (bool) – If
True, the image will be clamped to the range [min(img), max(img)] to prevent interpolation from extending outside of the image’s range. (Default isFalse)
- Returns:
- resized_image (np.ndarray):
The resized image.
- Return type:
(np.ndarray)
- roicat.ROInet.resize_images(imgs: ndarray, scale: float, clamp_range: bool = False) ndarray[source]
Resizes images using an affine transformation, scaled by a factor. Uses torch.nn.functional.grid_sample to perform the resizing.
- Parameters:
imgs (np.ndarray) – The input images to resize. Shape: (N, H, W)
scale (float) – The scale factor to apply for resizing.
clamp_range (bool) – If
True, the image will be clamped to the range [min(img), max(img)] to prevent interpolation from extending outside of the image’s range. (Default isFalse)
- Returns:
- resized_images (np.ndarray):
The resized images. Shape: (N, H, W)
- Return type:
(np.ndarray)
roicat.data_importing module
- class roicat.data_importing.Data_caiman(paths_resultsFiles: List[str], include_discarded: bool = True, um_per_pixel: float = 1.0, out_height_width: List[int] = [36, 36], centroid_method: str = 'median', verbose: bool = True, class_labels: str | None = None)[source]
Bases:
Data_roicatClass for importing data from CaImAn output files, specifically hdf5 results files.
- Parameters:
paths_resultsFiles (List[str]) – List of paths to the results files.
include_discarded (bool) – If
True, include ROIs that were discarded by CaImAn. Default isTrue.um_per_pixel (Union[float, List[float]]) – Resolution in micrometers per pixel of the imaging field of view. The conversion factor from pixels to microns. This is used to scale the ROI_images to a common size. Should either be a float or a list of floats, one for each session.
out_height_width (List[int]) – Output height and width. Default is [36, 36].
centroid_method (str) – Method for calculating the centroid of an ROI. Should be:
'centerOfMass'or'median'.verbose (bool) – If
True, print statements will be printed. Default isTrue.class_labels (str, optional) – Class labels. Default is
None.
- import_FOV_images(paths_resultsFiles: List | None = None, images: List | None = None) List[ndarray][source]
Imports the FOV images from the CaImAn results files.
- Parameters:
paths_resultsFiles (Optional[List]) – List of paths to CaImAn results files. If not provided, will use the paths stored in the class instance.
images (Optional[List]) – List of FOV images. If None, the function will import the estimates.b image from the paths specified in paths_resultsFiles.
- Returns:
- FOV images (np.ndarray):
FOV images. Shape is (nROIs, FOV_height, FOV_width).
- Return type:
List[np.ndarray]
- import_ROI_centeredImages(out_height_width: List[int] = [36, 36]) ndarray[source]
Imports the ROI centered images from the CaImAn results files.
- Parameters:
out_height_width (List[int]) – Height and width of the output images. Default is [36,36].
- Returns:
- ROI centered images (np.ndarray):
ROI centered images. Shape is (nROIs, out_height_width[0], out_height_width[1]).
- Return type:
(np.ndarray)
- import_cnn_caiman_preds(path_resultsFile: str | Path, include_discarded: bool = True) ndarray | None[source]
Imports the CNN-based CaImAn prediction probabilities from the given file.
- Parameters:
path_resultsFile (Union[str, pathlib.Path]) – Path to a single results file. Can be either a string or a pathlib.Path object.
include_discarded (bool) – If set to True, the function will include ROIs that were discarded by CaImAn. By default, this is set to True.
- Returns:
- preds (np.ndarray):
CNN-based CaImAn prediction probabilities.
- Return type:
(np.ndarray)
- import_overall_caiman_labels(path_resultsFile: str | Path, include_discarded: bool = True) ndarray[source]
Imports the overall CaImAn labels from the results file.
- Parameters:
path_resultsFile (Union[str, pathlib.Path]) – Path to a single results file.
include_discarded (bool) – If
True, include ROIs that were discarded by CaImAn. Default isTrue.
- Returns:
- labels (np.ndarray):
Overall CaImAn labels.
- Return type:
(np.ndarray)
- import_spatialFootprints(path_resultsFile: str | Path, include_discarded: bool = True) csr_array[source]
Imports the spatial footprints from the results file. Note that CaImAn’s
data['estimates']['A']is similar toself.spatialFootprints, but uses ‘F’ order. This function converts this into ‘C’ order to formself.spatialFootprints.- Parameters:
path_resultsFile (Union[str, pathlib.Path]) – Path to a single results file.
include_discarded (bool) – If
True, include ROIs that were discarded by CaImAn. Default isTrue.
- Returns:
- Spatial footprints (scipy.sparse.csr_array):
Spatial footprints.
- Return type:
(scipy.sparse.csr_array)
- class roicat.data_importing.Data_roicat(verbose: bool = True)[source]
Bases:
ROICaT_ModuleSuperclass for all data objects. Can be used as a template for creating custom data objects. RH 2022
- Parameters:
verbose (bool) – Determines whether to print status updates. (Default is
True)
- type
The type of the data object. Set by the subclass.
- Type:
object
- n_sessions
The number of imaging sessions.
- Type:
int
- n_roi
The number of ROIs in each session.
- Type:
int
- n_roi_total
The total number of ROIs across all sessions.
- Type:
int
- FOV_height
The height of the field of view in pixels.
- Type:
int
- FOV_width
The width of the field of view in pixels.
- Type:
int
- FOV_images
A list of numpy arrays, each with shape (FOV_height, FOV_width). Each element represents an imaging session.
- Type:
List[np.ndarray]
- ROI_images
A list of numpy arrays, each with shape (n_roi, height, width). Each element represents an imaging session and each element of the numpy array (first dimension) is an ROI.
- Type:
List[np.ndarray]
- spatialFootprints
A list of scipy.sparse.csr_array objects, each with shape (n_roi, FOV_height * FOV_width). Each element represents an imaging session.
- Type:
List[object]
- class_labels_raw
A list of numpy arrays, each with shape (n_roi,), where each element is an integer. Each element of the list is an imaging session and each element of the numpy array is a class label.
- Type:
List[np.ndarray]
- class_labels_index
A list of numpy arrays, each with shape (n_roi,), where each element is an integer. Each element of the list is an imaging session and each element of the numpy array is the index of the class label obtained from passing the raw class label through np.unique(*, return_inverse=True).
- Type:
List[np.ndarray]
- um_per_pixel
The conversion factor from pixels to microns. This is used to scale the ROI_images to a common size. Should either be a float or a list of floats, one for each session.
- Type:
Union[float, List[float]]
- session_bool
A boolean matrix with shape (n_roi_total, n_sessions). Each element is
Trueif the ROI is present in the session.- Type:
np.ndarray
- check_completeness(verbose: bool = True) None[source]
Checks which pipelines the data object is capable of running given the attributes that have been set.
- Parameters:
verbose (bool) – If
True, outputs progress and error messages. (Default isTrue)
- get_maxIntensityProjection_spatialFootprints(sf: List[csr_array] | None = None, normalize: bool = True)[source]
Returns the maximum intensity projection of the spatial footprints.
- Parameters:
sf (List[scipy.sparse.csr_array]) – List of spatial footprints, one for each session.
normalize (bool) – If True, normalizes the [min, max] range of each ROI to [0, 1] before computing the maximum intensity projection.
- Returns:
List of maximum intensity projections, one for each session.
- Return type:
List[np.ndarray]
- import_from_dict(dict_load: Dict[str, Any]) None[source]
Imports attributes from a dictionary. This is useful if a dictionary that can be serialized was saved.
- Parameters:
dict_load (Dict[str, Any]) – Dictionary containing args to load. Format: {‘method’: [arg1, arg2, …], …}
Note
This method does not return anything. It modifies the object state by importing attributes from the provided dictionary.
- remove_rois_by_classLabel(classLabel_to_keep: int | List[int] | None = None, classLabel_to_remove: int | List[int] | None = None, in_place: bool = True, verbose: bool | None = None)[source]
Removes ROIs based on their class label. Remakes all attributes that are affected by the removal of ROIs. This includes:
spatialFootprints
ROI_images
centroids
class_labels_raw
class_labels_index
session_bool
all attributes related to above attributes
- Parameters:
classLabel_to_keep (Optional[Union[int, List[int]]]) – Class label(s) to keep. If
None,classLabel_to_removemust be provided. The values should correspond to the values seen in theself.class_labels_rawattribute.classLabel_to_remove (Optional[Union[int, List[int]]]) – Class label(s) to remove. If
None,classLabel_to_keepmust be provided. The values should correspond to the values seen in theself.class_labels_rawattribute.in_place (bool) – If
True, the object is modified in place. A new object is returned with the ROIs removed either way.verbose (Optional[bool]) – Whether to print progress messages. If
None, the verbosity level set in the class is used.
- Returns:
The object with the ROIs removed.
- Return type:
self (Data_roicat)
- set_FOVHeightWidth(FOV_height: int, FOV_width: int)[source]
Sets the FOV_height and FOV_width attributes.
- Parameters:
FOV_height (int) – The height of the field of view (FOV) in pixels.
FOV_width (int) – The width of the field of view (FOV) in pixels.
- set_FOV_images(FOV_images: List[ndarray])[source]
Sets the FOV_images attribute.
- Parameters:
FOV_images (List[np.ndarray]) – List of 2D numpy.ndarray objects, one for each session. Each array should have shape (FOV_height, FOV_width).
- set_ROI_images(ROI_images: List[ndarray], um_per_pixel: List[float] | None = None) None[source]
Imports ROI images into the class. Images are expected to be formatted as a list of numpy arrays. Each element is an imaging session. Each element is a numpy array of shape (n_roi, FOV_height, FOV_width). This method will set the attributes: self.ROI_images, self.n_roi, self.n_roi_total, self.n_sessions. If any of these attributes are already set, it will verify the new values match the existing ones.
- Parameters:
ROI_images (List[np.ndarray]) – List of numpy arrays each of shape (n_roi, FOV_height, FOV_width).
um_per_pixel (Union[float, List[float]]) – The conversion factor from pixels to microns. This is used to scale the ROI_images to a common size. Should either be a float or a list of floats, one for each session.
- set_class_labels(labels: List[ndarray | List[int | str | float]] | None = None, path_labels: str | List[str] | None = None, n_classes: int | None = None) None[source]
Imports class labels into the class.
labels are expected to be formatted as a list of arrays. The outer list should have length equal to the number of sessions (n_sessions). Each element in the list is either a 1D array or list of integers or strings and should have length equal to the number of ROIs in that session (n_roi). Each element in the array or list is the class label for the corresponding ROI and can be a number or a string. List[Union[np.ndarray, List[Union[int, str, float]]]].
- Parameters:
labels (Optional[List[Union[np.ndarray, List[Union[int, str, float]]]]]) –
If
None: path_labels must be specified.Else:
labelsare expected to be a list of arrays. The outer list should have length equal to the number of sessions (n_sessions). Each element in the list is either a 1D array or list of integers or strings and should have length equal to the number of ROIs in that session (n_roi). Each element in the array or list is the class label for the corresponding ROI and can be a number or a string. List[Union[np.ndarray, List[Union[int, str, float]]]].
path_labels (Optional[Union[str, List[str]]]) –
If
None: labels must be specified.Else:
path_labelsis expected to be a list of strings. Each element in the list is a path to a file containing the class labels. The outer list should have length equal to the number of sessions (n_sessions). Each file should be a json file containing a list of integers or strings corresponding to the class labels for each ROI in that session.
n_classes (Optional[int]) – Number of classes. If not provided, it will be inferred from the class labels. (Default is
None)
- set_spatialFootprints(spatialFootprints: List[ndarray | csr_array | Dict[str, Any]], um_per_pixel: float | List[float] | None = None)[source]
Sets the spatialFootprints attribute.
- Parameters:
spatialFootprints (List[Union[np.ndarray, csr_array, Dict[str, Any]]]) –
One of the following:
List of numpy.ndarray objects, one for each session. Each array should have shape (n_ROIs, FOV_height, FOV_width).
List of scipy.sparse.csr_array objects, one for each session. Each matrix should have shape (n_ROIs, FOV_height * FOV_width). Reshaping should be done with ‘C’ indexing (standard).
List of dictionaries, one for each session. This dictionary should be a serialized scipy.sparse.csr_array object. It should contains keys: ‘data’, ‘indices’, ‘indptr’, ‘shape’. See scipy.sparse.csr_array for more information.
um_per_pixel (Union[float, List[float]]) – The conversion factor from pixels to microns. This is used to scale the ROI_images to a common size. Should either be a float or a list of floats, one for each session.
- transform_spatialFootprints_to_ROIImages(out_height_width: Tuple[int, int] = (36, 36)) ndarray[source]
Transforms sparse spatial footprints to dense ROI images.
- Parameters:
out_height_width (Tuple[int, int]) – Height and width of the output images. (Default is (36, 36))
- Returns:
- self.ROI_images (np.ndarray):
ROI images with shape (n_roi, out_height_width[0], out_height_width[1]).
- Return type:
(np.ndarray)
- class roicat.data_importing.Data_roiextractors(segmentation_extractor_objects: List[Any], um_per_pixel: float = 1.0, out_height_width: Tuple[int, int] = (36, 36), FOV_image_name: str | None = None, fallback_FOV_height_width: Tuple[int, int] = (512, 512), centroid_method: str = 'centerOfMass', class_labels: List[Any] | None = None, verbose: bool = True)[source]
Bases:
Data_roicatA class for importing all roiextractors supported data. This class will loop through each object and ingest data for roicat. RH, JB 2023
- Parameters:
segmentation_extractor_objects (list) – List of segmentation extractor objects. All objects must be of the same type.
um_per_pixel (Union[float, List[float]]) – Resolution in micrometers per pixel of the imaging field of view. The conversion factor from pixels to microns. This is used to scale the ROI_images to a common size. Should either be a float or a list of floats, one for each session.
out_height_width (tuple of int, optional) – The height and width of output ROI images, specified as (y, x). Defaults to [36,36].
FOV_image_name (str, optional) – If provided, this key will be used to extract the FOV image from the segmentation object’s self.get_images_dict() method. If None, the function will attempt to pull out a mean image. Defaults to None.
fallback_FOV_height_width (tuple of int, optional) – If the FOV images cannot be imported automatically, this will be used as the FOV height and width. Otherwise, FOV height and width are set from the first object in the list. Defaults to [512,512].
centroid_method (str, optional) – The method for calculating the centroid of the ROI. This should be either
'centerOfMass'or'median'. Defaults to'centerOfMass'.class_labels (list, optional) – A list of class labels for each object. Defaults to
None.verbose (bool, optional) – If set to True, print statements will be displayed. Defaults to
True.
- class roicat.data_importing.Data_suite2p(paths_statFiles: str | Path | List[str | Path], paths_opsFiles: str | Path | List[str | Path] | None = None, um_per_pixel: float | List[float] = 1.0, new_or_old_suite2p: str = 'new', out_height_width: Tuple[int, int] = (36, 36), type_meanImg: str = 'meanImgE', FOV_images: ndarray | None = None, centroid_method: str = 'centerOfMass', class_labels: List[ndarray] | List[str] | None = None, paths_iscell: str | Path | List[str | Path] | None = None, FOV_height_width: Tuple[int, int] | None = None, verbose: bool = True)[source]
Bases:
Data_roicatClass for handling suite2p output files and data. In particular stat.npy and ops.npy files. Imports FOV images and spatial footprints, and prepares ROI images. RH 2022
- Parameters:
paths_statFiles (list of str or pathlib.Path) – List of paths to the stat.npy files. Elements should be one of: str, pathlib.Path, list of str or list of pathlib.Path.
paths_opsFiles (list of str or pathlib.Path, optional) – List of paths to the ops.npy files. Elements should be one of: str, pathlib.Path, list of str or list of pathlib.Path. Optional. Used to get FOV_images, FOV_height, FOV_width, and shifts (if old matlab ops file).
um_per_pixel (Union[float, List[float]]) – Resolution in micrometers per pixel of the imaging field of view. The conversion factor from pixels to microns. This is used to scale the ROI_images to a common size. Should either be a float or a list of floats, one for each session.
new_or_old_suite2p (str) – Type of suite2p output files. Matlab=old, Python=new. Should be:
'new'or'old'.out_height_width (tuple of int) – Height and width of output ROI images. These are the little images of centered ROIs that are typically used for passing through the neural net. Unless your ROIs are larger than the default size, it’s best to just leave it as default. Should be: (int, int) (y, x).
type_meanImg (str) – Type of mean image to use. Should be:
'meanImgE'or'meanImg'.FOV_images (np.ndarray, optional) – FOV images. Array of shape (n_sessions, FOV_height, FOV_width). Optional.
centroid_method (str) – Method for calculating the centroid of an ROI. Should be:
'centerOfMass'or'median'.class_labels ((list of np.ndarray) or (list of str to paths) or None) – Optional. If
None, class labels are not set. If list of np.ndarray, each element should be a 1D integer array of length n_roi specifying the class label for each ROI. If list of str, each element should be a path to a .npy file containing an array of length n_roi specifying the class label for each ROI.paths_iscell (str or pathlib.Path or list of str or list of pathlib.Path) – Optional. Paths to the iscell.npy files. Elements should be one of: str, pathlib.Path, list of str or list of pathlib.Path. If provided, the iscell.npy files are used to set the class labels. If not provided, the class labels are set to None. An iscell.npy file is assumed to be a 2D numpy array of shape (n_ROIs, (iscell boolean, probability float))
FOV_height_width (tuple of int, optional) – FOV height and width. If
None, paths_opsFiles must be provided to get FOV height and width.verbose (bool) – If
True, prints results from each function.
- import_FOV_images(type_meanImg: str = 'meanImgE') List[ndarray][source]
Imports the FOV images from ops files or user defined image arrays.
- Parameters:
type_meanImg (str) –
Type of the mean image. References the key in the ops.npy file. Options are:
'meanImgE': Enhanced mean image.'meanImg': Mean image.
- Returns:
List of FOV images. Length of the list is the same as self.paths_files. Each element is a numpy.ndarray of shape (n_files, height, width).
- Return type:
FOV_images (List[np.ndarray])
- import_neuropil_masks(frame_height_width: List[int] | Tuple[int, int] | None = None) List[csr_array][source]
Imports and converts the neuropil masks of the ROIs in the stat files into images in sparse arrays.
- Parameters:
frame_height_width (Optional[Union[List[int], Tuple[int, int]]]) – The height and width of the frame in the form [height, width]. If
None, the height and width will be taken from the FOV images. (Default isNone)- Returns:
- neuropilMasks (List[scipy.sparse.csr_array]):
List of neuropil masks. Length of the list is the same as
self.paths_stat. Each element is a sparse array of shape (n_roi, frame_height, frame_width).
- Return type:
(List[scipy.sparse.csr_array])
- import_spatialFootprints(frame_height_width: Tuple[int, int] | None=None, dtype: dtype = <class 'numpy.float32'>) List[csr_array][source]
Imports and converts the spatial footprints of the ROIs in the stat files into images in sparse arrays.
Generates self.session_bool which is a bool np.ndarray of shape (n_roi, n_sessions) indicating which session each ROI belongs to.
- Parameters:
frame_height_width (Optional[Union[List[int], Tuple[int, int]]]) – The height and width of the frame in the form [height, width]. If
None,self.import_FOV_imagesmust be called before this method, and the frame height and width will be taken from the first FOV image. (Default isNone)dtype (np.dtype) – Data type of the sparse array. (Default is
np.float32)
- Returns:
- sf (List[scipy.sparse.csr_array]):
Spatial footprints. Length of the list is the same as
self.paths_files. Each element is a scipy.sparse.csr_array of shape (n_roi, frame_height * frame_width).
- Return type:
(List[scipy.sparse.csr_array])
- roicat.data_importing.fix_paths(paths: List[str | Path] | str | Path) List[str][source]
Ensures the input paths are a list of strings.
- Parameters:
paths (Union[List[Union[str, pathlib.Path]], str, pathlib.Path]) – The input can be either a list of strings or pathlib.Path objects, or a single string or pathlib.Path object.
- Returns:
A list of strings representing the paths.
- Return type:
List[str]
- Raises:
TypeError – If the input isn’t a list of str or pathlib.Path objects, a single str, or a pathlib.Path object.
- roicat.data_importing.make_smaller_data(data: Data_roicat, n_ROIs: int | None = 300, n_sessions: int | None = 10, bounds_x: Tuple[int, int] = (200, 400), bounds_y: Tuple[int, int] = (200, 400)) Data_roicat[source]
Reduces the size of a Data_roicat object by limiting the number of regions of interest (ROIs) and sessions, and adjusting the bounds on the x and y axes. This function is useful for making test datasets.
- Parameters:
data (Data_roicat) – The input data object of the
Data_roicattype.n_ROIs (Optional[int]) – The number of regions of interest to include in the output data. If
None, all ROIs will be included.n_sessions (Optional[int]) – The number of sessions to include in the output data. If
None, all sessions will be included.bounds_x (Tuple[int, int]) – The x-axis bounds for the output data. The bounds should be a tuple of two integers.
bounds_y (Tuple[int, int]) – The y-axis bounds for the output data. The bounds should be a tuple of two integers.
- Returns:
- data_out (Data_roicat):
The output data, which is a reduced version of the input data according to the specified parameters.
- Return type:
roicat.helpers module
- class roicat.helpers.Convergence_checker_optuna(n_patience: int = 10, tol_frac: float = 0.05, max_trials: int = 350, max_duration: float = 600, value_stop: float | None = None, verbose: bool = True)[source]
Bases:
objectChecks if the optuna optimization has converged. RH 2023
- Parameters:
n_patience (int) – Number of trials to look back to check for convergence. Also the minimum number of trials that must be completed before starting to check for convergence. (Default is 10)
tol_frac (float) – Fractional tolerance for convergence. The best output value must change by less than this fractional amount to be considered converged. (Default is 0.05)
max_trials (int) – Maximum number of trials to run before stopping. (Default is 350)
max_duration (float) – Maximum number of seconds to run before stopping. (Default is 600)
value_stop (Optional[float]) – Value at which to stop the optimization. If the best value is equal to or less than this value, the optimization will stop. (Default is None)
verbose (bool) – If
True, print messages. (Default isTrue)
- bests
List to hold the best values obtained in the trials.
- Type:
List[float]
- best
Best value obtained among the trials. Initialized with infinity.
- Type:
float
Example
# Create a ConvergenceChecker instance convergence_checker = ConvergenceChecker( n_patience=15, tol_frac=0.01, max_trials=500, max_duration=60*20, verbose=True ) # Assume we have a study and trial objects from optuna # Use the check method in the callback study.optimize(objective, n_trials=100, callbacks=[convergence_checker.check])
- class roicat.helpers.Convolver_1d(kernel: ndarray | object, length_x: int | None = None, dtype: object = torch.float32, pad_mode: str = 'same', correct_edge_effects: bool = True, device: str = 'cpu')[source]
Bases:
objectClass for 1D convolution. Uses torch.nn.functional.conv1d. Stores the convolution and edge correction kernels for repeated use. RH 2023
- pad_mode
Mode for padding. See
torch.nn.functional.conv1dfor details.- Type:
str
- dtype
Data type for the convolution. Default is
torch.float32.- Type:
object
- kernel
Convolution kernel as a tensor.
- Type:
object
- trace_correction
Kernel for edge correction.
- Type:
object
- Parameters:
kernel (Union[np.ndarray, object]) – 1D array to convolve with. The array can be a numpy array or a tensor.
length_x (Optional[int]) – Length of the array to be convolved. Must not be
Noneif pad_mode is not ‘valid’. (Default isNone)dtype (object) – Data type to use for the convolution. (Default is
torch.float32)pad_mode (str) – Mode for padding. See
torch.nn.functional.conv1dfor details. (Default is ‘same’)correct_edge_effects (bool) – Whether or not to correct for edge effects. (Default is
True)device (str) – Device to use for computation. (Default is ‘cpu’)
- convolve(arr: ndarray | Tensor) ndarray | Tensor[source]
Convolve array with kernel.
- Parameters:
arr (Union[np.ndarray, torch.Tensor]) – Array to convolve. Convolution performed along the last axis. Must be 1D, 2D, or 3D.
- Returns:
- out (Union[np.ndarray, torch.Tensor]):
The output tensor after performing convolution and correcting for edge effects.
- Return type:
(Union[np.ndarray, torch.Tensor])
Example
convolver = Convolver_1d(kernel=my_kernel) result = convolver.convolve(my_array)
- class roicat.helpers.Equivalence_checker(kwargs_allclose: dict | None = {'equal_nan': True, 'rtol': 1e-07}, assert_mode=False, verbose=False)[source]
Bases:
objectClass for checking if all items are equivalent or allclose (almost equal) in two complex data structures. Can check nested lists, dicts, and other data structures. Can also optionally assert (raise errors) if all items are not equivalent. RH 2023
- _kwargs_allclose
Keyword arguments for the numpy.allclose function.
- Type:
Optional[dict]
- _assert_mode
Whether to raise an assertion error if items are not close.
- Type:
bool
- Parameters:
kwargs_allclose (Optional[dict]) – Keyword arguments for the numpy.allclose function. (Default is
{'rtol': 1e-7, 'equal_nan': True})assert_mode (bool) – Whether to raise an assertion error if items are not close.
verbose (bool) –
- How much information to print out:
False/0: No information printed out.True/1: Mismatched items only.2: All items printed out.
- class roicat.helpers.ImageAlignmentChecker(hw: Tuple[int, int], radius_in: float | Tuple[float, float], radius_out: float | Tuple[float, float], order: int = 5, device: str = 'cpu')[source]
Bases:
objectClass to check the alignment of images using phase correlation. RH 2024
- Parameters:
hw (Tuple[int, int]) – Height and width of the images.
radius_in (Union[float, Tuple[float, float]]) – Radius of the pixel shift / offset that can be considered as ‘aligned’. Used to create the ‘in’ filter which is an image of a small centered circle that is used as a filter and multiplied by the phase correlation images. If a single value is provided, the filter will be a circle with radius 0 to that value; it will be converted to a tuple representing a bandpass filter (0, radius_in).
radius_out (Union[float, Tuple[float, float]]) – Similar to radius_in, but for the ‘out’ filter, which defines the ‘null distribution’ for defining what is ‘aligned’. Should be a value larger than the expected maximum pixel shift / offset. If a single value is provided, the filter will be a donut / taurus starting at that value and ending at the edge of the smallest dimension of the image; it will be converted to a tuple representing a bandpass filter (radius_out, min(hw)).
order (int) – Order of the butterworth bandpass filters used to define the ‘in’ and ‘out’ filters. Larger values will result in a sharper edges, but values higher than 5 can lead to collapse of the filter.
device (str) – Torch device to use for computations. (Default is ‘cpu’)
- hw
Height and width of the images.
- Type:
Tuple[int, int]
- order
Order of the butterworth bandpass filters used to define the ‘in’ and ‘out’ filters.
- Type:
int
- device
Torch device to use for computations.
- Type:
str
- filt_in
The ‘in’ filter used for scoring the alignment.
- Type:
torch.Tensor
- filt_out
The ‘out’ filter used for scoring the alignment.
- Type:
torch.Tensor
- score_alignment(images: ndarray | Tensor, images_ref: ndarray | Tensor | None = None)[source]
Score the alignment of a set of images using phase correlation. Computes the stats of the center (‘in’) of the phase correlation image over the stats of the outer region (‘out’) of the phase correlation image. RH 2024
- Parameters:
images (Union[np.ndarray, torch.Tensor]) – A 3D array of images. Shape: (n_images, height, width)
images_ref (Optional[Union[np.ndarray, torch.Tensor]]) – Reference images to compare against. If provided, the images will be compared against these images. If not provided, the images will be compared against themselves. (Default is
None)
- Returns:
Dictionary containing the following keys: * ‘mean_out’:
Mean of the phase correlation image weighted by the ‘out’ filter
- ’mean_in’:
Mean of the phase correlation image weighted by the ‘in’ filter
- ’ptile95_out’:
95th percentile of the phase correlation image multiplied by the ‘out’ filter
- ’max_in’:
Maximum value of the phase correlation image multiplied by the ‘in’ filter
- ’std_out’:
Standard deviation of the phase correlation image weighted by the ‘out’ filter
- ’std_in’:
Standard deviation of the phase correlation image weighted by the ‘in’ filter
- ’max_diff’:
Difference between the ‘max_in’ and ‘ptile95_out’ values
- ’z_in’:
max_diff divided by the ‘std_out’ value
- ’r_in’:
max_diff divided by the ‘ptile95_out’ value
- Return type:
(Dict)
- class roicat.helpers.ImageLabeler(image_array: ndarray, start_index: int = 0, path_csv: str | None = None, save_csv: bool = True, resize_factor: float = 10.0, normalize_images: bool = True, verbose: bool = True, key_end: str = 'Escape', key_prev: str = 'Left', key_next: str = 'Right')[source]
Bases:
objectA simple graphical interface for labeling image classes. Use this class with a context manager to ensure the window is closed properly. The class provides a tkinter window which displays images from a provided numpy array one by one and lets you classify each image by pressing a key. The title of the window is the image index. The classification label and image index are stored as the
self.labels_attribute and saved to a CSV file in self.path_csv. RH 2023- Parameters:
image_array (np.ndarray) – A numpy array of images. Either 3D: (n_images, height, width) or 4D: (n_images, height, width, n_channels). Images should be scaled between 0 and 255 and will be converted to uint8.
start_index (int) – The index of the first image to display. (Default is 0)
path_csv (str) – Path to the CSV file for saving results. If
None, results will not be saved.save_csv (bool) – Whether to save the results to a CSV. (Default is
True)resize_factor (float) – A scaling factor for the fractional change in image size. (Default is 1.0)
normalize_images (bool) – Whether to normalize the images between min and max values. (Default is
True)verbose (bool) – Whether to print status updates. (Default is
True)key_end (str) – Key to press to end the session. (Default is
'Escape')key_prev (str) – Key to press to go back to the previous image. (Default is
'Left')key_next (str) – Key to press to go to the next image. (Default is
'Right')
Example
with ImageLabeler(images, start_index=0, resize_factor=4.0, key_end='Escape') as labeler: labeler.run() path_csv, labels = labeler.path_csv, labeler.labels_
- image_array
A numpy array of images. Either 3D: (n_images, height, width) or 4D: (n_images, height, width, n_channels). Images should be scaled between 0 and 255 and will be converted to uint8.
- Type:
np.ndarray
- start_index
The index of the first image to display. (Default is 0)
- Type:
int
- path_csv
Path to the CSV file for saving results. If
None, results will not be saved.- Type:
str
- save_csv
Whether to save the results to a CSV. (Default is
True)- Type:
bool
- resize_factor
A scaling factor for the fractional change in image size. (Default is 1.0)
- Type:
float
- normalize_images
Whether to normalize the images between min and max values. (Default is
True)- Type:
bool
- verbose
Whether to print status updates. (Default is
True)- Type:
bool
- key_end
Key to press to end the session. (Default is
'Escape')- Type:
str
- key_prev
Key to press to go back to the previous image. (Default is
'Left')- Type:
str
- key_next
Key to press to go to the next image. (Default is
'Right')- Type:
str
- labels_
A list of tuples containing the image index and classification label for each image. The list is saved to a CSV file in self.path_csv.
- Type:
list
- classify(event)[source]
Adds the current image index and pressed key as a label. Then saves the results and moves to the next image.
- Parameters:
event (tkinter.Event) – A tkinter event object.
- get_labels(kind: str = 'dict') dict | List[Tuple[int, str]][source]
Returns the labels. The format of the output is determined by the
kindparameter. If the labels dictionary is empty, returnsNone. RH 2023- Parameters:
kind (str) –
The type of object to return. (Default is
'dict')'dict': {idx: label, idx: label, …}'list': [label, label, …] where the index is the image index and unlabeled images are represented as'None'.'dataframe': {‘index’: [idx, idx, …], ‘label’: [label, label, …]} This can be converted to a pandas dataframe with: pd.DataFrame(self.get_labels(‘dataframe’))
- Returns:
Depending on the
kindparameter, it returns either:- dict:
A dictionary where keys are the image indices and values are the labels.
- List[Tuple[int, str]]:
A list of tuples, where each tuple contains an image index and a label.
- dict:
A dictionary with keys ‘index’ and ‘label’ where values are lists of indices and labels respectively.
- Return type:
(Union[dict, List[Tuple[int, str]], dict])
- class roicat.helpers.IntegratedLabeler(images: ndarray, embeddings: ndarray = None, idx_images_overlay: ndarray | None = None, idx_selection: List[int] | None = None, figsize: float = 5, size_images_overlay: float | None = None, crop_images_overlay: float | None = 0.35, frac_overlap_allowed: float = 0.5, image_overlay_raster_size: Tuple[int, int] = (1000, 1000), alpha_points: float = 0.5, size_points: float = 20, normalize_images: bool = True, verbose: bool = True, path_csv: str | None = None, save_csv: bool = True, key_end: str = 'Escape', key_prev: str = 'Left', key_next: str = 'Right')[source]
Bases:
objectA graphical interface for labeling image classes. The class displays a sequence of images in the left panel which can be labelled by pressing keys and the right panel is a scatterplot of an embedding of each image with the option to overlay images on the scatterplot. The user can use a lasso tool to select points on the scatterplot and these points will be shown to the user on the left panel for labelling. The title of the window is the current image index. The overlays can be toggled by pressing Control-Shift-T. The classification label and image index are stored as the
self.labels_attribute and saved to a CSV file in self.path_csv.- Parameters:
images (np.ndarray) – A numpy array of images. Either 3D: (n_images, height, width) or 4D: (n_images, height, width, n_channels). Images should be scaled between 0 and 255 and will be converted to uint8.
embeddings (np.ndarray) – A numpy array of embeddings for each image. Should be shape (n_images, 2).
idx_images_overlay (np.ndarray) – A numpy array of indices of images to overlay on the scatterplot.
idx_selection (List[int]) – A list of indices to select from the image array. If
None, all images will be selected. (Default isNone)figsize (float) – The size of each panel in the figure (width and height). (Default is 5)
size_images_overlay (Optional[float]) – The size of the images to overlay. If
None, the size is calculated based on nearest neighbors. (Default isNone)crop_images_overlay (Optional[float]) – The fraction of the image to crop on each side. (Default is 0.35)
frac_overlap_allowed (float) – The fraction of overlap allowed between images. (Default is 0.5)
image_overlay_raster_size (Tuple[int, int]) – The size of the raster for the composite overlay. (Default is*(1000, 1000)*)
alpha_points (float) – The transparency of the scatterplot points. (Default is 0.5)
size_points (float) – The size of the scatterplot points. (Default is 20)
normalize_images (bool) – Whether to normalize the images between min and max values. (Default is
True)verbose (bool) – Whether to print status updates. (Default is
True)path_csv (Optional[str]) – Path to the CSV file for saving results. If
None, results will not be saved.save_csv (bool) – Whether to save the results to a CSV. (Default is
True)key_end (str) – Key to press to end the session. (Default is
'Escape')key_prev (str) – Key to press to go back to the previous image. (Default is
'Left')key_next (str) – Key to press to go to the next image. (Default is
'Right')
Example
- path_csv
Path to the CSV file for saving results. If
None, results will not be saved.- Type:
str
- save_csv
Whether to save the results to a CSV. (Default is
True)- Type:
bool
- labels_
A list of tuples containing the image index and classification label for each image. The list is saved to a CSV file in self.path_csv.
- Type:
list
- classify(event)[source]
Adds the current image index and pressed key as a label. Then saves the results and moves to the next image.
- Parameters:
event (tkinter.Event) – A tkinter event object.
- get_labels(kind: str = 'dict') dict | List[Tuple[int, str]][source]
Returns the labels. The format of the output is determined by the
kindparameter. If the labels dictionary is empty, returnsNone. RH 2023- Parameters:
kind (str) –
The type of object to return. (Default is
'dict')'dict': {idx: label, idx: label, …}'list': [label, label, …] where the index is the image index and unlabeled images are represented as'None'.'dataframe': {‘index’: [idx, idx, …], ‘label’: [label, label, …]} This can be converted to a pandas dataframe with: pd.DataFrame(self.get_labels(‘dataframe’))
- Returns:
Depending on the
kindparameter, it returns either:- dict:
A dictionary where keys are the image indices and values are the labels.
- List[Tuple[int, str]]:
A list of tuples, where each tuple contains an image index and a label.
- dict:
A dictionary with keys ‘index’ and ‘label’ where values are lists of indices and labels respectively.
- Return type:
(Union[dict, List[Tuple[int, str]], dict])
- save_classification()[source]
Saves the classification results to a CSV file. This function does not append, it overwrites the entire file. The file contains two columns: ‘image_index’ and ‘label’.
- update_selection(idx_selection: List[int])[source]
Updates the selection of images to classify. The selection is a list of indices to select from the image array. Will show the first image in the new selection.
- Parameters:
idx_selection (List[int]) – A list of indices to select from the image array.
- class roicat.helpers.OptunaProgressBar(n_trials: int | None = None, timeout: float | None = None, **tqdm_kwargs: Any)[source]
Bases:
objectA customizable progress bar for Optuna’s study.optimize().
- Parameters:
n_trials (int, optional) – The number of trials. Exactly one of n_trials or timeout must be set.
timeout (float, optional) – The maximum time to run in seconds. Exactly one of n_trials or timeout must be set.
tqdm_kwargs (dict, optional) –
Additional keyword arguments to pass to tqdm. These will override the default values EXCEPT for the following kwargs, which will defer to the environment variables:
disabledynamic_ncols
- exception roicat.helpers.ParallelExecutionError(index, original_exception)[source]
Bases:
ExceptionException class for errors that occur during parallel execution. Intended to be used with the
map_parallelfunction. RH 2023- index
Index of the job that failed.
- Type:
int
- original_exception
The original exception that was raised.
- Type:
Exception
- class roicat.helpers.Toeplitz_convolution2d(x_shape: Tuple[int, int], k: ndarray, mode: str = 'same', dtype: dtype | None = None)[source]
Bases:
objectConvolve a 2D array with a 2D kernel using the Toeplitz matrix multiplication method. This class is ideal when ‘x’ is very sparse (density<0.01), ‘x’ is small (shape <(1000,1000)), ‘k’ is small (shape <(100,100)), and the batch size is large (e.g. 1000+). Generally, it is faster than scipy.signal.convolve2d when convolving multiple arrays with the same kernel. It maintains a low memory footprint by storing the toeplitz matrix as a sparse matrix. RH 2022
- x_shape
The shape of the 2D array to be convolved.
- Type:
Tuple[int, int]
- k
2D kernel to convolve with.
- Type:
np.ndarray
- mode
Either
'full','same', or'valid'. See scipy.signal.convolve2d for details.- Type:
str
- dtype
The data type to use for the Toeplitz matrix. If
None, then the data type of the kernel is used.- Type:
Optional[np.dtype]
- Parameters:
x_shape (Tuple[int, int]) – The shape of the 2D array to be convolved.
k (np.ndarray) – 2D kernel to convolve with.
mode (str) – Convolution method to use, either
'full','same', or'valid'. See scipy.signal.convolve2d for details. (Default is ‘same’)dtype (Optional[np.dtype]) – The data type to use for the Toeplitz matrix. Ideally, this matches the data type of the input array. If
None, then the data type of the kernel is used. (Default isNone)
Example
# create Toeplitz_convolution2d object toeplitz_convolution2d = Toeplitz_convolution2d( x_shape=(100,30), k=np.random.rand(10,10), mode='same', ) toeplitz_convolution2d( x=scipy.sparse.csr_array(np.random.rand(5,3000)), batch_size=True, )
- roicat.helpers.add_text_to_images(images: array, text: List[List[str]], position: Tuple[int, int] = (10, 10), font_size: int = 1, color: Tuple[int, int, int] = (255, 255, 255), line_width: int = 1, font: str | None = None, frameRate: int = 30) array[source]
Adds text to images using
cv2.putText(). RH 2022- Parameters:
images (np.array) – Frames of video or images. Shape: (n_frames, height, width, n_channels).
text (list of lists) – Text to add to images. The outer list: one element per frame. The inner list: each element is a line of text.
position (tuple) – (x, y) position of the text (top left corner). (Default is (10,10))
font_size (int) – Font size of the text. (Default is 1)
color (tuple) – (r, g, b) color of the text. (Default is (255,255,255))
line_width (int) – Line width of the text. (Default is 1)
font (str) – Font to use. If
None, then will usecv2.FONT_HERSHEY_SIMPLEX. Seecv2.FONT...for more options. (Default isNone)frameRate (int) – Frame rate of the video. (Default is 30)
- Returns:
- images_with_text (np.array):
Frames of video or images with text added.
- Return type:
(np.array)
- roicat.helpers.apply_warp_transform(im_in: ndarray, warp_matrix: ndarray, interpolation_method: int = 1, borderMode: int = 0, borderValue: int = 0) ndarray[source]
Apply a warp transform to an image. Wrapper function for
cv2.warpAffineandcv2.warpPerspective. RH 2022- Parameters:
im_in (np.ndarray) – Input image with any dimensions.
warp_matrix (np.ndarray) – Warp matrix. Shape should be (2, 3) for affine transformations, and (3, 3) for homography. See
cv2.findTransformECCfor more info.interpolation_method (int) – Interpolation method. See
cv2.warpAffinefor more info. (Default iscv2.INTER_LINEARwhich = 1)borderMode (int) – Border mode. Determines how to handle pixels from outside the image boundaries. See
cv2.warpAffinefor more info. (Default iscv2.BORDER_CONSTANTwhich = 0)borderValue (int) – Value to use for border pixels if borderMode is set to
cv2.BORDER_CONSTANT. (Default is 0)
- Returns:
- im_out (np.ndarray):
Transformed output image with the same dimensions as the input image.
- Return type:
(np.ndarray)
- roicat.helpers.bounded_logspace(start: float, stop: float, num: int) ndarray[source]
Creates a logarithmically spaced array, similar to
np.logspace, but with a defined start and stop. RH 2022- Parameters:
start (float) – The first value in the output array.
stop (float) – The last value in the output array.
num (int) – The number of values in the output array.
- Returns:
- out (np.ndarray):
An array of logarithmically spaced values between
startandstop.
- Return type:
(np.ndarray)
- roicat.helpers.check_keys_subset(d, default_dict, hierarchy=['defaults'])[source]
Checks that the keys in d are all in default_dict. Raises an error if not. RH 2023
- Parameters:
d (Dict) – Dictionary to check.
default_dict (Dict) – Dictionary containing the keys to check against.
hierarchy (List[str]) – Used internally for recursion. Hierarchy of keys to d.
- roicat.helpers.clear_gpu_cache(gc_collect: bool = True)[source]
Clear GPU memory cache and optionally run garbage collection.
Releases unreferenced GPU tensors back to the OS. Works with CUDA, MPS (Apple Silicon), and is a no-op on CPU-only systems. Call between pipeline steps to prevent memory accumulation.
- Parameters:
gc_collect (bool) – If
True(default), rungc.collect()before clearing the GPU cache. Set toFalsefor lightweight clearing inside tight loops where GC overhead or side effects are undesirable.
RH 2025
- roicat.helpers.compare_file_hashes(hash_dict_true: Dict[str, Tuple[str, str]], dir_files_test: str | None = None, paths_files_test: List[str] | None = None, verbose: bool = True) Tuple[bool, Dict[str, bool], Dict[str, str]][source]
Compares hashes of files in a directory or list of paths to provided hashes. RH 2022
- Parameters:
hash_dict_true (Dict[str, Tuple[str, str]]) – Dictionary of hashes to compare. Each entry should be in the format: {‘key’: (‘filename’, ‘hash’)}.
dir_files_test (str) – Path to directory containing the files to compare hashes. Unused if paths_files_test is not
None. (Optional)paths_files_test (List[str]) – List of paths to files to compare hashes. dir_files_test is used if
None. (Optional)verbose (bool) – If
True, failed comparisons are printed out. (Default isTrue)
- Returns:
- tuple containing:
- total_result (bool):
Trueif all hashes match,Falseotherwise.- individual_results (Dict[str, bool]):
Dictionary indicating whether each hash matched.
- paths_matching (Dict[str, str]):
Dictionary of paths that matched. Each entry is in the format: {‘key’: ‘path’}.
- Return type:
(tuple)
- roicat.helpers.compose_remappingIdx(remap_AB: ndarray, remap_BC: ndarray, method: str = 'linear', fill_value: float | None = nan, bounds_error: bool = False) ndarray[source]
Composes two remapping index fields using scipy.interpolate.interpn.
This function computes ‘remap_AC’ from ‘remap_AB’ and ‘remap_BC’, where ‘remap_AB’ is a remapping index field that warps image A onto image B, and ‘remap_BC’ is a remapping index field that warps image B onto image C.
RH 2023
- Parameters:
remap_AB (np.ndarray) – An array of shape (H, W, 2) representing the remap field from image A to image B.
remap_BC (np.ndarray) – An array of shape (H, W, 2) representing the remap field from image B to image C.
method (str) –
Interpolation method to use. Either
'linear': Use linear interpolation (default).'nearest': Use nearest interpolation.'cubic': Use cubic interpolation.
fill_value (Optional[float]) – The value used for points outside the interpolation domain. (Default is
np.nan)bounds_error (bool) – If
True, a ValueError is raised when interpolated values are requested outside of the domain of the input data. (Default isFalse)
- Returns:
- remap_AC (np.ndarray):
An array of shape (H, W, 2) representing the remap field from image A to image C.
- Return type:
(np.ndarray)
- roicat.helpers.compose_transform_matrices(matrix_AB: ndarray, matrix_BC: ndarray) ndarray[source]
Composes two transformation matrices to create a transformation from one image to another. RH 2023
This function is used to combine two transformation matrices, ‘matrix_AB’ and ‘matrix_BC’. ‘matrix_AB’ represents a transformation that warps an image A onto an image B. ‘matrix_BC’ represents a transformation that warps image B onto image C. The result is ‘matrix_AC’, a transformation matrix that would warp image A directly onto image C.
- Parameters:
matrix_AB (np.ndarray) – A transformation matrix from image A to image B. The array can have the shape (2, 3) or (3, 3).
matrix_BC (np.ndarray) – A transformation matrix from image B to image C. The array can have the shape (2, 3) or (3, 3).
- Returns:
- matrix_AC (np.ndarray):
A composed transformation matrix from image A to image C. The array has the shape (2, 3) or (3, 3).
- Return type:
(np.ndarray)
- Raises:
AssertionError – If the input matrices do not have the shape (2, 3) or (3, 3).
Example
# Define the transformation matrices matrix_AB = np.array([[1, 0, 0], [0, 1, 0], [0, 0, 1]]) matrix_BC = np.array([[1, 0, 0], [0, 1, 0], [0, 0, 1]]) # Compose the transformation matrices matrix_AC = compose_transform_matrices(matrix_AB, matrix_BC)
- roicat.helpers.compute_cluster_similarity_matrices(s: csr_array | ndarray | COO, l: ndarray, verbose: bool = True) Tuple[ndarray, ndarray, ndarray][source]
Computes the similarity matrices for each cluster in
l. This algorithm works best on large and sparse matrices. RH 2023- Parameters:
s (Union[scipy.sparse.csr_array, np.ndarray, sparse.COO]) – Similarity matrix. Entries should be non-negative floats.
l (np.ndarray) – Labels for each row of
s. Labels should ideally be integers.verbose (bool) – Whether to print warnings. (Default is
True)
- Returns:
- tuple containing:
- labels_unique (np.ndarray):
Unique labels in
l.- cs_mean (np.ndarray):
Similarity matrix for each cluster. Each element is the mean similarity between all the pairs of samples in each cluster. Note that the diagonal here only considers non-self similarity, which excludes the diagonals of
s.- cs_max (np.ndarray):
Similarity matrix for each cluster. Each element is the maximum similarity between all the pairs of samples in each cluster. Note that the diagonal here only considers non-self similarity, which excludes the diagonals of
s.- cs_min (np.ndarray):
Similarity matrix for each cluster. Each element is the minimum similarity between all the pairs of samples in each cluster. Will be 0 if there are any sparse elements between the two clusters.
- Return type:
(tuple)
- roicat.helpers.confusion_matrix(y_hat: ndarray, y_true: ndarray, counts: bool = False) ndarray[source]
Computes the confusion matrix from
y_hatandy_true.y_hatshould be either predictions or probabilities. RH 2022- Parameters:
y_hat (np.ndarray) –
Numpy array of predictions or probabilities. Either
1D array of predictions (n_samples,). Values should be integers.
2D array of probabilities (n_samples, n_classes). Values should be floats.
(Default is 1D array of predictions)
y_true (np.ndarray) –
Numpy array of ground truth labels. Either
1D array of labels (n_samples,). Values should be integers.
2D array of one-hot labels (n_samples, n_classes). Values should be integers.
(Default is 1D array of labels)
counts (bool) – If
False, the output confusion matrix is normalized. IfTrue, the output contains counts. (Default isFalse)
- Returns:
- cmat (np.ndarray):
The computed confusion matrix.
- Return type:
(np.ndarray)
- roicat.helpers.cosine_kernel_2D(center: Tuple[int, int] = (5, 5), image_size: Tuple[int, int] = (11, 11), width: int = 5) ndarray[source]
Generates a 2D cosine kernel. RH 2021
- Parameters:
center (Tuple[int, int]) – The mean position (X, Y) where high value is expected. It is 0-indexed. Make the second value 0 to make it 1D. (Default is (5, 5))
image_size (Tuple[int, int]) – The total image size (width, height). Make the second value 0 to make it 1D. (Default is (11, 11))
width (int) – The full width of one cycle of the cosine. (Default is 5)
- Returns:
- k_cos (np.ndarray):
2D or 1D array of the cosine kernel.
- Return type:
(np.ndarray)
- roicat.helpers.cv2RemappingIdx_to_pytorchFlowField(ri: ndarray | Tensor) ndarray | Tensor[source]
Converts remapping indices from the OpenCV format to the PyTorch format. In the OpenCV format, the displacement is in pixels relative to the top left pixel of the image. In the PyTorch format, the displacement is in pixels relative to the center of the image. RH 2023
- Parameters:
ri (Union[np.ndarray, torch.Tensor]) – Remapping indices. Each pixel describes the index of the pixel in the original image that should be mapped to the new pixel. Shape: (H, W, 2). The last dimension is (x, y).
- Returns:
- normgrid (Union[np.ndarray, torch.Tensor]):
”Flow field”, in the PyTorch format. Technically not a flow field, since it doesn’t describe displacement. Rather, it is a remapping index relative to the center of the image. Shape: (H, W, 2). The last dimension is (x, y).
- Return type:
(Union[np.ndarray, torch.Tensor])
- roicat.helpers.deep_update_dict(dictionary: dict, key: List[str], val: Any, in_place: bool = False) dict | None[source]
Updates a nested dictionary with a new value. RH 2023
- Parameters:
dictionary (dict) – The original dictionary to update.
key (List[str]) – List of keys representing the hierarchical path to the nested value to update. Each element should be a string that represents a level in the hierarchy. For example, to change a value in the dictionary params at key ‘dataloader_kwargs’ and subkey ‘prefetch_factor’, you would pass [‘dataloader_kwargs’, ‘prefetch_factor’].
val (Any) – The new value to set in the dictionary.
in_place (bool) –
True: the original dictionary will be updated in-place and no value will be returned.False, a new dictionary will be created and returned. (Default isFalse)
- Returns:
- updated_dict (dict):
The updated dictionary. Only returned if
in_placeisFalse.
- Return type:
(Union[dict, None])
Example
original_dict = {"level1": {"level2": "old value"}} updated_dict = deep_update_dict(original_dict, ["level1", "level2"], "new value", in_place=False) # Now updated_dict is {"level1": {"level2": "new value"}}
- roicat.helpers.design_butter_bandpass(lowcut, highcut, fs, order=5, plot_pref=False)[source]
designs a butterworth bandpass filter. Makes a lowpass filter if lowcut is 0. Makes a highpass filter if highcut is fs/2. RH 2021
- Args:
- lowcut (scalar):
frequency (in Hz) of low pass band
- highcut (scalar):
frequency (in Hz) of high pass band
- fs (scalar):
sample rate (frequency in Hz)
- order (int):
order of the butterworth filter
- Returns:
- b (ndarray):
Numerator polynomial coeffs of the IIR filter
- a (ndarray):
Denominator polynomials coeffs of the IIR filter
- roicat.helpers.download_file(url: str | None, path_save: str, check_local_first: bool = True, check_hash: bool = False, hash_type: str = 'MD5', hash_hex: str | None = None, mkdir: bool = False, allow_overwrite: bool = True, write_mode: str = 'wb', verbose: bool = True, chunk_size: int = 1024) None[source]
Downloads a file from a URL to a local path using requests. Checks if file already exists locally and verifies the hash of the downloaded file against a provided hash if required. RH 2023
- Parameters:
url (Optional[str]) – URL of the file to download. If
None, then no download is attempted. (Default isNone)path_save (str) – Path to save the file to.
check_local_first (bool) – Whether to check if the file already exists locally. If
Trueand the file exists locally, the download is skipped. IfTrueandcheck_hashis alsoTrue, the hash of the local file is checked. If the hash matches, the download is skipped. If the hash does not match, the file is downloaded. (Default isTrue)check_hash (bool) – Whether to check the hash of the local or downloaded file against
hash_hex. (Default isFalse)hash_type (str) – Type of hash to use. Options are:
'MD5','SHA1','SHA256','SHA512'. (Default is'MD5')hash_hex (Optional[str]) – Hash to compare to, in hexadecimal format (e.g., ‘a1b2c3d4e5f6…’). Can be generated using
hash_file()orhashlib.hexdigest(). Ifcheck_hashisTrue,hash_hexmust be provided. (Default isNone)mkdir (bool) – If
True, creates the parent directory ofpath_saveif it does not exist. (Default isFalse)write_mode (str) – Write mode for saving the file. Options include:
'wb'(write binary),'ab'(append binary),'xb'(write binary, fail if file exists). (Default is'wb')verbose (bool) – If
True, prints status messages. (Default isTrue)chunk_size (int) – Size of chunks in which to download the file. (Default is 1024)
- roicat.helpers.export_svg_hv_bokeh(obj: object, path_save: str) None[source]
Saves a scatterplot from holoviews as an SVG file. RH 2023
- Parameters:
obj (object) – Holoviews plot object.
path_save (str) – Path to save the SVG file.
- roicat.helpers.extract_zip(path_zip: str, path_extract: str | None = None, verbose: bool = True) List[str][source]
Extracts a zip file. RH 2022
- Parameters:
path_zip (str) – Path to the zip file.
path_extract (Optional[str]) – Path (directory) to extract the zip file to. If
None, extracts to the same directory as the zip file. (Default isNone)verbose (bool) – Whether to print progress. (Default is
True)
- Returns:
- paths_extracted (List[str]):
List of paths to the extracted files.
- Return type:
(List[str])
- roicat.helpers.fill_in_dict(d: Dict, defaults: Dict, verbose: bool = True, hierarchy: List[str] = ['dict'])[source]
In-place. Fills in dictionary
dwith values fromdefaultsif they are missing. Works hierachically. RH 2023- Parameters:
d (Dict) – Dictionary to fill in. In-place.
defaults (Dict) – Dictionary of defaults.
verbose (bool) – Whether to print messages.
hierarchy (List[str]) – Used internally for recursion. Hierarchy of keys to d.
- roicat.helpers.find_geometric_transformation(im_template: ndarray, im_moving: ndarray, warp_mode: str = 'euclidean', n_iter: int = 5000, termination_eps: float = 1e-10, mask: ndarray | None = None, gaussFiltSize: int = 1) ndarray[source]
Find the transformation between two images. Wrapper function for cv2.findTransformECC RH 2022
- Parameters:
im_template (np.ndarray) – Template image. The dtype must be either
np.uint8ornp.float32.im_moving (np.ndarray) – Moving image. The dtype must be either
np.uint8ornp.float32.warp_mode (str) –
Warp mode.
’translation’: Sets a translational motion model; warpMatrix is 2x3 with the first 2x2 part being the unity matrix and the rest two parameters being estimated.
’euclidean’: Sets a Euclidean (rigid) transformation as motion model; three parameters are estimated; warpMatrix is 2x3.
’affine’: Sets an affine motion model; six parameters are estimated; warpMatrix is 2x3. (Default)
’homography’: Sets a homography as a motion model; eight parameters are estimated;`warpMatrix` is 3x3.
n_iter (int) – Number of iterations. (Default is 5000)
termination_eps (float) – Termination epsilon. This is the threshold of the increment in the correlation coefficient between two iterations. (Default is 1e-10)
mask (np.ndarray) – Binary mask. Regions where mask is zero are ignored during the registration. If
None, no mask is used. (Default isNone)gaussFiltSize (int) – Gaussian filter size. If 0, no gaussian filter is used. (Default is 1)
- Returns:
- warp_matrix (np.ndarray):
Warp matrix. See cv2.findTransformECC for more info. Can be applied using cv2.warpAffine or cv2.warpPerspective.
- Return type:
(np.ndarray)
- roicat.helpers.find_nonredundant_idx(s: coo_array) ndarray[source]
Finds the indices of the nonredundant entries in a sparse matrix. Useful when manually populating a sparse matrix and you want to know which entries have already been populated. RH 2022
- Parameters:
s (scipy.sparse.coo_array) – Sparse matrix. Should be in COO format.
- Returns:
- idx_unique (np.ndarray):
Indices of the nonredundant entries.
- Return type:
(np.ndarray)
- roicat.helpers.find_paths(dir_outer: str | List[str], reMatch: str = 'filename', reMatch_in_path: str | None = None, find_files: bool = True, find_folders: bool = False, depth: int = 0, natsorted: bool = True, alg_ns: str | None = None, verbose: bool = False) List[str][source]
Searches for files and/or folders recursively in a directory using a regex match. RH 2022
- Parameters:
dir_outer (Union[str, List[str]]) – Path(s) to the directory(ies) to search. If a list of directories, then all directories will be searched.
reMatch (str) – Regular expression to match. Each file or folder name encountered will be compared using
re.search(reMatch, filename). If the output is notNone, the file will be included in the output.reMatch_in_path (Optional[str]) –
Additional regular expression to match anywhere in the path. Useful for finding files/folders in specific subdirectories. If
None, then no additional matching is done.(Default is
None)find_files (bool) – Whether to find files. (Default is
True)find_folders (bool) – Whether to find folders. (Default is
False)depth (int) –
Maximum folder depth to search. (Default is 0).
depth=0 means only search the outer directory.
depth=2 means search the outer directory and two levels of subdirectories below it
natsorted (bool) – Whether to sort the output using natural sorting with the natsort package. (Default is
True)alg_ns (str) – Algorithm to use for natural sorting. See
natsort.nsor https://natsort.readthedocs.io/en/4.0.4/ns_class.html/ for options. Default is PATH. Other commons are INT, FLOAT, VERSION. (Default isNone)verbose (bool) – Whether to print the paths found. (Default is
False)
- Returns:
- paths (List[str]):
Paths to matched files and/or folders in the directory.
- Return type:
(List[str])
- roicat.helpers.flatten_dict(d: MutableMapping, parent_key: str = '', sep: str = '.') MutableMapping[source]
Flattens a dictionary of dictionaries into a single dictionary. NOTE: Turns all keys into strings. Stolen from https://stackoverflow.com/a/6027615. RH 2022
- Parameters:
d (Dict) – Dictionary to flatten
parent_key (str) – Key to prepend to flattened keys IGNORE: USED INTERNALLY FOR RECURSION
sep (str) – Separator to use between keys IGNORE: USED INTERNALLY FOR RECURSION
- Returns:
- flattened dictionary (dict):
Flat dictionary with the keys to deeper dictionaries joined by the separator.
- Return type:
(Dict)
- roicat.helpers.flowField_to_remappingIdx(ff: ndarray | object) ndarray | object[source]
Convert a flow field to a remapping index. WARNING: Technically, it is not possible to convert a flow field to a remapping index, since the remapping index describes an interpolation mapping, while the flow field describes a displacement. RH 2023
- Parameters:
ff (Union[np.ndarray, object]) – Flow field represented as a numpy ndarray or torch Tensor. It describes the displacement of each pixel. Shape (H, W, 2). Last dimension is (x, y).
- Returns:
- ri (Union[np.ndarray, object]):
Remapping index. It describes the index of the pixel in the original image that should be mapped to the new pixel. Shape (H, W, 2).
- Return type:
(Union[np.ndarray, object])
- roicat.helpers.generalised_logistic_function(x: ndarray | Tensor, a: float = 0, k: float = 1, b: float = 1, v: float = 1, q: float = 1, c: float = 1, mu: float = 0) ndarray | Tensor[source]
Calculates the generalized logistic function.
Refer to Generalised logistic function for detailed information on the parameters. RH 2021
- Parameters:
x (Union[np.ndarray, torch.Tensor]) – The input to the logistic function.
a (float) – The lower asymptote. (Default is 0)
k (float) – The upper asymptote when
c=1. (Default is 1)b (float) – The growth rate. (Default is 1)
v (float) – Should be greater than 0, it affects near which asymptote maximum growth occurs. (Default is 1)
q (float) – Related to the value Y(0). Center positions. (Default is 1)
c (float) – Typically takes a value of 1. (Default is 1)
mu (float) – The center position of the function. (Default is 0)
- Returns:
- out (Union[np.ndarray, torch.Tensor]):
The value of the logistic function for the input
x.
- Return type:
(Union[np.ndarray, torch.Tensor])
- roicat.helpers.get_balanced_class_weights(labels: ndarray) ndarray[source]
Balances the weights for classifier training.
RH, JZ 2022
- Parameters:
labels (np.ndarray) – Array that includes a list of labels to balance the weights for classifier training. shape: (n,)
- Returns:
- weights (np.ndarray):
Weights by samples. shape: (n,)
- Return type:
(np.ndarray)
- roicat.helpers.get_balanced_sample_weights(labels: ndarray, class_weights: ndarray | None = None) ndarray[source]
Balances the weights for classifier training.
RH/JZ 2022
- Parameters:
labels (np.ndarray) – Array that includes a list of labels to balance the weights for classifier training. shape: (n,)
class_weights (np.ndarray, Optional) – Optional parameter which includes an array of pre-fit class weights. If
None, weights will be calculated using the provided sample labels. (Default isNone)
- Returns:
- sample_weights (np.ndarray):
Sample weights by labels. shape: (n,)
- Return type:
(np.ndarray)
- roicat.helpers.get_dir_contents(directory: str) Tuple[List[str], List[str]][source]
Retrieves the names of the folders and files in a directory (does not include subdirectories). RH 2021
- Parameters:
directory (str) – The path to the directory.
- Returns:
- tuple containing:
- folders (List[str]):
A list of folder names.
- files (List[str]):
A list of file names.
- Return type:
(tuple)
- roicat.helpers.get_nd_butterworth_filter(shape: ~typing.Tuple[int, ...], factor: float, order: float, high_pass: bool, real: bool, dtype: ~numpy.dtype = <class 'numpy.float64'>, squared_butterworth: bool = True) ndarray[source]
Creates an N-dimensional Butterworth mask for an FFT.
- Parameters:
shape (Tuple[int, ...]) – Shape of the n-dimensional FFT and mask.
factor (float) – Fraction of mask dimensions where the cutoff should be.
order (float) – Controls the slope in the cutoff region.
high_pass (bool) – Whether the filter is high pass (low frequencies attenuated) or low pass (high frequencies are attenuated).
real (bool) – Whether the FFT is of a real (
True) or complex (False) image.dtype (np.dtype) – The desired output data type of the Butterworth filter. (Default is
np.float64)squared_butterworth (bool) – If
True, the square of the Butterworth filter is used. (Default isTrue)
- Returns:
- wfilt (np.ndarray):
The FFT mask.
- Return type:
(np.ndarray)
- roicat.helpers.get_nums_from_string(string_with_nums)[source]
Return the numbers from a string as an int RH 2021-2022
- Parameters:
string_with_nums (str) – String with numbers in it
- Returns:
The numbers from the string If there are no numbers, return None.
- Return type:
nums (int)
- roicat.helpers.get_path_between_nodes(idx_start: int, idx_end: int, predecessors: ndarray, max_length: int = 9999)[source]
Finds the indices corresponding to the shortest path between two nodes in a graph. Uses a predecessor matrix from a shortest path algorithm (e.g. scipy.sparse.csgraph.shortest_path) RH 2024
- Parameters:
idx_start (int) – Index of the starting node.
idx_end (int) – Index of the ending node.
predecessors (np.ndarray) – Predecessor matrix from a shortest path algorithm.
max_length (int) – Maximum length of the path. (Default is 9999)
- Returns:
List of node indices corresponding to the shortest path from idx_start to idx_end. [idx_start, …, idx_end]
- Return type:
path (List[int])
- roicat.helpers.grayscale_to_rgb(array: ndarray | Tensor | List) ndarray | Tensor[source]
Converts a grayscale image (2D array) or movie (3D array) to RGB (3D or 4D array).
RH 2023
- Parameters:
array (Union[np.ndarray, torch.Tensor, list]) – The 2D or 3D array of grayscale images.
- Returns:
- array (Union[np.ndarray, torch.Tensor]):
The converted 3D or 4D array of RGB images.
- Return type:
(Union[np.ndarray, torch.Tensor])
- roicat.helpers.h5_load(filepath: str | Path, return_dict: bool = True, verbose: bool = False) dict | object[source]
Returns a dictionary or an H5PY object from a given HDF file. RH 2023
- Parameters:
filepath (Union[str, Path]) – Full pathname of the file to read.
return_dict (bool) –
Whether or not to return a dict object. (Default is
True).True: a dict object is returned.False: an H5PY object is returned.
verbose (bool) – Whether to print detailed information during the execution. (Default is
False)
- Returns:
- result (Union[dict, object]):
Either a dictionary containing the groups as keys and the datasets as values from the HDF file or an H5PY object, depending on the
return_dictparameter.
- Return type:
(Union[dict, object])
- roicat.helpers.hash_file(path: str, type_hash: str = 'MD5', buffer_size: int = 65536) str[source]
Computes the hash of a file using the specified hash type and buffer size. RH 2022
- Parameters:
path (str) – Path to the file to be hashed.
type_hash (str) –
Type of hash to use. (Default is
'MD5'). Either'MD5': MD5 hash algorithm.'SHA1': SHA1 hash algorithm.'SHA256': SHA256 hash algorithm.'SHA512': SHA512 hash algorithm.
buffer_size (int) – Buffer size (in bytes) for reading the file. 65536 corresponds to 64KB. (Default is 65536)
- Returns:
- hash_val (str):
The computed hash of the file.
- Return type:
(str)
- roicat.helpers.idx2bool(idx: ndarray, length: int | None = None) ndarray[source]
Converts a vector of indices to a boolean vector. RH 2021
- Parameters:
idx (np.ndarray) – 1-D array of indices.
length (Optional[int]) – Length of boolean vector. If
None, the length will be set to the maximum index inidx+ 1. (Default isNone)
- Returns:
- bool_vec (np.ndarray):
1-D boolean array.
- Return type:
(np.ndarray)
- roicat.helpers.idx_to_oneHot(arr: ndarray | Tensor, n_classes: int | None = None, dtype: Type | None = None) ndarray | Tensor[source]
Converts an array of class labels to a matrix of one-hot vectors. RH 2021
- Parameters:
arr (Union[np.ndarray, torch.Tensor]) – A 1-D array of class labels. Values should be integers >= 0. Values will be used as indices in the output array.
n_classes (Optional[int]) – The number of classes. If
None, it will be derived fromarr. (Default isNone)dtype (Optional[Type]) – The data type of the output array. If
None, it defaults to bool for numpy array and torch.bool for Torch tensor. (Default isNone)
- Returns:
- oneHot (Union[np.ndarray, torch.Tensor]):
A 2-D array of one-hot vectors.
- Return type:
(Union[np.ndarray, torch.Tensor])
- roicat.helpers.index_with_nans(values, indices)[source]
Indexes an array with a list of indices, allowing for NaNs in the indices. RH 2022
- Parameters:
values (np.ndarray) – Array to be indexed.
indices (Union[List[int], np.ndarray]) – 1D list or array of indices to use for indexing. Can contain NaNs. Datatype should be floating point. NaNs will be removed and values will be cast to int.
- Returns:
Indexed array. Positions where indices was NaN will be filled with NaNs.
- Return type:
np.ndarray
- roicat.helpers.invert_remappingIdx(remappingIdx: ndarray, method: str = 'linear', fill_value: float | None = nan) ndarray[source]
Inverts a remapping index field.
Requires the assumption that the remapping index field is invertible or bijective/one-to-one and non-occluding. Defined ‘remap_AB’ as a remapping index field that warps image A onto image B, then ‘remap_BA’ is the remapping index field that warps image B onto image A. This function computes ‘remap_BA’ given ‘remap_AB’.
RH 2023
- Parameters:
remappingIdx (np.ndarray) – An array of shape (H, W, 2) representing the remap field.
method (str) –
Interpolation method to use. See
scipy.interpolate.griddata. Options are:'linear''nearest''cubic'
(Default is
'linear')fill_value (Optional[float]) – Value used to fill points outside the convex hull. (Default is
np.nan)
- Returns:
An array of shape (H, W, 2) representing the inverse remap field.
- Return type:
(np.ndarray)
- roicat.helpers.invert_warp_matrix(warp_matrix: ndarray) ndarray[source]
Inverts a provided warp matrix for the transformation A->B to compute the warp matrix for B->A. RH 2023
- Parameters:
warp_matrix (np.ndarray) – A 2x3 or 3x3 array representing the warp matrix. Shape: (2, 3) or (3, 3).
- Returns:
- inverted_warp_matrix (np.ndarray):
The inverted warp matrix. Shape: same as input.
- Return type:
(np.ndarray)
- roicat.helpers.json_load(filepath: str, mode: str = 'r') Any[source]
Loads an object from a json file. RH 2022
- Parameters:
filepath (str) – Path to the json file.
mode (str) – The mode to open the file in. (Default is
'r')
- Returns:
- obj (Any):
The object loaded from the json file.
- Return type:
(Any)
- roicat.helpers.json_save(obj: Any, filepath: str, indent: int = 4, mode: str = 'w', mkdir: bool = False, allow_overwrite: bool = True) None[source]
Saves an object to a json file using json.dump. RH 2022
- Parameters:
obj (Any) – The object to save.
filepath (str) – The path to save the object to.
indent (int) – Number of spaces for indentation in the output json file. (Default is 4)
mode (str) –
The mode to open the file in. Options are:
'wb': Write binary.'ab': Append binary.'xb': Exclusive write binary. Raises FileExistsError if the file already exists.
(Default is
'w')mkdir (bool) – If
True, creates parent directory if it does not exist. (Default isFalse)allow_overwrite (bool) – If
True, allows overwriting of existing file. (Default isTrue)
- class roicat.helpers.lazy_repeat_obj(obj: Any, pseudo_length: int | None = None)[source]
Bases:
objectMakes a lazy iterator that repeats an object. RH 2021
- Parameters:
obj (Any) – Object to repeat.
pseudo_length (Optional[int]) – Length of the iterator. (Default is
None).
- roicat.helpers.list_available_devices() dict[source]
Lists all available PyTorch devices on the system. RH 2024
- Returns:
A dictionary with device types as keys and lists of available devices as values.
- Return type:
(dict)
- roicat.helpers.make_2D_frequency_filter(hw: tuple, low: float = 5, high: float = 6, order: int = 3, distance_p: int = 100)[source]
Make a filter for scoring the alignment of images using phase correlation. RH 2024
- Parameters:
hw (tuple) – Height and width of the images.
low (float) – Low cutoff frequency for the bandpass filter. Units are in pixels.
high (float) – High cutoff frequency for the bandpass filter. Units are in pixels.
order (int) – Order of the butterworth bandpass filter. (Default is 3)
distance_p (int) – Distance parameter for the distance grid. Defines the Minkowski distance used to compute the distance grid.
- Returns:
Filter for scoring the alignment. Shape: (height, width)
- Return type:
(np.ndarray)
- roicat.helpers.make_Fourier_mask(frame_shape_y_x: Tuple[int, int] = (512, 512), bandpass_spatialFs_bounds: List[float] = [0.0078125, 0.3333333333333333], order_butter: int = 5, mask: ndarray | Tensor | None = None, dtype_fft: object = torch.complex64, plot_pref: bool = False, verbose: bool = False) Tensor[source]
Generates a Fourier domain mask for phase correlation, primarily used in BWAIN.
- Parameters:
frame_shape_y_x (Tuple[int, int]) – Shape of the images that will be processed through this function. (Default is (512, 512))
bandpass_spatialFs_bounds (List[float]) – Specifies the lowcut and highcut in spatial frequency for the butterworth filter. (Default is [1/128, 1/3])
order_butter (int) – Order of the butterworth filter. (Default is 5)
mask (Union[np.ndarray, torch.Tensor, None]) – If not
None, this mask is used instead of creating a new one. (Default isNone)dtype_fft (object) – Data type for the Fourier transform, default is
torch.complex64.plot_pref (bool) – If
True, the absolute value of the mask is plotted. (Default isFalse)verbose (bool) – If
True, enables the print statements for debugging. (Default isFalse)
- Returns:
- mask_fft (torch.Tensor):
The generated mask in the Fourier domain.
- Return type:
(torch.Tensor)
- roicat.helpers.make_batches(iterable: Iterable, batch_size: int | None = None, num_batches: int | None = None, min_batch_size: int = 0, return_idx: bool = False, length: int | None = None) Iterable[source]
Creates batches from an iterable. RH 2021
- Parameters:
iterable (Iterable) – The iterable to be batched.
batch_size (Optional[int]) – The size of each batch. If
None, then the batch size is based onnum_batches. (Default isNone)num_batches (Optional[int]) – The number of batches to create. (Default is
None)min_batch_size (int) – The minimum size of each batch. (Default is
0)return_idx (bool) – If
True, return the indices of the batches. Output will be [start, end] idx. (Default isFalse)length (Optional[int]) – The length of the iterable. If
None, then the length is len(iterable). This is useful if you want to make batches of something that doesn’t have a __len__ method. (Default isNone)
- Returns:
- output (Iterable):
Batches of the iterable.
- Return type:
(Iterable)
- roicat.helpers.make_distance_grid(shape=(512, 512), p=2, idx_center=None, return_axes=False, use_fftshift_center=False)[source]
Creates a matrix of distances from the center. Can calculate the Minkowski distance for any p. RH 2023
- Parameters:
shape (Tuple[int, int, ...]) –
Shape of the n-dimensional grid (i,j,k,…) If a shape value is odd, the center will be the center
of that dimension. If a shape value is even, the center will be between the two center points.
p (int) – Order of the Minkowski distance. p=1 is the Manhattan distance p=2 is the Euclidean distance p=inf is the Chebyshev distance
Optional[Tuple[int (idx_center) – The index of the center of the grid. If None, the center is assumed to be the center of the grid. If provided, the center will be set to this index. This is useful for odd shaped grids where the center is not obvious.
int – The index of the center of the grid. If None, the center is assumed to be the center of the grid. If provided, the center will be set to this index. This is useful for odd shaped grids where the center is not obvious.
...]] – The index of the center of the grid. If None, the center is assumed to be the center of the grid. If provided, the center will be set to this index. This is useful for odd shaped grids where the center is not obvious.
return_axes (bool) – If True, return the axes of the grid as well. Return will be a tuple.
use_fft_center (bool) – If True, the center of the grid will be the center of the FFT grid. This is useful for FFT operations where the center is assumed to be the top left corner.
- Returns:
- distance_image (np.ndarray):
array of distances to the center index
- axes (Optional[np.ndarray]):
axes of the grid as well. Only returned if return_axes=True
- Return type:
Union[np.ndarray, Tuple[np.ndarray, np.ndarray]]
- roicat.helpers.make_even(n, mode='up')[source]
Make a number even. RH 2023
- Parameters:
n (int) – Number to make even
mode (str) – ‘up’ or ‘down’ Whether to round up or down to the nearest even number
- Returns:
Even number
- Return type:
output (int)
- roicat.helpers.make_odd(n, mode='up')[source]
Make a number odd. RH 2023
- Parameters:
n (int) – Number to make odd
mode (str) – ‘up’ or ‘down’ Whether to round up or down to the nearest odd number
- Returns:
Odd number
- Return type:
output (int)
- roicat.helpers.map_parallel(func: Callable, args: List[Any], method: str = 'multithreading', n_workers: int = -1, prog_bar: bool = True) List[Any][source]
Maps a function to a list of arguments in parallel. RH 2022
- Parameters:
func (Callable) – The function to be mapped.
args (List[Any]) – List of arguments to which the function should be mapped. Length of list should be equal to the number of arguments. Each element should then be an iterable for each job that is run.
method (str) –
Method to use for parallelization. Either
'multithreading': Use multithreading from concurrent.futures.'multiprocessing': Use multiprocessing from concurrent.futures.'mpire': Use mpire.'serial': Use list comprehension.
(Default is
'multithreading')workers (int) – Number of workers to use. If set to -1, all available workers are used. (Default is
-1)prog_bar (bool) – Whether to display a progress bar using tqdm. (Default is
True)
- Returns:
- output (List[Any]):
List of results from mapping the function to the arguments.
- Return type:
(List[Any])
Example
- roicat.helpers.mask_image_border(im: ndarray, border_outer: int | Tuple[int, int, int, int] | None = None, border_inner: int | None = None, mask_value: float = 0) ndarray[source]
Masks an image within specified outer and inner borders. RH 2022
- Parameters:
im (np.ndarray) – Input image of shape: (height, width).
border_outer (Union[int, tuple[int, int, int, int], None]) – Number of pixels along the border to mask. If
None, the border is not masked. If an int is provided, all borders are equally masked. If a tuple of ints is provided, borders are masked in the order: (top, bottom, left, right). (Default isNone)border_inner (int, Optional) – Number of pixels in the center to mask. Will be a square with side length equal to this value. (Default is
None)mask_value (float) – Value to replace the masked pixels with. (Default is 0)
- Returns:
- im_out (np.ndarray):
Masked output image.
- Return type:
(np.ndarray)
- roicat.helpers.matlab_load(filepath: str, simplify_cells: bool = True, kwargs_scipy: Dict = {}, kwargs_mat73: Dict = {}, verbose: bool = False) Dict[source]
Loads a matlab file. If the .mat file is not version 7.3, it uses
scipy.io.loadmat. If the .mat file is version 7.3, it usesmat73.loadmat. RH 2023- Parameters:
filepath (str) – Path to the matlab file.
simplify_cells (bool) – If set to
Trueand file is not version 7.3, it simplifies cells to numpy arrays. (Default isTrue)kwargs_scipy (Dict) – Keyword arguments to pass to
scipy.io.loadmat. (Default is{})kwargs_mat73 (Dict) – Keyword arguments to pass to
mat73.loadmat. (Default is{})verbose (bool) – If set to
True, it prints information about the file. (Default isFalse)
- Returns:
- out (Dict):
The loaded matlab file content in a dictionary format.
- Return type:
(Dict)
- roicat.helpers.matlab_save(obj: Dict, filepath: str, mkdir: bool = False, allow_overwrite: bool = True, clean_string: bool = True, list_to_objArray: bool = True, none_to_nan: bool = True, kwargs_scipy_savemat: Dict = {'appendmat': True, 'do_compression': False, 'format': '5', 'long_field_names': False, 'oned_as': 'row'})[source]
Saves data to a matlab file. It uses
scipy.io.savematand provides additional functionality such as cleaning strings, converting lists to object arrays, and converting None to np.nan. RH 2023- Parameters:
obj (Dict) – The data to save. This must be in dictionary format.
filepath (str) – The path to save the file to.
mkdir (bool) – If set to
True, creates parent directory if it does not exist. (Default isFalse)allow_overwrite (bool) – If set to
True, allows overwriting of existing file. (Default isTrue)clean_string (bool) – If set to
True, converts strings to bytes. (Default isTrue)list_to_objArray (bool) – If set to
True, converts lists to object arrays. (Default isTrue)none_to_nan (bool) – If set to
True, converts None to np.nan. (Default isTrue)kwargs_scipy_savemat (Dict) –
Keyword arguments to pass to
scipy.io.savemat.'appendmat': Whether to append .mat to the end of the given filename, if it isn’t already there.'format': The format of the .mat file. ‘4’ for Matlab 4 .mat files, ‘5’ for Matlab 5 and above.'long_field_names': Whether to allow field names of up to 63 characters instead of the standard 31.'do_compression': Whether to compress matrices on write.'oned_as': Whether to save 1-D numpy arrays as row or column vectors in the .mat file. ‘row’ or ‘column’.
(Default is
{'appendmat': True, 'format': '5', 'long_field_names': False, 'do_compression': False, 'oned_as': 'row'})
- roicat.helpers.merge_dicts(dicts: List[dict]) dict[source]
Merges a list of dictionaries into a single dictionary. RH 2022
- Parameters:
dicts (List[dict]) – List of dictionaries to merge.
- Returns:
- result_dict (dict):
A single dictionary that contains all keys and values from the dictionaries in the input list.
- Return type:
(dict)
- roicat.helpers.merge_sparse_arrays(s_list: List[csr_array], idx_list: List[ndarray], shape_full: Tuple[int, int], remove_redundant: bool = True, elim_zeros: bool = True) csr_array[source]
Merges a list of square sparse arrays into a single square sparse array. Redundant entries are not selected; only entries chosen by np.unique are kept.
- Parameters:
s_list (List[scipy.sparse.csr_array]) – List of sparse arrays to merge. Each array can be any shape.
idx_list (List[np.ndarray]) – List of integer arrays. Each array should be the same length as its corresponding array in s_list and contain integers in the range [0, shape_full[0]). These integers represent the row/column indices in the full array.
shape_full (Tuple[int, int]) – Shape of the full array.
remove_redundant (bool) –
True: Removes redundant entries from the output array.False: Keeps redundant entries.
elim_zeros (bool) –
True: Eliminate zeros in the sparse matrix.False: Keeps zeros.
- Returns:
- s_full (scipy.sparse.csr_array):
Full sparse matrix merged from the input list.
- Return type:
scipy.sparse.csr_array
- roicat.helpers.phase_correlation(im_template: ndarray | Tensor, im_moving: ndarray | Tensor, mask_fft: ndarray | Tensor | None = None, return_filtered_images: bool = False, eps: float = 1e-08) ndarray | Tuple[ndarray, ndarray, ndarray][source]
Perform phase correlation on two images. Calculation performed along the last two axes of the input arrays (-2, -1) corresponding to the (height, width) of the images. RH 2024
- Parameters:
im_template (np.ndarray) – The template image(s). Shape: (…, height, width). Can be any number of dimensions; last two dimensions must be height and width.
im_moving (np.ndarray) – The moving image. Shape: (…, height, width). Leading dimensions must broadcast with the template image.
mask_fft (Optional[np.ndarray]) – 2D array mask for the FFT. If
None, no mask is used. Assumes mask_fft is fftshifted. (Default isNone)return_filtered_images (bool) – If set to
True, the function will return filtered images in addition to the phase correlation coefficient. (Default isFalse)eps (float) – Epsilon value to prevent division by zero. (Default is
1e-8)
- Returns:
- tuple containing:
- cc (np.ndarray):
The phase correlation coefficient.
- fft_template (np.ndarray):
The filtered template image. Only returned if return_filtered_images is
True.- fft_moving (np.ndarray):
The filtered moving image. Only returned if return_filtered_images is
True.
- Return type:
(Tuple[np.ndarray, np.ndarray, np.ndarray])
- roicat.helpers.pickle_load(filepath: str, zipCompressed: bool = False, mode: str = 'rb') Any[source]
Loads an object from a pickle file. RH 2022
- Parameters:
filepath (str) – Path to the pickle file.
zipCompressed (bool) – If
True, the file is assumed to be a .zip file. The function will first unzip the file, then load the object from the unzipped file. (Default isFalse)mode (str) – The mode to open the file in. (Default is
'rb')
- Returns:
- obj (Any):
The object loaded from the pickle file.
- Return type:
(Any)
- roicat.helpers.pickle_save(obj: Any, filepath: str, mode: str = 'wb', zipCompress: bool = False, mkdir: bool = False, allow_overwrite: bool = True, **kwargs_zipfile: Dict[str, Any]) None[source]
Saves an object to a pickle file using pickle.dump. Allows for zipping of the file.
RH 2022
- Parameters:
obj (Any) – The object to save.
filepath (str) – The path to save the object to.
mode (str) –
The mode to open the file in. Options are:
'wb': Write binary.'ab': Append binary.'xb': Exclusive write binary. Raises FileExistsError if the file already exists.
(Default is
'wb')zipCompress (bool) – If
True, compresses pickle file using zipfileCompressionMethod, which is similar tosavez_compressedin numpy (withzipfile.ZIP_DEFLATED). Useful for saving redundant and/or sparse arrays objects. (Default isFalse)mkdir (bool) – If
True, creates parent directory if it does not exist. (Default isFalse)allow_overwrite (bool) – If
True, allows overwriting of existing file. (Default isTrue)kwargs_zipfile (Dict[str, Any]) –
Keyword arguments that will be passed into zipfile.ZipFile. compression=``zipfile.ZIP_DEFLATED`` by default. See https://docs.python.org/3/library/zipfile.html#zipfile-objects. Other options for ‘compression’ are (input can be either int or object):
0: zipfile.ZIP_STORED (no compression)8: zipfile.ZIP_DEFLATED (usual zip compression)12: zipfile.ZIP_BZIP2 (bzip2 compression) (usually not as good as ZIP_DEFLATED)14: zipfile.ZIP_LZMA (lzma compression) (usually better than ZIP_DEFLATED but slower)
- roicat.helpers.plot_digital_filter_response(b, a=None, fs=30, worN=100000, plot_pref=True)[source]
plots the frequency response of a digital filter RH 2021
- Args:
- b (ndarray):
Numerator polynomial coeffs of the IIR filter
- a (ndarray):
Denominator polynomials coeffs of the IIR filter
- fs (scalar):
sample rate (frequency in Hz)
- worN (int):
number of frequencies at which to evaluate the filter
- roicat.helpers.plot_image_grid(images: List[ndarray] | ndarray, labels: List[str] | None = None, grid_shape: Tuple[int, int] = (10, 10), show_axis: str = 'off', cmap: str | None = None, kwargs_subplots: Dict = {}, kwargs_imshow: Dict = {}) Tuple[Figure, ndarray | Axes][source]
Plots a grid of images. RH 2021
- Parameters:
images (Union[List[np.ndarray], np.ndarray]) – A list of images or a 3D array of images, where the first dimension is the number of images.
labels (Optional[List[str]]) – A list of labels to be displayed in the grid. (Default is
None)grid_shape (Tuple[int, int]) – Shape of the grid. (Default is (10,10))
show_axis (str) – Whether to show axes or not. (Default is ‘off’)
cmap (Optional[str]) – Colormap to use. (Default is
None)kwargs_subplots (Dict) – Keyword arguments for subplots. (Default is {})
kwargs_imshow (Dict) – Keyword arguments for imshow. (Default is {})
- Returns:
- tuple containing:
- fig (plt.Figure):
Figure object.
- axs (Union[np.ndarray, plt.Axes]):
Axes object.
- Return type:
(Tuple[plt.Figure, Union[np.ndarray, plt.Axes]])
- roicat.helpers.prepare_directory_for_loading(directory: str, must_exist: bool = True) str[source]
Prepares a directory path for loading a file. This function is rarely used.
- Parameters:
directory (str) – The directory path to be prepared for loading.
must_exist (bool) – If set to
True, the directory at the specified path must exist. (Default isTrue)
- Returns:
- path (str):
The prepared directory path for loading.
- Return type:
(str)
- roicat.helpers.prepare_directory_for_saving(directory: str, mkdir: bool = False, exist_ok: bool = True) str[source]
Prepares a directory path for saving a file. This function is rarely used.
- Parameters:
directory (str) – The directory path to be prepared for saving.
mkdir (bool) – If set to
True, creates parent directory if it does not exist. (Default isFalse)exist_ok (bool) – If set to
True, allows overwriting of existing directory. (Default isTrue)
- Returns:
- path (str):
The prepared directory path for saving.
- Return type:
(str)
- roicat.helpers.prepare_filepath_for_loading(filepath: str, must_exist: bool = True) str[source]
Prepares a file path for loading a file. Ensures the file path is valid and has the necessary permissions.
- Parameters:
filepath (str) – The file path to be prepared for loading.
must_exist (bool) – If set to
True, the file at the specified path must exist. (Default isTrue)
- Returns:
- path (str):
The prepared file path for loading.
- Return type:
(str)
- roicat.helpers.prepare_filepath_for_saving(filepath: str, mkdir: bool = False, allow_overwrite: bool = True) str[source]
Prepares a file path for saving a file. Ensures the file path is valid and has the necessary permissions.
- Parameters:
filepath (str) – The file path to be prepared for saving.
mkdir (bool) – If set to
True, creates parent directory if it does not exist. (Default isFalse)allow_overwrite (bool) – If set to
True, allows overwriting of existing file. (Default isTrue)
- Returns:
- path (str):
The prepared file path for saving.
- Return type:
(str)
- roicat.helpers.prepare_params(params, defaults, verbose=True)[source]
- Does the following:
Checks that all keys in
paramsare indefaults.Fills in any missing keys in
paramswith values fromdefaults.Returns a deepcopy of the filled-in
params.
- Parameters:
params (Dict) – Dictionary of parameters.
defaults (Dict) – Dictionary of defaults.
verbose (bool) – Whether to print messages.
- roicat.helpers.prepare_path(path: str, mkdir: bool = False, exist_ok: bool = True) str[source]
Checks if a directory or file path is valid for different purposes: saving, loading, etc. RH 2023
- If exists:
If exist_ok=True: all good
If exist_ok=False: raises error
- If doesn’t exist:
- If file:
- If parent directory exists:
All good
- If parent directory doesn’t exist:
If mkdir=True: creates parent directory
If mkdir=False: raises error
- If directory:
If mkdir=True: creates directory
If mkdir=False: raises error
RH 2023
- Parameters:
path (str) – Path to be checked.
mkdir (bool) – If
True, creates parent directory if it does not exist. (Default isFalse)exist_ok (bool) – If
True, allows overwriting of existing file. (Default isTrue)
- Returns:
- path (str):
Resolved path.
- Return type:
(str)
- roicat.helpers.pvalue_to_zscore(p, two_tailed=True)[source]
Convert a p-value to a z-score.
Args: p (float):
The p-value.
- two_tailed (bool):
If True, the p-value is two-tailed. If False, the p-value is one-tailed.
- Returns:
The z-score.
- Return type:
float
- roicat.helpers.pydata_sparse_to_torch_coo(sp_array: object) object[source]
Converts a PyData Sparse array to a PyTorch sparse COO tensor.
This function extracts the coordinates and data from the sparse PyData array and uses them to create a new sparse COO tensor in PyTorch.
- Parameters:
sp_array (object) – The PyData Sparse array to convert. It should be a COO sparse matrix representation.
- Returns:
- coo_tensor (object):
The converted PyTorch sparse COO tensor.
- Return type:
(object)
Example
sp_array = sparse.COO(np.random.rand(1000, 1000)) coo_tensor = pydata_sparse_to_torch_coo(sp_array)
- roicat.helpers.pytorchFlowField_to_cv2RemappingIdx(normgrid: ndarray | Tensor) ndarray | Tensor[source]
Converts remapping indices from the PyTorch format to the OpenCV format. In the OpenCV format, the displacement is in pixels relative to the top left pixel of the image. In the PyTorch format, the displacement is in pixels relative to the center of the image. RH 2023
- Parameters:
normgrid (Union[np.ndarray, torch.Tensor]) – “Flow field”, in the PyTorch format. Technically not a flow field, since it doesn’t describe displacement. Rather, it is a remapping index relative to the center of the image. Shape: (H, W, 2). The last dimension is (x, y).
- Returns:
- ri (Union[np.ndarray, torch.Tensor]):
Remapping indices. Each pixel describes the index of the pixel in the original image that should be mapped to the new pixel. Shape: (H, W, 2). The last dimension is (x, y).
- Return type:
(Union[np.ndarray, torch.Tensor])
- roicat.helpers.rand_cmap(nlabels: int, first_color_black: bool = False, last_color_black: bool = False, verbose: bool = True, under: List[float] = [0, 0, 0], over: List[float] = [0.5, 0.5, 0.5], bad: List[float] = [0.9, 0.9, 0.9]) object[source]
Creates a random colormap to be used with matplotlib. Useful for segmentation tasks.
- Parameters:
nlabels (int) – Number of labels (size of colormap).
first_color_black (bool) – Option to use the first color as black. (Default is
False)last_color_black (bool) – Option to use the last color as black. (Default is
False)verbose (bool) – Prints the number of labels and shows the colormap if
True. (Default isTrue)under (List[float]) – RGB values to use for the ‘under’ threshold in the colormap. (Default is
[0, 0, 0])over (List[float]) – RGB values to use for the ‘over’ threshold in the colormap. (Default is
[0.5, 0.5, 0.5])bad (List[float]) – RGB values to use for ‘bad’ values in the colormap. (Default is
[0.9, 0.9, 0.9])
- Returns:
- colormap (LinearSegmentedColormap):
Colormap for matplotlib.
- Return type:
(LinearSegmentedColormap)
- roicat.helpers.remap_images(images: ndarray | Tensor, remappingIdx: ndarray | Tensor, backend: str = 'torch', interpolation_method: str = 'linear', border_mode: str = 'constant', border_value: float = 0, device: str = 'cpu') ndarray | Tensor[source]
Applies remapping indices to a set of images. Remapping indices, similar to flow fields, describe the index of the pixel to sample from rather than the displacement of each pixel. RH 2023
- Parameters:
images (Union[np.ndarray, torch.Tensor]) – The images to be warped. Shapes can be (N, C, H, W), (C, H, W), or (H, W).
remappingIdx (Union[np.ndarray, torch.Tensor]) – The remapping indices, describing the index of the pixel to sample from. Shape is (H, W, 2).
backend (str) – The backend to use. Can be either
'torch'or'cv2'. (Default is'torch')interpolation_method (str) – The interpolation method to use. Options are
'linear','nearest','cubic', and'lanczos'. Refer to cv2.remap or torch.nn.functional.grid_sample for more details. (Default is'linear')border_mode (str) – The border mode to use. Options include
'constant','reflect','replicate', and'wrap'. Refer to cv2.remap for more details. (Default is'constant')border_value (float) – The border value to use. Refer to cv2.remap for more details. (Default is
0)device (str) – The device to use for computations. Commonly either
'cpu'or'gpu'. (Default is'cpu')
- Returns:
- warped_images (Union[np.ndarray, torch.Tensor]):
The warped images. The shape will be the same as the input images, which can be (N, C, H, W), (C, H, W), or (H, W).
- Return type:
(Union[np.ndarray, torch.Tensor])
- roicat.helpers.remap_sparse_images(ims_sparse: spmatrix | List[spmatrix], remappingIdx: ndarray, method: str = 'linear', fill_value: float = 0, dtype: str | dtype = None, safe: bool = True, n_workers: int = -1, verbose: bool = True) List[csr_array][source]
Remaps a list of sparse images using the given remap field. RH 2023
- Parameters:
ims_sparse (Union[scipy.sparse.spmatrix, List[scipy.sparse.spmatrix]]) – A single sparse image or a list of sparse images.
remappingIdx (np.ndarray) – An array of shape (H, W, 2) representing the remap field. It should be the same size as the images in ims_sparse.
method (str) –
Interpolation method to use. See
scipy.interpolate.griddata. Options are:'linear''nearest''cubic'
(Default is
'linear')fill_value (float) – Value used to fill points outside the convex hull. (Default is
0.0)dtype (Union[str, np.dtype]) – The data type of the resulting sparse images. Default is
None, which will use the data type of the input sparse images.safe (bool) – If
True, checks if the image is 0D or 1D and applies a tiny Gaussian blur to increase the image width. (Default isTrue)n_workers (int) – Number of parallel workers to use. Default is -1, which uses all available CPU cores.
verbose (bool) – Whether or not to use a tqdm progress bar. (Default is
True)
- Returns:
- ims_sparse_out (List[scipy.sparse.csr_array]):
A list of remapped sparse images.
- Return type:
(List[scipy.sparse.csr_array])
- Raises:
AssertionError – If the image and remappingIdx have different spatial
dimensions. –
- roicat.helpers.remappingIdx_to_flowField(ri: ndarray | object) ndarray | object[source]
Convert a remapping index to a flow field. WARNING: Technically, it is not possible to convert a remapping index to a flow field, since the remapping index describes an interpolation mapping, while the flow field describes a displacement. RH 2023
- Parameters:
ri (Union[np.ndarray, object]) – Remapping index represented as a numpy ndarray or torch Tensor. It describes the index of the pixel in the original image that should be mapped to the new pixel. Shape (H, W, 2). Last dimension is (x, y).
- Returns:
- ff (Union[np.ndarray, object]):
Flow field. It describes the displacement of each pixel. Shape (H, W, 2).
- Return type:
(Union[np.ndarray, object])
- roicat.helpers.remove_redundant_elements(s: coo_array, inPlace: bool = False) coo_array[source]
Removes redundant entries from a sparse matrix. Useful when manually populating a sparse matrix and you want to remove redundant entries. RH 2022
- Parameters:
s (scipy.sparse.coo_array) – Sparse matrix. Should be in COO format.
inPlace (bool) –
If
True, the input matrix is modified in place.If
False, a new matrix is returned.
(Default is
False)
- Returns:
- s (scipy.sparse.coo_array):
Sparse matrix with redundant entries removed.
- Return type:
(scipy.sparse.coo_array)
- roicat.helpers.reshape_coo_manual(coo, new_shape)[source]
Manually reshape a COO matrix using 64-bit arithmetic. This function is only needed because windows does a bad job of gracefully switching from int32 to int64 when the values of idx need to be greater than the int32 max value (2147483648). Andrew helped figure this one out. It was an issue for when there are > around 30k ROIs. RH 2025
- Parameters:
coo – scipy.sparse.coo_array The input sparse matrix in COO format.
new_shape – tuple of ints The desired shape, e.g. (1, -1) expanded to a complete tuple.
- Returns:
- scipy.sparse.coo_array
The reshaped COO matrix.
- Return type:
new_coo
- roicat.helpers.resize_images(images: ndarray | List[ndarray] | Tensor | List[Tensor], new_shape: Tuple[int, int] = (100, 100), interpolation: str = 'BILINEAR', antialias: bool = False, device: str | None = None, return_numpy: bool | None = None) ndarray[source]
Resizes images using the
torchvision.transforms.Resizemethod. RH 2023- Parameters:
images (Union[np.ndarray, List[np.ndarray]], torch.Tensor, List[torch.Tensor]) – Images or frames of a video. Can be 2D, 3D, or 4D. * For a 2D array: shape is (height, width) * For a 3D array: shape is (n_frames, height, width) * For a 4D array: shape is (n_frames, n_channels, height, width)
new_shape (Tuple[int, int]) – The desired height and width of resized images as a tuple. (Default is (100, 100))
interpolation (str) – The interpolation method to use. See
torchvision.transforms.Resizefor options. *'NEAREST': Nearest neighbor interpolation *'NEAREST_EXACT': Nearest neighbor interpolation *'BILINEAR': Bilinear interpolation *'BICUBIC': Bicubic interpolationantialias (bool) – If
True, antialiasing will be used. (Default isFalse)Optional[str] (device) – The device to use for
torchvision.transforms.Resize. If None, will use the device of the input images. (Default isNone)Optional[bool] (return_numpy) – If
True, then will return a numpy array. Otherwise, will return a torch tensor on the defined device. If None, will return a numpy array only if the input is a numpy array. (Default isNone)
- Returns:
- images_resized (np.ndarray):
Frames of video or images with overlay added.
- Return type:
(np.ndarray)
- roicat.helpers.resize_remappingIdx(ri: ndarray | Tensor, new_shape: Tuple[int, int], interpolation: str = 'BILINEAR') ndarray | Tensor[source]
Resize a remapping index field. This function both resizes the shape of the actual remappingIdx arrays and scales the values to match the new shape. RH 2024
- Parameters:
ri (np.ndarray or torch.Tensor) – Remapping index field(s). Describes the index of the pixel in the original image that should be mapped to the new pixel. Shape (H, W, 2) or (B, H, W, 2). Last dimension is (x, y).
new_shape (Tuple[int, int]) – New shape of the remapping index field. Shape (H’, W’).
interpolation (str) –
The interpolation method to use. See
torchvision.transforms.Resizefor options.'NEAREST': Nearest neighbor interpolation'NEAREST_EXACT': Nearest neighbor interpolation'BILINEAR': Bilinear interpolation'BICUBIC': Bicubic interpolation
antialias (bool) – If
True, antialiasing will be used. (Default isFalse)
- Returns:
Resized remapping index field. Shape (H’, W’, 2). Last dimension is (x, y).
- Return type:
ri_resized (np.ndarray or torch.Tensor)
- roicat.helpers.safe_set_attr(obj: Any, attr: str, value: Any, overwrite: bool = False) None[source]
Safely sets an attribute on an object. If the attribute is not present, it will be created. If the attribute is present, it will only be overwritten if
overwriteis set toTrue. RH 2024- Parameters:
obj (Any) – Object to set the attribute on.
attr (str) – Attribute name.
value (Any) – Value to set the attribute to.
overwrite (bool) – Whether to overwrite the attribute if it already exists. (Default is
False)
- roicat.helpers.save_gif(array: ndarray | List, path: str, frameRate: float = 5.0, loop: int = 0, kwargs_backend: Dict = {})[source]
Saves an array of images as a gif. RH 2023
- Parameters:
array (Union[np.ndarray, list]) –
The 3D (grayscale) or 4D (color) array of images.
If dtype is
floattype, then scale is from 0 to 1.If dtype is
int, then scale is from 0 to 255.
path (str) – The path where the gif is saved.
frameRate (float) – The frame rate of the gif. (Default is
5.0)loop (int) –
The number of times to loop the gif. (Default is
0)0 means loop forever
1 means play once
2 means play twice (loop once)
etc.
backend (#)
use. (# Which backend to)
Options (#) – ‘imageio’ or ‘PIL’
kwargs_backend (Dict) – The keyword arguments for the backend.
- class roicat.helpers.scipy_sparse_csr_with_length(*args: object, **kwargs: object)[source]
Bases:
csr_arrayA scipy sparse array with a length attribute. RH 2023
- length
The length of the array (shape[0])
- Type:
int
- Parameters:
*args (object) – Arbitrary arguments passed to scipy.sparse.csr_array.
**kwargs (object) – Arbitrary keyword arguments passed to scipy.sparse.csr_array.
- roicat.helpers.scipy_sparse_to_torch_coo(sp_array: coo_array, dtype: type | None = None) sparse_coo_tensor[source]
Converts a Scipy sparse array to a PyTorch sparse COO tensor.
- Parameters:
sp_array (scipy.sparse.coo_array) – Scipy sparse array to be converted to a PyTorch sparse COO tensor.
dtype (Optional[type]) – Data type to which the values of the input sparse array are to be converted before creating the PyTorch sparse tensor. If
None, the data type of the input array’s values is retained. (Default isNone).
- Returns:
PyTorch sparse COO tensor converted from the input Scipy sparse array.
- Return type:
coo_tensor (torch.sparse_coo_tensor)
- roicat.helpers.set_device(use_GPU: bool = True, device_num: int = 0, device_types: List[str] = ['cuda', 'mps', 'xpu', 'cpu'], verbose: bool = True) str[source]
Sets the device for PyTorch. If a GPU is available and use_GPU is
True, it will be set as the device. Otherwise, the CPU will be set as the device. RH 2022- Parameters:
use_GPU (bool) –
Determines if the GPU should be utilized:
True: the function will attempt to use the GPU if a GPU is not available.False: the function will use the CPU.
(Default is
True)device_num (int) – Specifies the index of the GPU to use. (Default is
0)device_types (List[str]) – The types and order of devices to attempt to use. The first device type that is available will be used. Options are
'cuda','mps','xpu', and'cpu'.verbose (bool) –
Determines whether to print the device information.
True: the function will print out the device information.
(Default is
True)
- Returns:
- device (str):
A string specifying the device, either “cpu” or “cuda:<device_num>”.
- Return type:
(str)
- roicat.helpers.show_item_tree(hObj: object | dict | None = None, path: str | Path | None = None, depth: int | None = None, show_metadata: bool = True, print_metadata: bool = False, indent_level: int = 0) None[source]
Recursively displays all the items and groups in an HDF5 object or Python dictionary. RH 2021
- Parameters:
hObj (Optional[Union[object, dict]]) – Hierarchical object, which can be an HDF5 object or a Python dictionary. (Default is
None)path (Optional[Union[str, Path]]) – If not
None, then the path to the HDF5 object is used instead ofhObj. (Default isNone)depth (Optional[int]) – How many levels deep to show the tree. (Default is
Nonewhich shows all levels)show_metadata (bool) – Whether or not to list metadata with items. (Default is
True)print_metadata (bool) – Whether or not to show values of metadata items. (Default is
False)indent_level (int) – Used internally to the function. User should leave this as the default. (Default is 0)
Example
import h5py with h5py.File('test.h5', 'r') as f: show_item_tree(f)
- roicat.helpers.simple_cmap(colors: List[List[float]] = [[1, 0, 0], [1, 0.6, 0], [0.9, 0.9, 0], [0.6, 1, 0], [0, 1, 0], [0, 1, 0.6], [0, 0.8, 0.8], [0, 0.6, 1], [0, 0, 1], [0.6, 0, 1], [0.8, 0, 0.8], [1, 0, 0.6]], under: List[float] = [0, 0, 0], over: List[float] = [0.5, 0.5, 0.5], bad: List[float] = [0.9, 0.9, 0.9], name: str = 'none') object[source]
Creates a colormap from a sequence of RGB values. Borrowed with permission from Alex (https://gist.github.com/ahwillia/3e022cdd1fe82627cbf1f2e9e2ad80a7ex)
- Parameters:
colors (List[List[float]]) – List of RGB values. Each sub-list contains three float numbers representing an RGB color. (Default is list of RGB colors ranging from red to purple)
under (List[float]) – RGB values for the colormap under range. (Default is
[0,0,0](black))over (List[float]) – RGB values for the colormap over range. (Default is
[0.5,0.5,0.5](grey))bad (List[float]) – RGB values for the colormap bad range. (Default is
[0.9,0.9,0.9](light grey))name (str) – Name of the colormap. (Default is ‘none’)
- Returns:
- cmap (LinearSegmentedColormap):
The generated colormap.
- Return type:
(LinearSegmentedColormap)
Example
cmap = simple_cmap([(1,1,1), (1,0,0)]) # white to red colormap cmap = simple_cmap(['w', 'r']) # white to red colormap cmap = simple_cmap(['r', 'b', 'r']) # red to blue to red
- roicat.helpers.sparse_mask(x: csr_array, mask_sparse: csr_array, do_safety_steps: bool = True) csr_array[source]
Masks a sparse matrix with the non-zero elements of another sparse matrix. RH 2022
- Parameters:
x (scipy.sparse.csr_array) – Sparse matrix to mask.
mask_sparse (scipy.sparse.csr_array) – Sparse matrix to mask with.
do_safety_steps (bool) – Whether to do safety steps to ensure that things are working as expected. (Default is
True)
- Returns:
- output (scipy.sparse.csr_array):
Masked sparse matrix.
- Return type:
(scipy.sparse.csr_array)
- roicat.helpers.sparse_to_dense_fill(arr_s: COO, fill_val: float = 0.0) ndarray[source]
Converts a sparse array to a dense array and fills in sparse entries with a specified fill value. RH 2023
- Parameters:
arr_s (sparse.COO) – Sparse array to be converted to dense.
fill_val (float) – Value to fill the sparse entries. (Default is
0.0)
- Returns:
- dense_arr (np.ndarray):
Dense version of the input sparse array.
- Return type:
(np.ndarray)
- roicat.helpers.squeeze_integers(intVec: list | ndarray | Tensor) ndarray | Tensor[source]
Makes integers in an array consecutive numbers starting from the smallest value. For example, [7,2,7,4,-1,0] -> [3,2,3,1,-1,0]. This is useful for removing unused class IDs. RH 2023
- Parameters:
intVec (Union[list, np.ndarray, torch.Tensor]) – 1-D array of integers.
- Returns:
- squeezed_integers (Union[np.ndarray, torch.Tensor]):
1-D array of integers with consecutive numbers starting from the smallest value.
- Return type:
(Union[np.ndarray, torch.Tensor])
- roicat.helpers.torch_pca(X_in: Tensor | ndarray, device: str = 'cpu', mean_sub: bool = True, zscore: bool = False, rank: int | None = None, return_cpu: bool = True, return_numpy: bool = False) Tuple[Tensor | ndarray, Tensor | ndarray, Tensor | ndarray, Tensor | ndarray][source]
Conducts Principal Components Analysis using the Pytorch library. This function can run on either CPU or GPU devices. RH 2021
- Parameters:
X_in (Union[torch.Tensor, np.ndarray]) – The data to be decomposed. This should be a 2-D array, with columns representing features and rows representing samples. PCA is performed column-wise.
device (str) – The device to use for computation, e.g., ‘cuda’ or ‘cpu’. (Default is
'cpu')mean_sub (bool) – If
True, subtract the mean (‘center’) from the columns. (Default isTrue)zscore (bool) – If
True, z-score the columns. This is equivalent to conducting PCA on the correlation-matrix. (Default isFalse)rank (int) – Maximum estimated rank of the decomposition. If
None, then the rank is assumed to be X.shape[1]. (Default isNone)return_cpu (bool) –
(Default is
True)True, all outputs are forced to be on the ‘cpu’ device.False, and device is not ‘cpu’, then the returns will be on the provided device.
return_numpy (bool) – If
True, all outputs are forced to be of type numpy.ndarray. (Default isFalse)
- Returns:
- tuple containing:
- components (torch.Tensor or np.ndarray):
The components of the decomposition, represented as a 2-D array. Each column is a component vector and each row is a feature weight.
- scores (torch.Tensor or np.ndarray):
The scores of the decomposition, represented as a 2-D array. Each column is a score vector and each row is a sample weight.
- singVals (torch.Tensor or np.ndarray):
The singular values of the decomposition, represented as a 1-D array. Each element is a singular value.
- EVR (torch.Tensor or np.ndarray):
The explained variance ratio of each component, represented as a 1-D array. Each element is the explained variance ratio of the corresponding component.
- Return type:
(tuple)
Example
components, scores, singVals, EVR = torch_pca(X_in)
- roicat.helpers.warp_matrix_to_remappingIdx(warp_matrix: ndarray | Tensor, x: int, y: int) ndarray | Tensor[source]
Convert a warp matrix (2x3 or 3x3) into remapping indices (2D). RH 2023
- Parameters:
warp_matrix (Union[np.ndarray, torch.Tensor]) – Warp matrix of shape (2, 3) for affine transformations, and (3, 3) for homography.
x (int) – Width of the desired remapping indices.
y (int) – Height of the desired remapping indices.
- Returns:
- remapIdx (Union[np.ndarray, torch.Tensor]):
Remapping indices of shape (x, y, 2) representing the x and y displacements in pixels.
- Return type:
(Union[np.ndarray, torch.Tensor])
- roicat.helpers.yaml_load(filepath: str, mode: str = 'r', loader: object = <class 'yaml.loader.FullLoader'>) object[source]
Loads a YAML file. RH 2022
- Parameters:
filepath (str) – Path to the YAML file to load.
mode (str) – Mode to open the file in. (Default is
'r')loader (object) –
The YAML loader to use.
yaml.FullLoader: Loads the full YAML language. Avoids arbitrary code execution. (Default for PyYAML 5.1+)yaml.SafeLoader: Loads a subset of the YAML language, safely. This is recommended for loading untrusted input.yaml.UnsafeLoader: The original Loader code that could be easily exploitable by untrusted data input.yaml.BaseLoader: Only loads the most basic YAML. All scalars are loaded as strings.
(Default is
yaml.FullLoader)
- Returns:
- loaded_obj (object):
The object loaded from the YAML file.
- Return type:
(object)
- roicat.helpers.yaml_save(obj: object, filepath: str, indent: int = 4, mode: str = 'w', mkdir: bool = False, allow_overwrite: bool = True) None[source]
Saves an object to a YAML file using the
yaml.dumpmethod. RH 2022- Parameters:
obj (object) – The object to be saved.
filepath (str) – Path to save the object to.
indent (int) – The number of spaces for indentation in the saved YAML file. (Default is 4)
mode (str) –
Mode to open the file in.
'w': write (default)'wb': write binary'ab': append binary'xb': exclusive write binary. RaisesFileExistsErrorif file already exists.
(Default is
'w')mkdir (bool) – If
True, creates the parent directory if it does not exist. (Default isFalse)allow_overwrite (bool) – If
True, allows overwriting of existing files. (Default isTrue)
roicat.util module
- class roicat.util.Model_SWT(model: Module)[source]
Bases:
Module- forward(x)[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class roicat.util.ROICaT_Module[source]
Bases:
objectSuper class for ROICaT modules. RH 2023
- _system_info
System information.
- Type:
object
- property serializable_dict: Dict[str, Any]
Returns a serializable dictionary that can be saved to disk. This method goes through all items in self.__dict__ and checks if they are serializable. If they are, add them to a dictionary to be returned.
- Returns:
- serializable_dict (Dict[str, Any]):
Dictionary containing serializable items.
- Return type:
(Dict[str, Any])
- class roicat.util.RichFile_ROICaT(path: str | Path | None = None, check: bool | None = True, safe_save: bool | None = True, backend: str | None = 'auto')[source]
Bases:
RichFileRichFile subclass with ROICaT-specific type registrations (numpy arrays, scipy sparse matrices, torch tensors, optuna studies, pandas DataFrames, etc.).
- Parameters:
path (Optional[Union[str, Path]]) – Path to save/load the richfile.
check (Optional[bool]) – Whether to perform validation checks.
safe_save (Optional[bool]) – Whether to use atomic save with temporary file.
backend (Optional[str]) –
Storage backend. One of: *
'auto': auto-detect from existing path, or default to'directory'for new saves.'directory': classic richfile directory tree.'sqlar': single-file SQLite archive (.sqlar).'zip': single-file ZIP archive (.zip, stored/no compression).'tar': single-file plain TAR archive (.tar).
- roicat.util.check_dataStructure__list_ofListOrArray_ofDtype(lolod: ~typing.List[~typing.List[int | float]] | ~typing.List[~numpy.ndarray], dtype: ~typing.Type = <class 'numpy.int64'>, fix: bool = True, verbose: bool = True) List[List[int | float]] | List[ndarray][source]
Verifies and optionally corrects the data structure of ‘lolod’ (list of list of dtype).
The structure should be a list of lists of dtypes or a list of numpy arrays of dtypes.
- Parameters:
lolod (Union[List[List[Union[int, float]]], List[np.ndarray]]) –
The data structure to check. It should be a list of lists of dtypes or a list of numpy arrays of dtypes.
dtype (Type) –
The expected dtype of the elements in ‘lolod’. (Default is
np.int64)
fix (bool) –
If
True, attempts to correct the data structure if it is not as expected. The corrections are as follows:If ‘lolod’ is an array, it will be cast to [lolod]
If ‘lolod’ is a numpy object, it will be cast to [np.array(lolod, dtype=dtype)]
If ‘lolod’ is a list of lists of numbers (int or float), it will be cast to [np.array(lod, dtype=dtype) for lod in lolod]
If ‘lolod’ is a list of arrays of wrong dtype, it will be cast to [np.array(lod, dtype=dtype) for lod in lolod]
If
False, raises an error if the structure is not as expected. (Default isTrue)
verbose (bool) –
If
True, prints warnings when the structure is not as expected and is corrected. (Default isTrue)
- Returns:
- lolod (Union[List[List[Union[int, float]]], List[np.ndarray]]):
The verified or corrected data structure.
- Return type:
(Union[List[List[Union[int, float]]], List[np.ndarray]])
- roicat.util.discard_UCIDs_with_fewer_matches(ucids: List[List[int] | ndarray], n_sesh_thresh: int | str = 'all', verbose: bool = True) List[List[int] | ndarray][source]
Discards UCIDs that do not appear in at least n_sesh_thresh sessions. If
n_sesh_thresh='all', then only UCIDs that appear in all sessions are kept.- Parameters:
ucids (List[Union[List[int], np.ndarray]]) – List of lists of UCIDs for each session.
n_sesh_thresh (Union[int, str]) – Number of sessions that a UCID must appear in to be kept. If
'all', then only UCIDs that appear in all sessions are kept. (Default is'all')verbose (bool) – If
True, print verbose output. (Default isTrue)
- Returns:
- ucids_out (List[Union[List[int], np.ndarray]]):
List of lists of UCIDs with UCIDs that do not appear in at least n_sesh_thresh sessions set to -1.
- Return type:
(List[Union[List[int], np.ndarray]])
- roicat.util.get_default_parameters(pipeline='tracking', path_defaults=None)[source]
This function returns a dictionary of parameters that can be used to run different pipelines. RH 2023
- Parameters:
pipeline (str) –
The name of the pipeline to use. Options:
’tracking’: Tracking pipeline.
’classification_inference’: Classification inference pipeline (TODO).
’classification_training’: Classification training pipeline (TODO).
’model_training’: Model training pipeline (TODO).
path_defaults (str) – A path to a yaml file containing a parameters dictionary. The parameters from the file will be loaded as is. If None, the default parameters will be used.
- Returns:
- params (dict):
A dictionary containing the default parameters.
- Return type:
(dict)
- roicat.util.get_roicat_version() str[source]
Retrieves the version of the roicat package.
- Returns:
- version (str):
The version of the roicat package.
- Return type:
(str)
- roicat.util.invert_ucids(ucids: ndarray | list, max_ucid: int = None) ndarray | list[source]
Invert UCIDs to make ucids_inverse where ucids_inverse[i] = argwhere(ucids == i) for all i in range(len(ucids)). Elements with UCID=-1 are discarded. Missing ucid values are set to -1. RH 2025
- Parameters:
ucids (Union[np.ndarray, list]) – UCIDs to invert. Should be a 1D array or list of integers. Should be the ucids from a single session.
max_ucid (int, optional) – Maximum UCID value to use. If not provided, it will be inferred from the input UCIDs. If provided, it should be greater than or equal to the maximum UCID in the input. This is useful if you are combining multiple sessions with different UCID ranges. (Default is
None)
- Returns:
Inverted UCIDs where ucids_inverse[i] = argwhere(ucids == i) for all i in range(len(ucids)).
- Return type:
(Union[np.ndarray, list])
- roicat.util.labels_to_labelsBySession(labels, n_roi_bySession)[source]
Converts a list of labels to a list of lists of labels by session. RH 2024
- Parameters:
labels (list or np.ndarray) – List of labels.
n_roi_bySession (list or np.ndarray) – Number of ROIs by session.
- Returns:
List of lists of labels by session.
- Return type:
(list)
- roicat.util.make_session_bool(n_roi: ndarray) ndarray[source]
Generates a boolean array representing ROIs (Region Of Interest) per session from an array of ROI counts.
- Parameters:
n_roi (np.ndarray) – Array representing the number of ROIs per session. shape: (n_sessions,)
- Returns:
- session_bool (np.ndarray):
Boolean array of shape (n_roi_total, n_session) where each column represents a session and each row corresponds to an ROI.
- Return type:
(np.ndarray)
Example
n_roi = np.array([3, 4, 2]) session_bool = make_session_bool(n_roi)
- roicat.util.mask_UCIDs_by_label(ucids: List[List[int] | ndarray], labels: List[int] | ndarray) List[List[int] | ndarray][source]
Sets labels in the UCIDs to -1 if they are not present in the labels array.
RH 2024
- Parameters:
ucids (List[Union[List[int], np.ndarray]]) –
List of lists of UCIDs for each session.
Shape outer list: (n_sessions,)
Shape inner list: (n_roi_in_session,)
labels (Union[List[int], np.ndarray]) – Array of labels to keep. All other labels are set to -1. Shape: (n_labels,)
- Returns:
- ucids_out (List[Union[List[int], np.ndarray]]):
Masked list of lists of UCIDs. Elements that are not in the labels array are set to -1 in each session.
- Return type:
(List[Union[List[int], np.ndarray]])
Example
ucids = [[1, 2, 3], [2, -1, 4], [3, 0, 5]] labels = [2, 3] ucids_out = mask_UCIDs_by_label(ucids, labels) # ucids_out = [[-1, 2, 3], [2, -1, -1], [3, -1, -1]]
- roicat.util.mask_UCIDs_with_iscell(ucids: List[List[int] | ndarray], iscell: List[List[bool] | ndarray]) List[List[int] | ndarray][source]
Masks the UCIDs with the iscell array. If
iscellis False, then the UCID is set to -1.- Parameters:
ucids (List[Union[List[int], np.ndarray]]) –
List of lists of UCIDs for each session.
Shape outer list: (n_sessions,)
Shape inner list: (n_roi_in_session,)
iscell (List[Union[List[bool], np.ndarray]]) –
List of lists of boolean indicators for each UCID.
Truemeans that ROI is a cell,Falsemeans that ROI is not a cell.Shape outer list: (n_sessions,)
Shape inner list: (n_roi_in_session,)
- Returns:
- ucids_out (List[Union[List[int], np.ndarray]]):
Masked list of lists of UCIDs. Elements that are not cells are set to -1 in each session.
- Return type:
(List[Union[List[int], np.ndarray]])
- roicat.util.match_arrays_with_ucids(arrays: ndarray | List[ndarray], ucids: List[ndarray] | List[List[int]], return_indices: bool = False, squeeze: bool = False, force_sparse: bool = False, prog_bar: bool = False) List[ndarray | lil_array][source]
Matches the indices of the arrays using the UCIDs. Array indices with UCIDs corresponding to -1 are set to
np.nan. This is useful for aligning Fluorescence and Spiking data across sessions using UCIDs.- Parameters:
arrays (Union[np.ndarray, List[np.ndarray]]) – List of numpy arrays for each session. Matching is done along the first dimension.
ucids (Union[List[np.ndarray], List[List[int]]]) – List of lists of UCIDs for each session.
return_indices (bool) – If
True, then the indices of the UCIDs will also be returned. The indices will be of dtype np.float32 because it may contain NaNs. (Default isFalse)squeeze (bool) – If
True, then UCIDs are squeezed to be contiguous integers. (Default isFalse)force_sparse (bool) – If
True, then the output will be a list of sparse matrices. (Default isFalse)prog_bar (bool) – If
True, then a progress bar will be displayed. (Default isFalse)
- Returns:
- arrays_out (List[Union[np.ndarray, scipy.sparse.lil_array]]):
List of arrays for each session. Array indices with UCIDs corresponding to -1 are set to
np.nan. Each array will have shape: (n_ucids if squeeze==True OR max_ucid if squeeze==False, *array.shape[1:]). UCIDs will be used as the index of the first dimension.
- Return type:
(List[Union[np.ndarray, scipy.sparse.lil_array]])
- roicat.util.match_arrays_with_ucids_inverse(arrays: ndarray | List[ndarray], ucids: List[ndarray] | List[List[int]], unsqueeze: bool = True) List[ndarray | lil_array][source]
Inverts the matching of the indices of the arrays using the UCIDs. Arrays should have indices that correspond to the UCID values. The return will be a list of arrays with indices that correspond to the original indices of the arrays / ucids. Essentially, this function undoes the matching done by match_arrays_with_ucids().
- Parameters:
arrays (Union[np.ndarray, List[np.ndarray]]) – List of numpy arrays for each session.
ucids (Union[List[np.ndarray], List[List[int]]]) – List of lists of UCIDs for each session.
unsqueeze (bool) – If
True, then this algorithm assumes that the arrays were squeezed to remove unused UCIDs. This corresponds to and should match the argumentsqueezeused in match_arrays_with_ucids().
- Returns:
- arrays_out (List[Union[np.ndarray, scipy.sparse.lil_array]]):
List of arrays with indices that correspond to the original indices of the arrays / ucids.
- Return type:
(List[Union[np.ndarray, scipy.sparse.lil_array]])
- roicat.util.set_random_seed(seed=None, deterministic=False)[source]
Set random seed for reproducibility. RH 2023
- Parameters:
seed (int, optional) – Random seed. If None, a random seed (spanning int32 integer range) is generated.
deterministic (bool, optional) – Whether to make packages deterministic.
- Returns:
- seed (int):
Random seed.
- Return type:
(int)
- roicat.util.split_iby_session(x: Any, n_roi_per_session: ndarray | List[int])[source]
Splits an array or iterable into a list of arrays or iterables based on the number of ROIs per session.
- Parameters:
arr (Any) – Array to split.
n_roi_per_session (Union[np.ndarray, List[int]]) – Number of ROIs per session.
- Returns:
List of arrays split by session.
- Return type:
(List[Any])
- roicat.util.squeeze_UCID_labels(ucids: List[List[int] | ndarray], return_array: bool = False) List[List[int] | ndarray][source]
Squeezes the UCID labels. Finds all the unique UCIDs across all sessions, then removes spaces in the UCID labels by mapping the unique UCIDs to new values. Output UCIDs are contiguous integers starting at 0, and maintains elements with UCID=-1.
- Parameters:
ucids (List[Union[List[int], np.ndarray]]) – List of lists of UCIDs for each session.
return_array (bool) – If
True, then the output will be a numpy array. (Default isFalse)
- Returns:
- ucids_out (List[Union[List[int], np.ndarray]]):
List of lists of UCIDs with UCIDs that do not appear in at least n_sesh_thresh sessions set to -1.
- Return type:
(List[Union[List[int], np.ndarray]])
- roicat.util.system_info(verbose: bool = False) Dict[source]
Checks and prints the versions of various important software packages. RH 2022
- Parameters:
verbose (bool) – Whether to print the software versions. (Default is
False)- Returns:
- versions (Dict):
Dictionary containing the versions of various software packages.
- Return type:
(Dict)
roicat.visualization module
- roicat.visualization.compute_colored_FOV(spatialFootprints: List[csr_array], FOV_height: int, FOV_width: int, labels: List[ndarray] | ndarray | None = None, cmap: str | object = 'random', alphas_labels: ndarray | None = None, alphas_sf: List[ndarray] | ndarray | None = None, color_unlabeled: List[float] | None = None) List[ndarray][source]
Computes a set of images of fields of view (FOV) of spatial footprints, colored by the predicted class. RH 2023
- Parameters:
spatialFootprints (List[scipy.sparse.csr_array]) – Each element is all the spatial footprints for a given session.
FOV_height (int) – Height of the field of view.
FOV_width (int) – Width of the field of view.
labels (Optional[Union[List[np.ndarray], np.ndarray]]) – Label (will be a unique color) for each spatial footprint. Each element is all the labels for a given session. If -1, then the spatial footprint will be black / transparent. Can either be a list of integer labels for each session, or a single array with all the labels concatenated. Optional, if None, then all labels are set to random colors.
cmap (Union[str, object]) – Colormap to use for the labels. If ‘random’, then a random colormap is generated. Else, this is passed to matplotlib.colors.ListedColormap. (Default is ‘random’)
alphas_labels (Optional[np.ndarray]) – Alpha value for each label. shape: (n_labels,) which is the same as the number of unique labels len(np.unique(labels)). (Default is
None)alphas_sf (Optional[Union[List[np.ndarray], np.ndarray]]) – Alpha value for each spatial footprint. Can either be a list of alphas for each session, or a single array with all the alphas concatenated. (Default is
None)
- Returns:
- rois_c_bySession_FOV (List[np.ndarray]):
List of images of fields of view (FOV) of spatial footprints, colored by the predicted class.
- Return type:
(List[np.ndarray])
- roicat.visualization.crop_cluster_ims(ims: ndarray) ndarray[source]
Crops the images to the smallest rectangle containing all non-zero pixels. RH 2022
- Parameters:
ims (np.ndarray) – Images to crop. (shape: (n, H, W))
- Returns:
- cropped_ims (np.ndarray):
Cropped images. (shape: (n, H’, W’))
- Return type:
(np.ndarray)
- roicat.visualization.display_cropped_cluster_ims(spatialFootprints: List[ndarray], labels: ndarray, FOV_height: int = 512, FOV_width: int = 1024, n_labels_to_display: int = 100) None[source]
Displays the cropped cluster images. RH 2023
- Parameters:
spatialFootprints (List[np.ndarray]) – List of spatial footprints. Each footprint is a 2D array representing one region. (shape of each footprint: (H, W))
labels (np.ndarray) – Labels for each region of interest (ROI). (shape: (n,))
FOV_height (int) – Height of the field of view. (Default is 512)
FOV_width (int) – Width of the field of view. (Default is 1024)
n_labels_to_display (int) – Number of labels to display. (Default is 100)
- roicat.visualization.display_labeled_ROIs(images: ndarray, labels: ndarray | Dict[str, Any], max_images_per_label: int = 10, figsize: Tuple[int, int] = (10, 3), fontsize: int = 25, shuffle: bool = True) None[source]
Displays a grid of images, each row corresponding to a label, and each image is a randomly selected image from that label. RH 2023
- Parameters:
images (np.ndarray) – Array of images. Shape: (num_images, height, width) or (num_images, height, width, num_channels)
labels (Union[np.ndarray, Dict[str, Any]]) – If dict, it must contain keys ‘index’ and ‘label’, where ‘index’ is an array (or list) of indices corresponding to the indices of the images, and ‘label’ is an array (or list) of labels with the same length as ‘index’. If ndarray, it must be a 1D array of labels corresponding to each image.
max_images_per_label (int) – Maximum number of images to display per label. (Default is 10)
figsize (Tuple[int, int]) – Size of the figure. (Default is (10, 3))
fontsize (int) – Font size of the labels. (Default is 25)
shuffle (bool) – If
True, the order of the images will be shuffled. (Default isTrue)
- roicat.visualization.display_toggle_image_stack(images: List[ndarray] | List[Tensor], image_size: Tuple[int, int] | int | float | None = None, clim: Tuple[float, float] | None = None, interpolation: str = 'nearest') None[source]
Displays images in a slider using Jupyter Notebook. RH 2023
- Parameters:
images (Union[List[np.ndarray], List[torch.Tensor]]) – List of images as numpy arrays or PyTorch tensors.
image_size (Optional[Tuple[int, int]]) –
Tuple of (width, height) for resizing images.
If
None, images are not resized.If a single integer or float is provided, the images are resized by that factor.
(Default is
None)clim (Optional[Tuple[float, float]]) – Tuple of (min, max) values for scaling pixel intensities. If
None, min and max values are computed from the images and used as bounds for scaling. (Default isNone)interpolation (str) – String specifying the interpolation method for resizing. Options are ‘nearest’, ‘box’, ‘bilinear’, ‘hamming’, ‘bicubic’, ‘lanczos’. Uses the Image.Resampling.* methods from PIL. (Default is ‘nearest’)
- roicat.visualization.get_spread_out_points(data: ndarray, n_ims: int = 1000, dist_im_to_point: float = 0.3, border_frac: float = 0.05, device: str = 'cpu') ndarray[source]
Given a set of points, returns the indices of a subset of points that are spread out. Intended to be used to overlay images on a scatter plot of points. RH 2023
- Parameters:
data (np.ndarray) – Array containing the points to be spread out. Shape: (N, 2)
n_ims (int) – Number of indices to return corresponding to the number of images to be displayed. (Default is 1000)
dist_im_to_point (float) – Minimum distance between an image and its nearest point. Images with a minimum distance to a point greater than this value will be discarded. (Default is 0.3)
border_frac (float) – Fraction of the range of the data to add as a border around the points. (Default is 0.05)
device (str) – Device to use for torch operations. (Default is ‘cpu’)
- Returns:
- idx_images_overlay (np.ndarray):
Array containing the indices of the points to overlay images on. Shape: (n_ims,)
- Return type:
(np.ndarray)
- roicat.visualization.plot_confusion_matrix(confusion_matrix, class_names: List[str] = None, figsize: Tuple[int, int] = (4, 4), n_decimals: int = 2)[source]
Plots a confusion matrix using seaborn. RH 2023
- Parameters:
confusion_matrix (np.ndarray) – Array containing the confusion matrix. Shape: (num_classes, num_classes)
class_names (list) – List of class names. Length: num_classes If
None, the class names will be the indices of the confusion matrix.figsize (Tuple[int, int]) – Size of the figure.
n_decimals (int) – Number of decimals to round the confusion matrix to.
- roicat.visualization.select_region_scatterPlot(data: ndarray, images_overlay: ndarray | None = None, idx_images_overlay: ndarray | None = None, size_images_overlay: float | None = None, frac_overlap_allowed: float = 0.5, image_overlay_raster_size: Tuple[int, int] | None = None, path: str | None = None, figsize: Tuple[int, int] = (300, 300), alpha_points: float = 0.5, size_points: float = 1, color_points: str | List[str] = 'k') Tuple[Callable, object, str][source]
Selects a region of a scatter plot and returns the indices of the points in that region.
- Parameters:
data (np.ndarray) – Input data to create a scatterplot. The shape must be (n_samples, 2).
images_overlay (np.ndarray, optional) – A 3D array of grayscale images or a 4D array of RGB images, where the first dimension is the number of images. (Default is
None)idx_images_overlay (np.ndarray, optional) – A vector of data indices corresponding to each image in images_overlay. The shape must be (n_images,). (Default is
None)size_images_overlay (float, optional) – Size of each overlay image. The unit is relative to each axis. This value scales the resolution of the overlay raster. (Default is
None)frac_overlap_allowed (float, optional) – Fraction of overlap allowed between the selected region and the overlay images. This is only used when size_images_overlay is
None. (Default is 0.5)image_overlay_raster_size (Tuple[int, int], optional) – Size of the rasterized image overlay in pixels. If
None, the size will be set to figsize. (Default isNone)path (str, optional) – Temporary file path to save the selected indices. (Default is
None)figsize (Tuple[int, int], optional) – Size of the figure in pixels. (Default is (300, 300))
alpha_points (float, optional) – Alpha value of the scatter plot points. (Default is 0.5)
size_points (float, optional) – Size of the scatter plot points. (Default is 1)
color_points (Union[str, List[str]], optional) – Color of the scatter plot points. Single color only.
- Returns:
- tuple containing:
- fn_get_indices (Callable):
Function that returns the indices of the selected points.
- layout (object):
Holoviews layout object.
- path_tempfile (str):
Path to the temporary file that saves the selected indices.
- Return type:
(Tuple[Callable, object, str])
Example
fn_get_indices, layout, path_tempfile = select_region_scatterPlot(data)