API Documentation

Tracking module

roicat.tracking.alignment module

class roicat.tracking.alignment.Aligner(verbose=True)[source]

Bases: ROICaT_Module

A class for registering ROIs to a template FOV. Currently relies on available OpenCV methods for rigid and non-rigid registration. RH 2023

Parameters:

verbose (bool) – Whether to print progress updates. (Default is True)

classmethod augment_FOV_images(ims: List[ndarray], spatialFootprints: List[csr_matrix] | None = None, roi_FOV_mixing_factor: float = 0.5, use_CLAHE: bool = True, CLAHE_grid_size: int = 1, CLAHE_clipLimit: int = 1, CLAHE_normalize: bool = True) None[source]

Augments the FOV images by mixing the FOV with the ROI images and optionally applying CLAHE. RH 2023

Parameters:
  • ims (List[np.ndarray]) – A list of FOV images.

  • spatialFootprints (List[scipy.sparse.csr_matrix], optional) – A list of spatial footprints for each ROI. If None, then no mixing will be performed. (Default is None)

  • roi_FOV_mixing_factor (float) – The factor by which to mix the ROI images into the FOV images. If 0, then no mixing will be performed. (Default is 0.5)

  • use_CLAHE (bool) – Whether to apply CLAHE to the images. (Default is True)

  • CLAHE_grid_size (int) – The grid size for CLAHE. See alignment.clahe for more details. (Default is 1)

  • CLAHE_clipLimit (int) – The clip limit for CLAHE. See alignment.clahe for more details. (Default is 1)

  • CLAHE_normalize (bool) – Whether to normalize the CLAHE output. See alignment.clahe for more details. (Default is True)

Returns:

FOV_images_augmented (List[np.ndarray]):

The augmented FOV images.

Return type:

(List[np.ndarray])

fit_geometric(template: int | ndarray, ims_moving: List[ndarray], template_method: str = 'sequential', mode_transform: str = 'affine', gaussFiltSize: int = 11, mask_borders: Tuple[int, int, int, int] = (0, 0, 0, 0), n_iter: int = 1000, termination_eps: float = 1e-09, auto_fix_gaussFilt_step: bool | int = 10) ndarray[source]

Performs geometric registration of ims_moving to a template, using cv2.findTransformECC. RH 2023

Parameters:
  • template (Union[int, np.ndarray]) – Depends on the value of ‘template_method’. If ‘template_method’ == ‘image’, this should be a 2D np.ndarray image, an integer index of the image to use as the template, or a float between 0 and 1 representing the fractional index of the image to use as the template. If ‘template_method’ == ‘sequential’, then template is the integer index or fractional index of the image to use as the template.

  • ims_moving (List[np.ndarray]) – List of images to be aligned.

  • template_method (str) –

    Method to use for template selection.

    • ’image’: use the image specified by ‘template’.

    • ’sequential’: register each image to the previous or next image

    (Default is ‘sequential’)

  • mode_transform (str) – Mode of geometric transformation. Can be ‘translation’, ‘euclidean’, ‘affine’, or ‘homography’. See cv2.findTransformECC for more details. (Default is ‘affine’)

  • gaussFiltSize (int) – Size of the Gaussian filter. (Default is 11)

  • mask_borders (Tuple[int, int, int, int]) – Border mask for the image. Format is (top, bottom, left, right). (Default is (0, 0, 0, 0))

  • n_iter (int) – Number of iterations for cv2.findTransformECC. (Default is 1000)

  • termination_eps (float) – Termination criteria for cv2.findTransformECC. (Default is 1e-9)

  • auto_fix_gaussFilt_step (Union[bool, int]) – Automatically fixes convergence issues by increasing the gaussFiltSize. If False, no automatic fixing is performed. If True, the gaussFiltSize is increased by 2 until convergence. If int, the gaussFiltSize is increased by this amount until convergence. (Default is 10)

Returns:

remapIdx_geo (np.ndarray):

An array of shape (N, H, W, 2) representing the remap field for N images.

Return type:

(np.ndarray)

fit_nonrigid(template: int | ndarray, ims_moving: List[ndarray], remappingIdx_init: ndarray | None = None, template_method: str = 'sequential', mode_transform: str = 'createOptFlow_DeepFlow', kwargs_mode_transform: dict | None = None) ndarray[source]

Perform geometric registration of ims_moving to a template. Currently relies on cv2.findTransformECC. RH 2023

Parameters:
  • template (Union[int, np.ndarray]) –

    • If template_method == 'image': Then template is either an image or an integer index or a float fractional index of the image to use as the template.

    • If template_method == 'sequential': then template is the integer index of the image to use as the template.

  • ims_moving (List[np.ndarray]) – A list of images to be aligned.

  • remappingIdx_init (Optional[np.ndarray]) – An array of shape (N, H, W, 2) representing any initial remap field to apply to the images in ims_moving. The output of this method will be added/composed with remappingIdx_init. (Default is None)

  • template_method (str) –

    The method to use for template selection. Either

    • 'image': use the image specified by ‘template’.

    • 'sequential': register each image to the previous or next image (will be next for images before the template and previous for images after the template)

    (Default is ‘sequential’)

  • mode_transform (str) – The type of transformation to use for registration. Either ‘createOptFlow_DeepFlow’ or ‘calcOpticalFlowFarneback’. (Default is ‘createOptFlow_DeepFlow’)

  • kwargs_mode_transform (Optional[dict]) – Keyword arguments for the transform chosen. See cv2 docs for chosen transform. (Default is None)

Returns:

remapIdx_nonrigid (np.ndarray):

An array of shape (N, H, W, 2) representing the remap field for N images.

Return type:

(np.ndarray)

get_ROIsAligned_maxIntensityProjection(H: int | None = None, W: int | None = None) List[ndarray][source]

Returns the max intensity projection of the ROIs aligned to the template FOV.

Parameters:
  • H (Optional[int]) – The height of the output projection. If not provided and if not already set, an error will be thrown. (Default is None)

  • W (Optional[int]) – The width of the output projection. If not provided and if not already set, an error will be thrown. (Default is None)

Returns:

max_projection (List[np.ndarray]):

The max intensity projections of the ROIs.

Return type:

(List[np.ndarray])

get_flowFields(remappingIdx: ndarray | None = None) List[ndarray][source]

Returns the flow fields based on the remapping indices.

Parameters:

remappingIdx (Optional[np.ndarray]) – The indices for remapping the flow fields. If None, geometric or nonrigid registration must be performed first. (Default is None)

Returns:

flow_fields (List[np.ndarray]):

The transformed flow fields.

Return type:

(List[np.ndarray])

transform_ROIs(ROIs: ndarray, remappingIdx: ndarray | None = None, normalize: bool = True) List[ndarray][source]

Transforms ROIs based on remapping indices and normalization settings. RH 2023

Parameters:
  • ROIs (np.ndarray) – The regions of interest to transform. (shape: (H, W))

  • remappingIdx (Optional[np.ndarray]) – The indices for remapping the ROIs. If None, geometric or nonrigid registration must be performed first. (Default is None)

  • normalize (bool) – If True, data is normalized. (Default is True)

Returns:

ROIs_aligned (List[np.ndarray]):

Transformed ROIs.

Return type:

(List[np.ndarray])

transform_images(ims_moving: List[ndarray], remappingIdx: List[ndarray]) List[ndarray][source]

Transforms images using the specified remapping index.

Parameters:
  • ims_moving (List[np.ndarray]) – The images to be transformed. List of arrays with shape: (H, W) or (H, W, C)

  • remappingIdx (List[np.ndarray]) – The remapping index to apply to the images.

Returns:

ims_registered (List[np.ndarray]):

The transformed images. (N, H, W)

Return type:

(List[np.ndarray])

transform_images_geometric(ims_moving: List[ndarray], remappingIdx: ndarray | None = None) ndarray[source]

Transforms images based on geometric registration warps.

Parameters:
  • ims_moving (np.ndarray) – The images to be transformed. (N, H, W)

  • remappingIdx (Optional[np.ndarray]) – An array specifying how to remap the images. If None, the remapping index from geometric registration is used. (Default is None)

Returns:

ims_registered_geo (np.ndarray):

The images after applying the geometric registration warps. (N, H, W)

Return type:

(np.ndarray)

transform_images_nonrigid(ims_moving: List[ndarray], remappingIdx: ndarray | None = None) ndarray[source]

Transforms images based on non-rigid registration warps.

Parameters:
  • ims_moving (np.ndarray) – The images to be transformed. (N, H, W)

  • remappingIdx (Optional[np.ndarray]) – An array specifying how to remap the images. If None, the remapping index from non-rigid registration is used. (Default is None)

Returns:

ims_registered_nonrigid (np.ndarray):

The images after applying the non-rigid registration warps. (N, H, W)

Return type:

(np.ndarray)

class roicat.tracking.alignment.PhaseCorrelationRegistration[source]

Bases: object

Performs rigid transformation using phase correlation. RH 2022

mask

Spectral mask created using set_spectral_mask().

Type:

np.ndarray

ims_registered

Registered images, set in register().

Type:

np.ndarray

shifts

Pixel shift values (y, x), set in register().

Type:

np.ndarray

ccs

Phase correlation coefficient images, set in register().

Type:

np.ndarray

ims_template_filt

Template images filtered by the spectral mask, set in register().

Type:

np.ndarray

ims_moving_filt

Moving images filtered by the spectral mask, set in register().

Type:

np.ndarray

register(template: ndarray | int, ims_moving: ndarray, template_method: str = 'sequential') Tuple[ndarray, ndarray][source]

Registers a set of images using phase correlation. RH 2022

Parameters:
  • template (Union[np.ndarray, int]) –

    Template image.

    • If template_method is ‘image’, template should be a single image.

    • If template_method is ‘sequential’, template should be an integer corresponding to the index of the image to set as ‘zero’ offset.

  • ims_moving (np.ndarray) – Images to align to the template. (shape: (n, H, W))

  • template_method (str) –

    Method used to register the images.

    • ’image’: template should be a single image.

    • ’sequential’: template should be an integer corresponding to the index of the image to set as ‘zero’ offset.

    (Default is ‘sequential’)

Returns:

tuple containing:
ims_registered (np.ndarray):

Registered images. (shape: (n, H, W))

shifts (np.ndarray):

Pixel shift values (y, x). (shape: (n, 2))

Return type:

(Tuple[np.ndarray, np.ndarray])

set_spectral_mask(freq_highPass: float = 0.01, freq_lowPass: float = 0.3, im_shape: Tuple[int, int] = (512, 512)) None[source]

Sets the spectral mask for the phase correlation.

Parameters:
  • freq_highPass (float) – High pass frequency. (Default is 0.01)

  • freq_lowPass (float) – Low pass frequency. (Default is 0.3)

  • im_shape (Tuple[int, int]) – Shape of the image. (Default is (512, 512))

roicat.tracking.alignment.clahe(im: ndarray, grid_size: int = 50, clipLimit: int = 0, normalize: bool = True) ndarray[source]

Perform Contrast Limited Adaptive Histogram Equalization (CLAHE) on an image.

Parameters:
  • im (np.ndarray) – Input image.

  • grid_size (int) – Size of the grid. See cv2.createCLAHE for more info. (Default is 50)

  • clipLimit (int) – Clip limit. See cv2.createCLAHE for more info. (Default is 0)

  • normalize (bool) – Whether to normalize the output image. (Default is True)

Returns:

im_out (np.ndarray):

Output image after applying CLAHE.

Return type:

(np.ndarray)

roicat.tracking.alignment.convert_phaseCorrelationImage_to_shifts(cc_im: ndarray) Tuple[int, int][source]

Convert phase correlation image to pixel shift values. RH 2022

Parameters:

cc_im (np.ndarray) – Phase correlation image. The middle of the image corresponds to a zero-shift.

Returns:

tuple containing:
shift_y (int):

The pixel shift in the y-axis.

shift_x (int):

The pixel shift in the x-axis.

Return type:

(Tuple[int, int])

roicat.tracking.alignment.make_spectral_mask(freq_highPass: float = 0.01, freq_lowPass: float = 0.3, im_shape: Tuple[int, int] = (512, 512)) ndarray[source]

Generates a spectral mask for an image with given high pass and low pass frequencies.

Parameters:
  • freq_highPass (float) – High pass frequency to use. (Default is 0.01)

  • freq_lowPass (float) – Low pass frequency to use. (Default is 0.3)

  • im_shape (Tuple[int, int]) – Shape of the input image as a tuple (height, width). (Default is (512, 512))

Returns:

mask_out (np.ndarray):

The generated spectral mask.

Return type:

(np.ndarray)

roicat.tracking.alignment.phase_correlation(im_template: ndarray, im_moving: ndarray, mask_fft: ndarray | None = None, return_filtered_images: bool = False) ndarray | Tuple[ndarray, ndarray, ndarray][source]

Perform phase correlation on two images. RH 2022

Parameters:
  • im_template (np.ndarray) – The template image.

  • im_moving (np.ndarray) – The moving image.

  • mask_fft (Optional[np.ndarray]) – Mask for the FFT. If None, no mask is used. (Default is None)

  • return_filtered_images (bool) – If set to True, the function will return filtered images in addition to the phase correlation coefficient. (Default is False)

Returns:

tuple containing:
cc (np.ndarray):

The phase correlation coefficient.

fft_template (np.ndarray):

The filtered template image. Only returned if return_filtered_images is True.

fft_moving (np.ndarray):

The filtered moving image. Only returned if return_filtered_images is True.

Return type:

(Tuple[np.ndarray, np.ndarray, np.ndarray])

roicat.tracking.alignment.shift_along_axis(X: ndarray, shift: int, fill_val: int = 0, axis: int = 0) ndarray[source]

Shifts the elements of an array along a specified axis. RH 2023

Parameters:
  • X (np.ndarray) – Input array to be shifted.

  • shift (int) – The number of places to shift. If the value is positive, the shift is to the right. If the value is negative, the shift is to the left.

  • fill_val (int) – The value to fill in the emptied places after the shift. (Default is 0)

  • axis (int) – The axis along which to apply the shift. (Default is 0)

Returns:

shifted_array (np.ndarray):

The array after shifting elements along the specified axis.

Return type:

(np.ndarray)

roicat.tracking.blurring module

class roicat.tracking.blurring.ROI_Blurrer(frame_shape: Tuple[int, int] = (512, 512), kernel_halfWidth: int = 2, plot_kernel: bool = False, verbose: bool = True)[source]

Bases: ROICaT_Module

Blurs the Region of Interest (ROI). RH 2022

Parameters:
  • frame_shape (Tuple[int, int]) – The shape of the frame/Field Of View (FOV). Product of frame_shape[0] and frame_shape[1] must equal the length of a single flattened/sparse spatialFootprint. (Default is (512, 512))

  • kernel_halfWidth (int) – The half-width of the cosine kernel to use for convolutional blurring. (Default is 2)

  • plot_kernel (bool) – Whether to plot an image of the kernel. (Default is False)

  • verbose (bool) – Whether to print the convolutional blurring operation progress. (Default is True)

frame_shape

The shape of the frame/Field Of View (FOV). Product of frame_shape[0] and frame_shape[1] must equal the length of a single flattened/sparse spatialFootprint.

Type:

Tuple[int, int]

kernel_halfWidth

The half-width of the cosine kernel to use for convolutional blurring.

Type:

int

plot_kernel

Whether to plot an image of the kernel.

Type:

bool

verbose

Whether to print the convolutional blurring operation progress.

Type:

bool

blur_ROIs(spatialFootprints: List[object]) List[object][source]

Blurs the Region of Interest (ROI).

Parameters:

spatialFootprints (List[object]) – A list of sparse matrices corresponding to spatial footprints from each session.

Returns:

ROIs_blurred (List[object]):

A list of blurred ROI spatial footprints.

Return type:

(List[object])

get_ROIsBlurred_maxIntensityProjection() List[object][source]

Calculates the maximum intensity projection of the ROIs.

Returns:

ims (List[object]):

The maximum intensity projection of the ROIs.

Return type:

(List[object])

roicat.tracking.clustering module

class roicat.tracking.clustering.Clusterer(s_sf: csr_matrix | None = None, s_NN_z: csr_matrix | None = None, s_SWT_z: csr_matrix | None = None, s_sesh: csr_matrix | None = None, n_bins: int | None = None, smoothing_window_bins: int | None = None, verbose: bool = True)[source]

Bases: ROICaT_Module

Class for clustering algorithms. Performs:
  • Optimal mixing and pruning of similarity matrices:
    • self.find_optimal_parameters_for_pruning()

    • self.make_pruned_similarity_graphs()

  • Clustering:
    • self.fit(): Which uses a modified HDBSCAN

    • self.fit_sequentialHungarian: Which uses a method similar to CaImAn’s clustering method.

  • Quality control:
    • self.compute_cluster_quality_metrics()

Initialization ingests and stores similarity matrices. RH 2023

Parameters:
  • s_sf (Optional[scipy.sparse.csr_matrix]) – The similarity matrix for spatial footprints. Shape: (n_rois, n_rois). Expecting input to be manhattan distance of spatial footprints normalized between 0 and 1.

  • s_NN_z (Optional[scipy.sparse.csr_matrix]) – The z-scored similarity matrix for neural network output similarities. Shape: (n_rois, n_rois). Expecting input to be the cosine similarity matrix, z-scored row-wise.

  • s_SWT_z (Optional[scipy.sparse.csr_matrix]) – The z-scored similarity matrix for scattering transform output similarities. Shape: (n_rois, n_rois). Expecting input to be the cosine similarity matrix, z-scored row-wise.

  • s_sesh (Optional[scipy.sparse.csr_matrix]) – The similarity matrix for session similarity. Shape: (n_rois, n_rois). Boolean, with 1s where the two ROIs are from different sessions.

  • n_bins (int) – Number of bins to use for the pairwise similarity distribution. If using automatic parameter finding, then using a large number of bins makes finding the separation point more noisy, and only slightly more accurate. If None, then a heuristic is used to estimate the value based on the number of ROIs. (Default is 50)

  • smoothing_window_bins (int) – Number of bins to use when smoothing the distribution. Using a small number of bins makes finding the separation point more noisy, and only slightly more accurate. Aim for 5-10% of the number of bins. If None, then a heuristic is used. (Default is 5)

  • verbose (bool) – Specifies whether to print out information about the clustering process. (Default is True)

s_sf

The similarity matrix for spatial footprints. It is symmetric and has a shape of (n_rois, n_rois).

Type:

scipy.sparse.csr_matrix

s_NN_z

The z-scored similarity matrix for neural network output similarities. It is non-symmetric and has a shape of (n_rois, n_rois).

Type:

scipy.sparse.csr_matrix

s_SWT_z

The z-scored similarity matrix for scattering transform output similarities. It is non-symmetric and has a shape of (n_rois, n_rois).

Type:

scipy.sparse.csr_matrix

s_sesh

The similarity matrix for session similarity. It is symmetric and has a shape of (n_rois, n_rois).

Type:

scipy.sparse.csr_matrix

s_sesh_inv

The inverse of the session similarity matrix. It is symmetric and has a shape of (n_rois, n_rois).

Type:

scipy.sparse.csr_matrix

n_bins Optional[int]

Number of bins to use for the pairwise similarity distribution.

smoothing_window_bins Optional[int]

Number of bins to use when smoothing the distribution.

verbose

Specifies how much information to print out:

  • 0/False: Warnings only

  • 1/True: Basic info, progress bar

  • 2: All info

Type:

bool

compute_quality_metrics(sim_mat: object | None = None, dist_mat: object | None = None, labels: ndarray | None = None) Dict[source]

Computes quality metrics of the dataset. RH 2023

Parameters:
  • sim_mat (Optional[object]) – Similarity matrix of shape (n_samples, n_samples). If None then self.sConj must exist. (Default is None)

  • dist_mat (Optional[object]) – Distance matrix of shape (n_samples, n_samples). If None then self.dConj must exist. (Default is None)

  • labels (Optional[np.ndarray]) – Cluster labels of shape (n_samples,). If None, then self.labels must exist. (Default is None)

Returns:

quality_metrics (Dict):

Quality metrics dictionary that includes: ‘cluster_intra_means’, ‘cluster_intra_mins’, ‘cluster_intra_maxs’, ‘cluster_silhouette’, ‘sample_silhouette’, and other metrics if available.

Return type:

(Dict)

find_optimal_parameters_for_pruning(kwargs_findParameters: Dict[str, int | float | bool] = {'max_duration': 600, 'max_trials': 350, 'n_patience': 100, 'tol_frac': 0.05}, bounds_findParameters: Dict[str, Tuple[float, float]] = {'p_norm': (-5, 5), 'power_NN': (0.2, 2), 'power_SF': (0.3, 2), 'power_SWT': (0.1, 1), 'sig_NN_kwargs_b': (0.05, 2), 'sig_NN_kwargs_mu': (0, 0.5), 'sig_SWT_kwargs_b': (0.05, 2), 'sig_SWT_kwargs_mu': (0, 0.5)}, n_jobs_findParameters: int = -1, n_bins: int | None = None, smoothing_window_bins: int | None = None, seed=None) Dict[source]

Find the optimal parameters for pruning the similarity graph. How this function works:

  1. Make a conjunctive distance matrix using a set of parameters for the self.make_conjunctive_distance_matrix function.

  2. Estimates the distribution of pairwise distances between ROIs assumed to be the same and those assumed to be different ROIs. This is done by comparing the difference in the distribution of pairwise distances between ROIs from the same session and those from different sessions. Ideally, the main difference will be the presence of ‘same’ ROIs in the inter-session distribution.

  3. The optimal parameters are then updated using optuna in order to maximize the separation between the ‘same’ and ‘different’ distributions.

RH 2023

Parameters:
  • kwargs_findParameters (Dict[str, Union[int, float, bool]]) – Keyword arguments for the Convergence_checker class __init__.

  • bounds_findParameters (Dict[str, Tuple[float, float]]) – Bounds for the parameters to be optimized.

  • n_jobs_findParameters (int) – Number of jobs to use when finding the optimal parameters. If -1, use all available cores.

  • Optional[int] (n_bins) –

    Overwrites n_bins specified in __init__.

    Number of bins to use when estimating the distributions. Using a large number of bins makes finding the separation point more noisy, and only slightly more accurate. (Default is None or 50)

  • smoothing_window_bins (int) –

    Overwrites smoothing_window_bins specified in __init__.

    Number of bins to use when smoothing the distributions. Using a small number of bins makes finding the separation point more noisy, and only slightly more accurate. Aim for 5-10% of the number of bins. (Default is None or 5)

  • seed (int) – Seed for the random number generator in the optuna sampler. None: use a random seed.

Returns:

kwargs_makeConjunctiveDistanceMatrix_best (Dict):

The optimal parameters for the self.make_conjunctive_distance_matrix function.

Return type:

Dict

fit(d_conj: float, session_bool: ndarray, min_cluster_size: int = 2, n_iter_violationCorrection: int = 5, cluster_selection_method: str = 'leaf', d_clusterMerge: float | None = None, alpha: float = 0.999, split_intraSession_clusters: bool = True, discard_failed_pruning: bool = True, n_steps_clusterSplit: int = 100) ndarray[source]

Fits clustering using a modified HDBSCAN clustering algorithm. The approach is to use HDBSCAN but avoid having clusters with multiple ROIs from the same session. This is achieved by repeating three steps:

  1. Fit HDBSCAN to the data.

  2. Identify clusters that have multiple ROIs from the same session and walk back down the dendrogram until those clusters are split up into non-violating clusters.

  3. Disconnect graph edges between ROIs within each new cluster and all other ROIs outside the cluster that are from the same session.

Parameters:
  • d_conj (float) – Conjunctive distance matrix.

  • session_bool (np.ndarray) – Boolean array indicating which ROIs belong to which session. Shape: (n_rois, n_sessions)

  • min_cluster_size (int) – Minimum cluster size to be considered a cluster. (Default is 2)

  • n_iter_violationCorrection (int) – Number of iterations to correct for clusters with multiple ROIs per session. This is done to overcome the issues with single-linkage clustering finding clusters with multiple ROIs per session. (Default is 5)

  • cluster_selection_method (str) – Cluster selection method. Either 'leaf' or 'eom'. ‘leaf’ leans towards smaller clusters, ‘eom’ towards larger clusters. (Default is 'leaf')

  • d_clusterMerge (Optional[float]) – Distance threshold for merging clusters. All clusters with ROIs closer than this distance will be merged. If None, the distance is calculated as the mean + 1*std of the conjunctive distances. (Default is None)

  • alpha (float) – Alpha value. Smaller values result in more clusters. (Default is 0.999)

  • split_intraSession_clusters (bool) – If True, clusters containing ROIs from multiple sessions will be split. Only set to False if you want clusters containing multiple ROIs from the same session. (Default is True)

  • discard_failed_pruning (bool) – If True, clusters failing to prune are set to -1. (Default is True)

  • n_steps_clusterSplit (int) – Number of steps for splitting clusters with multiple ROIs from the same session. Lower values are faster but less accurate. (Default is 100)

Returns:

labels (np.ndarray):

Cluster labels for each ROI, shape: (n_rois_total)

Return type:

(np.ndarray)

fit_sequentialHungarian(d_conj: csr_matrix, session_bool: ndarray, thresh_cost: float = 0.95) ndarray[source]

Applies CaImAn’s method for clustering.

For further details, please refer to:
Parameters:
  • d_conj (scipy.sparse.csr_matrix) – Distance matrix. Shape: (n_rois, n_rois)

  • session_bool (np.ndarray) – Boolean array indicating which ROIs are in which sessions. Shape: (n_rois, n_sessions)

  • thresh_cost (float) – Threshold below which ROI pairs are considered potential matches. (Default is 0.95)

Returns:

labels (np.ndarray):

Cluster labels. Shape: (n_rois,)

Return type:

(np.ndarray)

classmethod make_conjunctive_distance_matrix(s_sf: csr_matrix | None = None, s_NN: csr_matrix | None = None, s_SWT: csr_matrix | None = None, s_sesh: csr_matrix | None = None, power_SF: float = 1, power_NN: float = 1, power_SWT: float = 1, p_norm: float = 1, sig_SF_kwargs: Dict[str, float] = {'b': 0.5, 'mu': 0.5}, sig_NN_kwargs: Dict[str, float] = {'b': 0.5, 'mu': 0.5}, sig_SWT_kwargs: Dict[str, float] = {'b': 0.5, 'mu': 0.5}) Tuple[csr_matrix, csr_matrix, ndarray, ndarray, ndarray, ndarray][source]

Makes a distance matrix from the three similarity matrices. RH 2023

Parameters:
  • s_sf (Optional[scipy.sparse.csr_matrix]) – Similarity matrix for spatial footprints. (Default is None)

  • s_NN (Optional[scipy.sparse.csr_matrix]) – Similarity matrix for neural network features. (Default is None)

  • s_SWT (Optional[scipy.sparse.csr_matrix]) – Similarity matrix for scattering wavelet transform features. (Default is None)

  • s_sesh (Optional[scipy.sparse.csr_matrix]) – The session similarity matrix. (Default is None)

  • power_SF (float) – Power to which to raise the spatial footprint similarity. (Default is 1)

  • power_NN (float) – Power to which to raise the neural network similarity. (Default is 1)

  • power_SWT (float) – Power to which to raise the scattering wavelet transform similarity. (Default is 1)

  • p_norm (float) – p-norm to use for the conjunction of the similarity matrices. (Default is 1)

  • sig_SF_kwargs (Dict[str, float]) – Keyword arguments for the sigmoid function applied to the spatial footprint overlap similarity matrix. See helpers.generalised_logistic_function for details. (Default is {‘mu’:0.5, ‘b’:0.5})

  • sig_NN_kwargs (Dict[str, float]) – Keyword arguments for the sigmoid function applied to the neural network similarity matrix. See helpers.generalised_logistic_function for details. (Default is {‘mu’:0.5, ‘b’:0.5})

  • sig_SWT_kwargs (Dict[str, float]) – Keyword arguments for the sigmoid function applied to the scattering wavelet transform similarity matrix. See helpers.generalised_logistic_function for details. (Default is {‘mu’:0.5, ‘b’:0.5})

Returns:

Tuple containing:
dConj (scipy.sparse.csr_matrix):

Conjunction of the three similarity matrices.

sConj (scipy.sparse.csr_matrix):

The session similarity matrix.

sSF_data (np.ndarray):

Activated spatial footprint similarity matrix.

sNN_data (np.ndarray):

Activated neural network similarity matrix.

sSWT_data (np.ndarray):

Activated scattering wavelet transform similarity matrix.

sConj_data (np.ndarray):

Activated session similarity matrix.

Return type:

(Tuple)

make_pruned_similarity_graphs(convert_to_probability: bool = False, stringency: float = 1.0, kwargs_makeConjunctiveDistanceMatrix: Dict | None = None, d_cutoff: float | None = None) None[source]

Constructs pruned similarity graphs. RH 2023

Parameters:
  • convert_to_probability (bool) – Whether to convert the distance and similarity graphs to probability, p(different) and p(same), respectively. (Default is False)

  • stringency (float) – Modifies the threshold for pruning the distance matrix. A higher value results in less pruning, a lower value leads to more pruning. This value is multiplied by the inferred threshold to generate a new one. (Default is 1.0)

  • kwargs_makeConjunctiveDistanceMatrix (Optional[Dict]) – Keyword arguments for the self.make_conjunctive_distance_matrix function. If None, the best parameters found using self.find_optimal_parameters are used. (Default is None)

  • d_cutoff (Optional[float]) – The cutoff distance for pruning the distance matrix. If None, then the optimal cutoff distance is inferred. (Default is None)

plot_distSame(kwargs_makeConjunctiveDistanceMatrix: dict | None = None) None[source]

Plot the estimated distribution of the pairwise similarities between matched ROI pairs of ROIs.

Parameters:

kwargs_makeConjunctiveDistanceMatrix (Optional[dict]) – Keyword arguments for the makeConjunctiveDistanceMatrix method. If None, the function uses the object’s best parameters. (Default is None)

plot_similarity_relationships(plots_to_show: List[int] = [1, 2, 3], max_samples: int = 1000000, kwargs_scatter: Dict[str, int | float] = {'alpha': 0.1, 's': 1}, kwargs_makeConjunctiveDistanceMatrix: Dict[str, float | Dict[str, float]] = {'p_norm': -4.0, 'power_NN': 1.0, 'power_SF': 0.5, 'power_SWT': 0.1, 'sig_NN_kwargs': {'b': 0.5, 'mu': 0.5}, 'sig_SF_kwargs': {'b': 0.5, 'mu': 0.5}, 'sig_SWT_kwargs': {'b': 0.5, 'mu': 0.5}}) Tuple[figure, axes][source]

Plot the similarity relationships between the three similarity matrices.

Parameters:
  • plots_to_show (List[int]) –

    Which plots to show.

    • 1: Spatial footprints vs. neural network features.

    • 2: Spatial footprints vs. scattering wavelet transform features.

    • 3: Neural network features vs. scattering wavelet.

  • max_samples (int) – Maximum number of samples to plot. Use smaller numbers for faster plotting.

  • kwargs_scatter (Dict[str, Union[int, float]]) – Keyword arguments for the matplotlib.pyplot.scatter plot.

  • kwargs_makeConjunctiveDistanceMatrix (Dict[str, Union[float, Dict[str, float]]]) – Keyword arguments for the makeConjunctiveDistanceMatrix method.

Returns:

tuple containing:
fig (matplotlib.pyplot.figure):

Figure object.

axs (matplotlib.pyplot.axes):

Axes object.

Return type:

(Tuple[matplotlib.pyplot.figure, matplotlib.pyplot.axes])

roicat.tracking.clustering.attach_fully_connected_node(d: object, dist_fullyConnectedNode: float | None = None, n_nodes: int = 1) object[source]

Appends a single node to a sparse distance graph that is weakly connected to all nodes.

Parameters:
  • d (object) – Sparse graph with multiple components. Refer to scipy.sparse.csgraph.connected_components for details.

  • dist_fullyConnectedNode (Optional[float]) – Value used for the connection strength to all other nodes. This value will be appended as elements in a new row and column at the ends of the ‘d’ matrix. If None, then the value will be set to 1000 times the difference between the maximum and minimum values in ‘d’. (Default is None)

  • n_nodes (int) – Number of nodes to append to the graph. (Default is 1)

Returns:

d2 (object):

Sparse graph with only one component.

Return type:

(object)

roicat.tracking.clustering.cluster_quality_metrics(sim: ndarray | csr_matrix, labels: ndarray) Tuple[source]

Computes the cluster quality metrics for a clustering solution including intra-cluster mean, minimum, maximum similarity, and cluster silhouette score. RH 2023

Parameters:
  • sim (Union[np.ndarray, scipy.sparse.csr_matrix]) – Similarity matrix. (shape: (n_roi, n_roi)) It can be obtained using _, sConj, _,_,_,_ = clusterer.make_conjunctive_similarity_matrix().

  • labels (np.ndarray) – Cluster labels. (shape: (n_roi,))

Returns:

tuple containing:
cs_intra_means (np.ndarray):

Intra-cluster mean similarity. (shape: (n_clusters,))

cs_intra_mins (np.ndarray):

Intra-cluster minimum similarity. (shape: (n_clusters,))

cs_intra_maxs (np.ndarray):

Intra-cluster maximum similarity. (shape: (n_clusters,))

cs_sil (np.ndarray):

Cluster silhouette score. (shape: (n_clusters,))

Return type:

(tuple)

roicat.tracking.clustering.make_label_variants(labels: ndarray, n_roi_bySession: ndarray) Tuple[source]

Creates convenient variants of label arrays. RH 2023

Parameters:
  • labels (np.ndarray) – Cluster integer labels. (shape: (n_roi,))

  • n_roi_bySession (np.ndarray) – Number of ROIs in each session.

Returns:

tuple containing:
labels_squeezed (np.ndarray):

Cluster labels squeezed into a continuous range starting from 0.

labels_bySession (List[np.ndarray]):

List of label arrays split by session.

labels_bool (scipy.sparse.csr_matrix):

Sparse boolean matrix representation of labels.

labels_bool_bySession (List[scipy.sparse.csr_matrix]):

List of sparse boolean matrix representations of labels split by session.

labels_dict (Dict[int, np.ndarray]):

Dictionary mapping unique labels to their locations in the labels array.

Return type:

(tuple)

roicat.tracking.clustering.score_labels(labels_test: ndarray, labels_true: ndarray, ignore_negOne: bool = False, thresh_perfect: float = 0.9999999999, compute_mutual_info: bool = False) Dict[str, float | Tuple[int, int]][source]

Computes the score of the clustering by finding the best match using the linear sum assignment problem. The score is bounded between 0 and 1. Note: The score is not symmetric if the number of true and test labels are not the same. I.e., switching labels_test and labels_true can lead to different scores. This is because we are scoring how well each true set is matched by an optimally assigned test set.

RH 2022

Parameters:
  • labels_test (np.ndarray) – Labels of the test clusters/sets. (shape: (n,))

  • labels_true (np.ndarray) – Labels of the true clusters/sets. (shape: (n,))

  • ignore_negOne (bool) – Whether to ignore -1 values in the labels. If set to True, -1 values will be ignored in the computation. (Default is False)

  • thresh_perfect (float) – Threshold for perfect match. Mostly used for numerical stability. (Default is 0.9999999999)

  • compute_mutual_info (bool) – If set to True, the adjusted mutual info score is also computed. (Default is False)

Returns:

dictionary containing:
score_weighted_partial (float):

Average correlation between the best matched sets of true and test labels, weighted by the number of elements in each true set.

score_weighted_perfect (float):

Fraction of perfect matches, weighted by the number of elements in each true set.

score_unweighted_partial (float):

Average correlation between the best matched sets of true and test labels.

score_unweighted_perfect (float):

Fraction of perfect matches.

adj_rand_score (float):

Adjusted Rand score of the labels.

adj_mutual_info_score (float):

Adjusted mutual info score of the labels. None if compute_mutual_info is False.

ignore_negOne (bool):

Whether -1 values were ignored in the labels.

idx_hungarian (Tuple[int, int]):

’Hungarian Indices’. Indices of the best matched sets.

Return type:

(dict)

roicat.tracking.scatteringWaveletTransformer module

class roicat.tracking.scatteringWaveletTransformer.SWT(kwargs_Scattering2D: Dict[str, Any] = {'J': 2, 'L': 8}, image_shape: Tuple[int, int] = (36, 36), device: str = 'cpu', verbose: bool = True)[source]

Bases: ROICaT_Module

Performs scattering wavelet transform using the kymatio library. RH 2022

Parameters:
  • kwargs_Scattering2D (Dict[str, Any]) – The keyword arguments to pass to the Scattering2D class. (Default is {'J': 2, 'L': 8})

  • image_shape (Tuple[int, int]) – The shape of the images to be transformed. (Default is (36,36))

  • device (str) – The device to use for the transformation. (Default is 'cpu')

  • verbose (bool) – If True, print statements will be outputted. (Default is True)

Example

swt = SWT(kwargs_Scattering2D={'J': 2, 'L': 8}, image_shape=(36,36), device='cpu', verbose=True)
transformed_images = swt.transform(ROI_images, batch_size=100)
transform(ROI_images: ndarray, batch_size: int = 100) ndarray[source]

Transforms the ROI images.

Parameters:
  • ROI_images (np.ndarray) – The ROI images to transform. One should probably concatenate ROI images across sessions for passing through here. (n_ROIs, height, width)

  • batch_size (int) – The batch size to use for the transformation. (Default is 100)

Returns:

latents (np.ndarray):

The transformed ROI images. (n_ROIs, latent_size)

Return type:

(np.ndarray)

roicat.tracking.similarity_graph module

class roicat.tracking.similarity_graph.ROI_graph(n_workers: int = -1, frame_height: int = 512, frame_width: int = 1024, block_height: int = 100, block_width: int = 100, overlapping_width_Multiplier: float = 0.0, algorithm_nearestNeigbors_spatialFootprints: str = 'brute', verbose: bool = True, kwargs_nearestNeigbors_spatialFootprints: dict = {})[source]

Bases: ROICaT_Module

Class for building similarity and distance graphs between Regions of Interest (ROIs) based on their features, generating potential clusters of ROIs using linkage clustering, building a similarity graph between clusters of ROIs, and computing silhouette scores for each potential cluster. The computations are performed on ‘blocks’ of the full field of view to accelerate computation and reduce memory usage. RH 2022

Parameters:
  • n_workers (int) – The number of workers to use for the computations. If -1, all available cpu cores will be used. (Default is -1)

  • frame_height (int) – The height of the frame. (Default is 512)

  • frame_width (int) – The width of the frame. (Default is 1024)

  • block_height (int) – The height of the block. (Default is 100)

  • block_width (int) – The width of the block. (Default is 100)

  • overlapping_width_Multiplier (float) – The multiplier for the overlapping width. (Default is 0.0)

  • algorithm_nearestNeigbors_spatialFootprints (str) – The algorithm to use for the nearest neighbors computation. See sklearn.neighbors.NearestNeighbors for more information. (Default is 'brute')

  • verbose (bool) – If set to True, outputs will be verbose. (Default is True)

  • **kwargs_nearestNeigbors_spatialFootprints (dict) – The keyword arguments to use for the nearest neighbors. See sklearn.neighbors.NearestNeighbors for more information. (Optional)

s_sf

Pairwise similarity matrix based on spatial footprints.

Type:

scipy.sparse.csr_matrix

s_NN

Pairwise similarity matrix based on Neural Network features.

Type:

scipy.sparse.csr_matrix

s_SWT

Pairwise similarity matrix based on Scattering Wavelet Transform.

Type:

scipy.sparse.csr_matrix

s_sesh

Pairwise similarity matrix based on which session the ROIs belong to.

Type:

scipy.sparse.csr_matrix

compute_similarity_blockwise(spatialFootprints: csr_matrix, features_NN: Tensor, features_SWT: Tensor, ROI_session_bool: Tensor, spatialFootprint_maskPower: float = 1.0) None[source]

Computes the similarity graph between ROIs and updates the instance attributes: s_sf, s_NN, s_SWT, s_sesh.

Parameters:
  • spatialFootprints (scipy.sparse.csr_matrix) – The spatial footprints of the ROIs. Can be obtained from blurring.ROI_blurrer.ROIs_blurred or data_importing.Data_suite2p.spatialFootprints.

  • features_NN (torch.Tensor) – The output latents from the roinet neural network. Can be obtained from ROInet.ROInet_embedder.latents.

  • features_SWT (torch.Tensor) – The output latents from the scattering wavelet transform. Can be obtained from scatteringWaveletTransform.SWT.latents.

  • ROI_session_bool (torch.Tensor) – The boolean array indicating which ROIs (across all sessions) belong to each session. shape: (n_ROIs total, n_sessions).

  • spatialFootprint_maskPower (float) – The power to raise the spatial footprint mask to. Use 1.0 for no change to the masks, low values (e.g., 0.5) to make the masks more binary looking, and high values (e.g., 2.0) to make the pairwise similarities highly dependent on the relative intensities of the pixels in each mask. (Default is 1.0)

Returns:

tuple containing:
s_sf (scipy.sparse.csr_matrix):

Pairwise similarity matrix based on spatial footprints.

s_NN (scipy.sparse.csr_matrix):

Pairwise similarity matrix based on Neural Network features.

s_SWT (scipy.sparse.csr_matrix):

Pairwise similarity matrix based on Scattering Wavelet Transform.

s_sesh (scipy.sparse.csr_matrix):

Pairwise similarity matrix based on which session the ROIs belong to.

Return type:

(tuple)

make_normalized_similarities(centers_of_mass: ndarray | List[ndarray], features_NN: Tensor | None = None, features_SWT: Tensor | None = None, k_max: int = 3000, k_min: int = 200, algo_NN: str = 'kd_tree', device: str = 'cpu', verbose: bool = True) None[source]

Normalizes the similarity matrices s_NN, s_SWT (but not s_sf) by z-scoring using the mean and standard deviation from the distributions of pairwise similarities between ROIs that are spatially distant from each other. This is done to make the similarity scores more comparable across different regions of the field of view. RH 2022

Parameters:
  • centers_of_mass (Union[np.ndarray, List[np.ndarray]]) – The centers of mass of the ROIs. Can be an array with shape: (n_ROIs total, 2), or a list of arrays with shape: (n_ROIs for each session, 2).

  • features_NN (torch.Tensor) – The output latent embeddings of the NN model. Shape: (n_ROIs total, n_features). (Default is None)

  • features_SWT (torch.Tensor) – The output latent embeddings of the SWT model. Shape: (n_ROIs total, n_features). (Default is None)

  • k_max (int) – The maximum number of nearest neighbors to consider for each ROI. This value will result in an intermediate similarity matrix of shape (n_ROIs total, k_max) between each ROI and its k_max nearest neighbors. This value is based on centroid distance. (Default is 3000)

  • k_min (int) – The minimum number of nearest neighbors to consider for each ROI. This value should be less than k_max and be chosen such that it is likely that any potential ‘same’ ROIs are within k_min nearest neighbors. This value is based on centroid distance. (Default is 200)

  • algo_NN (str) – The algorithm to use for the nearest neighbor search. See sklearn.neighbors.NearestNeighbors for options. It can be ‘kd_tree’, ‘ball_tree’, or ‘brute’. ‘kd_tree’ seems to be the fastest. (Default is 'kd_tree')

  • device (str) – The device to use for the similarity computations. The output will still be on CPU. (Default is 'cpu')

  • verbose (bool) – If True, print progress updates. (Default is True)

s_NN_z

The z-scored similarity matrix between ROIs based on the statistics of the NN embedding. Shape: (n_ROIs total, n_ROIs total). Note: This matrix is not symmetric and therefore should be treated as a directed graph.

Type:

scipy.sparse.csr_matrix

s_SWT_z

The z-scored similarity matrix between ROIs based on the statistics of the SWT embedding. Shape: (n_ROIs total, n_ROIs total). Note: This matrix is not symmetric and therefore should be treated as a directed graph.

Type:

scipy.sparse.csr_matrix

visualize_blocks() None[source]

Visualizes the blocks over a field of view by displaying them. This is primarily used for checking the correct partitioning of the blocks.

roicat.tracking.similarity_graph.cosine_similarity_customIdx(features: Tensor, idx: ndarray) Tensor[source]

Calculate cosine similarity using custom indices.

Parameters:
  • features (torch.Tensor) – A tensor of feature vectors. Shape: (n, d), where n is the number of data points and d is the dimensionality of the data.

  • idx (np.ndarray) – Array of indices. Shape should match the first dimension of the features tensor.

Returns:

result (torch.Tensor):

Cosine similarity tensor calculated using the provided indices. Shape: (n, d), where n is the number of data points and d is the dimensionality of the data.

Return type:

(torch.Tensor)

roicat.tracking.similarity_graph.get_idx_in_kRange(X: ndarray, k_max: int = 3000, k_min: int = 100, algo_kNN: str = 'brute', n_workers: int = -1) Tuple[ndarray, coo_matrix][source]

Get indices in a given range for k-Nearest Neighbors graph. RH 2022

Parameters:
  • X (np.ndarray) – Input data array where each row is a data point and each column is a feature.

  • k_max (int) – Maximum number of neighbors to find. (Default is 3000)

  • k_min (int) – Minimum number of neighbors to consider. (Default is 100)

  • algo_kNN (str) – Algorithm to use for nearest neighbors search. (Default is 'brute')

  • n_workers (int) – Number of worker processes to use. If -1, use all available cores. (Default is -1)

Returns:

tuple containing:
idx_diff (np.ndarray):

Indices of the non-zero values in the distance graph, with a range between k_min and k_max.

d (scipy.sparse.coo_matrix):

Sparse matrix representing the distance graph from the k-Nearest Neighbors algorithm.

Return type:

(Tuple[np.ndarray, scipy.sparse.coo_matrix])

Classification module

roicat.classification.classifier module

class roicat.classification.classifier.Auto_LogisticRegression(X: ndarray, y: ndarray, params_LogisticRegression: Dict = {'C': [1e-14, 1000.0], 'fit_intercept': True, 'l1_ratio': None, 'max_iter': 1000, 'n_jobs': None, 'penalty': 'l2', 'solver': 'lbfgs', 'tol': 0.0001, 'warm_start': False}, n_startup: int = 15, kwargs_convergence: Dict = {'max_duration': 600, 'max_trials': 150, 'n_patience': 50, 'tol_frac': 0.05}, n_jobs_optuna: int = 1, penalty_testTrainRatio: float = 1.0, test_size: float = 0.3, class_weight: Dict[str, float] | str | None = 'balanced', sample_weight: List[float] | None = None, cv: BaseCrossValidator | None = None, verbose: bool = True)[source]

Bases: Autotuner_regression

Implements automatic hyperparameter tuning for Logistic Regression. RH 2023

Parameters:
  • X (np.ndarray) – Training data. (shape: (n_samples, n_features))

  • y (np.ndarray) – Target variable. (shape: (n_samples,))

  • params_LogisticRegression (Dict) –

    Dictionary of Logistic Regression parameters. For each item in the dictionary if item is:

    • list: The parameter is tuned. If the values are numbers, then the list wil be the bounds [low, high] to search over. If the values are strings, then the list will be the categorical values to search over.

    • not a list: The parameter is fixed to the given value.

    See LogisticRegression for a full list of arguments.

  • n_startup (int) – Number of startup trials. (Default is 15)

  • kwargs_convergence (Dict[str, Union[int, float]]) –

    Convergence settings for the optimization. Includes:

    • 'n_patience' (int): The number of trials to wait for convergence before stopping the optimization.

    • 'tol_frac' (float): The fractional tolerance for convergence. After n_patience trials, the optimization will stop if the loss has not improved by at least tol_frac.

    • 'max_trials' (int): The maximum number of trials to run.

    • 'max_duration' (int): The maximum duration of the optimization in seconds.

  • n_jobs_optuna (int) – Number of jobs for Optuna. Set to -1 to use all cores. Note that some 'solver' options are already parallelized (like 'lbfgs'). Set n_jobs_optuna to 1 for these solvers.

  • penalty_testTrainRatio (float) – Penalty ratio for test and train.

  • test_size (float) – Test set ratio.

  • class_weight (Union[Dict[str, float], str]) – Weights associated with classes in the form of a dictionary or string. If given “balanced”, class weights will be calculated. (Default is “balanced”)

  • sample_weight (Optional[List[float]]) –

    Sample weights. See LogisticRegression for more details.

  • cv (Optional[sklearn.model_selection._split.BaseCrossValidator]) –

    A Scikit-Learn cross-validator class. If not None, then must have:

    • Call signature: idx_train, idx_test = next(self.cv.split(self.X, self.y))

    If None, then a StratifiedShuffleSplit cross-validator will be used.

  • verbose (bool) – Whether to print progress messages.

Demo:
## Initialize with NO TUNING. All parameters are fixed.
autoclassifier = Auto_LogisticRegression(
    X,
    y,
    params_LogisticRegression={
        'C': 1e-14,
        'penalty': 'l2',
        'solver': 'lbfgs',
    },
)

## Initialize with TUNING 'C', 'penalty', and 'l1_ratio'. 'solver' is fixed.
autoclassifier = Auto_LogisticRegression(
    X,
    y,
    params_LogisticRegression={
        'C': [1e-14, 1e3],
        'penalty': ['l1', 'l2', 'elasticnet'],
        'l1_ratio': [0.0, 1.0],
        'solver': 'lbfgs',
    },
)
evaluate_model(model: LogisticRegression | None = None, X: ndarray | None = None, y: ndarray | None = None, sample_weight: List[float] | None = None) Tuple[float, array][source]

Evaluates the given model on the given data. Makes label predictions, then computes the accuracy and confusion matrix.

Parameters:
  • model (sklearn.linear_model.LogisticRegression) – A sklearn LogisticRegression model. If None, then self.model_best is used.

  • X (np.ndarray) – The data to evaluate on. If None, then self.X is used.

  • y (np.ndarray) – The labels to evaluate on. If None, then self.y is used.

  • sample_weight (List[float]) – The sample weights to evaluate on. If None, then self.sample_weight is used.

Returns:

Tuple containing:
accuracy (float):

The accuracy of the model on the given data.

confusion_matrix (np.array):

The confusion matrix of the model on the given data.

Return type:

(tuple)

plot_C_curve()[source]

Makes a scatter plot of C values vs loss values.

class roicat.classification.classifier.Autotuner_regression(model_class: Type[BaseEstimator], params: Dict[str, Dict[str, Any]], X: Any, y: Any, cv: Any, fn_loss: Callable, n_jobs_optuna: int = -1, n_startup: int = 15, kwargs_convergence={'max_duration': 600, 'max_trials': 350, 'n_patience': 100, 'tol_frac': 0.05}, sample_weight: Any | None = None, catch_convergence_warnings: bool = True, verbose=True)[source]

Bases: ROICaT_Module

A class for automatic hyperparameter tuning and training of a regression model. RH 2023

model_class

A Scikit-Learn estimator class. Must have:

  • Method: fit(X, y)

  • Method: predict_proba(X) (for classifiers) or predict(X) (for continuous regressors)

Type:

Type[sklearn.base.BaseEstimator]

params

A dictionary of hyperparameters with their names, types, and bounds.

Type:

Dict[str, Dict[str, Any]]

X

Input data. Shape: (n_samples, n_features)

Type:

np.ndarray

y

Output data. Shape: (n_samples,)

Type:

np.ndarray

cv

A Scikit-Learn cross-validator class. Must have:

  • Call signature: idx_train, idx_test = next(self.cv.split(self.X, self.y))

Type:

Type[sklearn.model_selection._split.BaseCrossValidator]

fn_loss

Function to compute the loss. Must have:

  • Call signature: loss, loss_train, loss_test = fn_loss(y_pred_train, y_pred_test, y_true_train, y_true_test, sample_weight_train, sample_weight_test)

Type:

Callable

n_jobs_optuna

Number of jobs for Optuna. Set to -1 to use all cores. Note that some 'solver' options are already parallelized (like 'lbfgs'). Set n_jobs_optuna to 1 for these solvers.

Type:

int

n_startup

The number of startup trials for the optuna pruner and sampler.

Type:

int

kwargs_convergence

Convergence settings for the optimization. Includes:

  • 'n_patience' (int): The number of trials to wait for convergence before stopping the optimization.

  • 'tol_frac' (float): The fractional tolerance for convergence. After n_patience trials, the optimization will stop if the loss has not improved by at least tol_frac.

  • 'max_trials' (int): The maximum number of trials to run.

  • 'max_duration' (int): The maximum duration of the optimization in seconds.

Type:

Dict[str, Union[int, float]]

sample_weight

Weights for the samples, equal to ones_like(y) if None.

Type:

Optional[np.ndarray]

catch_convergence_warnings

If True, ignore ConvergenceWarning during model fitting.

Type:

bool

verbose

If True, show progress bar and print running results.

Type:

bool

Example

params = {
    'C':             {'type': 'real',        'kwargs': {'log': True, 'low': 1e-4, 'high': 1e4}},
    'penalty':       {'type': 'categorical', 'kwargs': {'choices': ['l1', 'l2']}},
}
fit() BaseEstimator | Dict[str, Any] | None[source]

Fit and tune the hyperparameters and train the classifier.

Returns:

best_model (sklearn.base.BaseEstimator):

The best estimator obtained from hyperparameter tuning.

best_params (Optional[Dict[str, Any]]):

The best parameters obtained from hyperparameter tuning.

Return type:

(Union[sklearn.base.BaseEstimator, Optional[Dict[str, Any]])

save_model(filepath: str | None = None, allow_overwrite: bool = False)[source]

Uses ONNX to save the best model as a binary file.

Parameters:
  • filepath (str) – The path to save the model to. If None, then the model will not be saved.

  • allow_overwrite (bool) – Whether to allow overwriting of existing files.

Returns:

The ONNX model.

Return type:

(onnx.ModelProto)

class roicat.classification.classifier.Load_ONNX_model_sklearnLogisticRegression(path_or_bytes: str = 'path/to/model.onnx', providers: List[str] = ['CPUExecutionProvider'])[source]

Bases: object

Loads an ONNX model of an sklearn LogisticRegression model into a runtime session. RH 2023

Parameters:

path_or_bytes (Union[str, bytes]) –

Either:

  • The filepath to the ONNX model.

  • The bytes of the ONNX model: model.SerializeToString().

Returns:

A partial function that takes in a numpy array or torch Tensor and passes it through the ONNX runtime session (model).

Return type:

(function)

class roicat.classification.classifier.LossFunction_CrossEntropy_CV(penalty_testTrainRatio: float = 1.0, labels: List | ndarray | None = None, test_or_train: str = 'test')[source]

Bases: object

Calculates the cross-entropy loss of a classifier using cross-validation. RH 2023

Parameters:
  • penalty_testTrainRatio (float) – The amount of penalty for the test loss to the train loss. Penalty is applied with formula: loss = loss_test_or_train * ((loss_test / loss_train) ** penalty_testTrainRatio).

  • labels (Optional[Union[List, np.ndarray]]) – A list or ndarray of labels. Shape: (n_samples,).

  • test_or_train (str) – A string indicating whether to apply the penalty to the test or train loss. It should be either 'test' or 'train'.

roicat.pipelines module

roicat.pipelines.pipeline_tracking(params: dict)[source]

Pipeline for tracking ROIs across sessions. RH 2023

Parameters:

params (dict) – Dictionary of parameters. See roicat.util.get_default_parameters(pipeline='tracking') for details.

Returns:

tuple containing:
results (dict):

Dictionary of results.

run_data (dict):

Dictionary containing the different class objects used in the pipeline.

params (dict):

Parameters used in the pipeline. See roicat.helpers.prepare_params() for details.

Return type:

(tuple)

roicat.ROInet module

OSF.io links to ROInet versions:

  • ROInet_tracking:
    • Info: This version does not include occlusions or large affine transformations.

    • Link: https://osf.io/x3fd2/download

    • Hash (MD5 hex): 7a5fb8ad94b110037785a46b9463ea94

  • ROInet_classification:
    • Info: This version includes occlusions and large affine transformations.

    • Link: https://osf.io/c8m3b/download

    • Hash (MD5 hex): 357a8d9b630ec79f3e015d0056a4c2d5

class roicat.ROInet.Dataloader_ROInet(ROI_images: ndarray, batchSize_dataloader: int = 8, pinMemory_dataloader: bool = True, numWorkers_dataloader: int = -1, persistentWorkers_dataloader: bool = True, prefetchFactor_dataloader: int = 2, transforms: Callable | None = None, n_transforms: int = 1, img_size_out: Tuple[int, int] = (224, 224), jit_script_transforms: bool = False, shuffle_dataloader: bool = False, drop_last_dataloader: bool = False, verbose: bool = True)[source]

Bases: ROICaT_Module

Class for creating a dataloader for the ROInet network. JZ, RH 2023

Parameters:
  • ROI_images (np.ndarray) – Array of ROIs to resize. Shape should be (nROIs, height, width).

  • pref_plot (bool) – If True, plots the sizes of the ROI images before and after normalization. (Default is False)

  • batchSize_dataloader (int) – The batch size to use for the DataLoader. (Default is 8)

  • pinMemory_dataloader (bool) – If True, pins the memory of the DataLoader, as per PyTorch’s best practices. (Default is True)

  • numWorkers_dataloader (int) – The number of worker processes for data loading. (Default is -1)

  • persistentWorkers_dataloader (bool) – If True, uses persistent worker processes. (Default is True)

  • prefetchFactor_dataloader (int) – The prefetch factor for data loading. (Default is 2)

  • transforms (Optional[Callable]) – The transforms to use for the DataLoader. If None, the function will only scale dynamic range (to 0-1), resize (to img_size_out dimensions), and tile channels (to 3) as a minimum to pass images through the network. (Default is None)

  • n_transforms (int) – The number of times to apply the transforms to each image. Should be 1 for inference and 2 for training. (Default is 1)

  • img_size_out (Tuple[int, int]) – The image output dimensions of DataLoader if transforms is None. (Default is (224, 224))

  • jit_script_transforms (bool) – If True, converts the transforms pipeline into a TorchScript pipeline, potentially improving calculation speed but can cause problems with multiprocessing. (Default is False)

  • shuffle (bool) – If True, shuffles the data. Should be set to True for SimCLR training. (Default is False)

  • drop_last (bool) – If True, drops the last batch if it is not full. Should be set to True for SimCLR training. (Default is False)

  • verbose (bool) – If True, print out extra information. (Default is True)

class roicat.ROInet.ROInet_embedder(dir_networkFiles: str, device: str = 'cpu', download_method: str = 'check_local_first', download_url: str = 'https://osf.io/x3fd2/download', download_hash: dict = None, names_networkFiles: dict = None, forward_pass_version: str = 'latent', verbose: bool = True)[source]

Bases: ROICaT_Module

Class for loading the ROInet model, preparing data for it, and running it. RH, JZ 2022

OSF.io links to ROInet versions:

  • ROInet_tracking:
    • Info: This version does not include occlusions or large affine transformations.

    • Link: https://osf.io/x3fd2/download

    • Hash (MD5 hex): 7a5fb8ad94b110037785a46b9463ea94

  • ROInet_classification:
    • Info: This version includes occlusions and large affine transformations.

    • Link: https://osf.io/c8m3b/download

    • Hash (MD5 hex): 357a8d9b630ec79f3e015d0056a4c2d5

Parameters:
  • dir_networkFiles (str) – Directory to find an existing ROInet.zip file or download and extract a new one into.

  • device (str) – Device to use for the model and data. (Default is 'cpu')

  • download_method (str) –

    Approach to downloading the network files. Options are:

    • 'check_local_first': Check if the network files are already in dir_networkFiles, if so, use them.

    • 'force_download': Download an ROInet.zip file from download_url.

    • 'force_local': Use an existing local copy of an ROInet.zip file, if they don’t exist, raise an error. Hash checking is done and download_hash must be specified.

    (Default is 'check_local_first')

  • download_url (str) – URL to download the ROInet.zip file from. (Default is https://osf.io/x3fd2/download)

  • download_hash (dict) – MD5 hash of the ROInet.zip file. This can be obtained from ROICaT documentation. If you don’t have one, use download_method=’force_download’ and determine the hash using helpers.hash_file(). (Default is None)

  • names_networkFiles (dict) –

    Names of the files in the ROInet.zip file. If uncertain, leave as None. The dictionary should have the form:

    {'params': 'params.json', 'model': 'model.py', 'state_dict': 'ConvNext_tiny__1_0_unfrozen__simCLR.pth',}

    Where ‘params’ is the parameters used to train the network (usually a .json file), ‘model’ is the model definition (usually a .py file), and ‘state_dict’ are the weights of the network (usually a .pth file). (Default is None)

  • forward_pass_version (str) – Version of the forward pass to use. Options are ‘latent’ (return the post-head output latents, use this for tracking), ‘head’ (return the output of the head layers, use this for classification), and ‘base’ (return the output of the base model). (Default is 'latent')

  • verbose (bool) – If True, print out extra information. (Default is True)

generate_dataloader(ROI_images: List[ndarray], um_per_pixel: float = 1.0, nan_to_num: bool = True, nan_to_num_val: float = 0.0, pref_plot: bool = False, batchSize_dataloader: int = 8, pinMemory_dataloader: bool = True, numWorkers_dataloader: int = -1, persistentWorkers_dataloader: bool = True, prefetchFactor_dataloader: int = 2, transforms: Callable | None = None, img_size_out: Tuple[int, int] = (224, 224), jit_script_transforms: bool = False)[source]

Generates a PyTorch DataLoader for a list of Region of Interest (ROI) images. Performs preprocessing such as rescaling, normalization, and resizing.

Parameters:
  • ROI_images (List[np.ndarray]) – The ROI images to use for the dataloader. List of arrays, each array corresponds to a session and is of shape (n_rois, height, width).

  • um_per_pixel (float) – The number of microns per pixel. Used to rescale the ROI images to the same size as the network input. (Default is 1.0)

  • nan_to_num (bool) – Whether to replace NaNs with a specific value. (Default is True)

  • nan_to_num_val (float) – The value to replace NaNs with. (Default is 0.0)

  • pref_plot (bool) – If True, plots the sizes of the ROI images before and after normalization. (Default is False)

  • batchSize_dataloader (int) – The batch size to use for the DataLoader. (Default is 8)

  • pinMemory_dataloader (bool) – If True, pins the memory of the DataLoader, as per PyTorch’s best practices. (Default is True)

  • numWorkers_dataloader (int) – The number of worker processes for data loading. (Default is -1)

  • persistentWorkers_dataloader (bool) – If True, uses persistent worker processes. (Default is True)

  • prefetchFactor_dataloader (int) – The prefetch factor for data loading. (Default is 2)

  • transforms (Optional[Callable]) – The transforms to use for the DataLoader. If None, the function will only scale dynamic range (to 0-1), resize (to img_size_out dimensions), and tile channels (to 3) as a minimum to pass images through the network. (Default is None)

  • img_size_out (Tuple[int, int]) – The image output dimensions of DataLoader if transforms is None. (Default is (224, 224))

  • jit_script_transforms (bool) – If True, converts the transforms pipeline into a TorchScript pipeline, potentially improving calculation speed but can cause problems with multiprocessing. (Default is False)

Returns:

ROI_images (np.ndarray):

The ROI images after normalization and resizing. Shape is (n_sessions, n_rois, n_channels, height, width).

Return type:

(np.ndarray)

Example

dataloader = generate_dataloader(ROI_images)
generate_latents() Tensor[source]

Passes the data in the dataloader through the network and generates latents.

Returns:

latents (torch.Tensor):

Latents for each ROI (Region of Interest).

Return type:

(torch.Tensor)

class roicat.ROInet.ROInet_embedder_original(dir_networkFiles: str, device: str = 'cpu', download_method: str = 'check_local_first', download_url: str = 'https://osf.io/x3fd2/download', download_hash: dict = None, names_networkFiles: dict = None, forward_pass_version: str = 'latent', verbose: bool = True)[source]

Bases: ROICaT_Module

Class for loading the ROInet model, preparing data for it, and running it. RH, JZ 2022

OSF.io links to ROInet versions:

  • ROInet_tracking:
    • Info: This version does not include occlusions or large affine transformations.

    • Link: https://osf.io/x3fd2/download

    • Hash (MD5 hex): 7a5fb8ad94b110037785a46b9463ea94

  • ROInet_classification:
    • Info: This version includes occlusions and large affine transformations.

    • Link: https://osf.io/c8m3b/download

    • Hash (MD5 hex): 357a8d9b630ec79f3e015d0056a4c2d5

Parameters:
  • dir_networkFiles (str) – Directory to find an existing ROInet.zip file or download and extract a new one into.

  • device (str) – Device to use for the model and data. (Default is 'cpu')

  • download_method (str) –

    Approach to downloading the network files. Options are:

    • 'check_local_first': Check if the network files are already in dir_networkFiles, if so, use them.

    • 'force_download': Download an ROInet.zip file from download_url.

    • 'force_local': Use an existing local copy of an ROInet.zip file, if they don’t exist, raise an error. Hash checking is done and download_hash must be specified.

    (Default is 'check_local_first')

  • download_url (str) – URL to download the ROInet.zip file from. (Default is https://osf.io/x3fd2/download)

  • download_hash (dict) – MD5 hash of the ROInet.zip file. This can be obtained from ROICaT documentation. If you don’t have one, use download_method=’force_download’ and determine the hash using helpers.hash_file(). (Default is None)

  • names_networkFiles (dict) –

    Names of the files in the ROInet.zip file. If uncertain, leave as None. The dictionary should have the form:

    {'params': 'params.json', 'model': 'model.py', 'state_dict': 'ConvNext_tiny__1_0_unfrozen__simCLR.pth',}

    Where ‘params’ is the parameters used to train the network (usually a .json file), ‘model’ is the model definition (usually a .py file), and ‘state_dict’ are the weights of the network (usually a .pth file). (Default is None)

  • forward_pass_version (str) – Version of the forward pass to use. Options are ‘latent’ (return the post-head output latents, use this for tracking), ‘head’ (return the output of the head layers, use this for classification), and ‘base’ (return the output of the base model). (Default is 'latent')

  • verbose (bool) – If True, print out extra information. (Default is True)

generate_dataloader(ROI_images: List[ndarray], um_per_pixel: float = 1.0, nan_to_num: bool = True, nan_to_num_val: float = 0.0, pref_plot: bool = False, batchSize_dataloader: int = 8, pinMemory_dataloader: bool = True, numWorkers_dataloader: int = -1, persistentWorkers_dataloader: bool = True, prefetchFactor_dataloader: int = 2, transforms: Callable | None = None, img_size_out: Tuple[int, int] = (224, 224), jit_script_transforms: bool = False)[source]

Generates a PyTorch DataLoader for a list of Region of Interest (ROI) images. Performs preprocessing such as rescaling, normalization, and resizing.

Parameters:
  • ROI_images (List[np.ndarray]) – The ROI images to use for the dataloader. List of arrays, each array corresponds to a session and is of shape (n_rois, height, width).

  • um_per_pixel (float) – The number of microns per pixel. Used to rescale the ROI images to the same size as the network input. (Default is 1.0)

  • nan_to_num (bool) – Whether to replace NaNs with a specific value. (Default is True)

  • nan_to_num_val (float) – The value to replace NaNs with. (Default is 0.0)

  • pref_plot (bool) – If True, plots the sizes of the ROI images before and after normalization. (Default is False)

  • batchSize_dataloader (int) – The batch size to use for the DataLoader. (Default is 8)

  • pinMemory_dataloader (bool) – If True, pins the memory of the DataLoader, as per PyTorch’s best practices. (Default is True)

  • numWorkers_dataloader (int) – The number of worker processes for data loading. (Default is -1)

  • persistentWorkers_dataloader (bool) – If True, uses persistent worker processes. (Default is True)

  • prefetchFactor_dataloader (int) – The prefetch factor for data loading. (Default is 2)

  • transforms (Optional[Callable]) – The transforms to use for the DataLoader. If None, the function will only scale dynamic range (to 0-1), resize (to img_size_out dimensions), and tile channels (to 3) as a minimum to pass images through the network. (Default is None)

  • img_size_out (Tuple[int, int]) – The image output dimensions of DataLoader if transforms is None. (Default is (224, 224))

  • jit_script_transforms (bool) – If True, converts the transforms pipeline into a TorchScript pipeline, potentially improving calculation speed but can cause problems with multiprocessing. (Default is False)

Returns:

ROI_images (np.ndarray):

The ROI images after normalization and resizing. Shape is (n_sessions, n_rois, n_channels, height, width).

Return type:

(np.ndarray)

Example

dataloader = generate_dataloader(ROI_images)
generate_latents() Tensor[source]

Passes the data in the dataloader through the network and generates latents.

Returns:

latents (torch.Tensor):

Latents for each ROI (Region of Interest).

Return type:

(torch.Tensor)

classmethod resize_ROIs(ROI_images: ndarray, um_per_pixel: float) ndarray[source]

Resizes the ROI (Region of Interest) images to prepare them for pass through network.

Parameters:
  • ROI_images (np.ndarray) – The ROI images to resize. Array of shape (n_rois, height, width).

  • um_per_pixel (float) – The number of microns per pixel. This value is used to rescale the ROI images so that they occupy a standard region of the image frame.

Returns:

ROI_images_rs (np.ndarray):

The resized ROI images.

Return type:

(np.ndarray)

class roicat.ROInet.Resizer_ROI_images(ROI_images: ndarray, um_per_pixel: float, nan_to_num: bool = True, nan_to_num_val: float = 0.0, verbose: bool = True)[source]

Bases: ROICaT_Module

Class for resizing ROIs. JZ, RH 2023

Parameters:
  • ROI_images (np.ndarray) – Array of ROIs to resize. Shape should be (nROIs, height, width).

  • um_per_pixel (float) – Size of a pixel in microns.

  • nan_to_num (bool) – Whether to replace NaNs with a specific value. (Default is True)

  • nan_to_num_val (float) – The value to replace NaNs with. (Default is 0.0)

  • verbose (bool) – If True, print out extra information. (Default is False)

plot_resized_comparison(ROI_images_cat: ndarray)[source]

Plot a comparison of the ROI sizes before and after resizing.

Parameters:

ROI_images_cat (np.ndarray) – Array of ROIs to resize. Shape should be (nROIs, height, width).

classmethod resize_ROIs(ROI_images: ndarray, um_per_pixel: float) ndarray[source]

Resizes the ROI (Region of Interest) images to prepare them for pass through network.

Parameters:
  • ROI_images (np.ndarray) – The ROI images to resize. Array of shape (n_rois, height, width).

  • um_per_pixel (float) – The number of microns per pixel. This value is used to rescale the ROI images so that they occupy a standard region of the image frame.

Returns:

ROI_images_rs (np.ndarray):

The resized ROI images.

Return type:

(np.ndarray)

class roicat.ROInet.ScaleDynamicRange(scaler_bounds=(0, 1), epsilon=1e-09)[source]

Bases: Module

Min-max scaling of the input tensor. RH 2021

forward(tensor)[source]

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class roicat.ROInet.TileChannels(dim=0, n_channels=3)[source]

Bases: Module

Expand dimension dim in X_in and tile to be N channels. RH 2021

forward(tensor)[source]

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class roicat.ROInet.Unsqueeze(dim=0)[source]

Bases: Module

Expand dimension dim in X_in and tile to be N channels. JZ 2023

forward(tensor)[source]

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class roicat.ROInet.dataset_simCLR(X: Tensor | array | List[float], y: Tensor | array | List[int], n_transforms: int = 2, transform: Callable | None = None, DEVICE: str = 'cpu', dtype_X: dtype = torch.float32, dtype_y: dtype = torch.int64)[source]

Bases: Dataset

Parameters:
  • X (Union[torch.Tensor, np.array, List[float]]) – Images. Expected shape: (n_samples, height, width). Currently expects no channel dimension. If/when it exists, then shape should be (n_samples, n_channels, height, width).

  • y (Union[torch.Tensor, np.array, List[int]]) – Labels. Shape: (n_samples).

  • n_transforms (int) – Number of transformations to apply to each image. Should be >= 1. (Default is 2)

  • transform (Optional[Callable]) –

    Optional transform to be applied on a sample. See torchvision.transforms for more information. Can use torch.nn.Sequential(a, bunch, of, transforms,) or other methods from torchvision.transforms.

    • If not None: Transform(s) are applied to each image and the output shape of X_sample_transformed for __getitem__ will be (n_samples, n_transforms, n_channels, height, width).

    • If None: No transform is applied and output shape of X_sample_trasformed for __getitem__ will be (n_samples, n_channels, height, width) (which is missing the n_transforms dimension).

    (Default is None)

  • DEVICE (str) – Device on which the data will be stored and transformed. Best to leave this as ‘cpu’ and do .to(DEVICE) on the data for the training loop. (Default is 'cpu')

  • dtype_X (torch.dtype) – Data type of X. (Default is torch.float32)

  • dtype_y (torch.dtype) – Data type of y. (Default is torch.int64)

  • temp_uncetainty (float) – Temperture term applied to the CrossEntropyLoss input. (Default is 1.0 for no change)

Example

transforms = torch.nn.Sequential(
    torchvision.transforms.RandomHorizontalFlip(p=0.5),

    torchvision.transforms.GaussianBlur(
        5,
        sigma=(0.01, 1.)
    ),
    torchvision.transforms.RandomPerspective(
        distortion_scale=0.6,
        p=1,
        interpolation=torchvision.transforms.InterpolationMode.BILINEAR,
        fill=0
    ),
    torchvision.transforms.RandomAffine(
        degrees=(-180,180),
        translate=(0.4, 0.4),
        scale=(0.7, 1.7),
        shear=(-20, 20, -20, 20),
        interpolation=torchvision.transforms.InterpolationMode.BILINEAR,
        fill=0,
        fillcolor=None,
        resample=None
    ),
)
scripted_transforms = torch.jit.script(transforms)

dataset = dataset_simCLR(  torch.tensor(images),
                            labels,
                            n_transforms=2,
                            transform=scripted_transforms,
                            DEVICE='cpu',
                            dtype_X=torch.float32,
                            dtype_y=torch.int64)

dataloader = torch.utils.data.DataLoader(   dataset,
                                        batch_size=64,
                                        shuffle=True,
                                        drop_last=True,
                                        pin_memory=False,
                                        num_workers=0)
tile_channels(X_in: Tensor | ndarray, dim: int = -3) Tensor | ndarray[source]

Expand dimension dim in X_in and tile to be 3 channels.

Parameters:
  • X_in (torch.Tensor or np.ndarray) – Input image with shape: (n_channels==1, height, width)

  • dim (int) – Dimension to expand. (Default is -3)

Returns:

X_out (torch.Tensor or np.ndarray):

Output image with shape: (n_channels==3, height, width)

Return type:

(torch.Tensor or np.ndarray)

roicat.ROInet.resize_affine(img: ndarray, scale: float, clamp_range: bool = False) ndarray[source]

Resizes an image using an affine transformation, scaled by a factor.

Parameters:
  • img (np.ndarray) – The input image to resize. Shape: (H, W)

  • scale (float) – The scale factor to apply for resizing.

  • clamp_range (bool) – If True, the image will be clamped to the range [min(img), max(img)] to prevent interpolation from extending outside of the image’s range. (Default is False)

Returns:

resized_image (np.ndarray):

The resized image.

Return type:

(np.ndarray)

roicat.data_importing module

class roicat.data_importing.Data_caiman(paths_resultsFiles: List[str], include_discarded: bool = True, um_per_pixel: float = 1.0, out_height_width: List[int] = [36, 36], centroid_method: str = 'median', verbose: bool = True, class_labels: str | None = None)[source]

Bases: Data_roicat

Class for importing data from CaImAn output files, specifically hdf5 results files.

Parameters:
  • paths_resultsFiles (List[str]) – List of paths to the results files.

  • include_discarded (bool) – If True, include ROIs that were discarded by CaImAn. Default is True.

  • um_per_pixel (float) – Microns per pixel. Default is 1.0.

  • out_height_width (List[int]) – Output height and width. Default is [36, 36].

  • centroid_method (str) – Method for calculating the centroid of an ROI. Should be: 'centerOfMass' or 'median'.

  • verbose (bool) – If True, print statements will be printed. Default is True.

  • class_labels (str, optional) – Class labels. Default is None.

import_FOV_images(paths_resultsFiles: List | None = None, images: List | None = None) List[ndarray][source]

Imports the FOV images from the CaImAn results files.

Parameters:
  • paths_resultsFiles (Optional[List]) – List of paths to CaImAn results files. If not provided, will use the paths stored in the class instance.

  • images (Optional[List]) – List of FOV images. If None, the function will import the estimates.b image from the paths specified in paths_resultsFiles.

Returns:

FOV images (np.ndarray):

FOV images. Shape is (nROIs, FOV_height, FOV_width).

Return type:

List[np.ndarray]

import_ROI_centeredImages(out_height_width: List[int] = [36, 36]) ndarray[source]

Imports the ROI centered images from the CaImAn results files.

Parameters:

out_height_width (List[int]) – Height and width of the output images. Default is [36,36].

Returns:

ROI centered images (np.ndarray):

ROI centered images. Shape is (nROIs, out_height_width[0], out_height_width[1]).

Return type:

(np.ndarray)

import_cnn_caiman_preds(path_resultsFile: str | Path, include_discarded: bool = True) ndarray | None[source]

Imports the CNN-based CaImAn prediction probabilities from the given file.

Parameters:
  • path_resultsFile (Union[str, pathlib.Path]) – Path to a single results file. Can be either a string or a pathlib.Path object.

  • include_discarded (bool) – If set to True, the function will include ROIs that were discarded by CaImAn. By default, this is set to True.

Returns:

preds (np.ndarray):

CNN-based CaImAn prediction probabilities.

Return type:

(np.ndarray)

import_overall_caiman_labels(path_resultsFile: str | Path, include_discarded: bool = True) ndarray[source]

Imports the overall CaImAn labels from the results file.

Parameters:
  • path_resultsFile (Union[str, pathlib.Path]) – Path to a single results file.

  • include_discarded (bool) – If True, include ROIs that were discarded by CaImAn. Default is True.

Returns:

labels (np.ndarray):

Overall CaImAn labels.

Return type:

(np.ndarray)

import_spatialFootprints(path_resultsFile: str | Path, include_discarded: bool = True) csr_matrix[source]

Imports the spatial footprints from the results file. Note that CaImAn’s data['estimates']['A'] is similar to self.spatialFootprints, but uses ‘F’ order. This function converts this into ‘C’ order to form self.spatialFootprints.

Parameters:
  • path_resultsFile (Union[str, pathlib.Path]) – Path to a single results file.

  • include_discarded (bool) – If True, include ROIs that were discarded by CaImAn. Default is True.

Returns:

Spatial footprints (scipy.sparse.csr_matrix):

Spatial footprints.

Return type:

(scipy.sparse.csr_matrix)

set_caimanLabels(overall_caimanLabels: List[List[bool]]) None[source]

Sets the CaImAn labels.

Parameters:

overall_caimanLabels (List[List[bool]]) – List of lists of CaImAn labels. The outer list corresponds to sessions, and the inner list corresponds to ROIs.

set_caimanPreds(cnn_caimanPreds: List[List[bool]]) None[source]

Sets the CNN-CaImAn predictions.

Parameters:

cnn_caimanPreds (List[List[bool]]) – List of lists of CNN-CaImAn predictions. The outer list corresponds to sessions, and the inner list corresponds to ROIs.

class roicat.data_importing.Data_roicat(verbose: bool = True)[source]

Bases: ROICaT_Module

Superclass for all data objects. Can be used as a template for creating custom data objects. RH 2022

Parameters:

verbose (bool) – Determines whether to print status updates. (Default is True)

type

The type of the data object. Set by the subclass.

Type:

object

n_sessions

The number of imaging sessions.

Type:

int

n_roi

The number of ROIs in each session.

Type:

int

n_roi_total

The total number of ROIs across all sessions.

Type:

int

FOV_height

The height of the field of view in pixels.

Type:

int

FOV_width

The width of the field of view in pixels.

Type:

int

FOV_images

A list of numpy arrays, each with shape (FOV_height, FOV_width). Each element represents an imaging session.

Type:

List[np.ndarray]

ROI_images

A list of numpy arrays, each with shape (n_roi, height, width). Each element represents an imaging session and each element of the numpy array (first dimension) is an ROI.

Type:

List[np.ndarray]

spatialFootprints

A list of scipy.sparse.csr_matrix objects, each with shape (n_roi, FOV_height * FOV_width). Each element represents an imaging session.

Type:

List[object]

class_labels_raw

A list of numpy arrays, each with shape (n_roi,), where each element is an integer. Each element of the list is an imaging session and each element of the numpy array is a class label.

Type:

List[np.ndarray]

class_labels_index

A list of numpy arrays, each with shape (n_roi,), where each element is an integer. Each element of the list is an imaging session and each element of the numpy array is the index of the class label obtained from passing the raw class label through np.unique(*, return_inverse=True).

Type:

List[np.ndarray]

um_per_pixel

The conversion factor from pixels to microns. This is used to scale the ROI_images to a common size.

Type:

float

session_bool

A boolean matrix with shape (n_roi_total, n_sessions). Each element is True if the ROI is present in the session.

Type:

np.ndarray

check_completeness(verbose: bool = True) None[source]

Checks which pipelines the data object is capable of running given the attributes that have been set.

Parameters:

verbose (bool) – If True, outputs progress and error messages. (Default is True)

get_maxIntensityProjection_spatialFootprints(sf: List[csr_matrix] | None = None, normalize: bool = True)[source]

Returns the maximum intensity projection of the spatial footprints.

Parameters:
  • sf (List[scipy.sparse.csr_matrix]) – List of spatial footprints, one for each session.

  • normalize (bool) – If True, normalizes the [min, max] range of each ROI to [0, 1] before computing the maximum intensity projection.

Returns:

List of maximum intensity projections, one for each session.

Return type:

List[np.ndarray]

import_from_dict(dict_load: Dict[str, Any]) None[source]

Imports attributes from a dictionary. This is useful if a dictionary that can be serialized was saved.

Parameters:

dict_load (Dict[str, Any]) – Dictionary containing attributes to load.

Note

This method does not return anything. It modifies the object state by importing attributes from the provided dictionary.

set_FOVHeightWidth(FOV_height: int, FOV_width: int)[source]

Sets the FOV_height and FOV_width attributes.

Parameters:
  • FOV_height (int) – The height of the field of view (FOV) in pixels.

  • FOV_width (int) – The width of the field of view (FOV) in pixels.

set_FOV_images(FOV_images: List[ndarray])[source]

Sets the FOV_images attribute.

Parameters:

FOV_images (List[np.ndarray]) – List of 2D numpy.ndarray objects, one for each session. Each array should have shape (FOV_height, FOV_width).

set_ROI_images(ROI_images: List[ndarray], um_per_pixel: float | None = None) None[source]

Imports ROI images into the class. Images are expected to be formatted as a list of numpy arrays. Each element is an imaging session. Each element is a numpy array of shape (n_roi, FOV_height, FOV_width). This method will set the attributes: self.ROI_images, self.n_roi, self.n_roi_total, self.n_sessions. If any of these attributes are already set, it will verify the new values match the existing ones.

Parameters:
  • ROI_images (List[np.ndarray]) – List of numpy arrays each of shape (n_roi, FOV_height, FOV_width).

  • um_per_pixel (Optional[float]) – The number of microns per pixel. This is used to resize the images to a common size. (Default is None)

set_class_labels(labels: List[ndarray] | ndarray | None = None, path_labels: str | List[str] | None = None, n_classes: int | None = None) None[source]

Imports class labels into the class.

  • labels are expected to be formatted as a list of numpy arrays or strings. Each element in the list is a session, and each element in the numpy array is associated with the nth element of the self.ROI_images list. Each element is a numpy array of shape (n_roi,).

  • Sets the attributes: self.class_labels_raw, self.class_labels_index, self.n_classes, self.n_class_labels, self.n_class_labels_total, self.unique_class_labels. If any of these attributes are already set, they will verify the new values match the existing ones.

Parameters:
  • labels (Optional[Union[List[np.ndarray], np.ndarray]]) –

    • If None: path_labels must be specified.

    • If a list of np.ndarray: each element should be a 1D array of integers or strings of length n_roi specifying the class label for each ROI.

    (Default is None)

  • path_labels (Optional[Union[str, List[str]]]) –

    • If None: labels must be specified.

    • If a list of str: each element should be a path to a either:

      • A .npy file containing a numpy array of shape (n_roi,) OR

      • A .pkl or .npy file containing a dictionary with an item that has key ‘labels’ and value of a numpy array of shape (n_roi,).

    The numpy array should be of integers or strings specifying the class label

  • n_classes (Optional[int]) – Number of classes. If not provided, it will be inferred from the class labels. (Default is None)

set_spatialFootprints(spatialFootprints: List[ndarray | csr_matrix | Dict[str, Any]], um_per_pixel: float | None = None)[source]

Sets the spatialFootprints attribute.

Parameters:
  • spatialFootprints (List[Union[np.ndarray, csr_matrix, Dict[str, Any]]]) –

    One of the following:

    • List of numpy.ndarray objects, one for each session. Each array should have shape (n_ROIs, FOV_height, FOV_width).

    • List of scipy.sparse.csr_matrix objects, one for each session. Each matrix should have shape (n_ROIs, FOV_height * FOV_width). Reshaping should be done with ‘C’ indexing (standard).

    • List of dictionaries, one for each session. This dictionary should be a serialized scipy.sparse.csr_matrix object. It should contains keys: ‘data’, ‘indices’, ‘indptr’, ‘shape’. See scipy.sparse.csr_matrix for more information.

  • um_per_pixel (Optional[float]) – The number of microns per pixel. This is used to resize the images to a common size. (Default is None)

class roicat.data_importing.Data_roiextractors(segmentation_extractor_objects: List[Any], um_per_pixel: float = 1.0, out_height_width: Tuple[int, int] = (36, 36), FOV_image_name: str | None = None, fallback_FOV_height_width: Tuple[int, int] = (512, 512), centroid_method: str = 'centerOfMass', class_labels: List[Any] | None = None, verbose: bool = True)[source]

Bases: Data_roicat

A class for importing all roiextractors supported data. This class will loop through each object and ingest data for roicat. RH, JB 2023

Parameters:
  • segmentation_extractor_objects (list) – List of segmentation extractor objects. All objects must be of the same type.

  • um_per_pixel (float, optional) – The resolution, specified as ‘micrometers per pixel’ of the imaging field of view. Defaults to 1.0.

  • out_height_width (tuple of int, optional) – The height and width of output ROI images, specified as (y, x). Defaults to [36,36].

  • FOV_image_name (str, optional) – If provided, this key will be used to extract the FOV image from the segmentation object’s self.get_images_dict() method. If None, the function will attempt to pull out a mean image. Defaults to None.

  • fallback_FOV_height_width (tuple of int, optional) – If the FOV images cannot be imported automatically, this will be used as the FOV height and width. Otherwise, FOV height and width are set from the first object in the list. Defaults to [512,512].

  • centroid_method (str, optional) – The method for calculating the centroid of the ROI. This should be either 'centerOfMass' or 'median'. Defaults to 'centerOfMass'.

  • class_labels (list, optional) – A list of class labels for each object. Defaults to None.

  • verbose (bool, optional) – If set to True, print statements will be displayed. Defaults to True.

class roicat.data_importing.Data_suite2p(paths_statFiles: str | Path | List[str | Path], paths_opsFiles: str | Path | List[str | Path] | None = None, um_per_pixel: float = 1.0, new_or_old_suite2p: str = 'new', out_height_width: Tuple[int, int] = (36, 36), type_meanImg: str = 'meanImgE', FOV_images: ndarray | None = None, centroid_method: str = 'centerOfMass', class_labels: List[ndarray] | List[str] | None = None, FOV_height_width: Tuple[int, int] | None = None, verbose: bool = True)[source]

Bases: Data_roicat

Class for handling suite2p output files and data. In particular stat.npy and ops.npy files. Imports FOV images and spatial footprints, and prepares ROI images. RH 2022

Parameters:
  • paths_statFiles (list of str or pathlib.Path) – List of paths to the stat.npy files. Elements should be one of: str, pathlib.Path, list of str or list of pathlib.Path.

  • paths_opsFiles (list of str or pathlib.Path, optional) – List of paths to the ops.npy files. Elements should be one of: str, pathlib.Path, list of str or list of pathlib.Path. Optional. Used to get FOV_images, FOV_height, FOV_width, and shifts (if old matlab ops file).

  • um_per_pixel (float) – Resolution in micrometers per pixel of the imaging field of view.

  • new_or_old_suite2p (str) – Type of suite2p output files. Matlab=old, Python=new. Should be: 'new' or 'old'.

  • out_height_width (tuple of int) – Height and width of output ROI images. These are the little images of centered ROIs that are typically used for passing through the neural net. Unless your ROIs are larger than the default size, it’s best to just leave it as default. Should be: (int, int) (y, x).

  • type_meanImg (str) – Type of mean image to use. Should be: 'meanImgE' or 'meanImg'.

  • FOV_images (np.ndarray, optional) – FOV images. Array of shape (n_sessions, FOV_height, FOV_width). Optional.

  • centroid_method (str) – Method for calculating the centroid of an ROI. Should be: 'centerOfMass' or 'median'.

  • class_labels ((list of np.ndarray) or (list of str to paths) or None) – Optional. If None, class labels are not set. If list of np.ndarray, each element should be a 1D integer array of length n_roi specifying the class label for each ROI. If list of str, each element should be a path to a .npy file containing an array of length n_roi specifying the class label for each ROI.

  • FOV_height_width (tuple of int, optional) – FOV height and width. If None, paths_opsFiles must be provided to get FOV height and width.

  • verbose (bool) – If True, prints results from each function.

import_FOV_images(type_meanImg: str = 'meanImgE') List[ndarray][source]

Imports the FOV images from ops files or user defined image arrays.

Parameters:

type_meanImg (str) –

Type of the mean image. References the key in the ops.npy file. Options are:

  • 'meanImgE': Enhanced mean image.

  • 'meanImg': Mean image.

Returns:

List of FOV images. Length of the list is the same as self.paths_files. Each element is a numpy.ndarray of shape (n_files, height, width).

Return type:

FOV_images (List[np.ndarray])

import_neuropil_masks(frame_height_width: List[int] | Tuple[int, int] | None = None) List[csr_matrix][source]

Imports and converts the neuropil masks of the ROIs in the stat files into images in sparse arrays.

Parameters:

frame_height_width (Optional[Union[List[int], Tuple[int, int]]]) – The height and width of the frame in the form [height, width]. If None, the height and width will be taken from the FOV images. (Default is None)

Returns:

neuropilMasks (List[scipy.sparse.csr_matrix]):

List of neuropil masks. Length of the list is the same as self.paths_stat. Each element is a sparse array of shape (n_roi, frame_height, frame_width).

Return type:

(List[scipy.sparse.csr_matrix])

import_spatialFootprints(frame_height_width: ~typing.List[int] | ~typing.Tuple[int, int] | None = None, dtype: ~numpy.dtype = <class 'numpy.float32'>) List[csr_matrix][source]

Imports and converts the spatial footprints of the ROIs in the stat files into images in sparse arrays.

Generates self.session_bool which is a bool np.ndarray of shape (n_roi, n_sessions) indicating which session each ROI belongs to.

Parameters:
  • frame_height_width (Optional[Union[List[int], Tuple[int, int]]]) – The height and width of the frame in the form [height, width]. If None, self.import_FOV_images must be called before this method, and the frame height and width will be taken from the first FOV image. (Default is None)

  • dtype (np.dtype) – Data type of the sparse array. (Default is np.float32)

Returns:

sf (List[scipy.sparse.csr_matrix]):

Spatial footprints. Length of the list is the same as self.paths_files. Each element is a scipy.sparse.csr_matrix of shape (n_roi, frame_height * frame_width).

Return type:

(List[scipy.sparse.csr_matrix])

roicat.data_importing.fix_paths(paths: List[str | Path] | str | Path) List[str][source]

Ensures the input paths are a list of strings.

Parameters:

paths (Union[List[Union[str, pathlib.Path]], str, pathlib.Path]) – The input can be either a list of strings or pathlib.Path objects, or a single string or pathlib.Path object.

Returns:

A list of strings representing the paths.

Return type:

List[str]

Raises:

TypeError – If the input isn’t a list of str or pathlib.Path objects, a single str, or a pathlib.Path object.

roicat.data_importing.make_smaller_data(data: Data_roicat, n_ROIs: int | None = 300, n_sessions: int | None = 10, bounds_x: Tuple[int, int] = (200, 400), bounds_y: Tuple[int, int] = (200, 400)) Data_roicat[source]

Reduces the size of a Data_roicat object by limiting the number of regions of interest (ROIs) and sessions, and adjusting the bounds on the x and y axes. This function is useful for making test datasets.

Parameters:
  • data (Data_roicat) – The input data object of the Data_roicat type.

  • n_ROIs (Optional[int]) – The number of regions of interest to include in the output data. If None, all ROIs will be included.

  • n_sessions (Optional[int]) – The number of sessions to include in the output data. If None, all sessions will be included.

  • bounds_x (Tuple[int, int]) – The x-axis bounds for the output data. The bounds should be a tuple of two integers.

  • bounds_y (Tuple[int, int]) – The y-axis bounds for the output data. The bounds should be a tuple of two integers.

Returns:

data_out (Data_roicat):

The output data, which is a reduced version of the input data according to the specified parameters.

Return type:

(Data_roicat)

roicat.helpers module

class roicat.helpers.Convergence_checker_optuna(n_patience: int = 10, tol_frac: float = 0.05, max_trials: int = 350, max_duration: float = 600, verbose: bool = True)[source]

Bases: object

Checks if the optuna optimization has converged. RH 2023

Parameters:
  • n_patience (int) – Number of trials to look back to check for convergence. Also the minimum number of trials that must be completed before starting to check for convergence. (Default is 10)

  • tol_frac (float) – Fractional tolerance for convergence. The best output value must change by less than this fractional amount to be considered converged. (Default is 0.05)

  • max_trials (int) – Maximum number of trials to run before stopping. (Default is 350)

  • max_duration (float) – Maximum number of seconds to run before stopping. (Default is 600)

  • verbose (bool) – If True, print messages. (Default is True)

bests

List to hold the best values obtained in the trials.

Type:

List[float]

best

Best value obtained among the trials. Initialized with infinity.

Type:

float

Example

# Create a ConvergenceChecker instance
convergence_checker = ConvergenceChecker(
    n_patience=15,
    tol_frac=0.01,
    max_trials=500,
    max_duration=60*20,
    verbose=True
)

# Assume we have a study and trial objects from optuna
# Use the check method in the callback
study.optimize(objective, n_trials=100, callbacks=[convergence_checker.check])
check(study: object, trial: object)[source]

Checks if the optuna optimization has converged. This function should be used as the callback function for the optuna study.

Parameters:
  • study (optuna.study.Study) – Optuna study object.

  • trial (optuna.trial.FrozenTrial) – Optuna trial object.

class roicat.helpers.Convolver_1d(kernel: ndarray | object, length_x: int | None = None, dtype: object = torch.float32, pad_mode: str = 'same', correct_edge_effects: bool = True, device: str = 'cpu')[source]

Bases: object

Class for 1D convolution. Uses torch.nn.functional.conv1d. Stores the convolution and edge correction kernels for repeated use. RH 2023

pad_mode

Mode for padding. See torch.nn.functional.conv1d for details.

Type:

str

dtype

Data type for the convolution. Default is torch.float32.

Type:

object

kernel

Convolution kernel as a tensor.

Type:

object

trace_correction

Kernel for edge correction.

Type:

object

Parameters:
  • kernel (Union[np.ndarray, object]) – 1D array to convolve with. The array can be a numpy array or a tensor.

  • length_x (Optional[int]) – Length of the array to be convolved. Must not be None if pad_mode is not ‘valid’. (Default is None)

  • dtype (object) – Data type to use for the convolution. (Default is torch.float32)

  • pad_mode (str) – Mode for padding. See torch.nn.functional.conv1d for details. (Default is ‘same’)

  • correct_edge_effects (bool) – Whether or not to correct for edge effects. (Default is True)

  • device (str) – Device to use for computation. (Default is ‘cpu’)

convolve(arr: ndarray | Tensor) ndarray | Tensor[source]

Convolve array with kernel.

Parameters:

arr (Union[np.ndarray, torch.Tensor]) – Array to convolve. Convolution performed along the last axis. Must be 1D, 2D, or 3D.

Returns:

out (Union[np.ndarray, torch.Tensor]):

The output tensor after performing convolution and correcting for edge effects.

Return type:

(Union[np.ndarray, torch.Tensor])

Example

convolver = Convolver_1d(kernel=my_kernel)
result = convolver.convolve(my_array)
class roicat.helpers.Equivalence_checker(kwargs_allclose: dict | None = {'equal_nan': True, 'rtol': 1e-07}, assert_mode=False, verbose=False)[source]

Bases: object

Class for checking if all items are equivalent or allclose (almost equal) in two complex data structures. Can check nested lists, dicts, and other data structures. Can also optionally assert (raise errors) if all items are not equivalent. RH 2023

_kwargs_allclose

Keyword arguments for the numpy.allclose function.

Type:

Optional[dict]

_assert_mode

Whether to raise an assertion error if items are not close.

Type:

bool

Parameters:
  • kwargs_allclose (Optional[dict]) – Keyword arguments for the numpy.allclose function. (Default is {'rtol': 1e-7, 'equal_nan': True})

  • assert_mode (bool) – Whether to raise an assertion error if items are not close.

  • verbose (bool) –

    How much information to print out:
    • False / 0: No information printed out.

    • True / 1: Mismatched items only.

    • 2: All items printed out.

class roicat.helpers.ImageLabeler(image_array: ndarray, start_index: int = 0, path_csv: str | None = None, save_csv: bool = True, resize_factor: float = 10.0, normalize_images: bool = True, verbose: bool = True, key_end: str = 'Escape', key_prev: str = 'Left', key_next: str = 'Right')[source]

Bases: object

A simple graphical interface for labeling image classes. Use this class with a context manager to ensure the window is closed properly. The class provides a tkinter window which displays images from a provided numpy array one by one and lets you classify each image by pressing a key. The title of the window is the image index. The classification label and image index are stored as the self.labels_ attribute and saved to a CSV file in self.path_csv. RH 2023

Parameters:
  • image_array (np.ndarray) – A numpy array of images. Either 3D: (n_images, height, width) or 4D: (n_images, height, width, n_channels). Images should be scaled between 0 and 255 and will be converted to uint8.

  • start_index (int) – The index of the first image to display. (Default is 0)

  • path_csv (str) – Path to the CSV file for saving results. If None, results will not be saved.

  • save_csv (bool) – Whether to save the results to a CSV. (Default is True)

  • resize_factor (float) – A scaling factor for the fractional change in image size. (Default is 1.0)

  • normalize_images (bool) – Whether to normalize the images between min and max values. (Default is True)

  • verbose (bool) – Whether to print status updates. (Default is True)

  • key_end (str) – Key to press to end the session. (Default is 'Escape')

  • key_prev (str) – Key to press to go back to the previous image. (Default is 'Left')

  • key_next (str) – Key to press to go to the next image. (Default is 'Right')

Example

with ImageLabeler(images, start_index=0, resize_factor=4.0,
key_end='Escape') as labeler:
    labeler.run()
path_csv, labels = labeler.path_csv, labeler.labels_
image_array

A numpy array of images. Either 3D: (n_images, height, width) or 4D: (n_images, height, width, n_channels). Images should be scaled between 0 and 255 and will be converted to uint8.

Type:

np.ndarray

start_index

The index of the first image to display. (Default is 0)

Type:

int

path_csv

Path to the CSV file for saving results. If None, results will not be saved.

Type:

str

save_csv

Whether to save the results to a CSV. (Default is True)

Type:

bool

resize_factor

A scaling factor for the fractional change in image size. (Default is 1.0)

Type:

float

normalize_images

Whether to normalize the images between min and max values. (Default is True)

Type:

bool

verbose

Whether to print status updates. (Default is True)

Type:

bool

key_end

Key to press to end the session. (Default is 'Escape')

Type:

str

key_prev

Key to press to go back to the previous image. (Default is 'Left')

Type:

str

key_next

Key to press to go to the next image. (Default is 'Right')

Type:

str

labels_

A list of tuples containing the image index and classification label for each image. The list is saved to a CSV file in self.path_csv.

Type:

list

classify(event)[source]

Adds the current image index and pressed key as a label. Then saves the results and moves to the next image.

Parameters:

event (tkinter.Event) – A tkinter event object.

end_session(event)[source]

Ends the classification session by destroying the tkinter window.

get_labels(kind: str = 'dict') dict | List[Tuple[int, str]][source]

Returns the labels. The format of the output is determined by the kind parameter. If the labels dictionary is empty, returns None. RH 2023

Parameters:

kind (str) –

The type of object to return. (Default is 'dict')

  • 'dict': {idx: label, idx: label, …}

  • 'list': [(idx, label), (idx, label), …]

  • 'dataframe': {‘index’: [idx, idx, …], ‘label’: [label, label, …]} This can be converted to a pandas dataframe with: pd.DataFrame(self.get_labels(‘dataframe’))

Returns:

Depending on the kind parameter, it returns either:

  • dict:

    A dictionary where keys are the image indices and values are the labels.

  • List[Tuple[int, str]]:

    A list of tuples, where each tuple contains an image index and a label.

  • dict:

    A dictionary with keys ‘index’ and ‘label’ where values are lists of indices and labels respectively.

Return type:

(Union[dict, List[Tuple[int, str]], dict])

next_img(event=None)[source]

Displays the next image in the array, and resizes the image.

prev_img(event=None)[source]

Displays the previous image in the array.

run()[source]

Runs the image labeler. Opens a tkinter window and displays the first image.

save_classification()[source]

Saves the classification results to a CSV file. This function does not append, it overwrites the entire file. The file contains two columns: ‘image_index’ and ‘label’.

exception roicat.helpers.ParallelExecutionError(index, original_exception)[source]

Bases: Exception

Exception class for errors that occur during parallel execution. Intended to be used with the map_parallel function. RH 2023

index

Index of the job that failed.

Type:

int

original_exception

The original exception that was raised.

Type:

Exception

class roicat.helpers.Toeplitz_convolution2d(x_shape: Tuple[int, int], k: ndarray, mode: str = 'same', dtype: dtype | None = None)[source]

Bases: object

Convolve a 2D array with a 2D kernel using the Toeplitz matrix multiplication method. This class is ideal when ‘x’ is very sparse (density<0.01), ‘x’ is small (shape <(1000,1000)), ‘k’ is small (shape <(100,100)), and the batch size is large (e.g. 1000+). Generally, it is faster than scipy.signal.convolve2d when convolving multiple arrays with the same kernel. It maintains a low memory footprint by storing the toeplitz matrix as a sparse matrix. RH 2022

x_shape

The shape of the 2D array to be convolved.

Type:

Tuple[int, int]

k

2D kernel to convolve with.

Type:

np.ndarray

mode

Either 'full', 'same', or 'valid'. See scipy.signal.convolve2d for details.

Type:

str

dtype

The data type to use for the Toeplitz matrix. If None, then the data type of the kernel is used.

Type:

Optional[np.dtype]

Parameters:
  • x_shape (Tuple[int, int]) – The shape of the 2D array to be convolved.

  • k (np.ndarray) – 2D kernel to convolve with.

  • mode (str) – Convolution method to use, either 'full', 'same', or 'valid'. See scipy.signal.convolve2d for details. (Default is ‘same’)

  • dtype (Optional[np.dtype]) – The data type to use for the Toeplitz matrix. Ideally, this matches the data type of the input array. If None, then the data type of the kernel is used. (Default is None)

Example

# create Toeplitz_convolution2d object
toeplitz_convolution2d = Toeplitz_convolution2d(
    x_shape=(100,30),
    k=np.random.rand(10,10),
    mode='same',
)
toeplitz_convolution2d(
    x=scipy.sparse.csr_matrix(np.random.rand(5,3000)),
    batch_size=True,
)
roicat.helpers.add_text_to_images(images: array, text: List[List[str]], position: Tuple[int, int] = (10, 10), font_size: int = 1, color: Tuple[int, int, int] = (255, 255, 255), line_width: int = 1, font: str | None = None, frameRate: int = 30) array[source]

Adds text to images using cv2.putText(). RH 2022

Parameters:
  • images (np.array) – Frames of video or images. Shape: (n_frames, height, width, n_channels).

  • text (list of lists) – Text to add to images. The outer list: one element per frame. The inner list: each element is a line of text.

  • position (tuple) – (x, y) position of the text (top left corner). (Default is (10,10))

  • font_size (int) – Font size of the text. (Default is 1)

  • color (tuple) – (r, g, b) color of the text. (Default is (255,255,255))

  • line_width (int) – Line width of the text. (Default is 1)

  • font (str) – Font to use. If None, then will use cv2.FONT_HERSHEY_SIMPLEX. See cv2.FONT... for more options. (Default is None)

  • frameRate (int) – Frame rate of the video. (Default is 30)

Returns:

images_with_text (np.array):

Frames of video or images with text added.

Return type:

(np.array)

roicat.helpers.apply_warp_transform(im_in: ndarray, warp_matrix: ndarray, interpolation_method: int = 1, borderMode: int = 0, borderValue: int = 0) ndarray[source]

Apply a warp transform to an image. Wrapper function for cv2.warpAffine and cv2.warpPerspective. RH 2022

Parameters:
  • im_in (np.ndarray) – Input image with any dimensions.

  • warp_matrix (np.ndarray) – Warp matrix. Shape should be (2, 3) for affine transformations, and (3, 3) for homography. See cv2.findTransformECC for more info.

  • interpolation_method (int) – Interpolation method. See cv2.warpAffine for more info. (Default is cv2.INTER_LINEAR)

  • borderMode (int) – Border mode. Determines how to handle pixels from outside the image boundaries. See cv2.warpAffine for more info. (Default is cv2.BORDER_CONSTANT)

  • borderValue (int) – Value to use for border pixels if borderMode is set to cv2.BORDER_CONSTANT. (Default is 0)

Returns:

im_out (np.ndarray):

Transformed output image with the same dimensions as the input image.

Return type:

(np.ndarray)

roicat.helpers.bounded_logspace(start: float, stop: float, num: int) ndarray[source]

Creates a logarithmically spaced array, similar to np.logspace, but with a defined start and stop. RH 2022

Parameters:
  • start (float) – The first value in the output array.

  • stop (float) – The last value in the output array.

  • num (int) – The number of values in the output array.

Returns:

out (np.ndarray):

An array of logarithmically spaced values between start and stop.

Return type:

(np.ndarray)

roicat.helpers.check_keys_subset(d, default_dict, hierarchy=['defaults'])[source]

Checks that the keys in d are all in default_dict. Raises an error if not. RH 2023

Parameters:
  • d (Dict) – Dictionary to check.

  • default_dict (Dict) – Dictionary containing the keys to check against.

  • hierarchy (List[str]) – Used internally for recursion. Hierarchy of keys to d.

roicat.helpers.compare_file_hashes(hash_dict_true: Dict[str, Tuple[str, str]], dir_files_test: str | None = None, paths_files_test: List[str] | None = None, verbose: bool = True) Tuple[bool, Dict[str, bool], Dict[str, str]][source]

Compares hashes of files in a directory or list of paths to provided hashes. RH 2022

Parameters:
  • hash_dict_true (Dict[str, Tuple[str, str]]) – Dictionary of hashes to compare. Each entry should be in the format: {‘key’: (‘filename’, ‘hash’)}.

  • dir_files_test (str) – Path to directory containing the files to compare hashes. Unused if paths_files_test is not None. (Optional)

  • paths_files_test (List[str]) – List of paths to files to compare hashes. dir_files_test is used if None. (Optional)

  • verbose (bool) – If True, failed comparisons are printed out. (Default is True)

Returns:

tuple containing:
total_result (bool):

True if all hashes match, False otherwise.

individual_results (Dict[str, bool]):

Dictionary indicating whether each hash matched.

paths_matching (Dict[str, str]):

Dictionary of paths that matched. Each entry is in the format: {‘key’: ‘path’}.

Return type:

(tuple)

roicat.helpers.compose_remappingIdx(remap_AB: ndarray, remap_BC: ndarray, method: str = 'linear', fill_value: float | None = nan, bounds_error: bool = False) ndarray[source]

Composes two remapping index fields using scipy.interpolate.interpn.

This function computes ‘remap_AC’ from ‘remap_AB’ and ‘remap_BC’, where ‘remap_AB’ is a remapping index field that warps image A onto image B, and ‘remap_BC’ is a remapping index field that warps image B onto image C.

RH 2023

Parameters:
  • remap_AB (np.ndarray) – An array of shape (H, W, 2) representing the remap field from image A to image B.

  • remap_BC (np.ndarray) – An array of shape (H, W, 2) representing the remap field from image B to image C.

  • method (str) –

    Interpolation method to use. Either

    • 'linear': Use linear interpolation (default).

    • 'nearest': Use nearest interpolation.

    • 'cubic': Use cubic interpolation.

  • fill_value (Optional[float]) – The value used for points outside the interpolation domain. (Default is np.nan)

  • bounds_error (bool) – If True, a ValueError is raised when interpolated values are requested outside of the domain of the input data. (Default is False)

Returns:

remap_AC (np.ndarray):

An array of shape (H, W, 2) representing the remap field from image A to image C.

Return type:

(np.ndarray)

roicat.helpers.compose_transform_matrices(matrix_AB: ndarray, matrix_BC: ndarray) ndarray[source]

Composes two transformation matrices to create a transformation from one image to another. RH 2023

This function is used to combine two transformation matrices, ‘matrix_AB’ and ‘matrix_BC’. ‘matrix_AB’ represents a transformation that warps an image A onto an image B. ‘matrix_BC’ represents a transformation that warps image B onto image C. The result is ‘matrix_AC’, a transformation matrix that would warp image A directly onto image C.

Parameters:
  • matrix_AB (np.ndarray) – A transformation matrix from image A to image B. The array can have the shape (2, 3) or (3, 3).

  • matrix_BC (np.ndarray) – A transformation matrix from image B to image C. The array can have the shape (2, 3) or (3, 3).

Returns:

matrix_AC (np.ndarray):

A composed transformation matrix from image A to image C. The array has the shape (2, 3) or (3, 3).

Return type:

(np.ndarray)

Raises:

AssertionError – If the input matrices do not have the shape (2, 3) or (3, 3).

Example

# Define the transformation matrices
matrix_AB = np.array([[1, 0, 0], [0, 1, 0], [0, 0, 1]])
matrix_BC = np.array([[1, 0, 0], [0, 1, 0], [0, 0, 1]])

# Compose the transformation matrices
matrix_AC = compose_transform_matrices(matrix_AB, matrix_BC)
roicat.helpers.compute_cluster_similarity_matrices(s: csr_matrix | ndarray | COO, l: ndarray, verbose: bool = True) Tuple[ndarray, ndarray, ndarray][source]

Computes the similarity matrices for each cluster in l. This algorithm works best on large and sparse matrices. RH 2023

Parameters:
  • s (Union[scipy.sparse.csr_matrix, np.ndarray, sparse.COO]) – Similarity matrix. Entries should be non-negative floats.

  • l (np.ndarray) – Labels for each row of s. Labels should ideally be integers.

  • verbose (bool) – Whether to print warnings. (Default is True)

Returns:

tuple containing:
labels_unique (np.ndarray):

Unique labels in l.

cs_mean (np.ndarray):

Similarity matrix for each cluster. Each element is the mean similarity between all the pairs of samples in each cluster. Note that the diagonal here only considers non-self similarity, which excludes the diagonals of s.

cs_max (np.ndarray):

Similarity matrix for each cluster. Each element is the maximum similarity between all the pairs of samples in each cluster. Note that the diagonal here only considers non-self similarity, which excludes the diagonals of s.

cs_min (np.ndarray):

Similarity matrix for each cluster. Each element is the minimum similarity between all the pairs of samples in each cluster. Will be 0 if there are any sparse elements between the two clusters.

Return type:

(tuple)

roicat.helpers.confusion_matrix(y_hat: ndarray, y_true: ndarray, counts: bool = False) ndarray[source]

Computes the confusion matrix from y_hat and y_true. y_hat should be either predictions or probabilities. RH 2022

Parameters:
  • y_hat (np.ndarray) –

    Numpy array of predictions or probabilities. Either

    • 1D array of predictions (n_samples,). Values should be integers.

    • 2D array of probabilities (n_samples, n_classes). Values should be floats.

    (Default is 1D array of predictions)

  • y_true (np.ndarray) –

    Numpy array of ground truth labels. Either

    • 1D array of labels (n_samples,). Values should be integers.

    • 2D array of one-hot labels (n_samples, n_classes). Values should be integers.

    (Default is 1D array of labels)

  • counts (bool) – If False, the output confusion matrix is normalized. If True, the output contains counts. (Default is False)

Returns:

cmat (np.ndarray):

The computed confusion matrix.

Return type:

(np.ndarray)

roicat.helpers.cosine_kernel_2D(center: Tuple[int, int] = (5, 5), image_size: Tuple[int, int] = (11, 11), width: int = 5) ndarray[source]

Generates a 2D cosine kernel. RH 2021

Parameters:
  • center (Tuple[int, int]) – The mean position (X, Y) where high value is expected. It is 0-indexed. Make the second value 0 to make it 1D. (Default is (5, 5))

  • image_size (Tuple[int, int]) – The total image size (width, height). Make the second value 0 to make it 1D. (Default is (11, 11))

  • width (int) – The full width of one cycle of the cosine. (Default is 5)

Returns:

k_cos (np.ndarray):

2D or 1D array of the cosine kernel.

Return type:

(np.ndarray)

roicat.helpers.cv2RemappingIdx_to_pytorchFlowField(ri: ndarray | Tensor) ndarray | Tensor[source]

Converts remapping indices from the OpenCV format to the PyTorch format. In the OpenCV format, the displacement is in pixels relative to the top left pixel of the image. In the PyTorch format, the displacement is in pixels relative to the center of the image. RH 2023

Parameters:

ri (Union[np.ndarray, torch.Tensor]) – Remapping indices. Each pixel describes the index of the pixel in the original image that should be mapped to the new pixel. Shape: (H, W, 2). The last dimension is (x, y).

Returns:

normgrid (Union[np.ndarray, torch.Tensor]):

”Flow field”, in the PyTorch format. Technically not a flow field, since it doesn’t describe displacement. Rather, it is a remapping index relative to the center of the image. Shape: (H, W, 2). The last dimension is (x, y).

Return type:

(Union[np.ndarray, torch.Tensor])

roicat.helpers.deep_update_dict(dictionary: dict, key: List[str], val: Any, in_place: bool = False) dict | None[source]

Updates a nested dictionary with a new value. RH 2023

Parameters:
  • dictionary (dict) – The original dictionary to update.

  • key (List[str]) – List of keys representing the hierarchical path to the nested value to update. Each element should be a string that represents a level in the hierarchy. For example, to change a value in the dictionary params at key ‘dataloader_kwargs’ and subkey ‘prefetch_factor’, you would pass [‘dataloader_kwargs’, ‘prefetch_factor’].

  • val (Any) – The new value to set in the dictionary.

  • in_place (bool) –

    • True: the original dictionary will be updated in-place and no value will be returned.

    • False, a new dictionary will be created and returned. (Default is False)

Returns:

updated_dict (dict):

The updated dictionary. Only returned if in_place is False.

Return type:

(Union[dict, None])

Example

original_dict = {"level1": {"level2": "old value"}}
updated_dict = deep_update_dict(original_dict, ["level1", "level2"], "new value", in_place=False)
# Now updated_dict is {"level1": {"level2": "new value"}}
roicat.helpers.download_file(url: str | None, path_save: str, check_local_first: bool = True, check_hash: bool = False, hash_type: str = 'MD5', hash_hex: str | None = None, mkdir: bool = False, allow_overwrite: bool = True, write_mode: str = 'wb', verbose: bool = True, chunk_size: int = 1024) None[source]

Downloads a file from a URL to a local path using requests. Checks if file already exists locally and verifies the hash of the downloaded file against a provided hash if required. RH 2023

Parameters:
  • url (Optional[str]) – URL of the file to download. If None, then no download is attempted. (Default is None)

  • path_save (str) – Path to save the file to.

  • check_local_first (bool) – Whether to check if the file already exists locally. If True and the file exists locally, the download is skipped. If True and check_hash is also True, the hash of the local file is checked. If the hash matches, the download is skipped. If the hash does not match, the file is downloaded. (Default is True)

  • check_hash (bool) – Whether to check the hash of the local or downloaded file against hash_hex. (Default is False)

  • hash_type (str) – Type of hash to use. Options are: 'MD5', 'SHA1', 'SHA256', 'SHA512'. (Default is 'MD5')

  • hash_hex (Optional[str]) – Hash to compare to, in hexadecimal format (e.g., ‘a1b2c3d4e5f6…’). Can be generated using hash_file() or hashlib.hexdigest(). If check_hash is True, hash_hex must be provided. (Default is None)

  • mkdir (bool) – If True, creates the parent directory of path_save if it does not exist. (Default is False)

  • write_mode (str) – Write mode for saving the file. Options include: 'wb' (write binary), 'ab' (append binary), 'xb' (write binary, fail if file exists). (Default is 'wb')

  • verbose (bool) – If True, prints status messages. (Default is True)

  • chunk_size (int) – Size of chunks in which to download the file. (Default is 1024)

roicat.helpers.export_svg_hv_bokeh(obj: object, path_save: str) None[source]

Saves a scatterplot from holoviews as an SVG file. RH 2023

Parameters:
  • obj (object) – Holoviews plot object.

  • path_save (str) – Path to save the SVG file.

roicat.helpers.extract_zip(path_zip: str, path_extract: str | None = None, verbose: bool = True) List[str][source]

Extracts a zip file. RH 2022

Parameters:
  • path_zip (str) – Path to the zip file.

  • path_extract (Optional[str]) – Path (directory) to extract the zip file to. If None, extracts to the same directory as the zip file. (Default is None)

  • verbose (bool) – Whether to print progress. (Default is True)

Returns:

paths_extracted (List[str]):

List of paths to the extracted files.

Return type:

(List[str])

roicat.helpers.fill_in_dict(d: Dict, defaults: Dict, verbose: bool = True, hierarchy: List[str] = ['dict'])[source]

In-place. Fills in dictionary d with values from defaults if they are missing. Works hierachically. RH 2023

Parameters:
  • d (Dict) – Dictionary to fill in. In-place.

  • defaults (Dict) – Dictionary of defaults.

  • verbose (bool) – Whether to print messages.

  • hierarchy (List[str]) – Used internally for recursion. Hierarchy of keys to d.

roicat.helpers.find_geometric_transformation(im_template: ndarray, im_moving: ndarray, warp_mode: str = 'euclidean', n_iter: int = 5000, termination_eps: float = 1e-10, mask: ndarray | None = None, gaussFiltSize: int = 1) ndarray[source]

Find the transformation between two images. Wrapper function for cv2.findTransformECC RH 2022

Parameters:
  • im_template (np.ndarray) – Template image. The dtype must be either np.uint8 or np.float32.

  • im_moving (np.ndarray) – Moving image. The dtype must be either np.uint8 or np.float32.

  • warp_mode (str) –

    Warp mode.

    • ’translation’: Sets a translational motion model; warpMatrix is 2x3 with the first 2x2 part being the unity matrix and the rest two parameters being estimated.

    • ’euclidean’: Sets a Euclidean (rigid) transformation as motion model; three parameters are estimated; warpMatrix is 2x3.

    • ’affine’: Sets an affine motion model; six parameters are estimated; warpMatrix is 2x3. (Default)

    • ’homography’: Sets a homography as a motion model; eight parameters are estimated;`warpMatrix` is 3x3.

  • n_iter (int) – Number of iterations. (Default is 5000)

  • termination_eps (float) – Termination epsilon. This is the threshold of the increment in the correlation coefficient between two iterations. (Default is 1e-10)

  • mask (np.ndarray) – Binary mask. Regions where mask is zero are ignored during the registration. If None, no mask is used. (Default is None)

  • gaussFiltSize (int) – Gaussian filter size. If 0, no gaussian filter is used. (Default is 1)

Returns:

warp_matrix (np.ndarray):

Warp matrix. See cv2.findTransformECC for more info. Can be applied using cv2.warpAffine or cv2.warpPerspective.

Return type:

(np.ndarray)

roicat.helpers.find_nonredundant_idx(s: coo_matrix) ndarray[source]

Finds the indices of the nonredundant entries in a sparse matrix. Useful when manually populating a sparse matrix and you want to know which entries have already been populated. RH 2022

Parameters:

s (scipy.sparse.coo_matrix) – Sparse matrix. Should be in COO format.

Returns:

idx_unique (np.ndarray):

Indices of the nonredundant entries.

Return type:

(np.ndarray)

roicat.helpers.find_paths(dir_outer: str | List[str], reMatch: str = 'filename', reMatch_in_path: str | None = None, find_files: bool = True, find_folders: bool = False, depth: int = 0, natsorted: bool = True, alg_ns: str | None = None, verbose: bool = False) List[str][source]

Searches for files and/or folders recursively in a directory using a regex match. RH 2022

Parameters:
  • dir_outer (Union[str, List[str]]) – Path(s) to the directory(ies) to search. If a list of directories, then all directories will be searched.

  • reMatch (str) – Regular expression to match. Each file or folder name encountered will be compared using re.search(reMatch, filename). If the output is not None, the file will be included in the output.

  • reMatch_in_path (Optional[str]) –

    Additional regular expression to match anywhere in the path. Useful for finding files/folders in specific subdirectories. If None, then no additional matching is done.

    (Default is None)

  • find_files (bool) – Whether to find files. (Default is True)

  • find_folders (bool) – Whether to find folders. (Default is False)

  • depth (int) –

    Maximum folder depth to search. (Default is 0).

    • depth=0 means only search the outer directory.

    • depth=2 means search the outer directory and two levels of subdirectories below it

  • natsorted (bool) – Whether to sort the output using natural sorting with the natsort package. (Default is True)

  • alg_ns (str) – Algorithm to use for natural sorting. See natsort.ns or https://natsort.readthedocs.io/en/4.0.4/ns_class.html/ for options. Default is PATH. Other commons are INT, FLOAT, VERSION. (Default is None)

  • verbose (bool) – Whether to print the paths found. (Default is False)

Returns:

paths (List[str]):

Paths to matched files and/or folders in the directory.

Return type:

(List[str])

roicat.helpers.flatten_dict(d: MutableMapping, parent_key: str = '', sep: str = '.') MutableMapping[source]

Flattens a dictionary of dictionaries into a single dictionary. NOTE: Turns all keys into strings. Stolen from https://stackoverflow.com/a/6027615. RH 2022

Parameters:
  • d (Dict) – Dictionary to flatten

  • parent_key (str) – Key to prepend to flattened keys IGNORE: USED INTERNALLY FOR RECURSION

  • sep (str) – Separator to use between keys IGNORE: USED INTERNALLY FOR RECURSION

Returns:

flattened dictionary (dict):

Flat dictionary with the keys to deeper dictionaries joined by the separator.

Return type:

(Dict)

roicat.helpers.flowField_to_remappingIdx(ff: ndarray | object) ndarray | object[source]

Convert a flow field to a remapping index. WARNING: Technically, it is not possible to convert a flow field to a remapping index, since the remapping index describes an interpolation mapping, while the flow field describes a displacement. RH 2023

Parameters:

ff (Union[np.ndarray, object]) – Flow field represented as a numpy ndarray or torch Tensor. It describes the displacement of each pixel. Shape (H, W, 2). Last dimension is (x, y).

Returns:

ri (Union[np.ndarray, object]):

Remapping index. It describes the index of the pixel in the original image that should be mapped to the new pixel. Shape (H, W, 2).

Return type:

(Union[np.ndarray, object])

roicat.helpers.generalised_logistic_function(x: ndarray | Tensor, a: float = 0, k: float = 1, b: float = 1, v: float = 1, q: float = 1, c: float = 1, mu: float = 0) ndarray | Tensor[source]

Calculates the generalized logistic function.

Refer to Generalised logistic function for detailed information on the parameters. RH 2021

Parameters:
  • x (Union[np.ndarray, torch.Tensor]) – The input to the logistic function.

  • a (float) – The lower asymptote. (Default is 0)

  • k (float) – The upper asymptote when c=1. (Default is 1)

  • b (float) – The growth rate. (Default is 1)

  • v (float) – Should be greater than 0, it affects near which asymptote maximum growth occurs. (Default is 1)

  • q (float) – Related to the value Y(0). Center positions. (Default is 1)

  • c (float) – Typically takes a value of 1. (Default is 1)

  • mu (float) – The center position of the function. (Default is 0)

Returns:

out (Union[np.ndarray, torch.Tensor]):

The value of the logistic function for the input x.

Return type:

(Union[np.ndarray, torch.Tensor])

roicat.helpers.get_balanced_class_weights(labels: ndarray) ndarray[source]

Balances the weights for classifier training.

RH, JZ 2022

Parameters:

labels (np.ndarray) – Array that includes a list of labels to balance the weights for classifier training. shape: (n,)

Returns:

weights (np.ndarray):

Weights by samples. shape: (n,)

Return type:

(np.ndarray)

roicat.helpers.get_balanced_sample_weights(labels: ndarray, class_weights: ndarray | None = None) ndarray[source]

Balances the weights for classifier training.

RH/JZ 2022

Parameters:
  • labels (np.ndarray) – Array that includes a list of labels to balance the weights for classifier training. shape: (n,)

  • class_weights (np.ndarray, Optional) – Optional parameter which includes an array of pre-fit class weights. If None, weights will be calculated using the provided sample labels. (Default is None)

Returns:

sample_weights (np.ndarray):

Sample weights by labels. shape: (n,)

Return type:

(np.ndarray)

roicat.helpers.get_dir_contents(directory: str) Tuple[List[str], List[str]][source]

Retrieves the names of the folders and files in a directory (does not include subdirectories). RH 2021

Parameters:

directory (str) – The path to the directory.

Returns:

tuple containing:
folders (List[str]):

A list of folder names.

files (List[str]):

A list of file names.

Return type:

(tuple)

roicat.helpers.get_nd_butterworth_filter(shape: ~typing.Tuple[int, ...], factor: float, order: float, high_pass: bool, real: bool, dtype: ~numpy.dtype = <class 'numpy.float64'>, squared_butterworth: bool = True) ndarray[source]

Creates an N-dimensional Butterworth mask for an FFT.

Parameters:
  • shape (Tuple[int, ...]) – Shape of the n-dimensional FFT and mask.

  • factor (float) – Fraction of mask dimensions where the cutoff should be.

  • order (float) – Controls the slope in the cutoff region.

  • high_pass (bool) – Whether the filter is high pass (low frequencies attenuated) or low pass (high frequencies are attenuated).

  • real (bool) – Whether the FFT is of a real (True) or complex (False) image.

  • dtype (np.dtype) – The desired output data type of the Butterworth filter. (Default is np.float64)

  • squared_butterworth (bool) – If True, the square of the Butterworth filter is used. (Default is True)

Returns:

wfilt (np.ndarray):

The FFT mask.

Return type:

(np.ndarray)

roicat.helpers.get_nums_from_string(string_with_nums)[source]

Return the numbers from a string as an int RH 2021-2022

Parameters:

string_with_nums (str) – String with numbers in it

Returns:

The numbers from the string If there are no numbers, return None.

Return type:

nums (int)

roicat.helpers.grayscale_to_rgb(array: ndarray | Tensor | List) ndarray | Tensor[source]

Converts a grayscale image (2D array) or movie (3D array) to RGB (3D or 4D array).

RH 2023

Parameters:

array (Union[np.ndarray, torch.Tensor, list]) – The 2D or 3D array of grayscale images.

Returns:

array (Union[np.ndarray, torch.Tensor]):

The converted 3D or 4D array of RGB images.

Return type:

(Union[np.ndarray, torch.Tensor])

roicat.helpers.h5_load(filepath: str | Path, return_dict: bool = True, verbose: bool = False) dict | object[source]

Returns a dictionary or an H5PY object from a given HDF file. RH 2023

Parameters:
  • filepath (Union[str, Path]) – Full pathname of the file to read.

  • return_dict (bool) –

    Whether or not to return a dict object. (Default is True).

    • True: a dict object is returned.

    • False: an H5PY object is returned.

  • verbose (bool) – Whether to print detailed information during the execution. (Default is False)

Returns:

result (Union[dict, object]):

Either a dictionary containing the groups as keys and the datasets as values from the HDF file or an H5PY object, depending on the return_dict parameter.

Return type:

(Union[dict, object])

roicat.helpers.hash_file(path: str, type_hash: str = 'MD5', buffer_size: int = 65536) str[source]

Computes the hash of a file using the specified hash type and buffer size. RH 2022

Parameters:
  • path (str) – Path to the file to be hashed.

  • type_hash (str) –

    Type of hash to use. (Default is 'MD5'). Either

    • 'MD5': MD5 hash algorithm.

    • 'SHA1': SHA1 hash algorithm.

    • 'SHA256': SHA256 hash algorithm.

    • 'SHA512': SHA512 hash algorithm.

  • buffer_size (int) – Buffer size (in bytes) for reading the file. 65536 corresponds to 64KB. (Default is 65536)

Returns:

hash_val (str):

The computed hash of the file.

Return type:

(str)

roicat.helpers.idx2bool(idx: ndarray, length: int | None = None) ndarray[source]

Converts a vector of indices to a boolean vector. RH 2021

Parameters:
  • idx (np.ndarray) – 1-D array of indices.

  • length (Optional[int]) – Length of boolean vector. If None, the length will be set to the maximum index in idx + 1. (Default is None)

Returns:

bool_vec (np.ndarray):

1-D boolean array.

Return type:

(np.ndarray)

roicat.helpers.idx_to_oneHot(arr: ndarray | Tensor, n_classes: int | None = None, dtype: Type | None = None) ndarray | Tensor[source]

Converts an array of class labels to a matrix of one-hot vectors. RH 2021

Parameters:
  • arr (Union[np.ndarray, torch.Tensor]) – A 1-D array of class labels. Values should be integers >= 0. Values will be used as indices in the output array.

  • n_classes (Optional[int]) – The number of classes. If None, it will be derived from arr. (Default is None)

  • dtype (Optional[Type]) – The data type of the output array. If None, it defaults to bool for numpy array and torch.bool for Torch tensor. (Default is None)

Returns:

oneHot (Union[np.ndarray, torch.Tensor]):

A 2-D array of one-hot vectors.

Return type:

(Union[np.ndarray, torch.Tensor])

roicat.helpers.index_with_nans(values, indices)[source]

Indexes an array with a list of indices, allowing for NaNs in the indices. RH 2022

Parameters:
  • values (np.ndarray) – Array to be indexed.

  • indices (Union[List[int], np.ndarray]) – 1D list or array of indices to use for indexing. Can contain NaNs. Datatype should be floating point. NaNs will be removed and values will be cast to int.

Returns:

Indexed array. Positions where indices was NaN will be filled with NaNs.

Return type:

np.ndarray

roicat.helpers.invert_remappingIdx(remappingIdx: ndarray, method: str = 'linear', fill_value: float | None = nan) ndarray[source]

Inverts a remapping index field.

Requires the assumption that the remapping index field is invertible or bijective/one-to-one and non-occluding. Defined ‘remap_AB’ as a remapping index field that warps image A onto image B, then ‘remap_BA’ is the remapping index field that warps image B onto image A. This function computes ‘remap_BA’ given ‘remap_AB’.

RH 2023

Parameters:
  • remappingIdx (np.ndarray) – An array of shape (H, W, 2) representing the remap field.

  • method (str) –

    Interpolation method to use. See scipy.interpolate.griddata. Options are:

    • 'linear'

    • 'nearest'

    • 'cubic'

    (Default is 'linear')

  • fill_value (Optional[float]) – Value used to fill points outside the convex hull. (Default is np.nan)

Returns:

An array of shape (H, W, 2) representing the inverse remap field.

Return type:

(np.ndarray)

roicat.helpers.invert_warp_matrix(warp_matrix: ndarray) ndarray[source]

Inverts a provided warp matrix for the transformation A->B to compute the warp matrix for B->A. RH 2023

Parameters:

warp_matrix (np.ndarray) – A 2x3 or 3x3 array representing the warp matrix. Shape: (2, 3) or (3, 3).

Returns:

inverted_warp_matrix (np.ndarray):

The inverted warp matrix. Shape: same as input.

Return type:

(np.ndarray)

roicat.helpers.json_load(filepath: str, mode: str = 'r') Any[source]

Loads an object from a json file. RH 2022

Parameters:
  • filepath (str) – Path to the json file.

  • mode (str) – The mode to open the file in. (Default is 'r')

Returns:

obj (Any):

The object loaded from the json file.

Return type:

(Any)

roicat.helpers.json_save(obj: Any, filepath: str, indent: int = 4, mode: str = 'w', mkdir: bool = False, allow_overwrite: bool = True) None[source]

Saves an object to a json file using json.dump. RH 2022

Parameters:
  • obj (Any) – The object to save.

  • filepath (str) – The path to save the object to.

  • indent (int) – Number of spaces for indentation in the output json file. (Default is 4)

  • mode (str) –

    The mode to open the file in. Options are:

    • 'wb': Write binary.

    • 'ab': Append binary.

    • 'xb': Exclusive write binary. Raises FileExistsError if the file already exists.

    (Default is 'w')

  • mkdir (bool) – If True, creates parent directory if it does not exist. (Default is False)

  • allow_overwrite (bool) – If True, allows overwriting of existing file. (Default is True)

class roicat.helpers.lazy_repeat_obj(obj: Any, pseudo_length: int | None = None)[source]

Bases: object

Makes a lazy iterator that repeats an object. RH 2021

Parameters:
  • obj (Any) – Object to repeat.

  • pseudo_length (Optional[int]) – Length of the iterator. (Default is None).

roicat.helpers.make_Fourier_mask(frame_shape_y_x: Tuple[int, int] = (512, 512), bandpass_spatialFs_bounds: List[float] = [0.0078125, 0.3333333333333333], order_butter: int = 5, mask: ndarray | Tensor | None = None, dtype_fft: object = torch.complex64, plot_pref: bool = False, verbose: bool = False) Tensor[source]

Generates a Fourier domain mask for phase correlation, primarily used in BWAIN.

Parameters:
  • frame_shape_y_x (Tuple[int, int]) – Shape of the images that will be processed through this function. (Default is (512, 512))

  • bandpass_spatialFs_bounds (List[float]) – Specifies the lowcut and highcut in spatial frequency for the butterworth filter. (Default is [1/128, 1/3])

  • order_butter (int) – Order of the butterworth filter. (Default is 5)

  • mask (Union[np.ndarray, torch.Tensor, None]) – If not None, this mask is used instead of creating a new one. (Default is None)

  • dtype_fft (object) – Data type for the Fourier transform, default is torch.complex64.

  • plot_pref (bool) – If True, the absolute value of the mask is plotted. (Default is False)

  • verbose (bool) – If True, enables the print statements for debugging. (Default is False)

Returns:

mask_fft (torch.Tensor):

The generated mask in the Fourier domain.

Return type:

(torch.Tensor)

roicat.helpers.make_batches(iterable: Iterable, batch_size: int | None = None, num_batches: int | None = None, min_batch_size: int = 0, return_idx: bool = False, length: int | None = None) Iterable[source]

Creates batches from an iterable. RH 2021

Parameters:
  • iterable (Iterable) – The iterable to be batched.

  • batch_size (Optional[int]) – The size of each batch. If None, then the batch size is based on num_batches. (Default is None)

  • num_batches (Optional[int]) – The number of batches to create. (Default is None)

  • min_batch_size (int) – The minimum size of each batch. (Default is 0)

  • return_idx (bool) – If True, return the indices of the batches. Output will be [start, end] idx. (Default is False)

  • length (Optional[int]) – The length of the iterable. If None, then the length is len(iterable). This is useful if you want to make batches of something that doesn’t have a __len__ method. (Default is None)

Returns:

output (Iterable):

Batches of the iterable.

Return type:

(Iterable)

roicat.helpers.map_parallel(func: Callable, args: List[Any], method: str = 'multithreading', n_workers: int = -1, prog_bar: bool = True) List[Any][source]

Maps a function to a list of arguments in parallel. RH 2022

Parameters:
  • func (Callable) – The function to be mapped.

  • args (List[Any]) – List of arguments to which the function should be mapped. Length of list should be equal to the number of arguments. Each element should then be an iterable for each job that is run.

  • method (str) –

    Method to use for parallelization. Either

    • 'multithreading': Use multithreading from concurrent.futures.

    • 'multiprocessing': Use multiprocessing from concurrent.futures.

    • 'mpire': Use mpire.

    • 'serial': Use list comprehension.

    (Default is 'multithreading')

  • workers (int) – Number of workers to use. If set to -1, all available workers are used. (Default is -1)

  • prog_bar (bool) – Whether to display a progress bar using tqdm. (Default is True)

Returns:

output (List[Any]):

List of results from mapping the function to the arguments.

Return type:

(List[Any])

Example

roicat.helpers.mask_image_border(im: ndarray, border_outer: int | Tuple[int, int, int, int] | None = None, border_inner: int | None = None, mask_value: float = 0) ndarray[source]

Masks an image within specified outer and inner borders. RH 2022

Parameters:
  • im (np.ndarray) – Input image of shape: (height, width).

  • border_outer (Union[int, tuple[int, int, int, int], None]) – Number of pixels along the border to mask. If None, the border is not masked. If an int is provided, all borders are equally masked. If a tuple of ints is provided, borders are masked in the order: (top, bottom, left, right). (Default is None)

  • border_inner (int, Optional) – Number of pixels in the center to mask. Will be a square with side length equal to this value. (Default is None)

  • mask_value (float) – Value to replace the masked pixels with. (Default is 0)

Returns:

im_out (np.ndarray):

Masked output image.

Return type:

(np.ndarray)

roicat.helpers.matlab_load(filepath: str, simplify_cells: bool = True, kwargs_scipy: Dict = {}, kwargs_mat73: Dict = {}, verbose: bool = False) Dict[source]

Loads a matlab file. If the .mat file is not version 7.3, it uses scipy.io.loadmat. If the .mat file is version 7.3, it uses mat73.loadmat. RH 2023

Parameters:
  • filepath (str) – Path to the matlab file.

  • simplify_cells (bool) – If set to True and file is not version 7.3, it simplifies cells to numpy arrays. (Default is True)

  • kwargs_scipy (Dict) – Keyword arguments to pass to scipy.io.loadmat. (Default is {})

  • kwargs_mat73 (Dict) – Keyword arguments to pass to mat73.loadmat. (Default is {})

  • verbose (bool) – If set to True, it prints information about the file. (Default is False)

Returns:

out (Dict):

The loaded matlab file content in a dictionary format.

Return type:

(Dict)

roicat.helpers.matlab_save(obj: Dict, filepath: str, mkdir: bool = False, allow_overwrite: bool = True, clean_string: bool = True, list_to_objArray: bool = True, none_to_nan: bool = True, kwargs_scipy_savemat: Dict = {'appendmat': True, 'do_compression': False, 'format': '5', 'long_field_names': False, 'oned_as': 'row'})[source]

Saves data to a matlab file. It uses scipy.io.savemat and provides additional functionality such as cleaning strings, converting lists to object arrays, and converting None to np.nan. RH 2023

Parameters:
  • obj (Dict) – The data to save. This must be in dictionary format.

  • filepath (str) – The path to save the file to.

  • mkdir (bool) – If set to True, creates parent directory if it does not exist. (Default is False)

  • allow_overwrite (bool) – If set to True, allows overwriting of existing file. (Default is True)

  • clean_string (bool) – If set to True, converts strings to bytes. (Default is True)

  • list_to_objArray (bool) – If set to True, converts lists to object arrays. (Default is True)

  • none_to_nan (bool) – If set to True, converts None to np.nan. (Default is True)

  • kwargs_scipy_savemat (Dict) –

    Keyword arguments to pass to scipy.io.savemat.

    • 'appendmat': Whether to append .mat to the end of the given filename, if it isn’t already there.

    • 'format': The format of the .mat file. ‘4’ for Matlab 4 .mat files, ‘5’ for Matlab 5 and above.

    • 'long_field_names': Whether to allow field names of up to 63 characters instead of the standard 31.

    • 'do_compression': Whether to compress matrices on write.

    • 'oned_as': Whether to save 1-D numpy arrays as row or column vectors in the .mat file. ‘row’ or ‘column’.

    (Default is {'appendmat': True, 'format': '5', 'long_field_names': False, 'do_compression': False, 'oned_as': 'row'})

roicat.helpers.merge_dicts(dicts: List[dict]) dict[source]

Merges a list of dictionaries into a single dictionary. RH 2022

Parameters:

dicts (List[dict]) – List of dictionaries to merge.

Returns:

result_dict (dict):

A single dictionary that contains all keys and values from the dictionaries in the input list.

Return type:

(dict)

roicat.helpers.merge_sparse_arrays(s_list: List[csr_matrix], idx_list: List[ndarray], shape_full: Tuple[int, int], remove_redundant: bool = True, elim_zeros: bool = True) csr_matrix[source]

Merges a list of square sparse arrays into a single square sparse array. Redundant entries are not selected; only entries chosen by np.unique are kept.

Parameters:
  • s_list (List[scipy.sparse.csr_matrix]) – List of sparse arrays to merge. Each array can be any shape.

  • idx_list (List[np.ndarray]) – List of integer arrays. Each array should be the same length as its corresponding array in s_list and contain integers in the range [0, shape_full[0]). These integers represent the row/column indices in the full array.

  • shape_full (Tuple[int, int]) – Shape of the full array.

  • remove_redundant (bool) –

    • True: Removes redundant entries from the output array.

    • False: Keeps redundant entries.

  • elim_zeros (bool) –

    • True: Eliminate zeros in the sparse matrix.

    • False: Keeps zeros.

Returns:

s_full (scipy.sparse.csr_matrix):

Full sparse matrix merged from the input list.

Return type:

scipy.sparse.csr_matrix

roicat.helpers.pickle_load(filepath: str, zipCompressed: bool = False, mode: str = 'rb') Any[source]

Loads an object from a pickle file. RH 2022

Parameters:
  • filepath (str) – Path to the pickle file.

  • zipCompressed (bool) – If True, the file is assumed to be a .zip file. The function will first unzip the file, then load the object from the unzipped file. (Default is False)

  • mode (str) – The mode to open the file in. (Default is 'rb')

Returns:

obj (Any):

The object loaded from the pickle file.

Return type:

(Any)

roicat.helpers.pickle_save(obj: Any, filepath: str, mode: str = 'wb', zipCompress: bool = False, mkdir: bool = False, allow_overwrite: bool = True, **kwargs_zipfile: Dict[str, Any]) None[source]

Saves an object to a pickle file using pickle.dump. Allows for zipping of the file.

RH 2022

Parameters:
  • obj (Any) – The object to save.

  • filepath (str) – The path to save the object to.

  • mode (str) –

    The mode to open the file in. Options are:

    • 'wb': Write binary.

    • 'ab': Append binary.

    • 'xb': Exclusive write binary. Raises FileExistsError if the file already exists.

    (Default is 'wb')

  • zipCompress (bool) – If True, compresses pickle file using zipfileCompressionMethod, which is similar to savez_compressed in numpy (with zipfile.ZIP_DEFLATED). Useful for saving redundant and/or sparse arrays objects. (Default is False)

  • mkdir (bool) – If True, creates parent directory if it does not exist. (Default is False)

  • allow_overwrite (bool) – If True, allows overwriting of existing file. (Default is True)

  • kwargs_zipfile (Dict[str, Any]) –

    Keyword arguments that will be passed into zipfile.ZipFile. compression=``zipfile.ZIP_DEFLATED`` by default. See https://docs.python.org/3/library/zipfile.html#zipfile-objects. Other options for ‘compression’ are (input can be either int or object):

    • 0: zipfile.ZIP_STORED (no compression)

    • 8: zipfile.ZIP_DEFLATED (usual zip compression)

    • 12: zipfile.ZIP_BZIP2 (bzip2 compression) (usually not as good as ZIP_DEFLATED)

    • 14: zipfile.ZIP_LZMA (lzma compression) (usually better than ZIP_DEFLATED but slower)

roicat.helpers.plot_image_grid(images: List[ndarray] | ndarray, labels: List[str] | None = None, grid_shape: Tuple[int, int] = (10, 10), show_axis: str = 'off', cmap: str | None = None, kwargs_subplots: Dict = {}, kwargs_imshow: Dict = {}) Tuple[Figure, ndarray | Axes][source]

Plots a grid of images. RH 2021

Parameters:
  • images (Union[List[np.ndarray], np.ndarray]) – A list of images or a 3D array of images, where the first dimension is the number of images.

  • labels (Optional[List[str]]) – A list of labels to be displayed in the grid. (Default is None)

  • grid_shape (Tuple[int, int]) – Shape of the grid. (Default is (10,10))

  • show_axis (str) – Whether to show axes or not. (Default is ‘off’)

  • cmap (Optional[str]) – Colormap to use. (Default is None)

  • kwargs_subplots (Dict) – Keyword arguments for subplots. (Default is {})

  • kwargs_imshow (Dict) – Keyword arguments for imshow. (Default is {})

Returns:

tuple containing:
fig (plt.Figure):

Figure object.

axs (Union[np.ndarray, plt.Axes]):

Axes object.

Return type:

(Tuple[plt.Figure, Union[np.ndarray, plt.Axes]])

roicat.helpers.prepare_directory_for_loading(directory: str, must_exist: bool = True) str[source]

Prepares a directory path for loading a file. This function is rarely used.

Parameters:
  • directory (str) – The directory path to be prepared for loading.

  • must_exist (bool) – If set to True, the directory at the specified path must exist. (Default is True)

Returns:

path (str):

The prepared directory path for loading.

Return type:

(str)

roicat.helpers.prepare_directory_for_saving(directory: str, mkdir: bool = False, exist_ok: bool = True) str[source]

Prepares a directory path for saving a file. This function is rarely used.

Parameters:
  • directory (str) – The directory path to be prepared for saving.

  • mkdir (bool) – If set to True, creates parent directory if it does not exist. (Default is False)

  • exist_ok (bool) – If set to True, allows overwriting of existing directory. (Default is True)

Returns:

path (str):

The prepared directory path for saving.

Return type:

(str)

roicat.helpers.prepare_filepath_for_loading(filepath: str, must_exist: bool = True) str[source]

Prepares a file path for loading a file. Ensures the file path is valid and has the necessary permissions.

Parameters:
  • filepath (str) – The file path to be prepared for loading.

  • must_exist (bool) – If set to True, the file at the specified path must exist. (Default is True)

Returns:

path (str):

The prepared file path for loading.

Return type:

(str)

roicat.helpers.prepare_filepath_for_saving(filepath: str, mkdir: bool = False, allow_overwrite: bool = True) str[source]

Prepares a file path for saving a file. Ensures the file path is valid and has the necessary permissions.

Parameters:
  • filepath (str) – The file path to be prepared for saving.

  • mkdir (bool) – If set to True, creates parent directory if it does not exist. (Default is False)

  • allow_overwrite (bool) – If set to True, allows overwriting of existing file. (Default is True)

Returns:

path (str):

The prepared file path for saving.

Return type:

(str)

roicat.helpers.prepare_params(params, defaults, verbose=True)[source]
Does the following:
  • Checks that all keys in params are in defaults.

  • Fills in any missing keys in params with values from defaults.

  • Returns a deepcopy of the filled-in params.

Parameters:
  • params (Dict) – Dictionary of parameters.

  • defaults (Dict) – Dictionary of defaults.

  • verbose (bool) – Whether to print messages.

roicat.helpers.prepare_path(path: str, mkdir: bool = False, exist_ok: bool = True) str[source]

Checks if a directory or file path is valid for different purposes: saving, loading, etc. RH 2023

  • If exists:
    • If exist_ok=True: all good

    • If exist_ok=False: raises error

  • If doesn’t exist:
    • If file:
      • If parent directory exists:
        • All good

      • If parent directory doesn’t exist:
        • If mkdir=True: creates parent directory

        • If mkdir=False: raises error

    • If directory:
      • If mkdir=True: creates directory

      • If mkdir=False: raises error

RH 2023

Parameters:
  • path (str) – Path to be checked.

  • mkdir (bool) – If True, creates parent directory if it does not exist. (Default is False)

  • exist_ok (bool) – If True, allows overwriting of existing file. (Default is True)

Returns:

path (str):

Resolved path.

Return type:

(str)

roicat.helpers.pydata_sparse_to_torch_coo(sp_array: object) object[source]

Converts a PyData Sparse array to a PyTorch sparse COO tensor.

This function extracts the coordinates and data from the sparse PyData array and uses them to create a new sparse COO tensor in PyTorch.

Parameters:

sp_array (object) – The PyData Sparse array to convert. It should be a COO sparse matrix representation.

Returns:

coo_tensor (object):

The converted PyTorch sparse COO tensor.

Return type:

(object)

Example

sp_array = sparse.COO(np.random.rand(1000, 1000))
coo_tensor = pydata_sparse_to_torch_coo(sp_array)
roicat.helpers.rand_cmap(nlabels: int, first_color_black: bool = False, last_color_black: bool = False, verbose: bool = True, under: List[float] = [0, 0, 0], over: List[float] = [0.5, 0.5, 0.5], bad: List[float] = [0.9, 0.9, 0.9]) object[source]

Creates a random colormap to be used with matplotlib. Useful for segmentation tasks.

Parameters:
  • nlabels (int) – Number of labels (size of colormap).

  • first_color_black (bool) – Option to use the first color as black. (Default is False)

  • last_color_black (bool) – Option to use the last color as black. (Default is False)

  • verbose (bool) – Prints the number of labels and shows the colormap if True. (Default is True)

  • under (List[float]) – RGB values to use for the ‘under’ threshold in the colormap. (Default is [0, 0, 0])

  • over (List[float]) – RGB values to use for the ‘over’ threshold in the colormap. (Default is [0.5, 0.5, 0.5])

  • bad (List[float]) – RGB values to use for ‘bad’ values in the colormap. (Default is [0.9, 0.9, 0.9])

Returns:

colormap (LinearSegmentedColormap):

Colormap for matplotlib.

Return type:

(LinearSegmentedColormap)

roicat.helpers.remap_images(images: ndarray | Tensor, remappingIdx: ndarray | Tensor, backend: str = 'torch', interpolation_method: str = 'linear', border_mode: str = 'constant', border_value: float = 0, device: str = 'cpu') ndarray | Tensor[source]

Applies remapping indices to a set of images. Remapping indices, similar to flow fields, describe the index of the pixel to sample from rather than the displacement of each pixel. RH 2023

Parameters:
  • images (Union[np.ndarray, torch.Tensor]) – The images to be warped. Shapes can be (N, C, H, W), (C, H, W), or (H, W).

  • remappingIdx (Union[np.ndarray, torch.Tensor]) – The remapping indices, describing the index of the pixel to sample from. Shape is (H, W, 2).

  • backend (str) – The backend to use. Can be either 'torch' or 'cv2'. (Default is 'torch')

  • interpolation_method (str) – The interpolation method to use. Options are 'linear', 'nearest', 'cubic', and 'lanczos'. Refer to cv2.remap or torch.nn.functional.grid_sample for more details. (Default is 'linear')

  • border_mode (str) – The border mode to use. Options include 'constant', 'reflect', 'replicate', and 'wrap'. Refer to cv2.remap for more details. (Default is 'constant')

  • border_value (float) – The border value to use. Refer to cv2.remap for more details. (Default is 0)

  • device (str) – The device to use for computations. Commonly either 'cpu' or 'gpu'. (Default is 'cpu')

Returns:

warped_images (Union[np.ndarray, torch.Tensor]):

The warped images. The shape will be the same as the input images, which can be (N, C, H, W), (C, H, W), or (H, W).

Return type:

(Union[np.ndarray, torch.Tensor])

roicat.helpers.remap_sparse_images(ims_sparse: spmatrix | List[spmatrix], remappingIdx: ndarray, method: str = 'linear', fill_value: float = 0, dtype: str | dtype = None, safe: bool = True, n_workers: int = -1, verbose: bool = True) List[csr_matrix][source]

Remaps a list of sparse images using the given remap field. RH 2023

Parameters:
  • ims_sparse (Union[scipy.sparse.spmatrix, List[scipy.sparse.spmatrix]]) – A single sparse image or a list of sparse images.

  • remappingIdx (np.ndarray) – An array of shape (H, W, 2) representing the remap field. It should be the same size as the images in ims_sparse.

  • method (str) –

    Interpolation method to use. See scipy.interpolate.griddata. Options are:

    • 'linear'

    • 'nearest'

    • 'cubic'

    (Default is 'linear')

  • fill_value (float) – Value used to fill points outside the convex hull. (Default is 0.0)

  • dtype (Union[str, np.dtype]) – The data type of the resulting sparse images. Default is None, which will use the data type of the input sparse images.

  • safe (bool) – If True, checks if the image is 0D or 1D and applies a tiny Gaussian blur to increase the image width. (Default is True)

  • n_workers (int) – Number of parallel workers to use. Default is -1, which uses all available CPU cores.

  • verbose (bool) – Whether or not to use a tqdm progress bar. (Default is True)

Returns:

ims_sparse_out (List[scipy.sparse.csr_matrix]):

A list of remapped sparse images.

Return type:

(List[scipy.sparse.csr_matrix])

Raises:
  • AssertionError – If the image and remappingIdx have different spatial

  • dimensions.

roicat.helpers.remappingIdx_to_flowField(ri: ndarray | object) ndarray | object[source]

Convert a remapping index to a flow field. WARNING: Technically, it is not possible to convert a remapping index to a flow field, since the remapping index describes an interpolation mapping, while the flow field describes a displacement. RH 2023

Parameters:

ri (Union[np.ndarray, object]) – Remapping index represented as a numpy ndarray or torch Tensor. It describes the index of the pixel in the original image that should be mapped to the new pixel. Shape (H, W, 2). Last dimension is (x, y).

Returns:

ff (Union[np.ndarray, object]):

Flow field. It describes the displacement of each pixel. Shape (H, W, 2).

Return type:

(Union[np.ndarray, object])

roicat.helpers.remove_redundant_elements(s: coo_matrix, inPlace: bool = False) coo_matrix[source]

Removes redundant entries from a sparse matrix. Useful when manually populating a sparse matrix and you want to remove redundant entries. RH 2022

Parameters:
  • s (scipy.sparse.coo_matrix) – Sparse matrix. Should be in COO format.

  • inPlace (bool) –

    • If True, the input matrix is modified in place.

    • If False, a new matrix is returned.

    (Default is False)

Returns:

s (scipy.sparse.coo_matrix):

Sparse matrix with redundant entries removed.

Return type:

(scipy.sparse.coo_matrix)

roicat.helpers.resize_images(images: ndarray | List[ndarray] | Tensor | List[Tensor], new_shape: Tuple[int, int] = (100, 100), interpolation: str = 'BILINEAR', antialias: bool = False, device: str = 'cpu', return_numpy: bool = True) ndarray[source]

Resizes images using the torchvision.transforms.Resize method. RH 2023

Parameters:
  • images (Union[np.ndarray, List[np.ndarray]], torch.Tensor, List[torch.Tensor]) – Images or frames of a video. Can be 2D, 3D, or 4D. * For a 2D array: shape is (height, width) * For a 3D array: shape is (n_frames, height, width) * For a 4D array: shape is (n_frames, n_channels, height, width)

  • new_shape (Tuple[int, int]) – The desired height and width of resized images as a tuple. (Default is (100, 100))

  • interpolation (str) – The interpolation method to use. See torchvision.transforms.Resize for options. * 'NEAREST': Nearest neighbor interpolation * 'NEAREST_EXACT': Nearest neighbor interpolation * 'BILINEAR': Bilinear interpolation * 'BICUBIC': Bicubic interpolation

  • antialias (bool) – If True, antialiasing will be used. (Default is False)

  • device (str) – The device to use for torchvision.transforms.Resize. (Default is 'cpu')

  • return_numpy (bool) – If True, then will return a numpy array. Otherwise, will return a torch tensor on the defined device. (Default is True)

Returns:

images_resized (np.ndarray):

Frames of video or images with overlay added.

Return type:

(np.ndarray)

roicat.helpers.save_gif(array: ndarray | List, path: str, frameRate: float = 5.0, loop: int = 0, kwargs_backend: Dict = {})[source]

Saves an array of images as a gif. RH 2023

Parameters:
  • array (Union[np.ndarray, list]) –

    The 3D (grayscale) or 4D (color) array of images.

    • If dtype is float type, then scale is from 0 to 1.

    • If dtype is int, then scale is from 0 to 255.

  • path (str) – The path where the gif is saved.

  • frameRate (float) – The frame rate of the gif. (Default is 5.0)

  • loop (int) –

    The number of times to loop the gif. (Default is 0)

    • 0 means loop forever

    • 1 means play once

    • 2 means play twice (loop once)

    • etc.

  • backend (#) –

  • use. (# Which backend to) –

  • Options (#) – ‘imageio’ or ‘PIL’

  • kwargs_backend (Dict) – The keyword arguments for the backend.

class roicat.helpers.scipy_sparse_csr_with_length(*args: object, **kwargs: object)[source]

Bases: csr_matrix

A scipy sparse matrix with a length attribute. RH 2023

length

The length of the matrix (shape[0])

Type:

int

Parameters:
  • *args (object) – Arbitrary arguments passed to scipy.sparse.csr_matrix.

  • **kwargs (object) – Arbitrary keyword arguments passed to scipy.sparse.csr_matrix.

roicat.helpers.scipy_sparse_to_torch_coo(sp_array: coo_matrix, dtype: type | None = None) sparse_coo_tensor[source]

Converts a Scipy sparse array to a PyTorch sparse COO tensor.

Parameters:
  • sp_array (scipy.sparse.coo_matrix) – Scipy sparse array to be converted to a PyTorch sparse COO tensor.

  • dtype (Optional[type]) – Data type to which the values of the input sparse array are to be converted before creating the PyTorch sparse tensor. If None, the data type of the input array’s values is retained. (Default is None).

Returns:

PyTorch sparse COO tensor converted from the input Scipy sparse array.

Return type:

coo_tensor (torch.sparse_coo_tensor)

roicat.helpers.set_device(use_GPU: bool = True, device_num: int = 0, verbose: bool = True) str[source]

Sets the device for PyTorch. If a GPU is available and use_GPU is True, it will be set as the device. Otherwise, the CPU will be set as the device. RH 2022

Parameters:
  • use_GPU (bool) –

    Determines if the GPU should be utilized:

    • True: the function will attempt to use the GPU if a GPU is not available.

    • False: the function will use the CPU.

    (Default is True)

  • device_num (int) – Specifies the index of the GPU to use. (Default is 0)

  • verbose (bool) –

    Determines whether to print the device information.

    • True: the function will print out the device information.

    (Default is True)

Returns:

device (str):

A string specifying the device, either “cpu” or “cuda:<device_num>”.

Return type:

(str)

roicat.helpers.show_item_tree(hObj: object | dict | None = None, path: str | Path | None = None, depth: int | None = None, show_metadata: bool = True, print_metadata: bool = False, indent_level: int = 0) None[source]

Recursively displays all the items and groups in an HDF5 object or Python dictionary. RH 2021

Parameters:
  • hObj (Optional[Union[object, dict]]) – Hierarchical object, which can be an HDF5 object or a Python dictionary. (Default is None)

  • path (Optional[Union[str, Path]]) – If not None, then the path to the HDF5 object is used instead of hObj. (Default is None)

  • depth (Optional[int]) – How many levels deep to show the tree. (Default is None which shows all levels)

  • show_metadata (bool) – Whether or not to list metadata with items. (Default is True)

  • print_metadata (bool) – Whether or not to show values of metadata items. (Default is False)

  • indent_level (int) – Used internally to the function. User should leave this as the default. (Default is 0)

Example

import h5py
with h5py.File('test.h5', 'r') as f:
    show_item_tree(f)
roicat.helpers.simple_cmap(colors: List[List[float]] = [[1, 0, 0], [1, 0.6, 0], [0.9, 0.9, 0], [0.6, 1, 0], [0, 1, 0], [0, 1, 0.6], [0, 0.8, 0.8], [0, 0.6, 1], [0, 0, 1], [0.6, 0, 1], [0.8, 0, 0.8], [1, 0, 0.6]], under: List[float] = [0, 0, 0], over: List[float] = [0.5, 0.5, 0.5], bad: List[float] = [0.9, 0.9, 0.9], name: str = 'none') object[source]

Creates a colormap from a sequence of RGB values. Borrowed with permission from Alex (https://gist.github.com/ahwillia/3e022cdd1fe82627cbf1f2e9e2ad80a7ex)

Parameters:
  • colors (List[List[float]]) – List of RGB values. Each sub-list contains three float numbers representing an RGB color. (Default is list of RGB colors ranging from red to purple)

  • under (List[float]) – RGB values for the colormap under range. (Default is [0,0,0] (black))

  • over (List[float]) – RGB values for the colormap over range. (Default is [0.5,0.5,0.5] (grey))

  • bad (List[float]) – RGB values for the colormap bad range. (Default is [0.9,0.9,0.9] (light grey))

  • name (str) – Name of the colormap. (Default is ‘none’)

Returns:

cmap (LinearSegmentedColormap):

The generated colormap.

Return type:

(LinearSegmentedColormap)

Example

cmap = simple_cmap([(1,1,1), (1,0,0)]) # white to red colormap
cmap = simple_cmap(['w', 'r'])         # white to red colormap
cmap = simple_cmap(['r', 'b', 'r'])    # red to blue to red
roicat.helpers.sparse_mask(x: csr_matrix, mask_sparse: csr_matrix, do_safety_steps: bool = True) csr_matrix[source]

Masks a sparse matrix with the non-zero elements of another sparse matrix. RH 2022

Parameters:
  • x (scipy.sparse.csr_matrix) – Sparse matrix to mask.

  • mask_sparse (scipy.sparse.csr_matrix) – Sparse matrix to mask with.

  • do_safety_steps (bool) – Whether to do safety steps to ensure that things are working as expected. (Default is True)

Returns:

output (scipy.sparse.csr_matrix):

Masked sparse matrix.

Return type:

(scipy.sparse.csr_matrix)

roicat.helpers.sparse_to_dense_fill(arr_s: COO, fill_val: float = 0.0) ndarray[source]

Converts a sparse array to a dense array and fills in sparse entries with a specified fill value. RH 2023

Parameters:
  • arr_s (sparse.COO) – Sparse array to be converted to dense.

  • fill_val (float) – Value to fill the sparse entries. (Default is 0.0)

Returns:

dense_arr (np.ndarray):

Dense version of the input sparse array.

Return type:

(np.ndarray)

roicat.helpers.squeeze_integers(intVec: list | ndarray | Tensor) ndarray | Tensor[source]

Makes integers in an array consecutive numbers starting from the smallest value. For example, [7,2,7,4,-1,0] -> [3,2,3,1,-1,0]. This is useful for removing unused class IDs. RH 2023

Parameters:

intVec (Union[list, np.ndarray, torch.Tensor]) – 1-D array of integers.

Returns:

squeezed_integers (Union[np.ndarray, torch.Tensor]):

1-D array of integers with consecutive numbers starting from the smallest value.

Return type:

(Union[np.ndarray, torch.Tensor])

roicat.helpers.torch_pca(X_in: Tensor | ndarray, device: str = 'cpu', mean_sub: bool = True, zscore: bool = False, rank: int | None = None, return_cpu: bool = True, return_numpy: bool = False) Tuple[Tensor | ndarray, Tensor | ndarray, Tensor | ndarray, Tensor | ndarray][source]

Conducts Principal Components Analysis using the Pytorch library. This function can run on either CPU or GPU devices. RH 2021

Parameters:
  • X_in (Union[torch.Tensor, np.ndarray]) – The data to be decomposed. This should be a 2-D array, with columns representing features and rows representing samples. PCA is performed column-wise.

  • device (str) – The device to use for computation, e.g., ‘cuda’ or ‘cpu’. (Default is 'cpu')

  • mean_sub (bool) – If True, subtract the mean (‘center’) from the columns. (Default is True)

  • zscore (bool) – If True, z-score the columns. This is equivalent to conducting PCA on the correlation-matrix. (Default is False)

  • rank (int) – Maximum estimated rank of the decomposition. If None, then the rank is assumed to be X.shape[1]. (Default is None)

  • return_cpu (bool) –

    (Default is True)

    • True, all outputs are forced to be on the ‘cpu’ device.

    • False, and device is not ‘cpu’, then the returns will be on the provided device.

  • return_numpy (bool) – If True, all outputs are forced to be of type numpy.ndarray. (Default is False)

Returns:

tuple containing:
components (torch.Tensor or np.ndarray):

The components of the decomposition, represented as a 2-D array. Each column is a component vector and each row is a feature weight.

scores (torch.Tensor or np.ndarray):

The scores of the decomposition, represented as a 2-D array. Each column is a score vector and each row is a sample weight.

singVals (torch.Tensor or np.ndarray):

The singular values of the decomposition, represented as a 1-D array. Each element is a singular value.

EVR (torch.Tensor or np.ndarray):

The explained variance ratio of each component, represented as a 1-D array. Each element is the explained variance ratio of the corresponding component.

Return type:

(tuple)

Example

components, scores, singVals, EVR = torch_pca(X_in)
roicat.helpers.warp_matrix_to_remappingIdx(warp_matrix: ndarray | Tensor, x: int, y: int) ndarray | Tensor[source]

Convert a warp matrix (2x3 or 3x3) into remapping indices (2D). RH 2023

Parameters:
  • warp_matrix (Union[np.ndarray, torch.Tensor]) – Warp matrix of shape (2, 3) for affine transformations, and (3, 3) for homography.

  • x (int) – Width of the desired remapping indices.

  • y (int) – Height of the desired remapping indices.

Returns:

remapIdx (Union[np.ndarray, torch.Tensor]):

Remapping indices of shape (x, y, 2) representing the x and y displacements in pixels.

Return type:

(Union[np.ndarray, torch.Tensor])

roicat.helpers.yaml_load(filepath: str, mode: str = 'r', loader: object = <class 'yaml.loader.FullLoader'>) object[source]

Loads a YAML file. RH 2022

Parameters:
  • filepath (str) – Path to the YAML file to load.

  • mode (str) – Mode to open the file in. (Default is 'r')

  • loader (object) –

    The YAML loader to use.

    • yaml.FullLoader: Loads the full YAML language. Avoids arbitrary code execution. (Default for PyYAML 5.1+)

    • yaml.SafeLoader: Loads a subset of the YAML language, safely. This is recommended for loading untrusted input.

    • yaml.UnsafeLoader: The original Loader code that could be easily exploitable by untrusted data input.

    • yaml.BaseLoader: Only loads the most basic YAML. All scalars are loaded as strings.

    (Default is yaml.FullLoader)

Returns:

loaded_obj (object):

The object loaded from the YAML file.

Return type:

(object)

roicat.helpers.yaml_save(obj: object, filepath: str, indent: int = 4, mode: str = 'w', mkdir: bool = False, allow_overwrite: bool = True) None[source]

Saves an object to a YAML file using the yaml.dump method. RH 2022

Parameters:
  • obj (object) – The object to be saved.

  • filepath (str) – Path to save the object to.

  • indent (int) – The number of spaces for indentation in the saved YAML file. (Default is 4)

  • mode (str) –

    Mode to open the file in.

    • 'w': write (default)

    • 'wb': write binary

    • 'ab': append binary

    • 'xb': exclusive write binary. Raises FileExistsError if file already exists.

    (Default is 'w')

  • mkdir (bool) – If True, creates the parent directory if it does not exist. (Default is False)

  • allow_overwrite (bool) – If True, allows overwriting of existing files. (Default is True)

roicat.util module

class roicat.util.ROICaT_Module[source]

Bases: object

Super class for ROICaT modules. RH 2023

_system_info

System information.

Type:

object

load(path_load: str | Path) None[source]

Loads attributes from a Data_roicat object from a pickle file.

Parameters:

path_load (Union[str, Path]) – Path to the pickle file.

Note

After calling this method, the attributes of this object are updated with those loaded from the pickle file. If an object in the pickle file is a dictionary, the object’s attributes are set directly from the dictionary. Otherwise, if the object in the pickle file has a ‘import_from_dict’ method, it is used to load attributes. If it does not, the attributes are directly loaded from the object’s __dict__ attribute.

Example

obj = Data_roicat()
obj.load('/path/to/pickle/file')
save(path_save: str | Path, save_as_serializable_dict: bool = False, allow_overwrite: bool = False) None[source]

Saves Data_roicat object to pickle file.

Parameters:
  • path_save (Union[str, pathlib.Path]) – Path to save pickle file.

  • save_as_serializable_dict (bool) – An archival-type format that is easy to load data from, but typically cannot be used to re-instantiate the object. If True, save the object as a serializable dictionary. If False, save the object as a Data_roicat object. (Default is False)

  • allow_overwrite (bool) – If True, allow overwriting of existing file. (Default is False)

property serializable_dict: Dict[str, Any]

Returns a serializable dictionary that can be saved to disk. This method goes through all items in self.__dict__ and checks if they are serializable. If they are, add them to a dictionary to be returned.

Returns:

serializable_dict (Dict[str, Any]):

Dictionary containing serializable items.

Return type:

(Dict[str, Any])

roicat.util.check_dataStructure__list_ofListOrArray_ofDtype(lolod: ~typing.List[~typing.List[int | float]] | ~typing.List[~numpy.ndarray], dtype: ~typing.Type = <class 'numpy.int64'>, fix: bool = True, verbose: bool = True) List[List[int | float]] | List[ndarray][source]

Verifies and optionally corrects the data structure of ‘lolod’ (list of list of dtype).

The structure should be a list of lists of dtypes or a list of numpy arrays of dtypes.

Parameters:
  • lolod (Union[List[List[Union[int, float]]], List[np.ndarray]]) –

    • The data structure to check. It should be a list of lists of dtypes or a list of numpy arrays of dtypes.

  • dtype (Type) –

    • The expected dtype of the elements in ‘lolod’. (Default is np.int64)

  • fix (bool) –

    • If True, attempts to correct the data structure if it is not as expected. The corrections are as follows:

      • If ‘lolod’ is an array, it will be cast to [lolod]

      • If ‘lolod’ is a numpy object, it will be cast to [np.array(lolod, dtype=dtype)]

      • If ‘lolod’ is a list of lists of numbers (int or float), it will be cast to [np.array(lod, dtype=dtype) for lod in lolod]

      • If ‘lolod’ is a list of arrays of wrong dtype, it will be cast to [np.array(lod, dtype=dtype) for lod in lolod]

    • If False, raises an error if the structure is not as expected. (Default is True)

  • verbose (bool) –

    • If True, prints warnings when the structure is not as expected and is corrected. (Default is True)

Returns:

lolod (Union[List[List[Union[int, float]]], List[np.ndarray]]):

The verified or corrected data structure.

Return type:

(Union[List[List[Union[int, float]]], List[np.ndarray]])

roicat.util.discard_UCIDs_with_fewer_matches(ucids: List[List[int] | ndarray], n_sesh_thresh: int | str = 'all', verbose: bool = True) List[List[int] | ndarray][source]

Discards UCIDs that do not appear in at least n_sesh_thresh sessions. If n_sesh_thresh='all', then only UCIDs that appear in all sessions are kept.

Parameters:
  • ucids (List[Union[List[int], np.ndarray]]) – List of lists of UCIDs for each session.

  • n_sesh_thresh (Union[int, str]) – Number of sessions that a UCID must appear in to be kept. If 'all', then only UCIDs that appear in all sessions are kept. (Default is 'all')

  • verbose (bool) – If True, print verbose output. (Default is True)

Returns:

ucids_out (List[Union[List[int], np.ndarray]]):

List of lists of UCIDs with UCIDs that do not appear in at least n_sesh_thresh sessions set to -1.

Return type:

(List[Union[List[int], np.ndarray]])

roicat.util.get_default_parameters(pipeline='tracking', path_defaults=None)[source]

This function returns a dictionary of parameters that can be used to run different pipelines. RH 2023

Parameters:
  • pipeline (str) –

    The name of the pipeline to use. Options:

    • ’tracking’: Tracking pipeline.

    • ’classification_inference’: Classification inference pipeline (TODO).

    • ’classification_training’: Classification training pipeline (TODO).

    • ’model_training’: Model training pipeline (TODO).

  • path_defaults (str) – A path to a yaml file containing a parameters dictionary. The parameters from the file will be loaded as is. If None, the default parameters will be used.

Returns:

params (dict):

A dictionary containing the default parameters.

Return type:

(dict)

roicat.util.get_roicat_version() str[source]

Retrieves the version of the roicat package.

Returns:

version (str):

The version of the roicat package.

Return type:

(str)

roicat.util.labels_to_labelsBySession(labels, n_roi_bySession)[source]

Converts a list of labels to a list of lists of labels by session. RH 2024

Parameters:
  • labels (list or np.ndarray) – List of labels.

  • n_roi_bySession (list or np.ndarray) – Number of ROIs by session.

Returns:

List of lists of labels by session.

Return type:

(list)

roicat.util.make_session_bool(n_roi: ndarray) ndarray[source]

Generates a boolean array representing ROIs (Region Of Interest) per session from an array of ROI counts.

Parameters:

n_roi (np.ndarray) – Array representing the number of ROIs per session. shape: (n_sessions,)

Returns:

session_bool (np.ndarray):

Boolean array of shape (n_roi_total, n_session) where each column represents a session and each row corresponds to an ROI.

Return type:

(np.ndarray)

Example

n_roi = np.array([3, 4, 2])
session_bool = make_session_bool(n_roi)
roicat.util.mask_UCIDs_by_label(ucids: List[List[int] | ndarray], labels: List[int] | ndarray) List[List[int] | ndarray][source]

Sets labels in the UCIDs to -1 if they are not present in the labels array.

RH 2024

Parameters:
  • ucids (List[Union[List[int], np.ndarray]]) –

    List of lists of UCIDs for each session.

    Shape outer list: (n_sessions,)

    Shape inner list: (n_roi_in_session,)

  • labels (Union[List[int], np.ndarray]) – Array of labels to keep. All other labels are set to -1. Shape: (n_labels,)

Returns:

ucids_out (List[Union[List[int], np.ndarray]]):

Masked list of lists of UCIDs. Elements that are not in the labels array are set to -1 in each session.

Return type:

(List[Union[List[int], np.ndarray]])

Example


ucids = [[1, 2, 3], [2, -1, 4], [3, 0, 5]] labels = [2, 3] ucids_out = mask_UCIDs_by_label(ucids, labels) # ucids_out = [[-1, 2, 3], [2, -1, -1], [3, -1, -1]]

roicat.util.mask_UCIDs_with_iscell(ucids: List[List[int] | ndarray], iscell: List[List[bool] | ndarray]) List[List[int] | ndarray][source]

Masks the UCIDs with the iscell array. If iscell is False, then the UCID is set to -1.

Parameters:
  • ucids (List[Union[List[int], np.ndarray]]) –

    List of lists of UCIDs for each session.

    Shape outer list: (n_sessions,)

    Shape inner list: (n_roi_in_session,)

  • iscell (List[Union[List[bool], np.ndarray]]) –

    List of lists of boolean indicators for each UCID.

    True means that ROI is a cell, False means that ROI is not a cell.

    Shape outer list: (n_sessions,)

    Shape inner list: (n_roi_in_session,)

Returns:

ucids_out (List[Union[List[int], np.ndarray]]):

Masked list of lists of UCIDs. Elements that are not cells are set to -1 in each session.

Return type:

(List[Union[List[int], np.ndarray]])

roicat.util.match_arrays_with_ucids(arrays: ndarray | List[ndarray], ucids: List[ndarray] | List[List[int]], squeeze: bool = False, force_sparse: bool = False, prog_bar: bool = False) List[ndarray | lil_matrix][source]

Matches the indices of the arrays using the UCIDs. Array indices with UCIDs corresponding to -1 are set to np.nan. This is useful for aligning Fluorescence and Spiking data across sessions using UCIDs.

Parameters:
  • arrays (Union[np.ndarray, List[np.ndarray]]) – List of numpy arrays for each session. Matching is done along the first dimension.

  • ucids (Union[List[np.ndarray], List[List[int]]]) – List of lists of UCIDs for each session.

  • squeeze (bool) – If True, then UCIDs are squeezed to be contiguous integers. (Default is False)

  • force_sparse (bool) – If True, then the output will be a list of sparse matrices. (Default is False)

  • prog_bar (bool) – If True, then a progress bar will be displayed. (Default is False)

Returns:

arrays_out (List[Union[np.ndarray, scipy.sparse.lil_matrix]]):

List of arrays for each session. Array indices with UCIDs corresponding to -1 are set to np.nan. Each array will have shape: (n_ucids if squeeze==True OR max_ucid if squeeze==False, *array.shape[1:]). UCIDs will be used as the index of the first dimension.

Return type:

(List[Union[np.ndarray, scipy.sparse.lil_matrix]])

roicat.util.match_arrays_with_ucids_inverse(arrays: ndarray | List[ndarray], ucids: List[ndarray] | List[List[int]], unsqueeze: bool = True) List[ndarray | lil_matrix][source]

Inverts the matching of the indices of the arrays using the UCIDs. Arrays should have indices that correspond to the UCID values. The return will be a list of arrays with indices that correspond to the original indices of the arrays / ucids. Essentially, this function undoes the matching done by match_arrays_with_ucids().

Parameters:
  • arrays (Union[np.ndarray, List[np.ndarray]]) – List of numpy arrays for each session.

  • ucids (Union[List[np.ndarray], List[List[int]]]) – List of lists of UCIDs for each session.

  • unsqueeze (bool) – If True, then this algorithm assumes that the arrays were squeezed to remove unused UCIDs. This corresponds to and should match the argument squeeze used in match_arrays_with_ucids().

Returns:

arrays_out (List[Union[np.ndarray, scipy.sparse.lil_matrix]]):

List of arrays with indices that correspond to the original indices of the arrays / ucids.

Return type:

(List[Union[np.ndarray, scipy.sparse.lil_matrix]])

roicat.util.squeeze_UCID_labels(ucids: List[List[int] | ndarray]) List[List[int] | ndarray][source]

Squeezes the UCID labels. Finds all the unique UCIDs across all sessions, then removes spaces in the UCID labels by mapping the unique UCIDs to new values. Output UCIDs are contiguous integers starting at 0, and maintains elements with UCID=-1.

Parameters:

ucids (List[Union[List[int], np.ndarray]]) – List of lists of UCIDs for each session.

Returns:

ucids_out (List[Union[List[int], np.ndarray]]):

List of lists of UCIDs with UCIDs that do not appear in at least n_sesh_thresh sessions set to -1.

Return type:

(List[Union[List[int], np.ndarray]])

roicat.util.system_info(verbose: bool = False) Dict[source]

Checks and prints the versions of various important software packages. RH 2022

Parameters:

verbose (bool) – Whether to print the software versions. (Default is False)

Returns:

versions (Dict):

Dictionary containing the versions of various software packages.

Return type:

(Dict)

roicat.visualization module

roicat.visualization.compute_colored_FOV(spatialFootprints: List[csr_matrix], FOV_height: int, FOV_width: int, labels: List[ndarray] | ndarray, cmap: str | object = 'random', alphas_labels: ndarray | None = None, alphas_sf: List[ndarray] | ndarray | None = None) List[ndarray][source]

Computes a set of images of fields of view (FOV) of spatial footprints, colored by the predicted class. RH 2023

Parameters:
  • spatialFootprints (List[scipy.sparse.csr_matrix]) – Each element is all the spatial footprints for a given session.

  • FOV_height (int) – Height of the field of view.

  • FOV_width (int) – Width of the field of view.

  • labels (Union[List[np.ndarray], np.ndarray]) – Label (will be a unique color) for each spatial footprint. Each element is all the labels for a given session. Can either be a list of integer labels for each session, or a single array with all the labels concatenated.

  • cmap (Union[str, object]) – Colormap to use for the labels. If ‘random’, then a random colormap is generated. Else, this is passed to matplotlib.colors.ListedColormap. (Default is ‘random’)

  • alphas_labels (Optional[np.ndarray]) – Alpha value for each label. shape: (n_labels,) which is the same as the number of unique labels len(np.unique(labels)). (Default is None)

  • alphas_sf (Optional[Union[List[np.ndarray], np.ndarray]]) – Alpha value for each spatial footprint. Can either be a list of alphas for each session, or a single array with all the alphas concatenated. (Default is None)

Returns:

rois_c_bySession_FOV (List[np.ndarray]):

List of images of fields of view (FOV) of spatial footprints, colored by the predicted class.

Return type:

(List[np.ndarray])

roicat.visualization.crop_cluster_ims(ims: ndarray) ndarray[source]

Crops the images to the smallest rectangle containing all non-zero pixels. RH 2022

Parameters:

ims (np.ndarray) – Images to crop. (shape: (n, H, W))

Returns:

cropped_ims (np.ndarray):

Cropped images. (shape: (n, H’, W’))

Return type:

(np.ndarray)

roicat.visualization.display_cropped_cluster_ims(spatialFootprints: List[ndarray], labels: ndarray, FOV_height: int = 512, FOV_width: int = 1024, n_labels_to_display: int = 100) None[source]

Displays the cropped cluster images. RH 2023

Parameters:
  • spatialFootprints (List[np.ndarray]) – List of spatial footprints. Each footprint is a 2D array representing one region. (shape of each footprint: (H, W))

  • labels (np.ndarray) – Labels for each region of interest (ROI). (shape: (n,))

  • FOV_height (int) – Height of the field of view. (Default is 512)

  • FOV_width (int) – Width of the field of view. (Default is 1024)

  • n_labels_to_display (int) – Number of labels to display. (Default is 100)

roicat.visualization.display_labeled_ROIs(images: ndarray, labels: ndarray | Dict[str, Any], max_images_per_label: int = 10, figsize: Tuple[int, int] = (10, 3), fontsize: int = 25, shuffle: bool = True) None[source]

Displays a grid of images, each row corresponding to a label, and each image is a randomly selected image from that label. RH 2023

Parameters:
  • images (np.ndarray) – Array of images. Shape: (num_images, height, width) or (num_images, height, width, num_channels)

  • labels (Union[np.ndarray, Dict[str, Any]]) – If dict, it must contain keys ‘index’ and ‘label’. If ndarray, it must be a 1D array of labels.

  • max_images_per_label (int) – Maximum number of images to display per label. (Default is 10)

  • figsize (Tuple[int, int]) – Size of the figure. (Default is (10, 3))

  • fontsize (int) – Font size of the labels. (Default is 25)

  • shuffle (bool) – If True, the order of the images will be shuffled. (Default is True)

roicat.visualization.display_toggle_image_stack(images: List[ndarray] | List[Tensor], image_size: Tuple[int, int] | int | float | None = None, clim: Tuple[float, float] | None = None, interpolation: str = 'nearest') None[source]

Displays images in a slider using Jupyter Notebook. RH 2023

Parameters:
  • images (Union[List[np.ndarray], List[torch.Tensor]]) – List of images as numpy arrays or PyTorch tensors.

  • image_size (Optional[Tuple[int, int]]) –

    Tuple of (width, height) for resizing images.

    If None, images are not resized.

    If a single integer or float is provided, the images are resized by that factor.

    (Default is None)

  • clim (Optional[Tuple[float, float]]) – Tuple of (min, max) values for scaling pixel intensities. If None, min and max values are computed from the images and used as bounds for scaling. (Default is None)

  • interpolation (str) – String specifying the interpolation method for resizing. Options are ‘nearest’, ‘box’, ‘bilinear’, ‘hamming’, ‘bicubic’, ‘lanczos’. Uses the Image.Resampling.* methods from PIL. (Default is ‘nearest’)

roicat.visualization.get_spread_out_points(data: ndarray, n_ims: int = 1000, dist_im_to_point: float = 0.3, border_frac: float = 0.05, device: str = 'cpu') ndarray[source]

Given a set of points, returns the indices of a subset of points that are spread out. Intended to be used to overlay images on a scatter plot of points. RH 2023

Parameters:
  • data (np.ndarray) – Array containing the points to be spread out. Shape: (N, 2)

  • n_ims (int) – Number of indices to return corresponding to the number of images to be displayed. (Default is 1000)

  • dist_im_to_point (float) – Minimum distance between an image and its nearest point. Images with a minimum distance to a point greater than this value will be discarded. (Default is 0.3)

  • border_frac (float) – Fraction of the range of the data to add as a border around the points. (Default is 0.05)

  • device (str) – Device to use for torch operations. (Default is ‘cpu’)

Returns:

idx_images_overlay (np.ndarray):

Array containing the indices of the points to overlay images on. Shape: (n_ims,)

Return type:

(np.ndarray)

roicat.visualization.plot_confusion_matrix(confusion_matrix, class_names: List[str] = None, figsize: Tuple[int, int] = (4, 4), n_decimals: int = 2)[source]

Plots a confusion matrix using seaborn. RH 2023

Parameters:
  • confusion_matrix (np.ndarray) – Array containing the confusion matrix. Shape: (num_classes, num_classes)

  • class_names (list) – List of class names. Length: num_classes If None, the class names will be the indices of the confusion matrix.

  • figsize (Tuple[int, int]) – Size of the figure.

  • n_decimals (int) – Number of decimals to round the confusion matrix to.

roicat.visualization.select_region_scatterPlot(data: ndarray, images_overlay: ndarray | None = None, idx_images_overlay: ndarray | None = None, size_images_overlay: float | None = None, frac_overlap_allowed: float = 0.5, image_overlay_raster_size: Tuple[int, int] | None = None, path: str | None = None, figsize: Tuple[int, int] = (300, 300), alpha_points: float = 0.5, size_points: float = 1, color_points: str | List[str] = 'k') Tuple[Callable, object, str][source]

Selects a region of a scatter plot and returns the indices of the points in that region.

Parameters:
  • data (np.ndarray) – Input data to create a scatterplot. The shape must be (n_samples, 2).

  • images_overlay (np.ndarray, optional) – A 3D array of grayscale images or a 4D array of RGB images, where the first dimension is the number of images. (Default is None)

  • idx_images_overlay (np.ndarray, optional) – A vector of data indices corresponding to each image in images_overlay. The shape must be (n_images,). (Default is None)

  • size_images_overlay (float, optional) – Size of each overlay image. The unit is relative to each axis. This value scales the resolution of the overlay raster. (Default is None)

  • frac_overlap_allowed (float, optional) – Fraction of overlap allowed between the selected region and the overlay images. This is only used when size_images_overlay is None. (Default is 0.5)

  • image_overlay_raster_size (Tuple[int, int], optional) – Size of the rasterized image overlay in pixels. If None, the size will be set to figsize. (Default is None)

  • path (str, optional) – Temporary file path to save the selected indices. (Default is None)

  • figsize (Tuple[int, int], optional) – Size of the figure in pixels. (Default is (300, 300))

  • alpha_points (float, optional) – Alpha value of the scatter plot points. (Default is 0.5)

  • size_points (float, optional) – Size of the scatter plot points. (Default is 1)

  • color_points (Union[str, List[str]], optional) – Color of the scatter plot points. Single color only.

Returns:

tuple containing:
fn_get_indices (Callable):

Function that returns the indices of the selected points.

layout (object):

Holoviews layout object.

path_tempfile (str):

Path to the temporary file that saves the selected indices.

Return type:

(Tuple[Callable, object, str])

Example

fn_get_indices, layout, path_tempfile = select_region_scatterPlot(data)