uchrom.utils

uchrom.utils.cauchy_combination(pvals, weights=None, axis: int = -1)[source]

Aggregate p-values along axis using the Cauchy combination test.

T = Σ_k w_k · tan((0.5 p_k) · π)T is asymptotically standard Cauchy under the null, so the combined p-value is 1 Cauchy.cdf(T).

Accepts numpy arrays or torch tensors; the output matches the input backend. Numerically safe clamping keeps p away from {0, 1}.

Parameters:
  • pvals (ndarray | Tensor) – Per-test p-values. The reduction is over axis.

  • weights (ndarray | Tensor | None) – Non-negative weights. If given, broadcast along pvals; if None, uniform weights.

  • axis (int) – Axis to combine along.

uchrom.utils.get_device(device: str = 'auto')[source]

Resolve 'auto' | 'cpu' | 'cuda' | 'mps' into a torch.device.

Priority for auto: CUDA > MPS > CPU. Mirrors the convention used by uchrom.recon.bulk.mds.torch_mds so the whole project shares one device-selection rule.

uchrom.utils.lowess_log_log(x: ndarray, y: ndarray, frac: float = 0.1, eps: float = 1e-12) ndarray[source]

LOWESS regression of log(y) ~ log(x) evaluated at the input x.

Used throughout the ArcFISH pipeline to get a “distance-stratified” expected value — e.g. expected per-pair variance as a smooth function of 1D genomic separation.

NaNs / non-positive values are ignored during fitting and yield NaN in the output for those entries. Returns predictions on the linear scale (exp applied).

Parameters:
  • x (1D arrays) – Same length.

  • y (1D arrays) – Same length.

  • frac (float) – LOWESS span (0–1). Default 0.1 matches ArcFISH.

  • eps (float) – Numerical floor to keep log defined.

Shared statistical / device primitives used by multiple analysis modules.

The Cauchy combination test (ACAT) and the LOWESS-on-log-log helper are the two fundamental building blocks of the ArcFISH-style pipeline (uchrom.fea.arc, uchrom.strc.loop, .tad, .comp).

Tensor-heavy code is written against PyTorch so it runs on CUDA / MPS / CPU interchangeably; LOWESS stays on CPU via statsmodels because it is a non-parallel kernel smoother on ~n_bins² points (small).

uchrom.utils.stats.cauchy_combination(pvals, weights=None, axis: int = -1)[source]

Aggregate p-values along axis using the Cauchy combination test.

T = Σ_k w_k · tan((0.5 p_k) · π)T is asymptotically standard Cauchy under the null, so the combined p-value is 1 Cauchy.cdf(T).

Accepts numpy arrays or torch tensors; the output matches the input backend. Numerically safe clamping keeps p away from {0, 1}.

Parameters:
  • pvals (ndarray | Tensor) – Per-test p-values. The reduction is over axis.

  • weights (ndarray | Tensor | None) – Non-negative weights. If given, broadcast along pvals; if None, uniform weights.

  • axis (int) – Axis to combine along.

uchrom.utils.stats.default_float_dtype(device)[source]

Pick a floating dtype that the given device supports well.

MPS only supports float32; elsewhere we default to float64 for the better numeric behaviour during LOWESS-normalised F-tests.

uchrom.utils.stats.get_device(device: str = 'auto')[source]

Resolve 'auto' | 'cpu' | 'cuda' | 'mps' into a torch.device.

Priority for auto: CUDA > MPS > CPU. Mirrors the convention used by uchrom.recon.bulk.mds.torch_mds so the whole project shares one device-selection rule.

uchrom.utils.stats.lowess_log_log(x: ndarray, y: ndarray, frac: float = 0.1, eps: float = 1e-12) ndarray[source]

LOWESS regression of log(y) ~ log(x) evaluated at the input x.

Used throughout the ArcFISH pipeline to get a “distance-stratified” expected value — e.g. expected per-pair variance as a smooth function of 1D genomic separation.

NaNs / non-positive values are ignored during fitting and yield NaN in the output for those entries. Returns predictions on the linear scale (exp applied).

Parameters:
  • x (1D arrays) – Same length.

  • y (1D arrays) – Same length.

  • frac (float) – LOWESS span (0–1). Default 0.1 matches ArcFISH.

  • eps (float) – Numerical floor to keep log defined.