uchrom.fea¶

Geometric / statistical features over chromatin traces.

uchrom.fea.add_annotation_features(cdata, gtf: str | Path | DataFrame, *, features: Sequence[str] | None = None, prefix: str = 'gtf', result_key: str = 'bin_features', project: bool = True, store: bool = True, overwrite: bool = False, promoter_window: tuple[int, int] = (-2000, 500), hash_source: bool = False) → DataFrame[source]¶: Compute annotation features for a ChromData object.

uchrom.fea.add_peak_features(cdata, peaks: DataFrame, *, name: str = 'peak', prefix: str = 'peak', result_key: str = 'bin_features', project: bool = True, store: bool = True, overwrite: bool = False, source_peak_key: str | None = None) → DataFrame[source]¶: Project an existing peak table into interval and spot features.

uchrom.fea.add_sequence_features(cdata, fasta: str | Path, *, features: Sequence[str] | None = None, prefix: str = 'seq', result_key: str = 'bin_features', project: bool = True, store: bool = True, overwrite: bool = False, hash_source: bool = False, g4_pattern: str = 'G{3,}[ACGTN]{1,7}G{3,}[ACGTN]{1,7}G{3,}[ACGTN]{1,7}G{3,}') → DataFrame[source]¶

Compute sequence features for a ChromData object.

The canonical interval table is returned. When store=True it is also merged into cdata.results[result_key]. When project=True feature columns are projected onto cdata.tracks using prefix.

uchrom.fea.aggregate_track_by_interval(cdata, track: str, *, agg: str = 'mean', signal_col: str = 'signal') → DataFrame[source]¶

Aggregate a spot-aligned track to unique genomic intervals.

Multiple spots can share the same genomic bin across cells or traces. Peak callers should operate on the genomic bin signal rather than individual spot rows, so this helper collapses duplicate intervals first.

uchrom.fea.aggregate_tracks_by_interval(cdata, tracks: Sequence[str], *, agg: str = 'mean') → DataFrame[source]¶: Aggregate multiple spot-aligned tracks to unique genomic intervals.

uchrom.fea.append_feature_registry_entry(cdata, entry: Mapping[str, Any], *, key: str = 'feature_registry') → dict[str, Any][source]¶

Append a provenance record to cdata.uns[key].

The registry is stored as a list of JSON-like dicts so it round-trips through the existing uns serializer.

uchrom.fea.axis_variance_cube(cd, chrom: str, device: str = 'auto') → dict[source]¶

Compute per-axis pairwise variance + sample-count cubes.

Returns a dict with var, count, mean (all (3, B, B)) plus bin_ids, n_traces, chrom, and — for downstream filter_normalize() — the full (3, T, B, B) pairwise diff tensor on the GPU device under key "diff".

uchrom.fea.axis_weight(cd, chrom: str | None = None, device: str = 'auto') → ndarray[source]¶

Compute per-axis weights w ∝ 1 / median(trace_variance).

For each axis we centre every trace at its own mean (bins with NaN excluded) and take the median across traces of each trace’s variance. The inverse of that median is the axis weight, normalised to sum 1. Consistent with ArcFISH’s axis_weight routine.

uchrom.fea.call_macs_bdgpeaks_from_bedgraph(path: str | Path, *, signal_col: str = 'score', cutoff: float = 5.0, min_length: int = 200, max_gap: int = 30, name: str = 'MACS', name_prefix: str | None = None) → DataFrame[source]¶: Read a bedGraph file and call MACS3 bdgpeakcall-compatible peaks.

uchrom.fea.call_macs_bdgpeaks_from_signal(signal_table: DataFrame, *, signal_col: str = 'score', cutoff: float = 5.0, min_length: int = 200, max_gap: int = 30, name: str = 'MACS', name_prefix: str | None = None) → DataFrame[source]¶

Call peaks using a Python port of MACS3 bdgpeakcall.

This mirrors MACS3’s bedGraphTrackI.call_peaks behavior for score tracks: regions with values at or above cutoff are merged when the intervening gap is at most max_gap bases, peaks shorter than min_length bases are discarded, and the summit follows MACS3’s tie-breaking rule. The port follows MACS3’s BSD-licensed bdgpeakcall and narrowPeak semantics without vendoring the MACS3 package at runtime.

uchrom.fea.call_macs_peaks_from_signal(signal_table: DataFrame, *, signal_col: str = 'signal', control_col: str | None = None, qvalue: float | None = 0.05, pvalue: float | None = None, genome_size: int | float | None = None, small_local_window_bins: int = 10, large_local_window_bins: int = 100, fragment_size_bins: int = 1, nolambda: bool = False, min_width_bins: int = 1, max_gap_bins: int = 0, signal_scale: float = 1.0, control_scale: float | None = None, negative: str = 'clip', name: str = 'peak') → DataFrame[source]¶

Call peaks with U-Chrom’s legacy dynamic Poisson approximation.

Observed binned signal is tested against a dynamic local lambda using Poisson tail probabilities, p-values are Benjamini-Hochberg adjusted, and significant adjacent bins are merged into peak intervals. Use call_macs_bdgpeaks_from_signal for the MACS3 bdgpeakcall port.

uchrom.fea.call_peaks_from_signal(signal_table: DataFrame, *, signal_col: str = 'signal', threshold: float | None = None, quantile: float = 0.95, min_width_bins: int = 1, max_gap_bins: int = 0, name: str = 'peak') → DataFrame[source]¶

Call simple contiguous high-signal peaks from an interval signal table.

The input table should have chrom, start, end, and signal_col. If threshold is omitted it is estimated from the requested quantile of finite signal values.

uchrom.fea.call_peaks_from_track(cdata, track: str, *, method: str = 'auto', control_track: str | None = None, agg: str = 'mean', threshold: float | None = None, quantile: float = 0.95, cutoff: float | None = None, min_length: int = 200, max_gap: int = 30, qvalue: float | None = 0.05, pvalue: float | None = None, genome_size: int | float | None = None, small_local_window_bins: int = 10, large_local_window_bins: int = 100, fragment_size_bins: int = 1, nolambda: bool = False, min_width_bins: int = 1, max_gap_bins: int = 0, signal_scale: float = 1.0, control_scale: float | None = None, negative: str = 'clip', peaks_key: str | None = None, feature_name: str | None = None, prefix: str = 'peak', result_key: str = 'bin_features', project: bool = True, store: bool = True, overwrite: bool = False) → DataFrame[source]¶: Aggregate a track, call peaks, and project peak features to spots.

uchrom.fea.compute_annotation_features(intervals: DataFrame, gtf: str | Path | DataFrame, *, features: Sequence[str] | None = None, promoter_window: tuple[int, int] = (-2000, 500), gene_feature: str = 'gene', exon_feature: str = 'exon') → DataFrame[source]¶

Compute GTF-derived features for genomic intervals.

The returned table preserves the input interval order and uses 0-based half-open coordinates.

uchrom.fea.compute_peak_features(intervals: DataFrame, peaks: DataFrame, *, name: str = 'peak') → DataFrame[source]¶: Compute overlap/count/distance features from peak intervals.

uchrom.fea.compute_sequence_features(intervals: DataFrame, fasta: str | Path, *, features: Sequence[str] | None = None, g4_pattern: str = 'G{3,}[ACGTN]{1,7}G{3,}[ACGTN]{1,7}G{3,}[ACGTN]{1,7}G{3,}', missing: str = 'raise') → DataFrame[source]¶

Compute FASTA-derived features for genomic intervals.

Parameters:

intervals – DataFrame with chrom, start, and end columns.
fasta – FASTA file path. Plain text and .gz files are supported.
features – Sequence feature names. Defaults to DEFAULT_SEQUENCE_FEATURES.
g4_pattern – Regular expression used for G-quadruplex motif counting.
missing – "raise" to reject intervals whose chromosomes are absent from the FASTA, or "ignore" to leave their feature values as NaN.

Returns:

Interval columns plus requested feature columns, preserving the input row order.

Return type:

DataFrame

uchrom.fea.contact_frequency(df: DataFrame, threshold: float, chrom=None)[source]¶

Fraction of traces where a pair of bins are within threshold.

NaN distances (missing spots) are excluded from both numerator and denominator — each bin pair’s frequency is over the set of traces that have both endpoints detected.

Parameters:

df (DataFrame with spots + coords.)
threshold (distance threshold in the same units as x/y/z.)
chrom (optional chromosome filter.)

Returns:

frequency (ndarray (n_bins, n_bins) in [0, 1], NaN where no) – trace had both endpoints detected.
bin_ids (list of (start, end))
n_traces (int)

uchrom.fea.filter_normalize(cube: dict, k_sigma: float = 4.0, frac: float = 0.1) → dict[source]¶

ArcFISH-style per-trace LOWESS filter + normalise.

Operates on the full (3, n_traces, n_bins, n_bins) pairwise-diff tensor kept on the GPU (under cube['diff']). Two passes:

Per-pair raw_var = nanmedian(trace_diff²) → LOWESS over log(genomic_distance) → strata_std. Individual trace observations where |diff - median(diff)| > k_sigma × strata_std are NaN’d in-place in the 4D tensor.
After filtering, per-pair filtered_var = nanmean((diff - mean)²) and per-pair count = n_valid recomputed. LOWESS again over log(d1d) → expected; normalised variance = filtered / expected.

Output (numpy, on CPU): var, count (refreshed after filter), norm_var, expected, raw_var, genomic_distance. The original 4D tensor under "diff" is consumed (may be modified).

uchrom.fea.macs_bdgpeaks_to_narrowpeak(peaks: DataFrame, *, name: str = 'MACS', name_prefix: str | None = None, trackline: bool = False, score_column: str = 'score') → str[source]¶: Render MACS bdgpeakcall-compatible peaks as narrowPeak text.

uchrom.fea.mean_distance_matrix(df: DataFrame, chrom=None, reduce: str = 'median')[source]¶

Population-level mean/median pairwise distance matrix.

For each pair of genomic bins (i, j), the distance is computed per-trace and then reduced across traces with np.nanmedian (the Bintu 2018 convention) or np.nanmean.

Parameters:

df (DataFrame with spots + coords.)
chrom (optional chromosome filter.)
reduce ('median' (default) or 'mean'.)

Returns:

matrix (ndarray (n_bins, n_bins))
bin_ids (list of (start, end))
n_traces (int)

uchrom.fea.project_interval_features_to_spots(cdata_or_spots, features: DataFrame, *, prefix: str | None = None, value_columns: Sequence[str] | None = None, into: DataFrame | None = None, overwrite: bool = False) → DataFrame[source]¶

Project interval-level features onto the spot axis.

Parameters:

cdata_or_spots – A ChromData-like object or a spot DataFrame with chrom, start, and end columns.
features – DataFrame with one row per interval and feature columns.
prefix – Optional namespace prefix for projected columns. For example, prefix="seq" maps gc_fraction to seq.gc_fraction.
value_columns – Feature columns to project. By default all non-interval columns are used.
into – Optional existing spot-aligned DataFrame to extend.
overwrite – If False, raise when projected columns already exist in into.

Returns:

A spot-aligned DataFrame containing the existing into columns plus projected feature columns.

Return type:

DataFrame

uchrom.fea.radius_of_gyration(df: DataFrame, chrom=None) → Series[source]¶

Per-trace radius of gyration.

Rg = sqrt(mean over spots of ||r - centroid||²). Traces with fewer than 2 spots contribute NaN.

uchrom.fea.read_bedgraph(path: str | Path, *, signal_col: str = 'score') → DataFrame[source]¶: Read a MACS-compatible bedGraph score track into an interval table.

uchrom.fea.read_gtf(path: str | Path) → DataFrame[source]¶

Read a GTF/GFF-like file into a normalized annotation table.

Coordinates are converted from 1-based inclusive GTF convention to 0-based half-open intervals.

uchrom.fea.unique_spot_intervals(cdata_or_spots) → DataFrame[source]¶

Return unique chrom/start/end intervals from a spot table.

The first occurrence order is preserved. cdata_or_spots may be a ChromData-like object with a spots attribute or a DataFrame.

Distance-based aggregates¶

Distance-based aggregate statistics over a population of traces.

Input convention: a flat DataFrame with columns chrom, start, end, x, y, z, trace_id (what ChromData.to_dataframe() produces, or what the browser’s ChromatinLayer.df stores).

The core helper _bin_coord_cube() pivots the flat table into a (n_traces, n_bins, 3) array with NaN for missing spots, which lets every aggregate statistic be computed as a straightforward NaN-aware reduction.

uchrom.fea.distance.contact_frequency(df: DataFrame, threshold: float, chrom=None)[source]¶

Fraction of traces where a pair of bins are within threshold.

NaN distances (missing spots) are excluded from both numerator and denominator — each bin pair’s frequency is over the set of traces that have both endpoints detected.

Parameters:

df (DataFrame with spots + coords.)
threshold (distance threshold in the same units as x/y/z.)
chrom (optional chromosome filter.)

Returns:

frequency (ndarray (n_bins, n_bins) in [0, 1], NaN where no) – trace had both endpoints detected.
bin_ids (list of (start, end))
n_traces (int)

uchrom.fea.distance.mean_distance_matrix(df: DataFrame, chrom=None, reduce: str = 'median')[source]¶

Population-level mean/median pairwise distance matrix.

For each pair of genomic bins (i, j), the distance is computed per-trace and then reduced across traces with np.nanmedian (the Bintu 2018 convention) or np.nanmean.

Parameters:

df (DataFrame with spots + coords.)
chrom (optional chromosome filter.)
reduce ('median' (default) or 'mean'.)

Returns:

matrix (ndarray (n_bins, n_bins))
bin_ids (list of (start, end))
n_traces (int)

uchrom.fea.distance.radius_of_gyration(df: DataFrame, chrom=None) → Series[source]¶

Per-trace radius of gyration.

Rg = sqrt(mean over spots of ||r - centroid||²). Traces with fewer than 2 spots contribute NaN.

Axis-wise preprocessing¶

ArcFISH-style axis-wise preprocessing for chromatin tracing data.

References

Yu H. et al. Accurate and robust 3D genome feature discovery from multiplexed DNA FISH, bioRxiv 2025.11.26.690837v1.

Independent implementation in uchrom — not derived from the GPL-3.0 ArcFISH source.

Pipeline (per chromosome)¶

axis_variance_cube Builds (3, n_bins, n_bins) per-axis pairwise variance + count cubes from ChromData spots. Each trace contributes a rank-1 outer difference for each axis; aggregation is NaN-aware.
filter_normalize Two-pass LOWESS stratification on log(1D genomic distance):
- first pass: flag entries whose per-pair squared deviation is more than k_sigma × stratified std as outliers and NaN them;
- second pass: refit LOWESS on the cleaned variances to give each entry a genome-distance-matched expectation, then normalise.
axis_weight Returns a 3-vector of weights (sum 1) inversely proportional to the per-axis trace-variance median — the exact weighting used by the ACAT combination step in the loop / tad / comp callers.

All tensor-heavy computation runs on a user-selected torch device ('auto' | 'cpu' | 'cuda' | 'mps'). LOWESS stays on CPU via statsmodels because it’s a non-vectorised kernel smoother whose input size is O(n_bins²) (typically ≤ 10 k).

uchrom.fea.arc.axis_variance_cube(cd, chrom: str, device: str = 'auto') → dict[source]¶

Compute per-axis pairwise variance + sample-count cubes.

uchrom.fea.arc.axis_weight(cd, chrom: str | None = None, device: str = 'auto') → ndarray[source]¶

Compute per-axis weights w ∝ 1 / median(trace_variance).

uchrom.fea.arc.filter_normalize(cube: dict, k_sigma: float = 4.0, frac: float = 0.1) → dict[source]¶

ArcFISH-style per-trace LOWESS filter + normalise.

Operates on the full (3, n_traces, n_bins, n_bins) pairwise-diff tensor kept on the GPU (under cube['diff']). Two passes:

Per-pair raw_var = nanmedian(trace_diff²) → LOWESS over log(genomic_distance) → strata_std. Individual trace observations where |diff - median(diff)| > k_sigma × strata_std are NaN’d in-place in the 4D tensor.
After filtering, per-pair filtered_var = nanmean((diff - mean)²) and per-pair count = n_valid recomputed. LOWESS again over log(d1d) → expected; normalised variance = filtered / expected.

Output (numpy, on CPU): var, count (refreshed after filter), norm_var, expected, raw_var, genomic_distance. The original 4D tensor under "diff" is consumed (may be modified).

Interval projection¶

Projection helpers for genomic interval features.

The functions in this module are deliberately independent of uchrom.auto_discovery. They provide a small shared layer for mapping canonical interval-level feature tables onto the spot axis of a ChromData object.

uchrom.fea.project.project_interval_features_to_spots(cdata_or_spots, features: DataFrame, *, prefix: str | None = None, value_columns: Sequence[str] | None = None, into: DataFrame | None = None, overwrite: bool = False) → DataFrame[source]¶

Project interval-level features onto the spot axis.

Parameters:

cdata_or_spots – A ChromData-like object or a spot DataFrame with chrom, start, and end columns.
features – DataFrame with one row per interval and feature columns.
prefix – Optional namespace prefix for projected columns. For example, prefix="seq" maps gc_fraction to seq.gc_fraction.
value_columns – Feature columns to project. By default all non-interval columns are used.
into – Optional existing spot-aligned DataFrame to extend.
overwrite – If False, raise when projected columns already exist in into.

Returns:

A spot-aligned DataFrame containing the existing into columns plus projected feature columns.

Return type:

DataFrame

uchrom.fea.project.unique_spot_intervals(cdata_or_spots) → DataFrame[source]¶

Return unique chrom/start/end intervals from a spot table.

The first occurrence order is preserved. cdata_or_spots may be a ChromData-like object with a spots attribute or a DataFrame.

Sequence features¶

Sequence-derived genomic interval features.

uchrom.fea.sequence.add_sequence_features(cdata, fasta: str | Path, *, features: Sequence[str] | None = None, prefix: str = 'seq', result_key: str = 'bin_features', project: bool = True, store: bool = True, overwrite: bool = False, hash_source: bool = False, g4_pattern: str = 'G{3,}[ACGTN]{1,7}G{3,}[ACGTN]{1,7}G{3,}[ACGTN]{1,7}G{3,}') → DataFrame[source]¶

Compute sequence features for a ChromData object.

The canonical interval table is returned. When store=True it is also merged into cdata.results[result_key]. When project=True feature columns are projected onto cdata.tracks using prefix.

uchrom.fea.sequence.compute_sequence_features(intervals: DataFrame, fasta: str | Path, *, features: Sequence[str] | None = None, g4_pattern: str = 'G{3,}[ACGTN]{1,7}G{3,}[ACGTN]{1,7}G{3,}[ACGTN]{1,7}G{3,}', missing: str = 'raise') → DataFrame[source]¶

Compute FASTA-derived features for genomic intervals.

Parameters:

intervals – DataFrame with chrom, start, and end columns.
fasta – FASTA file path. Plain text and .gz files are supported.
features – Sequence feature names. Defaults to DEFAULT_SEQUENCE_FEATURES.
g4_pattern – Regular expression used for G-quadruplex motif counting.
missing – "raise" to reject intervals whose chromosomes are absent from the FASTA, or "ignore" to leave their feature values as NaN.

Returns:

Interval columns plus requested feature columns, preserving the input row order.

Return type:

DataFrame

Annotation features¶

Genome annotation-derived interval features.

uchrom.fea.annotation.add_annotation_features(cdata, gtf: str | Path | DataFrame, *, features: Sequence[str] | None = None, prefix: str = 'gtf', result_key: str = 'bin_features', project: bool = True, store: bool = True, overwrite: bool = False, promoter_window: tuple[int, int] = (-2000, 500), hash_source: bool = False) → DataFrame[source]¶: Compute annotation features for a ChromData object.

uchrom.fea.annotation.compute_annotation_features(intervals: DataFrame, gtf: str | Path | DataFrame, *, features: Sequence[str] | None = None, promoter_window: tuple[int, int] = (-2000, 500), gene_feature: str = 'gene', exon_feature: str = 'exon') → DataFrame[source]¶

Compute GTF-derived features for genomic intervals.

The returned table preserves the input interval order and uses 0-based half-open coordinates.

uchrom.fea.annotation.read_gtf(path: str | Path) → DataFrame[source]¶

Read a GTF/GFF-like file into a normalized annotation table.

Coordinates are converted from 1-based inclusive GTF convention to 0-based half-open intervals.

Peak features¶

Peak calling and peak-derived interval features.

uchrom.fea.peaks.add_peak_features(cdata, peaks: DataFrame, *, name: str = 'peak', prefix: str = 'peak', result_key: str = 'bin_features', project: bool = True, store: bool = True, overwrite: bool = False, source_peak_key: str | None = None) → DataFrame[source]¶: Project an existing peak table into interval and spot features.

uchrom.fea.peaks.aggregate_track_by_interval(cdata, track: str, *, agg: str = 'mean', signal_col: str = 'signal') → DataFrame[source]¶

Aggregate a spot-aligned track to unique genomic intervals.

uchrom.fea.peaks.aggregate_tracks_by_interval(cdata, tracks: Sequence[str], *, agg: str = 'mean') → DataFrame[source]¶: Aggregate multiple spot-aligned tracks to unique genomic intervals.

uchrom.fea.peaks.call_macs_bdgpeaks_from_bedgraph(path: str | Path, *, signal_col: str = 'score', cutoff: float = 5.0, min_length: int = 200, max_gap: int = 30, name: str = 'MACS', name_prefix: str | None = None) → DataFrame[source]¶: Read a bedGraph file and call MACS3 bdgpeakcall-compatible peaks.

uchrom.fea.peaks.call_macs_bdgpeaks_from_signal(signal_table: DataFrame, *, signal_col: str = 'score', cutoff: float = 5.0, min_length: int = 200, max_gap: int = 30, name: str = 'MACS', name_prefix: str | None = None) → DataFrame[source]¶

Call peaks using a Python port of MACS3 bdgpeakcall.

uchrom.fea.peaks.call_macs_peaks_from_signal(signal_table: DataFrame, *, signal_col: str = 'signal', control_col: str | None = None, qvalue: float | None = 0.05, pvalue: float | None = None, genome_size: int | float | None = None, small_local_window_bins: int = 10, large_local_window_bins: int = 100, fragment_size_bins: int = 1, nolambda: bool = False, min_width_bins: int = 1, max_gap_bins: int = 0, signal_scale: float = 1.0, control_scale: float | None = None, negative: str = 'clip', name: str = 'peak') → DataFrame[source]¶

Call peaks with U-Chrom’s legacy dynamic Poisson approximation.

uchrom.fea.peaks.call_peaks_from_signal(signal_table: DataFrame, *, signal_col: str = 'signal', threshold: float | None = None, quantile: float = 0.95, min_width_bins: int = 1, max_gap_bins: int = 0, name: str = 'peak') → DataFrame[source]¶

Call simple contiguous high-signal peaks from an interval signal table.

The input table should have chrom, start, end, and signal_col. If threshold is omitted it is estimated from the requested quantile of finite signal values.

uchrom.fea.peaks.call_peaks_from_track(cdata, track: str, *, method: str = 'auto', control_track: str | None = None, agg: str = 'mean', threshold: float | None = None, quantile: float = 0.95, cutoff: float | None = None, min_length: int = 200, max_gap: int = 30, qvalue: float | None = 0.05, pvalue: float | None = None, genome_size: int | float | None = None, small_local_window_bins: int = 10, large_local_window_bins: int = 100, fragment_size_bins: int = 1, nolambda: bool = False, min_width_bins: int = 1, max_gap_bins: int = 0, signal_scale: float = 1.0, control_scale: float | None = None, negative: str = 'clip', peaks_key: str | None = None, feature_name: str | None = None, prefix: str = 'peak', result_key: str = 'bin_features', project: bool = True, store: bool = True, overwrite: bool = False) → DataFrame[source]¶: Aggregate a track, call peaks, and project peak features to spots.

uchrom.fea.peaks.compute_peak_features(intervals: DataFrame, peaks: DataFrame, *, name: str = 'peak') → DataFrame[source]¶: Compute overlap/count/distance features from peak intervals.

uchrom.fea.peaks.macs_bdgpeaks_to_narrowpeak(peaks: DataFrame, *, name: str = 'MACS', name_prefix: str | None = None, trackline: bool = False, score_column: str = 'score') → str[source]¶: Render MACS bdgpeakcall-compatible peaks as narrowPeak text.

uchrom.fea.peaks.read_bedgraph(path: str | Path, *, signal_col: str = 'score') → DataFrame[source]¶: Read a MACS-compatible bedGraph score track into an interval table.

Feature registry¶

Feature registry helpers for ChromData.uns.

uchrom.fea.registry.append_feature_registry_entry(cdata, entry: Mapping[str, Any], *, key: str = 'feature_registry') → dict[str, Any][source]¶

Append a provenance record to cdata.uns[key].

The registry is stored as a list of JSON-like dicts so it round-trips through the existing uns serializer.

uchrom.fea.registry.file_sha256(path: str | Path, *, chunk_size: int = 1048576) → str[source]¶: Return the SHA-256 digest for a local source file.

uchrom.fea.registry.table_provenance(table, *, value_columns=None) → dict[str, Any][source]¶: Small JSON-friendly summary for an enrichment output table.