uchrom.recon¶
Single-cell reconstruction¶
Bulk reconstruction (MDS)¶
- uchrom.recon.bulk.mds.apply_distance_decay_prior(contact_mat, weight=0.05)[source]¶
Apply distance decay prior to smooth contact frequencies. Expected values computed from nonzero contacts only (matching miniMDS).
- uchrom.recon.bulk.mds.apply_transform(coords, rotation=None, translation=None, scale=1.0)[source]¶
Apply rotation, translation and scaling.
- uchrom.recon.bulk.mds.compute_radius_of_gyration(coords)[source]¶
Rg = sqrt(mean(||x - centroid||^2)).
- uchrom.recon.bulk.mds.compute_stress(coords, dist_mat, weights=None)[source]¶
Compute MDS stress, skipping missing-data pairs (dist_mat == 0).
- uchrom.recon.bulk.mds.contact_to_distance(contact_mat, alpha=4.0)[source]¶
Convert contact frequencies to distances: d = c^(-1/alpha). Zero contacts are treated as missing data (distance = 0).
- uchrom.recon.bulk.mds.contacts_to_matrix(bin1, bin2, counts, n, dtype=torch.float64)[source]¶
Build symmetric contact matrix from sparse contact data.
- uchrom.recon.bulk.mds.fill_missing_distances(dist_mat, contact_mat=None)[source]¶
Fill zero (missing) distances using genomic distance prior. After contact_to_distance, zeros represent missing data, not zero distance.
- uchrom.recon.bulk.mds.inter_mds(input_path, resolution_inter=1000000, resolution_intra=100000, chroms=None, alpha=4.0, weight=0.05, n_iter=1000, device='auto', output_dir=None, verbose=True)[source]¶
Whole-genome 3D reconstruction with inter-chromosomal contacts.
- Parameters:
input_path – Path to .hic or .mcool file
resolution_inter – Resolution for inter-chromosomal scaffold (default 1Mb)
resolution_intra – Resolution for intra-chromosomal structures (default 100kb)
chroms – List of chromosomes (default: autosomes + X)
alpha – Contact-to-distance exponent
weight – Distance decay prior weight
n_iter – MDS iterations
device – ‘auto’, ‘cpu’, ‘cuda’, ‘mps’
output_dir – Output directory (None = don’t save)
verbose – Print progress
- Returns:
DataFrame with chrom, start, end, x, y, z for all bins
- Return type:
genome_df
- uchrom.recon.bulk.mds.normalize_distances(dist_mat)[source]¶
Normalize distance matrix to have unit mean. Includes zeros in mean calculation to match miniMDS behavior (miniMDS divides by np.mean(distMat) which includes zeros).
- uchrom.recon.bulk.mds.partitioned_mds(contact_mat, tad_regions=None, device='auto', res_ratio=10, alpha=4.0, alpha2=2.5, weight=0.05, n_iter=1000, verbose=False, n_workers=1)[source]¶
Partitioned MDS for high-resolution Hi-C data.
- uchrom.recon.bulk.mds.procrustes_alignment(source, target, scale=True)[source]¶
Align source to target using SVD-based Procrustes analysis.
- uchrom.recon.bulk.mds.run_mds(contact_mat, alpha=4.0, device='auto', weight=0.05, **kwargs)[source]¶
Full MDS pipeline: contact matrix -> 3D coordinates.
Zero-contact bins (rows/columns with no observed contacts) are removed before MDS, matching miniMDS behavior. Returns coordinates only for non-zero bins.
- Returns:
np.ndarray of shape (n_nonzero, 3) nonzero_mask: np.ndarray boolean mask of shape (n_total,)
indicating which bins were kept
- Return type:
coords
- uchrom.recon.bulk.mds.torch_mds(dist_mat, device='auto', n_iter=1000, lr=0.01, tol=1e-06, init='cmds', verbose=False, method='smacof')[source]¶
Run iterative MDS.
- Parameters:
method – ‘smacof’ (default, fast) or ‘adam’ (gradient descent).
- uchrom.recon.bulk.mds.torch_mds.cmds_init(dist_mat)[source]¶
Classical MDS initialization via eigendecomposition.
- uchrom.recon.bulk.mds.torch_mds.compute_stress(coords, dist_mat, weights=None)[source]¶
Compute MDS stress, skipping missing-data pairs (dist_mat == 0).
- uchrom.recon.bulk.mds.torch_mds.run_mds(contact_mat, alpha=4.0, device='auto', weight=0.05, **kwargs)[source]¶
Full MDS pipeline: contact matrix -> 3D coordinates.
Zero-contact bins (rows/columns with no observed contacts) are removed before MDS, matching miniMDS behavior. Returns coordinates only for non-zero bins.
- Returns:
np.ndarray of shape (n_nonzero, 3) nonzero_mask: np.ndarray boolean mask of shape (n_total,)
indicating which bins were kept
- Return type:
coords
- uchrom.recon.bulk.mds.torch_mds.smacof(dist_mat, device='auto', n_iter=1000, tol=1e-06, init='cmds', verbose=False)[source]¶
Run SMACOF (Scaling by MAjorizing a Complicated Function) MDS.
Unlike the Adam-based approach, SMACOF uses a majorization algorithm that does not require autograd, resulting in much lower per-iteration overhead on CPU.
- uchrom.recon.bulk.mds.torch_mds.torch_mds(dist_mat, device='auto', n_iter=1000, lr=0.01, tol=1e-06, init='cmds', verbose=False, method='smacof')[source]¶
Run iterative MDS.
- Parameters:
method – ‘smacof’ (default, fast) or ‘adam’ (gradient descent).
- uchrom.recon.bulk.mds.inter.inter_mds(input_path, resolution_inter=1000000, resolution_intra=100000, chroms=None, alpha=4.0, weight=0.05, n_iter=1000, device='auto', output_dir=None, verbose=True)[source]¶
Whole-genome 3D reconstruction with inter-chromosomal contacts.
- Parameters:
input_path – Path to .hic or .mcool file
resolution_inter – Resolution for inter-chromosomal scaffold (default 1Mb)
resolution_intra – Resolution for intra-chromosomal structures (default 100kb)
chroms – List of chromosomes (default: autosomes + X)
alpha – Contact-to-distance exponent
weight – Distance decay prior weight
n_iter – MDS iterations
device – ‘auto’, ‘cpu’, ‘cuda’, ‘mps’
output_dir – Output directory (None = don’t save)
verbose – Print progress
- Returns:
DataFrame with chrom, start, end, x, y, z for all bins
- Return type:
genome_df
- uchrom.recon.bulk.mds.transforms.align_substructure_to_scaffold(high_res_coords, low_res_coords, scaffold_coords, res_ratio=10)[source]¶
Align high-res substructure to global scaffold via Procrustes.
- uchrom.recon.bulk.mds.transforms.apply_transform(coords, rotation=None, translation=None, scale=1.0)[source]¶
Apply rotation, translation and scaling.
- uchrom.recon.bulk.mds.transforms.compute_radius_of_gyration(coords)[source]¶
Rg = sqrt(mean(||x - centroid||^2)).