Importing PyHiM chromatin trace tables¶
PyHiM (Devos et al. 2024, Genome Biology) is a Python toolkit for multiplexed DNA-FISH image analysis. It outputs chromatin trace tables as Astropy ECSV files with columns:
Spot_ID, Trace_ID, x, y, z, Chrom, Chrom_Start, Chrom_End,
ROI #, Mask_id, Barcode #, label
U-Chrom’s ChromData.from_pyhim_trace() reads these ECSV files directly
into the standard ChromData container, so you can use the browser,
structure callers, and embedding tools on PyHiM outputs without manual
conversion.
This notebook demonstrates the reader on Bintu et al. 2018 IMR90 chr21 chromatin tracing data (1,277 traces × 66 loci, 30 kb spacing), converted to PyHiM ECSV format.
1. Locate the demo ECSV¶
We use a PyHiM-format ECSV derived from Bintu et al. 2018 (Science
362:eaau1783) — IMR90 cells, chr21:18.6–20.6 Mb, 66 segments × 30 kb.
The conversion script is in example-data/bintu_to_pyhim_ecsv.py.
from pathlib import Path
import numpy as np
import pandas as pd
from uchrom import ChromData
def _repo_root():
for p in [Path.cwd(), *Path.cwd().parents]:
if (p / "pyproject.toml").exists():
return p
return Path.cwd()
root = _repo_root()
ecsv_path = root / "example-data" / "IMR90_chr21_pyhim.ecsv"
if not ecsv_path.exists():
# Generate it from the Bintu CSV
import subprocess
script = root / "example-data" / "bintu_to_pyhim_ecsv.py"
subprocess.run(["python", str(script)], check=True)
print(f"ECSV: {ecsv_path}")
print(f"Size: {ecsv_path.stat().st_size / 1e6:.1f} MB")
2. Read the ECSV into ChromData¶
ChromData.from_pyhim_trace() parses the ECSV and maps PyHiM columns
to ChromData’s schema:
x, y, z→cd.coordsChrom, Chrom_Start, Chrom_End, Trace_ID→cd.spotsMask_id→cell_id(PyHiM convention: each trace = one “cell”)ECSV
meta['comments']→cd.uns['xyz_unit'],cd.uns['genome_assembly']
cd = ChromData.from_pyhim_trace(ecsv_path)
cd
print(f"Traces: {cd.spots['trace_id'].nunique()}")
print(f"Spots per trace: {len(cd.spots) // cd.spots['trace_id'].nunique()}")
print(f"Genomic region: {cd.spots['chrom'].iloc[0]}:{cd.spots['start'].min():,}–{cd.spots['end'].max():,}")
print(f"xyz_unit: {cd.uns.get('xyz_unit')}")
print(f"genome_assembly: {cd.uns.get('genome_assembly')}")
3. Inspect a single trace¶
Each trace is a chromatin fiber — 66 consecutive loci on chr21.
trace0 = cd.get_trace("1") # first trace
print(f"Trace '1': {len(trace0.spots)} spots")
print(trace0.spots[["chrom", "start", "end"]].head(10))
4. Compute radius of gyration (Rg)¶
Rg measures the spatial compactness of each trace.
def compute_rg(coords):
centroid = coords.mean(axis=0)
return np.sqrt(((coords - centroid) ** 2).sum(axis=1).mean())
rg_per_trace = []
for tid in cd.spots["trace_id"].unique():
tr = cd.get_trace(tid)
rg_per_trace.append(compute_rg(tr.coords))
rg_per_trace = np.array(rg_per_trace)
print(f"Rg (microns): mean={rg_per_trace.mean():.2f}, std={rg_per_trace.std():.2f}")
print(f" min={rg_per_trace.min():.2f}, max={rg_per_trace.max():.2f}")
5. Visualize Rg distribution¶
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(6, 4))
ax.hist(rg_per_trace, bins=40, edgecolor="black", alpha=0.7)
ax.set_xlabel("Radius of gyration (μm)")
ax.set_ylabel("Number of traces")
ax.set_title(f"Bintu 2018 IMR90 chr21 (n={len(rg_per_trace)} traces)")
plt.tight_layout()
6. Plot a few example traces in 3D¶
from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure(figsize=(10, 8))
ax = fig.add_subplot(111, projection="3d")
# Plot first 5 traces
for i, tid in enumerate(list(cd.spots["trace_id"].unique())[:5]):
tr = cd.get_trace(tid)
ax.plot(tr.coords[:, 0], tr.coords[:, 1], tr.coords[:, 2],
marker="o", markersize=3, alpha=0.7, label=f"Trace {tid}")
ax.set_xlabel("x (μm)")
ax.set_ylabel("y (μm)")
ax.set_zlabel("z (μm)")
ax.set_title("5 example chromatin traces")
ax.legend()
plt.tight_layout()
7. Compute pairwise distance matrix for one trace¶
The distance matrix shows spatial proximity between loci.
trace0 = cd.get_trace("1")
dist_mat = trace0.compute_distances()
fig, ax = plt.subplots(figsize=(6, 5))
im = ax.imshow(dist_mat, cmap="viridis", origin="lower")
ax.set_xlabel("Locus index")
ax.set_ylabel("Locus index")
ax.set_title("Pairwise distance matrix (trace '1')")
plt.colorbar(im, ax=ax, label="Distance (μm)")
plt.tight_layout()
Next steps¶
Save as
.h5cd:cd.write("imr90_chr21.h5cd")for fast reload3D browser:
python -m uchrom.browser imr90_chr21.h5cd(interactive PyVista viewer)Structure calling: TAD/loop/compartment callers work on imaging data too
Embedding: If you have multiple cells, run FastHigashi on the population
For PyHiM outputs with empty Chrom columns, pass a barcode_dict:
cd = ChromData.from_pyhim_trace(
"Trace_3D.ecsv",
barcode_dict={1: ("chr1", 10_000_000, 10_050_000), ...}
)
See ChromData.from_pyhim_trace docstring for details.