Importing FOF-CT imaging data¶
The 4DN FISH Omics Format — Chromatin Tracing
(FOF-CT) is the community standard for chromatin-tracing / multiplexed DNA
FISH experiments. A FOF-CT core table is a CSV with a metadata header
(##FOF-CT_Version=v1.0, ##Columns=(…)) followed by per-spot rows.
This notebook loads a real FOF-CT dataset with ChromData.from_fofct and
shows how to navigate the Cell → Trace → Spot hierarchy that imaging
data naturally exposes.
import numpy as np
import pandas as pd
from uchrom import ChromData
# Change this to a FOF-CT file you have locally
FOFCT = 'example-data/fofct_core.csv' # or similar
# For this tutorial we build a synthetic example that looks like FOF-CT
# output in case you do not have a file handy. Skip this cell if you
# downloaded a real file.
import io, os
if not os.path.exists(FOFCT):
os.makedirs('example-data', exist_ok=True)
rng = np.random.default_rng(0)
header = '''##FOF-CT_Version=v1.0
##Table_Namespace=4dn_FOF-CT_core
##Genome_Assembly=GRCh38
##XYZ_Unit=micron
#Lab_Name: Demo
##Columns=(Spot_ID, Trace_ID, X, Y, Z, Chrom, Chrom_Start, Chrom_End, Cell_ID)
'''
rows = []
sid = 1
for cell in range(20):
for trace in range(2): # 2 alleles per cell
walk = np.cumsum(rng.normal(scale=0.12, size=(12, 3)), axis=0)
for bin_i in range(12): # 12 bins per trace
x, y, z = walk[bin_i]
start = 1_000_000 + bin_i * 10_000
rows.append((sid, f'{cell}_{trace}', x, y, z,
'chr17', start, start + 10_000, cell))
sid += 1
with open(FOFCT, 'w') as f:
f.write(header)
for r in rows:
f.write(','.join(str(x) for x in r) + '\n')
FOFCT
Load the file¶
cd = ChromData.read(FOFCT) if FOFCT.endswith(".h5cd") else ChromData.from_fofct(FOFCT)
cd
Inspect metadata¶
The FOF-CT header fields go into cd.uns so you can always trace the origin of a dataset.
cd.uns['fofct_header']
print('genome :', cd.uns.get('genome_assembly'))
print('xyz_unit :', cd.uns.get('xyz_unit'))
print()
print(cd.spots.head())
Traces per chromosome¶
Imaging data typically has many traces per chromosome (each trace = one allele observation).
summary = (
cd.spots
.groupby('chrom', observed=True)
.agg(n_spots=('trace_id', 'size'),
n_traces=('trace_id', 'nunique'))
)
summary
Subsetting a single trace¶
This is the natural unit of analysis in chromatin tracing.
first_trace_id = cd.spots['trace_id'].iloc[0]
trace = cd.get_trace(first_trace_id)
print(trace)
print()
print(trace.to_dataframe().head())
Convert to ChromData-backed analysis format¶
Downstream analysis modules in uchrom.fea and uchrom.strc operate directly on the flat DataFrame produced by cd.to_dataframe() or accept the ChromData itself.
df = cd.to_dataframe()
print(df.columns.tolist())
df.head()
Writing your own FOF-CT-compatible data¶
If you produced tracing data with a tool that does not write FOF-CT, you
can create a ChromData directly (see the ChromData basics tutorial) and write .h5cd. The
on-disk format is versioned so downstream tools can safely consume it.