Importing FOF-CT imaging data

The 4DN FISH Omics Format — Chromatin Tracing (FOF-CT) is the community standard for chromatin-tracing / multiplexed DNA FISH experiments. A FOF-CT core table is a CSV with a metadata header (##FOF-CT_Version=v1.0, ##Columns=(…)) followed by per-spot rows.

This notebook loads a real FOF-CT dataset with ChromData.from_fofct and shows how to navigate the Cell → Trace → Spot hierarchy that imaging data naturally exposes.

import numpy as np
import pandas as pd
from uchrom import ChromData

# Change this to a FOF-CT file you have locally
FOFCT = 'example-data/fofct_core.csv'  # or similar

# For this tutorial we build a synthetic example that looks like FOF-CT
# output in case you do not have a file handy.  Skip this cell if you
# downloaded a real file.
import io, os
if not os.path.exists(FOFCT):
    os.makedirs('example-data', exist_ok=True)
    rng = np.random.default_rng(0)
    header = '''##FOF-CT_Version=v1.0
##Table_Namespace=4dn_FOF-CT_core
##Genome_Assembly=GRCh38
##XYZ_Unit=micron
#Lab_Name: Demo
##Columns=(Spot_ID, Trace_ID, X, Y, Z, Chrom, Chrom_Start, Chrom_End, Cell_ID)
'''
    rows = []
    sid = 1
    for cell in range(20):
        for trace in range(2):          # 2 alleles per cell
            walk = np.cumsum(rng.normal(scale=0.12, size=(12, 3)), axis=0)
            for bin_i in range(12):      # 12 bins per trace
                x, y, z = walk[bin_i]
                start = 1_000_000 + bin_i * 10_000
                rows.append((sid, f'{cell}_{trace}', x, y, z,
                             'chr17', start, start + 10_000, cell))
                sid += 1
    with open(FOFCT, 'w') as f:
        f.write(header)
        for r in rows:
            f.write(','.join(str(x) for x in r) + '\n')

FOFCT

Load the file

cd = ChromData.read(FOFCT) if FOFCT.endswith(".h5cd") else ChromData.from_fofct(FOFCT)
cd

Inspect metadata

The FOF-CT header fields go into cd.uns so you can always trace the origin of a dataset.

cd.uns['fofct_header']
print('genome   :', cd.uns.get('genome_assembly'))
print('xyz_unit :', cd.uns.get('xyz_unit'))
print()
print(cd.spots.head())

Traces per chromosome

Imaging data typically has many traces per chromosome (each trace = one allele observation).

summary = (
    cd.spots
      .groupby('chrom', observed=True)
      .agg(n_spots=('trace_id', 'size'),
           n_traces=('trace_id', 'nunique'))
)
summary

Subsetting a single trace

This is the natural unit of analysis in chromatin tracing.

first_trace_id = cd.spots['trace_id'].iloc[0]
trace = cd.get_trace(first_trace_id)
print(trace)
print()
print(trace.to_dataframe().head())

Convert to ChromData-backed analysis format

Downstream analysis modules in uchrom.fea and uchrom.strc operate directly on the flat DataFrame produced by cd.to_dataframe() or accept the ChromData itself.

df = cd.to_dataframe()
print(df.columns.tolist())
df.head()

Writing your own FOF-CT-compatible data

If you produced tracing data with a tool that does not write FOF-CT, you can create a ChromData directly (see the ChromData basics tutorial) and write .h5cd. The on-disk format is versioned so downstream tools can safely consume it.