I/O and file formats¶
uchrom.io handles reading sequencing and imaging data, and saving
reconstruction output.
Supported inputs¶
Format |
Reader |
Returns |
|---|---|---|
|
|
DataFrame (read-level pairs) |
|
|
contact pixel DataFrame / bin+matrix |
|
|
bin+matrix |
|
|
DataFrame (chrom, start, x, y, z) |
|
|
DataFrame |
|
|
DataFrame / |
FOF-CT core CSV |
|
|
Reconstruction output: save_particles¶
All reconstruction modules write through
uchrom.io.save_particles(df, path):
pathending in.csv→ CSV (legacy)pathending in.h5cdor anything else →ChromDataHDF5 (default)
from uchrom.io import save_particles
save_particles(df, "out.h5cd") # ChromData
save_particles(df, "out.csv") # plain CSV
read_particles auto-detects the format by extension so downstream code
(browser, analysis) does not care which format is on disk.
FOF-CT import details¶
ChromData.from_fofct(path) handles several real-world FOF-CT quirks:
##Columns=(…)declared case-insensitively (some writers use lower-case)CSV column header row repeated after the
##Columns=metadataHeader lines wrapped in double quotes and padded with trailing commas
Optional extra columns (e.g.
Readout) preserved inspots##XYZ_Unit,##Genome_Assemblypromoted to top-levelunskeys
Unit conversion: FOF-CT files are typically in µm; you can scale in place
by multiplying cd.coords yourself, or pass a custom unit label into
uns['xyz_unit'] so downstream plots label correctly.
.h5cd format versioning¶
The on-disk .h5cd layout is versioned with two root attributes:
Attribute |
Example |
|---|---|
|
|
|
|
Read semantics:
Same MAJOR → read (higher MINOR warns, unknown fields ignored)
Different MAJOR →
ValueErrorwith upgrade guidanceMissing attribute → assume legacy 1.0 with a warning
New MAJOR versions add a _read_vN(cls, f) function in
uchrom/core/cdata.py plus any migration helper needed; the dispatch in
ChromData.read is the single place that maps version to reader.