9.2 Output Structure¶
All outputs are written under the execution name in the specified results folder.
Primary Outputs¶
Primary_Outputs/
├── Breakcounts/ # Sites where adjacent breaks were merged and counted
├── Break_Site_Plots/ # Per-base break distribution plots for each non-singleton nominated site
├── Break_Site_Summary_Tables/# Summary of all non-singleton break sites per condition with metadata
├── Homology/ # Observed break sites intersected with in silico-predicted guide-homologous sequences
├── Nomination_Lists/ # Break sites nominated as putatively induced (frequency- and/or homology-based)
├── report.html # Run summary — START HERE
└── Sample_Report/ # Per-condition summary of results
QC¶
QC/
├── Break_Metrics/ # Absolute break counts, normalised counts, and breaks vs library concentration
├── FastQC/
│ ├── Raw_FastQC/ # FastQC quality assessment on raw FASTQ files
│ ├── Trimmed/ # FastQC quality assessment on trimmed FASTQ files
│ └── Trimming/ # Trimming quality assessment
├── Lab_Metrics/ # gDNA yield, library yield, and library concentration plots
├── Mapping/ # Read loss breakdown (primary reads, unmapped, MAPQ-filtered)
└── Trimming/ # Per-sample trimming report
sample_manifest_combined.valid.csv — combined sample and nuclease manifest used for the run.
Secondary Outputs¶
Secondary_Outputs/
├── Bedfiles/ # Reads converted from FASTQ to BED format
├── Breakcount_Frequencies/ # Unique break counts per break-class for each condition (table + interactive plot)
├── Breakends/ # Genomic coordinates and strand for each break
├── Region_Exclusions_List/ # Genomic regions excluded from analysis
├── Sample_Thresholds/ # Per-sample break count thresholds (TSV)
└── Sequence_Mapping/
├── *.bam # Mapped reads (primary only, MAPQ ≥ 30)
├── pre-filter_flagstat/ # Summary statistics for unfiltered BAM files
├── post-filter_flagstat/ # Summary statistics for filtered BAM files
└── Unfiltered_BAMs/ # Unfiltered mapped reads
Sample Thresholds¶
The TSV files in Sample_Thresholds/ contain the break count above which the control
condition is predicted to produce fewer than 1 site. Sites above this threshold in the
treated sample are less likely to be explained by endogenous processes and more likely
to be induced by the treatment. Each file includes a threshold with upper and lower
bounds.