9.6 Summary TSVs¶
*_breaksite_summary.tsv and *_breaksite_summary.condensed.tsv summarise all
non-singleton break sites for each condition.
Columns¶
| Column | Description |
|---|---|
chr |
Chromosome of the break site. |
start |
Genomic start coordinate. |
end |
Genomic end coordinate. |
count |
Total breaks at the site (sum of plus_count and minus_count). |
plus_count |
Breaks on the positive (+) strand. |
minus_count |
Breaks on the negative (–) strand. |
count_ratio |
Ratio of strands: max(plus_count, minus_count) / min(...). Value is always > 1. |
width |
Width of the recurrent break site. |
site_to_context_relative_density |
Break site density (count / width) divided by context density (breaks in flanking ±50 bp region divided by 100). |
percent_id_to_guide_name |
Percent identity score from semi-global alignment of the guide sequence against the break site ±25 bp. |
guide_name_match_seq |
Guide-like target sequence at the site (if the site intersects an in silico prediction). |
guide_name_mismatches |
Number of mismatches to the guide sequence (if intersecting a prediction). |
reproducibility_count |
How frequently the break site is observed across replicates (n/m). |
count_(sample_name) |
Break count at the site in each individual replicate. |
intersect_to_repeat_mask |
Intersection with repeat annotations. |
intersect_to_gene |
Intersection with reference genome gene annotations. |
control_count |
Breaks observed at the same interval in the control sample. |
normalized_sample_count |
Breaks at the site in the treated sample normalised per million total breaks. |
normalized_control_count |
Breaks at the site in the control normalised per million total breaks. |
normalized_sample_to_control_ratio |
Ratio of normalised treated to normalised control break counts at the site. |
normalized_sample_to_control_context_ratio |
As above, including a ±50 bp context window. |
sample_context_count |
Breaks in the treated sample within the context region (site ±50 bp). |
control_context_count |
Breaks in the control within the context region (site ±50 bp). |
context_width |
Width of the context region (site width + 100 bp). |
rationale |
Reason the site was nominated: frequency-based, homology-based, or frequency-based,homology-based. |
guide_name |
Name of the matched guide sequence (if the site intersects an in silico prediction). |
break_site_probability_score |
Percentage probability that the site resulted from the treatment rather than an endogenous process. A cutoff of ≥ 80% is suggested to select sites more likely to be true positives. |