8.1 Set Up Executions¶

Steps¶

Select a reference genome:

Standard Reference GenomeCustom Genome

Toggle the button and select from hg19, hg38, or t2t.

Genome Selection panel showing the Standard Reference Genome toggle and hg38 selected

Upload the following files to the LatchBio Data tab before proceeding (mounting an AWS or GCP bucket is recommended):

File	Format	Required?
BWA Index Directory	Directory containing the genome indexed with `bwa index`	Required
Custom genome sequence	FASTA format	Required
Chromosome Sizes	BED file: `chr.. 0 chromosome_length`	Required
Gene Annotations	BED file: `chr.. start end gene_name`	Optional
Repeat Mask	BED file: `chr.. start end repeat_name`	Optional
Exclude List	BED file: `chr.. start end` — must have suffix `.excludelist.bed`	Optional

Warning

Contig names in the genome must start with chr.

Custom Genome input panel showing all mandatory and optional file fields

Under Output Parameters, select Select Folder and choose an output folder for results. For recommended folder organisation, see Example Folder Structure.
Enter an execution name. Only alphanumeric characters, underscores (_), and hyphens (-) are permitted. Date and time are added automatically.
Configure Advanced Parameters (optional):
Read Filtering
- Minimum read threshold: Minimum number of reads required for inclusion in downstream analysis.
- MAPQ threshold: Default removes reads with MAPQ below 30. Can be lowered to retain reads in repetitive regions where MAPQ values may fall below the default. Lowering below the default is generally not recommended as it increases false-positive risk.
ExAmp / Optical Duplicate Parameters
- Duplicate detection distance: Detection distances have been increased to match the NextSeq 2000 flow cell characteristics. Override with clumpify_dupedist and markdup_dist parameters if required.
- ExAmp duplicate filtering: Optional step to increase ExAmp duplicate artefact removal stringency. Non-reproducible single-stranded sites are excluded by default when enabled. The minimum break count for site removal at this step is configurable (default: 5).
Select Launch Workflow to start the run.

Workflow setup showing sample manifest rows and nuclease manifest rows loaded, ready to launch

Complete parametrised workflow with genome selection, output folder, and Launch Workflow button visible