Alignment with BWA and Bowtie

Bioinformatics
bwa
bowtie2
alignment
bwt
Fast read alignment to a reference genome via Burrows-Wheeler Transform indexing
Published

April 17, 2026

Introduction

BWA-MEM and Bowtie2 are the standard short-read aligners for DNA. Both use a Burrows-Wheeler-transformed index of the reference for rapid exact-and-approximate matching. Production-scale alignment is done at the command line; R wrappers exist for convenience.

Prerequisites

FASTQ files, reference genome.

Theory

BWA-MEM: seeds local alignments with maximal exact matches, extends via Smith-Waterman. Default for whole-genome DNA sequencing.

Bowtie2: end-to-end or local alignment. Common choice for ChIP-seq, ATAC-seq.

Both output SAM / BAM.

Assumptions

Short reads (<~300 bp); appropriate reference.

R Implementation

library(Rbowtie2); library(Rsamtools)

# Build index (one-off)
# bowtie2_build(references = "reference.fa", bt2Index = "idx")

# Align
# bowtie2(bt2Index = "idx", samOutput = "aln.sam",
#         seq1 = "reads.fastq", overwrite = TRUE)

# Convert SAM to sorted, indexed BAM
# asBam("aln.sam", destination = "aln_sorted")
# indexBam("aln_sorted.bam")

# Read a short segment
# scanBam("aln_sorted.bam", param = ScanBamParam(which = GRanges("chr1", IRanges(1, 1000))))

Output & Results

BAM file with aligned reads; flagstat summary (total, mapped, duplicates).

Interpretation

“BWA-MEM achieved 97 % alignment rate on a 30x human WGS library, with <1 % ambiguously mapped reads.”

Practical Tips

  • Use minimap2 for long reads (PacBio, Nanopore).
  • Report alignment statistics (flagstat) after alignment.
  • Index BAM files for downstream tools.
  • Deduplicate after alignment (Picard MarkDuplicates, samtools markdup).
  • For paired-end, specify both mate files.