Genome 541 Spring 2023 - Introduction to Computational Molecular Biology

Cancer Genomics Module

GENOME 541 A Sp 23 Course Website

The Cancer Genomics Module is the third module in the course and consists of 4 lectures.
Homework #7 and #8 accompany this module.

Schedule of lectures and homework

Dates: May 9 - May 18

Times: Tuesday & Thursday @ 10:30am - 11:50am

Location: Foege S110

Instructor: Gavin Ha (

Date Lecture Title Lecture Slides Homework Assigned
May 9 Introduction to Cancer Genome Analysis Lecture 1
May 11 Probabilistic Methods for Mutation Detection Lecture 2 Homework 7
May 16 Probabilistic Methods for Copy Number Alteration Detection Lecture 3
May 18 Allelic copy number, Tumor heterogeneity, Mutation power analysis, Structural variation in cancer Lecture 4 Homework 8


Homework Files Due Date
Homework #7:
Single nucleotide variant genotyping
1. Assignment
2. R Markdown template
3. Python Jupyter notebook template
4. Homework7_alleleCounts.txt
May 19, 11:59pm
Homework #8:
Profiling copy number alterations
1. Assignment
2. R Markdown template
3. Python Jupyter notebook template
4. Homework8_log2ratios_chr1.txt
May 26, 11:59pm

Module Outline

Lecture 1: Introduction to cancer genome analysis

Lecture 1 Slides

  1. Background on Cancer Genome Alterations
    • Genomic alterations in cancer: drivers vs passengers, somatic vs germline
    • Tumor evolution and heterogeneity
  2. Overview of Cancer Genome Analysis
    • Computational strategy and workflow
    • Tumor DNA Sequencing
    • Types of genomic alterations predicted from tumor sequencing
    • Methods/tools/algorithms in following lectures
  3. Primer on statistical modeling
    • Probability distribution, Bayesian statistics, inference

Lecture 2: Probabilistic methods for mutation detection

Lecture 2 Slides

  1. Primer on statistical modeling (cont’d)
    • Mixture models and inference using the EM algorithm
  2. Detecting Mutations in Cancer Genomes
    • Visualizing somatic vs germline SNVs
    • Sequencing read count data
    • SNV genotyping strategy
  3. Mixture Models for SNV Detection
    • SNVMix probabilistic model and EM inference
    • Predicting somatic SNVs in cancer

Lecture 3: Probabilistic methods for copy number alteration detection

Lecture 3 Slides

  1. Detecting Copy Number Alterations in Cancer Genomes
    • Predicting copy number features from sequence data
    • Copy number analysis workflow
    • Data normalization
  2. Continuous Hidden Markov Model (HMM)
    • Graphical model representation
    • Components of a continuous HMM
    • Inference & parameter estimation using expectation-maximization (EM)
  3. Copy Number Profiling using a Hidden Markov Model
    • Probabilistic model for copy number analysis
    • Predicting copy number segments using the Viterbi algorithm

Lecture 4: Additional topics

Lecture 4 Slides

  1. Additional Copy Number Analysis Features
    • Allelic copy number analysis
  2. Estimating tumor heterogeneity
    • Modeling tumor-normal admixture
    • Modeling tumor clonality and heterogeneity
  3. Assessing Statistical Power for Variant Discovery
    • Power analysis
    • Calibrating sequencing depth for variant discovery
  4. Structural Rearrangement Analysis in Cancer Genomes
    • Structural variant types predicted from sequencing analysis
    • Complex genomic structural rearrangements