Genome 541 - Introduction to Computational Molecular Biology

Cancer Genomics Module

Genome 541 Course Website

The Cancer Genomics Module is the third module in the course and consists of 4 lectures.
Homework #5 and #6 accompany this module.

Schedule of lectures and homework

Dates: April 28 - May 7

Times: Tuesday & Thursday @ 10:30am - 11:50am

TA: Anna-Lisa Doebley (

Date Lecture Title Lecture Slides Homework Assigned
April 28 Introduction to Cancer Genome Analysis Lecture 1 Homework 5
April 30 Probabilistic Methods for Mutation Detection Lecture 2
May 5 Probabilistic Methods for Copy Number Alteration Detection Lecture 3 Homework 6
May 7 Additional Topics: Tumor heterogeneity, Mutation power analysis, Structural Variants in cancer Lecture 4


Homework Files Due Date
Homework #5:
Single nucleotide variant genotyping
1. Assignment
2. R Markdown template
3. Python Jupyter notebook template
4. Homework5_alleleCounts.txt
May 8, 11:59pm
Homework #6:
Profiling copy number alterations
1. Assignment
2. R Markdown template
3. Python Jupyter notebook template
4. Homework6_log2ratios_chr1.txt
May 15, 11:59pm

Module Outline

Lecture 1: Introduction to cancer genome analysis

Lecture 1 Slides

  1. Background on Cancer Genome Alterations
    • Genomic alterations in cancer: drivers vs passengers, somatic vs germline
    • Tumor evolution and heterogeneity
  2. Overview of Cancer Genome Analysis
    • Computational strategy and workflow
    • Tumor DNA Sequencing
    • Types of genomic alterations predicted from tumor sequencing
    • Methods/tools/algorithms in following lectures
  3. Primer on statistical modeling
    • Probability distribution, Bayesian statistics, inference

Lecture 2: Probabilistic methods for mutation detection

Lecture 2 Slides

  1. Primer on statistical modeling (cont’d)
    • Mixture models and inference using the EM algorithm
  2. Detecting Mutations in Cancer Genomes
    • Visualizing somatic vs germline SNVs
    • Sequencing read count data
    • SNV genotyping strategy
  3. Mixture Models for SNV Detection
    • SNVMix probabilistic model and EM inference
    • Predicting somatic SNVs in cancer

Lecture 3: Probabilistic methods for copy number alteration detection

Lecture 3 Slides

  1. Detecting Copy Number Alterations in Cancer Genomes
    • Predicting copy number features from sequence data
    • Copy number analysis workflow
    • Data normalization
  2. Continuous Hidden Markov Model (HMM)
    • Graphical model representation
    • Components of a continuous HMM
    • Inference & parameter estimation using expectation-maximization (EM)
  3. Copy Number Profiling using a Hidden Markov Model
    • Probabilistic model for copy number analysis
    • Predicting copy number segments using the Viterbi algorithm

Lecture 4: Additional topics

Lecture 4 Slides

  1. Additional Copy Number Analysis Features
    • Allelic copy number analysis
  2. Estimating tumor heterogeneity
    • Modeling tumor-normal admixture
    • Modeling tumor clonality and heterogeneity
  3. Assessing Statistical Power for Variant Discovery
    • Power analysis
    • Calibrating sequencing depth for variant discovery
  4. Structural Rearrangement Analysis in Cancer Genomes
    • Structural variant types predicted from sequencing analysis
    • Complex genomic structural rearrangements