This course is primarily aimed at preparing students to participate in collegiate programming competitions such as the ACM ICPC.

Started in 1995 by the completion of the first genome sequence of a free-living organism, H. influenzae, the genomic era has led to thousands of complete genome sequences deposited in public databases and many more genome projects at various stages of completion. The large-scale availability of genome data is revolutionizing biological and medical research, with data-driven computational approaches taking a central role. This course covers fundamental computational methods for genomic data analysis, with a main emphasis on statistical methods for genomic data analysis and current applications in genomics and genetic epidemiology. Topics include statistical modeling of biological sequences, probabilistic models of DNA and protein evolution, expectation maximization and Gibbs sampling algorithms, genomic sequence variation, and applications in genomics and genetic epidemiology.

This course is an introduction to the fundamental mathematical models and algorithmic techniques used in bioinformatics. Emphasis will be placed on modeling computational problems arising in biology as graph-theoretic, statistical, or mathematical optimization problems, and on designing, analyzing, and implementing efficient algorithms for the latter.  Covered algorithmic techniques will include exhaustive search, integer programming, greedy algorithms, dynamic programming, divide-and-conquer, graph algorithms, combinatorial pattern matching, clustering, and randomized algorithms. Biological applications covered will include motif finding, sequence assembly, pairwise sequence alignment, genome rearrangement analysis, gene expression analysis, and evolutionary tree reconstruction.