CS 795/BINF 730 (Biological Sequence Analysis) [Spring 2010]     

Class Information

Class Link: http://www.cs.gmu.edu/~hrangwal/node/20


Huzefa Rangwala, Room #4423 EB, rangwala@cs.gmu.edu

Class Time & Location:

Mon: 7:20-10:00pm, Robinson B 228

Text Book:

Understanding Bioinformatics by Zvelebil & Baum

Teaching Assistant:


Office Hours:

M: 4-6pm, EEB 4423

About the Course

Course Description

CS 795 (Biological Sequence Analysis) is an inter-disciplinary course aimed at bridging the gap between biology and computer science, by exposing students to the widely used algorithms and methods playing a key role in bioinformatics and computational biology. The human genome project and advances in sequencing technologies have left us with a wealth of DNA, RNA, protein sequence data. Its important to infer key characteristics of biological systems using sequence analysis methods. The first half of the course will help students understand basic sequence alignment algorithms, hidden Markov models, classification and prediction methods. The second half will be an application of the concepts and ideas learned to some of the current bioinformatics applications motivated with a fair biological understanding.

Course Prerequisites

Programming in language of your choice. The class will cover the needed biology.

Course Outcomes

As an outcome of taking this class, a student will be able to

      Conceptualize and implement sequence alignment algorithm methods which use a dynamic programming solution.

      Study the working of large genomic sequence database search tools like FASTAand BLAST.

      Analyze the vast amount of genomic and proteomic data using machine learningand data mining tools (discriminative and generative models).

      Understand the theoretical aspects of Markov chains and hidden Markov models and their application to gene prediction, protein sequence annotation and multiple sequence alignment.

      Read research papers pertaining to bioinformatic and computational biology.

      Learn about new sequencing technologies along with development of short-read assembly algorithms

Course Format

Lectures will be given by the instructor. Besides material from the textbook, topics not discussed in the book may also be covered. Research papers and handouts of material not covered in the book will be made available. Grading will be based on homework assignments, exams, and a project. Homework assignments will require some programming. Exams and homework assignments must be done on an individual basis. Any deviation from this policy will be considered a violation of the GMU Honor Code.

Tentative Class Topics

Sequence Alignment, Sequence Assembly, Markov Models, Genome Annotation, Short-Read Sequencing, Protein Structure and Function Prediction.


      3 Programming Assignments (50 %)

      Exams (20%)

      Final Project (30%)