BNFO 601 
Integrated Bioinformatics
Scenarios
Fall 2004 
Identification of genes turned on by a transcription factor cascade
during a developmental process

Scientific story (html)

In brief: You are studying the developmental process that leads to sporulation in Bacillus subtilis, an organism that is closely related to the bacterium that causes Anthrax.  You hope to learn which genes are expressed 
during different stages of sporulation.  It so happens that you have a few
B. subtilis microarrays and some half-finished software kicking about.  Will they help?
Bioinformatic tools
Reverse Engineering
     Reverse engineer the output from Michael Eisen's Cluster program.
Cluster Analysis
     Apply the concepts of cluster analysis to the exploration of microarray datasets.
Molecular biology concepts
Regulation of gene expression
Spotted DNA microarray technology
Hierarchies in biological systems
Perl focus: Trees and Recursion.  Advanced Hashes.

Programs

          Cluster:      (ZIP compressed installer)  Mike Eisen's implementation of a clustering program 
        TreeView
(installer)  Mike Eisen's implementation of a program for visualizing clustered microarray data

        MyCluster.pl  A home-grown shell of a program that accomplishes part of the functionality of Eisen's Cluster
                   
An oh-so-close to working implementation of  Average Linkage Clustering that aims
                     to produce output that is compatible with Eisen's TreeView.  If only some nice, diligent
                    graduate student would come along and finish a couple of subroutines........ 
MergeSort.pl  A subroutine for sorting a list  (for Monday)
         Perl already knows how to sort a list, but this simple program illustrates how
         recusion is a natural tool when solving problems that have a natural representation as a tree

Data: Small data file for testing MyCluster.pl or exploring Cluster (ratiodata.txt)

Data: Real microarray data file with data from B. subtilis microarrays for use with MyCluster.pl  (BacillusData2.txt)
Data: Much bigger data file (from mouse microarrays) suitable for demonstrating Mike Eisen's Cluster (Final_with_zeros.txt)
Notes & Problem Set 
Introduction to clustering (Clustering_BNFO601.ppt)
Combined Notes and Problem set
(DOC)

External Links

        Michael Eisen's original paper describing clustering of microarray data (PNAS)
.