Bioinformatics Assignment for Developmental Biology (Bio 433)

Introduction

The exponential rise in genetic and genomic information has provided a host of powerful new tools for virtually every sub-discipline within biology. This is particularly true for developmental biology. The vast amount of sequence and expression information from a multitude of different organisms has enhanced the ability to identify developmental genes implicated in birth defects and developmental abnormalities, determine the downstream developmental and/or genetic effects of various teratogens and environmental contaminants, identify specific gene functions, and construct more accurate phylogenetic relationships. The goal of this assignment is to introduce you to a few of the vast array of bioinformatics tools currently available on the Web. Detailed instructions are provided here, but the best way to familiarize yourself with these tools is to use and experminet with them on your own.

The Assignment: Phylogenetic and Functional Analysis of a Gene of Your Choice

Step #1: Select a gene. You may choose any gene that appeals to you-one that you read about in the news, discussed in class or the text (there is an interesting list of genes associated with human disease on pages 710-711 of your text), or one identified through online resources. Your gene of choice must be cloned in at least five different organisms for you to use it in this assignment (the majority of genes are...if you have trouble, ask for help). In addition, your gene MUST be either an mRNA or a cDNA (not a genomic sequence) for the phylogenetic comparisons to work (why?). Thus, the title of your gene's GenBank entry must have the words CDS (coding regions), mRNA, or cDNA in it.

There are two main tools to help you in selecting a gene:

PubMed (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed)
This is the NCBI database of primary literature. Every published journal article relating to the molecular or biomedical field in the last 15 years is catalogued here and available for you to search. In the "Search" box, type in the query you wish to search for. A good place to start might be the name of a disease or particular disorder in which you are interested. This will bring up a list of journal articles that contain your search query. To view an abstract for a particular article, click on the blue link found over the author's names (See attached sheet for more info on using PubMed). Remember to vary your search queries; if something doesn't come up the first search, try using more or less terms to refine your search.

OMIM (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIM)
This is the NCBI database of genetically inherited diseases and disorders. Every gene
that has been implicated in human disease is in this database (see attached sheet for more info on using OMIM).

- You may search the "Gene Map" which will identify genes known to be associated with human disease. To do so, click on "Search Gene Map" in the left-hand toolbar, enter your search terms in the box provided, and click on "Find." You will obtain a list of matches. By clicking on a six digit OMIM number in blue, a detailed description of what is known regarding the gene and its association with disease will come up.

- You may also search the "Morbid Map," which will allow you to search by syndrome or disease. Most, but not all, of these genetic diseases are associated with a known gene You must select one for which there is an associated gene. To do so, click on "Search

 Morbid Map" in the left-hand toolbar. This will bring up a screen with two options. You may either enter a search term in the box or you may scroll down the alphabetical list of diseases. Click on the OMIM number to bring up a complete description.

At the upper right of each OMIM entry you will see links to "Related Entries, PubMed, Protein, Nucleotide, LinkOut." To obtain an actual GenBank entry (That is, the actual sequence information), click on "nucleotide."

Step #2: Address the following questions regarding your gene in a written report.

[1]    Begin with a one paragraph introduction on the significance of the gene and why you selected it.

[2]    What type of molecule does this gene encode? (For example. a tyrosine kinase receptor, a transcription factor)
(Use either PubMed to browse journal abstracts or GenBank entries)

[3]    In what organisms has the gene been cloned?

- Approach this in two ways. First use Entrez (from the NCBI homepage, click on "Entrez" in the toolbar at the top of the page, and choose "nucleotide" in the pulldown box to search) to enter the name of the gene (be sure to try both the acronym as well as the full, spelled-out name) and obtain a list of matches.

- Second, using BLAST (a program that compares sequences for similarity), determine to what genes or sequences your sequence is similar. To do this, go to the BLAST homepage http://www.ncbi.nlm.nih.gov/BLAST/. Click on "standard nucleotide- nucleotide BLAST [blastn]." To perform a search, use the standard parameters that come up (in other words, don't uncheck or change anything). Enter the accession number in the large "Search" box, and click on "BLAST!" At list of matches will then be displayed. (In general, if the regions of overlap are quite short, that is less that 50 bases, it is not a homologous gene). Keep a copy of this list for generating a phylogenetic tree later in the assignment.

[4]    Where is it expressed within the embryo? Is it expressed in the adult organism? Is the expression pattern conserved among all the organisms in which the gene has been cloned?

(Use PubMed to browse journal abstracts)

[5]    What is the role of this gene? That is, what function does it perform in the embryo and in the adult? Is the function conserved in different organisms?
(Use PubMed to browse journal abstracts)

[6]    Is the gene implicated in any human disease, birth defect, or syndrome? Is the nature of the mutation known? Is the defect present in other organisms as well?
(Use OMIM or PubMed to browse journal abstracts)

[7]    Using "Biology Workbench" reconstruct a putative phylogenetic relationship among the organisms selected by generating a dendogram. Is the dendogram consistent with the phylogeny based upon morphological characteristics? Why or why not? You should use ten different entries for this, if possible, from different types of organisms (see attached pages for information on using Biology Workbench).

This assignment should be in the range of 4-6 double-spaced pages.