Bioinformatics - Lab 6
Phylogenetic Trees
David Gilbert
The aim of this lab is to give you practical experience in the concepts from
the lecture on phylogenetic trees.
Software
Polytree:
- You can access the polytree software on the web at
www.dina.dk/~sestoft/bsa/Match7Applet.html, or
- you run it locally it as
java -jar /users/students4/software/public/Bio4/bin/PolyTreeToy.jar
under linux.
Note that
polytree does not implement neighbour-joining correctly -- the trees are
displayed as rooted!.
Phylip:
- You can access Phylip as a web-service at
http://bioweb.pasteur.fr/seqanal/phylogeny/phylip-uk.html
- You can run Phylip by typing the name of one of the modules
e.g. neighbor.
Be careful to run these programs
from your own directory since they will need to write out files as they run!
You do not need to give any arguments to the programs - they will put up a
[basic] dialogue for you to interact with.
A list of programs in the phylip package is at
http://evolution.genetics.washington.edu/phylip/programs.html
- If you are really keen, after the lab, you can download and install Phylip from
http://evolution.gs.washington.edu/phylip
Blast:
- Blast is installed at
/usr/local/lab/packages/blast-2.2.14/bin/, e.g.
you can run it by typing, blastall
Exercises
(A) Identify the phylogeny of Globin Sequences (Beta-Chain).
Using Clustalw/Phylip to compute distances/matrices:
step by step.
- Download the globin sequences (amino-acids, in fasta format) from the
course website
- Use clustalw as either the command-line version or the web-based version (see Lab 4).
E.g. (web-based):
- Open the page for clustalw. Upload the file with the help of the
"Browse..."-button.
- Use the default settings and make sure that the output format PHYLIP is
selected. (The produced format, the PHYLIP format is needed in the next step).
- Click on "Submit" and run the sequence alignment.
- Copy the alignment into a new text file and save it. (this will be the
input file for the PHYLIP programs). E.g. 'globins_all.phy'
- Go to the directory where that you have saved the input file.
- Calculate the distance matrix for the input file. In your terminal,
invoke the protdist program by typing the command
protdist
- Enter the input file name (of your alignments from clustalw) to the console.
E.g. 'globins_all.phy'
- protdist outputs an outfile that calculates the distance matrix.
- Change the name for this outfile (e.g. mv outfile outfile.matrix)
- Now we will produce neighbour-joining trees for the input sequences
using neighbor. Invoke the neighbor program by typing the command
neighbor
- Enter the matrix file name to the console. (e.g. outfile.matrix)
- You can choose to produce a Neighbour-joining tree or a UPGMA tree from
the console.
- neighbor outputs two different files, the outfile (tree visualisation)
and the treefile (distances).
-
You can plot an unrooted tree (for neighbour-joining) using
drawtree
or a rooted tree (for UPGMA) using
drawgram
which will require your treefile
Note that this program will also ask you for a 'fontfile'; these are
/users/students4/software/public/Bio4/bin/phylip3.65/font1
/users/students4/software/public/Bio4/bin/phylip3.65/font2
up to font6
-
You can try to generate trees using the fitch or kitsch programs from the
phylip package. You can plot these using either the drawtree or drawgram
programs.
(B) Origin and evolution of HIV
Data: this can be either
Some HIV nucleotide sequences that you can investigate are
here, in fasta format.
These sequences are from HIV subtype C viruses from South Africa (C.ZA sequences) and from India (C.IN
sequences). (from http://www.sanbi.ac.za/mrc/tdr2003/material/phylo_tut.2.html) OR
FASTA files containing the amino acid sequences for the
env, gag
and pol proteins from the isolates
in the list below. (from http://artedi.ebc.uu.se/course/UGSBR/hiv/)
- Using the same procedure as (A), perform a multiple alignment using clustalw, with output in phylip
format.
- Using the appropriate programs from the phylip package:
- Construct the distance matrix this time using dnadist (why?),
- generate the tree and
- visualise the tree
(C) Origin and evolution of H5N1
Try to construct phylogenetic trees from some examples of the
H5N1 virus.