BRCA1 Protein Phylogeny

Protein
     Homologs and alignments
     Phylogeny
     Motifs and domains
     Protein interactions

Phylogeny.fr phylogenetic tree

This figure above shows the phylogenetic tree obtained by using Phylogeny.fr in "One-click mode." Alignment of the Homo sapiens BRCA1 protein to the homologous gene in Pan troglodytes, Mus musculus, Bos taurus, Canis lupus familiaris, Gallus gallus and Caenorhabditis elegans were applied. All default settings were used. [1,4,5,6,7,8] (Clicking on links to each species will allow you to access the GENPEPT entry for the BRCA1 homolog for the respective species)


GeneBee phylogenetic tree using ClustalW alignment

Figure 1. This phylogenetic tree was generated by the GeneBee service using the ClustalW protein alignment results. [2,3]


GeneBee phylogenetic tree using T-COFFEE alignment

Figure 2. This phylogenetic tree was generated by the GeneBee service using the T-COFFEE protein alignment results. [2,3]


GeneBee phylogenetic tree using MUSCLE alignment

Figure 3. This phylogenetic tree was generated by the GeneBee service using the MUSCLE protein alignment results. [2,3]


Analysis

Phylogeny.fr and GeneBee were used to generate phylogenetic trees based on different algorithms. Phylogeny.fr allows generation of phylogenetic trees by using the raw protein sequences as the input. This algorithm then uses MUSCLE sequence alignment to generate a phylogenetic tree, curated using the GBlocks algorithm. GeneBee, on the other hand, allows input of various aligned sequences. I used all three alignments that I had previously generated using ClustalW, T-COFFEE, and MUSCLE algorithms. [1,2,3,4,5,6,7,8]

By comparing the two algorithms that use the MUSCLE algorithm for alignment, we find interesting results. The trees generated have very different distances between branchpoints, as well as having a different order of branching altogether. This difference is likely not due to the alignment method used, but rather the matrices used for generating the phylogenetic tree. While the GeneBee website is much more clear on what matrices are available, Phylogeny.fr provides limited information on the website about the algorithms used, as well as providing few options to change (even in the "advanced" mode, it only allows you to select whether or not all 4 processing steps are run). As far as ease of use, both were reasonable. For the extreme novice, I would suggest using Phylogeny.fr, because this does not require the initial step of generating an alignment. The tree generated using this program, however, seems less accurate in the output: Mus musculus diverges earlier than Gallus gallus.

As far as within GeneBee itself, using different alignment algorithms (ClustalW, T-COFFEE, and MUSCLE) only slightly affects the phylogenetic tree created. ClustalW and T-COFFEE alignments generated the most similar trees, with effectively the same distances between branchpoints. MUSCLE, while still generating the same order of branches, gave slightly different distances between branchpoints. Overall, the alignment method used does not affect the phylogenetic tree created, which further indicates that the difference between Phylogeny.fr and GeneBee is due solely to the difference in the matrices used for analysis of phylogeny, and is not due to differences in alignment.


[1] Anisimova, M., and Gascuel, O. (2006). Approximate likelihood ratio test for branchs: A fast, accurate and
     powerful alternative. Syst Biol, 55(4):539-52. doi:10.1080/10635150600755453.
[2] Brodsky, L.I., Ivanov, V.V., Kalaydzidis, Y.L., Leontovich, A.M., Nikolaev, V.K., Feranchuk, S.I., and Drachev, V.A.
     (1995). GeneBee-NET: Internet-based server for analyzing biopolymers structure. Biochemistry (Moscow),
     60(8):923-928.
http://www.genebee.msu.su/services/papers/GNB-NET/GNB-NET.htm.
[3] Brodsky, L.I., Vasilyev, A.V., Kalaydzidis, Y.L., Osipov, Y.S., Tatuzov, A.R.L., and Feranchuk, S.I. (1992).
     GeneBee: the program package for biopolymer structure analysis. Dimacs, 8:129-139.
     http://www.genebee.msu.su/services/papers/DIMA-BRO/DIMA-BRO.htm.

[4] Castresana, J. (2000). Selection of conserved blocks from multiple alignments for their use in phylogenetic
     analysis. Mol Biol Evol, 17(4):540-52. Retrieved from http://mbe.oxfordjournals.org/cgi/content/abstract
     /17/4/540.
[5] Chevenet, F., Brun, C., Banuls, AL., Jacq, B., and Chisten, R. (2006). TreeDyn: towards dynamic graphics and
     annotations for analyses of trees. BMC Bioinformatics, 7:439.
doi:10.1186/1471-2105-7-439.
[6] Dereeper, A., Guignon, V., Blanc, G., Audic, S., Buffet, S., Chevenet, F., Dufayard, J.F., Guindon, S., Lefort, V.,
     Lescot, M., Claverie, J.M., Gascuel, O. Phylogeny.fr: robust phylogenetic analysis for the non-specialist. Nucleic
     Acids Research
, 36(Web Server issue):W465-9.
doi:10.1093/nar/gkn180.
[7] Edgar, R.C. (2004). MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic
     Acid Research
, 32(5): 1792-1797. doi:10.1093/nar/gkh340.

[8] Guindon, S., and Gascuel, O. (2003). A simple, fast, and accurate algorithm to estimate large phylogenies by
     maximum likelihood. Syst Biol, 52(5):696-704.
doi:10.1080/10635150390235520.

Site created by Jessica D. Kueck
Genetics 677 Assignment, Spring 2009
University of Wisconsin-Madison