Gene Homologs and Sequence Alignments

     Homologs and alignments
     Motifs and domains

     Gene expression and

Homologs of BRCA1

BRCA1 has homologous genes in many different species. I chose to highlight a few well studied homologs for phylogenetic analysis, which were found on Homologene and Wormbase. An E-value of 0 indicates nearly exact sequence similarity in an alignment, and the larger the E-value, the less similarity.

Pan troglodytes (Chimpanzee)
Name: Breast cancer 1, early onset
Abbreviation: BRCA1
Gene accession number: NC_006484.2
     E-value: 0.0 (mega-blast)
Protein accession number: XP_001157352.1: Isoform 23 (closest homology to isoform 1 in humans)
     E-value: 0.0

Mus musculus (Mouse)
Name: Breast cancer 1
Abbreviation: brca1
Gene accession number: NC_000077.5
     E-value: 9xe^-84 (mega-blast)
Protein accession number: NP_033894.2
     E-value: 0.0

Canis lupus familiaris (Domestic dog)
Name: Breast cancer 1, early onset
Abbreviation: BRCA1
Gene accession number: NC_006591.2
     E-value: 0.42 (blastn)
Protein accession number: NP_001013434.1
     E-value 0.0

Bos taurus (Domestic cow)
Name: Breast cancer 1, early onset
Abbreviation: BRCA1
Gene accession number: NC_007317.3
     E-value: 0.037 (blastn)
Protein accession number: NP_848668.1
     E-value: 0.0

Gallus gallus (Chicken)
Name: Breast cancer 1, early onset
Abbreviation: BRCA1
Gene accession number: NC_006114.2
     E-value: cannot be determined
Protein accession number: NP_989500.1
     E-value: 2xe^-82

Caenorhabditis elegans (Worm)
Name: BRCa homolog (tumor suppressor gene Brca1)
Abbreviation: brc-1
Gene accession number: NC_003281.8
     E-value: cannot be determined
Protein accession number: NP_497780.2
     E-value: 2xe^-13

Alignments of homologous genes

ClustalW, T-COFFEE, MUSCLE, and failed to produce alignments of the BRCA1 gene due to the length of the sequence. As a consequence, GeneBee phylogenetic trees also could not be generated. Please see the Protein structure page for alignments based on the protein sequence.


Many homologs of the human BRCA1 gene exist throughout evolution. Using BLAST, an e-value could be determined for all mammalian species tested. The e-values followed what would be predicted based on evolution: Pan troglodytes was most similar (0.0), and Canis lupus familiaris was most dissimilar (0.42). I will also note that the length of the gene has increased with greater complexity: C. elegans sequence has only 9815 bp, nearly one-eigth the length of the human BRCA1 gene.

As noted above, the capacity of the free, online-available alignment software limits the sequence length allowed. As the BRCA1 gene is very large, alignments on full length genes could not be done.

Site created by Jessica D. Kueck
Genetics 677 Assignment, Spring 2009
University of Wisconsin-Madison