BRCA1
Gene Motifs and Domains

Gene
     Homologs and alignments
     Phylogeny
     Motifs and domains

     Ontology
     Gene expression and
        microarrays

MOTIF

No motifs could be found using the full length gene, presumably due to length.

The mRNA sequence for Isoform 1 was used. Using this method, many motifs were found, depending on the cut-off value used.
     Cut-off value                             Number of motifs
     85 (default)                                121
     90                                              76
     95                                              28
     97                                              16
The motifs returned with a cut-off value of >97 are described below, as taken from the MOTIF search.

When the complementary sequence to the Isoform 1 mRNA was entered, no motifs were found.

Transfac returned 16 motifs in the mRNA of isoform 1 of BRCA1:
     Motif                       Consensus                   Number            Function
     CdxA                       MTTTATR                        2
     CdxA                       WWTWMTR                     13
     SRY                         AAACWAM                       5                   Sex-determining Region Y gene product
     Nkx-2.5                  TYAATGT                        2
     AML-1a                   TGTGGT                          9
     C/EBP                      NNTKTGGWNANNN         1                   Enhancer Binding Protein
     Lyf-1                       TTTGGGAGR                   1                  
     Cap                         NCANNNNN                    1                   Transcription intitiation
     CDP                         NATYGATSSS                  1
     Sox-5                      NNAACAATNN                1         
     AP-1                       NTGASTCAG                    1                  AP-1 binding site
     c-Myb                     NNNAACKGNC                 1
     GKLF                       RAANRARRRARRGG          1
     HSF2                       NGAANNWTCK                 1                  Heat Shock Factor 2
     deltaEF1                  NNNCACCTNAN              1
     IK-2                        NNNYGGGAWNNN            1

MEME

60,000 bp maximum did not allow searching the gene sequence. As with the MOTIF search, the mRNA sequence for Isoform 1 was used. Three motifs were returned: [1]

Images taken from MEME search for the complementary DNA sequence for BRCA1 isoform 1.

Analysis

I used two programs available for determining DNA motifs from a DNA sequence alone: MOTIF and MEME. MOTIF differed significantly from MEME in the format it displayed results in. MOTIF allowed input of a sequence, but with limited parameter choices that could be changed. I found it useful to used the pre-defined setting of 85 cut-off score to start the search, but narrowed my search to a cut-off score of 97 to obtain a manageable number of results. The results were returned with limited information: name of the motif, a link to the Transfac database entry, the score returned, and a consensus of the motif found, displayed not as a logo, but by using letters representing multiple nucleotides at one position. In some cases a very vague description of the motif was given. The most disappointing thing about this program, which also came up in MEME, was that the meaning behind the motif could not be predicted in many cases. The description given both by the program and by the Transfac database gave no information about what proteins or other interactions this region is important for.

MEME found results that were much longer: MEME's results were around 20 nucleotides, and MOTIF's were about 10. MEME displayed results as logos, which I feel is much easier to interpret because it eliminates the need for a key. MEME presented e-values as follows: 4.4e+002. I was unsure of what this meant, because generally e-values are presented with the e to a negative power. The help section for MEME did not help make the distinction. MEME also differed from MOTIF in that it did not accompany the motifs found with a link or even a name of the motif. This is troublesome if you are trying to determine known motifs within an unknown gene, but may be more useful for determining unknown motifs in a gene of interest. The motifs found, however, appear to be somewhat similar to those found in MOTIF: for example, the first motif has a similar consensus sequence to AML-1a found in MOTIF.

I would also like to note that I am unsure how useful these motifs are for the gene. Since I used the mRNA sequence as an input, the sequence is complementary to the actual gene sequence. I tried to use the complement to the mRNA sequence for the MOTIF search, and no results were returned with a cut-off value as low as 25. As far as transcription factors and enhancer binding proteins, I feel these may not indicate any function in the BRCA1 gene itself. Other motifs found seemed to have little correlation to the function of the gene as well, for example SRY gene product, which is found on the Y chromosome.


[1] Bailey, T.L., and Elkan, C. (1994). Fitting a mixture model by expectation maximization to discover motifs in
     biopolymers. Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology,
     pp. 28-36, AAAI Press, Menlo Park, California, 1994.

Site created by Jessica D. Kueck
Genetics 677 Assignment, Spring 2009
University of Wisconsin-Madison