BRCA1 Protein Motifs and Domains

Protein
     Homologs and alignments
     Phylogeny
     Motifs and domains
     Protein interactions

MOTIF

Using all defaults in MOTIF, one motif was found.
ZF_RING_1 (PS00518), positions 39-48: CDHIFCKFCM

Pfam

Searching on Pfam using the sequence for isoform 1 of BRCA1 resulted in a map of the protein shown below. The domains shown are the ones found to be significant domains in the search. The green domain represents a Zinc Finger, C3HC4 type (RING finger) with an e-value of 6.8xe^-17, and the red domains represent BRCA1 C-terminus (BRCT) domains with e-values of 5.7xe^-12 and 2.5xe^-12 for the first and second domains, respectively. Six other proteins were brought up with the search, but declared as insignificant matches, and will not be described here. All matches found (both significant and insignificant) were manually curated domains from Pfam-A families. [1,2,3,4,7,10,11,16,17]

From Pfam, 2009. BRCA1 Isoform 1 Sequence Search Results. Retrieved from pfam.sanger.ac.uk/search/sequence/results?jobId=ecd73797-0e41-40a0-8630-50707b3810e1.

SMART

SMART found the RING domain near the N terminus, but here with an e-value of 1.82xe^-7. SMART describes how this domain binds Zn, allowing it to have ubiquitin ligase activity.
SMART also found the two BRCT domains near the C terminus, here with e-values of 8.38xe^-7 and 8.56xe^-13 for the first and second domains, respectively. These domains are sometimes found in proteins involved in DNA damage.
This website also found a coiled-coil domain and eight unknown regions with unspecified e-values. [13,15]

From SMART, 2009. BRCA1 Isoform 1 Sequence Search Results. Retrieved from http://smart.embl-heidelberg.de/.

Prosite

Using all default settings, Prosite did not give much information on the protein domains. Only the RING finger domain came up, as in the MOTIF program. [12]

ProDom

ProDom pulled out a number of BRCA1 homolog domains related to the BRCT domain found in other domain searches with the lowest e-values. Next domains in the sequence matched to known zinc-finger RING domains ZF_RING_1 and ZF_RING2. The e-values for all of these domains were fairly small (all less than 3xe^-6).
Some higher scoring domains were also found (e-values in parentheses):
Microencephalin repeat Riken product: Hypothetical full containing (0.0005)
     Relates to the BRCT domain
DNA/RNA helicase (0.003)
     Relates to the C-terminal domain of DNA/RNA helicase and the zinc-finger RING domain
[5,8]

PRINTS

PRINTS was difficult to use. It seems that you must know the domain of interest initially to search for more information about it. For this reason, I did not delve deeper into this website at this time.

Structure of RING domain

Brzovic, et al. solved the structure of the BRCA1-BARD1 RING domain heterodimer using solution NMR [6]. In Figure 1 below (Figure 1a from the original paper), you can see the heterodimer structure, with BRCA1 in red and BARD1 in blue. Black spheres represent zinc ions. Note the structural similarity between the BARD1 and BRCA1, and how this contributes to dimerization.

Figure 1. This is Figure 1a from the original Brzovic, et al. paper [6]. BRCA1 is shown in red, and BARD1 in blue.

From Nature, 2001. "Structure of a BRCA1-BARD1 heterodimeric RING-RING complex" Figure 1a. Retrieved from http://www.nature.com/nsmb/journal/v8/n10/full/nsb1001-833.html. Copyright Brzovic, et al., 2001.

Figure 2. This is Figure 1c from the original Brzovic, et al. paper [6].

From Nature, 2001. "Structure of a BRCA1-BARD1 heterodimeric RING-RING complex" Figure 1c. Retrieved from http://www.nature.com/nsmb/journal/v8/n10/full/nsb1001-833.html. Copyright Brzovic, et al., 2001.

Structure of BRCT domain

Williams, et al. solved the structure for the BRCT domains of BRCA1 from residues 1646-1859 using X-ray diffraction [18]. Their structure relative to each other seems to be influenced by the amino acids that link the two domains together, shown in blue (Figure 3).

Figure 3. This is Figure 2a from the original Williams, et al. paper [18]. Alpha helices are shown in yellow, and beta sheets are shown in green. The amino acid sequence that links the two domains is in blue.

From Nature, 2001. " Crystal structure of the BRCT repeat region from the breast cancer-associated protein BRCA1" Figure 2a. Retrieved from http://dx.doi.org/10.1038/nsb1001-838. Copyright Williams, et al., 2001.

pTARGET

pTARGET allows users to input a protein sequence and predict where the protein would localize based on sequence characteristics. Searching pTARGET for localization of the BRCA1 protein sequence returned the following results:
----------------------------------------------------------------------
ACCESSION NO.                   LOCALIZATION              CONFIDENCE
----------------------------------------------------------------------
gi|6552299|ref|NP_009225.1|     Nucleus                   100.0%
----------------------------------------------------------------------
This is congruent with the function of BRCA1.

UniProt confirmed this result, and examination of homologs shows that this localization is conserved in all homologs I have examined in previous analyses, as well as being predicted to localize to the nucleus in the far-removed Arabidopsis thaliana.

Protein modifications

The most notable protein modification to BRCA1 is phosphorylation, as UniProt describes 25 amino acids that have been found to be phosphorylated, 20 of which are located in a region of poly-serines.

A review by Deng and Brodie in 2000 also describes other protein modifications that modulate the activity of the protein. They describe that RING domain containing proteins are predicted to be regulated by ubiquitination. [9]

Analysis

The two different domains found in BRCA1 are the zinc-finger RING domain and the BRCA1 C-terminal domain (BRCT; two domains found in BRCA1). These two domains were found in almost all of the search engines used.

MOTIF found only the zinc-finger RING domain, however. This engine was very simple to use with the defaults, but the options available to change the parameters are difficult to interpret. Tutorials and help are available to help make these decisions.

Pfam was also easy to use and generated a wide variety of results from the input sequence. The domains found were mapped on the protein, as shown above. This engine also made the distinction between significant and insignificant domains very clearly.

SMART was my favorite engine to use. By simply entering the protein sequence, SMART finds the protein, along with the domains. The domains returned were very similar to Pfam, and included a schematic of where the domains are similar to Pfam. What I really liked was the aesthetics of the website, as well as the information they included on the gene ontology and protein interactions from simply entering the sequence.

Prosite was similar to MOTIF in that it only returned the zinc-finger RING domain as well.

ProDom probably returned the most results. Though the aesthetics of the website were lacking, especially when comparing to SMART, the information provided was still fairly simple to read and provided a number of domains of similar types. For example, several BRCT and RING domains were found. I found it interesting that the DNA/RNA helicase was found, as BRCA1 is known to be located to the nucleus as well.

I looked into the function of the RING and BRCT domains, as they seem to be the most important and conserved domains in the protein. RING domains in BRCA1 are responsible for ubiquitination of other proteins, which is evidence for this gene ontology biological process term. BRCT domains are found in many proteins involved in DNA damage repair, and these domains are responsible for protein interactions that are involved in this response. [14]

I did find it interesting that none of the searches returned a nuclear localization signal. This could be due to the fact that the default settings were not changed for any of the searches. Interestingly, though, pTARGET did localize the protein to the nucleus based on its sequence. The program did not, however, cite how it determined this.


[1] Bateman, A., Birney, E., Cerruti, L, Durbin, R., Etwiller, L., Eddy, S.R., Griffiths-Jones, S., Howe, K.L., Marshall,
     M., and Sonnhammer, E.L.L. (2002). The Pfam protein families database. Nucleic Acids Research,
     30(1):276-280. doi:10.1093/nar/30.1.276
.
[2] Bateman, A., Birney, E., Durbin, R., Eddy, S.R., Finn, R.D., and Sonnhammer, E.L.L. (1999). Pfam 3.1: 1313
     multiple alignments match the majority of proteins. Nucleic Acids Research, 27:260-262. doi:10.1093/nar
     /27.1.260
.
[3] Bateman, A., Birney, E., Durbin, R., Eddy, S.R., Howe, K.L., and Sonnhammer, E.L.L. (2000). The Pfam protein
     families database. Nucleic Acids Research, 28(1):263-266. doi:10.1093/nar/28.1.263
.
[4] Bateman, A., Coin, L., Durbin, R., Finn, R.D., Hollich, V., Griffiths-Jones, S., Khanna, A., Marshall, M., Moxon,
     S., Sonnhammer, E.L.L., Studholme, D.J., Yeats, C., and Eddy, S.R. (2004). The Pfam protein families database.
     Nucleic Acids Research
, 32(Database issue):D138-D141. doi:10.1093/nar/gkh121
.
[5] Bru, C., Courcelle, E., Carrère, S., Beausse, Y., Dalmar, S., and Kahn, D. (2005). The ProDom database of
     protein domain families: more emphasis on 3D. Nucleic Acids Research, 33: D212-D215. doi:10.1093/nar
     /gki034
.
[6] Brzovic, P.S., Rajagopal, P., Hoyt, D.W., King, M.C., and Klevit, R.E. (2001). Structure of a BRCA1-BARD1
     heterodimeric RING-RING complex. Nat.Struct.Biol., 8: 833-837.
doi:10.1038/nsb1001-833.

[7] Coin, L., Bateman, A., and Durbin, R. (2003). Enhanced protein domain discovery by using language modeling
     techniques from speech recognition. Proc Natl Acad Sci USA, 100(8):4516-4520. doi:10.1073
     /pnas.0737502100
.
[8] Corpet, F., Servant, F., Gouzy, J., and Kahn, D. (2000). ProDom and ProDom-CG: tools for protein domain
     analysis and whole genome comparisons. Nucleic Acids Research, 28:267-269
. doi:10.1093/nar/28.1.267.
[9] Deng, C., and Brodie, S.G. (2000). Roles of BRCA1 and its interacting proteins. BioEssays,
     22(8)728-737.doi:10.1002/1521-1878(200008)22:8<728::AID-BIES6>3.0.CO;2-B.

[10] Finn, R.D., Mistry, J., Schuster-Bockler, B., Griffiths-Jones, S., Hollich, V., Lassmann, T., Moxon, S., Marshall,
     M., Khanna, A., Durbin, R., Eddy, S.R., Sonnhammer, E.L.L., and Bateman, A. (2006). Pfam: clans, web tools and
     services. Nucleic Acids Research,
34(Database Issue):D247-D251. doi:10.1093/nar/gkj149.
[11] Finn, R.D., Tate, J., Mistry, J., Coggill, P.C., Sammut, S.J., Hotz, H., Ceric, G., Forslund, K., Eddy, S.R.,
     Sonnhammer, E.L.L., and Bateman, A. (2007). The Pfam protein families database. Nucleic Acids Research,

     36(Database issue):D281-D288. doi:10.1093/nar/gkm960.
[12] Hulo N., Bairoch A., Bulliard V., Cerutti L., Cuche B., De Castro E., Lachaize C., Langendijk-Genevaux P.S.,
     Sigrist C.J.A. (2007). The 20 years of PROSITE. Nucleic Acids Research, 36(Database issue):D245-9.
     doi:10.1093/nar/gkm977
.
[13] Letunic, I., Doerks, T., and Bork, P. (2009). SMART 6: recent updates and new developments. Nucleic Acids
     Research
, 37(Database issue):D229-32. doi:10.1093/nar/gkn808.
[14] Narod, S.A., and Foulkes, W.D. (2004). BRCA1 and BRCA2: 1994 and beyond. Nature Reviews Cancer,
     4:665-676.doi:10.1038/nrc1431.

[15] Schultz, J., Milpetz, F., Bork, P., and Ponting, C.P. (1998). SMART, a simple modular architecture research
     tool: Identification of signaling domains. PNAS, 95(11):5857-5864.
doi:10.1073/pnas.95.11.5857.
[16] Sonnhammer, E.L.L., Eddy, S.R., Birney, E., Bateman, A., and Durbin, R. (1998). Nucleic Acids Research,
     26:320-322. doi:10.1093/nar/26.1.320
.
[17] Sonnhammer, E.L.L., Eddy, S.R., and Durbin, R. (1997). Pfam: a comprehensive database of protein families
     based on seed alignments. Proteins, 28:405-420. doi:10.1002/(SICI)1097-0134(199707)28:3<405::AID-
     PROT10>3.0.CO;2-L.
[18] Williams, R.S., Green, R., and Glover, J.N. (2001). Crystal structure of the BRCT repeat region from the breast
     cancer-associated protein BRCA1. Nat.Struct.Biol. 8: 838-842. doi:10.1038/nsb1001-838.

Site created by Jessica D. Kueck
Genetics 677 Assignment, Spring 2009
University of Wisconsin-Madison