Conserved protein sequences
Shown below is an amino acid sequence alignment between two human zinc finger proteins, with GenBank accession numbers AAB24882 and AAB24881. Alignment was carried out using the clustalw sequence alignment program. Conserved amino acid sequences are marked by strings of * on the third line of the sequence alignment. As can be seen from this alignment, these two proteins contain a number of conserved amino acid sequences.
Tags: Carbohydrates
This entry was posted
on Sunday, November 30th, 2008 at 11:56 pm and is filed under Nutrients.
You can follow any responses to this entry through the RSS 2.0 feed.
You can leave a response, or trackback from your own site.
In biology, conserved sequences are similar or identical sequences that may occur within nucleic acids, proteins or polymeric carbohydrates within multiple species of organism or within different molecules produced by the same organism. In the case of cross species conservation, this indicates that a particular sequence may have been maintained by evolution despite speciation. The further back up the phylogenetic tree a particular conserved sequence may occur the more highly conserved it is said to be.
The TATA promoter sequence is an example of a highly conserved DNA sequence, being found in most eukaryotes.
Sequence similarities serve as evidence for structural and functional conservation, as well as of evolutionary relationships between the sequences. Consequently, comparative analysis is the primary means by which functional elements are identified.
Among the most highly conserved sequences are the active sites of enzymes and the binding sites of a protein receptors.
CCP domains , also called Sushi or Short Consensus Repeat(SCR) domains, contain consecutive domains of about 60 residues eachthat have 4 conserved cysteines, arranged in two conserved disulfide bonds, and one conserved tryptophan, but otherwise can vary greatly in sequence.
The Signal Peptide Peptidase (SPP) and its homologs (SPPL2a/b/c, SPPL3) are a class of transmembrane aspartyl proteases with the conserved motives YD ... GxGD ...PALL. Their sequences are highly conserved in different vertebrate species. Their substrates are Type II transmembrane proteins.[1]
Physiologically SPP processes signal peptides of HLA preproteins. A 9 AS cleavage fragment is then presented on HLA-E receptors and modulates the activity of natural killer cells.[2]
SPP also plays a pathophysiolocigal role; it cleaves the core protein of Hepatitis C virus and thus influences the reproduction rate of HCV.[3]
SPPL2a/b promotes the intramembrane cleavage of TNF? in activated dendritic cells and
Intrinsically unstructured proteins are characterized by a low content of bulky hydrophobic amino acids and a high proportion of polar and charged amino acids. Thus disordered sequences cannot bury sufficient hydrophobic core to fold like stable globular proteins. In some cases, hydrophobic clusters in disordered sequences provide the clues for identifying the regions that undergo coupled folding and binding.
Many disordered proteins also reveal low complexity sequences, i.e. sequences with overrepresentation of a few residues. While low complexity sequences are a strong indication of disorder, the reverse is not necessarily true, that is, not all disordered proteins have low complexity sequences.
Disordered
Initiation: the construction of the polymerase complex on the promoter. Pol III is unusual (compared to Pol II) requiring no control sequences upstream of the gene, instead normally relying on internal control sequences - sequences within the transcribed section of the gene (although upstream sequences are occasionally seen, eg. U6 snRNA gene has an upstream TATA box as seen in Pol II Promoters).
sHsps have some structural features in common: Very characteristic is a homologous and highly conserved amino acid sequence, the so-called ?-crystallin-domain at the C-terminus. These sequences consist of 80 to 100 residues with a homology between 20% and 60% and form ?-sheets, which are important for the formation of stable dimers.[1][2]
The N-terminus consists of a less conserved region, the so-called WD/EPF domain, followed by a short variable sequence with a rather conservative site near the C-terminus of this domain. The C-terminal part of the sHsps consists of the above mentioned ?-crystallin domain, followed by a variable sequence with high motility
sHsps have some structural features in common: Very characteristic is a homologous and highly conserved amino acid sequence, the so-called ?-crystallin-domain at the C-terminus. These sequences consist of 80 to 100 residues with a homology between 20% and 60% and form ?-sheets, which are important for the formation of stable dimers.[1][2]
The N-terminus consists of a less conserved region, the so-called WD/EPF domain, followed by a short variable sequence with a rather conservative site near the C-terminus of this domain. The C-terminal part of the sHsps consists of the above mentioned ?-crystallin domain, followed by a variable sequence with high motility
All of the mammalian Mef2 genes share approximately 50% overall amino acid identity and about 95% similarity throughout the highly conserved N-terminal MADS-box and Mef2 domains, however their sequences diverge in their C-terminal transactivation domain (see figure to the right).[4]
The MADS-box serves as the minimal DNA-binding domain, however an adjacent 29-amino acid extension called the Mef2 domain is required for high affinity DNA-binding and dimerization. Through an interaction with the MADS-box, Mef2 transcription factors have the ability to homo- and heterodimerize,[5] and a classic nuclear localization sequence (NLS) in the C-terminus of Mef2A, -C, and – D ensures nuclear localization
PIR was established in 1984 by the National Biomedical Research Foundation (NBRF) as a resource to assist researchers in the identification and interpretation of protein sequence information. Prior to that, the NBRF compiled the first comprehensive collection of macromolecular sequences in the Atlas of Protein Sequence and Structure, published from 1965-1978 under the editorship of Margaret Dayhoff. Dr. Dayhoff and her research group pioneered in the development of computer methods for the comparison of protein sequences, for the detection of distantly related sequences and duplications within sequences, and for the inference of evolutionary histories from alignments of protein sequences.
Dr. Winona
The concept of protein family was conceived at a time when very few protein structures or sequences were known; at that time, primarily small, single-domain proteins such as myoglobin, hemoglobin, and cytochrome c. Since that time, it was found that many proteins comprise multiple independent structural and functional units or domains. Due to evolutionary shuffling, different domains in a protein have evolved independently. This has led, in recent years, to a focus on families of protein domains. A number of online resources are devoted to identifying and cataloging such domains (see list of links at the end of this article).
Regions
Adaptor proteins usually contain several domains within their structure (e.g., Src homology 2 (SH2) and SH3 domains) which allow specific interactions with several other specific proteins. SH2 domains recognise specific amino acid sequences within proteins containing phosphotyrosine residues and SH3 domains recognise proline-rich sequences within specific peptide sequence contexts of proteins.
There are many other types of interaction domains found within adaptor and other signalling proteins which allow a rich diversity of specific and coordinated protein-protein interactions to occur within the cell during signal transduction.
The DNA sequence that a transcription factor binds to is called a transcription factor binding site or response element.
Chemically, transcription factors interact with their binding sites using a combination of electrostatic (of which hydrogen bonds are a special case) and Van der Waals forces. Due to the nature of these chemical interactions, most transcription factors bind DNA in a sequence specific manner. However, not all bases in the transcription factor binding site may actually interact with the transcription factor. In addition some of these interactions may be weaker than others. Thus, transcription factors don't bind just one sequence but are
'LIM domains' are protein structural domains, comprised of two contiguous zinc finger domains, separated by a two-amino acid residue hydrophobic linker.[1] They are named after their initial discovery in the proteins Lin11, Isl-1 & Mec-3.[2] LIM-domain containing proteins have been shown to play roles in cytoskeletal organisation, organ development and oncogenesis. LIM-domains mediate protein:protein interactions that are critical to cellular processes.
LIM domains have highly divergent sequences, apart from certain key residues. The sequence divergence allow a great many different binding sites to be grafted onto the same basic domain. The conserved residues are those involved in zinc binding or the
Leave a Reply