InterPro Database | Introduction | Amino acid Sequence | Protein Family......
InterPro Database-
InterPro Database
Introduction-
InterPro provides functional analysis of proteins by classifying them into families and predicting domains and important sites. The contents of InterPro consist of diagnostic signatures and the proteins that they significantly match.Home Page of InterPro Database
Models are built from the amino acid sequences of known families or domains and they are subsequently used to search unknown sequences (such as those arising from novel genome sequencing) in order to classify them. Each of the member databases of InterPro contribute towards a different niche, from very high-level, structure-based classifications (SUPERFAMILY and CATH-Gene3D) through to quite specific sub-family classifications (PRINTS and PANTHER).
The proteins in UniProtKB are also the central protein entities in InterPro. Information regarding which signatures significantly match these proteins are calculated as the sequences are released by UniProtKB
InterPro come from "member databases"
1) CATH-Gene3D
Describes protein families and domain architectures in complete genomes. Protein families are formed using a Markov clustering algorithm, followed by multi-linkage clustering according to sequence identity.
2) CDD
Conserved_Domain_Database is a protein annotation resource that consists of a collection of annotated multiple sequence alignment models for ancient domains and full-length proteins.
3) HAMAP
Stands for High-quality Automated and Manual Annotation of microbial Proteomes. HAMAP profiles are manually created by expert curators they identify proteins that are part of well-conserved bacterial, archaeal and plastid-encoded.
4) MobiDB
is a database annotating intrinsic disorder in proteins.
5)PANTHER
is a large collection of protein families that have been subdivided into functionally related subfamilies, using human expertise.
6) Pfam
is a large collection of multiple sequence alignments and hidden Markov models covering many common protein domains and families.
7) PIRSF
protein classification system is a network with multiple levels of sequence diversity from super families to subfamilies that reflects the evolutionary relationship of full-length proteins and domains.
8) PRINTS
is a compendium of protein fingerprints. A fingerprint is a group of conserved motifs used to characterise a protein family; its diagnostic power is refined by iterative scanning of UniProt.
9) ProDom
domain database consists of an automatic compilation of homologous domains. Current versions of ProDom are built using a novel procedure based on recursive PSI-BLAST searches.
10) PROSITE
is a database of protein families and domains. It consists of biologically significant sites, patterns and profiles that help to reliably identify to which known protein family (if any) a new sequence belongs.
11) SMART
allows the identification and annotation of genetically mobile domains and the analysis of domain architectures. More than 800 domain families found in signaling, extracellular and chromatin-associated proteins are detectable.
12) SUPERFAMILY
has been used to carry out structural assignments to all completely sequenced genomes.
13) TIGRFAMsis
a collection of protein families, featuring curated multiple sequence alignments, hidden Markov models (HMMs) and annotation, which provides a tool for identifying functionally related proteins based on sequence homology.
Subscribe for More Information........!
0 Comments