20th International CODATA Conference

 

Session: Primary Biological Databases

 

UniProt: the Universal Protein Resource

 

Claire O’Donovan (odonovan@ebi.ac.uk), European Bioinformatics Institute (EBI), Wellcome Trust Genome Campus, Cambridge, UK;

 

The ability to store and interconnect all available information on proteins is crucial to modern biological research. Accordingly, the Universal Protein Resource (UniProt) plays an ever more important role by providing a stable, comprehensive, high-quality freely accessible central resource on protein sequences and functional annotation.

UniProt is produced by the UniProt consortium, formed by European Bioinformatics Institute (EBI), Georgetown University Protein Information Resource (PIR) and Swiss Institute of Bioinformatics (SIB). The core activities of UniProt include manual curation of protein sequences assisted by automated annotation, sequence archiving, development of a user-friendly UniProt web site, and providing additional value-added information on proteins through cross-references to other databases. UniProt comprises three database components, each of which addresses a key need in protein bioinformatics. The UniProt Knowledgebase (UniProtKB) provides protein sequences with extensive annotation and cross-references. The UniProt Archive (UniParc) is the main sequence storehouse. The UniProt Reference Clusters (UniRef) condense sequence information and annotation to facilitate both sequence similarity searches and analyses of the results.

 

Keywords: Protein, amino acid sequence, database, annotation