5.2 PROSITE Field Definitions

5.2 PROSITE Field Definitions

The field codes found in a PROSITE flat file help to arrange the information for human readability and machine-based parsing. There are several PROSITE field codes found in an entry; each is represented with a two-letter abbreviation. Table 5-1 provides definitions and descriptions of these field codes.

Table 5-1. PROSITE field definitions

Field

Definition

Description

ID

Identification

The second item indicates the type of entry:

PATTERN
MATRIX
RULE

AC

Accession number

PSnnnnn.

DT

Date

Date of entry or last modification of the entry.

DE

Short description

Descriptive information about the entry content.

PA

Pattern

The definition of a PROSITE pattern.

MA

Matrix/profile

The definition of a PROSITE profile/matrix.

RU

Rule

The definition of a PROSITE rule.

NR

Numerical results

This contain information relevant to the results of the scan with a pattern on the complete SWISS-PROT knowledgebase. The following qualifiers are used:

/RELEASE
/TOTAL
/POSITIVE
/UNKNOWN
/FALSE_POS
/FALSE_NEG
/PARTIAL

CC

Comments

Various types of comments. The following qualifiers are used.

/TAXO-RANGE
/MAX-REPEAT
/SITE
/SKIP-FLAG
/MATRIX_TYPE
/SCALING_DB
/AUTHOR
/FT_KEY
/FT_DESC

DR

Cross-references to SWISS-PROT

These are used as pointers to SWISS-PROT entries.

3D

Cross-references to PDB

These are used to list the Protein Data Bank entries.

DO

Pointer to the documentation file

This contains a pointer to the entry in the PROSITE documentation file that describes the entry.

//

Termination line

This designates the end of an entry.

5.3 References

  • Falquet L., Pagni M., Bucher P., Hulo N., Sigrist C.J.A., Hofmann K., Bairoch A. 2002. The PROSITE database, its status in 2002. Nucleic Acids Research. Jan 1;30(1):235-8.

  • Sigrist CJA, Cerutti L, Hulo N, Gattiker A, Falquet L, Pagni M, Bairoch A, Bucher P. 2002. PROSITE: a documented database using patterns and profiles as motif descriptors. Brief Bioinform. Sep;3(3):265-74.

    Main page

    http://us.expasy.org/prosite/

    Release notes

    http://us.expasy.org/prosite/psrelnot.html

    User manual

    http://us.expasy.org/prosite/prosuser.html

    Download

    ftp://us.expasy.org/databases/prosite/

Part II: Tools

Now that we've described the common data formats and databases, it's time to get to work! What can you do with the data? You can compare two or more sequences, compute properties for the sequences, and look for patterns and subsequences. The possibilities are nearly limitless.

Since there's no way to describe all of the available tools—or even just the ones we use—we decided to showcase the tools we use most. We've included the classics: Readseq, BLAST, ClustalW, and HMMER. We also cover MEME and MAST (two tools that deserve to be better known), and a rising star called BLAT. The final chapter contains a wealth of information about the widely used open source suite of EMBOSS tools.

Each tool's brief description is followed by examples and command-line options. We've also included helpful web sites and other references.

We'd like to thank all the developers for making this rich abundance of documentation available to users!

Chapter 6

Chapter 7

Chapter 8

Chapter 9

Chapter 10

Chapter 11

Chapter 12