The field codes found in a Pfam flat file help display information for human readabilty and machine-based parsing. A typical entry contains several two-letter Pfam field codes. Table 4-1 provides definitions and descriptions of these codes.
Table 4-1. Pfam field definitions
Field
Definition
Description
AC
Accession number
PFxxxxx or PBxxxxxx.
ID
Identification
15 characters or less.
DE
Definition
80 characters or less.
AU
Author
Author of the entry.
AL
Alignment method of seed
Method used to align the seed members. Approved AL lines are:
Clustalv
Clustalw
Clustalw_mask_xxxx
Domainer
HMM_built_from_alignment
HMM_simulated_annealing
Manual
Prosite_pattern
Prodom
Structure_superposition
pftools
Unknown
BM
HMM building command lines
SE
Source of seed
The source suggesting seed members belong to a family.
GA
Gathering threshold
Search threshold to build the full alignment.
NC
Noise cutoff
This field refers to the bit scores of the highest scoring match not in the full alignment.
TC
Trusted cutoff
This field refers to the bit scores of the lowest scoring match in the full alignment.
TP
Type field
The type field is a compulsory field describing the type of family. At present it can be one of:
Family
Domain
Repeat
Motif
PI
Previous IDs
DC
Database Comment
Comment for database reference.
DR
Database Reference
Reference to external database.
RC
Reference Comment
Comment for literature reference.
RN
Reference Number
Digit in square brackets.
RM
Reference Medline
Eight digit number.
RT
Reference Title
Title of paper.
RA
Reference Author
Author of paper.
RL
Reference Location
Location of paper.
CC
Comment
Comment lines provide annotation and other information.
NE
Pfam accession
Indicated those cases where there is a nested domain.