2.5 EMBL Example Flat File

Example 2-3 contains a sample sequence entry from EMBL. This entry contains terms from the EMBL Field Definitions and the DDBJ/EMBL/GenBank Feature Table, discussed later in this chapter.

Example 2-3. Sample EMBL entry
ID   HSCDK2MR   standard; RNA; HUM; 1476 BP. XX AC   X61622; XX SV   X61622.1 XX DT   15-JAN-1992 (Rel. 30, Created) DT   15-JAN-1992 (Rel. 30, Last updated, Version 1) XX DE   H.sapiens CDK2 mRNA XX KW   CDK2 gene; cell cycle regulation protein; cyclin A binding; protein kinase. XX OS   Homo sapiens (human) OC   Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; OC   Eutheria; Primates; Catarrhini; Hominidae; Homo. XX RN   [1] RP   1-1476 RX   MEDLINE; 91330891. RA   Elledge S.J., Spottswood M.R.; RT   "A new human p34 protein kinase, CDK2, identified by complementation of a RT   cdc28 mutation in Saccharomyces cerevisiae, is a homolog of Xenopus Eg1"; RL   EMBO J. 10:2653-2659(1991). XX RN   [2] RP   1-1476 RA   Elledge S.J.; RT   ; RL   Submitted (28-NOV-1991) to the EMBL/GenBank/DDBJ databases. RL   S.J. Elledge, Dept. of Biochemistry, Baylor College of Medicine, 1 Baylor RL   Place, Houston, TX 77030, USA XX DR   GDB; 128984; CDK2. DR   SWISS-PROT; P24941; CDK2_HUMAN. XX FH   Key             Location/Qualifiers FH FT   source          1..1476 FT                   /db_xref="taxon:9606" FT                   /organism="Homo sapiens" FT                   /cell_line="EBV transformed Human peripheral lymphocyte FT                   (B-cell)" FT                   /clone_lib="lambda YES-R cDNA library" FT                   /clone="pSE1000" FT   CDS             1..897 FT                   /db_xref="SWISS-PROT:P24941" FT                   /note="cell division kinase. CDC2 homolog" FT                   /gene="CDK2" FT                   /function="protein kinase" FT                   /protein_ FT                   /translation="MENFQKVEKIGEGTYGVVYKARNKLTGEVVALKKIRLDTETEGVP FT                   STAIREISLLKELNHPNIVKLLDVIHTENKLYLVFEFLHQDLKKFMDASALTGIPLPLI FT                   KSYLFQLLQGLAFCHSHRVLHRDLKPQNLLINTEGAIKLADFGLARAFGVPVRTYTHEV FT                   VTLWYRAPEILLGSKYYSTAVDIWSLGCIFAEMVTRRALFPGDSEIDQLFRIFRTLGTP FT                   DEVVWPGVTSMPDYKPSFPKWARQDFSKVVPPLDEDGRSLLSQMLHYDPNKRISAKAAL FT                   AHPFFQDVTKPVPHLRL" XX SQ   Sequence 1476 BP; 368 A; 372 C; 351 G; 385 T; 0 other;      atggagaact tccaaaaggt ggaaaagatc ggagagggca cgtacggagt tgtgtacaaa        60      gccagaaaca agttgacggg agaggtggtg gcgcttaaga aaatccgcct ggacactgag       120      actgagggtg tgcccagtac tgccatccga gagatctctc tgcttaagga gcttaaccat       180      cctaatattg tcaagctgct ggatgtcatt cacacagaaa ataaactcta cctggttttt       240      gaatttctgc accaagatct caagaaattc atggatgcct ctgctctcac tggcattcct       300      cttcccctca tcaagagcta tctgttccag ctgctccagg gcctagcttt ctgccattct       360      catcgggtcc tccaccgaga ccttaaacct cagaatctgc ttattaacac agagggggcc       420      atcaagctag cagactttgg actagccaga gcttttggag tccctgttcg tacttacacc       480      catgaggtgg tgaccctgtg gtaccgagct cctgaaatcc tcctgggctc gaaatattat       540      tccacagctg tggacatctg gagcctgggc tgcatctttg ctgagatggt gactcgccgg       600      gccctgttcc ctggagattc tgagattgac cagctcttcc ggatctttcg gactctgggg       660      accccagatg aggtggtgtg gccaggagtt acttctatgc ctgattacaa gccaagtttc       720      cccaagtggg cccggcaaga ttttagtaaa gttgtacctc ccctggatga agatggacgg       780      agcttgttat cgcaaatgct gcactacgac cctaacaagc ggatttcggc caaggcagcc       840      ctggctcacc ctttcttcca ggatgtgacc aagccagtac cccatcttcg actctgatag       900      ccttcttgaa gcccccgacc ctaatcggct caccctctcc tccagtgtgg gcttgaccag       960      cttggccttg ggctatttgg actcaggtgg gccctctgaa cttgccttaa acactcacct      1020      tctagtctta accagccaac tctgggaata caggggtgaa aggggggaac cagtgaaaat      1080      gaaaggaagt ttcagtatta gatgcactta agttagcctc caccaccctt tcccccttct      1140      cttagttatt gctgaagagg gttggtataa aaataatttt aaaaaagcct tcctacacgt      1200      tagatttgcc gtaccaatct ctgaatgccc cataattatt atttccagtg tttgggatga      1260      ccaggatccc aagcctcctg ctgccacaat gtttataaag gccaaatgat agcgggggct      1320      aagttggtgc ttttgagaat taagtaaaac aaaaccactg ggaggagtct attttaaaga      1380      attcggttaa aaaatagatc caatcagttt ataccctagt tagtgttttc ctcacctaat      1440      aggctgggag actgaagact cagcccgggt gggggt                                1476 //


Sequence Analysis in a Nutshell
Sequence Analysis in a Nutshell: A Guide to Common Tools and Databases
ISBN: 059600494X
EAN: 2147483647
Year: 2005
Pages: 312

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net