2.2 GenBank Example Flat File

Example 2-1 contains a sample sequence entry from GenBank. This entry contains terms from the GenBank Field Definitions and the DDBJ/EMBL/GenBank Feature Table, discussed later in this chapter.

Example 2-1. Sample Genbank entry
LOCUS       HSCDK2MR                1476 bp    mRNA    linear   PRI 15-JAN-1992 DEFINITION  H.sapiens CDK2 mRNA. ACCESSION   X61622 VERSION     X61622.1  GI:29848 KEYWORDS    CDK2 gene; cell cycle regulation protein; cyclin A binding; protein             kinase. SOURCE      Homo sapiens (human)   ORGANISM  Homo sapiens             Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;             Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. REFERENCE   1  (bases 1 to 1476)   AUTHORS   Elledge,S.J. and Spottswood,M.R.   TITLE     A new human p34 protein kinase, CDK2, identified by complementation             of a cdc28 mutation in Saccharomyces cerevisiae, is a homolog of             Xenopus Eg1   JOURNAL   EMBO J. 10 (9), 2653-2659 (1991)   MEDLINE   91330891 REFERENCE   2  (bases 1 to 1476)   AUTHORS   Elledge,S.J.   TITLE     Direct Submission   JOURNAL   Submitted (28-NOV-1991) S.J. Elledge, Dept. of Biochemistry, Baylor             College of Medicine, 1 Baylor Place, Houston, TX 77030, USA FEATURES             Location/Qualifiers      source          1..1476                      /organism="Homo sapiens"                      /db_xref="taxon:9606"                      /clone="pSE1000"                      /cell_line="EBV transformed Human peripheral lymphocyte                      (B-cell)"                      /clone_lib="lambda YES-R cDNA library"      gene            1..1476                      /gene="CDK2"      CDS             1..897                      /gene="CDK2"                      /function="protein kinase"                      /note="cell division kinase. CDC2 homolog"                      /codon_start=1                      /protein_                      /db_xref="GI:29849"                      /db_xref="SWISS-PROT:P24941"                      /translation="MENFQKVEKIGEGTYGVVYKARNKLTGEVVALKKIRLDTETEGV                      PSTAIREISLLKELNHPNIVKLLDVIHTENKLYLVFEFLHQDLKKFMDASALTGIPLP                      LIKSYLFQLLQGLAFCHSHRVLHRDLKPQNLLINTEGAIKLADFGLARAFGVPVRTYT                      HEVVTLWYRAPEILLGSKYYSTAVDIWSLGCIFAEMVTRRALFPGDSEIDQLFRIFRT                      LGTPDEVVWPGVTSMPDYKPSFPKWARQDFSKVVPPLDEDGRSLLSQMLHYDPNKRIS                      AKAALAHPFFQDVTKPVPHLRL" BASE COUNT      368 a    372 c    351 g    385 t ORIGIN               1 atggagaact tccaaaaggt ggaaaagatc ggagagggca cgtacggagt tgtgtacaaa        61 gccagaaaca agttgacggg agaggtggtg gcgcttaaga aaatccgcct ggacactgag       121 actgagggtg tgcccagtac tgccatccga gagatctctc tgcttaagga gcttaaccat       181 cctaatattg tcaagctgct ggatgtcatt cacacagaaa ataaactcta cctggttttt       241 gaatttctgc accaagatct caagaaattc atggatgcct ctgctctcac tggcattcct       301 cttcccctca tcaagagcta tctgttccag ctgctccagg gcctagcttt ctgccattct       361 catcgggtcc tccaccgaga ccttaaacct cagaatctgc ttattaacac agagggggcc       421 atcaagctag cagactttgg actagccaga gcttttggag tccctgttcg tacttacacc       481 catgaggtgg tgaccctgtg gtaccgagct cctgaaatcc tcctgggctc gaaatattat       541 tccacagctg tggacatctg gagcctgggc tgcatctttg ctgagatggt gactcgccgg       601 gccctgttcc ctggagattc tgagattgac cagctcttcc ggatctttcg gactctgggg       661 accccagatg aggtggtgtg gccaggagtt acttctatgc ctgattacaa gccaagtttc       721 cccaagtggg cccggcaaga ttttagtaaa gttgtacctc ccctggatga agatggacgg       781 agcttgttat cgcaaatgct gcactacgac cctaacaagc ggatttcggc caaggcagcc       841 ctggctcacc ctttcttcca ggatgtgacc aagccagtac cccatcttcg actctgatag       901 ccttcttgaa gcccccgacc ctaatcggct caccctctcc tccagtgtgg gcttgaccag       961 cttggccttg ggctatttgg actcaggtgg gccctctgaa cttgccttaa acactcacct      1021 tctagtctta accagccaac tctgggaata caggggtgaa aggggggaac cagtgaaaat      1081 gaaaggaagt ttcagtatta gatgcactta agttagcctc caccaccctt tcccccttct      1141 cttagttatt gctgaagagg gttggtataa aaataatttt aaaaaagcct tcctacacgt      1201 tagatttgcc gtaccaatct ctgaatgccc cataattatt atttccagtg tttgggatga      1261 ccaggatccc aagcctcctg ctgccacaat gtttataaag gccaaatgat agcgggggct      1321 aagttggtgc ttttgagaat taagtaaaac aaaaccactg ggaggagtct attttaaaga      1381 attcggttaa aaaatagatc caatcagttt ataccctagt tagtgttttc ctcacctaat      1441 aggctgggag actgaagact cagcccgggt gggggt //


Sequence Analysis in a Nutshell
Sequence Analysis in a Nutshell: A Guide to Common Tools and Databases
ISBN: 059600494X
EAN: 2147483647
Year: 2005
Pages: 312

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net