The tokenize Module

The tokenize module splits a Python source file into individual tokens. It can be used for syntax highlighting or for various kinds of code-analysis tools.

In Example 13-17, we simply print the tokens.

Example 13-17. Using the tokenize Module
File: tokenize-example-1.py

import tokenize

file = open("tokenize-example-1.py")

def handle_token(type, token, (srow, scol), (erow, ecol), line):
 print "%d,%d-%d,%d:	%s	%s" % 
 (srow, scol, erow, ecol, tokenize.tok_name[type], repr(token))

tokenize.tokenize(
 file.readline,
 handle_token
 )

1,0-1,6: NAME import
1,7-1,15: NAME 	okenize
1,15-1,16: NEWLINE \012
2,0-2,1: NL \012
3,0-3,4: NAME file
3,5-3,6: OP =
3,7-3,11: NAME open
3,11-3,12: OP (
3,12-3,35: STRING "tokenize-example-1.py"
3,35-3,36: OP )
3,36-3,37: NEWLINE \012
...

Note that the tokenize function takes two callable objects: the first argument is called repeatedly to fetch new code lines, and the second argument is called for each token.






Python Standard Library
Python Standard Library (Nutshell Handbooks) with
ISBN: 0596000960
EAN: 2147483647
Year: 2000
Pages: 252
Authors: Fredrik Lundh
Simiral book on Amazon

Flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net