Other Client-Side Tools

So far in this chapter, we have focused on Python's FTP and email processing tools and have met a handful of client-side scripting modules along the way: ftplib, poplib, smtplib, mhlib, mimetools, urllib, rfc822, and so on. This set is representative of Python's library tools for transferring and processing information over the Internet, but it's not at all complete. A more or less comprehensive list of Python's Internet-related modules appears at the start of the previous chapter. Among other things, Python also includes client-side support libraries for Internet news, Telnet, HTTP, and other standard protocols.

11.5.1 NNTP: Accessing Newsgroups

Python's nntplib module supports the client-side interface to NNTP -- the Network News Transfer Protocol -- which is used for reading and posting articles to Usenet newsgroups in the Internet. Like other protocols, NNTP runs on top of sockets and merely defines a standard message protocol; like other modules, nntplib hides most of the protocol details and presents an object-based interface to Python scripts.

We won't get into protocol details here, but in brief, NNTP servers store a range of articles on the server machine, usually in a flat-file database. If you have the domain or IP name of a server machine that runs an NNTP server program listening on the NNTP port, you can write scripts that fetch or post articles from any machine that has Python and an Internet connection. For instance, the script in Example 11-24 by default fetches and displays the last 10 articles from Python's Internet news group, comp.lang.python, from the news.rmi.net NNTP server at my ISP.

Example 11-24. PP2EInternetOther eadnews.py

###############################################
# fetch and print usenet newsgroup postings
# from comp.lang.python via the nntplib module
# which really runs on top of sockets; nntplib 
# also supports posting new messages, etc.;
# note: posts not deleted after they are read;
###############################################

listonly = 0
showhdrs = ['From', 'Subject', 'Date', 'Newsgroups', 'Lines']
try:
 import sys
 servername, groupname, showcount = sys.argv[1:]
 showcount = int(showcount)
except:
 servername = 'news.rmi.net'
 groupname = 'comp.lang.python' # cmd line args or defaults
 showcount = 10 # show last showcount posts

# connect to nntp server
print 'Connecting to', servername, 'for', groupname
from nntplib import NNTP
connection = NNTP(servername)
(reply, count, first, last, name) = connection.group(groupname)
print '%s has %s articles: %s-%s' % (name, count, first, last)

# get request headers only
fetchfrom = str(int(last) - (showcount-1))
(reply, subjects) = connection.xhdr('subject', (fetchfrom + '-' + last))

# show headers, get message hdr+body
for (id, subj) in subjects: # [-showcount:] if fetch all hdrs
 print 'Article %s [%s]' % (id, subj)
 if not listonly and raw_input('=> Display?') in ['y', 'Y']:
 reply, num, tid, list = connection.head(id)
 for line in list:
 for prefix in showhdrs:
 if line[:len(prefix)] == prefix:
 print line[:80]; break
 if raw_input('=> Show body?') in ['y', 'Y']:
 reply, num, tid, list = connection.body(id)
 for line in list:
 print line[:80]
 print
print connection.quit()

As for FTP and email tools, the script creates an NNTP object and calls its methods to fetch newsgroup information and articles' header and body text. The xhdr method, for example, loads selected headers from a range of messages. When run, this program connects to the server and displays each article's subject line, pausing to ask whether it should fetch and show the article's header information lines (headers listed in variable showhdrs only) and body text:

C:...PP2EInternetOther>python readnews.py
Connecting to news.rmi.net for comp.lang.python
comp.lang.python has 3376 articles: 30054-33447
Article 33438 [Embedding? file_input and eval_input]
=> Display?

Article 33439 [Embedding? file_input and eval_input]
=> Display?y
From: James Spears 
Newsgroups: comp.lang.python
Subject: Embedding? file_input and eval_input
Date: Fri, 11 Aug 2000 10:55:39 -0700
Lines: 34
=> Show body?

Article 33440 [Embedding? file_input and eval_input]
=> Display?

Article 33441 [Embedding? file_input and eval_input]
=> Display?

Article 33442 [Embedding? file_input and eval_input]
=> Display?

Article 33443 [Re: PYHTONPATH]
=> Display?y
Subject: Re: PYHTONPATH
Lines: 13
From: sp00fd 
Newsgroups: comp.lang.python
Date: Fri, 11 Aug 2000 11:06:23 -0700
=> Show body?y
Is this not what you were looking for?

Add to cgi script:
import sys
sys.path.insert(0, "/path/to/dir")
import yourmodule

-----------------------------------------------------------
Got questions? Get answers over the phone at Keen.com.
Up to 100 minutes free!
http://www.keen.com

Article 33444 [Loading new code...]
=> Display?

Article 33445 [Re: PYHTONPATH]
=> Display?

Article 33446 [Re: Compile snags on AIX & IRIX]
=> Display?

Article 33447 [RE: string.replace() can't replace newline characters???]
=> Display?

205 GoodBye

We can also pass this script an explicit server name, newsgroup, and display count on the command line to apply it in different ways. Here is this Python script checking the last few messages in Perl and Linux newsgroups:

C:...PP2EInternetOther>python readnews.py news.rmi.net comp.lang.perl.misc 5
Connecting to news.rmi.net for comp.lang.perl.misc
comp.lang.perl.misc has 5839 articles: 75543-81512
Article 81508 [Re: Simple Argument Passing Question]
=> Display?

Article 81509 [Re: How to Access a hash value?]
=> Display?

Article 81510 [Re: London =?iso-8859-1?Q?=A330-35K?= Perl Programmers Required]
=> Display?

Article 81511 [Re: ODBC question]
=> Display?

Article 81512 [Re: ODBC question]
=> Display?

205 GoodBye


C:...PP2EInternetOther>python readnews.py news.rmi.net comp.os.linux 4
Connecting to news.rmi.net for comp.os.linux
comp.os.linux has 526 articles: 9015-9606
Article 9603 [Re: Simple question about CD-Writing for Linux]
=> Display?

Article 9604 [Re: How to start the ftp?]
=> Display?

Article 9605 [Re: large file support]
=> Display?

Article 9606 [Re: large file support]
=> Display?y
From: andy@physast.uga.edu (Andreas Schweitzer)
Newsgroups: comp.os.linux.questions,comp.os.linux.admin,comp.os.linux
Subject: Re: large file support
Date: 11 Aug 2000 18:32:12 GMT
Lines: 19
=> Show body?n

205 GoodBye

With a little more work, we could turn this script into a full-blown news interface. For instance, new articles could be posted from within a Python script with code of this form (assuming the local file already contains proper NNTP header lines):

# to post, say this (but only if you really want to post!)
connection = NNTP(servername)
localfile = open('filename') # file has proper headers
connection.post(localfile) # send text to newsgroup
connection.quit()

We might also add a Tkinter-based GUI frontend to this script to make it more usable, but we'll leave such an extension on the suggested exercise heap (see also the PyMailGui interface's suggested extensions in the previous section).

11.5.2 HTTP: Accessing Web Sites

Python's standard library (that is, modules that are installed with the interpreter) also includes client-side support for HTTP -- the Hypertext Transfer Protocol -- a message structure and port standard used to transfer information on the World Wide Web. In short, this is the protocol that your web browser (e.g., Internet Explorer, Netscape) uses to fetch web pages and run applications on remote servers as you surf the Net. At the bottom, it's just bytes sent over port 80.

To really understand HTTP-style transfers, you need to know some of the server-side scripting topics covered in the next three chapters (e.g., script invocations and Internet address schemes), so this section may be less useful to readers with no such background. Luckily, though, the basic HTTP interfaces in Python are simple enough for a cursory understanding even at this point in the book, so let's take a brief look here.

Python's standard httplib module automates much of the protocol defined by HTTP and allows scripts to fetch web pages much like web browsers. For instance, the script in Example 11-25 can be used to grab any file from any server machine running an HTTP web server program. As usual, the file (and descriptive header lines) is ultimately transferred over a standard socket port, but most of the complexity is hidden by the httplib module.

Example 11-25. PP2EInternetOtherhttp-getfile.py

#######################################################################
# fetch a file from an http (web) server over sockets via httplib;
# the filename param may have a full directory path, and may name a cgi
# script with query parameters on the end to invoke a remote program;
# fetched file data or remote program output could be saved to a local
# file to mimic ftp, or parsed with string.find or the htmllib module;
#######################################################################

import sys, httplib
showlines = 6
try:
 servername, filename = sys.argv[1:] # cmdline args?
except:
 servername, filename = 'starship.python.net', '/index.html'

print servername, filename
server = httplib.HTTP(servername) # connect to http site/server
server.putrequest('GET', filename) # send request and headers
server.putheader('Accept', 'text/html') # POST requests work here too
server.endheaders() # as do cgi script file names 

errcode, errmsh, replyheader = server.getreply() # read reply info headers
if errcode != 200: # 200 means success
 print 'Error sending request', errcode
else:
 file = server.getfile() # file obj for data received
 data = file.readlines()
 file.close() # show lines with eoln at end
 for line in data[:showlines]: print line, # to save, write data to file 

Desired server names and filenames can be passed on the command line to override hardcoded defaults in the script. You need to also know something of the HTTP protocol to make the most sense of this code, but it's fairly straightforward to decipher. When run on the client, this script makes a HTTP object to connect to the server, sends it a GET request along with acceptable reply types, and then reads the server's reply. Much like raw email message text, the HTTP server's reply usually begins with a set of descriptive header lines, followed by the contents of the requested file. The HTTP object's getfile method gives us a file object from which we can read the downloaded data.

Let's fetch a few files with this script. Like all Python client-side scripts, this one works on any machine with Python and an Internet connection (here it runs on a Windows client). Assuming that all goes well, the first few lines of the downloaded file are printed; in a more realistic application, the text we fetch would probably be saved to a local file, parsed with Python's htmllib module, and so on. Without arguments, the script simply fetches the HTML index page at http://starship.python.org:

C:...PP2EInternetOther>python http-getfile.py
starship.python.net /index.html

Starship Python

Introducing Python

Part I: System Interfaces

System Tools

Parallel System Tools

Larger System Examples I

Larger System Examples II

Part II: GUI Programming

Graphical User Interfaces

A Tkinter Tour, Part 1

A Tkinter Tour, Part 2

Larger GUI Examples

Part III: Internet Scripting

Network Scripting

Client-Side Scripting

Server-Side Scripting

Larger Web Site Examples I

Larger Web Site Examples II

Advanced Internet Topics

Part IV: Assorted Topics

Databases and Persistence

Data Structures

Text and Language

Part V: Integration

Extending Python

Embedding Python

VI: The End

Conclusion Python and the Development Cycle



Programming Python
Python Programming for the Absolute Beginner, 3rd Edition
ISBN: 1435455002
EAN: 2147483647
Year: 2000
Pages: 245

Similar book on Amazon

Flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net