Section 14.9. NNTP: Accessing Newsgroups


14.9. NNTP: Accessing Newsgroups

So far in this chapter, we have focused on Python's FTP and email processing tools and have met a handful of client-side scripting modules along the way: ftplib, poplib, smtplib, email, mimetools, urllib, and so on. This set is representative of Python's client-side library tools for transferring and processing information over the Internet, but it's not at all complete.

A more or less comprehensive list of Python's Internet-related modules appears at the start of the previous chapter. Among other things, Python also includes client-side support libraries for Internet news, Telnet, HTTP, XML-RPC, and other standard protocols. Most of these are analogous to modules we've already metthey provide an object-based interface that automates the underlying sockets and message structures.

For instance, Python's nntplib module supports the client-side interface to NNTPthe Network News Transfer Protocolwhich is used for reading and posting articles to Usenet newsgroups on the Internet. Like other protocols, NNTP runs on top of sockets and merely defines a standard message protocol; like other modules, nntplib hides most of the protocol details and presents an object-based interface to Python scripts.

We won't get into protocol details here, but in brief, NNTP servers store a range of articles on the server machine, usually in a flat-file database. If you have the domain or IP name of a server machine that runs an NNTP server program listening on the NNTP port, you can write scripts that fetch or post articles from any machine that has Python and an Internet connection. For instance, the script in Example 14-28 by default fetches and displays the last 10 articles from Python's Internet newsgroup, comp.lang.python, from the news.rmi.net NNTP server at my ISP.

Example 14-28. PP3E\Internet\Other\readnews.py

 ############################################################################ # fetch and print usenet newsgroup posting from comp.lang.python via the # nntplib module   which really runs on top of sockets; nntplib also  supports # posting new messages, etc.; note: posts not deleted after they are read; ############################################################################ listonly = 0 showhdrs = ['From', 'Subject', 'Date', 'Newsgroups', 'Lines'] try:     import sys     servername, groupname, showcount = sys.argv[1:]     showcount  = int(showcount) except:     servername = 'news.rmi.net'     groupname  = 'comp.lang.python'          # cmd line args or defaults     showcount  = 10                          # show last showcount posts # connect to nntp server print 'Connecting to', servername, 'for', groupname from nntplib import NNTP connection = NNTP(servername) (reply, count, first, last, name) = connection.group(groupname) print '%s has %s articles: %s-%s' % (name, count, first, last) # get request headers only fetchfrom = str(int(last) - (showcount-1)) (reply, subjects) = connection.xhdr('subject', (fetchfrom + '-' + last)) # show headers, get message hdr+body for (id, subj) in subjects:                  # [-showcount:] if fetch all hdrs     print 'Article %s [%s]' % (id, subj)     if not listonly and raw_input('=> Display?') in ['y', 'Y']:         reply, num, tid, list = connection.head(id)         for line in list:             for prefix in showhdrs:                 if line[:len(prefix)] == prefix:                     print line[:80]; break         if raw_input('=> Show body?') in ['y', 'Y']:             reply, num, tid, list = connection.body(id)             for line in list:                 print line[:80]     print print connection.quit( ) 

As for FTP and email tools, the script creates an NNTP object and calls its methods to fetch newsgroup information and articles' header and body text. The xhdr method, for example, loads selected headers from a range of messages.

For NNTP servers that require authentication, you may also have to pass a username, a password, and possibly a reader-mode flag to the NNTP call, and you may need to be connected directly to the server's network in order to access it at all (e.g., not connected to an intermediate broadband provider). See the Python Library manual for more on other NNTP parameters and object methods.

When run, this program connects to the server and displays each article's subject line, pausing to ask whether it should fetch and show the article's header information lines (headers listed in the variable showhdrs only) and body text:

 C:\...\PP3E\Internet\Other>python readnews.py Connecting to news.rmi.net for comp.lang.python comp.lang.python has 3376 articles: 30054-33447 Article 33438 [Embedding? file_input and eval_input] => Display? Article 33439 [Embedding? file_input and eval_input] => Display?y From: James Spears <jimsp@ichips.intel.com> Newsgroups: comp.lang.python Subject: Embedding? file_input and eval_input Date: Fri, 11 Aug 2000 10:55:39 -0700 Lines: 34 => Show body? Article 33440 [Embedding? file_input and eval_input] => Display? Article 33441 [Embedding? file_input and eval_input] => Display? Article 33442 [Embedding? file_input and eval_input] => Display? Article 33443 [Re: PYTHONPATH] => Display?y Subject: Re: PYTHONPATH Lines: 13 From: sp00fd <sp00fdNOspSPAM@yahoo.com.invalid> Newsgroups: comp.lang.python Date: Fri, 11 Aug 2000 11:06:23 -0700 => Show body?y Is this not what you were looking for? Add to cgi script: import sys sys.path.insert(0, "/path/to/dir") import yourmodule ----------------------------------------------------------- Got questions?  Get answers over the phone at Keen.com. Up to 100 minutes free! http://www.keen.com Article 33444 [Loading new code...] => Display? Article 33445 [Re: PYTHONPATH] => Display? Article 33446 [Re: Compile snags on AIX & IRIX] => Display? Article 33447 [RE: string.replace( ) can't replace newline characters???] => Display? 205 GoodBye 

We can also pass this script an explicit server name, newsgroup, and display count on the command line to apply it in different ways. Here is this Python script checking the last few messages in Perl and Linux newsgroups:

 C:\...\PP3E\Internet\Other>python readnews.py news.rmi.net comp.lang.perl.misc 5 Connecting to news.rmi.net for comp.lang.perl.misc comp.lang.perl.misc has 5839 articles: 75543-81512 Article 81508 [Re: Simple Argument Passing Question] => Display? Article 81509 [Re: How to Access a hash value?] => Display? Article 81510 [Re: London =?iso-8859-1?Q?=A330-35K?= Perl Programmers Required] => Display? Article 81511 [Re: ODBC question] => Display? Article 81512 [Re: ODBC question] => Display? 205 GoodBye C:\...\PP3E\Internet\Other>python readnews.py news.rmi.net comp.os.linux 4 Connecting to news.rmi.net for comp.os.linux comp.os.linux has 526 articles: 9015-9606 Article 9603 [Re: Simple question about CD-Writing for Linux] => Display? Article 9604 [Re: How to start the ftp?] => Display? Article 9605 [Re: large file support] => Display? Article 9606 [Re: large file support] => Display?y From: andy@physast.uga.edu (Andreas Schweitzer) Newsgroups: comp.os.linux.questions,comp.os.linux.admin,comp.os.linux Subject: Re: large file support Date: 11 Aug 2000 18:32:12 GMT Lines: 19 => Show body?n 205 GoodBye 

With a little more work, we could turn this script into a full-blown news interface. For instance, new articles could be posted from within a Python script with code of this form (assuming the local file already contains proper NNTP header lines):

 # to post, say this (but only if you really want to post!) connection = NNTP(servername) localfile = open('filename')      # file has proper headers connection.post(localfile)        # send text to newsgroup connection.quit( ) 

We might also add a Tkinter-based GUI frontend to this script to make it more usable, but we'll leave such an extension on the suggested exercise heap (see also the PyMailGUI interface's suggested extensions at the end of the next chapteremail and news messages have a similar structure).




Programming Python
Programming Python
ISBN: 0596009259
EAN: 2147483647
Year: 2004
Pages: 270
Authors: Mark Lutz

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net