The urlparse Module

The urlparse module contains functions to process URLs, and to convert between URLs and platform-specific filenames. Example 7-16 demonstrates.

Example 7-16. Using the urlparse Module

File: urlparse-example-1.py

import urlparse

print urlparse.urlparse("http://host/path;params?query#fragment")

('http', 'host', '/path', 'params', 'query', 'fragment')

A common use is to split an HTTP URL into host and path components (an HTTP request involves asking the host to return data identified by the path), as shown in Example 7-17.

Example 7-17. Using the urlparse Module to Parse HTTP Locators

File: urlparse-example-2.py

import urlparse

scheme, host, path, params, query, fragment =
 urlparse.urlparse("http://host/path;params?query#fragment")

if scheme == "http":
 print "host", "=>", host
 if params:
 path = path + ";" + params
 if query:
 path = path + "?" + query
 print "path", "=>", path

host => host
path => /path;params?query

Alternatively, Example 7-18 shows how you can use the urlunparse function to put the URL back together again.

Example 7-18. Using the urlparse Module to Parse HTTP Locators

File: urlparse-example-3.py

import urlparse

scheme, host, path, params, query, fragment =
 urlparse.urlparse("http://host/path;params?query#fragment")

if scheme == "http":
 print "host", "=>", host
 print "path", "=>", urlparse.urlunparse(
 (None, None, path, params, query, None)
 )

host => host
path => /path;params?query

Example 7-19 uses the urljoin function to combine an absolute URL with a second, possibly relative URL.

Example 7-19. Using the urlparse Module to Combine Relative Locators

File: urlparse-example-4.py

import urlparse

base = "http://spam.egg/my/little/pony"

for path in "/index", "goldfish", "../black/cat":
 print path, "=>", urlparse.urljoin(base, path)

/index => http://spam.egg/index
goldfish => http://spam.egg/my/little/goldfish
../black/cat => http://spam.egg/my/black/cat

Core Modules

More Standard Modules

Threads and Processes

Data Representation

File Formats

Mail and News Message Processing

Network Protocols

Internationalization

Multimedia Modules

Data Storage

Tools and Utilities

Platform-Specific Modules

Implementation Support Modules

Other Modules



Python Standard Library
Python Standard Library (Nutshell Handbooks) with
ISBN: 0596000960
EAN: 2147483647
Year: 2000
Pages: 252
Authors: Fredrik Lundh

Flylib.com © 2008-2020.
If you may any questions please contact us: flylib@qtcs.net