You want to retrieve the contents of a URL. For example, you want to include part of one web page in another page's content.
Provide the URL to file_get_contents( ), as shown in Example 13-1.
Fetching a URL with file_get_contents( )
Or you can use the cURL extension, as shown in Example 13-2.
Fetching a URL with cURL
You can also use the HTTP_Request class from PEAR, as shown in Example 13-3.
Fetching a URL with HTTP_Request
file_get_contents( ), like all PHP file-handling functions, uses PHP's streams feature. This means that it can handle local files as well as a variety of network resources, including HTTP URLs. There's a catch, though'the allow_url_fopen configuration setting must be turned on (which it usually is).
This makes for extremely easy retrieval of remote documents. As Example 13-4 shows, you can use the same technique to grab a remote XML document.
Fetching a remote XML document
To retrieve a page that includes query string variables, use http_build_query( ) to create the query string. It accepts an array of key/value pairs and returns a single string with everything properly escaped. You're still responsible for the ? in the URL that sets off the query string. Example 13-5 demonstrates http_build_query( ).
Building a query string with http_build_query( )
To retrieve a protected page, put the username and password in the URL. In Example 13-6, the username is david, and the password is hax0r.
Retrieving a protected page
Example 13-7 shows how to retrieve a protected page with cURL.
Retrieving a protected page with cURL
Example 13-8 shows how to retrieve a protected page with HTTP_Request.
Retrieving a protected page with HTTP_Request
PHP's http stream wrapper automatically follows redirects. Since PHP 5.0.0, file_get_contents( ) and fopen( ) support a stream context argument that allows for specifying options about how the stream is retrieved. In PHP 5.1.0 and later, one of those options is max_redirects'the maximum number of redirects to follow. Example 13-9 sets max_redirects to 1, which turns off redirect following.
Not following redirects
The max_redirects stream wrapper option really indicates not how many redirects should be followed, but the maximum number of requests that should be made when following the redirect chain. That is, a value of 1 tells PHP to make at most 1 request'follow no redirects. A value of 2 tells PHP to make at most 2 requests'follow no more than 1 redirect. (A value of 0, however, behaves like a value of 1'PHP makes just 1 request.)
If the redirect chain would have PHP make more requests than are allowed by max_redirects, PHP issues a warning.
cURL only follows redirects when the CURLOPT_FOLLOWLOCATION option is set, as shown in Example 13-10.
Following redirects with cURL
To set a maximum number of redirects that cURL should follow, set CURLOPT_FOLLOWLOCATION to TRue and then set the CURLOPT_MAXREDIRS option to that maximum number.
HTTP_Request does not follow redirects, but another PEAR module, HTTP_Client, can. HTTP_Client wraps around HTTP_Request and provides additional capabilities. Example 13-11 shows how to use HTTP_Client to follow redirects.
Following redirects with HTTP_Client
cURL can do a few different things with the page it retrieves. As you've seen in previous examples, if CURLOPT_RETURNTRANSFER is set, curl_exec( ) returns the body of the page requested. If CURLOPT_RETURNTRANSFER is not set, curl_exec( ) prints the response body.
To write the retrieved page to a file, open a file handle for writing with fopen( ) and set the CURLOPT_FILE option to that file handle. Example 13-12 uses cURL to copy a remote web page to a local file.
Writing a response body to a file with cURL
To pass the cURL resource and the contents of the retrieved page to a function, set the CURLOPT_WRITEFUNCTION option to a callback for that function (either a string function name or an array of object name or instance and method name). The "write function" must return the number of bytes it was passed. Note that with large responses, the write function might get called more than once as cURL processes the response in chunks. Example 13-13 uses a cURL write function to save page contents in a database.
Saving a page to a database table with cURL
13.1.4. See Also
Recipe 13.2 for fetching a URL with the POST method; documentation on file_get_contents( ) at http://www.php.net/file_get_contents, simplexml_load_file( ) at http://www.php.net/simplexml_load_file, stream_context_create( ) at http://www.php.net/stream_context_create, curl_init( ) at http://www.php.net/curl-init, curl_setopt( ) at http://www.php.net/curl-setopt, curl_exec( ) at http://www.php.net/curl-exec, curl_getinfo( ) at http://www.php.net/curl_getinfo, and curl_close( ) at http://www.php.net/curl-close; the PEAR HTTP_Request class at http://pear.php.net/package/HTTP_Request; and the PEAR HTTP_Client class at http://pear.php.net/package/HTTP_Client.