This chapter discusses sessions and the inherent risks associated with stateful web applications. You will first learn the fundamentals of state, cookies , and sessions; then I will discuss several concernscookie theft, exposed session data, session fixation, and session hijackingalong with practices that you can employ to help prevent them.
The rumors are true: HTTP is a stateless protocol. This description recognizes the lack of association between any two HTTP requests. Because the protocol does not provide any method that the client can use to identify itself, the server cannot distinguish between clients.
While the stateless nature of HTTP has some important benefitsafter all, maintaining state requires some overheadit presents a unique challenge to developers who need to create stateful web applications. With no way to identify the client, it is impossible to determine whether the user is already logged in, has items in a shopping cart, or needs to register.
An elegant solution to this problem, originally conceived by Netscape, is a state management mechanism called cookies. Cookies are an extension of the HTTP protocol. More precisely, they consist of two HTTP headers: the Set-Cookie response header and the Cookie request header.
When a client sends a request for a particular URL, the server can opt to include a Set-Cookie header in the response. This is a request for the client to include a corresponding Cookie header in its future requests. Figure 4-1 illustrates this basic exchange.
Figure 4-1. A complete cookie exchange that involves two HTTP transactions
If you use this concept to allow a unique identifier to be included in each request (in a Cookie header), you can begin to uniquely identify clients and associate their requests together. This is all that is required for state, and this is the primary use of the mechanism.
The concept of session management builds upon the ability to maintain state by maintaining data associated with each unique client. This data is kept in a session data store, and it is updated on each request. Because the unique identifier specifies a particular record in the session data store, it's most often called the session identifier.
If you use PHP's native session mechanism, all of this complexity is handled for you. When you call session_start( ), PHP first determines whether a session identifier is included in the current request. If one is, the session data for that particular session is read and provided to you in the $_SESSION superglobal array. If one is not, PHP generates a session identifier and creates a new record in the session data store. It also handles propagating the session identifier and updating the session data store on each request. Figure 4-2 illustrates this process.
While this convenience is helpful, it is important to realize that it is not a complete solution. There is no inherent security in PHP's session mechanism, aside from the fact that the session identifier it generates is sufficiently random, thereby eliminating the practicality of prediction. You must provide your own safeguards to protect against all other session attacks. I will show you a few problems and solutions in this chapter.