HTTP, Browsers, and Credentials It is easy to draw incorrect conclusions about the behavior of the Web; when you have a page displayed in your browser, it is natural to think that you are still connected to that site. In actuality, however, that's not the case once your browser fetches the page from the server, both disconnect and forget about each other. If you follow a link, or ask for another page from the same server, a completely new exchange has begun. When you think about it, this is fairly obvious. It would make no sense for your browser to stay connected to the server while you went off to lunch or home for the day. Each transaction that is unique and unrelated to others is called stateless, and it has a bearing on how HTTP access control works. When it comes to password-protected pages, the web server doesn't remember whether you've accessed them before or not. Down at the HTTP level where the client (browser) and server talk to each other, the client has to prove who it is every time; it's the client that remembers your information. When accessing a protected area for the first time in a session, here's what actually gets exchanged between the client and the server: The client requests the page. The server responds, "You are not authorized to access this resource (a 401 unauthorized status). This resource is part of authentication realm XYZ." (This information is conveyed using the WWW-Authenticate response header field; see RFC 2616 for more information.) If the client isn't an interactive browser, at this point it probably goes to step 7. If it is interactive, it asks the user for a username and password, and shows the name of the realm the server mentioned. Having gotten credentials from the user, the client reissues the request for the document including the credentials this time. The server examines the provided credentials. If they're valid, it grants access and returns the document. If they aren't, it responds as it did in step 2. If the client receives the unauthorized response again, it displays some message about it and asks the user if he wants to try entering the username and password again. If the user says yes, the client goes back to step 3. If the user chooses not to reenter the username and password, the client gives up and accepts the "unauthorized" response from the server. Once the client has successfully authenticated with the server, it remembers the credentials, URL, and realm involved. Subsequent requests that it makes for the same document or one "beneath" it (e.g., /foo/bar/index.html is "beneath" /foo/index.html) causes it to send the same credentials automatically. This makes the process start at step 4, so even though the challenge/response exchange is still happening between the client and the server, it's hidden from the user. This is why it's easy to get caught up in the fallacy of users being "logged on" to a site. This is how all HTTP weak authentication works. One of the common features of most interactive web browsers is that the credentials are forgotten when the client is shut down. This is why you need to reauthenticate each time you access a protected document in a new browser session. |