Denial of service (DoS) attacks work by swamping your web server with a great number of simultaneous requests, slowing down the server or preventing access altogether. DoS attacks are difficult to prevent in general, and usually the most effective way to address them is at the network or operating system level. One example would be to block specific addresses from making requests to the server; although you can block addresses at the web server level, it is more efficient to block them at the network firewall/router or with the operating system network filters.
Other kinds of abuse include posting extremely big requests or opening many simultaneous connections. You can limit the size of requests and timeouts to minimize the effect of attacks. The default request timeout is 300 seconds, but you can change it with the TimeOut directive. A number of directives enable you to control the size of the request body and headers: LimitRequestBody, LimitRequestFields, LimitRequestFieldSize, LimitRequestLine, and LimitXMLRequestBody.
Robots, web spiders, and web crawlers are names that describe a category of program access pages in your website, recursively following your site's links. Web search engines use these programs to scan the Internet for web servers, download their content, and index it. Real-life users use these types of programs to download an entire website or portion of a website for later offline browsing. Normally, these programs are well behaved, but sometimes they can be aggressive and swamp your website with too many simultaneous connections or become caught in cyclic loops.
Well-behaved spiders will request a special file, called robots.txt, that contains instructions about how to access your website and which parts of the website won't be available to them. The syntax for the file can be found at http://www.robotstxt.org/. By placing a properly formatted robots.txt file in your web server document root, you can control spider activity. Additionally, you can stop the requests at the router or operating system level.