8.1 Online Reconnaissance

 <  Day Day Up  >  

Online recon can be divided into passive (performed by querying third-party resources) and active (performed in direct contact with target network resources). The recon begins by naming a target, such as a web site.

8.1.1 Passive Reconnaissance

The first intelligence-gathering step is to perform passive online reconnaissance, keeping under the company radar screens. The information typically available at this stage is just the company name and the web site address. The web site address can yield information about web hosting (through whois and traceroute), IP addresses (using nslookup, traceroute, and whois), and some employee names (through whois). Utilities

Here are some examples of this simple reconnaissance technique, using some other standard Unix utilities. For instance, the nslookup command queries the default DNS server for the information. The server relays the request to the appropriate DNS servers (starting from the so-called root servers ) to finally receive the answer from the target organization server, as follows :

 $ nslookup www.example.com Server:  ns1.example.edu Address: Name:    www.example.com Address: 

This query yields only an IP address. However, from an IP address you can make an educated guess that an adjacent IP address also belongs to the company ”and that vulnerable servers might use those IP addresses. In the above case, you can infer that probably belong to the same company. Again, it's just a guess, but it can be verified via other means (see below). The first thing to check in this case is whether the web site is hosted at a third-party ISP or on the company premises. In the first case, an attack on adjacent addresses will hit the ISP, but not the intended victim. However, if the focus of the attack were indeed a web server, then looking at the nearby IP addresses would make sense, since the related application servers can use them. nslookup also has a more detailed mode of operation, described below. To activate this mode, type nslookup , and a new command prompt will appear. Now you can send various types of DNS queries, such as for an address resolution, for an email server, and for other data (type help to see all the options). You can also choose various servers (set this option using server whatever.example.com ).

Using the host command allows you to get more detailed information in the default query, as follows:

 $ host www.example.edu   www.example.edu is a nickname for ws.web.example.edu ws.web.example.edu has address ws.web.example.edu mail is handled (pri=1) by ws.mail.example.edu 

This example shows the IP address, the "true" hostname ( ws.web.example.edu ), and the address for a mail server. The mail server presents a useful avenue for email reconnaissance attacks (described below), denial-of-service attacks, spamming , email relaying, and other dirty tricks. The host command uses the same information sources as the previous example of nslookup.

To get more information from a single query, perform the following:

 $ host -l -v -t any example.edu Found 1 addresses for dns.example.edu Trying Connection failed, trying next server: Connection refused Trying example.edu 7200 IN SOA     dns.example.edu jexample.example.edu(                         2001021390      ;serial (version)                         7200    ;refresh period                         1800    ;retry refresh this often                         3600000 ;expiration period                         7200    ;minimum TTL                         ) example.edu 7200 IN NS      ns1.example.edu example.edu 7200 IN NS      ns2.example.edu example.edu 7200 IN MX      10 ns1.example.edu localhost.example.edu       7200 IN A mail.example.edu    7200 IN CNAME   ns1.example.edu gateway.example.edu 7200 IN A ftp.example.edu     7200 IN CNAME   ns1.example.edu                        ) 

This example is a complete download of a DNS server zone file; i.e., the collection of records for a domain.

The whois command queries public databases maintained by domain registrars in various regions of the world. It is often necessary to know which whois server to interrogate, depending upon the address used. Some newer Unix variants have a magic whois command that queries multiple sources of information. Windows users can install one of the many "network tools" packages available as freeware or shareware. whois uses its own TCP-based protocol (described in RFC 954) to send queries across networks.

The following is an example of the whois command:

 Domain Name: EXAMPLE.EDU Registrant:    Example University    Exampleville, NY 11700    UNITED STATES Contacts:     Administrative Contact:    Joe Example    Main Build. Room 13    Exampleville, NY 11700    UNITED STATES    (800) 555 - 1212    jexample@noc.example.edu    Technical Contact:    Joe Example    Main Build. Room 13    Exampleville, NY 11700    UNITED STATES    (800) 555 - 1212    jexample@noc.example.edu Name Servers:     NOCNOC.EXAMPLE.EDU    WNS2.EXAMPLE.EDU Domain record activated:    20-Apr-1988 Domain record last updated: 31-May-2002 

In might seem that querying public information databases reveals nothing new, but in fact the above excerpt contains the following information:

  • The IP addresses for several DNS servers (that can be directly queried for more information)

  • An email and a phone number contact for at least one person (useful for social engineering, email attacks, and even domain hijacking)

  • Domain expiration date (knowing this can assist with domain hijacking)

  • Physical location of the facilities (for dumpster diving, etc.)

traceroute is another useful reconnaissance tool. One example uses UDP packets (on Unix) or ICMP packets (on Windows) with the special values of some fields such as TTL (explained in Chapter 6). In this case, traceroute elicits a response (an ICMP) from every hop between you and the target. The responding machines usually include routers and other boxes that are on the path . Here's a traceroute example:

 $ traceroute www.example.edu traceroute to ws.web.example.edu (, 30 hops max, 38 byte packets  1  tesost1.all.example.com (  2.026 ms  1.572 ms  1.533 ms *        Roubox12.example.com (  3.479 ms  3.114 ms  3.032 ms *        ... 15  25.140 ms  29.966 ms  23.824 ms 16  ws.web.example.edu (  27.539 ms  33.461 ms  66.995 ms 

The most important information derived from the traceroute is the IP addresses of the hosts just before the target host, which hopefully share the same domain name. Often, you would see a "firewall.example.com" right before "www.example.com". Admittedly, this illustration is largely artificial, but you may see something like "pix12.example.com" ("pix12" most likely indicates the Cisco PIX firewall). In general, the last hops are often routers or firewalls that might be fun to examine.

You can also obtain more IP addresses from direct and reverse whois queries. For example, the query above gives the IP addresses for the name servers. In other words, you can determine the IP addresses owned by the organization. After getting the IP address for the web server, you can run the following:

 $ whois -h whois.arin.net 

You get something similar to:

 OrgName:    State University of Example at Exampleville  OrgID:      EXAEDU NetRange: -  CIDR:  NetName:    SUNY-SB NetHandle:  NET-192-0-0-0-1 Parent:     NET-192-0-0-0-0 NetType:    Direct Assignment NameServer: NOCNOC.EXAMPLE.EDU NameServer: WHOISTHERE.EXAMPLE.EDU Comment:     RegDate:    1986-08-03 Updated:    1998-02-29 TechHandle: EX666-ARIN TechName:   Exampleton, John  TechPhone:  +1-888-555-1212 TechEmail:  jex@example.edu  # ARIN Whois database, last updated 2002-09-11 19:05 # Enter ? for additional hints on searching ARIN's Whois database 

This response produces a plethora of useful information. Some of the data is similar to the whois information above, but one piece is crucial. In this case, the query returned a list of IP addresses owned by the organization, which can be used for further penetration.

More advanced whois queries allow you to search for the contact's name and other attributes. All of the advanced queries are described in RFC 954 on whois.

Samspade.org (http://www.samspade.org) and many other web sites provide a one-stop shop for such information. It is worth noting that for the case of a direct DNS query (as in our example above), there is a tiny degree of interaction between the attacker and the target (namely, the DNS query is processed by the organization's DNS server). The additional benefit of using such third-party sites is increased separation from the target, and thus safety from detection.

Other reconnaissance methods include querying the DNS server directly for more information (such as attempting a zone transfer), but we classify such recon techniques as active, as there is direct interaction with the target. Those methods are shown later in the chapter.

Some examples of what can be done with an IP address are illustrated in Figure 8-1, a screenshot from a Windows reconnaissance tool called NetScanTools Pro.

Figure 8-1. NetScanTools Pro

This tool divides reconnaissance actions into those that contact the target and those that do not, parallel to the format presented in this chapter. It's interesting that such a distinction is introduced in a commercial tool not originally designed for penetration testing. Web reconnaissance

Another preliminary passive recon technique is web searching. Querying search engines for terms related to the target company can yield important data. At the time of this writing, a comprehensive list of search of engines is available at http://directory.google.com/Top/Computers/Internet/Searching/Search_Engines/. More advanced searchers will want to hone their skills at +Fravia's http://www.searchlores.org.

Some effective search terms include:

  • Company and product names

  • Company domain names (make sure you find all the secondary domain names; the company might have separate DNS servers and contact people)

  • Names and email addresses of key employees

Multiple search engines should be used for greater coverage. While Google is the best, AltaVista and AllTheWeb might turn up a gem or two that Google misses. Read http://searchenginewatch.com to find more search engines to scour.

Google can also be used to search a company web site via the "site:example.com" string. The fun part of this search is that a large site is bound to have a juicy bit of confidential information posted by mistake. Just search for "password", and you will find some interesting results (make sure to get written permission first).

In addition, word processor documents are often distributed off the company home page. Seemingly innocent Microsoft Word documents might contain embedded company proprietary information, revision history, and pointers to people. Not all users are aware of these "hidden" features of Word files, but Word forensics is a rapidly growing field.

In addition to web searching, look at mailing list and newsgroup postings (some mailing lists are mirrored as newsgroups as well). The one-stop shop for newsgroup searching is Google at http://www.google.com/grphp. It is often productive to search for postings made from the target company's email addresses, or by company personnel from private email addresses. Many of the security and technology mailing lists are mirrored on the Web; thus, you can also just search the Web for interesting postings. A lot of material that used to be on the Web, but has since been removed, might survive in a Google cache or on the Internet Archive site, at http://www.archive.org. Using this site, you can actually access the target site as it was at some moment in the past, allowing you to track the development of the web site and possibly using the knowledge of past mistakes in current attacks.

A word on data reduction is appropriate here. If you are searching for data on a large company, the number of web hits will be vast. For example, searching for Microsoft on Google produces a staggering 33,100,000 hits. In this case, combining search terms will save you.

A method for searching print media for references to the company (thus getting more contact names, email addresses, and possibly network defenses) would be nice to have, but print media is not searchable online. Or is it? Actually, the mammoth Lexis-Nexis database aggregates most of the print media periodicals and can be searched online at http://www.lexisnexis.com. Access to this database is not free, but the fee might be worth it for serious intrusion preparations .

Another extremely useful area to search is a list of instant messenger (IM) users. Just look through the databases of AOL's AIM, Yahoo! Messenger, and MSN Messenger users. Most IM systems have web sites and user directories. For example, every ICQ user has a personal web page (located at ">http://web.icq.com/wwp?Uin=< userid >, where <userid> is the user's ICQ ID number). If you find company people among them, some new attacks become possible ( especially if those users are engaging in IM communications in violation of a security policy).

Searching job sites (such as Monster.com or Hotjobs.com) may prove helpful as well. The company's job requirements for technical positions might shed some light on its IT defenses. If the company hires Checkpoint Firewall-1 administrators, it makes sense to assume that it uses that product. The same applies to computing platforms and application software.

Yet another engaging source for juicy bits of intelligence is peer-to-peer networks. Submissions from the company's employees or from the company's IP addresses can lead to new ways of penetrating the company.

A nice source of general company data is Sec.gov, a site for the Securities and Exchange Commission. By using the company search at http://www.sec.gov/ edgar /searchedgar/companysearch.html, you can "leech" seemingly innocuous information that can help in a serious penetration exercise. For example, addresses, names, and sometimes contact information of critical employees, and financial records (for publicly traded companies) may sometimes be discovered there.

For a more advanced and meticulous analysis, it makes sense to take a peek at the personal web sites of the target company's employees. (It's not recommended, though, as your penetration-testing contract will not apply to sites outside of the company.) Apart from providing ample material for social engineering attacks (detailed in Chapter 7), such knowledge will help you in standard penetration testing, if that is indeed the nature of your interest in the subject.

8.1.2 Active Reconnaissance

Active reconnaissance is performed in direct contact with target network resources. For instance, email reconnaissance is a more active kind of reconnaissance. Email

Email intelligence gathering is a separate project in itself. The simplest form of email recon is to send an email message to a nonexistent user within the organization. For a simple network setup, the response will be something similar to the following:

 <john_baton@example.net>: does not like recipient. Remote host said: 550 5.1.1 <john_baton@example.net>... User unknown Giving up on . --- Original message follows. Return-Path: <ahdjhd@yahoo.com> ... 

The above example shows the email server responding to the message with SMTP code 550 (user unknown). This email was sent to a simple network. However, for complicated mail architecture, such a technique produces a response from the internal mail server. For example, the following message was a response from a major organization. Read it to see how much we can learn about the company's IT defenses:

 Return-path: <john@ns1.evil.net> Received: from ms.cc.example.edu (ms.cc.example.edu [])         by ns1.evil.net (8.11.0/8.11.0) with ESMTP id g8J5pHW31611        for  <john@eviluser.org>; Thu, 19 Sep 2002 01:51:17 -0400 Received: from ms.cc.example.edu (ms.cc.example.edu [])         by ms.cc.example.edu (8.12.2/8.9.3) with SMTP id g8J5pMlB026143        for  <john@eviluser.org>; Thu, 19 Sep 2002 01:51:22 -0400 (EDT) Date: Thu, 19 Sep 2002 01:51:22 -0400 From: Norton_AntiVirus_Gateways@cc.example.edu Subject: Returned mail To: john@eviluser.org Message-id: <M2002091901512215601@ms.cc.example.edu> MIME-version: 1.0 Content-type: multipart/report; report-type=delivery-status;  boundary="Boundary_(ID_FHa8wIAtDscecSrUiy54BA)" --Boundary_(ID_FHa8wIAtDscecSrUiy54BA) Content-type: text/plain; charset=us-ascii Content-transfer-encoding: 7BIT --- The message cannot be delivered to the following address. --- john_chuzokin@example.edu    Mailbox unknown or not accepting mail. 550 5.1.1 <john_chuzokin@example.edu>... User unknown --Boundary_(ID_FHa8wIAtDscecSrUiy54BA) Content-type: message/delivery-status Reporting-MTA: Norton AntiVirus Gateway;Norton_AntiVirus_Gateways@cc.example.edu Final-Recipient: rfc822;john_chuzokin@example.edu Action: failed Status: 5.1.1 Diagnostic-Code: X-Notes; Cannot route mail to user (john_chuzokin@example.edu). --Boundary_(ID_FHa8wIAtDscecSrUiy54BA) Content-type: message/rfc822 Received: from ms.cc.example.edu ([])  by ms.cc.example.edu (NAVGW with SMTP id M2002091901512206729 for  <john_chuzokin@example.edu>; Thu, 19 Sep 2002 01:51:22 -0400 

We now know the following:

  • Mail server manufacturer and version

  • Presence of antivirus defenses

  • Email topology

Not a bad chunk of intelligence from a simple email!

In other cases, careful analysis of email headers reveals internal email addresses (also known as RFC 1918 addresses, such as,, etc.), mail client and server type (useful for attacking with email malware), gateway antivirus software (as in the above case), and operating systems. In other words, email systems leak like a sieve. While it is possible to tune up the software to disclose less, such tuning is almost never done, even by the most cautious organizations. Web site analysis

We only skim web reconnaissance techniques here: they are covered extensively in most general network security references.

An effective way to collect recon data on the company's Internet presence is to take a close look at the company's web site. Web hosts, middleware servers, and backend database servers can be discovered. In addition, the web technologies can be identified and scoured for vulnerabilities.

The following is a primer for web site scrubbing. First, the visit to the web site should determine the high-level structure of the site. At the same time, look at the URLs for hostname changes and file extensions (in order to identify the technology being used). Comments in the HTML, pointers to older versions, backups , and author names can all aid in a subsequent attack. Often, something as simple as trying to access a directory listing by removing the tail part of the URL produces results. FTP

If a target company has an anonymous FTP site, it makes sense to take a peek. An FTP site is generally a relatively poor source of intelligence, since most companies do not store confidential documents on public FTP sites. However, you may be surprised. Perhaps some documents have been forgotten. You can also search word processor documents for embedded information. In fact, there are cases in which the erased portions of Microsoft Word documents have been recovered.

To avoid leaving sensitive information detritus, write in a text editor first, and then copy and paste your writing into a word processor. A word on stealth

While passive reconnaissance methods (such as web searching and public database querying) do not put the attacker in direct contact with the target, more active methods ”such as requesting information from the company's DNS servers ”might leave your IP address in a log or two. Thus, keeping in the shadows is appropriate even at this stage.

Some techniques for remaining anonymous are as follows:

  • Using public web proxies

  • Using an anonymizer service

  • Using third-party reconnaissance and attack sites

  • Using throwaway accounts

  • Using your own proxy machines (obtained using whatever channel)

  • Using a public Internet caf or other free computer access (such as a neighboring university's computer lab)

Public web proxies are useful for stealthy reconnaissance. Simply search for anonymous proxies operated by someone on the Internet, change your browser settings, and look for the target web site. For example, going to http://www.openproxies.com at the time of this writing yielded a huge list of unsecured SOCKS and Squid proxies that may be used to "launder" HTTP requests . You simply need to change the browser to go to the proxy, such as (for IE) by going to Tools Internet Options Connections LAN Settings and then setting the IP address and port of the discovered proxy server (e.g., port TCP 3128). [1]

[1] Detailed instructions for using anonymous proxies under Windows can be found in the book Windows .NET Server Security Handbook by Cyrus Peikari and Seth Fogie (Prentice-Hall).

You can try malformed requests and so on, all without revealing your IP address. Proxies will happily pass many of the web attacks (SQL injection, cross-site scripting, and others) ”well, hopefully. If you're using somebody else's proxy, check it by visiting your own web site or a known proxy test web site (search Google for "proxy test" to find such a site).

The problem we are trying to solve is figuring out whether the proxy is trying to sneak your IP address to the destination web server somewhere in the HTTP request headers (X-Forwarded-For-, Via-, and so on). Here is some brief background on what is happening. How does a proxied HTTP connection work? The browser, configured to go through the proxy server as above, sends its usual request for a web page to the proxy and not to the server the user intends to surf. The proxy receives the request and forwards it to the server. The server returns the desired page to the proxy for the subsequent forwarding back to the user. However, the proxy might choose to insert the requestor 's IP address as a part of its request to the web server. This might be done via some of the HTTP protocol header tags, such as X-Forwarded-For-. Sometimes proxies that do not do that are called anonymous proxies. [2]

[2] For directions on setting up your own proxies, look up "Anonymizing with Squid Proxy," by Anton Chuvakin (http://www.securityfocus.com/infocus/1508 ).

Even if the proxy does not send your IP address, stay away from doing anything particularly vile: the proxy might be operated by your friendly cybercrime police unit or by a local "honey net." In this case, your anonymous browsing habits will end up in some security research paper, or worse . In any case, you can never be sure who reads the access logs on a freely available proxy (which you can find by searching Google for "free web proxy").

Using an anonymizer service such as the Anonymizer (http://www.Anonymizer.com) is a stopgap solution. It is very simple to set up and does not require proxy IP address searching. There is a nice list of various sites that offer such services at http://dmoz.org/Computers/Internet/Proxies/Free/ (you can find some proxy test sites in the list as well). The Anonymizer shields your IP address from the target site and does not transmit it in headers. However, many anonymizer logs are released to third parties.

Third-party reconnaissance and scanning sites are a one-stop shop for intelligence (DNS, whois, traceroute), "anonymous" surfing, and maybe even port scanning or web server querying. If you can access the target site via a proxy, it becomes an "offense in depth" and can contribute to the overall stealth of the approach. In addition, if you know that the person operating the site does not keep logs, the possibility of someone tracing your intelligence-gathering activity is less.

Throwaway Internet accounts are also a choice for advanced hackers (although wasting valuable assets on mere reconnaissance isn't always the smartest thing to do). However they are obtained, these accounts are often difficult to trace. For example, in one case the attack was traced to a small ISP in a remote region of the U.S. The ISP had no data retention policy (actually, no data retention at all), no caller ID, and only analog phone lines; in other words, it proved to be a dead end for the investigation. Thus, throwaway Internet accounts provide a high level of stealth and are a great option, provided that they can be obtained freely or inexpensively.

Deploying your own proxy on a remote machine allows you to be sure that there is indeed no logging. However, finding an accessible machine that is not affiliated with you in any way presents a challenge. Ideally, such a machine would be placed in another country with xenophobic locals, a different native language, a poor state of computer security, and no applicable computer-crime laws.

Using Internet caf s, public libraries, and university labs for anonymous Internet access is a frequent strategy in Hollywood hacker movies. If you can get online from such a location inconspicuously, tracing you is tough. In addition, if each location is used exactly once, the challenge increases by an order of magnitude. However, note that in the U.S. nearly every public Internet terminal is believed by some to be now under some sort of surveillance (except wifi).

Attempting to achieve true stealth reveals one of the paradoxes of the Internet: you appear to be anonymous all the time, but every action is likely to be recorded. Viewed from one angle, on the Web "nobody knows you are a dog," and a single mouse-click can take you from one continent to another. Disappearing seems to be easy. But from another angle, every click is recorded somewhere, and your Internet provider will happily give up their logs if a legal investigation is opened.

It is also important to note that some no-contact reconnaissance methods can still be tracked. For example, some intrusion detection systems can be set to track DNS queries against the company's DNS servers launched from popular "tool sites" such as http://www.all-nettools.com.

Another site worth mentioning is http://cotse.com. It contains many tools, including various queries, portscans, Windows NetBIOS requests, Unix finger, and more. At the time of this writing, it also contains a remote OS fingerprinting functionality. Finally, a useful site from which to perform preliminary reconnaissance on a web server http://www.netcraft.com. Netcraft .com allows you to query the remote web server for versions, software, and even some web components (such as Apache modules in use). Use it for your pre-attack investigation (with permission, of course). Human reconnaissance

While spy novels contain dramatic descriptions of human reconnaissance, its accuracy is dubious. A discussion of such techniques is beyond the scope of this book. However, Chapter 7 contains many information-gathering techniques that can be used with technical reconnaissance methods.

Dumpster-diving is one such technique. Searching the company's trash for confidential information does not require any advanced social skills (except to explain your behavior when confronted by security guards ). Nevertheless, this technique has been known to yield valuable papers, manuals, data disks, and even hard drives . Dumpster-diving may seem like an extreme measure. However, many well-known hacker cases involve someone picking up internal or proprietary information from such unhygienic source.

 <  Day Day Up  >  

Security Warrior
Security Warrior
ISBN: 0596005458
EAN: 2147483647
Year: 2004
Pages: 211

Similar book on Amazon

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net