18.3 Using the Internet as an Investigative Tool


18.3 Using the Internet as an Investigative Tool

An important aspect of following the cybertrail in an investigation is to search for related information on the Internet such as a victim's Web pages or Usenet messages, an offender's e-mail address or telephone number, and personal data in various online databases. Because the Internet contains so much loosely ordered information, searching for something in particular can be like looking for a needle in a haystack. This is why it is crucial to learn how to search the Internet effectively. In addition to becoming familiar with various search tools, it is necessary to develop search strategies.

One method of searching for digital evidence on the Internet is to look for online resources in a particular geographical area. For instance, if a victim or unknown offender lives in San Francisco, there is likely to be a higher concentration of related information in that area. Searching online telephone directories, newspaper archives, bulletin boards, chat rooms, and other resources dedicated to San Francisco can uncover unknown aspects of a known victim's online activities and can lead to the identity of a previously unknown offender. Search engines that focus on a particular country (e.g. www.google.it, ie.altavista.com) can also be useful for a geographically focused search.

Another strategy is to search within a particular organization. For instance, if a victim or offender is affiliated with a particular company or school, there is likely to be a higher concentration of personal information in associated online resources. As with a geographically focused search, looking through an organization's online telephone directory, internal bulletins or newsletters, discussion boards or mailing lists, and other publicly accessible online resources can lead to useful information. Additionally, it may be possible to query systems on an organization's network for information about users. Although it is permissible to access information on an organization's computer systems in non-invasive ways, care should be taken not to cross the line into unauthorized access.

Besides searching for real names, nicknames, full e-mail addresses, and segments of e-mail addresses, it can be productive to focus searches around unusual interests, searching areas on the Internet that the victim or suspect frequented. Given the difficulty in making informed guesses of where a victim or offender might go on the Internet, this type of search usually develops from a lead. For instance, interviews with family and friends, or an examination of a victim's computer may reveal that she subscribed to a particular newsgroup and frequented a particular IRC chat room to arrange sexual encounters. An offender or victim may have left traces of their activities in these online areas. Searching these areas can be particularly productive if the offender and victim communicated with each other in a public area on the Internet, revealing connections between them.

In addition to the traces of activities that remain on the Internet, online witnesses who used the same areas may have logs of the activities on their computers. For instance, in the Sharon Lopatka case, participants in the AOL and IRC channels that the victim and offender frequented recalled that both of them did not employ "safe-words" to prevent injury during rough sex (Cairns 1996). As another example, after apprehending an offender, some digital evidence examiners will contact people who the offender was in contact with on the Internet (e.g. sent e-mail, AOL Buddy list). By sending a letter to these individuals informing them of the situation and asking them for any related information, it is possible to locate witnesses and other victims. In some cases, victims of a common offender seek each other out to form online support networks. These associations can be helpful to the victims. They can also be useful to investigators because the networks make identifying and contacting victims easier. However, sharing information about the criminal activity and the offender among victims who are also potential witnesses may complicate matters when the time comes for them to testify.

Notably, these search strategies are not mutually exclusive and can be effectively combined to locate the majority of available information on the Internet regarding the search subject. Whichever combination of search strategies is used, investigators should document important searches, indicating when, where, and how specific items were found. Handwritten notes combined with the investigator's Web browser history are generally sufficient to show when, where, and how information was located. Also, because information on the Internet can change at any moment, screenshots and copies of Web pages are useful for documenting what investigators saw at the time. Some tools for capturing a Web site efficiently and fairly completely are:

  • Web Whacker: www.webwhacker.com

  • Adobe Acrobat: www.adobe.com

  • Teleport: www.tenmax.com/teleport/pro/home.htm

  • Httrack: www.httrack.com

  • Web Copier: www.maximumsoft.com

  • Snagit: www.techsmith.com

  • Anawave's WebSnake: http://www.websnake.com/

  • Htdig: http://www.htdig.org

  • Surfsaver: www.surfsaver.com/download

  • Wget: http://www.gnu.org/software/wget/wget.html

  • Black Widow: www.softbytelabs.com/BlackWidow

Some of these tools will not copy subpages of a Web site if links to these subpages are encoded in a scripting language that the tool does not understand. Therefore, it is advisable to test a tool to ensure that it is adequate for the task and inspect the resulting files to verify that they are satisfactory. Any files that are generated during the search process should be inventoried, documenting file names, MD5 values, and date-time stamps.

18.3.1 Search Engines

Search engines are among the most useful tools for finding information on the Internet. Although search engines are not particularly difficult to use, there is some skill involved in using them effectively. Each search engine has different contents, archiving methods, search features, and limitations. Therefore, if is important to understand how each search engine works and which ones are best suited for particular tasks.

Many search engines, like Altavista, actively update themselves by running programs that search the Web incessantly for new data. As a result, they can turn up recent information but lack older, outdated data.[10] Google compensates for this shortcoming by retaining a copy of Web pages it has found - this "cached" information is useful when the original is gone. Google is also capable of searching Word documents and PDF files that other search engines overlook. Additionally, Google has a searchable archive of Usenet messages stretching back to 1981. Another unique feature of Google is its search algorithm (PageRank), which estimates the relevance and quality of data based on the number of links to the data from other sources on the Web. It is important to be aware of how each search engine attempts to "help" with a search so that this "help" can be utilized when it is useful and avoided when it is not.

Investigators can employ the language of the search engines they are using to create more narrowly focused searches. For example, some search engines understand words like AND, OR, NOT, and NEAR. Some search engines also allow symbols such as "-" to exclude terms for the search and "+" to include terms. For instance, in Altavista, the following commands can be used to find documents containing the words "unsolved" and "homicide" but not the words "mystery" or "mysteries:"

+homicide +unsolved -mystery -mysteries

homicide AND unsolved AND NOT myster*

Some offenders protect themselves by using computer-smart nicknames such as En0ch|an instead of Enochian. The zero instead of an "o" and the pipe (|) instead of an "i" confound search algorithms. In such cases, clever use of search engine syntax (e.g. AND, OR, NEAR) is required. Search engines can also be useful for finding connections on the Web. For instance, pages containing links to a suspect's Web site can be found by searching Google or Altavista using the syntax "link:www.suspectswebpage.com." For additional discussion about utilizing advanced features of search engines see SearchEngineWatch.[11]

Keep in mind that searching for obviously illegal terms will rarely turn up anything illegal. Many Web sites use illegal terms to attract interest, but actual criminals make some effort to hide their activities using euphemisms. For instance, some offenders use the terms "lolita" or "nature shots" to refer to images of children, or "family fun" to refer to incest. These euphemisms may turn up during the initial searches, in which case it will be necessary to expand the search using this new knowledge and gradually narrow the search again. Also, individuals who want their Web pages to be excluded by search engines can simply place "robots.txt" files on their Web sites.

Metasearch engines such as Copernic and Metacrawler enable individuals to search multiple search engines simultaneously from a single site. Because they utilize many other search engines, metasearch engines can be useful for brainstorming or finding very specific details. However, since metasearch engines tend to usurp control of the search, their results can be incomplete or can contain unrelated entries. As a result, metasearch engines make it more difficult to determine why certain pages were included in the results, making it difficult to explain to others how the page was found. Search results may contain pages that are unrelated to the subject in question but that contain some of the keywords. Failing to explain exactly how a particular piece of evidence was found can weaken a case. Furthermore, the large number of hits that are common in metasearch engines can be overwhelming and can hinder an investigation.

Although metasearch engines can be useful when searching for very specific details (e.g. occurrences of a telephone number on a Web page), it is important to also search specialized search engines or databases (e.g. telephone directories) when looking for fine details.

18.3.2 Online Databases (The Invisible Web)

There are many databases on the Web containing data within specific subject areas. For example, online databases contain information about sex offenders, missing children, individuals' assets and credit history, and medical information. Many of these databases can be located using search engines but the information they contain can only be queried directly. For instance, using Google or Altavista for "sex AND offender AND database" leads to various Sex Offender Registries around the United States. Some databases are organized on the following Web sites, making them easier to find.

  • InvisibleWeb: http://invisibleweb.com

  • Internets: http://www.internets.com

  • JournalismNet: http://www.journalismnet.com

  • PowerReporting: http://www.powerreporting.com

There are also online databases, such as AutoTrack and KnowX, containing a wide variety of information about individuals but these databases charge fees for use.

Whois databases are particularly useful for investigations involving the Internet. Whois databases are maintained by Internet registrars and contain the names and contact information of people who are responsible for the many computer systems that make up the Internet. These databases can reveal who is responsible for a particular Web site, including their name, telephone number and address. There are separate Whois databases for different countries - some of the main databases are listed here and others can be found at Allwhois.[12]

  • United States (NetSol): http://www.netsol.com/cgi-bin/whois/whois

  • United States (ARIN): http://whois.arin.net/whois/index.html

  • Europe: http://www.ripe.net/db/whois.html

  • Asia: http://whois.apnic.net/

Some registrar databases only have information on high-level domains while others have information on IP addresses. For instance, to find the contact information for "www.wsex.com," search Netsol whereas to find contact information for the associated IP address (207.42.132.101), search ARIN. Note that these databases have slightly different contact information for the World Sports Exchange.

Domain name: www.wsex.com

IP Address: 207.42.132.101

Registrant: Big Green (WSEX-DOM)

ISP: Cable & Wireless Antigua

SPRINT-CF2A87

  • Woods Center #11

  • St. Johns Antigua

OrgName:

World Sports Exchange

  • AG

OrgID:

WSE-9

Address:

Friar's Hill Road

Domain Name: WSEX.COM

Address:

Woods Center, St John's

City:

Administrative Contact:

StateProv:

  • holowchak, jason (NZHOWTMQZI)

PostalCode:

  • jasonholowchak@hotmail.com

Country:

AG

    • hodges bay

    • st. johns, na na

NetRange:

207.42.132.96-207.42.132.127

    • AG

CIDR:

207.42.132.96/27

    • 268-480-3861 123 123 1234

NetName:

CWAG-207-42-132-96

NetHandle:

NET-207-42-132-96-1

  • Hanson, Spencer (SH2534)

Parent:

NET-207-42-132-0-1

  • spencer@WWW.WSEX.COM

NetType:

Reassigned

    • World Sports Exchange Ltd

Comment:

    • Ryan's Place, High Street

RegDate:

2001-04-20

    • St. John's

Updated:

2001-04-20

    • AG

    • 268 480-3888

TechHandle:

MH1271-ARIN

TechName:

Hayden, Matthew

Record expires on 19-Sep-2009.

TechPhone:

(268)-480-3888

Record created on 18-Sep-1996.

TechEmail:

jay@wsex.com

Domain servers in listed order:

  • NS.WSEX.COM

207.42.132.101

  • NS2.JASONHOLOWCHAK.COM

207.42.132.119

  • NS.JASONHOLOWCHAK.COM

66.216.122.143

Sites such as Geektools[13] facilitate searches by providing a single interface to many Whois databases. It is also possible to search some Whois databases for other fields such as names and e-mail addresses. Some individuals use services like Domain by Proxy[14] to prevent their contact information from being placed in the Whois database system.

18.3.3 Usenet Archive versus Actual Newgroups

Archives such as Google Groups contain millions of messages from tens of thousands of newsgroups. These archives are invaluable tools for investigators because they contain a vast amount of detailed information about individuals and their interactions. By searching this archive, it may be possible to learn about a person's interests, personality, and much more. However, these archives are not comprehensive and should not be depended on completely when dealing with Usenet. Few archives include message attachments and anyone can specify that they do not want their postings to be archived. Any newsgroup posting with "x-no-archive: yes" as its first line will be ignored by archiving software. Also, there are private newsgroups that are not archived.

Therefore, it is important for investigators to become familiar with and involved in the actual newsgroups related to an investigation rather than rely entirely on the archives. As well as seeing information that is not archived by Google Groups (e.g. images and other file attachments), it is useful to see discussions develop and progress, get to know the characters of the participants, and observe patterns of a particular group's behavior. Additionally, investigators may be able to observe offenders of their local community in newsgroups dedicated to a specific geographic region.

[10]An archive of many Web pages can be found at http://web.archive.org/

[11]http://www.searchenginewatch.com/facts/index.php

[12]http://www.allwhois.com

[13]http://www.geektools.com

[14]http://www.domainsbyproxy.com




Digital Evidence and Computer Crime
Digital Evidence and Computer Crime, Second Edition
ISBN: 0121631044
EAN: 2147483647
Year: 2003
Pages: 279

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net