3.8 Link Analysis Applications
Associations of networks created via link analysis may be grouped into such categories as entity-to-entity, entity-to-event, or event-to-event. Entity-to-entity associations between individuals may include blood relative, spouse, employees, friend, acquaintance, neighbor, etc. Associations between individuals and an organization may include labels such as leader, owner, member, head, and employee. Association networks between an individual and a place may include place of birth, point of entry, residence, current location, country visited, and place of training. Entity-to-event association networks may include labels for bomber, victim, object, visits, meetings, enrollment, weapon, or place. For these events, the roles for any date and time are a factor. Event-to-event association networks include tying communications, such as phone calls, e-mails, physical meetings, or planning events, to each other.
Links, as well as nodes, may have attributes specific to the domain or relevant to the method of collection. For example, link attributes might indicate the certainty or strength of a relationship, the dollar value of a transaction, or the probability of an infection. Some linkage data may be simple, such as a meeting, or voluminous, such as phone calls, wire transfers, or e-mails, with a uniformity of node and link types and a great deal of regularity. On the other hand, data may be extremely rich and varied, though sparse, such as clandestine meetings common with law enforcement investigative data, with elements possessing many domain-specific attributes, as well as confidence and value, which can change over time.
The ability of link analysis to represent relationships and associations among objects of different types has proven crucial in assisting human investigators to comprehend complex webs of evidence and draw conclusions that are not apparent from any single piece of information. Link analysis often raises the following questions for a criminal investigator and intelligence analyst:
Which nodes are the main leaders of a network?
What are the underlying relationships in the network?
What are the relevant sub-networks in a large network?
Are there undetected links or hidden nodes in the data?
What level of aggregation best reveals the links in a network?
Which links can be severed to impede the operation of the network?
Fortunately, link analysis tools have the capability of answering these and other types of questions the investigative analysts may want to interrogate from the data. Many more applications exist for link analysis techniques in the areas of fraud prevention and criminal analysis, as we shall see in the following chapters.
3.9 Focusing on Money Laundering via Link Analysis: A Case Study
NETMAP is used extensively by several government agencies including the U.S. Treasury Department's Financial Crimes Enforcement Network (FinCEN), which is involved in the proactive detection of financial crimes such as money laundering. Money laundering usually involves the conversion of large amounts of cash from illegal activities into legitimate funds. Current estimates are that $100 to $300 billion are laundered annually; obviously, this involves massive amounts of transactional data, which cannot be manually analyzed. Just in one day more than $1 trillion is wired through New York City alone.
Financial crime analysts and investigators use tools like NETMAP to begin to define the important parts of financial transactions as they relate to individuals, organizations, and locations, including dates, amounts, institutions, sources, and ID-numbers (see Figure 3.2). When NETMAP is used for money-laundering investigations, the tool is used to expose and associate indirect relationships of bank accounts, home and business addresses, and identification numbers by multiple filers of required forms with the financial organizations. For example, a reliable method of detecting wrongdoing is the use of the same address by several individuals conducting transactions in excess of $10,000 that require reporting to the IRS. The related gang member or terrorist cell operative may be using the address of a dummy or vacant safe house.
Figure 3.2: A NETMAP link chart displaying financial relationships.
Analysts use NETMAP in money-laundering investigations to search for links between individuals and organizations, such as car dealerships, wire transfer companies, and casas de cambio (houses of exchange) for buying and selling dollars that are prevalent in the border cities between the United States and Mexico.
Recently, FinCEN completed the first phase of a data mining study. They piloted three tools: Darwin from Oracle, Clementine from SPSS of Chicago, and SGI's MineSet, along with two data mining contractors, Visual Analytics of Poolesville, Md., and Nautilus Systems of Fairfax, Va. Not surprisingly, they quickly found a problem in the quality of the data. Since 1996, Treasury also has required financial institutions to use a three-page form called the Suspicious Activity Report to describe potential embezzlement, money laundering, check kiting, loan fraud, or other crimes. The form has a space for an account of the suspect activity, a category of information not present on other Bank Secrecy Act reporting forms. The free-format data—which can be cryptic, detailed, or nearly unintelligible—presented a huge challenge to the data miners.
Cleaning the raw data turned out to be a crucial first step. Not only did the researchers have to fix the usual typographical errors and misspellings, but they also had to resolve data inconsistencies among the free-format text fields. For example, bank officials would describe a suspect's occupation with terms such as "worked," "worker," "working at," "works on," or "worked for." Ultimately, the data mining software must lump the variations under one term, "employment." To find data correlations—say, between suspects' job titles and their crimes—the researchers tested several standard mining approaches. A clustering algorithm sought to group data from a given field into natural clusters. Ultimately, the broad patterns of criminal activity detected in the forms could help filter the suspicious cases out of Treasury's database of more than 12 million in annual currency transaction reports. In the next phase, the agency plans to define specific data abstraction routines and train FinCEN analysts. Ultimately, the form for capturing the data needs to be redesigned in order to make it easier for mining via neural networks and machine-learning algorithms.