5.3 Text Mining Applications


5.3 Text Mining Applications

In the context of investigative data mining, text mining techniques and tools can be used to sort and organize large collections of text-based data, such as licenses, registrations, airline tickets, credit-card transactions, point-of-entry passport records, criminal files, transcripts of investigations, and any other type of text-based data set for which a name, word, or concept needs to be identified and tracked. However, as with every data mining project, the results returned from text mining are very much dependent on the quality, relevance, and objective of the analyst.

For text mining to be effective, the content and focus of the documents and databases is very important. For example, applying text mining to a collection of random e-mail files probably won't generate much in the way of relevant findings or lead to an ongoing investigation or counter-intelligence analysis unless the e-mail files are specifically those of confirmed suspects. However, using text mining to analyze the e-mails of a group of individuals related to or who have had some contact with a group of suspects in a wide area is likely to provide some important leads to an ongoing discovery-and-detect investigation, where the objective is to identify, for example, unknown associates in a criminal ring or terrorist cell.

Text mining software can also be used to construct investigation dossiers or internal intranet directories by classifying hundreds of thousands of documents based on multiple, inherent concepts found in the source text. For example, criminal files can be organized based on modus operandi by applying NLP techniques and other advanced algorithms. Text mining software can automatically identify and extract key concepts from investigation-related documented records. These concepts can be automatically linked to a taxonomy that can meet an agency's, department's, or specific investigative team's information requirements. These taxonomies provide users with a directory structure for exploring further via link analysis tools or by browsing or searching for the information through an intranet.

Because text mining extracts the key concepts in the documents rather than a single keyword, the taxonomies make it easier for investigators and analysts to find relevant case-related information existing in multiple, linked documents. Such concept-based indexing also eliminates the need to force documents into predefined categories. Text mining software also replaces manual categorization and tagging efforts that add to the costs and deployment/update times for agency- or department-wide portals. This type of organization of crime-related information allows for the institutionalization of modus operandi and of criminal detection procedures.

Text mining software uses the source text itself to automate portal taxonomy creation by extracting multiple key concepts from the documents, mapping the interrelationships between these concepts in the document collection, and creating a taxonomy database that references and links these concepts. For example, criminal cases can be organized by a text mining tool into distinct categories based on the type, time, location, modus operandi, rate, cost, and any other characteristics or feature the user decides, or they can be organized and clustered automatically by the software.




Investigative Data Mining for Security and Criminal Detection
Investigative Data Mining for Security and Criminal Detection
ISBN: 0750676132
EAN: 2147483647
Year: 2005
Pages: 232
Authors: Jesus Mena

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net