2.3 The Data Warehouse


2.3 The Data Warehouse

The concept of data warehousing—that is, assembling a cohesive view of customers from multiple internal databases coupled with external demographic data sources—has been an accepted practice for several years by large companies, especially retailers. The idea of the data warehouse is to have a multidimensional picture of customers, mixing information about their spending habits with insightful lifestyle demographics. While the concept of this type of consumer data warehouse is not directly applicable to law enforcement and counter-intelligence, its data architecture does have merits: the assembling of information about individuals from disparate databases into a composite to gain a comprehensive view of their identities and behaviors.

The most common analyses that data warehouses in the private sector are subject to are online analytical processing (OLAP) and data mining. OLAP tools are used to extract data cubes, which are reports segmenting customer or sales information by area—for example by zip code, city, state, and region. They are a fairly straightforward, analysis-driven type of reporting. While OLAP reports are valuable in summarizing of customer activity, data mining is more valuable because it often identifies the hidden patterns of customer behavior.

The ability for companies to use these types of analyses on their data warehouses has led to the practice of customer-relationship management (CRM). In CRM, firms integrate all point-of-contact customer data, including Web site forms, e-mail, dealership sales data, phone call site data, and transactional data, in order to provide better service and retain their customers. While the concept of CRM also does not apply to law enforcement either, the lessons about integrating data from multiple sources in order to assemble a picture of an individual is applicable, because, again, a cohesive view of perpetrators and suspects can be obtained.

September 11 demonstrated the need to share and access multiple data sets containing critical strategic information, as well as to be more effective in the use of data mining techniques normally used for profiling individuals in marketing, call centers, insurance, telecommunications, utilities, retailing, and e-commerce. The same type of CRM analysis, which uses data warehousing and analytic techniques, can be applied to counter-intelligence and criminal detection applications. This is not to suggest the use of the simplistic type of racial profiling that has been used in the past, but a more effective methodology of using data mining as a modeling tool for sorting through vast databases to identifiy perpetrators based on behavioral patterns and socioeconomic, Internet, consumer, credit, criminal, lifestyle, and other commercial and government data sources.

As was mentioned in the preceding chapter, individuals cannot exist without leaving a trail of digital data in commercial and government databases and online and offline information. Appendix A includes a partial listing of several hundred Web sites that provide links to some of these files. However, the sites listed in Appendix A are just a start; there are many more potential data sources for enhancing the value of an investigative data mining analysis. Users of data mining tools and techniques from industries in financial services, retailing, marketing, and the like have long employed the concept of overlaying information about their customers and prospects with external lifestyle, socioeconomic, and demographic data.

For example, an e-commerce site can mine not only the clickstream data of its most loyal and profitable online customers, but it may also look at their zip-code and geo-code demographics in an attempt to obtain a profile about them. It can also look at the geo location of their Internet provider address. Using a similarly method, perpetrators may be profiled via data appends from diverse and unrelated databases. Unexpected results may occur when this is done; for example, the German authorities used utility-power usage records to identify potential dormant terrorists: foreign students who rented (safe) houses and used no electricity.




Investigative Data Mining for Security and Criminal Detection
Investigative Data Mining for Security and Criminal Detection
ISBN: 0750676132
EAN: 2147483647
Year: 2005
Pages: 232
Authors: Jesus Mena

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net