Fear often accompanies progress. Historically, the threat of invading one's personal privacy was more of a potential than a reality. However, with the increased use of electronic communications and the World Wide Web (WWW), it has become quite easy and inexpensive to share information among trading partners. Prior to the mid-1990s, there were technical barriers as well as economic disincentives to the sharing of information. As these barriers have fallen, the potential for data-mining use and abuse has increased.
At one time, society was very concerned about "Big Brother" (the government) gathering data and determining what individuals were doing in their personal lives. Interestingly, as a new century begins, it appears as if the organizations most likely to invade your privacy are local businesses. When one considers the quantity of data collected about consumers, it is mind-boggling. Just consider the level of detail contained in the purchasing history of individuals who use VIP, shopper club cards, or credit cards to obtain store discounts. Through these membership cards, companies are able to track your purchases, possibly deducing your interests. In addition, data may be gathered about you in the most unlikely of places. For example, imagine working out at your local gym on a computerized stair stepper or stationary bike. A computer tracks your heartbeat or the number of steps taken per minute. Netpulse Communications Incorporated (http://www.netpulse.com/) does just that. It links its exercise equipment to a national database of healthcare member profiles. "By surveying members, Netpulse plans to flesh out the profiles to include the person's age, weight, gender, birth date, address, and product-buying preferences" (Markoff, 1999, p. 96). Netpulse's intention is to provide online advertising based on individual profiles.
Data, information, and knowledge vary in their stability. For example, knowing a customer bought Scooby Doo fruit snacks is less important than the fact the customer has children. "The fact that a customer has diabetes is more stable than a particular pattern of food purchases that may allow inferring he or she has diabetes" (McCarthy, 2000, p. 75). More stable facts such as a person has children or diabetes are more predictive of future behaviors than simple observational facts such as diapers were purchased on the 12th of last month.
Needless to say, no matter how you categorize data, the quantity of data collected about an individual is substantial demographic information, customer satisfaction, legal history, insurance records, purchase preferences, financial and banking information, as well as medical profiles. One thing that IS and business professionals must realize is that following ethical practices and respecting the privacy of individuals makes good business sense. Bad publicity associated with a single incident can taint a company's reputation for years, even when that company has followed the law and done everything that it perceives possible to ensure the privacy of those from whom the data was gathered. An example of a company that knows all too well the politics of the privacy debate is N2H2. Its Internet filtering software is used by 40% of U.S. schools. N2H2 decided last year to sell its aggregated data. It followed the rules set forth by the Children's Online Privacy Protection Act, and the data did not contain names or personal information (Wilder & Soat, 2001). However, it had so many people up in arms over the selling of this data that it scrapped the project. Thus, even though N2H2 was well within its legal rights to sell its aggregate data, the public viewed this as unethical. In the next section, we discuss ethics and their relevance to data mining.