Data Mining: There s Gold in Them Thar Bits


Data Mining: There's Gold in Them Thar Bits

Web surfing data is not the only source of potentially valuable employee information. One of the more interesting by-products of technology is the phenomenal value that it can add to existing data. It's all well and good to have millions of census forms moldering in a government warehouse, but if they have to be sorted and collated by hand, their value and utility is limited. Even the relatively cumbersome punch card technology helped make the data on those census forms much more useful. Computers, with their vastly faster data handling capabilities, more compact storage, and programmability, are capable of extracting even greater value. With the enormous improvements taking place in computing power, massive amounts of data can be manipulated, sorted, massaged, and viewed from an almost infinite number of angles.

The ability of computers to organize and extract value from large databases has given rise to a new term and a new segment of the software industry: data mining. The key element in data mining is the discovery of relationships between various pieces of data. Using the information contained in its customer database, for instance, an online retailer might quickly be able to determine which items in its inventory are doing well in different areas of the country. With a little more massaging, a company can tailor its e-mail advertisements to its customers based on their recent purchases, both in terms of interest and the amount they typically spend on given types of products.

During the summer of 1999, the online bookstore Amazon.com received some criticism for its enthusiastic implementation of data mining technology. Using a combination of zip codes and domain names, Amazon began compiling specialized bestseller lists, detailing the most popular books in specific communities, businesses, and educational institutions. Amazon called its new feature "purchase circles," and marketed them both as a diversion and as a gift-giving guide. If your brother worked for National Semiconductor, for instance, you might think about giving him a copy of 101 Nights of Grrreat Sex, which hit the National Semiconductor bestseller list that summer.

The public outcry that followed the introduction of purchase circles gave Amazon some pause, and the company instituted policies (albeit moderately well-hidden ones) that allow customers to block the collection of their purchase information. The company also no longer promotes its purchase circles as aggressively as it first did, although you can still find the top-selling books for various communities, governmental bodies, and organizations. On October 6, 2002, for example, the top five best-selling books among Microsoft employees were:

  • Yookoso! An Invitation to Contemporary Japanese (student edition plus listening comprehension audio cassette)

  • Trust on Trial: How the Microsoft Case is Reframing the Rules of Competition

  • Proudly Serving My Corporate Masters: What I Learned in Ten Years as a Microsoft Programmer

  • The Person and The Situation

  • Communicating and Mobile Systems: The Pi-Calculus

One industry that has quickly adopted data mining technology is the National Basketball Association. A number of teams in the NBA use an IBM program called Advanced Scout, which analyzes the reams of data generated during the league's basketball games. Coaches can download the data, use Advanced Scout to organize and assess it, and then run different queries on the data to locate and identify patterns and trends. The information provided by Advanced Scout can then be used to make game films more informative and educational for both the coaching staff and players.

Similar technology is making its way into more mainstream companies, where the queries are not so much "who are the best scorers and in what game situations?" as they are "who are our best salespeople and with what types of customers?" Software that allows employers to ask those types of questions, of course, is entirely in keeping with the normal goals of business—figuring out how to make the best use of the assets on hand, so as to maximize profits and minimize costs.

Some of the data mining software that's hitting the market threatens to go beyond the boundaries of normal business analysis. For instance, a Boulder, Colorado software company called Fatline recently released a new program called FastTracker. Ostensibly, FastTracker is designed to enable management and employees to analyze all of the information on each other's hard drives so that when a new issue arises or specific information is required, it's possible to locate the person in the organization who is most knowledgeable.

An important subsidiary purpose of FastTracker is to introduce peer-to-peer pressure into the workplace, by making it possible for employees to see what other employees are doing on the Web or are storing on their hard drives. While a coworker's Web surfing habits might reveal that she has developed an expertise in an important new technology, privacy specialists are concerned that peer-to-peer peering will further eroded employee privacy.

A similar, albeit more tightly controlled product, AltaVista Enterprise Search, is offered by former search engine leader AltaVista. Like FastTracker, the AV Enterprise Search program is designed to scan files located in disparate places on a company network and prepare an index that can assist the company in locating information and resources. AltaVista stresses that its program is structured in such a way that a company's management can limit the resources available to employees. Without such controls, it might be possible for employees to gain access to highly sensitive material typically stored on computers in the human resources department: salary schedules, evaluations, medical records, etc.

As we've seen, businesses are fairly cavalier in how they treat their employees' privacy for their own purposes (security, productivity, etc.), but so far, at least, there are no indications that companies are selling large quantities of employee data for economic gain. Most companies are undoubtedly aware that it would be extremely difficult to attract good candidates if they were discovered making a practice of selling private employee information.

The very real possibility exists, nonetheless, that at some point, the value of such information may prove to be too great a temptation. A number of major corporations are saving costs in their human resources departments by providing employee payroll and work history information to a third-party database. When other employers, banks, and income verification services need information about a particular employee's work history, they call the third-party database rather than the employee's former company. By turning employment information over to the central database, companies are able to shrink the size of their HR departments and reduce their personnel costs.

One of the leaders in this growing field is Talx Inc., which operates a service called The Work Number, which prospective employers can call to verify a job applicant's information (at a charge of ten to thirteen dollars per inquiry). Since 1995, Talx has contracted with companies to obtain data on their employees' employment dates and income; in June 2002, the company said that it had 70 million employment records on file. Taking into account duplicate records, Talx says that its Work Number service has information on roughly 20 percent of the American workforce.

Talx minimizes the potential privacy concerns by stating that the data it collects and redistributes is limited to employment dates and salary. The system does not contain more subjective information like job evaluations or recommendations, although it's certainly not hard to imagine a system that would offer copies of your last three evaluations for an additional fee. The Work Number also contains several security checkpoints to prevent unauthorized access: In order to get access to your employment data, your prospective employer must be a subscriber of The Work Number and must have (surprise!) your Social Security number.

The temptation to peddle employee information will surely rise as data mining software grows more sophisticated and allows employers to draw increasingly detailed portraits of their employees. In this regard, employees may ultimately be hoisted by their own petard. The ability of employers to compile detailed information about their employees—information that might eventually be attractive to third parties—is a direct result of the employees' tendency to use company Internet resources to send e-mail, surf the Web, and engage in online chats.

According to a February 2001 Forbes article by Victoria Murphy, the growing use of e-mail and Web surfing by employees, when tracked and analyzed by software like FastTracker and AV Enterprise Search, is allowing businesses to identify employees with particular skills or experience. For instance, if a printer company is considering adding Wi-Fi capabilities to its machines, a manager could use software from companies like Tacit Knowledge Systems or the London-based Autonomy to find out if there's anyone in his company who's been quietly developing expertise in the new technology.




The Naked Employee. How Technology Is Compromising Workplace Privacy
Naked Employee, The: How Technology Is Compromising Workplace Privacy
ISBN: 0814471498
EAN: 2147483647
Year: 2003
Pages: 93

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net