A database is a mechanism used to define, store, and manipulate data. In modern information systems as database management systems (DBMSs), which generally exist apart from application software, a database contains the data as well as a programming and command interface used to create, manage, and administer data.
Cross-Reference DBMSs generally contain an access-control mechanism used to protect data and permit only certain users, or classes of users, to view or modify certain portions of the database. We describe access control mechanisms in databases more fully in Chapter 4.
The two most common types of databases found in use today are relational databases and object-oriented databases.
The granularity of access control is a description of how finely you can control who can see which tables, rows, and fields. An example of low granularity is read or read/write access to all rows and fields in a table. High granularity restricts access to certain fields and even certain rows. But realize that high granularity means that either the database administrator (good if this isn’t you) or the security administrator (bad if this is you) is going to be working a lot of extra hours managing all those permissions.
Views can be employed to simplify security issues. A view is a virtual table that consists of the rows and fields from one or more tables in the database. Then you give access to these views rather than to the actual tables. This can be a lot easier to manage, meaning that the database administrator or security administrator can manage security with somewhat higher granularity and still have a life.
Aggregation is a matter of serious concern from a security and privacy perspective. Aggregation refers to the process of combining low-sensitivity data items together, resulting in high-sensitivity data. Consider this example: If you get Ed McMahon’s home address from one database, his Social Security Number from another, his driver’s license from another, and his date of birth from yet another, then you’ve got something potentially damaging . . . or valuable. Add a diabolical, hungry mind, some motivation, and soon you’re buying Brooks Brothers suits and driving shiny red Italian sports cars, all courtesy of Ed McMahon. If you still don’t get it, we’re talking about identity theft here. By themselves, dates of birth or a home address don’t mean a lot, but together, you’ve got something. That’s aggregation.
Instant Answer The term aggregation refers to the process of combining low-sensitivity data items to produce a high-sensitivity data item.
We’d be remiss if we didn’t mention inference. This concept refers to the ability of someone to deduce or infer something about sensitive information that’s beyond normal reach because of its sensitivity level.
A data dictionary is a database of databases. In a large application with many tables (some applications have hundreds of tables), a data dictionary could exist as a database containing all the information about all the tables and fields in a given application. The DBMS, the application, and security tools such as access control use a data dictionary. A data dictionary can be used to create or re-create tables, manage security access, and as a control point for managing the schema of the application’s database.
A data warehouse is a special purpose database that’s used for business research, decision support, and planning; comparatively, typical databases support daily business operations.
Why have a data warehouse? Ultimately, for performance reasons. Every large company has a department full of people who play what-if games with the corporate data - and they get paid for it! For instance, a bank executive may want a list of customers with more than $35,000 in their checking accounts who haven’t made any deposits in the past month and who also live in the 94027 Zip code. (You know, the Silicon Valley zillionaires.) The people in Operations may get a little upset if the VP is combing through the database all day long doing queries on nonindexed fields. The tellers who are waiting 45 seconds for deposits to clear because of all this extra processing might be getting a little miffed, too.
The point is that the company’s main production database isn’t the place for the kind of activity that the what-iffers play. You can tune the database for inserts and updates (activities typical in regular production databases) or large queries (activities typical of data warehouses), but not both. These two activities require opposite tuning settings in the database. They don’t get along. Analogy: If you want to drive fast, pick the sports car; if you want to haul stuff, choose the pickup truck. Use the right tool for the right job. The same goes for production business data processing versus research and forecasting.
The respectable term for the activity that the what-iffers play is decision support. These folks try to figure out trends about their customers, their business activities, or what-have-you, in order to support strategic decisions. The new fancy-pants term for decision support is data mining. Data mining should conjure up a vivid word picture of a strip mine where frantic diggers comb through tons of ore for grams of gold, uranium, and kryptonite. Data mining is like that: Executives and business planners do comb through tons of historical data in order to find hard-to-spot and potentially valuable trends.
Another common application of data mining is fraud detection. Banks and credit card companies sift through mountains of transactions looking for spending trends in individual accounts to sniff out whether a credit card number may have been stolen from a poorly protected e-commerce company’s data warehouse. Talk about a cruel irony.
Someone who needed to sell more databases came up with the idea that building a separate database optimized for humongous queries to support decision support would, in the end, be better than trying to accommodate production operations and decision support on the same system. History bears this out, and data warehouses work pretty well when they’re well designed and managed.
Instant Answer The type of database used for decision support is called a data warehouse.