Index_C

C

cardinality, 249-250

causes investigation, 87-94

approaches, 88

connecting, with foreign keys, 181

data events analysis, 89-94

error clustering analysis, 88-89

not possible, 88

requirements, 88

checkers

batch, 212

building, 170

data rule, 234

defensive, 96

periodic, 253

transaction, 252-253

classifying

columns, 202-203

functional dependencies, 198-199

table relationships, 207

COBOL copybooks, 54

column names, 149-151

defined, 149

descriptive, 150-151

overdependence on, 149-150

prefixed/postfixed, 150

column properties

business meaning, 155-157, 168

confidence, 165-166

data rules vs., 220

defined, 143

discrete value list, 160-161

empty condition rules, 165

example, 263-265

length, 159

multiple conflicting rules, 164

names, 149-151

patterns, 164

physical data type, 157-159

precision, 159-160

profiling, 155-167

range of values, 161

skip-over rules, 162

special domains, 164-165

storage, 157-160, 168

text column rules, 162-163

time-related consistency, 166-167

typical, 149

valid value, 160-165, 168

violations of, 132

See also property lists

column property analysis, 132, 143-172

data validation, 155

defined, 143

discovery from data, 153-154

goals, 152

information gathering, 153

process, 152-155

results verification, 154-155

summary, 171-172

value-level remedies, 169-171

columns, 132, 144-145

breaking overloaded fields into, 127

candidate-rendundant, 201

classifying, 194, 202-203

constant, 176

date, 195-196

defined, 121, 144

defining with structure rules, 133

derived, 179

descriptor, 195

discovering, 203-204

documentation, 144-145

duplicate, 268

free-form text field, 196

identifier, 194-195

mapping, 167-169

object subgrouping, 227-228

with one value, 156, 194

quantifier, 195

redundant, 133

synonyms, 133, 184-187

text, 162-163

unused, 156

values, 144, 148

See also column properties; data profiling

completeness, 26

complex data rule analysis, 135, 237-245

data gathering, 239-240

definitions, 237-238

mapping with other applications, 244

output validation, 240

process, 238-240

process illustration, 239

summary, 245

testing, 240

See also data rule analysis; simple data rule analysis

complex data rules

aggregations, 243

dates and time, 241-242

example, 270

exclusivity, 242-243

execution, 241

location, 242

lookup, 243-244

remedies, 245

types of, 241-244

See also data rules; simple data rules

confidence, 165-166

defined, 165

examples, 165-166

scorekeeping, 166

See also column properties

consistency, 29-30

as accuracy part, 29-30

time-related, 166-167

See also inconsistencies

consolidations, 53, 231-232

consultants, 17

continuous monitoring, 100-101

elements, 101

with issues tracking, 101

corporations

business case for, 115-117

data sources, importance to, 111

decisions, 115-117

correct information

not given, 47-48

not known, 47

correlation, 21-22

aggregation, 38

complex data rules, 244

value, 38

costs

of achieving accurate data, 108

of conducting assessment, 112

hidden, 111

identified, 111

new system implementation, 13

poor-quality data, 255-256

of slow response, 106-107

transaction rework, 13

typical, 106

wasted, 106

cross-company systems, 7

customer relationship management (CRM), 6, 14

concept, 107

implementations, 16

initiatives, 107

customer-centric model, 16



Data Quality(c) The Accuracy Dimension
Data Quality: The Accuracy Dimension (The Morgan Kaufmann Series in Data Management Systems)
ISBN: 1558608915
EAN: 2147483647
Year: 2003
Pages: 133
Authors: Jack E. Olson

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net