|
|
cardinality, 249-250
causes investigation, 87-94
approaches, 88
connecting, with foreign keys, 181
data events analysis, 89-94
error clustering analysis, 88-89
not possible, 88
requirements, 88
checkers
batch, 212
building, 170
data rule, 234
defensive, 96
periodic, 253
transaction, 252-253
classifying
columns, 202-203
functional dependencies, 198-199
table relationships, 207
COBOL copybooks, 54
column names, 149-151
defined, 149
descriptive, 150-151
overdependence on, 149-150
prefixed/postfixed, 150
column properties
business meaning, 155-157, 168
confidence, 165-166
data rules vs., 220
defined, 143
discrete value list, 160-161
empty condition rules, 165
example, 263-265
length, 159
multiple conflicting rules, 164
names, 149-151
patterns, 164
physical data type, 157-159
precision, 159-160
profiling, 155-167
range of values, 161
skip-over rules, 162
special domains, 164-165
storage, 157-160, 168
text column rules, 162-163
time-related consistency, 166-167
typical, 149
valid value, 160-165, 168
violations of, 132
See also property lists
column property analysis, 132, 143-172
data validation, 155
defined, 143
discovery from data, 153-154
goals, 152
information gathering, 153
process, 152-155
results verification, 154-155
summary, 171-172
value-level remedies, 169-171
columns, 132, 144-145
breaking overloaded fields into, 127
candidate-rendundant, 201
classifying, 194, 202-203
constant, 176
date, 195-196
defined, 121, 144
defining with structure rules, 133
derived, 179
descriptor, 195
discovering, 203-204
documentation, 144-145
duplicate, 268
free-form text field, 196
identifier, 194-195
mapping, 167-169
object subgrouping, 227-228
with one value, 156, 194
quantifier, 195
redundant, 133
synonyms, 133, 184-187
text, 162-163
unused, 156
values, 144, 148
See also column properties; data profiling
completeness, 26
complex data rule analysis, 135, 237-245
data gathering, 239-240
definitions, 237-238
mapping with other applications, 244
output validation, 240
process, 238-240
process illustration, 239
summary, 245
testing, 240
See also data rule analysis; simple data rule analysis
complex data rules
aggregations, 243
dates and time, 241-242
example, 270
exclusivity, 242-243
execution, 241
location, 242
lookup, 243-244
remedies, 245
types of, 241-244
See also data rules; simple data rules
confidence, 165-166
defined, 165
examples, 165-166
scorekeeping, 166
See also column properties
consistency, 29-30
as accuracy part, 29-30
time-related, 166-167
See also inconsistencies
consolidations, 53, 231-232
consultants, 17
continuous monitoring, 100-101
elements, 101
with issues tracking, 101
corporations
business case for, 115-117
data sources, importance to, 111
decisions, 115-117
correct information
not given, 47-48
not known, 47
correlation, 21-22
aggregation, 38
complex data rules, 244
value, 38
costs
of achieving accurate data, 108
of conducting assessment, 112
hidden, 111
identified, 111
new system implementation, 13
poor-quality data, 255-256
of slow response, 106-107
transaction rework, 13
typical, 106
wasted, 106
cross-company systems, 7
customer relationship management (CRM), 6, 14
concept, 107
implementations, 16
initiatives, 107
customer-centric model, 16
|
|