data archiving
data staging used for, 66–67
File System Task for, 8, 352
data banding, 99
data cleansing
data profiling for, 48–52
example of, 177–180
transformations for, 97
Data Conversion transformation, 97, 98–99, 319–321
Data Driven Query tasks, migrating to SSIS, 339
Data Encryption Standard (DES), 40
data extraction
configuring in Data Flow task, 8–9
data lineage, 55
definition of, 55
incremental extraction
approaches for, 68
with audit table, 68
with auto-incrementing identifier, 68
with change identifier columns, 68
correlating source with destination for, 83–86
definition of, 55, 68
with log-based auditing, 68
with timestamp data type, 79–81
objectives of, 55
process of, 56
data flow destinations
correlating source with, for incremental extraction, 83–86
Dimension Processing destination, 197, 205–207
optimizing, 371–376
Oracle destination, by Persistent Systems, 330–331
Partition Processing destination
for processing cubes, 221
for SSAS processing, 197, 207–208
for processing cubes, 220–221
Raw File destination, 321–322
raw files created by, 63
for SSAS objects, 197, 204–208
table locking in, 170
data flow optimization
data flow properties for, 369–371
destination optimization, 371–376
pipeline optimization, 365–369
data flow source adapters
creating connections for, 57
DataReader source adapter, 60
definition of, 57, 58
delaying validation of, 64
Excel source adapter, 61–63
flat file source adapter
change identifier columns in, 82–83
using, 58–60
OLE DB source adapter
parameterized source queries, 70–71
using, 60–61
raw file source adapter, 63
XML source adapter, 63
Data Flow Task
memory utilization of, 357–360
source adapter in, 57
SQL operations compared to, 360–362
data flows
backpressure of, 365
execution time of, measuring, 371
execution trees for, 363
data grain changes
example of, 175–181
forms of, 173
transformations for, 173–175
data integration, 6
data lineage
approaches for, 86–87
definition of, 55, 86
sequence lineage number for, 87–88
data mining objects (SSAS)
data flow features for, 361
processing of, 192, 196
data processing, tools for, 1–4.
See also ETL (extraction, transformation, and loading)
data profiling, 48–52
data quality
data cleansing
data profiling for, 48–52
example of, 177–180
transformations for, 97
data validation
data profiling for, 48–52
data staging used for, 66–67
data sources
correlating to destination, for incremental extraction, 83–86
DB2 sources, 331–332
joining to dimension data, 133–136
mapping to dimension and fact tables, 144
multiple record types in, 316–323
Oracle sources
code page warnings from, 328
Oracle client network components for, 326
parameterized queries with, 328–330
providers for, 327–328
reading from, 327–330
writing to, 330–331
preparing for dimension ETL, 97–101
raw files, 323–325
sharing among packages, 57
supported by SSIS, 56
VSAM sources, 332
data staging, 66–67
data type of variable, 23
data validation
data profiling for, 48–52
data staging used for, 66–67
data warehousing ETL.
See also dimension ETL;
dimension tables;
fact table ETL
book about, 102
definition of, 4–6
NULL values in, 12
database (SSAS), processing of, 192
database (SSIS)
administration of, 6–7
growth of, effect on performance, 371
msdb database
Package Store in, 287–288
roles in, used by package roles, 298
DataDirect Providers
for DB2, 332
for Oracle, 328
DataReader source adapter, 60
date dimension tables, with SCD Wizard, 128–132
db_dtsadmin database role, 298
db_dtsoperator database role, 298
DB2 sources, 331–332
debug mode for package execution, 17
decision support, data warehousing focusing on, 4
Decrypt() method, 40
DefaultBufferMaxRows property, 370
DefaultBufferSize property, 370
degenerate dimension, 88
DelayValidation property, connections, 64
deployment of packages
package installation wizard for, 275–279
Package Store location affecting, 287–288
utility for, creating, 274–275
derived calculations for measures, 159
Derived Column transformation
for data grain changes, 174
for measure calculations, 159, 160–161
preparing source data for dimension ETL, 97, 99–100
DES (Data Encryption Standard), 40
destination server package execution, 379–380
destinations. See data flow destinations
dimension ETL
definition of, 91–92, 96–97
preparing source data for, 97–101
transformations for, 97
Dimension Processing destination, 197, 205–207
dimension tables
business key (candidate key) in, 95
correspondence to source tables, 97
definition of, 4
indexes for, optimizing, 121
loading of, updating inferred members during, 106–107
mapping source files to, 144
surrogate key for, 95, 143–144
updates to, optimizing, 121–122
dimensional model, 4
dimensions (SSAS), processing of, 192–194, 209–211
dimensions (SSIS)
attributes of, 92
change types of
in custom package, 137–141
inferred members, handling with, 104–107
list of, 102–104
specifying in SCD Wizard, 108–112
supported by SCD Wizard, 107
definition of, 92–93
fact tables relating to, 94–95
hierarchy within
definition of, 92–93
ragged, 126
unbalanced, 126
inferred members of, 104–107
slowly changing dimension, 102
disk input/output (I/O), 357–360
Dispose() method, IDisposable interface, 29
distributed server package execution, 381–382
Do…While loop, in For Loop container, 19
downloading files, over HTTP, 29–30
DTExec.exe utility
definition of, 300
with distributed package execution, 382
location where executed, 376–377
for packages executed outside debug mode, 17
32-bit and 64-bit versions of, 306–308
DTExecUI.exe utility
32-bit and 64-bit issues regarding, 306–308
using DTS Migration Wizard, 290–297
DTS Migration Wizard, 339–340, 341–344
DTS (SQL Server 2000 DTS)
migrating to SSIS
DTS Migration Wizard for, 339–340, 341–344
handling exceptions from, 344–352
SQL Server Upgrade Advisor, 340–341
success rates of, 339–340
packages
administration of, 335–339
editing, 336
importing, 336
running, 336–339
scheduling, 336
DTSER_SUCCESS code, 17
DTSRun.exe command, 336–337
DTSRunUI.exe command, 337
DTUtil.exe utility, 300–301
dynamic connections, 35–37
Dynamic Properties tasks, migrating to SSIS, 339, 345–347