What About ETL Tools?


I get asked about ETL tools at every single consulting and speaking engagement. To date, I've not used any ETL tool on any data warehouse. My reasons are simple:

  • ETL tools generally do not produce optimally efficient code.

  • ETL tools cost money ”often more than the hours they replace.

  • There are too darn many to choose from (see Table 6-1).

Of course, the ETL vendors will firmly dispute my first point; so be it. But, these vendors (even Oracle with their Warehouse Builder) are generally at least a year behind when it comes to supporting the latest database features for optimally efficient data loading. Remember that I've said several times throughout this book that you'll want to be on the bleeding edge of Oracle releases and patches so you can utilize and benefit from such new features. More importantly, the lack of such features may limit your data loading architecture implementation options (e.g., you may be forced into a multi-step loading process with intermediate staging tables). Thus, having an ETL tool at least a year behind the database features may limit your implementation options, and this defeats the purpose. And we're talking about loading massive amounts of data. I've benchmarked several of these ETL tools versus custom code. The closest competition was twice as slow as well-written custom code. And ETL tools with Java transform engines ran orders of magnitude slower. So, you're trading data loading speed for a pretty GUI and sub-optimal code generation, period.

As for cost, I mean much more than just the purchase price ”although for some of these ETL tools, the price can reach hundreds of thousands of dollars. There are some very big hidden costs as well. Some ETL tools require your software architecture to conform to their deployment paradigm. I call this buying a square-pegged ETL tool to fit into the round hole of your ETL environment. For example, Oracle's Warehouse Builder generates TCL scripts for scheduling via Oracle's Enterprise Manager. Well, what if you don't use Oracle's Enterprise Manager or TCL? Ever try telling a production support center they had to adopt a new job scheduler? And how about training all your staff on a new language (in case they have to debug the generated code)? These are costs, and big ones at that. Is it really worth it for generating sub-optimal code? Therefore, it's all these hidden costs that make ETL tools a sketchy bet at best.

There's one other cost to ETL tools I'm hesitant to expose because it's going to offend some people ”oh well, so be it. ETL tools permit staffing your data loading team with nearly anyone available to generate your ETL programs, regardless of their Oracle and data warehousing experience. It's bad enough that the Oracle field is littered with people who claim to be premier Oracle developers but don't know how to write basic SQL. For example, my all-time favorite is the senior developer with supposedly four plus years of Oracle experience walking into my cubicle and asking what the plus sign is for in a SELECT statement's WHERE clause. Many Oracle developers I've met don't know SQL basics such as sub-queries, correlated sub-queries, tree-walk queries (i.e., START WITH and CONNECT BY ), exists, not exists, unions, minus, etc. Do you really want to give these people a code generator to hide behind? Most DBAs know exactly what I'm talking about here. "Garbage in, garbage out" only gets worse with code generators in the hands of weak developers. It very quickly becomes a little garbage in, a lot of garbage out, with the DBA left holding the bag.

Finally, there is the overwhelming multitude of ETL tools to choose from, as shown in Table 6-1.

Table 6-1. World of ETL Tools Available

ETL Product Name

Vendor

ActaWorks

Acta Technologies

Amadea

ISoft

ASG-XPATH

Allen Systems Group

AT Sigma W-Import

Advanced Technologies

AutoImport

White Crane Systems

Automatic Data Warehouse Builder

Gilbert Babin

Blue Data Miner

Blue Data

Catalyst

Synectics Solutions

CDB/Superload

CDB Software

Cerebellum Portal Integrator

Cerebellum Software

Checkmate

BitbyBit International Ltd.

Chyfo

Ispirer Systems

CMS TextMap

Cornerstone Management Systems

Compleo

Symtrax

Content Connect

One Page

Convert /IDMS-DB, Convert/VSAM

Forecross Corporation

Conversions Plus

DataViz

Copy Manager

Information Builders, Inc.

CoSORT

Innovative Routines International, Inc.

CrossXpress

Cross Access Corporation

Cubeware Importer

CubeWare

Cyklop

Tokab Software AB

Data Cycle

APE Software Components S.L.

Data Exchange

XSB

Data EXTRactor

DogHouse Enterprises

Data Flow Manager

Peter's Software

Data Junction, Content Extractor

Data Junction

Data Manager

Joe Spanicek

Data Mapper

Applied Database Technology

Data Migration Tools

Friedman & Associates

Data Migrator for SAP, PeopleSoft

Information Builders, Inc.

Data Propagation System

Treehouse Software

Data Warehouse Tools

Javacorporate

Data 3

Inform Information Systems

DataBlaster 2

Bus-Tech, Inc.

DataBrix Data Manager

Lakeview Technology

DataConvert

Metadata Information Partners

DataDigger

Donnell Systems

DataExchanger SRV

CrossDataBase Technology

Datagration

Paladyne

DataImport

Spalding Software

DataLoad

Software Technologies Corporation

DataManager

Joe Spanicek

DataMIG

Dulcian, Inc.

DataMiner

Placer Group

DataMirror Constellar Hub

DataMirror Corporation

DataMirror Transformation Server

DataMirror Corporation

DataProF

IT Consultancy Group BV

DataPropagator

IBM

DataProvider

Order Software Company

DataPump for SAP R/3

Transcope AG

DataStage XE

Ascential Software

DataSuite

Pathlight Data Systems

Datawhere

Miab Systems Ltd.

DataX

Data Migrators

DataXPress

EPIQ Systems

DB/Access

Datastructure

DBMS/Copy

Conceptual Software, Inc.

DBridge

Software Conversion House

DEAP I

DEAP Systems

DecisionBase

Computer Associates

DecisionStream

Cognos

DECISIVE Advantage

InfoSAGE, Inc.

Departmental Suite 2000

Analytical Tools Inc.

DETAIL

Striva Technology

Distribution Agent for MVS

Sybase

DocuAnalyzer

Mobius Management

DQtransform

Metagon Technologies

DT/Studio

Embarcadero Technologies

DTS

Microsoft

e-Sense Gather

Vigil Technologies

e-zMigrate

e-zdata.net

eIntegration Suite

Taviz Technology

Environment Manager

WhiteLight Technology

ETI Extract

Evolutionary Technologies, Inc.

ETL Engine

FireSprout

ETL Manager

iWay Software

eWorker Portal, eWorker Legacy

entrinsic.com

EZ-Pickin's

ExcelSystems

FastCopy

SoftLink

File-AID/Express

CompuWare

FileSpeed

Computer Network Technology

Formware

Captiva Software

FOXTROT

EnableSoft, Inc.

Fusion FTMS

Proginet

Gate/1

Selesta

G nio

Hummingbird Communications Ltd.

Gladstone Conversion Package

Gladstone Computer Services

GoHunter

Gordian Data

Graphical Performance Series

Vanguard Solutions

Harvester

Object Technology UK

HIREL

SWS Software Services

iManageData

BioComp Systems

iMergence

iMergence Technologies

InfluX

Network Software Associates, Inc.

InfoLink/400

Batcom

InfoManager

InfoManager Oy

InfoRefiner, InfoTransport, InfoHub, InfoPump

Computer Associates

Information Discovery Platform

Cymfony

Information Logistics Network

D2K

InformEnt

Fiserv

InfoScanner

WisoSoftCom

InScribe

Critical Path

InTouch/2000

Blue Isle Software, Inc.

ISIE

Artaud, Courth oux & Associ s

John Henry

Acme Software

KM.Studio

Knowmadic

LiveTransfer

Intellicorp

LOADPLUS

BMC Software

Mainframe Data Engine

Flatiron Solutions

Manheim

PowerShift

Mercator

TSI International

Meta Integration Works

Meta Integration Technologies

MetaSuite

Minerva Softcare

MetaTrans

Metagenix

MineWorks/400

Computer Professional Systems

MinePoint

MinePoint

MITS

Management Information Tools

Monarch

Datawatch Corporation

Mozart

Magma Solutions

mpower

Ab Initio

MRE

SolutionsIQ

NatQuery

NatWorks, Inc

netConvert

The Workstation Group, Ltd.

NGS-IQ

New Generation Software

NSX Data Stager

NSX Software

ODBCFace

System Tech Consulting

OLAP Data Migrator

Legacy to Web Solutions

OmniReplicator

Lakeview Technology

OpalisRendezVous

Opalis

Open Exchange

IST

OpenMigrator

PrismTech

OpenWizard Professional

OpenData Systems

OptiLoad

Leveraged Solutions, Inc.

Oracle Warehouse Builder

Oracle Corporation

Orchestrate

Torrent Systems Inc.

Outbound

Firesign Computer Company

Parse-O-Matic

Pinnacle Software

ParseRat

Guy Software

pcMainframe

cfSOFTWARE

PinnPoint Plus

Pinnacle Decision Systems

PL/Loader

Hanlon Consulting

PointOut

mSE GmbH

Power*Loader Suite

SQL Power Group

PowerDesigner WarehouseArchitect

Powersoft

PowerMart

Informatica

PowerStage

Sybase

Rapid Data

Open Universal Software

Relational DataBridge

Liant Software Corporation

Relational Tools

Princeton Softech

ReTarGet

Tominy

Rodin

Coglin Mill Pty Ltd.

Roll-Up

Ironbridge Software

Sagent Solution

Sagent Technology, Inc.

SAS/Warehouse Adminstrator

SAS Institute

Schemer Advanced

Appligator.com

Scribe Integrate

Scribe Software Corporation

Scriptoria

Bunker Hill

SERdistiller

SER Solutions

Signiant

Signiant

SpeedLoader

Benchmark Consulting

SPINA PRO

Diagnos

SRTransport

Schema Research Corp.

StarQuest Data Replicator

StarQuest Software

StarTools

StarQuest

Stat/Transfer

Circle Systems

Strategy

SPSS

Sunopsis

Sunopsis

SyncSort Unix

Syncsort

TableTrans

PPD Informatics

Text Agent

Tasc, Inc.

TextPipe

Crystal Software Australia

TextProc2000

LVRA

Textractor

Textkernel

Tilion

Tilion

Transporter Fountain

Digital Fountain

TransportIT

Computer Associates

ViewShark

infoShark

Vignette Business Integration Studio

Vignette

Visual Warehouse

IBM

Volantia

Volantia

vTag Web

Connotate Technologies

Waha

Beacon Information Technology

Warehouse

Taurus Software

Warehouse Executive

Ardent Software

Warehouse Plus

eNVy Systems

Warehouse Workbench

systemfabrik

Web Automation

webMethods

Web Data Kit

LOTONtech

Web Mining

Blossom Software

Web Replicator

Media Consulting

WebQL

Caesius Software

WhizBang! Extraction Library

WhizBang! Labs

Wizport

Turning Point

Xentis

GrayMatter Software Corporation



Oracle DBA Guide to Data Warehousing and Star Schemas
Oracle DBA Guide to Data Warehousing and Star Schemas
ISBN: 0130325848
EAN: 2147483647
Year: 2003
Pages: 79
Authors: Bert Scalzo

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net