I get asked about ETL tools at every single consulting and speaking engagement. To date, I've not used any ETL tool on any data warehouse. My reasons are simple: -
ETL tools generally do not produce optimally efficient code. -
ETL tools cost money ”often more than the hours they replace. -
There are too darn many to choose from (see Table 6-1). Of course, the ETL vendors will firmly dispute my first point; so be it. But, these vendors (even Oracle with their Warehouse Builder) are generally at least a year behind when it comes to supporting the latest database features for optimally efficient data loading. Remember that I've said several times throughout this book that you'll want to be on the bleeding edge of Oracle releases and patches so you can utilize and benefit from such new features. More importantly, the lack of such features may limit your data loading architecture implementation options (e.g., you may be forced into a multi-step loading process with intermediate staging tables). Thus, having an ETL tool at least a year behind the database features may limit your implementation options, and this defeats the purpose. And we're talking about loading massive amounts of data. I've benchmarked several of these ETL tools versus custom code. The closest competition was twice as slow as well-written custom code. And ETL tools with Java transform engines ran orders of magnitude slower. So, you're trading data loading speed for a pretty GUI and sub-optimal code generation, period. As for cost, I mean much more than just the purchase price ”although for some of these ETL tools, the price can reach hundreds of thousands of dollars. There are some very big hidden costs as well. Some ETL tools require your software architecture to conform to their deployment paradigm. I call this buying a square-pegged ETL tool to fit into the round hole of your ETL environment. For example, Oracle's Warehouse Builder generates TCL scripts for scheduling via Oracle's Enterprise Manager. Well, what if you don't use Oracle's Enterprise Manager or TCL? Ever try telling a production support center they had to adopt a new job scheduler? And how about training all your staff on a new language (in case they have to debug the generated code)? These are costs, and big ones at that. Is it really worth it for generating sub-optimal code? Therefore, it's all these hidden costs that make ETL tools a sketchy bet at best. There's one other cost to ETL tools I'm hesitant to expose because it's going to offend some people ”oh well, so be it. ETL tools permit staffing your data loading team with nearly anyone available to generate your ETL programs, regardless of their Oracle and data warehousing experience. It's bad enough that the Oracle field is littered with people who claim to be premier Oracle developers but don't know how to write basic SQL. For example, my all-time favorite is the senior developer with supposedly four plus years of Oracle experience walking into my cubicle and asking what the plus sign is for in a SELECT statement's WHERE clause. Many Oracle developers I've met don't know SQL basics such as sub-queries, correlated sub-queries, tree-walk queries (i.e., START WITH and CONNECT BY ), exists, not exists, unions, minus, etc. Do you really want to give these people a code generator to hide behind? Most DBAs know exactly what I'm talking about here. "Garbage in, garbage out" only gets worse with code generators in the hands of weak developers. It very quickly becomes a little garbage in, a lot of garbage out, with the DBA left holding the bag. Finally, there is the overwhelming multitude of ETL tools to choose from, as shown in Table 6-1. Table 6-1. World of ETL Tools Available ETL Product Name | Vendor | ActaWorks | Acta Technologies | Amadea | ISoft | ASG-XPATH | Allen Systems Group | AT Sigma W-Import | Advanced Technologies | AutoImport | White Crane Systems | Automatic Data Warehouse Builder | Gilbert Babin | Blue Data Miner | Blue Data | Catalyst | Synectics Solutions | CDB/Superload | CDB Software | Cerebellum Portal Integrator | Cerebellum Software | Checkmate | BitbyBit International Ltd. | Chyfo | Ispirer Systems | CMS TextMap | Cornerstone Management Systems | Compleo | Symtrax | Content Connect | One Page | Convert /IDMS-DB, Convert/VSAM | Forecross Corporation | Conversions Plus | DataViz | Copy Manager | Information Builders, Inc. | CoSORT | Innovative Routines International, Inc. | CrossXpress | Cross Access Corporation | Cubeware Importer | CubeWare | Cyklop | Tokab Software AB | Data Cycle | APE Software Components S.L. | Data Exchange | XSB | Data EXTRactor | DogHouse Enterprises | Data Flow Manager | Peter's Software | Data Junction, Content Extractor | Data Junction | Data Manager | Joe Spanicek | Data Mapper | Applied Database Technology | Data Migration Tools | Friedman & Associates | Data Migrator for SAP, PeopleSoft | Information Builders, Inc. | Data Propagation System | Treehouse Software | Data Warehouse Tools | Javacorporate | Data 3 | Inform Information Systems | DataBlaster 2 | Bus-Tech, Inc. | DataBrix Data Manager | Lakeview Technology | DataConvert | Metadata Information Partners | DataDigger | Donnell Systems | DataExchanger SRV | CrossDataBase Technology | Datagration | Paladyne | DataImport | Spalding Software | DataLoad | Software Technologies Corporation | DataManager | Joe Spanicek | DataMIG | Dulcian, Inc. | DataMiner | Placer Group | DataMirror Constellar Hub | DataMirror Corporation | DataMirror Transformation Server | DataMirror Corporation | DataProF | IT Consultancy Group BV | DataPropagator | IBM | DataProvider | Order Software Company | DataPump for SAP R/3 | Transcope AG | DataStage XE | Ascential Software | DataSuite | Pathlight Data Systems | Datawhere | Miab Systems Ltd. | DataX | Data Migrators | DataXPress | EPIQ Systems | DB/Access | Datastructure | DBMS/Copy | Conceptual Software, Inc. | DBridge | Software Conversion House | DEAP I | DEAP Systems | DecisionBase | Computer Associates | DecisionStream | Cognos | DECISIVE Advantage | InfoSAGE, Inc. | Departmental Suite 2000 | Analytical Tools Inc. | DETAIL | Striva Technology | Distribution Agent for MVS | Sybase | DocuAnalyzer | Mobius Management | DQtransform | Metagon Technologies | DT/Studio | Embarcadero Technologies | DTS | Microsoft | e-Sense Gather | Vigil Technologies | e-zMigrate | e-zdata.net | eIntegration Suite | Taviz Technology | Environment Manager | WhiteLight Technology | ETI Extract | Evolutionary Technologies, Inc. | ETL Engine | FireSprout | ETL Manager | iWay Software | eWorker Portal, eWorker Legacy | entrinsic.com | EZ-Pickin's | ExcelSystems | FastCopy | SoftLink | File-AID/Express | CompuWare | FileSpeed | Computer Network Technology | Formware | Captiva Software | FOXTROT | EnableSoft, Inc. | Fusion FTMS | Proginet | Gate/1 | Selesta | G nio | Hummingbird Communications Ltd. | Gladstone Conversion Package | Gladstone Computer Services | GoHunter | Gordian Data | Graphical Performance Series | Vanguard Solutions | Harvester | Object Technology UK | HIREL | SWS Software Services | iManageData | BioComp Systems | iMergence | iMergence Technologies | InfluX | Network Software Associates, Inc. | InfoLink/400 | Batcom | InfoManager | InfoManager Oy | InfoRefiner, InfoTransport, InfoHub, InfoPump | Computer Associates | Information Discovery Platform | Cymfony | Information Logistics Network | D2K | InformEnt | Fiserv | InfoScanner | WisoSoftCom | InScribe | Critical Path | InTouch/2000 | Blue Isle Software, Inc. | ISIE | Artaud, Courth oux & Associ s | John Henry | Acme Software | KM.Studio | Knowmadic | LiveTransfer | Intellicorp | LOADPLUS | BMC Software | Mainframe Data Engine | Flatiron Solutions | Manheim | PowerShift | Mercator | TSI International | Meta Integration Works | Meta Integration Technologies | MetaSuite | Minerva Softcare | MetaTrans | Metagenix | MineWorks/400 | Computer Professional Systems | MinePoint | MinePoint | MITS | Management Information Tools | Monarch | Datawatch Corporation | Mozart | Magma Solutions | mpower | Ab Initio | MRE | SolutionsIQ | NatQuery | NatWorks, Inc | netConvert | The Workstation Group, Ltd. | NGS-IQ | New Generation Software | NSX Data Stager | NSX Software | ODBCFace | System Tech Consulting | OLAP Data Migrator | Legacy to Web Solutions | OmniReplicator | Lakeview Technology | OpalisRendezVous | Opalis | Open Exchange | IST | OpenMigrator | PrismTech | OpenWizard Professional | OpenData Systems | OptiLoad | Leveraged Solutions, Inc. | Oracle Warehouse Builder | Oracle Corporation | Orchestrate | Torrent Systems Inc. | Outbound | Firesign Computer Company | Parse-O-Matic | Pinnacle Software | ParseRat | Guy Software | pcMainframe | cfSOFTWARE | PinnPoint Plus | Pinnacle Decision Systems | PL/Loader | Hanlon Consulting | PointOut | mSE GmbH | Power*Loader Suite | SQL Power Group | PowerDesigner WarehouseArchitect | Powersoft | PowerMart | Informatica | PowerStage | Sybase | Rapid Data | Open Universal Software | Relational DataBridge | Liant Software Corporation | Relational Tools | Princeton Softech | ReTarGet | Tominy | Rodin | Coglin Mill Pty Ltd. | Roll-Up | Ironbridge Software | Sagent Solution | Sagent Technology, Inc. | SAS/Warehouse Adminstrator | SAS Institute | Schemer Advanced | Appligator.com | Scribe Integrate | Scribe Software Corporation | Scriptoria | Bunker Hill | SERdistiller | SER Solutions | Signiant | Signiant | SpeedLoader | Benchmark Consulting | SPINA PRO | Diagnos | SRTransport | Schema Research Corp. | StarQuest Data Replicator | StarQuest Software | StarTools | StarQuest | Stat/Transfer | Circle Systems | Strategy | SPSS | Sunopsis | Sunopsis | SyncSort Unix | Syncsort | TableTrans | PPD Informatics | Text Agent | Tasc, Inc. | TextPipe | Crystal Software Australia | TextProc2000 | LVRA | Textractor | Textkernel | Tilion | Tilion | Transporter Fountain | Digital Fountain | TransportIT | Computer Associates | ViewShark | infoShark | Vignette Business Integration Studio | Vignette | Visual Warehouse | IBM | Volantia | Volantia | vTag Web | Connotate Technologies | Waha | Beacon Information Technology | Warehouse | Taurus Software | Warehouse Executive | Ardent Software | Warehouse Plus | eNVy Systems | Warehouse Workbench | systemfabrik | Web Automation | webMethods | Web Data Kit | LOTONtech | Web Mining | Blossom Software | Web Replicator | Media Consulting | WebQL | Caesius Software | WhizBang! Extraction Library | WhizBang! Labs | Wizport | Turning Point | Xentis | GrayMatter Software Corporation | |