COMMIT and ROLLBACK

   

It should be true that a COMMIT will actually guarantee that data is written to the disk and flushed. That is, it should be safe to pull the power plug as soon as COMMIT is done. Unfortunately, any DBMS that makes that guarantee must fight the operating system, because the operating system prefers to do lazy writes and keep its caches until a physical write is convenient . Therefore COMMIT is always going to be slow. This is true even if the DBMS doesn't wait until the database is updated, because at the very least it must wait until the log file is updated.

We've mentioned before that all DBMSs have a nonstandard flag called the auto-commit flag. If the flag is on , then COMMIT happens automatically after every data-change statement has been executed. If the auto-commit flag is off , then you must issue an explicit COMMIT yourself. Here are the usual rules:

  • If there is only one data-change statement in the transaction, then auto-commit should be on .

  • If there are multiple data-change statements in the transaction, then auto-commit should be off .

  • The auto-commit flag is on by default if you are using ODBC or JDBC or most other APIs.

Transactions can be difficult, though, because of the following odd behavior by DBMSs:

  • IBM, Informix, and Ingres act correctly whether the auto-commit flag is on or off .

  • InterBase and Oracle won't allow you to ROLLBACK Data Definition Language (DDL) statements (e.g., CREATE TABLE) even if the auto-commit flag is off . But they do allow you to ROLLBACK Data Manipulation Language (DML) statements (e.g., INSERT) if the auto-commit flag is off .

  • Even if the auto-commit flag is off , Microsoft and Sybase will do automatic COMMITs unless you issue an explicit BEGIN TRANSACTION statement. Generally, you can't ROLLBACK DDL statements with these DBMSs, but you can ROLLBACK DML statements.

  • If your DBMS is MySQL, auto-commit is always on and ROLLBACK is not possible.

The net effect is that it's really rather difficult to use transactions. The idea is to reduce the number of network messages because a COMMIT message doesn't have to be sent. But you should make what efforts you can to delay COMMIT if (a) auto-commit would compromise data integrity, or (b) you have several statements that will probably change the same pages in the same database files.

When performing a transaction, the DBMS can make one of two possible assumptions: that COMMIT is likely to happen, or that ROLLBACK is likely to happen. If COMMIT is likely, then the DBMS's best strategy is this:

 When a data change happens {   Put a copy of the original data page in the log file.   Change the data page immediately. } When a COMMIT happens {   Just flush the log file. } When a ROLLBACK happens {   Read the original from the log file.   Copy it to the data page. } 

If ROLLBACK is likely, then this is the DBMS's best strategy:

 When a data change happens {   Put a copy of the changed data page in the log file. } When a COMMIT happens {   Read the change from the log file.   Copy it to the data page.   Flush. } When a ROLLBACK happens {   Mark the log entry as invalid.   Do nothing else. The original data is untouched. } 

Most DBMSs assume that COMMIT is likely. That's okay, but it has the unfortunate consequence that you can't play "what-if" games in your transactions. Start with the belief that your transaction will go through, because if you have to abort it, a ROLLBACK takes longer than a COMMIT. Remember also that logging and COMMIT processing happen in separate threads so you won't see this effect by simply timing your transaction. But the effect is there and will affect throughput.

Portability

Ingres, InterBase, Oracle, and Sybase assume COMMIT is likely. IBM, Informix, and Microsoft assume ROLLBACK is likely. MySQL doesn't support transactions.


Usually, the actual event order is that a data-change statement changes an in-memory copy of a page and creates an in-memory log record. COMMIT ensures that a separate thread (the log writer) has flushed the log records only. Flushing of the changed data pages can happen later.

Tip

If ROLLBACK is expensive, then start a transaction with the operation that's most likely to fail. For example, in a library circulation transaction, scan the patron's card before scanning the material the patron wishes to borrow (GAIN: 7/7).


The Bottom Line: COMMIT and ROLLBACK

COMMIT is always going to be slow.

Watch out for operating system caches. COMMIT hasn't really happened until these buffers are written to disk and flushed.

If only one data-change statement is in the transaction, then auto-commit should be on .

If multiple data-change statements are in the transaction, then auto-commit should be off .

Make an effort to delay COMMIT if (a) auto-commit would compromise data integrity, or (b) you have several statements that will probably change the same pages in the same database files.

Start with the belief that your transaction will go throughthat is, write your transaction based on the assumption that COMMIT will be successful, not that ROLLBACK will be necessary.

If ROLLBACK is expensive, start a transaction with the operation that's most likely to fail.

   


SQL Performance Tuning
SQL Performance Tuning
ISBN: 0201791692
EAN: 2147483647
Year: 2005
Pages: 125

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net