|< Day Day Up >|
With the server availability problem tackled, your next job is to examine what, if any, issues can be found with the reservation system and its transactions. Once again, reports range from "hung" programs and browsers to cryptic error messages that flash by before the user had a chance to write them down.
What is known for sure is that somehow the airline's data integrity became badly damaged: The combination of nearly empty departing jumbo jets juxtaposed with the sight of police riot squads clearing mobs of hundreds of people away from oversold airplanes confirms that assertion.
When faced with application or transaction issues, it's natural to want to dive in and start reading code. A better approach is to attempt to take advantage of any diagnostic tools or reports to help narrow your search. In this case, the SHOW INNODB STATUS command will likely provide useful clues. Fortunately, a quick-thinking system administrator repeatedly ran this command during the height of the crisis, and kept copies of its output. You would like to thank her, but she was terminated along with many others, so this will have to wait.
Poring through the output, you notice a pattern. You see the same type of message about transactions and locks:
---TRANSACTION 0 1291, ACTIVE 58 sec, OS thread id 3632 starting index read mysql tables in use 1, locked 1 LOCK WAIT 2 lock struct(s), heap size 320 MySQL thread id 16, query id 1150 webserver19 10.68.0.19 webclient Updating update flight_seats set seat_number = '23B' where reservation_id = 7944 and flight_date = '2006-06-10' ------- TRX HAS BEEN WAITING 58 SEC FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 0 page no 52 n bits 456 index `PRIMARY` of table `high_hat /flight_seats` trx id 0 1291 lock_mode X waiting Record lock, heap no 2 PHYSICAL RECORD: n_fields 6; 1-byte offs TRUE; info bits 0 0: len 4; hex 00000001; asc ;; 1: len 6; hex 000000000508; asc ;; 2: len 7; hex 800000002d0084; asc - ?;; 3: len 3; hex 8f9f04; asc ? ;; 4: len 4; hex 80003cfd; asc <?;; 5: len 3; hex 30396f; asc 09o;;
Something is definitely wrong here no transaction should wait 5 seconds, much less 58 seconds for a lock to be granted. Few users will be so patient.
It appears that this transaction originates in the "change your seat" portion of the web application. Reading the code, you observe the following processing sequence, expressed here in pseudocode:
Start the transaction Display a list of seats to the user Wait for their input Update the flight_seats table with the new seat request Commit the transaction
You also notice errors like the following sprinkled through the output:
------------------------ LATEST FOREIGN KEY ERROR ------------------------ 060604 21:51:23 Transaction: TRANSACTION 0 1061815959, ACTIVE 32 sec, process no 20846, OS thread id 21087876 32 inserting, thread declared inside InnoDB 500 mysql tables in use 1, locked 1 22 lock struct(s), heap size 2496, undo log entries 1 MySQL thread id 73, query id 1584835 10.68.0.132 hberson update insert into customer_flights(customer_flight_id,customer_id,flight_id) values(20321,23304,11472) Foreign key constraint fails for table `high_hat/customer_flights`: , CONSTRAINT `customer_flights_ibfk_1` FOREIGN KEY (`customer_id`) REFERENCES `c ustomer_master` (`customer_id`) Trying to add in child table, in index `customer_id` tuple: DATA TUPLE: 2 fields; 0: len 4; hex 00005b08; asc [ ;; 1: len 4; hex 00004f61; asc Oa;; But in parent table `high_hat/customer_master`, in index `PRIMARY`, the closest match we can find is record: PHYSICAL RECORD: n_fields 15; 1-byte offs TRUE; info bits 0 0: len 4; hex 00005b5a; asc [Z;; 1: len 6; hex 00003f4a068f; asc ?J ;; 2: len 7; hex 800005c00207a2; asc ;; 3: len 0; hex ; asc ;; 4: len 0; hex ; asc ;; 5: len 30; hex 317a686f756b71787333386a7135373276783273613166373237327473 68; asc 1zhoukqxs38jq572vx2sa1f7272tsh;...(truncated); 6: len 8; hex 683867346d3 73977; asc h8g4m79w;; 7: SQL NULL, size 0 ; 8: len 14; hex 6a367a786e78747838337 0786d76; asc j6zxnxtx83pxmv;; 9: len 19; hex 7a32616933696961306a65346b71727a357 961; asc z2ai3iia0je4kqrz5ya;; 10: len 3; hex 727573; asc rus;; 11: len 3; hex 8 fa243; asc C;; 12: len 1; hex 01; asc ;; 13: len 3; hex 8fa110; asc ;; 14: len 8; hex 800012356f8be2e0; asc 5o ;;
Finally, as part of your normal review of the application, you discern a pattern of frequent, explicit temporary table creation. A statement to build one of these tables typically looks like this:
CREATE TABLE tmp_customer_master_address AS SELECT cm.*, ca.address1 FROM customer_master cm, customer_address ca WHERE cm.customer_id = ca.customer_id AND cm.last_name = 'Meadows' AND cm.first_name = 'Charles';
After being built, these tables can hold large collections of rows, depending on the uniqueness of the name combination. They are then further searched before being dropped. You do notice, however, that most columns in this temporary table go unused.
Using the results from SHOW INNODB STATUS has saved you a tremendous amount of time, helping you hone in on the source of many problems without having to wade through thousands of lines of application code.
As currently defined, at least two serious problems exist with the transaction you saw earlier:
Given the low quality of the application code you have already reviewed, you recommend a more in-depth study of all the transactional and referential capabilities of the software involved in the ill-fated airfare promotion: It's a necessary step in tracking down the root causes of the over- and underbooking problems.
Your investigation showed that temporary tables are being created all the time, and that these tables effectively serve as small subsets of information used for follow-on queries. At least four problems exist with these tables; luckily, all of these issues have solutions.
|< Day Day Up >|