Chapter 63: Fun and Games: Simulating Gambling and Sporting Event Probabilities


Overview

  • What is the probability of winning at craps?

  • In five-card draw poker, what is the probability of getting three of a kind?

  • Before the 2003 Super Bowl, Oakland was favored by 3 points. What was the probability that Tampa Bay would beat Oakland?

  • Going into the 2003 NCAA men’s basketball Final Four, what was the probability of each team winning the tournament?

Gambling and following sporting events are popular pastimes. I think gambling and sports are exciting because you never know what’s going to happen. Monte Carlo simulation is a powerful tool that can be used to predict gambling and sporting event probabilities. Essentially, we predict probability by playing out our gambling or sporting event situation multiple times. If, for example, we have Microsoft Office Excel 2007 play out the game of craps 10,000 times and we win 4900 times, we would estimate the probability of winning at craps to equal 4900/10,000, or 49 percent. If we play out the 2003 NCAA men’s Final Four 1000 times and Syracuse wins 300 of our iterations, we would estimate Syracuse’s probability of winning the championship as 300/1000, or 30 percent.

  • What is the probability of winning at craps?

  • In the game of craps, a player rolls two dice. If the combination is 2, 3, or 12, the player loses. If the combination is 7 or 11, the player wins. If the combination is a different number, the player continues rolling the dice until he or she either matches the number thrown on the first roll (called the point) or rolls 7. If the player rolls the point before rolling 7, the player wins. If the player rolls 7 before rolling the point, the player loses. By complex calculations, we can show that the probability of a player winning at craps is 0.493. We can use Excel to simulate the game of craps many times (I chose 2000) to demonstrate this probability.

  • In this example, it is crucial to keep in mind that we don’t know how many rolls the game will take. We can show that the chance of a game requiring more than 50 rolls of the dice is highly unlikely, so we’ll play out 50 rolls of the dice. After each roll, we keep track of the game status as follows:

    • 0 equals the game is lost.

    • 1 equals the game is won.

    • 2 equals the game continues.

  • Our output cell will keep track of the status of the game after the 50th roll by recording 1 to indicate a win and 0 to indicate a loss. You can review the work I did in the file Craps.xlsx, shown in Figure 63-1.

    image from book
    Figure 63-1: Simulating a game of craps

  • In cell B2, I used the RANDBETWEEN function to generate the number on the first die on the first roll by using the formula RANDBETWEEN(1,6). The RANDBETWEEN function ensures that each of its arguments is equally likely, so each die has an equal (1/6) chance of yielding 1, 2, 3, 4, 5, or 6. Copying this formula to the range B2:AY3 generates 50 rolls of the dice. (In Figure 63-1, I’ve hidden rolls 8–48.)

  • In the cell range B4:AY4, I compute the total dice combination for each of the 50 rolls by copying from B4 to C4:AY4 the formula SUM(B2:B3). In cell B5, I determine the game status after the first roll with the formula IF(OR(B4=2,B4=3,B4=12),0,IF(OR(B4=7, B4=11),1,2)). Remember that a roll of 2, 3, or 12 results in a loss (entering 0 in the cell); a roll of 7 or 11 results in a win (1); and any other roll results in the game continuing (2).

  • In cell C5, I compute the status of the game after the second roll with the formula IF(OR(B5=0,B5=1),B5,IF(C4=$B4,1,IF(C4=7,0,2))). If the game ended on the first roll, we maintain the status of the game. If we make our point, we record a win with 1. If we roll a 7, we record a loss. Otherwise, the game continues. I added a dollar sign in the reference to column B ($B4) in this formula to ensure that with each roll, we try to match the point thrown on the first roll. Copying this formula from C5 to D5:AY5 records the game status after rolls 2 through 50.

  • The game result is in cell AY5, which is copied to C6 so that we can easily see it. I then use a one-way data table to play out the game of craps 2000 times. In cell E9 I enter the formula =C6, which tracks the final outcome of the game (0 if a loss or 1 if a win). Next I select the table range (D9:E2009), click What-If Analysis in the Data Tools group on the Data tab, and then click Data Table. I choose a one-way table with any blank cell as our Column Input Cell. After pressing F9, I’ve simulated the game of craps 2000 times.

  • In cell E8, I can compute the fraction of our simulations that result in wins with the formula AVERAGE(E10:E2009). For our 2000 iterations, we won 49.25 percent of the time. If we had run more trials (for example, 10,000 iterations), we would have come closer to the true probability of winning at craps (49.3%). For information about using a one-way data table, see Chapter 15, “Sensitivity Analysis with Data Tables.”

  • In five-card draw poker, what is the probability of getting three of a kind?

  • An ordinary deck of cards contains four cards of each type-four aces, four deuces, and so on, up to four kings. To estimate the probability of getting a particular poker hand, we’ll assign the value 1 to an ace, 2 to a deuce, and on up through the deck so that a jack is assigned the value 11, a queen is assigned 12, and a king is assigned 13.

  • In five-card draw poker, you are dealt five cards. Many probabilities are of interest, but let’s use simulation to estimate the probability of getting three of a kind, which requires that you have three of one type of card and no pairs. (If you have a pair and three of a kind, the hand is a full house.) To simulate the five cards drawn, we proceed as follows. (See the file Poker.xlsx, shown in Figure 63-2.)

    • Associate a random number with each card in the deck.

    • The five cards chosen will be the five cards associated with the five smallest random numbers, which gives each card an equal chance of being chosen.

    • Count how many of each card (ace through king) are drawn.

    image from book
    Figure 63-2: Estimating the probability that we’ll draw three of a kind in a poker game

  • To begin, we list in cells D3:D54 all the cards in the deck: four 1s, four 2s, and so on up to four 13s. Then we copy from cell E3 to E4:E54 the RAND() function to associate a random number with each card in the deck. Copying from C3 to C4:C54 the formula RANK(E3,$E$3:$E$54,1) gives the rank (ordered from smallest to largest) of each random number. For example, in Figure 63-2, you can see that the first 3 in the deck (row 11) is associated with the 32nd smallest random number. (You will see different results in the worksheet because the random numbers are automatically recalculated when you open the worksheet.)

  • The syntax of the RANK function is RANK(number,array,1 or 0). If the last argument of the RANK function is 1, the function returns the rank of the number in the array with the smallest number receiving a rank of 1, the second smallest number a rank of 2, and so on. If the last argument of the RANK function is 0, the function returns the rank of the number in the array with the largest number receiving a rank of 1, the second largest number receiving a rank of 2, and so on.

  • When ranking random numbers, no ties can occur (because the random numbers would have to match 16 digits).

  • Suppose, for example, we were ranking the numbers 1, 3, 3, and 4 and that the last argument of the RANK function was 1. Excel would return the following ranks:

    Open table as spreadsheet

    Number

    Rank (smallest number has rank of 1)

    1

    1

    3

    2

    3

    2

    4

    4

  • Because 3 is the second smallest number, 3 would be assigned a rank of 2. The other 3 would also be assigned a rank of 2. Because 4 is the fourth smallest number, it will be assigned a rank of 4. Understanding the treatment of ties by the RANK function will help you complete Problem 1 at the end of the chapter.

  • By copying from cell B3 to B4:B7 the formula VLOOKUP(A3,lookup,2,FALSE), we can draw our five cards from the deck. This formula draws the five cards corresponding to the five smallest random numbers. (The lookup table range C3:D54 has been named Lookup.) We use FALSE in the VLOOKUP function because the ranks need not be in ascending order.

  • Having assigned the range name Drawn to our drawn cards (the range B3:B7), copying from J3 to J4:J15 the formula COUNTIF(drawn,I3) counts how many of each card are in our drawn hand. In cell J17, we determine whether we have three of a kind with the formula IF(AND(MAX(J3:J15)=3,COUNTIF(J3:J15,2)=0),1,0). This formula returns a 1 if, and only if, our hand has three of one kind and no pairs.

  • We now use a one-way data table to simulate 4000 poker hands. In cell J19, we recopy the results of cell J17 with the formula =J17. Next we select our table range (I19:J4019). After clicking What-If Analysis in the Data Tools group on the Data tab, and then clicking Data Table, we set up a one-way data table by selecting any blank cell as our Column Input Cell. After clicking OK, we have simulated 4000 poker hands. In cell G21, we record our estimated probability of three of a kind with the formula AVERAGE(J20: J4019). We estimate the chance of three of a kind at 1.9 percent. (Using basic probability theory, we can show that the true probability of drawing three of a kind is 2.1 percent.)

  • Before the 2003 Super Bowl, Oakland was favored by 3 points. What was the probability that Tampa Bay would beat Oakland?

  • Extensive study by my friend, Jeff Sagarin (check out his sports ratings at http://www.usatoday.com/sports/sagarin.htm), has shown that the number of points by which the favorite wins a college or professional football or basketball game follows a normal distribution with the mean equal to the bookies’ forecast, with a standard deviation of 16 points for professional football, 14 points for college football, 12 points for professional basketball, and 10 points for college basketball. Therefore, the number of points by which Oakland wins the Super Bowl (Oakland winning by a negative number of points means they lose) is normally distributed with a mean of 3 and a standard deviation of 16 points. Again, for Oakland to lose, they must win by 0 points or less.

  • This problem can be computed with the formula NORMDIST(0,3,16, True). (See Chapter 58, “The Normal Random Variable,” for a discussion of the NORMDIST function.) This function yields a 42.6 percent chance for Oakland to lose. As we know, Tampa Bay won the game, but this was not a totally unexpected outcome.

  • Going into the 2003 NCAA men’s basketball Final Four, what was the probability of each team winning the tournament?

  • Using a methodology similar to that described in Chapter 32, “Using Solver to Rate Sports Teams,” where we used the Excel Solver to rate sports teams, you can use the scores of previous games to develop ratings for each college basketball team. On the eve of the 2003 men’s Final Four, the ratings of the four teams were Syracuse, 91.03; Kansas, 92.76; Marquette, 89.01; and Texas, 90.66. Given this information, we can play out the Final Four several thousand times to estimate the chance that each team will win.

  • Our mean prediction for the number of points by which the home team wins equals favorite rating–underdog rating. In the Final Four, there is no home team, but if there was one, we would add 5 points to the home team’s rating. (In professional basketball, the home edge is 4 points, and in college and pro football, the home edge is three points.) Then we can use the NORMINV function to simulate the outcome of each game. (See Chapter 58, “The Normal Random Variable,” for a discussion of using the NORMINV function to simulate a normal random variable.)

  • We’ve calculated the likely outcome of the 2003 Final Four in the file Final4sim.xlsx, shown in Figure 63-3 on the next page. The semifinals pitted Kansas against Marquette and Syracuse against Texas.

    image from book
    Figure 63-3: Simulating the outcome of the NCAA 2003 Final Four

  • To begin, we enter each team’s name and rating in the cell range C4:D5 and C8:D9. In cell F4, we use the RAND() function to enter a random number for the Marquette–Kansas game, and in cell F8, we enter a random number for the Syracuse–Texas game. Our simulated outcome is always relative to the top team listed.

  • In cell E4, we determine the outcome of the Kansas–Marquette game (from the standpoint of Kansas) with the formula NORMINV(F4,D4–D5,10). Note that Kansas is favored by D4–D5 points. In cell E8, we determine the outcome of the Texas–Syracuse game (from the standpoint of Syracuse) with the formula NORMINV(F8,D8–D9,10). (Remember that the standard deviation for the winning margin of college basketball games is 10 points.)

  • In cells G5 and G6, we ensure that the winner of each semifinal game moves on to the finals. A semifinal outcome of greater than 0 causes the top-listed team to win; otherwise, the bottom-listed team wins. Thus, in cell G5, we enter the winner of the first game by using the formula IF(E4>0,"KU", "MARQ"). In cell G6, we enter the winner of the second game by using the formula IF(E8>0,"SYR","TEX").

  • In cell H5, we enter a random number that will be used to simulate the outcome of the championship game. Copying from I5 to I6 the formula VLOOKUP(G5,$C$4:$D$9, 2,FALSE) obtains the rating for each team in the championship game. Next, in cell J5, we compute the outcome of the championship game (from the reference point of the top-listed team in cell G5) with the formula NORMINV(H5,I5–I6,10). Finally, in cell K5, we determine the actual champion with the formula IF(J5>0,G5,G6).

  • Now we can use a one-way data table in the usual fashion to play out the Final Four 2000 times. Our simulated winners are in the cell range M12:M2011. Copying from K12 to K13:K15 the formula COUNTIF($M$12:$M$2011,J12)/2000 computes the predicted probability for each team winning: 39 percent for Kansas, 25 percent for Syracuse, 21 percent for Texas, and 15 percent for Marquette. These probabilities can be translated to odds using the following formula:

    image from book

  • For example, the odds against Kansas are 1.56 to 1:

    image from book

  • This means that if we placed $1 on Kansas to win and a bookmaker paid us $1.56 for a Kansas championship, the bet is fair. Of course, the bookie will lower these odds slightly to ensure that he makes money. (See Problem 6 at the end of this chapter for a related exercise.)

  • Our methodology can easily be extended to simulate the entire NCAA tournament. Just use IF statements to ensure that each winner advances, and use LOOKUP functions to find each team’s rating. See the file Ncaa2003.xlsx for a simulation of the 2003 tournament. We gave Syracuse a 4-percent chance to win at the start of tournament.

  • In this worksheet, I used comments to explain my work, which you can see in Figure 63-4 on the next page. Here is some background about using comments:

    • To insert a comment in a cell, select the cell, and on the Review tab, in the Comments group, click New Comment and then type your comment text. You will see a small red mark in the upper-right corner of any cell containing a comment.

    • To edit a comment, right-click the cell containing the comment, and click Edit Comment.

    • To make a comment always visible, right-click the cell, and click Show/Hide Comments. Clicking Hide Comment when the comment is displayed causes the comment to be hidden unless the pointer is in the cell containing the comment.

    • If you want to print your comments, click the Page Setup dialog box launcher (the small arrow in the bottom right corner of the group) on the Page Layout tab to display the Page Setup dialog box. In the Comments box on the Sheet tab, you can indicate whether you want comments printed as displayed on the sheet or at the end of the sheet.

    image from book
    Figure 63-4: Comments in a worksheet




Microsoft Press - Microsoft Office Excel 2007. Data Analysis and Business Modeling
MicrosoftВ® Office ExcelВ® 2007: Data Analysis and Business Modeling (Bpg -- Other)
ISBN: 0735623961
EAN: 2147483647
Year: 2007
Pages: 200

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net