Hack 54. Check Your iPod s Honesty


Hack 54. Check Your iPod's Honesty

Find out how random your iPod's "random" shuffle really is.

Personalized song ratings in Apple's iTunes, the software that allows you to play songs on your iPod, lets you quickly find your favorites and helps the Party Shuffle feature play more of what you like most. The algorithm iTunes uses to pick what comes next in the playlist is meant to select randomly from your favorites. Is it really random, though?

After hearing one artist played over and over during a shuffled play of your entire music library in iTunes, you might think your player has a preference of its own. Apple, though, claims the iTunes's shuffle algorithm is completely random. The shuffle algorithm chooses songs without replacement. In other words, much like going through a shuffled deck of cards, you will hear each song only once until you have heard them all (or until you have stopped the player or selected a different playlist).

iTunes's Party Shuffle is a different matter. Its algorithm selects songs with replacement, meaning the entire library is reshuffled after each song is played (like reshuffling a deck of cards after every time a card is drawn). The "Play higher rated songs more often" option does exactly what it says, but how much preference is given to higher rated songs?

This hack originally appeared as an article on the OmniNerd web site at http://www.omninerd.com/.


Assessing iTunes's Selection Procedures

I wanted to test two different song selection options: Party Shuffle and "Play higher rated songs more often." I created a short playlist of six songs: one from each different star rating and a song left unrated. The songs were from the same genre and artist and were each changed to be only one second in duration.

I conducted my tests on iTunes 5. iTunes 6 has added a Smart Shuffle feature, which may decrease the chances of hearing songs from the same artist or album consecutively, but I haven't tested it yet.


After resetting the play count to zero, I hit Play and left my desk for the weekend. I ran the same songs twice: once selecting random (Party Shuffle) and once selecting both random and the "Play higher rated songs more often" option. Table 5-8 shows the play counts, as of Monday morning.

Table Song selection distribution
 Random selection  Based on rating  
Song ratingTimes playedPercentage of total Times playedPercentage of total
None9,10516.70 percent 2,0523.9 percent
19,05516.60 percent 6,23811.8 percent
29,09016.67 percent 8,12515.4 percent
39,11416.71 percent 10,02018.9 percent
49,02716.55 percent 12,15823.0 percent
59,14616.77 percent 14,29327.0 percent
Total54,537100 percent 52,886100 percent


The play counts in the random trial were very close to each other, as can be expected with a random selection. For the trial based on song ratings (or rating biased selection), the preference algorithm appears to be linear from 12 percent to 27 percent for the rated songs. Moving from the five-star rating downward, the linear preference declines around 4 percent with each step down in rating, but the drop doubles from one-star to unrated, with a fall of 8 percent. While one star might seem like the lowest rating, no rating proved the black sheep of the lot.

Your iPod assumes that if you haven't provided a rating for a song, you must want to hear it even less frequently than those songs to which you have assigned your lowest rating. This is a bit like choosing a movie with bad reviews over a movie that hasn't been reviewed.


Figure 5-2 shows the effects of different song selection options. You can judge the randomness of the true random selection option by seeing if those "Random" bars in the figure all seem the same height. The linear nature of the "Rating Biased" barscan be judged by imagining whether there are equal jumps in height as one moves from a rating of 1 to a rating of 5.

Figure 5-2. Patterns of song selection


Calculating the Statistics of the Selection Process

Changing the number of songs within each rating changes the probabilities for each song's selection. With multiple songs of each rating, the chance of a song with rating r coming up next in the ratings-biased Party Shuffle can be calculated using this expression:

Subscripts in this expression indicate the song rating. The chance of a song being chosen is based on x (number of songs with each rating) and P (the proportional weight assigned by the iTunes algorithm for each rating).

With iTunes's preference probabilities for each rating determined from the weekend-long sampling run, here's the resulting expression:

Although the higher-rated songs are given preference, you will not definitively hear more five-star rated songs than all other songs. Let's assume most people follow a normal distribution for their ratings [Hack #23], with the three-star rating being the most common. Table 5-9 displays a hypothetical iTunes library with this bell-shaped curve for the rating song count.

Table Typical song rating distribution
Song ratingNumber of songs
None72
1321
21,527
31,812
4507
595


If I run these hypothetical numbers through our frequency equations, I get a distribution that looks like Figure 5-3.

Figure 5-3. Probability distribution of song selection


As you can see in Figure 5-3, the chance of a song with a particular rating coming up next in the playlist is greatly determined by the song count within the rating. The iTunes preference for higher-rated songs and dislike for lower-rated songs only slightly raises or lowers the probability determined first from the song count.

These chances of hearing a song with a certain rating can be applied to find the chances of hearing a particular song. If we remove the song count from the numerator in the song selection expression, we can calculate the chance of a certain specific song, not just the rating, coming up next:

Explaining Statistical Surprises

About a month after running these tests, I noticed my iTunes Party Shuffle at work played the same song two times in a row. This was the first time I had noticed a consecutive repeat, and I checked the playlist. Not only did I find Nirvana's "Territorial Pissings" listed twice in a row, but A.F.I.'s "Death of Seasons" was listed twice in a row three tracks later.

I use the "Play higher rated songs more often" option, but these were each middle-of-the-road 3-star songs, and my song library has nearly 4,000 songs. The odds might seem outrageous at first, but you have to realize just how many songs you hear throughout a workday. If I average 10 hours at work each day and average a 31/2-minute song duration, odds say I should hear a consecutive repeat in less than a month.

Many claim to still see patterns as iTunes rambles through their music collection, but the majority of these patterns are simply multiple songs from the same artist. Think of it this way: if you have 2,000 songs and 40 of them are from the same artist, there is always about a 2 percent chance of hearing them next with random play. Right after one of their songs finishes, odds show a 50 percent chance a song by the same artist will play again within the next 35 songs and a 64 percent chance they will be played again within the next 50 songs. This can be calculated following this equation:

As we have seen in other hacks, a low likelihood event (such as our 2 percent chance of repeating an artist) becomes a highly likely event after just a few opportunities [Hack #46].

It's simply the mind's tendency to find a pattern that makes you think iTunes has a preference.

See Also

Additional technical information about iPods and shuffling can be found at these sources:

  • Levy, Steven. "Does Your iPod Play Favorites." January 31, 2005. http://msnbc.msn.com/id/6854309/site/newsweek/.

  • Hofferth, Jerrod. "Using Party Shuffle in iTunes." August 22, 2004. http://ipodlounge.com/index.php/articles/comments/using-party-shuffle-in-itunes/.

Brian Hansen




Statistics Hacks
Statistics Hacks: Tips & Tools for Measuring the World and Beating the Odds
ISBN: 0596101643
EAN: 2147483647
Year: 2004
Pages: 114
Authors: Bruce Frey

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net