Refuse and Reveal


One of the approaches to understanding a large amount of data is to characterize it using a few numbers. Statistics such as minimum, maximum, and the various kinds of averages tell you global properties. Sometimes they are enough to reveal information about individuals. This is why databases that give only statistical information are an issue for privacy advocates: Enough statistical questions can reveal personal data.

Consider a simple game between a questioner Quentin and a responder Rosalba. Quentin can ask only about global properties of a group of numbers, e.g., are they all whole numbers, are they distinct, and the statistics - mean, median, minimum, and maximum.

Rosalba always tells the truth. She may refuse to answer, however, if she knows the answer may divulge all the numbers. As we’ll see, the refusal to answer may itself divulge all the numbers. Sometimes, she will volunteer information just for the fun of it.

Warm Up

  • Rosalba: “I have five integers, all distinct.”

  • Quentin: “What is the minimum?”

  • Rosalba: “15.”

  • Quentin: “What is the maximum?”

  • Rosalba: “I won’t tell, because you would know everything.”

What are the numbers?

Solution to Warm-Up

Because the numbers are all distinct, the maximum would tell everything only if it were 19. Then the collection consists of 15, 16, 17, 18, and 19. Okay, this was easy, but the deductions will get more interesting.

Before we go on, though, let me remind you of the definition of the mean and median. The mean of a collection of numbers is their sum divided by the number in the collection. For example the mean of 20, 22, 22, 40, and 101 is 205/5 = 41. The median is the middle number in the sorted order, so 22 for this example. That is, the median is the middle in a sorted ordering of the values (our examples will always have an odd number of values).

  1. Rosalba: “I have five integers that may or may not be distinct.”

    Quentin: “What is the minimum?”

    Rosalba: “20.”

    Quentin: “Which of these would not allow me to infer all their values: number distinct, mean, maximum, or median?”

    Rosalba: “Only the median.”

    Quentin: “Great. I know the numbers.”

    What are they?

  2. Rosalba: “I have seven integers that may or may not be distinct.”

    Quentin: “What is the minimum?”

    Rosalba: “20.”

    Quentin: “Which of these are you willing to tell me (i.e., would not allow me to infer all their values): mean, median, and maximum?”

    Rosalba: “All of them.”

    Quentin: “Ok, what is the maximum?”

    Rosalba: “21.”

    Quentin: “I know which of mean and median you’re willing to tell me now.”

    Which? Why?

  3. Rosalba: “Can you find some situation in which I would be happier to tell you the mean rather than the median?”

    Quentin: “Could you give me a hint?”

    Rosalba: “In an example I can think of, there are three numbers, two of them distinct.” Give it a try.

  4. Rosalba: “Can you find some situation in which all of minimum, maximum, mean, and median are necessary and sufficient to find the identities of five numbers that are all integers?”

Rosalba: “So far we’ve been playing games with just a few numbers. I’ve given you hints and you’ve been able to infer them all. But just a handful of numbers are not interesting. Let’s try for more.

“Before we do that, let me define one new global property: the total distance to a point. Let’s say we have the five numbers 10, 15, 20, 30, 60. The total distance to a point - say, 22 - is the sum of (22-10), (22-15), (22-20), (30-22), and (60-22). Mathematically, the total distance to x from a set of numbers is the sum of the absolute values of the differences between each number and x.

  1. Rosalba: “Now we are ready. There are 17 numbers that are not all distinct. Their minimum is 30, mean is 34, and median is 35.”

    Quentin: “What is their total distance to 35?”

    Rosalba: “I won’t tell you, but the total distance to 35 is five less than the total distance to 38. Whoops! I shouldn’t have told you that.”

    Quentin laughing: “You’re right. Now I know the numbers.”

    What are they?

  2. How would your answer to this question change if there were 1701 numbers, but otherwise the same information as in the previous question?




Puzzles for Programmers and Pros
Puzzles for Programmers and Pros
ISBN: 0470121688
EAN: 2147483647
Year: 2007
Pages: 81
Authors: Dennis Shasha

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net