The Grand Locus / Life for statistical sciences

Drunk man walking

Lotteries fascinate the human mind. In the The Lottery in Babylon, Jorge Luis Borges describes a city where the lottery takes a progressively dominant part in people’s life, to the extent that every decision, even life and death, becomes subject to the lottery.

In this story, Borges brings us face to face with the discomfort that the concept of randomness creates in our mind. Paradoxes are like lighthouses, they indicate a dangerous reef, where the human mind can easily slip and fall into madness, but they also show us the way to greater understanding.

One of the oldest paradoxes of probability theory is the so called Saint Petersburg paradox, which has been teasing statisticians since 1713. Imagine I offered you to play the following game: if you toss ‘tails’, you gain $1, and as long as you toss ‘tails’, you double your gains. The first ‘heads’ ends the spree and determines how much you gain. So you could gain$0, $1,$2, $4,$8... with probability 1/2, 1/4, 1/8, 1/16, 1/32 etc. What is the fair price I can ask you to play the Saint Petersburg lottery?

Probability theory says that the fair price should be the expected gain. This way, you neither lose nor gain money on average. In this case, the expected gain is $0 +$0.25 + $0.25 +$0.25 + $0.25 + ... which adds up to inifinity! So you could accept to pay any price to play the Saint Petersburg lottery, since you know that your expected gain beats it. Now, take a moment to imagine the situation of the Saint Petersbourg lottery and let me ask you a simple question: would you play for$1 million?

My guess is that you would not. At least I wouldn’t. Intuitively, the reason is that you will be bankrupt before you make any profit. In other words, the timing of gains and loss is also an issue.

This is where the drunk man steps in. If you don’t know him, let me do the presentation. He has been walking randomly left or right in every textbook since the invention of random processes. Some say, he was walking even before statistics were invented! As the story goes, the drunk man gets out of a bar and is so drunk that he will take one step at a time, left or right at random. At some distance on the left is his house, and somewhere on the right, an ice-cold river. It is a classical problem to determine the probability that the drunk man will get home safe. Playing the lottery is like going on a drunk man walk: you hope you will hit the jackpot before you are bankrupt, but you’ll take random steps between the two.

The drunk man approach to playing the lottery is to pay if your chances of making profit are higher than your chances of going bankrupt. As the example of the Saint Petersburg paradox shows, the expected gain poorly reflects this probability. We can define a ‘breakpoint’ price, that will drive the behaviour of the player (to play or not). At the breakpoint price, the chances of losing everything are the same as the chances of doubling your fortune. So the breakpoint price is not the same for everyone: it depends on your capital. For example, if you have $100, this price is$2.18, and if you have $1000 it is$3.01.

It is easier to get that number by simulation than solving the general problem. Here are the bits of R code I assembled to get to that result. Please not that this is only a quick and dirty fix.

First we write a function to simulate the process.

drunk_man_walk <- function(price, lower, upper=0) {# Simulate steps of a Saint Petersburg drunk man walk.# Args#   price: the price to pay for one round of lottery.#   lower: the lower bound.#   upper: the upper bound.# Return#   TRUE if player hits upper bound first, FALSE otherwise.   n <- 1024;     # Steps to simulate at a time (speed issues).   fortune <- 0;  # Current fortune.   while (TRUE) {      # Simulate a uniform distribution, take log2 and      # then the ceiling value.      # This gives the length of the streak of 'tails'.      n_tails <- abs(ceiling(log2(runif(n))));      # Gain is 0 if no 'tail' at all,       # 2^(number of tails -1) otherwise.      gains <- (n_tails > 0) * 2^(n_tails - 1);       losses <- rep(price, n);      # Here are the ups and down of fortune.      path <- fortune + diffinv(gains - losses);      # Did we hit one of the boundaries?      if (min(path) <= lower || max(path) > upper) {         # Yes we did.         hit_lower_at <- min(which(path <= lower), n);         hit_upper_at <- min(which(path > upper), n)         # Return FALSE if we hit lower, first         # and TRUE if it was upper.         return (hit_lower_at > hit_upper_at);      }      # Otherwise keep walking (and increase n for speed).      fortune <- path[n];      n <- 2*n;   }}

Here I use a trick to determine how many ‘tails‘ are tossed in a row. The rest is addition and subtraction of gain and losses, and checking whether we hit the top or the bottom boundary first.

Next we write functions to find the breakpoint price.

breakpoint_price <- function(capital, precision) {# Compute the breakpoint price for a given capital.# Args#   capital: the initial fortune.#   precision: the numeric precision for estimation.   average_times_doubled_capital <- function(n, price) {      # Convenience function to compute the average      # number of times the player doubled their capital      # out of n tries for a given price.      made_profit <- 0;      for (i in 1:n) {         # Remember that 'drunk_man_walk()' returns 1 if         # player hits upper bound first.         made_profit <- made_profit +             drunk_man_walk(price, -capital, capital);      }      return (made_profit / n);   }   # Try a breakpoint price between 0 and 10.   down <- 0;   up <- 10;   # Now we get to the breakpoint price by bisection.   # The breakpoint price 'bkprice' is such that   #   #   average_times_doubled_capital(n, bkprice) = 0.5   #   # as n grows large.   while (TRUE) {      # Quickly get an upper bound for the breakpoint price.      # For speed, fix n = 100 only.      if (average_times_doubled_capital(100, up) > 0.5) {         up <- 2*up;      }      else {         break;      }   }   # Now up is greater than breakpoint price. Bisect   # until we get breakpoint price within 'precision'.   # Meanwhile n will grow larger with a limit of 5000.   n <- 100;   while (TRUE) {      average <- average_times_doubled_capital(n, (up+down)/2);      if (abs(average - 0.5) < precision) {         break;      }      if (average > 0.5) {         # Can pay more.         down <- (up + down) / 2;      }      else {         # Too expensive.         up <- (up + down) / 2;      }      # Increase n up to 5000 for more precision.      n <- min(2*n, 5000);   }   return ((up + down) / 2);}

With that code loaded in R, you can get the approximate breakpoint prices by calling:

breakpoint_price(100, 0.01)breakpoint_price(1000, 0.01)

Notice how the drunk man carried us a long way from the Saint Petersburg paradox. He took us from an immaterial infinite number issue to a tangible and sensible solution. En passant, we also saw that there are two types of Saint Petersburg lotteries. The first, that we can call ‘lose-lose’, is such that the player has more chances of being ruined than making profit. Still, the expected loss of the organizers is infinite so they would lose money by allowing a large number of people to play the lottery. The other kind, ‘win-lose’, is such that the player has more chances of making profit, but the organizers are still in the same situation.

Clearly, these situations don’t occur in practice for lack of a suicidal lottery organizers. However, similar situations can show up in poker or financial investment: you may turn down an easy win because you cannot pay for potential early loss.

Classical lotteries are typicaly ‘lose-win’. But harder to imagine is the ‘win-win’ case where players have higher chances of recovering their investment than going brankrupt, while organizers still have a positive expected gain... The closest to that situation I can think of is the capitalist banking system. By contracting a debt to the bank, customers always hit the jackpot before they are bankrupt, while the bank has positive expected gain through the interest rate.

Which by the way reminds me of that joke: A banker walks into a bar...