2022 FIFA World Cup – Can You Separate The World Cup Fans?

A certain hotel in Qatar is hosting 11 American fans and seven Dutch fans. Since no alcohol is available inside the stadiums, the fans spend the afternoon at the hotel bar before shuttle buses will take them to a match. Then, they haphazardly write their room numbers on a big board by the concierge desk.

To avoid any rowdiness between rival fans, shuttle bus drivers have been instructed to ferry American and Dutch fans separately. To ensure this, a shuttle pulls up in front of the hotel, and its driver calls out room numbers from the board, one by one at random. As long as they support the same team, each fan climbs aboard the bus and their room number is erased. Once the driver calls out the room number of a fan for the second team, the shuttle leaves with only the fans of the single team aboard. The next shuttle then pulls up and repeats the process.

What is the probability that the last shuttle ferries American fans?

Extra credit: On average, what is the expected number of shuttle buses needed to ferry all 18 fans?

We need to make a decision about how many fans can stay in a room. The question text implies there is one fan per room, and this is what I will address. If more than one fan can stay in a room (fans choosing to attend with a partner, family, and/or friends and booking together to save money, rationally), then we would have to decide how many people *could* stay in a room, and the probability of a room containing some number of fans.

Every shuttle leaves with a homogeneous group of fans, which means that it could leave with all \(11\) American or all \(7\) Dutch fans or just one fan. The only room that matters is which room is listed last on the list. If the rooms are randomly ordered and called, then we just need the last person that is called to be American. Since there are \(18\) fans, \(11\) of which are American, \(\mathbf{\frac{11}{18}}\) **of the time, the last shuttle will be an American shuttle.**

First, note that the order that the Americans are called in won’t affect our calculation of the average – if we take a given order (say, \(AAAAAAAAAAADDDDDDD\)) we can reorder the Americans within that ordering and it won’t affect the number of runs there are in the overall order (in this case, two). Similarly, we can reorder the Dutch. So, instead of there being \(18! = 6,402,373,705,728,000\) orderings to contend with, we only need to deal with \(\frac{18!}{11!\;7!} = 31,824\) orders of American and Dutch fans. Here, I represent Americans as “\(1\)’s” and the Dutch as “\(0\)’s”.

```
library(arrangements)
shuttlers = arrangements::permutations(k = 18, freq = c(7, 11)) - 1
nrow(shuttlers) ## good -- we've got the 31,824 unique orders of interest
```

`[1] 31824`

Three example orders:

```
shuttlers |> head(3) |> as.data.frame() |> kable(col.names = NULL) |>
kableExtra::kable_styling(position = "center")
```

0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |

0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |

0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |

Now, a quick little function to get the number of runs (shuttles needed) for a given ordering (row):

```
fun.runs = function(vector){
## base R function rle (Run Length Encoding) does the heavy lifting
return(rle(vector)[1]$lengths |> length())
}
## For the three example orderings above, we have 2, 4, and 4 shuttles required
apply(X = shuttlers[1:3, ], MARGIN = 1, FUN = fun.runs)
```

`[1] 2 4 4`

Now, we just need to apply the function to all rows and calculate the mean:

**So, we would need an average of** \(\mathbf{9 \frac{5}{9} = 9.\bar5}\) **shuttles to complete this job. **

Fun color hex code facts I just learned (and used):

- Dutch “Oranje” is #F36C21
- US “red” is #BF0A30
- US “blue” is #002868

The solution on fivethirtyeight suggests that if a fan of another team is called up, they don’t automatically get the next shuttle, which seems like an arbitrary rule and frankly unfair from a queuing perspective. Imagine getting called to come up next at the DMV, being turned away, and then returning back to the seating area for a random chance at another chance to be served. 🤦