Data Science Interview Prep: Q10
Calculate Expectation Using the Law of Total Probability. (Category: Statistics)
There are 100 noodles in a bowl. At each step, you randomly select two noodle ends from the bowl and tie them together. What is the expectation on the number of loops formed?
Solution:
Let us first derive a generalized expression for the expectation on the number of loops formed. The question can be reworded as the expectation of the number of loops formed when two ends from a noodle is chosen.
There are two scenarios:
Choosing two ends of the same noodle.
Choosing two ends from a different noodle.
Let n be the number of noodles in a bowl.
Let S be the event that represents the end chosen from the same noodle.
Let ~S be the event that represents the end chosen from a different noodle.
Let the expected number of loops formed from n noodles be E(n).
We are interested in E(n) which can be obtained using the law of total expectation as shown below.
E(n|S) is the expectation of the number of loops formed given that you chose the two ends from the same noodle.
If you chose the two ends from the same noodle then one loop is added to the bowl, leaving (n-1) noodles. Hence we obtain the following expression,
If you choose two ends from a different noodle then no loop is added to the bowl, leaving (n-1) noodles. Hence we obtain the following expression,
For n noodles, there are 2n ends in total. Since there are n noodles, there are n favorable ways to pick two ends that belong to the same noodle.
P(S) = Probability of two ends chosen from the same noodle
= Number of ways to choose two ends from the same noodle / Total number of ways of choosing two ends.
Given that we have n noodles,
The number of ways to choose two ends from the same noodle = n.
Total number of ways of choosing two ends = 2nC2 = 2n! / 2!(2n-2)! = n(2n-1)
Hence we obtain the following expression,
P(~S) = Probability of two ends chosen from different noodles = [1 - P(S)]
Now we substitute equations 3, 4, 5 and 6 into Equation (2), we obtain the following:
In this problem, there are 100 noodles in a bowl, (i.e.) n = 100.
We know from Equation (1) that E(1) = 1.
Using Equation (7), we can deduce the following:
E(2) = E(1) + 1/(2*2 - 1) = 1 + 1/3
E(3) = E(2) + 1/(2*3 - 1) = 1 + 1/3 + 1/5
.
E(100) = 1 + 1/3 + 1/5 + 1/7 + ….. + 1/199
Hence, the expectation on the number of loops formed using 100 noodles is:
This should be enough from an interview stand point.
To stand out as a top candidate in the interview, you can simplify the expression above using the harmonic series approximation (a concept found in number theory and asymptotic analysis).
A harmonic series is the sum of reciprocal of natural numbers. A generalized expression of the same can be expressed as:
Equation (9) can be approximated using the expression shown below. Proof of the following expression is beyond the scope of this newsletter.
From Equation (8), we obtain the following:
E(100) = 1 + 1/3 + 1/5 + 1/7 + ….. + 1/199
Let us simplify E(100) using Equation (9) and Equation (10).
E(100) can be rewritten as follows:
E(100) = (1 + 1/2 + 1/3 + ….+ 1/199) - (1/2 + 1/4 + 1/6 + …. + 1/198)
= (1 + 1/2 + 1/3 + ….+ 1/199) - 1/2 * (1 + 1/2 + 1/3 + …. + 1/99)
= H199 - ½ H99
= ln(199) + 0.577 - 1/2 * [ln(99) + 0.577]
= 3.284 ≈ 3.3
Hence, we get E(100) ≈ 3.3, (i.e.) the expectation on the number of loops formed when there are 100 noodles in a bowl is approximately 3.3.
We hope you found the article both enlightening and valuable! For more insightful content delivered straight to your inbox, simply enter your email address below and hit the subscribe button. Stay tuned for future updates!
Reference:
Connecting Noodles Problem: https://math.stackexchange.com/questions/1417274/connecting-noodles-probability-question











