probabilistic-analysis | Lisa Tech Blog

Lemma

[1] Consider a process that throws balls uniformly at random into b bins and let C be a subset of these bins. If the process throws $q leq b log|C|$ balls, then the probability that each bin in C has at least one ball is at most $frac{1}{exp(gamma cdot ((1 - frac{q}{b cdot log|C|}) cdot log|C|)^2)}$ if $|C| geq 2$ , where $gamma$ is some constant strictly less than 1. If $|C| = 1$ , then the probability is at most $1 - (1/4)^{q/b}$ .

Comment: conpon analysis + chernoff bound

Lemma

[1] Consider a process that throws t balls into b bins uniformly at random. if $t leq b/e$ , then the probability that there are at most $t/2$ occupied bins is at most $2^{-t/2}$ .

Lemma

[1] Consider a process that throws balls uniformly at random into b bins and let C be a subset of these bins. If the process throws q balls, then the probability that at least $theta cdot |C|$ of the bins in $C$ have at least one ball is at most $frac{1}{exp(frac{theta cdot |C|}{6})}$ if $q leq theta cdot b /2$ ; and at most $frac{1}{exp(frac{theta cdot |C|}{6} cdot (frac{theta cdot b}{q}-1)^2)}$ if $theta cdot b/2 < q < theta cdot b$ .

Reference

[1] Co-Location-Resistant Clouds, by Yossi Azar et al. in CCSW 2014

Problem

If there are n candidates, and we do not want to interview all candidates to find the best one. We also do not wish to hire and fire as we find better and better candidates.

Instead, we are willing to settle for a candidate who is close to the best, in exchange of hiring exactly once.

For each interview we must either immediately offer the position to the applicant or immediately reject the applicant.

What is the trade-off between minimizing the amount of interviewing and maximizing the quality of the candidate hired?

Code

Analysis

We analysis in the following to determine the best value of k that maximize the probability we hire the best candidate. In the analysis, we assume the index starts with 1 (rather than 0 as shown in the code).

Let $B_i$ be the event that the best candidate is the $i$ -th candidate.
Let $O_i$ be the event that none of the candidate in position $k+1$ to $i-1$ is chosen.
Let $S_i$ be the event that we hire the best candidate when the best candidate is in the $i$ -th position.

We have $Pr(S_i ) = Pr (B_i cup O_i) = Pr(B_i) cdot Pr(O_i)$ since $B_i$ and $O_i$ are independent events.

$Pr(S_i) = sum^n_{i=k+1} Pr(S_i)$
$= sum^n_{i=k+1} frac{1}{n} cdot frac{k}{n-i}$
$= frac{k}{n} sum^n_{i=k+1} frac{1}{i-1}$
$leq frac{k}{n} cdot (ln n - ln k)$