[pdf] Mermin, N.D., "Bringing home the atomic world: Quantum mysteries for anybody", Am. J. Phys. 49(10), pp. 940-943, (1981).
Bell inequalities are conditions obeyed by random variables with classical probability distributions, and thus by any (local) hidden variable theory. Quantum mechanics correctly predicts that physical systems may violate these conditions. However there are weaker versions of the Bell inequalities, the so-called Tsirelson bound, that even quantum mechanical systems must obey.
After reading Mermin's beautiful paper, I decided to work out a few extra details, including the Tsirelson inequality for the class of systems he describes. It's totally trivial but I couldn't find it written down explicitly anywhere so I thought I would write about it here.
You should be able to think of a strategy for creating boxes for which $p_{ij} = \frac{1}{3}$ (in the limit) when $i \ne j$. However the rule for winning the game is that you must come up with a strategy for which $p_{ij} \lt \frac{1}{3}$ when $i \ne j$.
Would you play this game?
Now encode the colour red as $-1$ and green as $+1$. This means that the product of two of these random variables is $+1$ if the colours agree and $-1$ otherwise, and thus taking expectations: $$ \mathbb{E}(A_i B_j) = p_{ij}\cdot 1 + (1 - p_{ij})\cdot (-1) = 2p_{ij} - 1. $$
Finally consider the random variable: $$ \begin{array}{crccccc} X = & - A_1 B_1 & + & A_1 B_2 & + & A_1 B_3 & +\\ & A_2 B_1 & - & A_2 B_2 & + & A_2 B_3 & +\\ & A_3 B_1 & + & A_3 B_2 & - & A_3 B_3 &\\ \end{array} $$ Now calculate: $$ \mathbb{E}(X) = -9 + 2\sum_{i \ne j}p_{ij}. $$ Furthermore, by condition (ii) of the game, we must have $A_i = B_i$ and thus $p_{ij} = p_{ji}$ and so we can simplify the above to: $$ \mathbb{E}(X) = -9 + 4(p_{12} + p_{13} + p_{23}). $$ However from the definition of $X$ we have: $$ |X| \le |-B_1 + B_2 + B_3| + |B_1 - B_2 + B_3| + |B_1 + B_2 - B_3| \le 5. $$ In particular $X \ge -5$ and thus: $$ \mathbb{E}(X) \ge -5. $$
Putting everything together we have $-9 + 4(p_{12} + p_{13} + p_{23}) \ge -5$ and thus finally: $$ p_{12} + p_{13} + p_{23} \ge 1. $$ From this we must have $p_{ij} \ge \frac{1}{3}$ for some $i \ne j$ and so if you are playing the game in a classical universe, you cannot win!
(Note that the above does not assume any physical non-determinism, the measure space is just a mathematical tool to free us from having to make assumptions about the details of the physical system. I have sketched an alternate proof that avoids using this language in the appendix.)
The details of the quantum system are irrelevant for everything else I say but for completeness I'll give a brief sketch of the setup that Mermin describes.
The boxes can be built by having each box contain one of a pair of spin-$\frac{1}{2}$ particles entangled in a so-called singlet state. The selector switch picks an angle from $\{0, \pm 2\pi/3\}$ at which to measure a component of the particle's spin, the button carries out the measurement, and the light indicates the result. An easy calculation with Pauli matrices reveals that if the angle difference between Alice and Bob's selectors is $\theta$ then the probability that their results disagree is $\cos^2(\theta/2)$. By having opposite colour-conventions on the two boxes, we get the required behaviour.
Thus, quantum mechancially we still have $A_i, B_i$ but they are now operators. Even though these operators do not all commute (e.g., $A_1 A_2 \ne A_2 A_1$) we do still have:
As before, set: $$ \begin{array}{crccccc} X = & - A_1 B_1 & + & A_1 B_2 & + & A_1 B_3 & +\\ & A_2 B_1 & - & A_2 B_2 & + & A_2 B_3 & +\\ & A_3 B_1 & + & A_3 B_2 & - & A_3 B_3, &\\ \end{array} $$ but now also define: $$ \begin{align} P &= \frac{1}{2}(-A_1 + A_2 + A_3) + B_1,\\ Q &= \frac{1}{2}( A_1 - A_2 + A_3) + B_2,\\ R &= \frac{1}{2}( A_1 + A_2 - A_3) + B_3,\\ S &= \frac{1}{2}( A_1 + A_2 + A_3). \end{align} $$ Then calculate: $$ \begin{align} P^2 + Q^2 + R^2 + S^2 &= X + A_1^2 + A_2^2 + A_3^2 + B_1^2 + B_2^2 + B_3^2,\\ &= X + 6I. \end{align} $$ Since $P$ is self-adjoint, $\mathbb{E}(P^2) \ge 0$ and likewise for $Q, R, S$. We thus obtain: $$ \mathbb{E}(X) \ge -6. $$ This is the Tsirelson bound for this system. Note that it is weaker than what we obtained when using the stronger classical assumptions since it has $-6$ instead of $-5$.
We can calculate $\mathbb{E}(X)$ for Mermin's setup. This has $p_{ij} = \frac{1}{4}$ for all $i \ne j$ and $p_{ii} = 1$, and thus $\mathbb{E}(A_iB_j) = -\frac{1}{2}$ for $i \ne j$ and $\mathbb{E}(A_iB_i) = 1$, so that: $$ \mathbb{E}(X) = -3 + 6(-\frac{1}{2}) = -6. $$ We thus see that the Tsirelson bound is tight. Equivalently, we get: $$ p_{12} + p_{13} + p_{23} \ge \frac{3}{4}, $$ and so we see that Mermin's system is extremal.
Lemma 1
Proof (sketch) Construct a new matrix $\hat A$ of dimensions $l \times {m\choose 2}$ whose rows correspond to those of $A$ and whose columns correspond to unordered pairs of distinct columns of $A$. The values in $\hat A$ are 0 or 1 according to whether the corresponding pair of values in $A$ match.
Let $S$ be the number of 1s in $\hat A$. By the pigeon hole principle, since $k < m$ there is at least one repeated value in any row of $A$ and thus at least one 1 in each row of $\hat A$. Counting $S$ row-wise, we thus have: $$ l \le S. $$ On the other hand, counting $S$ column-wise gives: $$ S = \sum_{i \ne j}c_{ij} \le {m\choose 2}\max_{i \ne j}c_{ij}. $$ These two inequalities combine to the required result. $\blacksquare$
Note that when $m$ is significantly larger than $k$, the inequality of lemma 1 is far from tight (because the row-wise count is loose). Nevertheless, the above is all we need.
Lemma 2
Proof (sketch) There is a straightforward proof by contradiction based on lemma 1. $\blacksquare$
Lemma 3
Proof (sketch) Pass to the subsequence of $B$ where all values in the top row match the corresponding values in the bottom row. Since $p_{ii} = 1$, the corresponding limits for this subsequence exist and take the same values as those for $B$. Now apply lemma 2 to the subsequence, viewed as a sequence of $m$-tuples. $\blacksquare$