Page 1 of 1

Gemp Shuffler Investigation

Posted: Sun Apr 05, 2026 7:47 pm
by ketura

NOLINK

If you've played on Gemp for any length of time, you've probably heard someone say — or said yourself — that the shuffler feels off. Maybe you've experienced it firsthand: you mulligan and get back half the same cards. Or you keep drawing the same clump of companions every game. Or your deck just feels like it isn't really shuffling.

This post will show the results of an investigation done to identify if there are any issues.


What People Report

The complaints generally fall into a few buckets:

  • "I mulligan and get the same cards back." You pitch your opening hand, shuffle up, draw again — and three of the same cards are right back staring at you.
  • "The shuffle just doesn't feel random." Cards seem to clump. You go half the game without seeing a card you're running four copies of, or you draw all four in a row.
  • "This never seems to happen when I shuffle at my kitchen table!"

The question is: are they evidence of a bug, or evidence of something else?


What We Did

To test this properly, we needed to remove human bias from the equation; hard numbers only.

We built a test harness using a 60-card Highlander deck, meaning 60 1x cards; no duplicates. If you mulligan a normal deck and see "another copy" of a card you had before, that could be a completely different copy. With a Highlander deck, if a card comes back after a mulligan, it is the exact same card.

We then ran two tests:

Test 1: Shuffle Fairness. We created 10,000 fresh games, shuffled the deck, and drew an opening hand each time. Then we checked: does every card in the deck show up with roughly equal frequency? Or do some cards mysteriously favor certain positions?

Test 2: Mulligan Overlap. We simulated the mulligan 50,000 times: shuffle, draw 8 cards, put them all back, shuffle again, draw 6 (the standard mulligan penalty). Then we counted exactly how many cards from the original hand reappeared.


The Results

Shuffle Fairness: PASS

Every card appeared in the opening hand almost exactly as often as every other card, across 50,000 games. The statistical test (chi-squared) came in at 49.82 against a failure threshold of 77.93. Not a single card out of 60 was flagged as appearing unusually often or rarely. The shuffle is uniform.

Full output:

Code: Select all

=== SEEDING BIAS TEST ===
Iterations: 50000  |  Hand size: 8  |  Deck size: 60
Expected frequency per card: 6666.7  (stddev 76.0)

Card         Observed Expected  Z-score
------------------------------------------
1_1              6556   6666.7    -1.46
1_2              6628   6666.7    -0.51
1_3              6730   6666.7     0.83
1_4              6788   6666.7     1.60
1_5              6757   6666.7     1.19
1_6              6696   6666.7     0.39
1_7              6632   6666.7    -0.46
1_8              6599   6666.7    -0.89
1_9              6847   6666.7     2.37
1_10             6624   6666.7    -0.56
1_11             6623   6666.7    -0.57
1_12             6713   6666.7     0.61
1_13             6771   6666.7     1.37
1_14             6616   6666.7    -0.67
1_15             6672   6666.7     0.07
1_16             6634   6666.7    -0.43
1_17             6724   6666.7     0.75
1_18             6697   6666.7     0.40
1_19             6577   6666.7    -1.18
1_20             6799   6666.7     1.74
1_21             6804   6666.7     1.81
1_22             6598   6666.7    -0.90
1_23             6673   6666.7     0.08
1_24             6692   6666.7     0.33
1_25             6546   6666.7    -1.59
1_26             6870   6666.7     2.68
1_27             6648   6666.7    -0.25
1_28             6645   6666.7    -0.29
1_29             6639   6666.7    -0.36
1_30             6575   6666.7    -1.21
1_31             6604   6666.7    -0.82
1_32             6628   6666.7    -0.51
1_33             6571   6666.7    -1.26
1_34             6634   6666.7    -0.43
1_35             6573   6666.7    -1.23
1_36             6673   6666.7     0.08
1_37             6704   6666.7     0.49
1_38             6659   6666.7    -0.10
1_39             6593   6666.7    -0.97
1_40             6653   6666.7    -0.18
1_41             6772   6666.7     1.39
1_42             6703   6666.7     0.48
1_43             6631   6666.7    -0.47
1_44             6556   6666.7    -1.46
1_45             6713   6666.7     0.61
1_46             6724   6666.7     0.75
1_47             6770   6666.7     1.36
1_48             6578   6666.7    -1.17
1_49             6652   6666.7    -0.19
1_50             6748   6666.7     1.07
1_51             6593   6666.7    -0.97
1_52             6654   6666.7    -0.17
1_53             6681   6666.7     0.19
1_54             6730   6666.7     0.83
1_55             6567   6666.7    -1.31
1_56             6639   6666.7    -0.36
1_57             6684   6666.7     0.23
1_58             6635   6666.7    -0.42
1_59             6625   6666.7    -0.55
1_60             6680   6666.7     0.18

Chi-squared: 49.82  (df=59, critical=77.93 at p=0.05)
Outliers (|z| > 3): 0 of 60 cards
Verdict: PASS — no significant bias detected

Mulligan Overlap: PASS — but the expected results are surprising

Here's the summary from 50,000 mulligan trials:

Cards repeated from original handHow often it happenedHow often math says it should happen
0 (completely fresh hand)40.77%40.67%
1 card repeated41.49%41.53%
2 cards repeated15.24%15.14%
3 cards repeated2.33%2.47%
4+ cards repeated0.17%0.20%

The observed results match the mathematically predicted distribution almost perfectly. Chi-squared: 5.92 against a failure threshold of 9.46.

Full output:

Code: Select all

=== MULLIGAN OVERLAP TEST ===
Iterations: 50000  |  Deck: 60  |  Initial hand: 8  |  Mulligan hand: 6
Expected overlap: 0.80 cards

Overlap      Observed   Expected       Obs%       Exp%
-------------------------------------------------------
0               20384    20332.6     40.77%     40.67%
1               20745    20765.2     41.49%     41.53%
2                7620     7570.6     15.24%     15.14%
3                1166     1236.0      2.33%      2.47%
4                  81       92.7      0.16%      0.19%
5                   4        2.9      0.01%      0.01%
6                   0        0.0      0.00%      0.00%

Chi-squared: 5.92  (df=4, critical=9.46 at p=0.05)
Verdict: PASS — overlap matches hypergeometric expectation

Read that table again, though. Only 41% of mulligans give you a completely fresh hand. The majority of the time — 59% — you will see at least one card from your original hand come back.

This is similar to the so-called Birthday Paradox, in which if you have 23 people in a room, the chances that 2 of them share a birthday is 50%. This is surprising to normal human intuition, who would assume you'd have to have closer to 100 or more for that to be the case. But here we can see that you should instead expect with every mulligan to get back at least 1 card you shuffled away.

And that's with a Highlander deck. In a real deck with 4 copies of key cards, you're not just getting the same card back — you're getting cards that look identical from a much larger pool. If you're running 4 copies of five different staples, that's 20 of your 60 cards that all feel like "the same stuff." No shuffler in the world is going to make those feel rare.


Technical Details

For those who want to peek under the hood:

The RNG. Gemp uses java.util.Collections.shuffle() backed by java.util.concurrent.ThreadLocalRandom. This is a solid PRNG — it's seeded per-thread from System.nanoTime() mixed with internal probe values. It's not java.util.Random with its weaker linear scramble, and it doesn't suffer from the classic "two Random instances created at the same millisecond get the same seed" problem.

The mulligan code path. When you mulligan, the server removes your hand, appends those cards back to the deck, and runs a full Collections.shuffle() on the entire deck before drawing your new hand. There is no partial shuffle, no "put them on top and cut" — it's a complete Fisher-Yates shuffle of all 60 cards.

Why the overlap math works out this way. The probability of getting exactly k cards back from your original hand follows the hypergeometric distribution. You're drawing 6 cards from a 60-card deck where 8 are "marked" (your original hand). The expected overlap is 6 x 8/60 = 0.8 cards. That's less than 1 — but the distribution is lumpy. Getting 0 or 1 back are both very common (~ 41% each), while getting 2 back happens about 1 in 7 mulligans. Rare enough to feel wrong, common enough to happen regularly.

The birthday paradox factor. In a real (non-Highlander) deck, perceived overlap is dramatically amplified. Consider: if you run 4 copies each of 5 key cards, you have 20 cards that are "memorable." The chance of drawing at least one of those 20 in any 6-card draw is approximately 93%. So even with a perfect shuffle, you'll almost always see something familiar — because your deck is built to give you those cards.

Bottom line. We tested it. The shuffler does what a shuffler should do. The human brain is just really bad at intuiting probability, and really good at remembering the time it got the same three cards back after a mulligan.