There is a circular race track of circumference
. people, with (distinct) speeds start at the start of the race track, and run around at their speed, until a random time . After this, the runners are at locations . I then cut the race track into contiguous chunks, each of length and count how many runners are in each chunk. Let , the āmaxloadā, denote the number of runners in the fullest chunk. This is a random variable depending on the time when I stop the runners. In this document I will analyze .
Relation to prior work
TFAE:
(by rescaling distances). ( for all , so are very unlikely to be in different buckets).
In other words, the problem considered in this post, which Iāll call āreal hashingā is equivalent to a special case of the linear hashing problem. I think that real hashing is basically just as interesting as the normal linear hashing problem, and am quite happy to study it.
Because real hashing is a special case of the normal linear hashing problem, Knudsenās argument here shows that
Defining triple collisions
Define
Observe:
.
In what follows, Iāll write
The triple collision lemma
Lemma
Let
Proof: By rescaling the length of the circle we have:
Integers are more favorable than arbitrary reals, so we can switch to taking
Now, hereās a helpful fact from number theory:
The equation
has a unique solution in .
There are
Remark:
Intuitively, youād have guessed
The āall numbers are sufficiently coprimeā lemma
Lemma:
Proof: By the triple collision lemma:
The
Weāll now analyze the second term.
I think you can do better than the analysis that Iām about to do, using a technique from Alekās paper on LH. But the below method suffices for our needs.
Partition
We have:
Fix any
We have:
There are at most
Thus,
Thus,
as desired.
Overall, this means that weāve bounded the second and third terms of the big sum that we care about for the lemma by
To finish the lemma we bound the fourth term.
We use a similar set of tricks to the ones that we used in bounding the second term. We dyadically partition in the same way. The fourth term is then at most
Fix any
This last expression is bounded by
A famous result in number theory is that
This is obviously way overkill, but anyways, applying this we bound the expression that weāre working on by
And, weāre done.
A simple bound on maxload
Theorem:
Proof
Clearly,
Recalling that we are basically ok to replace
Applying the āall numbers are sufficiently coprimeā lemma and setting
Authorās note
Iām pretty sure that the above proof goes through, using the āquantum dinosaurā technique for normal linear hashing of the problem. But itās probably at least a moderate amount of additional work. (The quantum dinos technique is just expertly using the fact that any
The path forward
Ultimately, weād like to prove some stronger bounds on
Hereās the rough plan for how to prove this stronger bound:
- Eat
of the times around each fraction with denominator at most . - Figure out the quadruple collision lemma, and do some vibe checks on it. Itās going to be much trickier than I originally thought. Itās going to need to take into account that we ate some stuff.
- Show that the expected number of quadruple collisions, over times that were not eaten is at most
, by using the quadruple collision lemma and basic number theory.
Trying to prove a quadruple collision lemma
Letās assume --- bc idk how to solve even given this assumption, and simpler problems are nice --- that
Let
Naively, youād hope that
Honestly, idk if I even expect this thing to be true.
Whatever happens, itās all going to crucially depend on the fact that weāre taking
Note that the weaker conjecture with all the
It feels very productive to write
for some
Iād like to argue that in the square
This would be enough to win.
Note that banning
The remainder of the proof is left as an exercise to the reader.
By which I mean, Iāll think about this more next weekend probably.