skyspace

One of the interesting results in “Deterministic Time-Space Tradeoffs for k-SUM” is

Theorem. 3-SUM is in deterministic time \(\hat{O}(n^{2} / \log n)\) and space \(\hat{O}(\sqrt{n} \log n)\).

(We use \(\hat{O}\) to mask \(\mathop{\mathrm{\text{polylog}}}\log\) factors.)

Another interesting result that they prove is:

Theorem. The following problem is equivalently hard to 3sum: Solve 3-SUM in subquadratic time with \(n^{1/2+\varepsilon}\) space.

So either there is no subquadratic 3-SUM algorithm, or there is one with space requirement \(O(n^{.51})\).

Remark. It’s widely believed that \(k\)-SUM requires \(n^{\left\lceil k/2 \right\rceil-o(1)}\) time. It’s not known how to do polynomial method for \(3\)-sum: the best improvement is just a polylog factor. For \(4\)-sum it is not even known how to eliminate a log factor.

Then they do some domination thing with posets that I really didn’t follow. If you are a reader and can explain this to me, please contact me! I get why they need it though: They do a self-reduction, sorting (well they don’t actually sort, because they can’t afford to in their space bound, but they just “implicitly sort” without writing down the sorted list) I originally (naively) thought that it was obvious that if you sorted and split into buckets each pair of buckets would interact with \(O(1)\) other buckets. This would be true if, e.g., the buckets were sets with bounded doubling :). However, they do some fancy stuff to show this I guess. Or, I’m not even sure that it is true anymore, but they at least show that its true “on average”. I.e., the number of triples of buckets that interact is \(O((n/g)^{2})\).

Here’s how they “implicitlt sort” because I thought it was neat:

Lemma. The \(s\)-select problem is to find the \(s\)-th smallest item in a list.

The \(s\)-select problem can be solved in \(O(n)\) time and \(O(s)\) space.

Proof. You keep a list of size \(2s\). You start by putting the first \(2s\) elements in your list. Then you throw out the \(s\) smallest by doing median select and filtering.

Then you absorb the next \(s\) elements and then throw out the \(s\) smallest elements from your new array. And so on.

Ok, now here’s a bit more about how their reduction works:

Recall, we have so far done the following: Sort the array and partition into chunks of \(g\) contiguous elements. Now:

Theorem. Suppose we chose a random prime in \([mg^{7}]\). Then for any set \(S\) of \(3g\) numbers in \(2^{m}\):

If \(S\) has a 3SUM solution then \(\Pr\) \(S\) has a 3SUM solution \(\bmod p\) is 1.
If \(S\) doesn’t have a 3SUM solution then \(\Pr\) \(S\) has a 3SUM solution \(\bmod p\) is at most \(O(\log m + \log g)/g^{4}\).

Proof. \(m\)-bit numbers have at most \(m\) prime factors.

We are going to set \(g = O(\log n / \log \log n)\).

Remark. For any prime \(p\le g^{O(1)}\) there is a data structure of size \(g^{O(g)}\) that can answer any 3SUM instance on \(g\) numbers modulo \(p\).

Final algorithm:

Proof.

Pick random prime \(p\le g^{O(1)}\). construct lookup table for all 3sum instances on \(3g\) numbers mod \(p\).
Compute numbers mod \(p\). Probability that this creates a fake 3sum solution is less than \(1/g^{3}\).
Run 3sum self-reduction to get \(O((n/g)^{2})\) 3sum instances on \(3g\) numbers. Solve these by the look-up table.
If more than \(100 (n^2/g^{5})\) of the subproblems said yes, report yes. there’s no way that we could have this many false positives (at least with constant probability by Markov)
Else, just itterate through all the reported solutions and check whether they are real solutions.

Oh, so actually this saves \(\log^{2}\) factor if you are in unit-cost word-ram. And still \(\log\) factor if not.