edit: Unfortunately, there are major errors in this document. Sorry about that.
Introduction
This has led ARC to thinking about the following question:
(
) Can we come up with a mechanistic method for estimating the expectation of a NN?
The word âmechanisticâ intuitively means âfeels pretty different from samplingâ. If you have a solution to this that you think is borderline mechanistic then you can test it by seeing if itâs useful for some of the downstream tasks that ARC cares about.
I think itâs pretty unclear + confusing exactly why solving
ARC is interested in three settings:
- Competing with sampling on random problem instances.
- Competing with sampling on worst case problem instances, but where we get an optimized âadvice stringâ.
- Competing with sampling on trained instances, where we get to learn an advice string in parallel to the training process, using the same compute budget as the training process.
Iâll consider these in turn.
(2) Worst case instances
Hereâs a problem statement:
Conjecture 1
Let
be the set of 3SAT instances with clauses. Let be the set of âexplanationsâ ( bit strings) There exists a mechanistic âexplanation quality measuring machineâ , where can be computed in time, such that for all , there is a mechanistic âexpectation estimation machineâ such that
runs in time - If
then should be within of
The relevant fact about the compute / error tradeoff is that this is the same compute / error tradeoff that naive sampling gets.
(Note: Someone said that Conjecture 1 might give an alternate proof of BPP in Sigma2. TODO: think about whether or not thatâs true, and whether or not itâs concerning if true (probably isnât concerning bc that proof isnât too hard?).)
Conjecture
Conjecture 1 implies that we can solve any similar flavor problem that we care about (because
#3SAT
feels like a pretty hard and pretty general problem).
Victor is pretty sure that this conjecture is False. But maybe itâs at least spiritually true.
Def: T is a measure preserving reduction if
- T is computable in polynomial time
- T preserves measure For example:
- T: 4SAT instances -â 3SAT instances such that
Question: Do measure preserving reductions exist? Probably exactly measure preserving is impossible for dumb number theory reasons. - Approximately measure preserving in some sense would also maybe be fine.
- Iâve thought about this a bit and suspect that itâs pretty tricky to do / maybe impossible.
Mike thinks that I may not have given the optimization oracle enough power. Iâm unsure but will think about my version for now.
(1) Random instances
This is the only question that ARC has had progress on so far. ARC has proved the following theorem:
Theorem 1
Let
be the vanilla distribution over 3SAT instances with clauses; Namely, each clause has three random (distinct) variables in it, and each of the variables is randomly negated or not. For all , there is a âexpectation estimation machineâ such that
runs in time , and
Note: ARCâs proof of theorem 1 is imo very complicated (it uses the words âhomogeneous polynomials are the irreducible representations of the hyperoctahedral groupâ). Iâd like a much simpler proof please.
Currently, ARC is interested in investigating more complicated distributions. For instance, a distribution where some variables are more common than others, or where we have more negations than positive versions of literals.
(Note: ok ARC technically hasnât written down a proof of theorem 1 but weâre pretty confident that we have such a proof, maybe up to log factors).
(Note: Eric thinks that
SAT might be a better problem.)
(Note: Eric thinks that you might need advice to solve this if the distribution is sufficiently weird. Iâd be pretty interested in a proof or vibes based proof as to why this would be true. Eric gave some vibes based on obfuscation, but I didnât find it too compelling.)
(Note: Maybe if we solved random cases, we could do random hacky stuff like this better).
(3) Trained instances
- Thinking about random search as a hill climbing procedure seems like a good first step.
- Eventually weâll need to think about SGD though probably. Which is an L.
Breaking ARC
Hereâs a problem that I thought it would be lethal to ARCâs agenda to be unable to solve:
Conjecture 1
Let
be the set of 3SAT instances with clauses. There exists a deterministic function and a deterministic function such that
runs in time - If
then should be within of
Suppose we could prove Conjecture 1. Then weâd have the following corollary:
Corollary 2
Let
be the set of 3SAT instances with clauses such that , and let be the set of 3SAT instances with clauses such that . There exists a deterministic function and a deterministic function such that
runs in time - For
if , and then .
Next, weâd have the following corollary:
Corollary 3
Let
be the set of 3SAT instances with clauses such that , and let be the set of 3SAT instances with clauses such that . Distinguishing between and can be done in .
Next, weâd have the following corollary:
Corollary 4
.
Unfortunately, according to my friend, Corollary 4 is beyond the reach of complexity theory for now. Thus we should not expect to be able to prove Conjecture 1.