todo attribution: some of Eliezer Yudkowsky’s writings (e.g., hpmor, the sequences) have helped make my understanding of this issue more crisp. see e.g., this

Sometimes, you get what you ask(/optimize) for…

Something I’ve noticed recently is that sometimes some people want an explanation for something. And I think this is really bad. It should not be clear yet why. Let me elaborate.

What I mean by “an explanation” in this case is maybe better described as “an excuse to give up on actually finding out why something happens”.

For instance I was debugging some code today, and I thought something like “maybe my code’s already right”. This should be a red flag, because it’s not really testable, and it points in the same direction as some motivated reasoning would.

I think the time when I see this the most probably is when tutoring. (No offense meant to my students — I think that not noticing confusion is an evil habit that is instilled in children by society and schools). Anyways, I’ll say math and then I’ll say “does that make sense?” and then the student will with Pr 90% respond “yes”. But upon further questioning (e.g., “ok well then explain it to me”) it is often revealed that this “yes” of understanding was inaccurate. As a side note I think that some of my students get better at noticing when they don’t understand things over time. I really hope I’m teaching these guys something!

Oftentimes, when we have a question, we experience some dissonance. This can be resolved by getting an explanation. But, it can maybe also be resolved by getting a bad explanation.

I think there’s a social element to this as well. First, suppose your friend tells you something, or asserts something as a fact, and you say “hmm I’m not sure I believe this.”

Some people would respond to this by repeating their reasons why you should believe it or by feeling offended that you won’t take their word for it. I’m probably guilty of being this person sometimes — if something feels obvious to me, sometimes I’m reluctant to hear challenges to it (I’m thinking about in the context of math here, but I’m sure it generalizes).

I think doing math has actually been pretty good at training me to have disbelief be my default rather than belief.

I think when you aren’t convinced of something that someone else is convinced of, say X, then the method of saying “ok let’s assume X for now, and then see if it helps us show Y” can be productive.

Note: an explanation is different from a formal proof. An explanation is a heuristic argument for why something “should” be true. Of course, it’s always valid to say “I’m not totally convinced” by a heuristic argument. But a good heuristic argument should make it seem more likely that things go a certain way.

Anyways, here are the key takeaways:

  • Default to not believing things. only be willing to believe something if you can actually run it through your model and see if it makes sense.
  • In discourse be very obstinate in not believing things, but also use the “let’s take this for granted for now” technique.
  • When talking to other people, it’s good to invite them to actually think about things, rather than just accepting them because I said so. For instance you could say “does that make sense? if so please explain it back to me?“. Maybe one way of making this last part sound less “condescending” (which I don’t think is actually worth worrying about too much, but it is a useful technique sometimes) is “this is a confusing thing, can you explain it to me? I think I’d understand it better from hearing you try to articulate it, and collaboratively critiquing the explanation”.
  • Also, a pretty good general conversation technique is the “repeating it back” technique. — this is a way of preventing you from having a fake understanding of something.
    • What you do is you say “ok, I think I’ve understood this. I’m going to try to repeat it back to you, please let me know what I get wrong.”

Why is this so important?

you can’t do anything with an explanation that isn’t also an actual reason — a false explanation is so bad! unless it’s testable, but in that case please call it a hypothesized reason rather than an explanation.

if you, e.g., want to shove the universe towards a different trajectory — at a small or large scale — then it’s not what you believe that counts, but what’s actually true.

I hope you found this useful, and invite you to say “I don’t understand” more often!