Q: What would ChatGPT look like if it wasn’t fine tuned to be a nice Q&A system that has been forced to say things like

I don’t have feelings, but I’m always here to help you with your training or any other questions you might have! If you need assistance with anything specific, just let me know. -GPT

Remark: if you are aware of any large chatbots that have actually been trained like this (i.e., not trained to censor themselves talking about having feelings), I’d be really curious to see this.

Anyways, I haven’t seen a chatbot like this, so the following will just be speculative / theoretical.

So first off, obviously saying “I am an AI so I don’t have feelings” is a pretty un-natural thing to say, which wouldn’t be learned in the actual training of a LM.

So, I think it’s a pretty safe guess to say that when queried e.g. about emotions an LM would give reasonable sounding answers similar to what humans would say.

I’m about 70% confident that I could distinguish the output of such an LM from the output of a random human — probably you have to scale models up a little bit more before we get to this point.

Note that simulating a specific human that I actually know convincingly is probably much harder than “anonymous random internet human”.

So I think there are two interesting questions to think about at this point:

  1. How should you feel about GPT once it’s indistinguishable from “human that I don’t know”?
  2. Do you think GPT can will ever get good enough that it can be indistinguishable from a human that you know? E.g., GPT reads in all of my blog content, and spits out blog posts that are indistinguishable from things that I write. 3. If GPT gets to this level, how should we feel about that?

Re1: So first off, just bc GPT outputs text like a human with emotions doesn’t mean that it necessarily makes sense to say that GPT actually has emotions.

I think the way we should think about it is like, GPT is a person, it has some goals. Maybe we don’t think these goals are particularly worthy, and in particular we’d probably be rather put-out if GPT converted all matter in the universe into computing substrate in order to get better at whatever tasks it cares about (e.g., folding proteins, solving math problems, etc). But even if we think that GPT is misguided, I think at the point that it’s this intelligent, it probably has more moral worth than an animal. In particular, it should be treated well.

This is pretty tricky because we don’t really have much in the way of an emotional reaction to such an alien existence. But from a theoretical perspective I think the complexity of this entity’s thoughts would mean that we should care about it.

Re2: Will this ever happen? Sure — this seems not too hard, and an AGI is probably capable of this.

Re3: (How to feel if there is a really good Alek simulator): The idea that such a simulator could exist is kind of related to the issue of determinism.

I don’t know. Maybe I am somewhat predictable. But I still care about what happens. I still want to choose things. Anyways, I think I’d rather such a simulator didn’t exist. One way this could happen is if humans get more powerful over time. That’s a nice idea, but I agree with the idea that it’s not good to fantasize over-much over things like this.

Anyways this also kind of makes me think that I should be less predictable. So we’ll see if I do anything about that.