🐱 Skyspace3.0
Search
Search
Search
Dark mode
Light mode
Explorer
0stack
a note on effective altruism et al
acting NOW
actual explanation
addictive technologies are harmful
bids
Can I enjoy arbitrary things?
change is atomic
deep learning class -- some things I've learned
dissonance. fun. conviction.
dont look up
double standard
Emotional Health in the face of the AI Situation
good arguments
on feelings
responsibility reprise
stimuli
taboo
taking life seriously
taking social initiative
talking to people
the world is full of allies
The Wrong Person
Thinking about people's reactions to AI xrisk
worrying about giving offense
1research
alignment drift notes
alignment-drift-draft
backdoors take 2
concrete questions
Does deliberative alignment incentivize lying?
How scary is HHH?
LPE Agenda
MAD Agenda
Mechanistic Probability Estimation
project ideas 04.21.25
questions 02.16.25
questions 02.20.25
questions 03.10.25
questions 3.22
refuting NCP
Research Idea --- Honest AIs
safe distillation
safe distillation FAILED
Solving AI Safety from 1st principles
surprising alignment ideas
2notes
BPP in Sigma2
cryptographic proofs
cuckoo hashing
DQN Maze Solver
EM reprise
Goldreich-Levin Theorem
Hirahara Shimizu Planted Clique Paper Summary
IQPs + friends
iterated amplification
Kolmogorov complexity
low vs high stakes alignment
murphy -- Theoretical ML textbook
online convex optimization
P2P communication
particle filter
predictive AI
RL
Ryan
smoothing backdoors
3old-feed
unreadable-notes
2024-06-21
2024-06-25
2024-06-27
2024-07-24
Borůvka
Expansion Review
hashtable without hash functions
network decompositions
routing
A letter to the Senator
A mistake -- too critical of Anthropic specifically
A simple alignment proposal
ai discussion slideshow
algorithm class notes
alignment meta-research algorithm
arena notes
backdoors and deceptive alignment
Brainstorming Some Open Questions
bucket list of learning exercises
CAIP model organism proposal
Causal Scrubbing Notes
Check if Alek released a new post script
crypto primitives
defaults
Do you have the Markov property?
Ethics of Computing Essay 1
finding c4s
fine tuning
good blog posts
inference
intro-to-rl
Inverse Reward Design -- paper summary
mildly hot takes
mnist
off-switch
OLD --- ARC formal verification notes
open-ness
physics notes
qmhw9
raising awareness
ridiculous BFS
Risks from AI -- elevator pitch
scalable alignment
school tidbits
some ml projects and reflection
some permutation avoidance notes
some problems from my friends
some random super old alignment notes
superposition
talking to myself
the end, again
The Inner Alignment Problem
two mistakes in my plans for mitigating xrisk
What could go wrong?
What Should We Do About the xRisk Posed by ASI?
wins
yearbookv2
risk3
Addressing some more common counterarguments
Lemma 1
Lemma 2
Lemma 3
Lemma 4
main
normative claims
5 years
100% is easier than 99%
accept the truth
AI xrisk
akrasia
avoid_discomfort.cpp
be specific
Being intellectually curious
Can I change my objective function? Should I?
cognitive distortion list
commitments
consciousness
contact
cooking
determinism is useless
Doing good work
existence
existence of objective function
Favorite Endeavors by Aesthetics
flake
food-water
fun activity ideas
good problems
good subgoals
Goodhart's law
goodness(universe)
how I want to spend my life
human compute is scarce
intention
is judging people a good idea?
justification
laughing at serious things
listening
live research options
mind reading
motivated reasoning
not fail with abandon
nukes xrisk thoughts
OLD AI xrisk argument
on writing, blogging and being opinionated
outer loop, inner loop
pascal-wager
point of research
prediction vs explanation
press the button
procrastinating choices
questions
recovering from being tired when working
responsibility
science AI xrisk article
self betrayal
sleep
social
some random projects
speak the truth?
Terms for others in my utility function
trivia
walking
Want to X or want to have X'd
What do I need money for?
what influence do perceived expectations have on me?
Will humans beat death in the next 30 years?
win-the-lottery-utility
working healthily
wrong exists
your work is not your worth
Home
❯
3old feed
Folder: 3old-feed
48 items under this folder.
Jun 29, 2025
mildly hot takes
Jun 29, 2025
A letter to the Senator
Jun 29, 2025
A mistake -- too critical of Anthropic specifically
Jun 29, 2025
A simple alignment proposal
Jun 29, 2025
Brainstorming Some Open Questions
Jun 29, 2025
CAIP model organism proposal
Jun 29, 2025
Causal Scrubbing Notes
technical
Jun 29, 2025
Check if Alek released a new post script
Jun 29, 2025
Do you have the Markov property?
Jun 29, 2025
Ethics of Computing Essay 1
Jun 29, 2025
Inverse Reward Design -- paper summary
technical
Jun 29, 2025
OLD --- ARC formal verification notes
technical
Jun 29, 2025
Risks from AI -- elevator pitch
Jun 29, 2025
The Inner Alignment Problem
Jun 29, 2025
What Should We Do About the xRisk Posed by ASI?
Jun 29, 2025
What could go wrong?
Jun 29, 2025
ai discussion slideshow
Jun 29, 2025
algorithm class notes
Jun 29, 2025
alignment meta-research algorithm
Jun 29, 2025
arena notes
Jun 29, 2025
backdoors and deceptive alignment
technical
Jun 29, 2025
bucket list of learning exercises
Jun 29, 2025
crypto primitives
todo
Jun 29, 2025
defaults
Jun 29, 2025
finding c4s
Jun 29, 2025
fine tuning
Jun 29, 2025
good blog posts
Jun 29, 2025
inference
Jun 29, 2025
intro-to-rl
Jun 29, 2025
mnist
Jun 29, 2025
off-switch
Jun 29, 2025
open-ness
Jun 29, 2025
physics notes
Jun 29, 2025
qmhw9
Jun 29, 2025
raising awareness
Jun 29, 2025
ridiculous BFS
Jun 29, 2025
scalable alignment
Jun 29, 2025
school tidbits
todo
Jun 29, 2025
some ml projects and reflection
Jun 29, 2025
some permutation avoidance notes
Jun 29, 2025
some problems from my friends
Jun 29, 2025
some random super old alignment notes
Jun 29, 2025
superposition
technical
Jun 29, 2025
talking to myself
Jun 29, 2025
the end, again
Jun 29, 2025
two mistakes in my plans for mitigating xrisk
Jun 29, 2025
wins
Jun 29, 2025
yearbookv2