🐱 Skyspace3.0

Search

SearchSearch
        • 2024-06-21
        • 2024-06-25
        • 2024-06-27
        • 2024-07-24
        • Borůvka
        • Borůvka
        • Expansion Review
        • hashtable without hash functions
        • network decompositions
        • routing
      • A letter to the Senator
      • a note on effective altruism + et al.
      • A simple alignment proposal
      • actual explanation
      • addictive technologies are harmful
      • ai discussion slideshow
      • algorithm class notes
      • alignment meta-research algorithm
      • arena notes
      • backdoors and deceptive alignment
      • bids
      • Brainstorming Some Open Questions
      • bucket list of learning exercises
      • CAIP model organism proposal
      • Can I enjoy arbitrary things?
      • Causal Scrubbing Notes
      • Check if Alek released a new post script
      • crypto primitives
      • deep learning class -- some things I've learned
      • Do you have the Markov property?
      • dont look up
      • double standard
      • Emotional Health in the face of the AI Situation
      • Ethics of Computing Essay 1
      • finding c4s
      • fine tuning
      • good arguments
      • good blog posts
      • inference
      • intro-to-rl
      • Inverse Reward Design -- paper summary
      • mildly hot takes
      • mnist
      • off-switch
      • OLD --- ARC formal verification notes
      • on feelings
      • open-ness
      • physics notes
      • qmhw9
      • raising awareness
      • responsibility reprise
      • ridiculous BFS
      • Risks from AI -- elevator pitch
      • scalable alignment
      • school tidbits
      • some ml projects and reflection
      • some permutation avoidance notes
      • some problems from my friends
      • some random super old alignment notes
      • superposition
      • taking life seriously
      • talking to myself
      • The Inner Alignment Problem
      • the world is full of allies
      • The Wrong Person
      • Thinking about people's reactions to AI xrisk
      • two mistakes in my plans for mitigating xrisk
      • What could go wrong?
      • What Should We Do About the xRisk Posed by ASI?
      • wins
      • worrying about giving offense
      • yearbookv2
      • backdoors take 2
      • concrete questions
      • Does deliberative alignment incentivize lying?
      • How scary is HHH?
      • LPE Agenda
      • MAD Agenda
      • project ideas 04.21.25
      • questions 02.16.25
      • questions 02.20.25
      • questions 03.10.25
      • questions 3.22
      • refuting NCP
      • Research Idea --- Honest AIs
      • safe distillation
      • safe distillation FAILED
      • Solving AI Safety from 1st principles
      • cryptographic proofs
      • cuckoo hashing
      • DQN Maze Solver
      • EM reprise
      • Goldreich-Levin Theorem
      • Hirahara Shimizu Planted Clique Paper Summary
      • IQPs + friends
      • iterated amplification
      • low vs high stakes alignment
      • murphy -- Theoretical ML textbook
      • online convex optimization
      • P2P communication
      • particle filter
      • predictive AI
      • RL
      • Ryan
      • smoothing backdoors
      • Addressing some more common counterarguments
      • Lemma 1
      • Lemma 2
      • Lemma 3
      • Lemma 4
      • main
      • normative claims
      • 5 years
      • 100% is easier than 99%
      • accept the truth
      • acting NOW
      • AI xrisk
      • akrasia
      • avoid_discomfort.cpp
      • be specific
      • Can I change my objective function? Should I?
      • change is atomic
      • cognitive distortion list
      • commitments
      • consciousness
      • contact
      • cooking
      • determinism is useless
      • existence
      • existence of objective function
      • Favorite Endeavors by Aesthetics
      • flake
      • food-water
      • fun activity ideas
      • good problems
      • good subgoals
      • Goodhart's law
      • goodness(universe)
      • how I want to spend my life
      • human compute is scarce
      • intention
      • is judging people a good idea?
      • justification
      • laughing at serious things
      • listening
      • live research options
      • mind reading
      • motivated reasoning
      • not fail with abandon
      • nukes xrisk thoughts
      • OLD AI xrisk argument
      • on writing, blogging and being opinionated
      • outer loop, inner loop
      • pascal-wager
      • point of research
      • prediction vs explanation
      • press the button
      • procrastinating choices
      • questions
      • recovering from being tired when working
      • responsibility
      • science AI xrisk article
      • self betrayal
      • sleep
      • social
      • some random projects
      • speak the truth?
      • stimuli
      • taboo
      • taking social initiative
      • talking to people
      • Terms for others in my utility function
      • trivia
      • walking
      • Want to X or want to have X'd
      • What do I need money for?
      • what influence do perceived expectations have on me?
      • Will humans beat death in the next 30 years?
      • win-the-lottery-utility
      • working healthily
      • wrong exists
      • your work is not your worth
    Home

    ❯

    tags

    ❯

    Tag: technical

    Tag: technical

    7 items with this tag.

    • Feb 16, 2025

      smoothing backdoors

      • technical
      • todo
    • Jan 27, 2025

      low vs high stakes alignment

      • technical
    • Jan 26, 2025

      Causal Scrubbing Notes

      • technical
    • Jan 26, 2025

      OLD --- ARC formal verification notes

      • technical
    • Jan 21, 2025

      superposition

      • technical
    • Nov 09, 2024

      backdoors and deceptive alignment

      • technical
    • Oct 02, 2024

      Inverse Reward Design -- paper summary

      • technical

    Created with Quartz v4.2.3 © 2025

    • GitHub
    • Discord Community