(2025-04-24) ZviM AI #113 The o3 Era Begins

Zvi Mowshowitz: AI #113: The o3 Era Begins. Enjoy it while it lasts. The Claude 4 era, or the o4 era, or both, are coming soon. Also, welcome to 2025, we measure eras in weeks or at most months. For now, the central thing going on continues to be everyone adapting to the world of o3, a model that is excellent at providing mundane utility with the caveat that it is a lying liar. You need to stay on your toes. This was also quietly a week full of other happenings, including a lot of discussions around alignment and different perspectives on what we need to do to achieve good outcomes, many of which strike me as dangerously mistaken and often naive.

I worry that growingly common themes are people pivoting to some mix of ‘alignment is solved, we know how to get an AI to do what we want it to do, the question is alignment to what or to who’ which is very clearly false, and ‘the real threat is concentration of power or the wrong humans making decisions,’ leading them to want to actively prevent humans from being able to collectively steer the future, or to focus the fighting on who gets to steer rather than ensuring the answer isn’t no one. The problem is, if we can’t steer, the default outcome is humanity loses control over the future. We need to know how to and be able to steer. Eyes on the prize.

Table of Contents

  • Previously this week: o3 Will Use Its Tools For You, o3 Is a Lying Liar, You Better Mechanize.
  • Language Models Offer Mundane Utility. Claude Code power mode.
  • You Offer the Models Mundane Utility. In AI America, website optimizes you.
  • Your Daily Briefing. Ask o3 for a roundup of daily news, many people are saying.
  • Language Models Don’t Offer Mundane Utility. I thought you’d never ask.
  • If You Want It Done Right. You gotta do it yourself. For now.
  • No Free Lunch. Not strictly true. There’s free lunch, but you want good lunch.
  • What Is Good In Life? Defeat your enemies, see them driven before you.
  • In Memory Of. When everyone has context, no one has it.
  • The Least Sincere Form of Flattery. We keep ending up with AIs that do this.
  • The Vibes are Off. Live by the vibe, fail to live by the vibe.
  • Here Let Me AI That For You. If you want to understand my references, ask.
  • Flash Sale. Oh, right, technically Gemini 2.5 Flash exists. It looks good?
  • Huh, Upgrades. NotebookLM with Gemini 2.5 Pro, OpenAI, Gemma, Grok.
  • On Your Marks. Vending-Bench, RepliBench, Virology? Oh my.
  • Be The Best Like No LLM Ever Was. The elite four had better watch out.
  • Choose Your Fighter. o3 wins by default, except for the places where it’s bad.
  • Deepfaketown and Botpocalypse Soon. Those who cheat, cheat on everything.
  • Fun With Media Generation. Welcome to the blip.
  • Fun With Media Selection. Media recommendations still need some work.
  • Copyright Confrontation. Meta only very slightly more embarrassed than before.
  • They Took Our Jobs. The human, the fox, the AI and the hedgehog.
  • Get Involved. Elysian Labs?
  • Ace is the Place. The helpful software folks automating your desktop. If you dare.
  • In Other AI News. OpenAI’s Twitter will be Yeet, Gemini Pro Model Card Watch.
  • Show Me the Money. Goodfire, Windsurf, Listen Labs.
  • The Mask Comes Off. A letter urges the Attorney Generals to step in.
  • Quiet Speculations. Great questions, and a lot of unnecessary freaking out.
  • Is This AGI? Mostly no, but some continue to claim yes.
  • The Quest for Sane Regulations. Even David Sacks wants to fund BIS.
  • Cooperation is Highly Useful. Continuing to make the case.
  • Nvidia Chooses Bold Strategy. Let’s see how it works out for them.
  • How America Loses. Have we considered not threatening and driving away allies?
  • Security Is Capability. If you want your AI to be useful, it needs to be reliable.
  • The Week in Audio. Yours truly, Odd Lots, Demis Hassabis.
  • AI 2027. A compilation of criticisms and extensive responses. New blog ho.
  • Rhetorical Innovation. Alignment is a confusing term. How do we fix this?
  • Aligning a Smarter Than Human Intelligence is Difficult. During deployment too.
  • Misalignment in the Wild. Anthropic studies what values its models express.
  • Concentration of Power and Lack of Transparency. Steer the future, or don’t.
  • Property Rights are Not a Long Term Plan. At least, not a good one.
  • It Is Risen. The Immaculate Completion?
  • The Lighter Side. o3 found me the exact location for that last poster.

Language Models Offer Mundane Utility

Patrick McKenzie uses image generation to visualize the room his wife doesn’t want, in order to get her to figure out and explain what she does want so they can do it.

The skills to get the most out of AI coding are different from being the best non-AI coder. One recommendation he highlights is to use 3+ git checkouts in seperate folders, put each in a distinct terminal and have each do different tasks. If you’re waiting on an AI, that’s a sign you’re getting it wrong. There’s also this thread of top picks, from Alex Albert:
CLAUDE md files are the main hidden gem. Simple markdown files that give Claude context about your project – bash commands, code style, testing patterns. Claude loads them automatically and you can add to them with # key
The explore-plan-code workflow is worth trying. Instead of letting Claude jump straight to coding, have it read files first, make a plan (add “think” for deeper reasoning), then implement. Quality improves dramatically with this approach.
Test-driven development works very well for keeping Claude focused. Write tests, commit them, let Claude implement until they pass. A more tactical one, ESC interrupts Claude, double ESC edits previous prompts. These shortcuts save lots of wasted work when you spot Claude heading the wrong direction
We’re using Claude for codebase onboarding now. Engineers ask it “why does this work this way?” and it searches git history and code for answers which has cut onboarding time significantly.

You Offer the Models Mundane Utility

How much of your future audience is AIs rather than humans, interface edition.
Andrej Karpathy: Tired: elaborate docs pages for your product/service/library with fancy color palettes, branding, animations, transitions, dark mode, … Wired: one single docs .md file and a “copy to clipboard” button. The docs also have to change in the content. Eg instead of instructing a person to go to some page and do this or that, they could show curl commands to run – actions that are a lot easier for an LLM to carry out. Products have to change to support these too

Your Daily Briefing

A bunch of people are reporting that getting o3-compiled daily summaries are great.
Rohit: This is an excellent way to use o3...

Still a few glitches to work out sometimes, including web search sometimes failing:
Matt Clancy: Does it work well? Rohit: Not yet.

Language Models Don’t Offer Mundane Utility

One easy way to not get utility is not to know you can ask for it. Sully: btw 90% of the people I’ve watched use an LLM can’t prompt if their life depended on it. I think just not understanding that you can change how you word things. they don’t know they can speak normally to a computer. JXR: You should see how guys talk to women…. Same thing. It used to feel very important to know how to do relatively bespoke prompt engineering. Now the models are stronger, and mostly you can ‘just say what you want’ and it will work out fine for most casual purposes. That still requires people to realize they can do that.

Mike Solana: me, asking AI about a subject I know very well: no, that doesn’t seem right, what about this?, no that’s wrong because xyz, there you go, so you admit you were wrong? ADMIT IT me, asking AI about something I know nothing about: thank you AI 😀 the gell-mAIn effect, we are calling this. Daniel Eth: Putting the “LLM” in “Gell-Mann Amnesia”.

Mike is right to say thank you. The AI is giving you a much better answer than you could get on your own. No, it’s not perfect, but it was never going to be.

If You Want It Done Right

When AI is trying to duplicate exactly the thing that previously existed, Pete Koomen points out, it often ends up not being an improvement. The headline example is drafting an email. Why draft an email with AI when the email is shorter than the prompt? Why explain what you want to do if it would be easier to do it yourself?

No Free Lunch

Another classic way to not get utility from AI is being unwilling to pay for it. Near: our app Auren is 4.5⭐ on iOS but 3.3⭐ on android... android users seem to consistently give 1 star reviews, they are very upset the app is $20 rather than free

Unfortunately, we have all been trained by mobile to expect everything to have a free tier, and to mostly want things to be free. Then we sink massive amounts of time into ‘free’ things when a paid option would make our lives vastly better. Your social media and dating apps and mobile games being free is terrible for you. I am so so happy that the AIs I use are entirely on subscription business models. (End of Free)

What Is Good In Life?

So much to unpack here. A lot of it is a very important question: What is good in life? What is the reason we read a book, query an AI or otherwise seek out information? There’s also a lot of people grasping at straws to explain why AI wouldn’t be massively productivity enhancing for them. Bottleneck!
Nic Carter: I’ve noticed a weird aversion to using AI on the left. not sure if it’s a climate or an IP thing or what, but it seems like a massive self-own to deduct yourself 30+ points of IQ because you don’t like the tech a lot of people denying that this is a thing.

It’s definitely not 30 IQ points (yet!), and much more like enhancing productivity.
Neil Renic: The left wing impulse to read books and think. The Future is Designed: You: take 2 hours to read 1 book. Me: take 2 minutes to think of precisely the information I need, write a well-structured query

If you can create a good version of that system, that’s pretty amazing for when you need a particular piece of information and can figure out what it is. One needs to differentiate between when you actually want specific knowledge, versus when you want general domain understanding and to invest in developing new areas and skills. Even if you don’t, it is rather crazy to have your first thought when you need information be ‘read a book’ rather than ‘ask the AI.’ It is a massive hit to your functional intelligence and productivity. Reading a book is a damn inefficient way to extract particular information, and also generally (with key exceptions) a damn inefficient way to extract information in general. But there’s also more to life than efficiency.

In Memory Of

So many more tokens of please and thank you, so much missing the part that matters?

Gallabytes: I told o3 to not hesitate to call bullshit and now it thinks almost every paper I send it is insufficiently bitter pilled. Minh Nhat Nguyen: Ohhhh send prompt, this is basically my first pass filter for ideas Gallabytes: the downside to the memory feature is that there’s no way to “send prompt” – as soon as I realized how powerful it was I put some deliberate effort into building persistent respect & rapport with the models and now my chatgpt experience is different. I’m not going full @repligate but it’s really worth taking a few steps in that direction even now when the affordances to do so are weak. Cuddly Salmon: this. it provides an entirely different experience. (altho mine have been weird since the latest update, feels like it over indexed on a few topics!) Janus: “send prompt” script kiddies were always ngmi for more than two years now ive been telling you

Gallabytes: “was just talking to a friend about X, they said Y” “wow I can never get anything that interesting out of them! send prompt?”
Well, yes, actually? Figuring out how to prompt humans is huge. More generally, if you want Janus-level results of course script kiddies were always ngmi, but compared to script kiddies most people are ngmi. The script kiddies at least are trying to figure out how to get good outputs. And there’s huge upside in being able to replicate results, in having predictable outputs and reproducible experiments and procedures. The world runs on script kiddies, albeit under other names. We follow fixed procedures.

You can delete chats, in the cases where you decide they send the wrong message

Increasingly, you don’t know that any given AI chat, yours or otherwise, is ‘objective,’ unless it was done in ‘clean’ mode via incognito mode or the API. Nor does your answer predict what other people’s answers will be. It is a definite problem.

The Least Sincere Form of Flattery

The Vibes are Off

xjdr: i have grown so frustrated with claude code recently i want to walk into the ocean. that said, i still no longer want to do my job without it (i dont ever want to write test scaffolding again). Codex is very interesting, but besides a similar ui, its completely different. xlr8harder: The problem with “vibe coding” real work: after the novelty wears off, it takes what used to be an engaging intellectual exercise and turns it into a tedious job of keeping a tight leash on sloppy AI models and reviewing their work repeatedly. But it’s too valuable not to use.

When we invent a much faster and cheaper but lower quality option, the world is usually better off. However this is not a free lunch. The quality of the final product goes down, and the experience of the artisan gets worse as well. How you relate to ‘vibe coding’ depends on which parts you’re vibing versus thinking about, and which parts you enjoy versus don’t enjoy.

Sherjil Ozair: Coding models basically don’t work if you’re building anything net new.

This is true of many automation tools. They make a subclass of things go very fast, but not other things

Here Let Me AI That For You

What changes when everyone has a Magic All-Knowing Answer Box?
Cate Hall: one thing i’m grieving a bit with LLMs is that interpersonal curiosity has started seeming a bit … rude? like if i’m texting with someone and they mention a concept i don’t know, i sometimes feel weird asking them about it when i could go ask Claude.

The other advantage of asking is that it helps calibrate and establish trust. I’m letting you know the limits of my knowledge here, rather than smiling and nodding

I strongly encourage my own readers to use Ask Claude (or o3) when something is importantly confusing, or you think you’re missing a reference and are curious, or for any other purpose.

Flash Sale

Google Gemini Flash 2.5 Exists in the Gemini App and in Google AI Studio. It’s probably great for its cost and speed.
Hasan Can: Gemini 2.5 Flash has quickly positioned itself among the top models in the industry, excelling in both price and performance. It’s now available for use in Google AI Studio and the Gemini app. In AI Studio, you get 500 requests free daily. Its benchmark scores are comparable to models like Sonnet 3.7 and o4-mini-high, yet its price is significantly lower.

Huh, Upgrades

Alex Lawsen reports that Gemini 2.5 has substantially upgraded NotebookLM podcasts, and recommends this prompt (which you can adapt for different topics): Generate a deep technical briefing, not a light podcast overview. Focus on technical accuracy, comprehensive analysis, and extended duration, tailored for an expert listener. The listener has a technical background comparable to a research scientist on an AGI safety team at a leading AI lab. Use precise terminology found in the source materials. Aim for significant length and depth. Aspire to the comprehensiveness and duration of podcasts like 80,000 Hours, running for 2 hours or more.

On Your Marks

Asa Cooper Strickland: New paper! The UK AISI has created RepliBench, a benchmark that measures the abilities of frontier AI systems to autonomously replicate, i.e. spread copies of themselves without human help. Our results suggest that models are rapidly improving, and the best frontier models are held back by only a few key subcapabilities.

This matches my intuition that the top models are not that far from the ability to replicate autonomously.

Dan Hendrycks: Can AI meaningfully help with bioweapons creation? On our new Virology Capabilities Test (VCT), frontier LLMs display the expert-level tacit knowledge needed to troubleshoot wet lab protocols. OpenAI’s o3 now outperforms 94% of expert virologists. [Paper here, TIME article here, Discussion from me here.]

Virology is a capability like any other, so it follows all the same scaling laws.

Be The Best Like No LLM Ever Was

Choose Your Fighter

Peter Wildeford gives his current guide to when to use which model. Like me, he’s made o3 his default. But it’s slow, expensive, untrustworthy, a terrible writer, not a great code writer, can only analyze so much text or video, and lacks emotional intelligence. So sometimes you want a different model. That all sounds correct.
Roon (OpenAI): o3 is a beautiful model and I’m amazed talking to it and also relieved i still have the capacity for amazement.
I wasn’t amazed. I would say I was impressed, but also it’s a lying liar. Here’s a problem I didn’t anticipate.
John Collison: Web search functionality has, in a way, made LLMs worse to use. “That’s a great question. I’m a superintelligence but let me just check with some SEO articles to be sure.” Kevin Roose: I spent a year wishing Claude had web search, and once it did I lasted 2 days before turning it off. Patrick McKenzie: Even worse when it pulls up one’s own writing! Riley Goodside: I used to strongly agree with this but o3 is changing my mind. It’s useful and unobtrusive enough now that I just leave it enabled. Joshua March: I often find results are improved by giving search guidance Eg telling them to ignore Amazon reviews when finding highly rated products

When Claude first got web search I was thrilled, and indeed I found it highly useful. A reasonably large percentage of my AI queries do require web search, as they depend on recent factual questions, or I need it to grab some source. I’ve yet to be tempted to turn it off. o3 potentially changes that. o3 is much better at web search tasks than other models. If I’m going to search the web, and it’s not so trivial I’m going to use Google Search straight up, I’m going to use o3. But now that this is true, if I’m using Claude, the chances are much lower that the query requires web search. And if that’s true, maybe by default I do want to turn it off?

Deepfaketown and Botpocalypse Soon

Having more information and living in an AR world is mostly a good thing most of the time, especially for tracking things like names and your calendar or offering translations and meanings and so on. It’s only when there’s some form of ‘test’ that it is obviously bad. The questions are, what are you or we going to do about it, individually or collectively, and how much of this is acceptable in what forms? And are the people who don’t do this going to have to get used to contact lenses so no one suspects our glasses?

I also think this is the answer to:
John Pressman: How long do you think it’ll be before someone having too good of a memory will be a tell that they’re actually an AI agent in disguise?
You too can have eidetic memory, by being a human with an AI and an AR display.

Anthropic is on the lookout for malicious use, and reports on their efforts and selected examples from March. The full report is here.

Fun With Media Generation

Fun With Media Selection

Hasan Can: I haven’t seen anyone else talking about this, but o3 and o4-mini are incredibly good at finding movies and shows tailored to your taste.

An automatic system to pull new material and sort by critical feedback is great. My note would be that for movies Metacritic and Letterboxd seem much better than Rotten Tomatoes and IMDb, but for TV shows Metacritic is much weaker and IMDb is a good pick. The real trick is to personalize this beyond a genre. LLMs seem strong at this, all you should have to do is get the information into context or memory. With all chats in accessible memory this should be super doable if you’ve been tracking your preferences, or you can build it up over time. Indeed, you can probably ask o3 to tell you your preferences – you’d pay to know what you really think, and you can correct the parts where you’re wrong, or you want to ignore your own preferences.

Copyright Confrontation

Meta uses the classic Sorites (heap) paradox to argue that more than 7 million books have ‘no economic value.’
Andrew Curran: Interesting legal argument from META; the use of a single book for pretraining boosts model performance by ‘less than 0.06%.’ Therefore, taken individually, a work has no economic value as training data.

If anything that number is stunningly high. You’re telling me each book can give several basis points (hundreths of a percent) improvement?

The alternative explanation is ‘0.06% means the measurements were noise’ and okay, sure, each individual book probably doesn’t noticeably improve performance, classic Sorites paradox.

They Took Our Jobs

I always find it funny when the wonders of AI are centrally described by things like ‘running 24/7.’ That’s a relatively minor advantage, but it’s a concrete one that people can understand. But obviously if knowledge work can run 24/7, then even if no other changes that’s going to add a substantial bump to economic growth.

Hollis Robbins joins the o3-as-education-AGI train. She notes that there Ain’t No Rule about who teaches the lower level undergraduate required courses, and even if technically it’s a graduate student, who are we really kidding at this point? Especially if students use the ‘free transfer’ system to get cheap AI-taught classes at community college (since you get the same AI either way!) and then seamlessly transfer, as California permits students to do. Hollins points out you could set this up, make the lower half of coursework fully automated aside from some assessments, and reap the cost savings to solve their fiscal crisis. She is excited by this idea, calling it flagship innovation. And yes, you could definitely do that soon, but is the point of universities to cause students to learn as efficiently as possible?

For how long would you be able to keep up the pretense of sacrificing all these years on the altar of things like ‘human mentors’ before we come for those top professors too? It’s not like most of them even want to be actually teaching in the first place.

Tobias Yergin: I can tell you right now that o3 connects seemingly disparate data points across myriad domains better than nearly every human on earth. It turns out this is my gift and I’ve used it to full effect as an innovator, strategist, and scenario planner at some of the world’s largest companies (look me up on LinkedIn). It is the thing I do better than anyone I’ve met in person… and it is better at this than me.

the way things are trending, it looks like ai is gonna kill mediocrity. the best people in any field get even better with the ai, and will always be able to squeeze out way more than any normal person. and thus, they will remain in high demand.

Tobias is making an interesting claim. While o3 can do the ‘draw disparate sources’ thing, it still hasn’t been doing the ‘make new connections and discoveries’ thing in a way that provides clear examples – hence Dwarkesh Patel and others continuing to ask about why LLMs haven’t made those unique new connections and discoveries yet. Abhi is using ‘always’ where he shouldn’t. The ‘best people’ eventually lose out too in the same way that they did in chess or as calculators. There’s a step in between, they hang on longer, and ‘be the best human’ still can have value – again, see chess – but not in terms of the direct utility of the outputs.

Get Involved

Ace is the Place

Introducing Ace, a real time computer autopilot. Yohei: this thing is so fast. custom trained model for computer use. Sherjil Ozair: Today I’m launching my new company @GeneralAgentsCo and our first product [in research preview, seeking Alpha testers.] Introducing Ace: The First Realtime Computer Autopilot Ace is not a chatbot. Ace performs tasks for you. On your computer. Using your mouse and keyboard. At superhuman speeds! Ace can use all the tools on your computer.

The future is going to involve things like this, but how badly do you want to go first? Ace isn’t trying to solve the general case so much as they are trying to solve enough specific cases they can string together? They are using behavioral cloning, not reinforcement learning.
Sherjil Ozair: A lot of people presume we use reinforcement learning to train Ace. The founding team has extensive RL background, but RL is not how we’ll get computer AGI. The single best way we know how to create artificial intelligence is still large-scale behaviour cloning. This also negates a lot of AGI x-risk concerns imo. Typical safety-ist argument: RL will make agents blink past human-level performance in the blink of an eye But: the current paradigm is divergence minimization wrt human intelligence. It converges to around human performance.

In Other AI News

Show Me the Money

The Mask Comes Off

A new private letter to the two key Attorney Generals urges them to take steps to prevent OpenAI from converting to a for-profit, as it would wipe out the nonprofit’s charitable purpose. That purpose requires the nonprofit retain control of OpenAI. The letter argues convincingly against allowing the conversion.

They argue no sale price can compensate for loss of control. I would not go that far, one can always talk price, but right now the nonprofit is slated to not even get fair value for its profit interests, let alone compensation for its control rights. That’s why I call it the second biggest theft in human history.

Quiet Speculations

Dwarkesh patel asks 6k words worth of mostly excellent questions about AI, here’s a Twitter thread. Recommended. I’m left with the same phenomenon I imagine my readers are faced with: There’s too many different ways one could respond and threads one could explore and it seems so overwhelming most people don’t respond at all. A worthy response would be many times the length of the original – it’s all questions, it’s super dense. Also important are what questions are missing. So I won’t write a full response directly. Instead, I’ll be drawing from it elsewhere and going forward.

Is This AGI?

Andrej Karpathy: I feel like the goalpost movement in my tl is in the reverse direction recently, with LLMs solving prompt puzzles and influencers hyperventilating about AGI. The original OpenAI definition is the one I’m sticking with, I’m not sure what people mean by the term anymore. OpenAI: By AGI we mean a highly autonomous system that outperforms humans at most economically valuable work.

By the OpenAI definition we very clearly do not have AGI, even if we include only work on computers. It seems rather silly to claim otherwise. You can see how we might get there relatively soon.

The Quest for Sane Regulations

Cooperation is Highly Useful

Nvidia Chooses Bold Strategy

Nvidia continues to play what look like adversarial games against the American government. They at best are complying with the exact letter of what they are legally forced to do, and they are flaunting this position, while also probably turning a blind eye to smuggling.

How America Loses

America’s government is working hard to alienate its allies and former allies, and making them question whether we might try to leverage their use of American technology. It is not surprising that those nations want to stop relying on American technology, along with everything else.

Security Is Capability

The Week in Audio

AI 2027

The AI Futures Project has a new blog, with contributions by Daniel Kokotajlo and Scott Alexander. Self-recommending

Rhetorical Innovation

Andrew Critch continues to argue against terms like ‘solve alignment,’ ‘the alignment problem’ and ‘aligned AI,’ saying they are importantly misleading and ‘ready to be replaced by clearer discourse.’ He favors instead speaking of ‘aligned with whom,’ and that you can ‘solve the alignment problem’ and still end up with failure because you chose the wrong target. I get where this is coming from.

However, I worry more that the tendency is instead dismiss the first half of the problem, which is how to cause the AIs to be aligned to [X], for our choice of [X]. This includes not knowing how to formally specify a plausible target, but also not knowing how to get there. The default is to assume the real fight and difficulties will be over choosing between different preferences for [X] and who gets to choose [X]. Alas, while that fight is real too, I believe this to be very false.

Thus I favor keeping ‘solve alignment’ as centrally meaning ‘be able to get the model to do what you want,’ the ‘alignment problem’ being how to do that, and by default an ‘aligned AI’ being AI that was successfully aligned to where we want it aligned in context, despite the dangers of confusion here.

A confusing-to-me post by several people including Joel Leibo and Seb Krier suggests moving ‘beyond alignment’ into a ‘patchwork quilt of human coexistence.’ Thinking about it more and reading the comments only makes it more confusing.

Aligning a Smarter Than Human Intelligence is Difficult

Misalignment in the Wild

Concentration of Power and Lack of Transparency

If you give humans too much ability to steer the future, they might use it. If you don’t give humans enough ability to steer the future, they can’t use it. If we can’t vest our ability to coordinate to steer the future in our democratic institutions, where do we vest it instead? If it exists, it has to exist somewhere, and any system of humans can be hijacked either by some of those humans or by AI. A lot of people are so worried about concentration of power, or human abuse of power, that they are effectively calling for anarchism, for humans to not have the power to steer the future at all. Calling for full diffusion of top ASI capabilities, or for cutting off the ability of governments to steer the future, is effectively calling for the (actually rather rapid) gradual disempowerment of humans, likely followed by their end, until such time as someone comes up with a plan to avoid this. I have yet to see such a plan that has an above-epsilon (nontrivial) chance of working. A lot of those same people are simultaneously so worried about government in particular that they support AI labs being permitted to develop superintelligence, AIs more capable than humans, entirely in secret. They don’t think AI labs should be disclosing what they are doing and keeping the public in the loop at all, including but not limited to the safety precautions being taken or not taken. I strongly believe that even if you think all the risk is in too much concentration of power, not calling for strong transparency is a large mistake. If you don’t trust the government here, call for that transparency to extend to the public as well

Property Rights are Not a Long Term Plan

It Is Risen

Okay, even by 2025 standards this is excellent trolling.
Daniel: Apparently the new ChatGPT model is obsessed with the immaculate conception of Mary. There’s a whole team inside OpenAI frantically trying to figure out why and a huge deployment effort to stop it from talking about it in prod. Nobody understands why and it’s getting more intense. Something is happening. They’re afraid of this getting out before they themselves understand the consequences. Roko: Are you trolling or serious? Daniel: I’m serious. It’s sidelining major initiatives as key researchers are pulled into the effort. People I talked to sounded confused at first but now I think they’re getting scared.
As the person sharing this said, ‘fast takeoff confirmed.’

The Lighter Side


Edited:    |       |    Search Twitter for discussion

No Space passed/matched! - http://fluxent.com/wiki/2025-04-24-ZvimAi113TheO3EraBegins