Shouldn’t we spend money on AGI safety, just in case?
Yes, but less money than AGI safety advocates would like, and not on typical AGI safety projects

When I argue it’s highly unlikely that a dangerous artificial general intelligence or AGI will be invented in the next decade, one of the most common replies I hear is: shouldn’t we spend money on AGI safety, just in case? My answer is: yes, but much less money than AGI safety advocates would like, and the money should be spent on different things than they want to spend it on.
The AGI safety community (which calls itself the AI safety community, but I’m modifyfing this lingo to make it clearer) has a ton of specific beliefs about AGI. These beliefs are:
There is a greater than 50% chance of AGI within the next decade
The chance of AGI causing human extinction (or at least killing the large majority of humans) is somewhere between 5% and 95%
AGI will be based primarily on deep learning, and especially on large language models (LLMs)
Progress toward AGI will largely be a result of increasing amounts of compute and/or data used in training AI systems, rather than fundamental breakthroughs in AI research
The primary danger from AGI is that it will think and act very differently from how humans do, and from how humans want it to, with preferences, goals, and values that are very different from our own
While creating AGI is relatively easy, so easy that we may not be able to prevent it from happening even with significant international coordination, making AGI have the same preferences, goals, and values as humans is extremely hard, and, therefore, making AGI safe is extremely hard
These beliefs have mostly emerged out of informal online discussions on forums, blogs, Twitter, and Facebook, and from informal offline discussions with the local AGI safety communities in the San Francisco Bay Area and Oxford.
When members of the AGI safety community, or its sympathizers, say, “Shouldn’t we spend money on AGI safety, just in case?”, what they’re specifically arguing is:
We should a lot of money, ideally billions
We should assume the AGI safety community’s specific beliefs about AGI are correct and fund projects that accord with these beliefs
The projects should be run by members of the AGI safety community or people who are on board with their beliefs about AGI
What I’d advocate instead is:
We should initially spend much less money, probably no more than low millions to start with
Rather than assuming whether the AGI safety community’s beliefs about AGI are correct or not, the purpose of the projects that get funded should be to investigate the potential dangers of AGI from scratch, without assuming anything
This may include investigating whether the AGI safety community’s beliefs about AGI are correct or not
Projects should be run by people who have relevant credentials, such as academic researchers
Research should adhere to academic standards of quality and produce papers that get published in reputable peer-reviewed journals
There should be academic freedom or intellectual freedom, with tolerance of diverse opinions among researchers, such that adhering to the AGI safety community’s beliefs is neither a pre-requisite nor disqualifying
To boil this down even further, what I’m saying is:
AGI safety projects should be investigational research
Ideally, they should be academic research, or at the very least adhere to academic standards of rigour, empiricism, sourcing, external review, and overall quality
The sort of questions this research would attempt to answer are:
What sort of technologies (if any) are likely to lead to AGI?
When is AGI likely to be invented? Can this be predicted?
If AGI were invented, would it be existentially dangerous? Would it have human-like goals and preferences, or alien ones?
Some of the difficulties in doing research on these questions are:
AGI hasn’t been invented yet
AI researchers and engineers don’t agree on what technology or AI paradigm (if any) will underlie AGI
There are inherent difficulties in predicting the invention of technologies or discoveries in science, and it’s not clear that this can be done with any accuracy, or if so, how much accuracy
Without specific knowledge of how AGI will be built, it is difficult to say what it will be like, how it will behave, how its cognition will work internally, or whether it will be dangerous
Yet exploring these difficulties could itself be a part of the investigational, academic research I’d like see funded (rather than typical AGI safety projects).
For the most part, the AGI safety community isn’t really, earnestly wondering if their beliefs are true or not. Their minds are more or less already made up. They believe there is an imminent threat of human extinction from AGI, often much higher than 10% within the next decade. Their goal is to do what it takes to prevent this, not find out whether the threat is real. From the point of view of people who aren’t bought into their views, they are jumping the gun.
The AGI safety community’s beliefs emerged in a non-scientific, non-academic way, in a non-scientific, non-academic context. As the AGI safety community has gotten more money and attention, it hasn’t attempted to bring its work into academia or into mainstream, institutional science. Although doing so would be a way for AGI safety ideas to gain credibility, the community has instead preferred to try to directly persuade the general public through projects like AI 2027 and AI in Context, and to lobby for AGI safety legislation and executive orders. The strategy is to influence the public directly and to influence government directly, sidestepping academia and mainstream, institutional science.
I imagine if AGI safety advocates explained their rationale for this, they would say academia and academic science move too slowly to respond to the imminent threat of AGI, which will likely arise in just a few years. There isn’t time to do academic research.
But another reason is that the AGI safety community has a lot of contempt for academia and institutional science, and feels that it is intellectually superior to it. In their 2025 book If Anyone Builds It, Everyone Dies, AGI safety advocates Eliezer Yudkowsky and Nate Soares indict academic AI researchers:
You’re living in a world where [Elon] Musk’s idealistic plans and [Yann] LeCun’s vague assurances were not met by an outrush of horror from the rest of academic science and industry’s engineers.
Imagine if somebody like that, with enough money and power to make their wishes real, announced they were building a nuclear power plant based on that level of theory! Imagine the reactions of the competent veterans who knew it was hard, who could analyze the resulting disaster using mature engineering techniques!
If there aren’t thousands of horrified scientists and engineers leaping up to beg governments to shut down those particular AI labs, it tells you that it’s not just a problem of individuals. It means that whole field of science is in the stage of folk theory and blind optimism.
This is not a recent development for the AGI safety community. Yudkowsky has long opposed academia and mainstream, institutional science. Yudkowsky has also written and spoken about his belief that he’s a unique genius, perhaps entirely unique in the world, and can find no one whose intelligence compares to his own — at least no one who’s interested in working in AGI safety research. The AGI safety community, of which Yudkowsky is a co-founder and spiritual leader, largely follows Yudkowsky’s lead on this. Is it largely contemptuous of academia. And it believes in its superior rationality and intelligence compared to most of the world, including most experts in most fields.
This subcultural baggage is a huge obstacle to the AGI safety community funding and doing the kind of investigational and academic (or aspirationally academic-quality) research that could convince skeptics of their beliefs. If you think academic and institutional science are corrupt or crazy, why would you bother with academic or scientific research? If you think the problem with skeptics is that they’re just fundamentally too stupid or stubbornly irrational to see that you’re obviously right, why would you think this kind of research could change their minds? Better to seek power other ways, like mass media campaigns and lobbying.
Maybe this strategy will work. But a huge problem for the AGI safety community is that people naturally want to be convinced, rather than automatically believe what they’re being told. And when people ask to be convinced, the evidence base that the AGI safety community can use to persuade people is extremely thin. I’ll try to give a list that includes everything important that I’m aware of, after more than a decade of engaging with AGI safety discourse:
Purely theoretical and philosophical writing by people like Yudkowsky or the philosopher Nick Bostrom, which relies a lot on parables, analogies, thought experiments, and sci-fi stories
The famous METR time horizons graph, which exists as a blog post and an unpublished pre-print, whose caveats are typically ignored, and which, as far as I’m concerned, has been debunked in multiple reviews
AI 2027, a mass media campaign based on a model that uses made-up data and botched math
Situational Awareness, a series of blog posts by AI researcher turned hedge fund manager Leopold Aschenbrenner, which are filled with bizarre graphs with only the suggestion of data or even units of measurement, one of which appears to assert that AGI was invented in 2023
Surveys of AI researchers, superforecasters, or others that collect people’s opinions, guesses, or intuitions on AGI, but of course don’t ask them to cite evidence (since they’re just surveys), and which generally seem to reflect opinions that AGI is farther away and less dangerous than the AGI safety community believes
Opinions expressed by some individual AI researchers, CEOs of AI companies, and so on, which, if the surveys are to be believed, aren’t representative of the views of their field
The current generative AI investment boom and the breakout financial success of AI companies like OpenAI and Anthropic, although most investors and analysts who believe in generative AI don’t seem to believe in near-term AGI, and there is a big logical gap between the generative AI investment boom (or bubble) and near-term AGI
Appeals to intuition around large language models (LLMs), along the lines of: “Just compare GPT-3.5 in 2022 to GPT-5.4 Thinking today, it’s obvious AGI is on the horizon!”
Increases in the scores for LLM benchmarks and a similar appeal to intuition
Let’s see, what else? I think that’s about it. If we remove the stuff that’s been discredited or debunked, we can boil it down to:
Philosophical and theoretical arguments
Surveys of AI researchers, superforecasters, etc. which don’t themselves cite evidence and with results that generally do not agree with the AGI safety community
Opinions expressed by some individual AI researchers, CEOs of AI companies, etc., typically expressed without much evidence or argumentation, which are also not representative of the survey data
Recent AI investment, revenue, and valuations
Appeals to intuition about LLM progress
This is really thin evidence. And let’s remember the claim here: that an AI company, probably in California, is more likely than not going to invent either a Machine God or a Machine Devil within the next decade, either ushering in utopia or bringing on the apocalypse. Outside of religious prophecy, conspiracy theories, and esoteric beliefs about UFOs and aliens, I can’t think of a comparably gigantic claim. Evidence this thin wouldn’t be good enough to support much smaller claims, like that PepsiCo is a strong buy, that Gavin Newsom will win the 2028 U.S. presidential election in a landslide, or that Ozempic is a robust treatment for addictions. How could it possibly support the largest, most radical claim in history?
The rejoinder I always get at this point is that if there’s even a small chance of existential danger from AGI, it’s worth investing in AGI safety as a precautionary measure. And this takes us full circle back to the beginning of the article: I agree, but that money should be spent investigating whether or not there is an existential danger from AGI (and related questions), not on projects that assume there is — plus assume many other specific things about AGI besides that — and proceed from there. The money should be spent building an evidence base so that a) we can know the truth (and act accordingly), b) we can understand the topic of AGI better (and, if necessary, use this understanding to guide action), and c) if there’s evidence of danger, convince everyone to get on board with an appropriate response to that danger.
There is an analogy here between AGI safety and other (purported) global catastrophic risks. The largest project in asteroid defense, NASA’s eminently laudable NEO Surveyor space telescope (planned for launch in 2027), is an attempt to gather more information about asteroids in our solar system and identify any specific threats. Volcanologists who worry that a large volcanic erruption could cause an unprecedented global disaster have founded the Global Volcano Risk Alliance. Their main goal is to better monitor volcanoes around the world for signs of danger. Something they’d like to do with enough money is launch a dedicated satellite to monitor volcanoes on the ground. With pandemics, we now all understand how important is to monitor for outbreaks of new or known, dangerous viruses. All three of these examples are about gathering information. In all three of these areas, scientists are leading the charge, and scientific communities with standards of evidence are in operation.
My suggestion with AGI is analogous to these three other potential global threats: study the issue, gather information, put scientists in charge, and employ the sort of standards of evidence seen in scientific communities. If evidence of risk accumulates, then funding should increase. If it doesn’t, then funding should stay low. This seems like an appropriate response.
Probably the most overlooked idea in the AGI safety discourse is the role that knowledge plays in our ability to mitigate risk. Imagine that, somehow, people in the 1890s had rightly begun to fear that, someday in the future, powerful, city-decimating bombs would be invented. Suppose they began to fear that these bombs would be accidentally detonated, and countries would kill millions of their own citizens. Knowing little to nothing about the science and technology that would later come to underlie these bombs, what could these people actually do to reduce the risk of accidental detonations? Little to nothing, it seems like.
In the following decades, as scientific and technical knowledge around nuclear bombs grew, people’s capacity to prevent accidental detonations grew rapidly. If you have no idea how the bombs will work, and wonder if they might somehow use gravity or ice or energy from other dimensions, you can’t really imagine how a bomb might accidentally go off, or whether that would even be a risk. It’s probably not an exaggeration to say that, in the late 1930s and the 1940s, people’s ability to do something about the risk of accidental nuclear detonations was millions of times higher than in the 1890s.
The AGI safety community believes we already know, pretty specifically, how we will create AGI. But if you believe we don’t know how AGI will be built (if it ever is) even in the most general terms, then it stands to reason our ability to mitigate risk from AGI will rapidly increase as we gain more of that knowledge. It’s typically prudent to plan for risks sooner, but when you completely lack the knowledge to do so, and you expect your knowledge to increase in the future, delay is rational. You can allocate low millions (or less) to studying AGI risk today and if at any point more evidence of AGI risk emerges, or if we acquire more knowledge that could be put to use in reducing AGI risk, then you can increase funding commensurately. If we’re on a trajectory toward AGI in a matter of decades, then it may not be an exaggeration to say the cost-effectiveness of this money will be millions of times higher in the future than today.
You can see my gripe here is that AGI safety advocates or sympathizers invoke the idea of precaution — which we rightly apply to all sorts of risks including asteroids and pandemics — to support something that isn’t supported by the idea of precaution alone. If I say, just in case it saves the world, you should give me a $100,000 grant to study how ideas in the Tao Te Ching could be used to reduce AGI risk, I’m not just asking you to believe in the idea of precaution. I’m asking you to believe in something more specific than that.
I think the most accurate way to characterize the AGI safety community is as a millenialist or pseudo-religious community. The AGI safety community rejected both religion and mainstream, institutional science and then created its own religion — complete with God (aligned AGI), the Devil (misaligned AGI), heaven (the Singularity), and hell (suffering risks or S-risks) — based on science fiction and fringe science. It’s sort of missing the point to talk about evidence, rigour, or peer review in that context. The AGI safety community is filled with bizarre, dubious claims, like that AGI was created in 2023. Data is simply invented where it’s needed. Private intuition is invoked to support what public revelation cannot.
Yet the AGI safety community is logically distinct from the concept of existential risk from AGI, and that risk should not be dismissed out of hand with no investigation. There is room for some low level of funding for competent, credentialed, curious researchers to investigate the topic of AGI risk from square one, without taking the AGI safety community’s beliefs as a given. This is a subtle distinction, yet we must grasp such subtle distinctions in order to think clearly about our complex world.


It's wild that Y&S write: "It means that whole field of science is in the stage of folk theory" about AI R&D as if the pontifications by rationalists on the alignment problem and intelligence explosions aren't also in the stage of folk theory. Lol
I posted something relevant to this topic on the Effective Altruism Forum: https://forum.effectivealtruism.org/posts/Zw8AtEKHyPazizQwG/ea-organizations-should-pay-experts-to-peer-review-their