Why Science Became a Church
In 1660, twelve men met in London after a lecture at Gresham College. They decided to form a society for "the promoting of Physico-Mathematicall Experimentall Learning." Two years later, Charles II granted them a royal charter. They chose a motto:
Nullius in verba.
"On the word of no one." Take nobody's word for it.
This was radical. For two thousand years, natural philosophy had operated on authority. Aristotle said it; therefore it was true. The Church endorsed it; therefore it was doctrine. The motto was a declaration of war against this entire tradition.
The Royal Society would not accept claims because someone important made them. They would accept claims because someone demonstrated them. Show us. Let us see. Let us repeat. Let us verify.
Robert Boyle demonstrated his air pump at Society meetings. Members watched the candle flame extinguish in the vacuum. They didn't take Boyle's word that air was necessary for combustion. They saw it. They could build their own pumps and check.
This was the founding compact of modern science: I do not trust you. Show me your data. Let me replicate your experiment. Prove it.
Three and a half centuries later, we are told to "Trust the Science."
Something inverted.
The Scientific Revolution didn't succeed because scientists were virtuous. It succeeded because it built a mechanism that produced truth regardless of virtue.
The mechanism was a trade:
The Compact: We (the scientific community) agree to give you Status (publications, citations, tenure, prizes) if and only if you agree to subordinate your Ego to Reality (publish your methods, accept falsification, let others check your work).
This was alchemy. It took the darkest human drive—status-seeking—and transmuted it into the noblest output: objective truth.
Scientists weren't saints. They were ambitious, competitive, jealous, desperate for recognition. Newton and Leibniz fought viciously over priority for calculus. Rivalry was everywhere. But the compact channeled that rivalry toward verification. You got glory for being right—but only if others could check your work and couldn't prove you wrong.
The incentive was "be first to discover something real." Not "be first to publish something plausible." The difference is everything.
Adam Smith observed that it is not from the benevolence of the butcher that we get our dinner, but from his regard to his own interest. The same insight applies:
It is not from the curiosity of the scientist that we get the truth. It is from their desire for prestige, constrained by a system that only awards prestige for verified discovery.
Break the constraint, and you keep the prestige-seeking but lose the truth.
The compact had specific architectural features. Each feature served a function:
| Feature | Function | Why It Worked |
|---|---|---|
| Publication | Make claims public | You can't verify secrets. Public claims can be checked. |
| Methods disclosure | Enable replication | If you won't show how you did it, no one can verify you did it. |
| Peer review | Adversarial filter | Your rivals check your work. They want to find errors. |
| Replication | Actual verification | Others repeat your experiment. If it fails, you're wrong. |
| Priority | Incentive to publish | First to publish gets credit. Racing accelerates discovery. |
| Citation | Credit trail | You must acknowledge prior work. Others can trace lineage. |
Notice what these features have in common: distrust.
Every feature assumes scientists will cheat, exaggerate, or fool themselves if given the chance. The architecture doesn't prevent cheating through moral education. It prevents cheating by making cheating detectable.
The system assumed bad actors and designed around them. This is why it worked.
Each architectural feature can decay. When a functional mechanism becomes a symbolic ritual, the form persists while the function dies. This is the liturgical failure mode.
Original function (1665): Filter for scarce resources. Philosophical Transactions could only print so many pages. Someone had to decide what was worth the paper. Reviewers asked: "Is this worth printing? Is the method sound? Are there obvious errors?"
Current ritual: Paradigm enforcement. Reviewers ask: "Does this threaten my field? Does it cite the right people? Does it use approved methods? Does it reach conclusions that won't embarrass us?"
The replication crisis proves the filter isn't working. The Open Science Collaboration (2015) attempted to replicate 100 psychology studies published in top journals. Only 36% replicated. These papers passed peer review. Peer review didn't catch the problem because peer review wasn't checking for replicability. It was checking for publishability—a different criterion.
Original function: Breadcrumb trail. If you use someone's work, cite it so others can trace the lineage, check the source, verify the foundation.
Current ritual: Currency. Citations are counted, measured, compared. They determine hiring, tenure, grants. A currency that can be counted can be gamed.
Citation cartels form: "I cite you, you cite me." This is counterfeiting. The citation looks like intellectual credit but represents mutual back-scratching. The form (citation) persists; the function (verification trail) is absent.
Original function: Betting against yourself. You state a prediction. You test it. If nature says no, you were wrong. The p-value was supposed to measure how surprised you should be if your hypothesis were false. It was a confession mechanism: "I tried to prove myself wrong; I couldn't."
Current ritual: p-Hacking. Torture the data until it confesses to p < 0.05. Run multiple analyses, report the one that worked. Adjust hypotheses after seeing results. The form remains ("statistically significant"); the function (risk of falsification) is gone.
Simmons, Nelson, and Simonsohn demonstrated in "False-Positive Psychology" (2011) that using standard "researcher degrees of freedom"—choices about outliers, covariates, stopping rules—they could produce statistically significant evidence that listening to "When I'm Sixty-Four" makes people younger. The ritual worked. The mechanism was dead.
Original function: Actual checking. Others repeat your experiment. If they get different results, something is wrong—with your experiment, theirs, or the underlying claim.
Current ritual: Almost never performed. Replications don't get published in top journals ("not novel"). They don't earn grants ("not original research"). They don't advance careers. The incentive structure actively punishes the one activity that would verify whether any of this is true.
We are funding the Liturgy of Science, not the Architecture of Science.
The robes are worn. The Latin is spoken. The rituals are performed. The planes don't land.
Somewhere between 1660 and now, the motto inverted.
| Nullius in Verba (1660) | "Trust the Science" (2020) |
|---|---|
| I don't trust you; show me | Trust the experts |
| Verification is the point | Verification is "misinformation" |
| Status for being checked and right | Status for being credentialed |
| Science as Market | Science as Church |
Consider the phrase "Trust the Science." What does it mean?
If science is a method, then "trust" is irrelevant. You verify. The method produces results regardless of whether you trust the practitioner. That's the point of having a method.
If science is a social compact, then trust is the output of the architecture working correctly—not an input. You trust the results because the system forces verification. The trust is earned, not demanded.
"Trust the Science" asks for faith. It asks you to trust a class of people (those with credentials) rather than a mechanism of verification. This is precisely what the Royal Society was founded to reject.
The phrase is the negation of the motto. It is the inversion.
In May 2020, The Lancet published a study claiming hydroxychloroquine increased mortality in COVID patients. The study used data from a company called Surgisphere. Within days, outside researchers noticed anomalies: the data claimed more COVID deaths in Australian hospitals than Australia reported nationally. The African data showed patterns inconsistent with African medical infrastructure.
The study passed peer review at one of the world's most prestigious medical journals. The anomalies were caught by people on Twitter and in preprint discussions—the informal, uncredentialed, "unverified" parts of the system.
The Lancet retracted the paper. But notice what happened: the liturgy (peer review) failed; the architecture (adversarial checking by motivated outsiders) worked.
When someone objected to the original publication, they could be told to "trust the peer review." When they persisted, they were doing exactly what the Royal Society intended: refusing to take anyone's word for it.
"Trust the Science" would have silenced the correction. "Nullius in verba" enabled it.
If the compact worked, why did it stop working?
The answer is thermodynamic: Truth is high-entropy to a social system.
New truth disrupts. It invalidates textbooks. It threatens careers built on old paradigms. It forces restructuring. It creates conflict. It requires people to admit they were wrong.
Palatability is homeostasis. It maintains existing relationships, preserves existing hierarchies, avoids conflict. A palatable finding doesn't threaten anyone. It can be absorbed without restructuring.
Institutions naturally drift toward homeostasis unless forced otherwise. The forcing function was scarcity.
Early science operated in an environment where being wrong was expensive:
The environment provided feedback. Bad physics produced bad outcomes. The bad outcomes were visible, immediate, and attributable. Selection pressure maintained the compact.
Modern science operates differently:
The selection pressure that maintained the compact has relaxed. You can get status without delivering truth. Once you can get status without sacrificing ego, the alchemy engine stops transmuting narcissism into physics. It just produces narcissism.
Here's the puzzle: everything I've described is documented. Kuhn wrote about paradigm defense in 1962. Merton documented the norms and counter-norms. The replication crisis has been studied extensively. Sociologists of science have analyzed these dynamics for decades.
Why doesn't the critique propagate?
Because the system contains its own critique by professionalizing it.
Science and Technology Studies (STS) exists. Sociology of Scientific Knowledge (SSK) exists. These fields have journals, conferences, tenure lines. They document exactly the dynamics described in this essay.
But scientists don't read them. It's not their field. The critique is contained within a specialty, and the specialty has its own boundaries. A physicist who read STS literature would be stepping outside their lane—doing something professionally unrewarded and potentially suspicious.
The system neutralizes critique by granting it territory. "We have an STS department" is like "we have a bio-ethicist on the board." It means: "We have contained the ethical critique. It has been professionalized. It lives in its box. It doesn't threaten operations."
The recursion trap: a system cannot fully model itself without crashing. If physicists integrated the insights of sociology—that their truth-claims are partially socially constructed, that their boundary-work is tribal, that their paradigms defend themselves—they would lose the "God's Eye View" that gives them authority.
To maintain authority, Science must treat itself as Subject (the watcher), never as Object (the watched). The critique exists but is quarantined where it can't infect operations.
There are two ways to respond to institutional failure:
The Saint View: Science fails because scientists are greedy, careless, or dishonest. The solution is ethics training, integrity pledges, better people. This is the disposition move—change the agents.
The Gamer View: Scientists are rational players maximizing their score. If the game rewards p-hacking, they will p-hack. If citations count for tenure, they will game citations. If replications aren't published, they won't replicate. This is the architecture move—change the rules.
The Saint View has been tried for decades. Every university has research ethics training. Every journal has integrity guidelines. The replication crisis persists. p-Hacking persists. Citation gaming persists.
The disposition move failed because disposition doesn't scale. You can't make everyone a saint. You can make everyone respond to incentives.
This connects to Ethics Is an Engineering Problem: 3,000 years of disposition training produced no reliable improvement. Architectural constraints (markets, courts, verification mechanisms) produced dramatic improvement. The pattern holds here.
Don't try to reform the priests. Build a mechanism where the priest only eats if it rains.
The original compact worked because it aligned incentives with truth. The current system misaligns them. The fix is architectural.
In a prediction market, you bet real money on outcomes. If you're wrong, you lose. This forces skin-in-the-game that academic discourse lacks.
Robin Hanson proposed prediction markets for science: researchers bet on whether results will replicate. If you're confident in your finding, bet on it. If replication fails, you lose money. The market aggregates information from everyone who has an opinion and forces them to pay for being wrong.
DARPA's Policy Analysis Market (PAM) was canceled in 2003 after political outcry—it looked like "betting on terrorism." The rejection is itself evidence for the claim: the mechanism would have revealed uncomfortable truths, so it was killed.
Prediction markets exist now (Polymarket, Metaculus). They consistently outperform expert panels. The mechanism works. Adoption is slow because it threatens incumbents.
In a pre-registered study, you specify your hypothesis, methods, and analysis plan before collecting data. The protocol is public and timestamped. You can't move the goalposts after seeing results.
This kills p-hacking by construction. If you pre-register "I will test hypothesis X with method Y and analysis Z," you can't later pretend you were testing hypothesis W all along.
Registered Reports go further: journals accept papers based on the methods section before results are known. This eliminates publication bias (the tendency to publish only positive results). If the method is good, the result gets published regardless of whether it's interesting.
The Open Science Framework (OSF) provides infrastructure. Adoption is growing but remains a minority practice. The majority of published science is not pre-registered.
Daniel Kahneman proposed adversarial collaboration: researchers who disagree design a study together, specify in advance what results would change their minds, and commit to publishing regardless of outcome.
This forces engagement with the strongest counterarguments. You can't ignore your opponent when you're designing the study with them. You can't move goalposts when they're watching.
Adversarial collaboration requires ego subordination. It's expensive and rare. But when it happens, it produces unusually reliable results—precisely because both sides are trying to prove the other wrong.
Currently, replications are unrewarded. What if they were rewarded?
A replication bounty system would pay researchers to replicate published findings. If the replication succeeds, modest payout. If it fails, larger payout (you discovered something important). The original researchers could post bonds against their claims—skin in the game.
This inverts the current incentive. Currently, you're rewarded for novel claims regardless of truth. With bounties, you're rewarded for checking claims—and especially for catching false ones.
All these mechanisms share a structure: they make truth-seeking behavior incentive-compatible regardless of virtue.
You don't need scientists to be saints. You need a system where selfish scientists doing selfish things produce truth as a byproduct. That's what the original compact achieved. These mechanisms attempt to restore it.
Science is the steelman case.
Of all institutions claiming to produce truth, science has the strongest architecture. Explicit verification norms. Replication tradition. Mathematical and empirical grounding. Historical track record of producing discoveries that work—bridges that stand, vaccines that protect, technologies that function.
If science has this problem, every weaker institution has it worse.
| Institution | Claims to Produce | Verification Mechanism | Actual Selection Pressure |
|---|---|---|---|
| Science | Truth about nature | Replication (in theory) | Publications, citations, grants |
| Journalism | Truth about events | None systematic | Clicks, engagement, narrative fit |
| Think Tanks | Policy truth | None (explicitly advocacy) | Donor satisfaction, influence |
| Intelligence | Strategic truth | Classification (anti-verification) | Institutional survival, budget |
| Academia (non-STEM) | Humanistic truth | Peer review (tribal) | Paradigm conformity, status |
| Corporate R&D | Profitable truth | Market (eventually) | Quarterly results, internal politics |
Science at least had the architecture. The other institutions never did, or had weaker versions. Journalism never had replication. Think tanks were always advocacy organs with research aesthetics. Intelligence agencies invert publication—they classify rather than share, making verification impossible by design.
There's also a crucial divide within science itself: reality's veto power varies by field.
The replication crisis is concentrated in fields where reality can't enforce. Psychology: 36% replication. Cancer biology: similar. Physics: the compact still mostly holds because reality kills bad physics quickly. The solution for soft fields: import hard constraints (prediction markets, pre-registration) to simulate reality's veto.
This essay focused on science because it's the clearest case. The mechanism is universal:
The pattern is the same. Science is just where it's most visible because science had the most to lose.
The Macro Magisterium documents this pattern in economics. Cargo Cult Epistemology documents it in general discourse. Belonging Is Axiology explains why the pattern is so stable—because truth was never the terminal value.
This essay is about science specifically. The phenomenon is about truth-claiming institutions generally. Science is the proof of concept: if even the best institution decays, institutional decay is the default, not the exception.
Lies have a thermodynamic cost. They generate entropy—social friction, bad policy, misallocated resources, systems built on false foundations.
A bridge built on bad physics falls down. A medical treatment based on unreplicated findings harms patients. A policy based on false social science fails. The costs are real, even when delayed.
The current system defers these costs. It launders them through complexity and time. But thermodynamics doesn't care about laundering. Eventually, the bill comes due.
The question is whether we rebuild the truth-forcing architecture before the failures compound, or whether we wait for a crisis large enough to force reconstruction.
History suggests crises do the forcing. The replication crisis triggered reform movements (Open Science, pre-registration). Catastrophic failures (Theranos, various drug recalls, retracted papers that influenced policy) slowly erode faith in the liturgy.
But crisis-driven reform is expensive. It happens only after damage is done. Proactive architectural reform would be cheaper. It would also threaten incumbents, which is why it doesn't happen.
The Royal Society was founded on institutionalized distrust. "Nullius in verba"—take nobody's word for it—was a mechanism design choice, not a slogan. It built a system where status-seeking was channeled toward verification.
That system is broken. The mechanisms have decayed into rituals. The forms persist—peer review, citation, hypothesis testing—but the functions are gone. Status now flows to those who perform science, not those who do science.
"Trust the Science" is the inversion of the founding motto. It asks for faith in credentials rather than confidence in verification. It transforms science from a market (where claims compete and are tested) into a church (where authorities pronounce and are believed).
The fix is not "better scientists" but better architecture. Prediction markets, pre-registration, adversarial collaboration, replication bounties—mechanisms that force truth-seeking regardless of individual virtue. The original compact harnessed narcissism to produce physics. We need mechanisms that do the same.
The critique will be contained. This essay will be filed under "Science Studies" or "Anti-Science" or "Philosophy"—some category that keeps it from infecting operations. The containment is itself evidence for the claim.
The planes will keep not landing until we stop funding the liturgy and start funding the architecture.
Nullius in verba. It was right the first time.
Key Takeaways
This essay synthesizes several research traditions:
The contribution here is synthesis and framing: the "compact" framing, the "liturgy vs. architecture" distinction, the connection to thermodynamic homeostasis, and the containment mechanism explanation for why the critique doesn't propagate.
Related essays:
This essay is part of Aliveness—a framework for understanding what sustains organized complexity over time. The framework itself is a proof of concept: outsider synthesis that would not have emerged from any single academic department.