I Changed My Mind about Error-Correcting Debate, Misogyny and More

Max · March 31, 2026, 5:32pm

Okay yeah. I tried it earlier on some UBI stuff and it worked pretty well. I have a draft post about it I’m finishing (which I started beofre this post so I might have to finish the UBI one later).

I thought I’d try with the inductivist stuff and see what I could get:

Prompt

note: i find the AskUserQuestion tool mention helpful with claude.

and then some questions and answers:

Q: Which Popperian points are you most interested in seeing engaged with? (Select all that apply)
A: Problem of induction (no justification for universal claims from finite observations), Falsificationism / conjectures & refutations as the engine of knowledge growth, Demarcation problem (science vs pseudoscience), Critique of confirmationism / verificationism

Q: What counts as ‘inductivist’ for your purposes? (Select all that apply)
A: Classical inductivism (Mill, early logical positivists), Bayesian / probabilistic confirmation theory, Any non-Popperian epistemology that defends some role for induction

Q: Are you primarily after published academic works (journal articles, monographs), or are informal sources (blog posts, lecture notes, interviews) also valuable?
A: Both formal and informal sources

Main Response

https://claude.ai/public/artifacts/b07595ea-9067-4c5d-b7c6-51d8e04cd6fc

compass_artifact_wf-13afd578-1840-4054-9e8b-0a7f9d123741_text_markdown.md (20.7 KB)

After that I asked a follow up and while reading the first criticism of Popper, I felt like I could address it, so I’m going to (and maybe more).

Insert during editing: I reason through the top two arguments with some annotations of thoughts. Even if these arguments it found aren’t new or interesting, I think some of my thoughts are (interesting at least, some were new to me). It felt like good practice.

[Max:] summarize the strongest arguments assuming the reader has a familiarity with popper but not with the specific arguments in question (so you need to state poppers position and the criticism).

Full Response

model was claude opus 4.6 extended.

Here’s a summary of the strongest arguments, structured so each one states the Popperian position being challenged and then the criticism.

1. Salmon’s Corroboration Dilemma

Popper’s position. Science proceeds by conjecture and refutation. We propose bold theories, attempt to falsify them, and when they survive severe tests, we say they are “corroborated.” Crucially, corroboration is not confirmation — it does not raise the probability that the theory is true or that it will succeed in future applications. Corroboration is merely a backward-looking report card: this theory has survived these tests so far. Popper insists this is purely deductive — no inductive inference is involved at any stage.

Salmon’s critique. Wesley Salmon (The Foundations of Scientific Inference, 1967; “Rational Prediction,” 1981) pressed a simple but devastating question: why should we prefer well-corroborated theories for practical decisions?

Suppose you need to build a bridge. Two theories of materials science are available. Theory A has been severely tested and survived — it is highly corroborated. Theory B was proposed yesterday and has never been tested. Popper says corroboration confers no expectation of future success. If that’s true, you have no rational basis for choosing Theory A over Theory B for your bridge. You might as well consult an astrologer.

Salmon formulated this as a dilemma:

Horn 1: If corroboration really confers no rational expectation about future performance, then Popper’s methodology gives scientists no reason to rely on well-tested theories. Modus tollens without corroboration is empty — you can eliminate falsified theories but have no basis for preferring among the survivors.

Horn 2: If corroboration does provide rational grounds for preferring well-tested theories for future application, then corroboration is functioning as inductive support under a different name. Modus tollens with corroboration is just induction.

Either way, Popper’s program fails on its own terms. Salmon’s formulation — “modus tollens without corroboration is empty; modus tollens with corroboration is induction” — has not been satisfactorily answered by Popperians in the nearly six decades since it was published. Popper’s own responses tended to reassert that corroboration is backward-looking, but this doesn’t resolve why a rational agent should use backward-looking assessments to guide forward-looking decisions unless some inductive principle is operating.

Strength of engagement: Salmon represented Popper’s position accurately and even sympathetically. He acknowledged the genuine insight in falsificationism before showing the gap. This is considered by many philosophers of science to be the single strongest challenge to Popper’s anti-inductivism.

2. Putnam’s “Inductivist Quaver”

Popper’s position. Scientific theories gain credibility through bold, risky predictions that could have falsified them but didn’t. A theory that predicts something surprising and turns out to be right has been severely tested. This is what distinguishes science from pseudoscience — genuine scientific theories stick their necks out.

Putnam’s critique. Hilary Putnam (“The ‘Corroboration’ of Theories,” 1974) pointed out a historical counterexample that strikes at the heart of this picture. Newton’s theory of universal gravitation — one of the greatest achievements in the history of science — was not accepted because it made bold novel predictions that were then tested. It was accepted because it explained things people already knew: planetary orbits (Kepler’s laws), the tides, the behavior of falling objects. These weren’t risky predictions — they were established phenomena that Newton’s framework unified.

On a strict Popperian account, the fact that Newton’s theory accommodated already-known data shouldn’t count for much. After all, any clever theorist can construct a theory to fit existing data. Popper emphasized novel predictions — predictions made before the data was in. But historically, the most important evidence for Newton was old evidence, not new predictions.

Putnam then examined how Popper actually discussed theory preference and detected what he called “an inductivist quaver” — moments where Popper’s reasoning about why we should prefer well-corroborated theories implicitly relied on the very inductive logic he officially rejected. The logical structure of “this theory has passed severe tests, therefore it is more worthy of tentative acceptance” is, Putnam argued, indistinguishable from inductive reasoning regardless of what label you attach to it.

Strength of engagement: Putnam opened with genuine respect for Popper’s influence and worked from a concrete historical case rather than abstract logic, making the argument difficult to deflect with technical maneuvers. The Newton example is particularly effective because Popper himself held Newton in high regard.

3. Carnap’s Mixed-Quantifier Argument Against the Falsification/Verification Asymmetry

Popper’s position. There is a fundamental logical asymmetry between falsification and verification. A universal statement like “all swans are white” can never be verified (you’d have to check every swan that ever exists) but can be conclusively falsified by a single black swan. This asymmetry is the foundation of Popper’s entire methodology — it’s why he favored falsification over confirmation as the engine of science.

Carnap’s critique. Rudolf Carnap (in the Schilpp volume The Philosophy of Rudolf Carnap, 1963) identified a technical but profound problem: Popper’s asymmetry only holds for sentences of a simple universal form (“for all x, Fx”). But most actual laws of physics aren’t of this form. They involve mixed quantifiers — sentences of the form “for all x, there exists a y, such that…”

Consider the law “every material has a melting point.” This says: for every substance x, there exists a temperature y such that x melts at y. To falsify this, you’d need to show that some substance has no melting point at any temperature — which requires testing every possible temperature, an infinite task. To verify it for a specific substance, you just need to find one temperature where it melts.

For sentences with this mixed-quantifier structure, falsification is as difficult as verification. The clean asymmetry that Popper’s entire methodology depends on simply vanishes. And since real physical laws routinely involve existential quantifiers (limits, equilibrium points, conservation quantities), this isn’t an exotic edge case — it’s the norm in actual science.

According to historian A.W. Carus, neither Popper nor any of his followers ever directly addressed this argument. It remains one of the most technically precise and under-discussed challenges to Popper’s foundations.

Strength of engagement: This is a formal logical argument that attacks Popper on his own preferred terrain — the logical structure of scientific statements. It doesn’t require any controversial philosophical premises, just careful attention to quantifier structure.

4. Mayo’s Error-Statistical Completion of Severe Testing

Popper’s position. Science progresses through severe testing. A good test is one that a theory would be likely to fail if it were false. The more severe the tests a theory survives, the more corroborated it is. Popper explicitly required that tests be severe — a sycophantic “test” designed to confirm a theory doesn’t count. He even proposed that we can measure a theory’s testability by its degree of falsifiability.

However, Popper insisted this entire framework could operate without any notion of probability or inductive inference. Corroboration is not confirmation. Severe testing does not raise the probability that a theory is true.

Mayo’s critique. Deborah Mayo (Error and the Growth of Experimental Knowledge, 1996; Statistical Inference as Severe Testing, 2018) argued that Popper’s insight about severity was exactly right — but that he was unable to cash it out formally because he refused to use the mathematical tools required.

The problem: how do you determine whether a test is actually severe? Popper’s answer was essentially qualitative — a theory is more severely testable if it makes more precise predictions, prohibits more possible observations, etc. But this leaves crucial questions unanswered. Suppose a theory predicts a measurement of 5.0 ± 0.1 and you observe 5.05. Has the theory passed a severe test? That depends on things like: how precise is your instrument? What’s the probability you’d see 5.05 even if the theory were false? How much experimental noise is there?

Answering these questions requires error probabilities — the probability that your test procedure would produce a given result under various hypotheses. This is precisely the statistical machinery of Neyman-Pearson testing, which involves reasoning about how often a procedure would err. And this, Mayo argued, is inductive reasoning — reasoning from the performance characteristics of a procedure to a conclusion about the case at hand.

Mayo’s claim is that Popper saw the right destination (severe testing as the key to scientific method) but refused to take the only road that leads there (error-probabilistic reasoning). She characterized this as Popper making a “bait and switch” — getting assent to the intuitive idea that severity matters, then delivering a framework that cannot formally distinguish severe tests from non-severe ones. Her own error-statistical framework fills the gap by defining severity precisely: a claim passes a severe test when the test had a high probability of detecting the specific flaw being probed, and the claim survived anyway.

Strength of engagement: Mayo is arguably the fairest critic in this literature because she explicitly positions herself as completing Popper’s project rather than demolishing it. She carefully distinguishes “naive” Popper (the textbook straw figure) from the more sophisticated philosopher, credits his insights about severity as foundational, and presents her alternative as what Popper should have said rather than what he’s wrong about.

5. Howson’s Bayesian Solution That Partly Vindicates Popper

Popper’s position. The problem of induction (Hume’s problem) is insoluble. There is no way to justify inductive inference — no way to show that observing regularities in the past gives rational grounds for expecting them to continue. Popper accepted Hume’s negative conclusion completely and proposed that science doesn’t need induction because it can proceed purely through conjecture and refutation.

Howson’s critique. Colin Howson (Hume’s Problem: Induction and the Justification of Belief, 2000) developed the remarkable position that Popper was right about Hume but wrong about Bayesianism.

Howson agreed with Popper that there is no non-circular justification for induction. He then argued that Bayesian probabilistic reasoning doesn’t require such a justification. Bayesianism provides a logic for updating beliefs in light of evidence — Bayes’ theorem is a deductive consequence of the probability axioms — without making any claim that this logic is “justified” in the foundational sense Hume demanded. Just as deductive logic tells you what follows from your premises without justifying the premises themselves, Bayesian inference tells you how to update your credences without justifying your starting credences (priors).

This dissolves Popper’s main motivation for rejecting probabilistic reasoning. Popper rejected Bayesianism because he thought it required solving the problem of induction — that you’d need some inductive warrant to show that updating on evidence is rational. Howson argued this conflates two different things: the problem of justifying induction as a general principle (which Howson agrees is insoluble) and the problem of whether probabilistic inference is internally coherent and useful (which Howson argues it is).

Additionally, Howson and Urbach (Scientific Reasoning: The Bayesian Approach, 1989/2006) showed through detailed analysis that Popper’s concept of corroboration — when you work out its formal properties — is essentially a monotonic transformation of Bayesian posterior probability. Corroboration tracks the same evidential relationships that Bayesian confirmation does. The two programs agree on which evidence supports which theories; they disagree only on what to call it.

Strength of engagement: Howson was at the LSE (Popper’s department) and knew Popper’s work intimately. His strategy of partly agreeing with Popper — yes, Hume’s problem is insoluble — before showing this doesn’t lead where Popper thought it led, is a sophisticated form of engagement. He even defended Popper against weak attacks (notably defending the Popper-Miller theorem against a bad objection by Dunn and Hellman) before offering his own independent critique.

6. Zamora Bonilla: Popper’s Own Measures Are Inductivist

Popper’s position. Popper spent considerable effort developing formal measures of corroboration, testability, and verisimilitude (truthlikeness). These measures were designed to capture, in logical and mathematical terms, his anti-inductivist methodology. A theory’s corroboration is a function of its logical content, the severity of the tests it has passed, and related factors — but crucially, it is not a probability and does not involve inductive reasoning.

Zamora Bonilla’s critique. Jesús Zamora Bonilla (“On Popper’s Strong Inductivism (or Strongly Inconsistent Anti-Inductivism),” Studies in History and Philosophy of Science, 2010) performed a careful formal analysis of Popper’s own mathematical desiderata for what a good measure of evidential support should look like. He showed that once you strip away the ad hoc elements and take Popper’s formal requirements at face value, the resulting measures are qualitatively indistinguishable from standard inductivist confirmation measures.

In other words, Popper’s own criteria for what makes evidence support a theory — when you work them out rigorously — produce the same orderings as the Bayesian confirmation measures he rejected. The formal properties Popper demanded of corroboration are the formal properties that confirmation theorists had independently identified. The difference between “corroboration” and “confirmation” turns out to be terminological rather than structural.

Strength of engagement: This is a particularly elegant critique because it uses Popper’s own formal machinery against him. It doesn’t require accepting any philosophical premises Popper would reject — it just requires taking his mathematics seriously and following where it leads.

What These Arguments Share

The strongest critiques all converge on a common theme: Popper’s framework implicitly relies on the inductive reasoning it officially rejects. Whether the argument focuses on the practical role of corroboration (Salmon), the historical practice of science (Putnam), the formal structure of physical laws (Carnap), the requirements of severe testing (Mayo), the logic of belief updating (Howson), or the mathematics of evidential support (Zamora Bonilla), the conclusion is structurally similar — that anti-inductivism is unstable, collapsing into either a covert inductivism or an empty formalism that can’t guide actual scientific practice.

What distinguishes these from weaker critiques is that each author takes Popper’s actual position seriously, represents it accurately, and identifies a specific gap in the logic rather than dismissing the project wholesale.

1. Salmon’s Corroboration Dilemma

Popper’s position. Science proceeds by conjecture and refutation. We propose bold theories, attempt to falsify them, and when they survive severe tests, we say they are “corroborated.” Crucially, corroboration is not confirmation — it does not raise the probability that the theory is true or that it will succeed in future applications. Corroboration is merely a backward-looking report card: this theory has survived these tests so far. Popper insists this is purely deductive — no inductive inference is involved at any stage.

Salmon’s critique. Wesley Salmon (The Foundations of Scientific Inference, 1967; “Rational Prediction,” 1981) pressed a simple but devastating question: why should we prefer well-corroborated theories for practical decisions?

Suppose you need to build a bridge. Two theories of materials science are available. Theory A has been severely tested and survived — it is highly corroborated. Theory B was proposed yesterday and has never been tested. Popper says corroboration confers no expectation of future success. If that’s true, you have no rational basis for choosing Theory A over Theory B for your bridge. You might as well consult an astrologer.

I note that CF instructs us on what to do without saying whether we should expect A or B to work. I am not sure if CR does or not, though. I think CF’s instruction is via yes/no and thus original to CF.

It’s implied that A and B disagree, so the bridge that theory A suggests will work is predicted to fail by theory B and vice versa.

If theory B has not passed the same tests as theory A (because they haven’t been performed), then this seems like a non-inductivist way to favor A over B. Thus, I’ll assume B has passed all the tests that A has.

If B has passed all the same tests as A, then there is no reason to favor A over B.

Salmon formulated this as a dilemma:

Horn 1: If corroboration really confers no rational expectation about future performance, then Popper’s methodology gives scientists no reason to rely on well-tested theories. Modus tollens without corroboration is empty — you can eliminate falsified theories but have no basis for preferring among the survivors.

Horn 2: If corroboration does provide rational grounds for preferring well-tested theories for future application, then corroboration is functioning as inductive support under a different name. Modus tollens with corroboration is just induction.

Either way, Popper’s program fails on its own terms. Salmon’s formulation — “modus tollens without corroboration is empty; modus tollens with corroboration is induction” — has not been satisfactorily answered by Popperians in the nearly six decades since it was published. Popper’s own responses tended to reassert that corroboration is backward-looking, but this doesn’t resolve why a rational agent should use backward-looking assessments to guide forward-looking decisions unless some inductive principle is operating.

Strength of engagement: Salmon represented Popper’s position accurately and even sympathetically. He acknowledged the genuine insight in falsificationism before showing the gap. This is considered by many philosophers of science to be the single strongest challenge to Popper’s anti-inductivism.

Response to Horn 1: “well-tested” is suspicious to me here. If we are comparing A and B and B is not well-tested, and that means it has not been tested against important cases that A has been. The motivation to choose A is not from its well testedness, rather it’s to avoid B’s lack of testing. (Note: inductivism trap here: B’s lack of testedness is not a prediction that B will fail.) We criticize B: “it fails test X”. In reality we don’t know if this criticism is true, but disproving it should be easy (pass test X). We should treat B as tentatively refuted if we are forced to choose A or B. If B has passed test X (for all X), then A and B are both well tested and the argument breaks.

So Horn 1 is wrong. Either we have a reason to reject B in favor of A, or the argument is broken.

(Note: I looked up modus tollens and it doesn’t seem like I need to respond to that directly.)

Response to Horn 2: seems right.

Okay I feel like this is an adequate response to the argument as put here. Maybe the original argument was better or more precise in a way I glossed over.

I think I used CF knowledge in answering it, so I’m not sure if Popper would have possibly given this response.

A more direct CF response is that it’s okay to have two options because either they agree or we can figure out which is correct. In the extreme case, the bridge we’re building is the experiment because theory A and B must disagree on whether the bridge designed under theory A works or not.

I’m not sure I’ve realized this before as a property of CF’s approach: For all practical real world decisions, if we have a plan under either theory, then: either there is no conflict in this case or that plan is an experiment which will criticize one or both theories.

That’s really elegant. It also sets an upper bound on the cost of figuring out the truth when there’s a conflict: just build both options (at least one will fail).

Hmm, by what standard is something surprising? It wasn’t a surprise to the theory that predicted it. I’m not sure Popper would have put it like that. My intuition is that it’s surprising because it disagrees with the dominant theory, in which case the prediction being true is a criticism of the dominant theory.

Putnam’s critique. Hilary Putnam (“The ‘Corroboration’ of Theories,” 1974) pointed out a historical counterexample that strikes at the heart of this picture. Newton’s theory of universal gravitation — one of the greatest achievements in the history of science — was not accepted because it made bold novel predictions that were then tested. It was accepted because it explained things people already knew: planetary orbits (Kepler’s laws), the tides, the behavior of falling objects. These weren’t risky predictions — they were established phenomena that Newton’s framework unified.

On a strict Popperian account, the fact that Newton’s theory accommodated already-known data shouldn’t count for much. After all, any clever theorist can construct a theory to fit existing data. Popper emphasized novel predictions — predictions made before the data was in. But historically, the most important evidence for Newton was old evidence, not new predictions.

Hard to vary comes to mind as the CR response. I’m using BoI phrasing, was this one of Popper’s ideas that DD didn’t credit in BoI? If so, that seems like a Popperian criticism of the argument.

Also I think the predictions-before-data thing is kind of overblown. In part because new theories tell us where to look (disagreements in predictions). The important part of predictions-before-data is that the data is not the motivation for the theory. Also, if we have a theory with prediction X and then discover some old archived data that can be used to test prediction X (and the theory passes), intuitively the fact the data predates the theory/prediction should not be a criticism of the theory.

I don’t think the quaver is a very good argument unless it was something Popper never resolved. Assuming this is accurate, I admire Popper’s willingness to discuss it. Hmm, time to google it I think

[Research here partially informed by LLM] Okay, so Putnam used the phrase “inductivist quaver” and Popper quoted it in Replies to my Critics. Popper claims Putnam didn’t understand Popper’s ideas and that he was using ‘induction’ in a broad loose way that wasn’t philosophically meaningful.

[My thoughts] In hindsight, I think Popper was wrong to dismiss this and concede to a lesser point (that the argument is true only when ‘induction’ is used too broadly). I say ‘in hindsight’ because I think my responses were better and are more consistent with CR. I’m not sure if CF has more to offer in this case that I didn’t spot.

Hmm, I don’t want to accidentally get the wrong idea about Popper because I was lazy and used an LLM instead of sourcing the original text.

Can’t find it edit: found it

I feel like Popper would cringe at that last part – his regard for Newton shouldn’t have any bearing on which explanation (of why UG was good) is correct, or how effective an argument is. I think there’s some social dynamics stuff going on. It’s a bit weird to think about because this is claude output (though it did learn it from human text).