Philosophy of Scientific Inference

The scientific method is perhaps humankind's greatest invention. Built on a strict scaffolding of experimentation and confirmation, the scientific method has been man's primary means of learning about the world since Francis Bacon popularized the need for a structured, empirical approach to natural investigation in the 16th century. As successful as science is when applied to nature, we are unable to turn it upon itself, to use it to critically examine its own workings. Doing science is evidently not a science, but a philosophy. This philosophy of science is concerned with how the method works, the character of the laws and facts it reveals, and the limits of its application.

I am interested in this topic as a scientist, not a philosopher. I believe that every practicing scientist should have an acquaintance with the tools of his or her trade. Importantly, we must be clear about what science can and cannot do, and we should understand how it is that we know what we know. Below is a reading list I've assembled to help those interested in learning about the philosophy of science.

A Brief History of the Hypothesis
David Glass and Ned Hall (in Cell, Vol 134, 3 (2008))

An excellent overview of the history of science and the current state of the art in fewer than 4 pages.

Historical Science, Experimental Science, and the Scientific Method
Carol Cleland (in Geology, Vol 29, 11 (2001)).

A paper on the comparison between so-called "experimental" and "historical" approaches to science: the former typically taking place in a laboratory setting where the experimenter can vary conditions in a controlled manner; the latter constituting an investigation into the necessary and sufficient causes of past events. The methods of inquiry are different but Cleland argues that both pursuits are fundamentally scientific despite cited claims to the contrary vis-à-vis historical investigations.

A key distinction between the practice of experimental and historical science seems to be due to something called "the asymmetry of overdetermination": it is generally easier to verify that an event occurred than to predict that it will. Cleland chooses the eruption of a volcano as an example: the eruption will generally leave extensive traces (this is the overdetermination), only a small fraction of which are needed to support an inference that an eruption occurred. On the other hand, there are many factors that must be accounted for to support a prediction that an eruption will occur. Historical science deals with discovering and investigating the traces, usually looking for a "smoking gun" to corroborate a single hypothesis among the many that might be supported by other traces. The iridium shocked quartz in the K-T boundary was one such smoking gun pointing towards the hypothesis that an asteroid impact caused the K-T extinction event. The overdetermination suggests that there are many traces and so the possibility of a smoking gun is good -- it is the job of historical science to find it. Meanwhile, experimental science must wrestle with the seeming "underdetermination" of the future by localized present events: test conditions must be controlled and varied to tease out the various necessary and sufficient causal conditions for the observed effects.

In historical science, though, we are not generally content with only verifying the occurrence of an event and corroborating a proposed hypothesis via a single or small number of traces. Ideally we want also to know all those causal conditions that lead to it, just like the lab experimentalist. This can be done, in principle, so long as each causal condition leaves a trace. While the smoking gun wields the power to fell bad hypotheses, it is not enough to understand well the nature of the cause.

Choice & Chance: An Introduction to Inductive Logic
Brian Skyrms (1986)

"Choice & Chance" is a light but provocative introduction to the problem of induction. Inductive reasoning is how we learn -- it increases our knowledge of things as we go from premise to conclusion. It stands in contrast to deductive reasoning: "An argument is deductively valid if and only its conclusion is false when its premises are true; an argument is inductively strong if and only if it is improbable that its conclusion is false when its premises are true." These definitions emphasize that induction is not the opposite of deduction (as is too often falsely alleged), but instead inductive reasoning applies to a spectrum of inference: from deduction (truth by logical necessity) to the most egregious non sequiturs. Ideally, our inferences should be inductively strong -- that they are correct "most of the time" or with "high probability" -- essential for distinguishing the desirable strength of induction, but notoriously difficult to formalize. This is Skyrms' entrè into Hume's problem of induction -- how do we confirm that conclusions assigned high probability by our inductive model actually occur most of the time?

The problem is with the validity of the inductive approach itself. The go-to method for learning new things about the world is induction, whether these things be how to tie a shoe, build a rocket, or conduct scientific inquiry. It's a bit of a meta-problem: can we use inductive logic to justify the use of induction? No, not unless we wish to commit logical suicide by running round and round in a vicious circle. And we cannot use deduction either, because its conclusions never tell us anything not already implicitly contained in the premises (i.e. we cannot learn from it). Skyrms describes an inductive approach to validating induction that introduces a hierarchy of arguments, wherein the inductive inferences made on one level are justified by the one above it. Acknowledging that I've not examined this solution carefully, it's difficult to see how it succeeds: like a rug that's too large for the room, we can smooth it out here only at the expense of creating a ripple over there. Skyrms spends some time discussing the other commonly attempted solutions: 1) induction works if any method does (a clever argument and well worth understanding (p. 44), even if it's not a solution per se), and 2) there isn't actually a problem and we all need to get over it.

Inductive inference in the natural sciences is generally used to project knowledge of particular cases to knowledge universal. Strong induction in these cases relies on the presumed uniformity of nature, across space and time, that supports extrapolation from known events here and now to unknown events there and then. The degree of relevant uniformity dictates the strength of the induction: that it will rain tomorrow because it has rained all week projects only a temporary regularity and risks being false, whereas the claim that the sun will rise tomorrow projects a firm, well-substantiated regularity.

The problem of induction is how to identify the relevant uniformity in general. Skyrms explores this nuanced challenge in Chapter 3, beginning with the work of Goodman on how regularities depend on the language used to describe events (we'll discuss Goodman's work on this and related issues in a later reference). In short, one can perform linguistic shenanigans -- words with situation- and time-dependent meanings, to deeply confuse and thwart attempts at establishing regularities. It is not clear from Skyrms' treatment whether this is actually a problem for the practical sciences or only an academic curiosity. Of more immediate concern to the practicing scientist is how one discovers patterns and regularity in data for the purpose of establishing law-like relationships between quantities. Skyrms has in mind discrete data points represented in 2D: how do we draw a curve through these points? Any way we please! And each such curve will support a different prediction for the values of points lying outside the domain supported by the data (a different extrapolation): "For any prediction whatsoever, we can find a regularity whose projection licenses that prediction (p. 65)." Skyrms closes Chapter 3 without any resolution to this dismal state of affairs, without any pep talk. This is unfortunate because I believe the situation is not so dire, despite the clear challenges so well-articulated by the author. Happily, experimental science augmented with a suitable helping of goodness-of-fit tests goes a far way towards clarifying some of these issues. It's unfortunate that Skyrms doesn't mention any of them, but they'll be covered in other references in this list.

At this point, Skyrms makes a rather abrupt break from his inquiry into induction and discusses Mills methods of identifying necessary and sufficient conditions of observed events; duly interesting but admittedly off topic (Skyrms suggests skipping the rather lengthy Chapter 4 entirely on a first read). In fact, the remainder of the book is seemingly off topic, covering probability theory without making any clear connections back to the deep problems we were left with at the end of Chapter 3. Because the latter portions of the book seem to wither in isolation, I recommend the first three chapters (only 75 pages or so) as a basic introduction to the problem of induction.

The Foundations of Scientific Inference
Wesley C. Salmon (1979)

Salmon's book covers the broad problem of scientific inference, from induction and its challenges, to potential solutions, to alternative methods of inference like deductivism. This is not a book on induction per se, but given its central role in scientific inference, it sets the tone and directs the course of Salmon's investigations.

According to empiricism, knowledge must be founded on evidence. David Hume's problem of induction concerns how knowledge is actually derived from evidence. It is a logical problem about the relationship between evidence and conclusion, in particular, whether our attempts at extrapolating knowledge of the observed to the unobserved is logically justified. Logically, inductive arguments are essentially deductions with missing premises and this is the problem -- how can we be sure of conclusions built on such shaky ground?

In Chapter 2, Salmon considers a variety of solutions to the problem of induction: some attempt a direct counter, while others consider altogether different routes to scientific inference. One such approach, the hypothetico-deductive method, is investigated as an induction-free process of inference. The idea is that predictions are deduced from hypotheses which are then tested against empirical evidence. The process is apparently fully deductive, but there is one initial, glaring problem: suppose we assess that the collected data confirm the prediction, are we to conclude the hypothesis correct? Not without committing the unforgivable logical crime called "affirming the consequent". The fallacy arises because there could be other hypotheses, different from that being tested, which also happen to support the given set of empirical data. Simply put, we cannot deduce a unique hypothesis from data: "While we are concerned with the status of the general hypothesis -- whether we should accept or reject it -- the hypothesis must be treated as a conclusion to be supported by evidence, not as a premise lending support to other conclusions. The inference from observational evidence to hypothesis is certainly not deductive." (p. 19) Further, hypotheses don't just fall from the sky. While the construction of a hypothesis might be a creative, non-logical affair, it certainly isn't deductive or else it could not be ampliative, hardly a candidate for an informative, generalized hypothesis. As Salmon puts it, "A scientific theory that merely summarized what had already been observed would not deserve to be called a theory. If scientific inference were not ampliative, science would be useless for prediction, postdiction, and explanation." (p. 20) Karl Popper will have more to say on the origin of hypotheses in later references.

Next, we come to Prof. Popper's deductivism, another attempt to do inference without induction. Salmon gives only a quick synopsis, but we'll have an opportunity to delve further into Popper's work in later references. Popper maintains that the only logical method available to scientific inference is falsification: it is possible to falsify a general statement by observing only a single contradictory instance or event, whereas we have just noted the impossibility of confirming a general statement by observing particular instances or events (from above, we cannot confirm a single unique hypothesis by a body of empirical evidence). Falsification therefore seems to sidestep the deductive fallacy committed in confirming hypotheses. A hypothesis that survives falsification can be corroborated, gaining favor over competing hypotheses by being more falsifiable (which is another way of saying that the hypothesis has more content, but we'll get into this in later references). The important point is that Salmon doesn't buy it, that there is more than just deduction in Poppers program. After all, we are learning something in the corroboration of hypotheses, and we know deduction to be non-ampliative. In fact, the process of corroboration is a "nondemonstrative form of inference. It is a way of providing for the acceptance of hypotheses even though the content of these hypotheses go beyond basic statements (particular statements of observed fact). Modus tollens without corroboration is empty; modus tollens with corroboration is induction." (p. 26) Elsewhere, "To maintain that the truth of a deduced prediction supports a hypothesis is straightforwardly inductive." (p. 109) Or, at least something akin to it.

Returning to induction, Salmon explores whether there is any hope in establishing a uniformity of nature on which to base inductive inference. He concludes similarly to Skyrms, "The problem is how to ferret out the genuine uniformities: coincidence vs genuine causal regularity" (p. 42) Compounding the problem is that the uniformity of nature is itself an empirical question: we cannot use induction to infer global uniformity (since induction is what we are trying to justify through this very uniformity) and it doesn't appear (to me) that we can get very far using Popper's deductivism, since induction requires that certain uniformity holds universally, not only that it not be falsified by a local observation. Without an a priori statement of uniformity of nature (as with, for example, Kant's synthetic a priori propositions), we appear stuck, no matter how strongly experience suggests such regularities.

What about a probabilistic interpretation of induction? After all, inductive conclusions are not supposed to hold with certainty, and strong induction is defined by its conclusions holding true with high probability. Chapters 4 and 5 spend a a lot of time discussing probability, investigating and mulling the various interpretations of what it means for something to be probable. The two most popular conceptions of probability, frequency of occurrence and degree of rational belief based on available evidence, both fall short of supplying a version of probability adequate to resolving the problem of induction (all versions end up somehow needing to assume uniformity or invoke induction in circular ways). But that's OK -- by this time Salmon has set the stage for an improved, probabilistic hypothetico-deductive method that incorporates some of Popper's concerns. This is what remains after the smoke settles on the bloody, corps-litterred battlefield of scientific inference: a hopeful procedure built from the exploded shrapnel of the other brave but failed attempts at a solution.

The idea is to use Bayes' theorem, which furnishes the probability of the hypothesis given evidence. Bayes' theorem relates this quantity to the probability of the evidence given the hypothesis: these are the two quantities of interest to the standard hypothetico-deductive method, but there's more. Two additional vital ingredients appear: the prior probability of the hypothesis and the probability that the evidence would obtain even if the hypothesis was false, i.e. under a different hypothesis. This latter quantity is essential for avoiding the logical trap of affirming the consequent that we mentioned earlier. The prior probability favors likely hypotheses: surely subjective, but here is where we can bring prior knowledge to bear on the question of the plausibility of the hypothesis. Nowadays, a hypothesis that cited evil spirits as the cause of a new kind of migraine would be roundly considered less plausible than one based on neurobiology, and rightly so. The prior is key to singling out the plausible hypotheses from the infinity of riff raff conjectures, saving us the impossible task of testing all of them. Meanwhile, the third ingredient -- the probability of the evidence under a different hypothesis -- is what Popper has in mind when he talks about falsifiability: if this quantity is large, this means the hypothesis under consideration is not strongly falsifiable and therefore only weakly corroborated by the data. Bayes' theorem appears to salvage the hypothetico-deductive method, "it provides a coherent schema in terms of which we can understand the roles of confirmation, falsification, corroboration, and plausibility." (p. 120)

As a Bayesian, I find ending as Salmon does with Bayes' theorem extremely satisfying. It really ties the room together.

Fact, Fiction, and Forecast
Nelson Goodman (1946)

The is Nelson Goodman's famous work that introduced us to color-shifting gems and other puzzles of inductive inference. We met some of these ideas in Skyrm's introduction, and I include this book only for those seeking an elaboration of some of these ideas. It's a short text, divided into 4 lectures. The first one is Goodman's well-known piece on the "Problem of Counterfactuals", essentially, on the problem of assigning truth values to conditional statements in logic. It is difficult, and lies somewhat outside the main avenue of inductive inference; I am not familiar with this problem and so cannot comment further.

Goodman's main contribution to the philosophy of scientific inference is his statement and examination of the projection problem of scientific hypotheses. This problem is layed out in lectures two through four. Lecture 2, the "Passing of the Possible", deals with what Goodman calls the problem of dispositions. A disposition is a quality or "capacity" of a thing, like flexibility or inflammability. The problem takes on the hefty task of understanding whether these dispositions are real, in the sense that an object's size and shape are. Something is flexible if it bends under suitable pressure; but the same object is still said to be flexible even if we don't apply the pressure, right? So dispositions deal in the possible. Goodman argues that the reality of dispositions depends on whether they are causal consequences of other predicates, reducing the problem to the discovery and enumeration of all these causal predicates. This discussion exists in the same rarefied air as the first lecture, and is in fact related to the problem of counterfactual conditionals ("If I had applied pressure to this object, it would have bent.") Though abstract, Goodman grounds it by tying it to the problem of induction by the end of the lecture: "the problem of projecting manifest to non-manifest cases [cases when the disposition is exemplified and cases when it is not] is obviously not very different from going from the known to the unknown or from past to future cases." (p. 58) This is the problem of induction.

In lecture 3, "The New Riddle of Induction" Goodman declares emphatically that the problem of induction is not the justification of the program itself, but rather as the problem of defining the difference between valid and invalid predictions. So not the grand, meta-problem of induction as a method, but the use of it in its role in forming individual hypotheses and confirming predictions. The immediate problem is the difficulty in ascertaining the difference between lawlike and merely contingent hypotheses, the latter including accidental generalities. Only lawlike statements can be confirmed by data. This is Goodman's "New Riddle of Induction", and it is a formidable one: "the problem of justifying induction has been dipslaced by the problem of defining confirmation...this has left us the residual problem of distinguishig between confirmable and non-confirmable hypotheses" (p. 81) The conclusion -- that "lawlike or projectible statements cannot be distinguished on merely syntactical grounds" (p. 83) has important consequences for the theory of inductive logic developed by Rudolph Carnap, which we'll examine later.

Goodman's argument at the close of lecture 3, that the act of confirmation presupposes a unique language, is further fleshed out in his fourth lecture, "Prospects for a Theory of Projection". Goodman elaborates on the problem of projectability in terms of his famous grue and bleen emeralds. A "grue" emerald is one that is green if measured before, say, July 1 2015, and blue if measured thereafter. Meanwhile, emeralds that are "green" are always green. We get into trouble if we try to do induction on the property grue: if we measure a bunch of emeralds today and find them all to be grue, this property will not hold arbitrarily into the future. Grue is evidently not a projectable property, whereas green is. The problem of projection, as layed out in Lecture 3, is how to tell these apart. Goodman's proposed solution requires that we bring past experience to bear on the distinction between predicates like "green" and those like "grue". "We must consult the record of past projections of the two predicates. Plainly, 'green', as a veteran of earlier and many more projections than 'grue', has the more impressive biography. The predicate 'green', we may say, is much better entrenched than the predicate 'grue'." (p. 94) These are some of the first hints that an operative theory of confirmation is necessarily Bayesian in nature -- that prior knowledge of things is an essential ingredient in forming inferences.

Natural Kinds
Willard V. Quine (1969)

Just when you think the conversation has become sufficiently philosophical, Quine comes along and ups the ante. The piece "Natural Kinds" is Quine's attempt to define the essential properties of things that permit their categorization into "kinds of things", with the ultimate goal of understanding which qualities of things are projectible, that is, that can serve as the basis of scientific prediction. His motivation is to deal systematically and formally with the issues raised by thorny thought problems like Hempel's ravens and Goodman's emeralds. Quine suggests that Hempel's "non-black non-ravens" and Goodman's "grue emeralds" are not projectible; in the case of Hempel, Quine argues that the complement of projectible predicates (like "black ravens") is not itself projectible. In the case Goodman, Quine reasserts Goodman's own argument that "green" is projectible while "grue" is not because "green" is better entrenched, that "two green emeralds are more similar than two grue ones would be if only one were green" (p. 42). By introducing the notion of "similarity", Quine is now charged with the task of defining what, exactly, it means for things to be "similar". This is apparently really hard.

As animals, our sense of similarity is in some sense innate: a pink ellipse is more similar to a red circle than a blue triangle. This suggests a "prior spacing of qualities" that is not learned, but rather serves as scaffolding for learning; without it, all stimuli are equally the same and equally different. Quine claims that the "ostensive" learning of words makes use of this prior spacing, or else we'd never successfully resolve how to apply extensive qualities like colors to things (e.g. this requires, among other things, the ability to discern when something turns from yellow to orange.) The fact that two strangers will generally agree on when something is yellow and when not is proof that induction works for this kind of thing: apparently the kind of similarity that anchors the use of language is projectible. The problem with this is that color is not always the most useful or relevant projectible quality: two square blocks of different color would be deemed dissimilar if we base our appraisal on color. It is remarkable that mankind has been successful in "working around the blinding dazzle of color vision and found more significant regularities elsewhere." He has done so largely through science.

At this point Quine asks a deep and interesting question: how is it that this prior "subjective spacing of qualities accord[s] so well with the functionally relevant groupings in nature"? In other words, there's no {\it a priori} reason that human pattern-seeking and perceptiveness should successfully organize the world for us; why is the world understandable in this way? The question is not {\it whether} induction works (it does), but why. The answer seems to be that our subjective spacing is of the world---it has evolved through Darwinian selection as a reward for those organisms that employed it as the basis for learning. Our sense of perception capable of so organizing the world is itself a product of this organization.

Quine argues that the notion of similarity is tied to the maturity of the science at hand: things that appear dissimilar from the vantage of point of rudimentary understanding of the science, later come to be seen perhaps as different manifestations of a unifying principle or kind. For example, take electrons and positrons. These are separate, different (in fact opposite) particles of different kinds: one is positive and one is negatively charged. It wasn't until the advent of relativistic quantum field theory that electrons and positrons were understood as excitations of the same underlying quantum field (the Dirac field). Not until the problem could be understood on a deeper level could electrons and positrons be recognized as being of the "same kind" (Perhaps a better example is Quine's own, that of whales and fish---a perceived similarity not overturned until biologists better understood evolution and anatomy.) The relevance of this observation to the question of similarity is that similarity is evidently a contextual problem that depends on the depth of understanding about the nature of the things being compared.

Conjectures and Refutations: The Growth of Scientific Knowledge
Karl Popper (1963)

This is a collection of essays spanning a rich range of topics in the modern philosophy of science and human knowledge. It was written after his opus, The Logic of Scientific Discovery (1935), and several essays offer important elaborations and clarifications of many of the ideas presented there. Though a key reference, The Logic of Scientific Discovery is written in a formal, no nonsense Teutonic tone; I found these essays to be more human and engaging. I am mostly ambivalent about whether to read these essays before or after The Logic of Scientific Discovery, but I recommend reading at least the first essay, "Science: Conjectures and Refutations", beforehand.

In this essay, Popper gives a concise summary of how he sees science working, and it's valuable as a bird's eye view of Popper's philosophy. The essay elaborates on the formation of the hypothesis, in contrast to the confirmation of hypotheses which seems to get generally more attention. Popper takes aim at induction by enumeration as being a common method by which hypotheses are developed, arguing instead that "Without waiting, passively, for the repetitions to impress or impose regularities upon us, we actively try to impose regularities upon the world...This was a theory of trial and error -- of conjectures and refutations," and that "scientific theories were not the digest of observations, but that they were inventions -- conjectures boldly put forward for trial, to be eliminated if they clashed with observations..." (p. 46) Popper disagrees with the idea that we build hypotheses piecemeal from observations, though he concedes that surely prior experience and observation informs them. A key concept underlying much of Popper's thought appears to be a relative of Kant's synthetic a prior causality, or at least a rudimentary emulation. Popper speaks often of a presupposed frame of reference, or "frame of expectations" that is a priori and that we use to impose patterns on observations; you can think of it as giving context to otherwise scattered and meaningless measurements: "Thus we are born with expectations; with 'knowledge' that, while not valid a priori, is psychologically or genetically a priori, i.e. prior to all observational experience. One of the most important of these expectations is the expectation of finding a regularity." (p. 47)

In Essay 11, "The Demarcation Between Science and Metaphysics", Popper gives really a heartfelt critique of Rudolph Carnap's approach to induction and confirmation. It's worth a read to see how their ideas square, at least from Popper's perspective, and I don't have much to comment on there. I do want to gripe a bit about something in Popper's philosophy that has been bothering me since first reading his Logic of Scientific Discovery. It has to do with his insistence on the drive to find probabilities with small logical (inductive) probability, which he considers to vary inversely with their empirical content. He seems right in this: for example, a hypothesis with high logical probability verges on tautology and can tell us nothing about the real world. Hypotheses with low logical probability are prone to failure and refutation, because they are exacting and precise in their predictions, but the pay-off is high. He therefore argues that confirmation of hypotheses should not be about high probability, as supposed by Carnap and others, but about low probability. Carnap's view is that the theory with the highest probability, given the evidence, is the most confirmed. There is an apparent tension in these two conceptions of confirmation, but I think these probabilities are different things. Popper's is a kind prior probability that concerns the prediction space of the theory: how universal or precise it is (think number of free parameters in a model, along with their permitted ranges). This has nothing whatever to do with evidence or data. Meanwhile, Carnap's confirmation does concern the evidence in an obviously necessary way: it holds the hypothesis against the data. Carnap's probability, which has I think been superseded by the Bayesian posterior, p(H|d), tells us the probability that the hypothesis is true given the data -- surely we are to select the H with the highest posterior odds. Popper's probability is about the general testability -- the form -- of the hypothesis (how it relates to the whole space of possible evidence) whereas Carnap's is about the results of a particular test -- how the hypothesis stands up to particular pieces of evidence. So, I am confused about the confusion. (It should be noted that Popper's logical probability, though referred to here as a "prior" probability, is not necessarily a Bayesian prior (as this term appears in Bayes' theorem) because that probability includes past confirmations a la Carnap, as well as predictability of the hypothesis under test.

The Logic of Scientific Discovery
Karl Popper (1934)

The Logic of Scientific Discovery is Karl Popper's great work in which he lays out his thesis of deductivism -- a logical approach to science based on the falsification, rather than confirmation, of hypotheses. The act of confirming a hypothesis cannot be made deductively valid, because it commits the logical fallacy of affirming the consequent (if hypothesis H predicts observation O, it is incorrect to conclude that observing O implies the correctness of H). The act of falsification, on the other hand, can be made deductive via modus tollens: if H predicts O, then observing ~O (not O), we conclude ~H (not H, that H is wrong). This, quite simply, is the heart of Popper's philosophy.

It is easy to understand this asymmetry between confirmation (or verifiability) and falsifiability. Scientific theories and laws generally take the form of universal statements. The problem of induction asserts that such universal statements can never be derived from singular statements, for example, upon observing several white swans, we are of course not logically justified in extending this property to all swans (nor can we search the whole universe to check). On the other hand, such universal statements can be contradicted by singular statements, for example, the claim that all swans are white can be refuted by the observation of only a single black swan -- "consequently it is possible by means of purely deductive inferences (with the help of the modus tollens of classical logic) to argue from the truth of singular statements to the falsity of universal statements." (p. 19) Popper's confidence in this realization encouraged him to promote falsification as the line of demarcation between science and pseudoscience: "it must be possible for an empirical scientific system to be refuted by experience." (p. 18)

Objections to Popper's program tend to center around the perceived negativity associated with the act of falsification; call it a psychological aversion to refutation. More practically, the concern is that science is not about what doesn't work -- how can we hope to improve our knowledge of the world if we cannot confirm, or verify, scientific hypotheses? Shouldn't science be constructive, rather than destructive? How else are we supposed to establish scientific laws? Popper counters that our view as scientists should not be that our theories are correct, only that they aren't wrong. He believes in a relentless, vigorous assault on all hypotheses and scientific proposals with the hopes of striking down those that miss the mark; he envisions a sort of selection process in which the "unfit' hypotheses are weeded out and the best ones selected by "exposing them all to the fiercest struggle for survival" (p. 20). So Popper does have some semblance of confirmation in mind -- he evidently associates it with the notion of fitness. Importantly, though, Popper views the cycle of conjecture and refutation as generative, as one that puts scientific hypotheses through a sort of optimization process in order to develop theories that, though ultimately incorrect, are the very best possible prototype of the truth: "Theories are nets cast to catch what we call 'the world': to rationalize, to explain, to master it. We endeavor to make the mesh ever finer and finer."

The fitness of a theory is established through the process that Popper terms corroboration: "A positive decision can only temporarily support a theory...but so long as the theory withstands severe tests and is not superseded by another theory, then we say it has been corroborated by past experience." (p. 10) To corroborate a theory, it must withstand more extensive and rigorous testing than its competitors. In short, it must be more "testable", or equivalently, more "falsifiable". The reason testability is important is because it is logically related to the empirical content of the theory: "the more a theory forbids, the more it says about the world of experience." What this means is that theories with more universal or precise statements are more testable because there are more opportunities for a misstep, more places for them to go wrong. For those of us who have ever been torn between two competing explanatory models for a set of data, we might have relied on various heuristics of model complexity, like the number of free parameters or predictiveness of the model. These are precisely the ideas Popper has in mind when he advocates for empirical content. It is striking that the modern statistical basis of hypothesis testing and model selection was so accurately anticipated by Popper.

Recall Salmon's suspsicion over Popper's process of corroboration, in particular, his assertion that Popper was smuggling in some induction in order to make ampliative statements about the world. Indeed, Popper's description of conjecture and refutation as a matter of life and death in the face of experiment, and the steady climb towards the fittest explanation of a given set of data must be ampliative. After all, the process of corroboration essentially optimizes the information content of passable theories. But is this induction? Popper would say, "no", but I believe this is a point that deserves considerable thought.

Popper's ideas, if nothing else, have made it OK to accept that science is not about being certain, that, as scientists, all we can really be certain about is the inadequacy and ultimate imperfection of our theories. Karl Popper argued passionately and I think decisively that this is not the weakness but the hallmark, the unique power, of the scientific enterprise in comparison to other modes of inquiry. Indeed, " is not his possession of knowledge, of irrefutable truth, that makes the man of science, but his persistent and recklessly critical quest for truth."