Hypotheses in science are formed, supported and refuted on the basis of evidence but also logical reasoning. For this to occur, all hypotheses and claims should be constructed and treated with scepticism, not confirmation. This means being open to questioning and possible evidence proving it incorrect in some way.
If we are uncritical we shall always find what we want: we shall look for, and find, confirmations, and we shall look away from, and not see, whatever might be dangerous to our pet theories. In this way it is only too easy to obtain what appears to be overwhelming evidence in favour of a theory which, if approached critically, would have been refuted (Popper, 1957)
Software testing is no different. All testers should treat existing claims of quality with scepticism, not confirmation. Do not test to verify or validate, but instead test to disconfirm and invalidate. The tester’s goal is falsifiability.
The Swan Experiment
A scientific researcher examined all the swans she could find and saw they all had white feathers. Based on this evidence, she concluded that all swans were white or the swan species has white feathers. This can be considered a scientific claim or hypothesis because:
- It’s logical reasoning based on empirical evidence
- It’s open to being proved incorrect (falsify or refute) by finding evidence of non-white swans
The statement “In 1697, black swans were sighted in Australia” means that the reasoning is no longer sound and the hypothesis has been falsified. It would have to be revisited to account for the new evidence (Popper, 1959).
This is the goal of the software tester. Take existing claims of quality and find evidence where they’re not true or may not be true. These statements to the contrary are then communicated to stakeholders in formats such as test summary reports, bug reports or feedback during meetings. If no evidence can be found to refute the claim, then state as such. Never confirm that anything works or any claim is valid, complete or passing. This is impossible because quality can’t be verified. Testers also form their own hypotheses of quality, and while there is also a strong emphasis to look for evidence to support counter-claims of lost value or harm, these should be open to falsification too. Testers should provide a balanced argument and not carelessly mislead stakeholders with false claims of any type.
Testers do not confirm, verify or validate the quality of the software. The goal isn’t to demonstrate conformance to specification or anything else. Instead, testers take any claim of quality, find evidence and form ideas of how it doesn’t deliver value, may not deliver value or causes harm. The goal of testing is to disconfirm and invalidate. Science and software testing is falsification.
- Deduction vs Induction - Testing is constantly building and falsifying hypotheses of quality using logical reasoning to uncover information
- Popper, K. 1957. The Poverty of Historicism. 1st ed. London: Routledge. p. 124
- Popper, K. 1959. The Logic of Scientific Discovery. 2nd ed. London: Routledge. p. 17