Testing’s goal is to gain knowledge and understanding of software quality. This knowledge and understanding needs to be communicated to stakeholders as information. Information on knowledge and understanding is therefore at the heart of testing (Ashby, 2016). There are two approaches to science that use different types of information
Quantitative approaches to science are based around numerical data. For natural science, this may be collected in laboratories using instruments to gather observations as measurements. For social science, people’s thoughts may be collected using survey questions where answers are rated on a numerical scale. Data can be statistically analysed using formal science approaches to draw conclusions. The quantitative approach values independent objectivity, which means keeping distanced from what it being researched. It is the only approach available for natural, positivist science, but is also available for social, relativist science (Robson, 2011a).
Examples include using electronvolts (eV) to measure the mass and existence of the Higgs Boson subatomic particle (Aad et al., 2012) and how much social media influences people’s trust in science across different countries (Huber, Barnidge, Gil de Zúñiga and Liu, 2019).
Qualitative approaches to science are typically based around field studies of people (and animals) in various cultural environments. Journals are used to capture observations and interview notes from study participants about how they interact with each other and the world around them. These are later evaluated to draw conclusions. Qualitative approaches may also used to “qualify” the nature or meaning of something, including what counts as “one” of something, in order to measure to the degree that something is present. In social science, this means every quantitative approach must start with a qualitative one. Qualitative approaches value subjectivity and embedding oneself into the social world being researched. It is therefore a useful approach for constructionist and interpretivist social science, but the approach is not available for positivist science (Robson, 2011a),(Kirk and Miller, 1986a).
Claims of Quality
Anything said about the quality of a product, system or service is describing its quality in some form. In the beginning, this may only be claims containing partial context with hopes and assumptions about how the software may deliver value or customer satisfaction. At this stage there’s no supporting evidence observed from interacting with the software itself. No software may even be written. However each claim can be tested for evidence once the software application starts to come together.
These ideas or concepts about the software application may be documented in some form as artefacts. The main examples are requirements (user stories) and designs (architectural, UI/UX). The claims themselves also need to be tested for quality by asking questions and probing for more information. Examples include what ways may this claim not deliver value or customer satisfaction? What risks have not been considered that may cause harm? (Ashby, 2021). From this, counter claims of quality can be made and tested that highlight risk causing lost value and harm.
Examples of initial claims (ideas and artefacts) of value that come from the business and are tested by testers:
- Specifications and specification documents (user stories, requirements and designs)
- Help documents and user guides
- Marketing material
- Bugs with “resolved” status
- Any project tasks with “done” status
- Developers demonstrating the software’s value
- User interfaces are also implied claim of quality: If it looks like a calculator, we expect it to work like one
- Anything positive said about the product, formally or informally
Examples of counter-claims (ideas and artefacts) of lost value or harm that come from the testers back to the business:
- Product bugs with “new” status
- Any development or test tasks with “new” or “reopened” status
- Testers demonstrating the software’s problems
- Anything negative said about the product, including complaints by users, formally or informally, documented or undocumented
- Any claim of quality that’s contradictory, confusing, ambiguous or misleading
- Any unresolved project issues and blockers
Facts, Laws, Hypotheses and Theories
Facts, laws, hypotheses and theories are important steps in science to uncovering truth and are also important to testers uncovering information on quality.
- Facts are anything that can be objectively observed. They form the empirical evidence in science. Examples include a apple falling from a tree or someone’s emotional reaction to something.
- Laws are collections of related observations (known as classes). Laws can be described as either quantitative formulas, like Newton’s laws of motion and gravity (Newton, 1726), or qualitative generalisations, such as how predators and prey react to each other (Langley, 2012).
- Hypotheses are ideas that explain laws and facts, although assumptions have to be made. There could be many hypotheses that are eventually ruled out or reformed though experimentation and observation. Examples include gravity as a type of force (Verlinde, 2010) or whether it’s nature or nurture that explains how animals react to predators or prey (Morehouse, Graves, Mikle, Boyce, 2016).
- Theories are the hypotheses that remain and fit all the facts so are the closest to scientific truth. Examples include Einstein’s general theory of relativity which accounts for gravity (Einstein, 1916) or various theories of genetic influence based on observations of natural selection (Darwin & Wallace, 1858).
In software quality and testing, facts and laws are the contextual factors that make up the story of quality, such as how software responds (or doesn’t respond) to user input or how a user would feel about that response. Hypotheses describe how the facts and laws (contextual factors) fit together to tell the story of quality. This hypothesis can be tested for empirical evidence that supports or refutes it. Once all the story pieces are in place and sufficient testing of the hypothesis has been completed, the tester then has a solid theory for any given quality story that can be reported as information for stakeholders to act upon. However constructing a valid and reliable hypotheses and theories is a challenging part of science and software testing.
Quantitative Measurements and Metrics
The quality story is a qualitative assessment, but quantitative approaches and measurements may still be used in support in this assessment.. Examples of measurements used in support of the quality story include how long a user has to wait for a process to complete (performance), how many clicks it takes to complete a task (usability) or the text-to-background colour contrast ratio (accessibility). Customer satisfaction scores, customer usage data or magazine review ratings can also provide important data. Equally is extends to testing and the project itself, KPIs maybe valuable in getting insight into how testing is going, what improvements can be made to the processes and communicating to management. However quantitative data doesn’t negate the need for qualitative assessment. Data isn’t information. For example, how long is too long? How many clicks is too many? At what point does contrast become problematic? Can customer usage data and review scores be trusted? After all, “there is no such thing as raw data. Human beings do not simply perceive, then interpret, but rather go through a process called cognition” (Kirk and Miller, 1986b).
A qualitative assessment is required to hypothesise risk to customers and the project. If attempts are made to use measurements only, it can cause problems. Statistics and metrics are useful but they don’t tell a good story of quality by themselves. In fact, they can be highly misleading. This doesn’t mean the quantitative approach doesn’t have it’s place and numbers can’t be used. Performance data on various timings is important, test automation is critical to provide fast feedback and production metrics give valuable insight. It just means metrics and statistics should be used in support of a qualitative assessment of quality. The quantitative vs qualitative approach is a “well-worn argument” among social researchers, however if done well, the combination of both (known as “mixed-mode” or “multi-strategy”) can help achieve reliability and validity. For most testers working within project constraints, use whichever approaches are most suitable at the time to get the job done (known as a “pragmatic” approach) (Robson, 2011b).
Numbers and measurements can be useful, but if the limits aren’t known nor respected, quantitative measurements can be dangerous and highly misleading. In such cases, it’s best to stick to qualitative approaches to social research (Kaner, Bach, Pettichord, 2002).
It’s not wrong to define and clarify requirements in ways that may facilitate measurement. Although I claim that whatever you do along that line will not result in an objective “measurement of quality” of your service or product, it may be useful! It may be good enough. It may give you what you need to make an assessment of quality.(Bach, 2019)
The best way to think about quantitative measurements is to form a “sandwich” between to layers of qualitative assessments:
|Qualitative assessment of what to measure, how to measure it and why it should be measured to ensure usefulness and validity to the quality story|
|Quantitative measurement to gather the numerical data needed|
|Qualitative assessment to ensure on-going usefulness and validity, and to contextualise the data into the quality story as information for stakeholders|
Quality is assessed, described and evaluated as information for stakeholders. As a social science, testing benefits greatly from qualitative approaches. Testers make use of journals, mind maps and note-taking approaches to record thoughts and feelings as they go. Testers put themselves into the shoes of stakeholders to find problems that matter to them. Quality cannot be measured directly and testers only apply metrics and measurements carefully in support of an evaluation of quality. This makes software testing a qualitative science.
- Reliability vs Validity - Avoid misleading information by ensuring consistent and accurate testing through diversity of testing methods and approaches
- Ashby, D., 2016. Information, and its relationship with testing and checking. [online] Dan Ashby. Available at: Link
- Ashby, D., 2021. Continuous Testing Throughout the SDLC. [online] Ministry of Testing. Available at: Link
- Bach, J., 2019. Assess Quality, Don’t Measure It. [online] Satisfice. Available at: Link
- Kaner, C., Bach, J. and Pettichord, B., 2002. Lessons Learned In Software Testing: A Context Driven Approach. 1st ed. New York: Wiley, p.262.
- Kirk, J. and Miller, M., 1986. Reliability and Validity in Qualitative Research. 1st ed. Newbury Park: Sage, p.9(a), p. 49-50(b).
- Robson, C., 2011. Real World Research. 3rd ed. Oxford: Wiley, pp. 18-19(a), 29(b).
- Aad et al., 2012. Observation of a new particle in the search for the Standard Model Higgs boson with the ATLAS detector at the LHC. Physics Letters B, [online] 716(1), pp.1-29. Available at: Link
- Darwin, C. and Wallace, A., 1858. On the Tendency of Species to form Varieties; and on the Perpetuation of Varieties and Species by Natural Means of Selection. Journal of the Proceedings of the Linnean Society of London. Zoology, 3(9), pp.45-62.
- Einstein, A., 1916. Die Grundlage der allgemeinen Relativitätstheorie. Annalen der Physik. 49 (7): 769–822, Available at Link (En)
- Goodall, J. et al, 1999. Cultures in chimpanzees. Nature, [online] 399(6737), pp.682-685. Available at: Link
- Huber, B., Barnidge, M., Gil de Zúñiga, H. and Liu, J., 2019. Fostering public trust in science: The role of social media. Public Understanding of Science, [online] 28(7), pp.759-777. Available at: Link
- Langley, P., 2012. Discovering Qualitative and Quantitative Laws. [online] The University of Auckland School of Computer Science. Available at: Link
- Morehouse, A., Graves, T., Mikle, N. and Boyce, M., 2016. Nature vs. Nurture: Evidence for Social Learning of Conflict Behaviour in Grizzly Bears. [online] PLOS ONE 11(11). Available at: Link
- Newton, I., 1726. Axioms or Laws of Motion. Mathematical Principles of Natural Philosophy. 3rd ed. p. 19. Available at: Link
- Šileika, A. and Bekerytė, J., 2013. The Theoretical Issues of Unemployment, Poverty and Crime Coherence in the Terms of Sustainable Development. Journal of Security and Sustainability Issues, [online] 2(3), pp.59-70. Available at: Link
- Verlinde, E., 2010. On the origin of gravity and the laws of Newton. Journal of High Energy Physics. 2011(4). Available at: Link