One of the commonsense reasoning benchmark suites is the Choice of Plausible Alternatives (COPA) challenge.4 The challenge consists in being confronted with a statement together with a question regarding it and two alternative possible answers. The task is to find out which one of the two alternatives is more plausible. Drawing this conclusion requires a substantial amount of knowledge about the world and yet humans are capable to do it instantaneously.5
The handling of large amounts of knowledge turns out to be a major problem for reasoning systems when processing common sense operations. The situation is different in experiments that involve humans. Some researchers argue that, in contrast to artificial reasoners, human reasoning is more error-prone when problems are being presented in abstract terms; yet as soon as there is an applied context, for example, derived from social interactions or has to do with matters of self-precaution, human performance is clearly better than machine performance.
A well-studied example regarding the challenge of drawing an inference is the Wason Selection Task.6 In
4 Melissa Roemmele, Cosmin Adrian Bejan, Andrew S. Gordon "Choice of Plausible Alternatives: An Evaluation of Commonsense Causal Reasoning," in Papers from the 2011 AAAI Spring Symposium, No. 6: Logical Formalizations of Commonsense Reasoning, Palo Alto, CA: AAAI Press 2011, pp. 90-5.
5 Andrew S. Gordon, Choice of Plausible Alternatives (COPA), https://people.ict.usc.edu/~gordon/copa.html, the set of 500 questions is posted at https://people.ict.usc.edu/~gordon/downloads/COPA-questions-dev.txt.
6 Peter C. Wason, "Reasoning About A Rule," The
the standard version of the task, a subject is presented with four different cards. The subject is told that each card contains a letter on one side and a number on the opposite side. The instructions include a statement such as "If there is a vowel on one side, the opposite side contains an even number." The subject is asked to verify or disprove this statement by turning over a minimum number of cards. In this abstract task, less than 25% of the subjects were able to find the solution; this result has since been confirmed with a wide variety of subjects. Even students attending logic lectures at universities get similar poor results. It should be obvious (at least to a logician) that the statement, "If there is a vowel on one side, the opposite side contains an even number" is formulated as a material implication of the form "If P, then Q." And hence one needs to flip the card with A—in this instance P is true, so it would be necessary to check whether Q holds. Similarly, one needs to turn the card with the number 5, for if there is a vowel on the other side, the implication would be false.7
Turning the card with number 2 is not necessary, as the statement would be true, regardless of what is depicted on the other side. Numerous experiments have shown that people have problems correctly executing this abstract, but quite simple, inference. The situation changes drastically when context is added
Quarterly Journal of Experimental Psychology 20/3 (August 1968), 273–281.
7 Ulrike Barthelmeß and Ulrich Furbach, A Different Look at Artificial Intelligence: On Tour with Bergson, Proust and Nabokov, Wiesbaden, DE: Springer Nature 2023, pp. 84-6, the image of Figure 2 is taken from p. 85.
Figure 1:Example problem #65 from the COPA challenge.
Figure 2: The Wason Selection Task
(a) If one side of a card contains a vowel,the other side is an even number.
(b) Each card represents a person's ageand beverage. Persons below the age of21 years are not allowed to drink beer.