Common-sense Inference

We use words to talk about the world. Therefore, to understand what words mean, we must have a prior explication of how we view the world.” – Hobbs (1987)

Researchers in Artificial Intelligence and (Computational) Linguistics have long-cited the requirement of common-sense knowledge in language understanding.[1] This knowledge is viewed as a key component in filling in the gaps between the telegraphic style of natural language statements: we are able to convey considerable information in a relatively sparse channel, presumably owing to a partially shared model at the start of any discourse. [2]
—-Common-sense inference – inferences based on common-sense knowledge – is possibilistic: things everyone more or less would expect to hold in a given context, but without the necessary strength of logical entailment.[3] Because natural language corpora exhibit human reporting bias (Gordon and Van Durme, 2013), systems that derive knowledge exclusively from such corpora may be more accurately considered models of language, rather than of the world (Rudinger et al., 2015). Facts such as “A person walking into a room is very likely to be blinking and breathing” are usually unstated in text, so their real-world likelihoods do not align to language model probabilities. We would like to have systems capable of, e.g., reading a sentence that describes a real-world situation and inferring how likely other statements about that situation are to hold true in the real world. This capability is subtly but crucially distinct from the ability to predict other sentences reported in the same text, as a language model may be trained to do.


The JHU Ordinal Common-sense Inference (JOCI) corpus is a collection of 39k automatically generated common-sense inference pairs manually labelled for ordinal inference with the labels very likely, likely, plausible, technically possible, and impossible. JOCI is created to support ordinal common-sense inference, which is an extension of recognizing textual entailment: predicting ordinal human responses on the subjective likelihood of an inference holding in a given context.

Examples of JOCI

Context Hypothesis
John was excited to go to the fair The fair opens .
The politician’s argument was considered absurd. He lost the support of voters.
Several bike riders in a parade, wearing American paraphernalia with onlookers nearby. People are sitting and watching a parade.
A bare headed man wearing a dark blue cassock, sandals, and dark blue socks mounts the stone steps leading into a weathered old building A man is in the middle of home building .
A brown-haired lady dressed all in blue denim sits in a group of pigeons . People are made of the denim .

Statistics of JOCI

Subset #pair Context Source Hypothesis Source
AGCI 22,086 SNLI-train AGCI-WK
2,456 SNLI-dev AGCI-WK
2,362 SNLI-test AGCI-WK
5,002 ROCStories AGCI-WK
1,211 SNLI-train AGCI-NN
SNLI 993 SNLI-train SNLI-entailment
998 SNLI-train SNLI-neutral
995 SNLI-train SNLI-contradiction
ROCStories 1,000 ROCStories-1st ROCStories-2nd
1,000 ROCStories-1st ROCStories-3rd
COPA 1,000 COPA-premise COPA-effect
Total 39,093


1. Sheng Zhang, Rachel Rudinger, Kevin Duh and Benjamin Van Durme.  Ordinal Common-sense Inference.  Transactions of the ACL, 2017.

Humans have the capacity to draw common-sense inferences from natural language: various things that are likely but not certain to hold based on established discourse, and are rarely stated explicitly. We propose an evaluation of automated common-sense inference based on an extension of recognizing textual entailment: predicting ordinal human responses on the subjective likelihood of an inference holding in a given context. We describe a framework for extracting common-sense knowledge from corpora, which is then used to construct a dataset for this ordinal entailment task. We train a neural sequence-to-sequence model on this dataset, which we use to score and generate possible inferences. Further, we annotate subsets of previously established datasets via our ordinal annotation protocol in order to then analyze the distinctions between these and what we have constructed.

Paper link: TACL site, PDF download

Data: The JOCI corpus



  • ACL, 2017. Ordinal Common-sense Inference. (Sheng Zhang)


[1] Schank (1975): It has been apparent … within … natural language understanding … that the eventual limit to our solution … would be our ability to characterize world knowledge.

[2] McCarthy (1959): a program has common sense if it automatically deduces for itself a sufficiently wide class of immediate consequences of anything it is told and what it already knows.

[3] E.g., many of the bridging inferences of Clark (1975) make use of common-sense knowledge, such as the following example of “Probable part”: I walked into the room. The windows looked out to the bay. To resolve the definite reference the windows, one needs to know that rooms have windows is probable.