Monday, November 28, 2016

Semantic Graphs and Examples

You've got to admit it, graphs have an allure that terms do not have. From silly graphs like
to matters of life and death graphs like

people simply love the stuff. Terms, even the lambda-calculus ones, do not have such an appeal.

So it makes some sense to see if we can capitalize on graphs' affordances for Natural Language semantics, of the style we like.

This is a guest blog post by Dick Crouch. Nah, I lie. It's the mathematics of his stuff that I am trying to understand. There is a collection of slides in the archive of the Delphi-in meeting in Stanford, summer 2016 too (Standardizing Interface Representations for Downstream Tasks like Entailment or Reasoning), but the notes are older, from May 2015.


These are notes towards a proposal for a graphical semantic representation for natural language. Its main feature is that it layers a number sub-graphs over a basic sub-graph representing the predicate-argument structure of the representation. These sub-graphs include:

  • A context / scope sub-graph. This represents the structure of propositional contexts (approximately possible worlds) against which predicates and arguments are to be interpreted. This layer is used to handle boolean connectives like negation and disjunction, propositional attitude  and other clausal contexts (belief, knowledge, imperatives, questions, conditionals), and quantifier scope (under development). The predicate-argument and context graphs go hand in hand, and one cannot properly interpret a predicate-argument graph without its associated context graph.
  • A property sub-graph. This associates terms in the predicate-argument graph with lexical, morphological, and syntactic features (e.g. cardinality, tense and aspect morphology, specifiers)
  • A lexical sub-graph. This associates terms in the predicate-argument graph with lexical entries. There will can be more than one lexical sub-graph for each word, and it is populated by the  concepts, and semantic information obtainable from a knowledge base such as Princeton WordNet, for example.
  • A link sub-graph. This contains co-reference and discourse links between terms in the pred-arg graph. (It has also been used in entailment and contradiction detection to record term matches between premise and conclusion graphs)
  • Other sub-graphs are possible. A separate temporal sub-graph for spelling out the semantics of tense and aspect is under consideration.


This proposal has been partially implemented, and appears to have some practical utility. But theoretically it has not been fully fleshed out. These notes do not perform this fleshing out task, but just aim to describe some of the motivations and issues.

To give an initial idea of what these graphs are like, here are some  examples showing the basic predicate-argument and context structures for some simple sentences. The predicate-argument nodes are shown in blue, and the contexts in grey.
 

1. John did not sleep.

produces the graph above. All sentences are initially embedded under the true context (t) -- on the top right. However, the negation induces a new context embedded under t. In this negated context, an instance of the concept "sleeping by John" can be instantiated. But the effect of the "not" link between t and the embedded context means that this concept is held to be uninstantiable in t.

Every context will have a context-head (ctx_hd) link to a node in the predicate argument graph. The node in the predicate argument graph represents a lexical concept (possibly further restricted by syntactic arguments). The context head concept is always held to be instantiable in its corresponding context. But whether it continues to be instantiable in sub- or super-ordinate context depends on the kind of link between the contexts.

Not explicitly shown in this graph, but present in the actual graphs are further non-head links from each predicate-argument term to their introducing contexts.

If you want to relate this to Discourse Representation Structures, you can see the context labels as being the names of DRS boxes.

2. John believes that Mary does not like him.

This is a slightly more complex example where we can see that the word "believe" introduces an additional context for the complement clause "Mary does not like him". In the t context, there is a believing by John of something. What that something is is spelled out in the clausal context (ctx_5x1), which is a negation of the clausal context "Mary likes him". The example also show a co-reference link between the subject of believe and the object of like.

3. John or Mary slept.

This illustrates the treatment of disjunction. Like negation, disjunction is viewed as a context introducer (i.e. natural language disjunction is inherently modal / intensional, unlike disjunction in classical propositional or first-order logic). The way to read the graph is that there is some group object that is the subject of sleep. Both the group object and the sleeping by the group object are asserted to be instantiable in the top level context. The group object is further restricted by its membership properties: in one context John is an element of the object, and in another Mary is an element of the group object.

4. John loves Mary.

ok, I bet this one got you by surprise!

Just for the hell of it this time here is a fuller graph for a simpler sentence, showing the other lexical and property sub-graphs. The "lex" arcs point to possible word senses for the predicate-argument terms. Not shown in the diagram is that the labels on the sense nodes encode information about the  taxonomic concepts associated with the word senses. Likewise not illustrated in any of these graphs is the fact the predicate-argument node labels encode things like part of speech, stem and surface form, position in sentence, etc.
 
The way these graphs are  obtained is completely separable from, and less important than an abstract definition of semantic graph structures that allows one to specify how to process semantics in various ways (e.g. direct inference on graphs, conversion of graphs to representations suitable for theorem proving, etc.).

Maybe you think that the use of transfer semantics as above seems like  an overkill, at least for the purposes of providing inputs for natural language inference. The transfer semantic pipeline was originally set up to ease the conversion of linguistic semantic representations onto more canonical knowledge representations. As such, there is considerable emphasis on normalizing different semantic representations so that wherever possible the same content is represented in the same way: this simplifies conversion to KR. 

But  maybe there is no particular reason to do all this normalization on the inputs if all you wanted to do was inference. It might  be better to figure out a lighter-weight process for adding extra layers of semantic information direct to dependency structure produced by the parser. Much like many others are doing nowadays.

But the kinds of representations that make sense for inference, this is indeed something that it is worth thinking hard about.