Formalisms and Methodology for Learning by Reading

FAM-LbR 2010

Machine Reading as a Process of Partial QA

Peter Clark and Phil Harrison

Machine reading goal:

  • construct an inference-supporting representation from text
  • connect what is read with what is known -- reader knows something, and the text elaborates/deepens that knowledge

In QA: "remainder" that we don't know is failure; In machine reading, "remainder" that we don't know is new knowledge

Interleave interpretation with answering:

  • start with logical form
  • consider several alternative disambiguations
  • compare them with the existing knowledge base; are they provable or (partially) known?
  • iterate
  • end up with a disambiguated semantics that is consistent with the knowledge base

Example: if we see "joined", it could me "acoompanied by" or "attached to" (WSD)

Create a tree of possible interpretations; interpret and try to prove parts of the logical form.

how sensitive is this approach to the order in which we 'read' documents?

Audience questions mostly focused on "what's new here?" and "does this scale?"

Building an end-to-end text reading system based on a packed representation

Doo Soon Kim, Ken Barker and Bruce Porter

Simple pipeline uses very aggressive pruning. Alternative: n-best, using a beam. But combinatorial expansion.

So they used a packed representation through the entire system.

Target representation: graphical representation of dependencies between events/objects. Nodes are things like "has-part" and "object-of-event"

Packed Graphical (PG) Representation

  • Base representation plus constraints.
  • Base representation is a graph with variables.
  • We can then put constraints on those variables (eg R1=(foo|bar)); and we can put constraints on the relations between the variables.

Ambiguity types:

  • parsing ambiguity
  • type ambiguity
  • relation ambiguity
  • coref ambiguity

Disambiguation: identify mappings between (hopefully) redundant texts, and then merge the information from them.

To do the merging: convert constraints to Bayesian networks, then merge the networks.

Pruning: discard low-probability candidates. This can propagate, because of the derivation constraints. This is actually done with Bayesian networks.

Semantic Enrichment of Text with Background Knowledge

Anselmo Peñas and Eduard Hovy

Typically, texts omit important information.

Goal: automatically recover the omitted information. "Enrichment"

Use domain-specific knowledge base, with counts of patterns used to enrich semantically poor relationships (eg noun-noun compounds).

Large Scale Relation Detection

Chris Welty, James Fan, David Gondek and Andrew Schlaikjer

Mining Script-Like Structures from the Web

Niels Kasch and Tim Oates

Open-domain Commonsense Reasoning Using Discourse Relations from a Corpus of Weblog Stories

Matthew Gerber, Andrew Gordon and Kenji Sagae

Semantic Role Labeling for Open Information Extraction

Janara Christensen, Mausam, Stephen Soderland and Oren Etzioni

Empirical Studies in Learning to Read

Marjorie Freedman, Edward Loper, Elizabeth Boschee and Ralph Weischedel

Learning Rules from Incomplete Examples: A Pragmatic Approach

Janardhan Rao Do

Unsupervised techniques for discovering ontology elements from Wikipedia article links

Zareen Syed and Tim Finin

Machine Reading at the University of Washington

Hoifung Poon, Janara Christensen, Pedro Domingos, Oren Etzioni, Raphael Hoffmann, Chloe Kiddon, Thomas Lin, Xiao Ling, Mausam, Alan Ritter, Stefan Schoenmackers, Stephen Soderland, Dan Weld, Fei Wu and Congle Zhang

Analogical Dialogue Acts: Supporting Learning by Reading Analogies

David Barbella and Kenneth Forbus

A Hybrid Approach to Unsupervised Relation Discovery Based on Linguistic Analysis and Semantic Typing

Zareen Syed and Evelyne Viegas

Supporting rule-based representations with corpus-derived lexical information.

Annie Zaenen, Cleo Condoravdi, Daniel Bobrow and Raphael Hoffmann

PRISMATIC: Inducing Knowledge from a Large Scale Lexicalized Relation Resource

James Fan, David Ferrucci, David Gondek and Aditya Kalyanpur