*
* Lecture notes by Edward Loper
*
* Course: CIS 630 (NLP Seminar: Structural Representations)
* Professor: Joshi
* Institution: University of Pennsylvania
*
[01/15/01 05:17 PM]
>> Representationally Oriented Grammars
(or Grammars for Analysis, Grammars as Constraint Satisfaction)
Look at grammar as a set of constraints that the sentence must
satisfy. Sentence is associated with a description:
# S \to D
Grammar's job is to decide whether a description D is consistant.
Just tells you what representations are licensed, doesn't say how to
get them.
>> Derivationally Oriented Grammars
(or grammars for generation)
Say how to derive a grammar. How to construct a derivation D.
Generative system.
[01/22/01 04:34 PM]
Define finite automata on tree:
type\_node x (state\_child1, state\_child2, \ldots) \to state\_node
e.g.
The x () \to q1
table x () \to q2
NP x (q1, q2) \to q3
At the top, check if you're in the set of accepting states.
"recognizable set" of trees is exactly those trees that are accepted
by FSA on trees.
> to do
- send email re what i want to work on
[01/29/01 04:41 PM]
nested dependencies vs. cross dependencies: CFGs can only give nested
dependancies.
[02/05/01 04:34 PM]
see joshi re lexical semantic info after class..
take initiative? ;)
> Unification
Implementing constraints on substitution and adjoining (and in
particular adjoining)
3 types of constraints:
- selective adjoining -- Feature structures implicitly specify
constraints.
- null adjoining
- obligatory adjoining -- at that node, at least one adjunction must
take place
Adjoining changes already-built structure. It's a higher order
operation than substitution, and a higher order abstraction..
# {\textasciicircum} {\textasciicircum}
# / \backslash adjoin X2 gives / \backslash
# / X1\backslash / \backslash / X4\backslash
# /\_/\_\backslash \backslash /\_X3\backslash /\_/\_\backslash \backslash
# /\_X5\backslash
# /\_\backslash
# Where:
# t(X4) = t(X2) U t(X1)
# b(X4) = b(X2)
# t(X5) = t(X3)
# b(x5) = b(X3) U b(X1)
Where "U" is unification..
Substitution is a special case, where X is a leaf, and it has no
bottom features.
>> LTAG
# G = (I, A)
# I = initial trees
# A = adjunction treess
We don't give rewrite rules because adjunction and substitution are
language independant.
# T(G) = trees
# L(G) = strings
# TAL = \cup_G L(G)
Theorems:
1. TAL is more powerful than CFGs (CFL \subset TAL, proper subset)
Note: You can get crossed dependancies (2 nested dependancies
sharing elements gives crossed dependancies)
2. even when TAG's surface form is CFG, SD's of CFGs \subset SD's of
TAGs. (SD=structural description)
# S
# /| S
# a S = crossed /|\backslash = nested
# |\backslash a S b
# S b
proper analysis?
[02/07/01 04:37 PM]
Representation should make it clear what the constituants are. What
about discontinuous constituants, etc.?
we have to define what we mean by constituants..
Representations should also make the dependancies clear. what are
dependencies? what are dependencies between constituants?
[02/20/01 05:12 PM]
> Generation talk
>> Formal structure for generation
- Syntax
- Semantics
- Links to context (& goals of conversation)
Consider the LTAG derivation tree:
# slide
# / \backslash
# coupling-nut onto
# \backslash
# elbow
define:
- new
- assertion: move(e, h, n, p) is next(e)
- shared
- presup
- pragmatics
[03/01/01 04:39 PM]
> Dependancy and Locality in TAG
What is dependancy?
- thematic dependancy:
- relationship between predicates and arguments
- locality: syntacitically realized within some constrained
structural domain.
- always local, but sometimes operating on traces, etc.
- structural dependancy
- relationship between 2 elements in a structure
- e.g., moved element and trace (coindexing)
- subject to locality constraints
Why?
- accounts for syntax data..
What *is* the local structural domain?
Primary argument of Frank: a privledged structural domain to express
locality dependancy relations can be defined.
Terms:
- Basis: atomic units out of which stuctures are built
- Structural Composition: closure of basis over composition rules
- Transformation: modify existing rep. transformations create
structural dependancies!
Kernel structures from chomsky 55: basically simple active sentences.
can they be the domain of locality? but what about transformations??
if we interlevel transofmations and compositions, things get
arbitrarily far apart..
In TAG, all transformations take place prior to formation of
elementary trees:
# basis --(move + merge)\longrightarrow elementary tree
# elementary tree --(adjoin + subst)\longrightarrow sentences
[03/06/01 04:47 PM]
> Raising, Superraising, There-Insertion
Raising in GB is defined by a transformational account. Raise an
element to a site higher in the tree. Raising attempts to preserve
some sense of locality -- trace..
In TAG, there's no transformation. Raising is defined by adjoining.
Eg., define "seems" as an adjoining node that inserts between "John"
and "to like broccoli." Locality is preserved because they come from
the same fundamental tree.
Recursion in GB: successive cyclic movement
Recursion in TAG: multiple adjoins.
>> Super-Raising
- John_i seems [IP t_i to be likely [IP t_i to eat broccoli ]]
- *John_i seems [IP it is likely [IP t_i to eat broccoli ]]
Why is the second one bad? In GB:
- representational constraint
- derivational constraint
Either way, we must give an explicit constraint.
But in TAG, it comes for "free."
How to deal with "it is likely"?
No super-raising: you can't combine I'..IP and IP..I' to get I'..I'
[03/08/01 04:39 PM]
> Bob Frank
- What makes an elementary tree valid for a language?
- What makes a derived tree an acceptable sentences?
- Not all elementary trees are acceptable sentences.
Consider:
- There [seems] to be a VP in the hospital
* There [seems] a VP to be in the hospital
Second one is invalid because we don't have the elementary tree:
* There a VP to be in the hospital
But what if we interpret it as:
* [There seems] a VP to be in the hospital
So why can't "there seems" be an elementary tree? maybe because seems
wants to take predicates, and "there" isn't an argument..
But what about [it seems]? If our elementary tree for it is:
# [TP it [T' [T \ldots] [VP [V seems] [CP \ldots]
And what about "it is raining"?
And what about sentences (in other languages) like:
- it was danced by John.
Does EPP hold on elementary trees? Yes. Otherwise, we'd never
generate the elementary trees for "it is raining." But on the other
hand we have elementary trees that don't satisfy the EPP.
So why are "there" and "it" different? I.e., why do we get:
# [TP it [T' [T \ldots] [VP [V seems] [CP \ldots]
but not:
# [TP there [T' [T \ldots] [VP [V seems] [CP \ldots]
If we assume that subjects always begin in VP, we have to explain
why/how they get to spec/TP.
Define a lexical array = a list of lexical items with selectional
features. Selectional features require that an item merge with
certain types of object. Then use Merge & Move to construct your
elementary trees from small LAs. LAs must have at most one
semantically contentful element..
When creating elementary trees, keep going until you've dealt with as
many uninterpretable features as you can.. E.g., assume T has [+EPP]
feature. So:
# [VP DP [V' [V expected] TP]] \to
# [TP DP [VP t [V' [V expected] TP]]]
But:
# [VP [V seems] TP] \to
# [TP [VP [V seems] TP]]
\phi features are agreement features. But they're not selectional.
"It" agrees with \phi, but "there" doesn't. Which means that we can get:
# [TP it [T [VP [V seems] TP]]]
because "it" satisfies both the EPP and the \phi features of T. But we
can't get the same thing with "there" because there doesn't satisfy
the \phi features:
# * [TP there [T [VP [V seems] TP]]]
c.f.:
# it seems that A and B (note: seems is singular)
# there are A and B (note: are is plural)
Claim: after we've generated our elementary trees, the unchecked
features play the role of placing restrictions on what adjunctions are
allowed.
A can only adjoin to B if doing so satisfies some of B features.
comments to: rfrank@jhu.edu
[03/27/01 03:07 PM]
> Reduced Constructions
>> Scrambling and Tag
# that no one dared [ the bike to repair]
# that the bike no one dared [ to repair]
This movement is unbounded.
Scrambling not allowed by all verbs.
>> Clitic Climbing
Clitic can "climb" to higher clauses, for some verbs.
>> The problem
moved constituant ends up in the *middle* of the upper clause..
For:
# John seems to like the pizza
We can just adjoin in "seems" to the E.T.:
# NP to like NP.
But where does "does" belong in:
# Does John seem to like the pizza?
? X-tag makes "does" its own tree. But it seems like "does" is on
C, so we might want:
# [Cbar [C does] [IP [Ibar [I seem] \ldots]]]
But then what adjoins into what? One option is to use multicomponent
tree-local tag, and do:
# [Cbar [C does] \ldots]
and
# [Ibar [I seem] \ldots]
But you can get unbounded raising.. So the components can get
arbitrarily far apart..
[04/03/01 03:37 PM]
> More fun with C/G
>> Reconstructing C/G in the form of LTAG..
CG derivation trees parallel LTAG elementary trees.
4 rules:
# Function application:
# (S/NP) NP \to S
# NP (NP\backslash S) NP \to S
# Function composition:
# X/Y Y/Z \to X/Z
# X\backslash Y Y\backslash Z \to X\backslash Z
You can assume [A], and derive [B]. This lets us prove that A\to B.
This is "withdrawing the assumption" or "discharging the assumption"
# [A] ; make assumption
# \vdots
# B
# ----- ; withdraw assumption
# A\to B
# CFG \longrightarrow LTAG
# \downarrow \downarrow
# CG(AB) \to CG(PPT)
#
# Where:
# \to is strong-lex EDL, FRD
# \downarrow is weak equivalance
# CG(AB) = standard, vanilla CG
# CG(PPT) = C/G with partial proof tree
This is similar to what Bob Frank was doing with LTAG and minimalism,
but for CG.
Instead of assinging likes the type "(NP\backslash S)/S", assign it a partial
proof tree:
# likes
# |
# [NP] (NP\backslash S)/S) [NP]
# ------------------
# NP\backslash S
# ---------------------------
# S
Thus, these PPT types have assumptions.
Normally, the only way to satisfy an assumption is to withdraw it.
But we will introduce new ways of satisfying an assumption?
Construct a finite collection of partial proof trees, where each PPT
is a syntactic type associated with a lexical item. These are
analagous to elementary trees. Then introduce a composition method.
These must be inference rules..
>> Inference rule: linking
B(PPT) = the set of basic partial proof trees. How do we construct
them?
* Unfold arguments of types by introducing assumptions.
* No unfolding past an argument that's not an argument of the
lexical item.. e.g., adverb is VP\to VP, i.e., its type is
(NP\backslash S)\backslash(NP\backslash S). But how do we keep from unfolding the VP? Mark
a node that stops unfolding with an asterisk: (NP\backslash S)\backslash(NP*\backslash S).
* If a trace assumption is introduced while unfolding then it must
be locally discharged, i.e., within the basic PPT which is being
introduced.
* While unfolding we can interpolate, say, from X to Y where X is a
conclusion node and Y is an assumption node.
>> Stretching
# Y = u v w, with X = the single conclusion of v
# Then we can say Y = u [X] w; X \to v
e.g., stretched likes:
# likes
# |
# [NP] (NP\backslash S)/S) [NP]
# ------------------
# NP\backslash S
# :
# :
# [NP\backslash S]
# ---------------------------
# S
Then we can splice in "passionately"
# passionately
# |
# [NP\backslash S] (NP\backslash S)\backslash(NP*\backslash S)
# ------------------------
# NP\*S
We are still just using linking here when we combine..
Stretching is used during composition.
>> Traces
Introduce a special assumption, which we'll call a trace
assumption.. and then discharge it on the other side. You must
discharge within one elementary tree.
# likes e
# | |
# [NP] [NP] (NP\backslash S)/NP [NP]
# ---------------
# NP\backslash S
# ----------------------
# S
# -------- # Discharge assumption
# NP\backslash S
# -------------------
# S
We must disallow the possibility of doing discharges outside of
elementary trees, because then we lose locality.. we want to keep
dependancies in elementary trees.
Use a permutation operation to allow assumption and discharge to occur
on different sides?
In normal CG, you have to introduce assumptions at the periphery. So
introducing e in a sentence like "who John saw e yesterday" would be
difficult. But since we can stretch, we can stretch "who John saw e"
and splice in "yesterday"..
>> Interpolate
Interpolation is basically introducing a gap in a PPT that must be
satisfied at the time PPTs are put together (i.e., while constructing
full proof trees for sentences).
How to do something like "John seems to be happy"? We want a "John to
be happy" tree with a spliced in "seems to".. but how do we rule out
"John to be happy"? (In LTAG, we used features) Consider "John tries
to walk"
# [NP] walk
# | |
# NP NP\backslash Sinf
# ----------------
# Sinf
# tries
# |
# NP (NP\backslash S)/Sinf [Sinf]
# --------------------
# NP\backslash S
# ------------------
# S
# seems
# |
# (NP*\backslash S)/(NP\backslash Sinf) [NP\backslash Sinf]
# ------------------------------
# NP*\backslash S
So we need a new tree for walk.
# walk(inf)
# |
# [NP] NP\backslash Sinf
# : \gets Interpolation. I must have a proof
# : tree to splice in here.
# [NP\backslash S]
# ----------------
# S
You can only interpolate while constructing PPTs, not while using them
to construct sentences. (c.f., trace assumptions). Then at
run-time, we can splice things in (e.g., seems).. This is equivalant
to forced adjunction in LTAG..
How does interpolation relate to multicomponent LTAG?
* Question: discharging and introducing on different sides??
[04/24/01 04:37 PM]
> Synchronous Grammars
- Grammars generate languages (sets of strings)
- Synchronous grammars generate string relations (sets of pairs of
strings).
Useful for:
- transalation
- interpretation (to internal language)
>> Synchronous CFG
(aka Syntax-directed translation schemata)
Set of pairs of rules. E.g.:
# ~~
#
#
In additon, a system of coindexation..
# ~~~~
#
#
Indexes say that generated nonterminals are "linked."
Start with a pair of "top" nonterminals, and rewrite them in pairs,
using paired rules:
# ~~~~ \Rightarrow
If you rewrite a nonterminal with a given index on the left, you must
rewrite the nonterminal with the same index on the right. In this
case, if we rewrite X1, we must also rewrite B1.
# \Rightarrow
# \Rightarrow
# \Rightarrow
In this case, this is the only string generated by the grammar:
#
Derivation trees:
# < (S (X x (Z z)) (Y y)), (S (A a) (B b (C c)))>
Property of result: Structure of the tree doesn't change, except that
sisters may be reversed and nodes may be renamed. This is (almost?)
always a property of synchronous grammars.
Indeces are what give us this isomorphism.
>>> Synchronous Grammar \alpha:
# \to
# \to
# \to <\epsilon, \epsilon>
# \to
# \to <\epsilon, \epsilon>
>>> Synchronous Grammar \beta:
# \to
# ~~** \to ****
# **** \to <\epsilon, \epsilon>
# \to ****
# \to <\epsilon, \epsilon>
# \alpha\circ\beta = a^n^nc^n
**