Tags: | |
---|---|

Reference: | citeseer |

Download: |

PCFG Models of Linguistic Tree Representations

The early work on probabilistic parsing focused on PCFGs, which assign a probability to each rule in a CFG, and compute the probability of a parse as the product of the probabilities of the rules used to build it. Mark Johnson points out that this framework assumes that the form of the probabilistic model for a parse tree must exactly match the form of the tree itself. After showing that this assumption can lead to poor models, Johnson suggests that reversible transformations can be used to construct a probabilistic model whose form differs from the form of the desired output tree. He describes four transformations for prepositional-attachment structures, and evaluates those transformations using both a theoretical analysis based on toy training sets, and an empirical analysis based on performance on the Penn Treebank II.

Two of these transformations result in significant improvements to
performance: **flatten** and **parent**. The
**flatten** transformation replaces select nested tree structures
with flat structures, effectively weakening the independence
assumptions that are made by the original structure. The
**parent** transformation augments each node label with the node
label of its parent node, allowing nodes to act as "communication
channels" to allow conditional dependency between a node and its
grandparent node. Both of these transformations result in a weakening
of the model's independence assumptions, while increasing the number
of parameters that must be estimated (because they result in a larger
set of possible productions). Thus, they can be thought of an example
of the classical "bias versus variance" trade-off. Johnson's
empirical results show that, in the case of these two transformations,
the reduction in bias overcomes the increase in variance.