|
CL/MT Research Group |
| SECBaLC Meeting April 2002 |
10:30-12 Jonathan Ginzburg: tutorial session on Interrogative Investigations by Ginzburg and Sag
12-1 Lunch
1-1:45 Aline Villavicencio (Cambridge) The Acquisition of a Unification-Based
Generalised Categorial Grammar
1:45-2:30 Matt Purver (King's College, London) Lexical Acquisition in a
Grammar-based Dialogue System
2:30-2:50 Coffee
2:50-3:10 Business/planning meeting
3:10-3:55 Frank Richter and Manfred Sailer (Tübingen) Distributionally
Restricted Words in Lexical Resource Semantics (LRS)
http://www.streetmap.co.uk/streetmap.dll?P2M?P=WC2R2LS&Z=1
Enter from the Strand, take the lift in the main foyer to the 5th floor, turn right and follow notices to the Computer Science Dept. 23D is at the end of the first corridor you will be walking along.
In this talk I am going to describe some work done to investigate the process of grammatical acquisition from data. I am using a computational learning system that is composed of a Universal Grammar with associated parameters, and a learning algorithm, following the Principles and Parameters Theory. The Universal Grammar is implemented as a Unification-Based Generalised Categorial Grammar, embedded in a default inheritance network of lexical types. The learning algorithm receives input from a corpus annotated with logical forms and sets the parameters based on this input. This framework is used as basis to investigate several aspects of language acquisition. In this talk I am going to describe the components of the learning system and some experiments performed, concentrating on the acquisition of word order for different learners. The results obtained in these experiments show the different learners having a similar performance and converging towards the target grammar given the input data available, regardless of their starting points. It also shows how convergence is possible even in the face of ambiguous and noisy input data, that only affect the speed of convergence of the learners towards the target.
Lexical Acquisition in a Grammar-based Dialogue System Matt Purver (King's College, London)
Practical NLP applications have to deal with unknown words. Systems which are based around grammars which require detailed lexical syntactic and semantic information must somehow acquire this information for such words.
Techniques for automatic acquisition depend either on world/domain knowledge (as in script-based approaches), or plentiful contextual information (as in corpus-based approaches). In an open- or wide-domain dialogue system, neither of these are available. This talk describes a solution to this problem based on interaction with the user via clarification questions.
Use of a highly contextualized semantic representation - an extension of the abstracted contextual-parameter approach of (Ginzburg & Cooper, forthcoming) to all open-class words - allows the lexical-semantic content of words to be treated separately from their role within the sentence. This representation, together with an utterance-anaphoric view of clarification, allows the clarificational dialogue to be integrated within a constraint-based grammar and governed by standard rules of conversation.
References:
Jonathan Ginzburg and Robin Cooper, "Clarification, Ellipsis and Utterance Representation", forthcoming.
In this talk, we bring together two lines of research that we have been pursuing over the last few years: Lexical Resource Semantics (LRS) and the development of a collocation module for Head-Driven Phrase Structure Grammar.
In the first part of the talk we will introduce the major properties of LRS. We assume that the semantic representation of a linguistic sign is a term of some standard semantic representation language. We specify these terms using techniques of underspecified semantics (Reyle 1993, Bos 1996, Egg 1998, Egg et al 2001). In LRS underspecification is used at the level of linguistic descriptions, not at the level of the objects denoted by the grammar. Similar to systems of underspecified semantics, LRS splits a semantic term into its subterms. This allows us to express combinatorial semantics without lambda-conversion. Furthermore we can use identities of subterms to provide a natural account for linguistic phenomena such as negative concord (Richter and Sailer 2001a) or multiple wh-questions (Richter and Sailer 2001b), where the very same semantic operator is introduced by several words within a sentence.
In the second part of the talk we will investigate the distribution of particular lexical items, in particular the German verb "fackeln" ("dither", in the sense of acting nervously or indecisively). It is a Negative Polarity Item which has to occur in the scope of a durational modifier. We will argue that a collocation theory is needed to account for the obligatory presence of the durational modifier. The collocational requirement has to be expressed with reference to the logical form of the modifier. We will show that the negation sensitivity can be accounted for by the same collocational mechanism, making reference to the logical form of the licensing context rather than to its entailment properties. The two collocational restrictions for the verb "fackeln" crucially make reference to the logical form of a sentence. We will show that an LRS semantics allows us to state them in a straightforward way.
The resulting system builds on a devision of labor between (a), "regular" semantic processes such as basic semantic combinatorics, semantic "concord" phenomena etc, and (b), "irregular" or idiosyncratic properties of lexical elements.
References:
Bos, J. 1996: Predicate logic unplugged. In P. Dekker and M. Stokhof, eds, Proceedings of the 10th Amsterdam Colloquium, pp. 133-143.
Egg, M. 1998: Wh-questions in Underspecified Minimal Recursion Semantics. Journal of Semantics, Vol 15, pp. 37-82.
Egg, M., A. Koller and J. Niehren 2001: The Constraint Language for Lambda Structures. Journal of Logic, Language, and Information, Vol 10(4), pp. 457-485
Reyle, U. 1993: Dealing with ambiguities by underspecification: Construction, representation and deduction. Journal of Semantics, vol. 10(2), pp. 123-179.
Richter, F. and M. Sailer 2001a: Polish Negation and Lexical Resource Semantics In: Geert-Jan M. Kruijff, Lawrence S. Moss, and Richard T. Oehrle, eds., Proceedings of FGMOL 2001.
Richter, F. and M. Sailer 2001b: On the Left Periphery of German Finite Sentences In: W. Detmar Meurers, Tibor Kiss (eds): Constraint-Based Approaches toGermanic Syntax. CSLI Publications. pp. 257-300