Cobalise Meeting: Oct 2002

                                    23 Oct 2002,
                      Cambridge University Computer Laboratory

Below you will find (1) a short report on the meeting, and (2) the meeting programme, including abstracts of the talks.

Meeting Report

A meeting of the (South-Eastern) Constraint-based Linguistics Group took place in the Computer Laboratory of the University of Cambridge on Wednesday, October 23rd. of the (South-Eastern) Constraint-based Linguistics group. The following talks were given:

Ann Copestake (University of Cambridge): Tutorial Session on practical NLP using constraint-based grammars

Ryo Otoguro (University of Essex): Japanese syntactic verb-verb compounding and grammatical information spreading in LFG

Fabre Lambeau (University of Cambridge): Light Verb Constructions: description and representation in an HPSG large grammar

Dick Hudson (University College London): Word Grammar as a constraint-based theory of language

Jason Baldridge (University of Edinburgh): Getting a grip on combinators: Multi-Modal Combinatory Categorial Grammar

Jim Blevins (University of Cambridge): Subsumption-based alternatives to unification-based formalisms & computational implications

It was agreed at a business meeting that the next meeting would probably be in Essex or Sussex in April.

It was agreed that external funding would be valuable, and in particular that it would allow outside speakers to be brought in. It was agreed that it would be helpful if the successful application for funding of the morphology group could be circulated.

There was a brief discussion of the group's name. Two possibilities are South-Eastern Constraint-Based Linguistics Group and Constraint-Based Linguists in the South-East, but some questioned whether South-East(ern) should be part of the name.

The group has a (rudimentary) website on:

http://www.essex.ac.uk/linguistics/clmt/cobalise/

Comments to Doug Arnold (doug#essex.ac.uk)

Meeting Programme

The recently established regional Constraint-Based Grammar group will be meeting at the Computer Laboratory of the University of Cambridge, Cambridge on Wednesday, October 23rd. The meeting will be from 10am to 5 pm and the schedule is the following:

# 1100-1130: coffee/tea pause

# 1230-1400: lunch

# 1500-1530: coffee/tea pause

# 1630: business meeting/planning

The meeting will be in Cambridge and will take place in the William Gates Building, where the Computer Lab has found a new home recently. Details on how to get here can be found on http://www.cl.cam.ac.uk/UoCCL/contacts/. Once in the building, go to the reception (just in front of the main entrance) and you will get directed to room FW11 where the meeting will be held.

Please distribute this message to all of those that you think might be interested.

If you are planning to come, please let us know.

Looking forward to seeing you here,

Aline Villavicencio and Fabre Lambeau - local organizers


Abstracts:

Dick Hudson, Word Grammar as a constraint-based theory of language.

I shall introduce the three most general and controversial ideas of WG: 1. That language is a (conceptual) network. 2. That information is exploited by means of (a) spreading activation and (b) multiple default inheritance.
3. That sentence structure is based on rich word-word dependencies rather than on phrase structure.
I shall show how these ideas relate the goal of modelling human competence and performance, but I shall also consider how they relate to some of the goals of NLP.

Jason Baldridge, Getting a grip on combinators: Multi-Modal Combinatory Categorial Grammar

In the Combinatory Categorial Grammar (CCG) framework, syntactic patterns arise as the result of the interaction of complex, directionally specified categories with a small set of rules that merge these categories together. While CCG provides a theory of grammar in which nearly all cross-linguistic variation is found in the lexicon, it nonetheless is frequently necessary to place language-specific restrictions on the applicability of the rules to ensure that ungrammatical sequences cannot be derived. In this talk, I will show how the more fine-grained conception of the categorial slash provided by the Categorial Type Logic tradition of categorial grammar can be utilized to rid CCG of rule restrictions and endow the theory with greater explanatory power. I will show how the control that is obtained by this move simplifies previous CCG accounts of English and Dutch and how it provides a cross-linguistic account of syntactic extraction asymmetries in English, Tagalog, and Toba Batak.

Ann Copestake, An introduction to practical NLP using constraint-based grammars.

Starting with a very brief overview of the state-of-art in NLP, I will discuss the need for deep processing in some applications and motivate the use of constraint-based grammars for parsing and generation. I will mention some issues that arise when developing large grammars and briefly talk about semantic representation. Specific examples will be taken from the LinGO grammars and grammar engineering tools which are freely available and which have been used as components of several applications, including commercially deployed systems.

Fabre Lambeau, Light Verb Constructions: description and representation in an HPSG large grammar

The keystone of any NLP application with decent to large coverage is arguably its lexicon. It is obvious that any such lexicon that would only contain simple words would undermine the overall quality of a system. Recent experiments were carried out to account for multiword expressions in the LinGO English Resource Grammar, a large-scale bi-directional grammar of English in the HPSG framework. Light Verb Constructions are one particular type of multiword expression that has special properties partly predictable. I will present them and provide a description of them very much based on a wider definition than traditionally used in the literature. In that perspective, I shall discuss how Light Verb Constructions can be represented syntactically as well as semantically and the advantages of such a representation for further processing.

Ryo Otoguro, Japanese syntactic verb-verb compounding and grammatical information spreading in LFG

Japanese has four related verb-verb compound constructions. One is purely lexical, but the other three (`syntactic compounds') show varying degrees of syntactic independence: the two verbs (V1, V2) are distinct syntactic terminals. The syntactic compounds are classified into three types by the properties of V2 (Kageyama 1993, Matsumoto 1996), (1) the unaccusative, (2) the transitive and (3) the serial verb type. The serial verb type forms a complex predicate in syntax, so that it is challenging to the framework like LFG.

The paper analyses the structural relations between V1, V2 within Andrews and Manning's (1999) information spreading approach in LFG. This architecture allows more flexible grammatical information sharing among c-structure nodes by dividing the information into naturrl classes. Under this proposal, V1 clause is a sentential SUBJ of V2 in the unaccusative type and an XCOMP in the transitive type. In the serial verb type, two verbs are serialised under the V node in c-structure and V1 shares the only grammatical relations with its mother, namely the whole VV compound. Moreover, V1 clause is a pseudo-complement, i.e. semantic argument, ARG of V2.