File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/02/w02-1502_concl.xml
Size: 3,506 bytes
Last Modified: 2025-10-06 13:53:23
<?xml version="1.0" standalone="yes"?> <Paper uid="W02-1502"> <Title>The Grammar Matrix: An Open-Source Starter-Kit for the Rapid Development of Cross-Linguistically Consistent Broad-Coverage Precision Grammars</Title> <Section position="9" start_page="3" end_page="3" type="concl"> <SectionTitle> 7 Conclusion </SectionTitle> <Paragraph position="0"> This project carries linguistic, computational, and practical interest. The linguistic interest lies in the HPSG community's general bottom-up approach to language universals, which involves aiming for good coverage of a variety of languages first, and leaving the task of what they have in common for later. (Of course, theory building is never purely data-driven, and there are substantive hypotheses within HPSG about language universals.) Now that we have implementations with fairly extensive coverage for a somewhat typologically diverse set of languages, it is a good time to take the next step in this program, working to extract and generalize what is similar across these existing wide-coverage grammars. Moreover, the central role of types in the representation of linguistic generalizations enables the kind of underspecification which is useful for expressing what is common among related languages while allowing for the further specialization which necessarily distinguishes one language from another.</Paragraph> <Paragraph position="1"> The computational interest is threefold. First there is the question of what formal devices the grammar matrix will require. Should it include defaults? What about domain union (linearization theory)? The selection and deployment of formal devices should be informed by on-going research on processing schemes, and here the crosslinguistic perspective can be particularly helpful. Where there are several equivalent analyses of the same linguistic phenomena (e.g., morphosyntactic ambiguity or optionality), the choice of analysis can have processing implications that aren't necessarily apparent in a single grammar. Second, having a set of wide-coverage HPSGs with fairly standardized fundamentals could prove interesting for research on stochastic processing and disambiguation, especially if the languages differ in gross typological features such as word order. Finally, there are also computational issues involved in how the grammar matrix would evolve over time as it is used in new grammars. The matrix enables the developer of a grammar for a new language to get a quick start on producing a system that parses and generates with non-trivial semantics, while also building the foundation for a wide-coverage grammar of the language. But the matrix itself may well change in parallel with the development of the grammar for a particular language, so appropriate mechanisms must be developed to support the merging of enhancements to both.</Paragraph> <Paragraph position="2"> There is also practical industrial benefit to this project. Companies that are consumers of these grammars benefit when grammars of multiple languages work with the same parsing and generation algorithms and produce standardized semantic representations derived from a rich, linguistically motivated syntax-semantics interface. More importantly, the grammar matrix will help to remove one of the primary remaining obstacles to commercial deployment of grammars of this type and indeed of the commercial use of deep linguistic analysis: the immense cost of developing the resource.</Paragraph> </Section> class="xml-element"></Paper>