File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/p04-1001_concl.xml

Size: 3,392 bytes

Last Modified: 2025-10-06 13:54:02

<?xml version="1.0" standalone="yes"?>
<Paper uid="P04-1001">
  <Title>Optimization in Multimodal Interpretation</Title>
  <Section position="5" start_page="2" end_page="2" type="concl">
    <SectionTitle>
4 Conclusion
</SectionTitle>
    <Paragraph position="0"> As in natural language interpretation addressed by Optimality Theory, the idea of optimizing constraints is beneficial and there is evidence in favor of competition and constraint ranking in multimodal language interpretation. We developed a graph-based approach to address optimization for multimodal interpretation; in particular, interpreting multimodal references. Our approach simultaneously applies temporal, semantic, and contextual constraints together and achieves the best interpretation among all alternatives. Although currently the referent graph corresponds to gesture  &amp;quot;a&amp;quot; indicates the number of inputs in which the referring expressions were correctly recognized by the speech recognizer; &amp;quot;b&amp;quot; indicates the number of inputs in which the referring expressions were correctly recognized and were correctly resolved; &amp;quot;c&amp;quot; indicates the number of inputs in which the referring expressions were not correctly recognized; &amp;quot;d&amp;quot; indicates the number of inputs in which the referring expressions also were not correctly recognized, but were correctly resolved. The sum of &amp;quot;a&amp;quot; and &amp;quot;c&amp;quot; gives the total number of inputs with a particular combination of speech and gesture.</Paragraph>
    <Paragraph position="1"> constraint in Optimality Theory. Given these compatibility functions, the graph-matching algorithm provides an optimization process to find the best match between two graphs. This process corresponds to the Evaluator component of Optimality Theory.</Paragraph>
    <Paragraph position="2"> Third, in our approach, different compatibility functions return different values to address the Constraint Ranking component in Optimality Theory For example, as discussed earlier, once a  ) returns 1. Here, we consider the compatibility between identifiers is more important than the compatibility between semantic types.</Paragraph>
    <Paragraph position="3"> However, currently we have not yet addressed the strict dominance aspect of Optimality Theory.</Paragraph>
    <Paragraph position="4"> input and conversation context, it can be easily extended to incorporate other modalities such as gaze inputs.</Paragraph>
    <Paragraph position="5"> We have only taken an initial step to investigate optimization for multimodal language processing.</Paragraph>
    <Paragraph position="6"> Although preliminary studies have shown the effectiveness of the optimization approach based on graph matching, this approach also has its limitations. The graph-matching problem is a NP complete problem and it can become intractable once the size of the graph is increased. However, we have not experienced the delay of system responses during real-time user studies. This is because most user inputs were relatively concise (they contained no more than four referring expressions). This brevity limited the size of the graphs and thus provided an opportunity for such an approach to be effective. Our future work will address how to extend this approach to optimize the overall interpretation of user multimodal inputs.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML