Based on the multi-level analysis of lexicon, phrase and sentence, discourse analysis has become one of the key issues in natural language processing research in recent years. However, Chinese discourse analysis is still in its very early stage, significantly lagging behind that of English in both theory and methodology. This project aims to establish the computational theory for the analysis of logical structure and semantics of Chinese discourse by leveraging on the state-of-the-art and apply the research results to practical applications empirically. In particular, the project focuses on the following researches:
1) propose the theory and model for the analysis of Chinese discourse logical structure, topic structure, cohesion and coherence; 2) based on the proposed theory, develop the annotation scheme and build up a large scale of Chinese discourse-annotated corpus; 3) study and implement the core algorithms of Chinese discourse analysis; 4) apply the research results to machine translation and question answering.
We believe that the research achievements from this proposal have great scientific significance and application value to Chinese information processing and Chinese computational linguistics by advancing the state-of-the-art and filling up the research gaps of automatic analysis and application of Chinese discourse.