References

Bing Liu and Xiaoli Li and Yang Dai and Wee Sun Lee and Philip Yu. Building text classifiers using positive and unlabeled examples. Proc. ICDM. 2003.

Fei Huang and Ying Zhang and Stephan Vogel. Mining key phrase translations from web corpora. EMNLP. 2005.

Frank Keller and Maria Lapata and Olga Ourioupina. Using the web to overcome data sparseness. EMNLP. 2002.

Kamal Nigam and Andrew Kachites McCallum and Sebastian Thrun and Tom Mitchell. Text classification from labeled and unlabeled documents using EM. Journal of Machine Learning. 2000.

Mirella Lapata and Frank Keller. Web-based models for NLP. ACM transactions on speech and language processing. 2005

O. Cetin and Andreas Stolcke. Language modeling in ICSI-SRI Spring 2005 meeting speech recognition evaluation. ICSI Tech report TR-05-006. 2005.

Philip Resnik and Noah A. Smith. The web as a parallel corpus. Computational Linguistics. 2003.

Ruhi Sariyaka and Agustin Gravano and Yuqing Gao. Rapid language model development using external resources for new spoken dialog domains. Proc. of ICASSP. 2005.

Tim Ng and Mari Ostendorf and Mei-yuh Hwang and Manhung Siu and Ivan Bulykp and Xin Lei. Web-data augmented language model for mandarin speech recognition. Proc. of ICASSP. 2005.

Xiaojin Zhu. Semi-supervised learning literature survey. Computer Science, Univ. of Wisconsin-Madison.

R. C. Carrasco. Accurate computation of the relative entropy between stochastic regular grammars. RAIRO Theoretical informatics and applications. 1997.
