References

1   Chinatsu Aone , Mila Ramos-Santacruz, REES: a large-scale relation and event extraction system, Proceedings of the sixth conference on Applied natural language processing, p.76-83, April 29-May 04, 2000, Seattle, Washington 

2   C. Aone, L. Halverson, T. Hampton, and M. Ramos-Santacruz. 1998. SRA: Description of the IE2 system used for MUC-7. In Proceedings of MUC-7. 

3   Daniel M. Bikel , Richard Schwartz , Ralph M. Weischedel, An Algorithm that Learns Whats in a Name, Machine Learning, v.34 n.1-3, p.211-231, Feb. 1999 

4   M. Collins and N. Duffy. 2001. Convolution kernels for natural language. In Proceedings of NIPS-2001. 

5   Corinna Cortes , Vladimir Vapnik, Support-Vector Networks, Machine Learning, v.20 n.3, p.273-297, Sept. 1995 

6   Nello Cristianini , John Shawe-Taylor, An introduction to support Vector Machines: and other kernel-based learning methods, Cambridge University Press, New York, NY, 1999 

7   R. O. Duda and P. E. Hart. 1973. Pattern Classification and Scene Analysis. John Wiley, New York. 

8   Yoav Freund , Robert E. Schapire, Large Margin Classification Using the Perceptron Algorithm, Machine Learning, v.37 n.3, p.277-296, Dec. 1999 

9   T. Furey, N. Cristianini, N. Duffy, D. Bednarski, M. Schummer, and D. Haussler. 2000. Support vector machine classification and validation of cancer tissue samples using microarray expression. Bioinformatics, 16. 

10   D. Haussler. 1999. Convolution kernels on discrete structures. UC Santa Cruz Technical Report UCS-99-10. 

11   Thorsten Joachims, Text Categorization with Suport Vector Machines: Learning with Many Relevant Features, Proceedings of the 10th European Conference on Machine Learning, p.137-142, April 21-23, 1998 

12   Thorsten Joachims, Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms, Kluwer Academic Publishers, Norwell, MA, 2002 

13   John D. Lafferty , Andrew McCallum , Fernando C. N. Pereira, Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data, Proceedings of the Eighteenth International Conference on Machine Learning, p.282-289, June 28-July 01, 2001 

14   Nick Littlestone, Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm, Machine Learning, v.2 n.4, p.285-318, April 1988 

15   Huma Lodhi , Craig Saunders , John Shawe-Taylor , Nello Cristianini , Chris Watkins, Text classification using string kernels, The Journal of Machine Learning Research, 2, p.419-444, 3/1/2002 

16   Andrew McCallum , Dayne Freitag , Fernando C. N. Pereira, Maximum Entropy Markov Models for Information Extraction and Segmentation, Proceedings of the Seventeenth International Conference on Machine Learning, p.591-598, June 29-July 02, 2000 

17   S. Miller, M. Crystal, H. Fox, L. Ramshaw, R. Schwartz, R. Stone, and R. Weischedel. 1998. Algorithms that learn to extract information - BBN: Description of the SIFT system. In Proceedings of MUC-7. 

18   Marcia Munoz , Vasin Punyakanok , Dan Roth , Dav Zimak, A Learning Approach to Shallow Parsing, University of Illinois at Urbana-Champaign, Champaign, IL, 1999 

19   D. Roth and W. Yih. 2001. Relational learning via propositional algorithms: An information extraction case study. In Proceedings of IJCAI-01. 

20   Dan Roth, Learning in Natural Language, Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence, p.898-904, July 31-August 06, 1999 

21   V. Vapnik. 1998. Statistical Learning Theory. John Wiley. 
