In this post, I want to show how I use NLTK for preprocessing and tokenization, but then apply machine learning techniques (e.g. building a linear SVM using stochastic gradient descent) using Scikit-Learn.
C. Rose, A. Roque, D. Bhembe, and K. VanLehn. Proceedings of the HLT-NAACL 03 workshop on Building educational applications using natural language processing - Volume 2, page 68--75. Stroudsburg, PA, USA, Association for Computational Linguistics, (2003)
Y. Yang, and X. Liu. SIGIR '99: Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, page 42--49. New York, NY, USA, ACM Press, (1999)
Y. Yang, and X. Liu. SIGIR '99: Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, page 42--49. New York, NY, USA, ACM Press, (1999)