In this post, I want to show how I use NLTK for preprocessing and tokenization, but then apply machine learning techniques (e.g. building a linear SVM using stochastic gradient descent) using Scikit-Learn.
S. Dori-Hacohen, and J. Allan. Proceedings of the 22nd ACM international conference on Conference on information &\#38; knowledge management, page 1845--1848. New York, NY, USA, ACM, (2013)
E. Loza Mencía, and J. Fürnkranz. Machine Learning and Knowledge Discovery in Databases, volume 5212 of Lecture Notes in Computer Science, Springer Berlin Heidelberg, (2008)
E. Loza Mencía, and J. Fürnkranz. Semantic Processing of Legal Texts, volume 6036 of Lecture Notes in Computer Science, Springer Berlin Heidelberg, (2010)