Natural Language Processing (NLP) Resources
This is a set of materials to learn and practice NLP. This list may also be used as general reference to go back to for a refresher.
Courses and Course Materials (Start Here)
- Recurrent Neural Networks by Andrew Ng Course Youtube Material – Highly recommended to start here if you’ve never done NLP
- Stanford Deep Learning for NLP (cs224n) Course Material
Tutorials
Topic |
Title/Description |
Link |
Topic Modeling |
Topic modeling from Gensim official Docs |
Tutorial |
Topic Modeling and Clustering |
A topic identification and document clustering algorithm tutorial with Gensim/NLTK from PyCon |
Video |
Intent and Entity Recognition |
Language Understanding with Recurrent Networks from CNTK official Docs |
Tutorial |
Word2Vec |
Vector Representations of Words from TensorFlow official Docs |
Tutorial |
Text categorization |
Analysing a collection of text documents from Scikit-Learn official Docs |
Tutorial |
Sequence to Sequence |
A tutorial on how to summarize text and generate features using deep learning with Keras and TensorFlow |
Tutorial |
Examples - Try Me!
- Document clustering with k-means official
scikit-learn
Example
- Featurize free-form text data using
mmlspark
on top of primitives in SparkML via a single transformer in this official mmlspark
Notebook
- Sequence Classification with CNTK Example
- Sequence2Sequence with CNTK Example
NLP-Specific Packages
gensim
: topic modelling Docs - good for word2vec, semantic similarity, LDA, LSA, etc.
nltk
: Natural Language Toolkit Docs - good for tokenization, stemming, tagging, parsing, corpora, etc.
spacy
: Efficient and Backed by ANNs NLP Toolkit Docs - good for parsing, tagging, entity recognition, text categorization, phrase matching, etc.
allennlp
: Deep Learning for NLP from AllenNLP built on PyTorch Ref - good for conditional random field, encoders/decoders, reading comprehension, semantic role, etc.
Blog Articles
Topic |
Title/Description |
Link |
Basics |
7 types of Artificial Neural Networks for Natural Language Processing |
Link |
TF/IDF |
Calculating TF/IDF on How I met your mother transcripts (with scikit-learn ) |
Link |
General/Sentiment Analysis |
Breakthrough Research Papers and Models for Sentiment Analysis |
Link |
|
|
Link |
Papers
Topic |
Title/Description |
Author(s) |
Link |
Text Classification |
Fine-tuned Language Models for Text Classification (with Transfer Learning) |
Jeremy Howard, Sebastian Ruder |
Link |
NLP at Scale
- Document classification with
pyspark
with HDInsight on Azure Doc
Kaggle
- Toxic Comment Classification Challenge Competition
Books
TBD
Exercises - Try Me!
Topic |
Title/Description |
Link |
Sentiment Analysis |
Build a sentiment analysis / polarity model scikit-learn |
Exercise and Code to start |
List updated 2017-01-26