NLPlay is a toolbox / repository, centralizing implementations of key NLP algorithms in one place,to tackle Text Classification, Sentiment Analysis & Question Answering problems. The idea is to have a collection of ready to use algorithms & building blocks , to allow people to quickly benchmark/customize those different model architectures, over standard datasets or their own ones.
- TFIDF Ngrams + SGD linear Model : A statistical interpretation of term specificity and its application in retrieval - 1972
- FastText : Bag of Tricks for Efficient Text Classification - 2016
- NBSVM : Baselines and Bigrams: Simple, Good Sentiment and Topic Classification - 2012
- FastText : Bag of Tricks for Efficient Text Classification - 2016
- DAN : Deep Unordered Composition Rivals Syntactic Methods for Text Classification - 2015
- MLP : A model based on an embedding layer and a configurable pooling & feed-forward neural network on top
- NBSVM++ : Baselines and Bigrams: Simple, Good Sentiment and Topic Classification - 2012 - Source : FastAI
- CharCNN : Character-level Convolutional Networks for Text Classification - 2015
- TextCNN : Convolutional Neural Networks for Sentence Classification - 2014 - Source : Galsang
- TextRCNN : Recurrent Convolutional Neural Networks for Text Classification - 2015
- AttentiveConvNet : Attentive Convolution: Equipping CNNs with RNN-style Attention Mechanisms - 2017 - Source : Tencent
- LEAM : Joint Embedding of Words and Labels for Text Classification - 2018
- EXAM : Explicit Interaction Model towards Text Classification - 2018 - !UNDER DEVELOPMENT!
- DPCNN : Deep Pyramid Convolutional Neural Networks for Text Categorization - 2017 - Source : Cheneng
- QRNN : Quasi-Recurrent Neural Networks - 2016 - Source : Dreamgonfly
- SWEM : Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms - 2018
- SRU : Simple Recurrent Units for Highly Parallelizable Recurrence - 2017 - Source : Asappresearch
- LSTM/BiLSTM: Long Short Term Memory - 1997, Neural architectures for named entity recognition - 2016
- GRU/BiGRU : Neural Machine Translation by Jointly Learning to Align and Translate - 2014
- AdaBelief : AdaBelief Optimizer: Adapting Stepsizes by the Belief in Observed Gradients - 2020 - Source : juntang-zhuang
- AdaBound : Adaptive Gradient Methods with Dynamic Bound of Learning Rate - 2019 - Source : Luolc
- DiffGrad : diffGrad: An Optimization Method for Convolutional Neural Networks - 2019 - Source : Less Wright
- Lookahead : Lookahead Optimizer: k steps forward, 1 step back - 2019 - Source : lonePatient
- QHAdam : Quasi-hyperbolic momentum and Adam for deep learning - 2019 - Source : FacebookResearch
- RAdam : On the Variance of the Adaptive Learning Rate and Beyond - 2020 - Source : LiyuanLucasLiu
- Ranger : An Adaptive Remote Stochastic Gradient Method for Training Neural Networks - 2019 - Source : Less Wright
- Mish : Mish: A Self Regularized Non-Monotonic Neural Activation Function - 2019 - Source : Diganta Misra
- Swish/SwishPlus: Flatten-T Swish: a thresholded ReLU-Swish-like activation function for deep learning - 2019 - Source : Geffnet
- LiSHT/LightRelu: LiSHT: Non-Parametric Linearly Scaled Hyperbolic Tangent Activation Function for Neural Networks - 2019 - Source : Less Wright
- Threshold Relu : An improved activation function for deep learning - Threshold Relu, or TRelu - 2019 - Source : Less Wright
- FocalLoss : Focal Loss for Dense Object Detection - 2017 - Source : mbsariyildiz
- LabelSmoothingLoss : Rethinking the Inception Architecture for Computer Vision - 2015 - Source : OpenNMT
- Supervised Contrastive Loss: Supervised Contrastive Learning - 2020 - Source : Yonglong Tian
- Sentiment analysis : IMDB, MR
- Question classification : TREC6, TREC50
- Text classification : 20 newsgroups, AGNews, Amazon Review Polarity, Amazon Review Full , DBpedia, Yelp Review Polarity, Yelp Review Full, Sogou News, Yahoo Answers
- parlib : Parallel Processing for large lists (ie corpus pre-processing), Pandas DataFrames or Series, using joblib
- DSManager / WordVectorsManager : Automatic reference and download of key datasets & pretrained vectors (Glove, FastText...)
-
Include additional Models :
- HAN : Hierarchical Attention Networks for Document Classification - 2016
- SIF : A Simple but Tough-to-Beat Baseline for Sentence Embeddings - 2016
- USIF : Unsupervised Random Walk Sentence Embeddings: A Strong but Simple Baseline - 2018
- RE2 : Simple and Effective Text Matching with Richer Alignment Features - 2019
- BiMPM : Bilateral Multi-Perspective Matching for Natural Language Sentences - 2017
- MaLSTM/MaGRU : Siamese Recurrent Architectures for Learning Sentence Similarity - 2016
-
Include additional Datasets :
-
Others :
Include Nvidia Apex - Mixed Precision to improve GPU memory footprint on Turing/Volta/Ampere architectures- Include support of Google TPU for training & inference via PyTorch/XLA
- Include Cross validation mechanism
- Include Metrics (F1,AUC...) + Confusion Matrix
- Include automatic EDA reporting features
- Include a streamlit app to easily explore & debug model predictions errors and identify potential root causes (ie tokenization, unseen tokens, sentence length,class confusion..)
- Include Microsoft NNI for Hyper Parameters Tuning (TPE, SMAC, Hyperband, BOHB... )
- Include MLflow for Experiments tracking