First Pre-processing of the data, cleaning, key word indentification and tokenisation. A very small set of labeled data so self supervised learning has been tried.
Three Approaches have been tried:
- Word2vec and TF-IDF
- BiLSTM
- Fine-Tuning DistillBERT