My bachelor's thesis was a research project on natural language processing (NLP). It focused on the generation of multilingual texts in underrepresented languages. Because existing metrics performed very poorly when evaluating the results of models trained on the data set I was using, I needed to train a learned regression metric.
Regression would be useful for many textual tasks, such as:
- Sentiment Analysis: Predict the strength of positive or negative sentiment instead of a simple binary classification.
- Estimation of writing quality: Predicts how high the quality of a piece of writing is.
For my use case, I needed the model to rate how good another model's prediction was for a given task. The rows in my dataset consisted of text input and a label, 0 (bad prediction) or 1 (good prediction).
- Input: Text
- Label: 0 or 1
- The task: Predict a numerical probability between 0 and 1
But transformer-based models are usually used for generation tasks. Why would you use a pre-trained LM to…?