Introduction to tuning pre-trained transformer models | by Ram Vegiraju | February 2024

Simplified using the HuggingFace training object

HugsFace It serves as the home for many popular open source NLP models. Many of these models are effective as-is, but often require some type of training or tuning to improve performance for your specific use case. As the implosion of LLM continues, we will take a step back in this article to review some of the building blocks that HuggingFace provides that simplify training NLP models.

Traditionally, NLP models can be trained using basic PyTorch, TensorFlow/Keras, and other popular machine learning frameworks. While you can go this route, it requires a deeper understanding of the framework you are using, as well as more code to write the training loop. With HuggingFace coach classThere is an easier way to interact with the NLP Transformers models you want to use.

Trainer is a class specifically optimized for Transformers models and also provides tight integration with other Transformers libraries such as Data sets and Assess. Trainer at a more advanced level also supports distributed training libraries and can easily integrate with infrastructure platforms such as Amazon SageMaker.

In this example, we will look at using the Trainer class locally to tune the popular BERT model in the IMBD data set for a text classification use case (ai.stanford.edu/~amaas/data/sentiment/” rel=”noopener ugc nofollow” target=”_blank”>Large Movie Reviews Dataset ai.stanford.edu/~amaas/papers/wvSent_acl2011.bib” rel=”noopener ugc nofollow” target=”_blank”>Citation).

NOTE: This article assumes basic knowledge of Python and mastery of NLP. We won't go into any machine learning-specific theories about model construction or selection; This article is dedicated to understanding how we can fine-tune the existing pre-trained models available in HuggingFace Model Hub.

Setting
fine-tuning BERT
Additional Resources and Conclusion

For this example, we will work on SageMaker Studio and use a conda_python3 kernel on a ml.g4dn.12xlarge instance. Note that you can use a smaller instance type, but this could affect the training speed depending on the number of CPUs/workers that are available.

Introduction to tuning pre-trained transformer models | by Ram Vegiraju | February 2024

Technical Terrence Team

Oil falls to $82.71 on demand fears

Leave a Reply Cancel reply

Recommended.

Top 5 Cryptocurrency PR Agencies of 2024

Pixel Transformer: Challenging local bias in vision models

Starbucks refuses to send interim CEO Schultz to testify at Senate hearing By Reuters

Telegram integrates Tether (USDT) payments into the Tron network

The best PC games for 2023

Categories

Important Links

Introduction to tuning pre-trained transformer models | by Ram Vegiraju | February 2024

Simplified using the HuggingFace training object

Related

Technical Terrence Team

Oil falls to $82.71 on demand fears

Leave a Reply Cancel reply

Recommended.

Top 5 Cryptocurrency PR Agencies of 2024

Pixel Transformer: Challenging local bias in vision models

Starbucks refuses to send interim CEO Schultz to testify at Senate hearing By Reuters

Telegram integrates Tether (USDT) payments into the Tron network

The best PC games for 2023

Categories

Important Links

Get daily news updates to your inbox!