Imagine you wish to build an NLP model to solve a given problem. You need to define the task scope, then find or create data that specifies the intended system behaviour, choose a suitable model architecture, train the model, assess its performance through evaluation, and then deploy it for real-world usage. Researchers have made it possible to prototype such extensively made NLP models with a single line of code!
Prompt2Model is a system that retains the ability to specify system behaviour using simple prompts and simultaneously provides a deployable special purpose model preserving all its benefits. The figure above demonstrates the working architecture of our Prompt2Model. Essentially, it works as an automated pipeline, which extracts all the necessary details about the task from user prompts and then gathers and combines task-related information and deploys using the following different channels.
- Dataset retrieval: Given a prompt, the first task is to discover existing manually annotated data that can support a user’s task description.
- Dataset generation: To support a wide range of tasks, there exists a Dataset Generator to produce synthetic training data as per the user-specific requirements parsed by the Prompt Parser. The prompt parses consist of an LLM with in-context learning that is utilised to segment user prompts, employing OpenAI’s gpt-3.5-turbo-0613.
- Model retrieval: Using the provided prompt, a pre-trained language model is selected with suitable knowledge for the user’s goal. This chosen model serves as the student model and is further fine-tuned and evaluated using the generated and retrieved data.
- WebApp: Finally, there exists an easy-to-use graphical user interface that allows downstream users to interact with the trained model. This web application, built using Gradio, can then be easily deployed publicly on a server.
In conclusion, Prompt2Model is a tool for quickly building small and competent NLP systems. It can be directly used to produce task-specific models that outperform LLMs in a few hours without manual data annotation or architecture. Given the model’s extensible design, it can offer a platform for exploring new techniques in model distillation, dataset generation, synthetic evaluation, dataset retrieval, and model retrieval.
Looking ahead, we can envision Prompt2Model as a catalyst for collaborative innovation. By proposing distinct challenges, researchers aim to foster the development of diverse implementations and improvements across the framework’s components in the future.
Check out the Paper and Github. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 29k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.
Janhavi Lande, is an Engineering Physics graduate from IIT Guwahati, class of 2023. She is an upcoming data scientist and has been working in the world of ml/ai research for the past two years. She is most fascinated by this ever changing world and its constant demand of humans to keep up with it. In her pastime she enjoys traveling, reading and writing poems.