Tuning, Retraining, and More: Moving Forward with Customized LLMs

Introduction

I’m pretty sure most of you have already used ChatGPT. It’s great because you’ve taken the first step on a journey we’re about to embark on in this article! You see, when it comes to mastering any new technology, the first thing you need to do is use it. It’s like learning to swim by jumping into the water!

If you want to explore GenAI, pick a real-world problem and start building an app to solve it. At the heart of everything GenAI is a large language model (LLM), which some people call the fundamental model (FM).

You may have heard from consumers, tuners and model manufacturers. But wait, we’re about to break it down even more.

Moving forward with customized LLMs — Source: Author

McKinsey sees it as takers, shapers, and makers that they mentioned in their GenAI Recognize session.

We’ll take a closer look at each of these layers in this article.

The proliferation of platforms as a use case

To delve even deeper into this, we will turn to a real-life example that will make everything very clear. In today’s technology landscape, it is a fact that most applications need to run on multiple platforms. However, here’s the catch: each platform has its unique interface and quirks. Expanding an application’s support for additional platforms and maintaining those applications cross-platform is equally challenging.

But that’s where GenAI comes in to save the day. It allows us to create a unified, easy-to-use interface for our applications, regardless of the platforms they serve. The magic ingredient? Large Language Models (LLM) transform this interface into a natural, intuitive language.

Linux, Windows and Mac commands

To make it more specific for even better understanding, let’s say we want to know what exact command to run for different scenarios on our machine, which can be Linux, Windows or Mac. The following diagram illustrates a scenario:

Value for the end user and application developer

As an end user, you don’t need to learn/know the commands for each of these platforms and you can do your things naturally and intuitively. As the application developer, you do not need to explicitly translate each of the application’s user-facing interfaces to each of the underlying supported platforms.

Reference architecture

Several LLMs, including GPT3, GPT3.5, and GPT4, reside in the cloud, courtesy of various providers such as Open ai and Azure Open ai. They are easily accessible via various APIs such as completion, chat completion, etc.

ai orchestrators make this access even more seamless and consistent across models and providers. That’s why GenAI applications today often interact with ai orchestrators rather than interacting directly with the underlying providers and models. It then handles orchestration with configurable and/or possibly multiple providers and underlying models as required by the application.

You can have a plugin for each of the platforms your application wants to support for flexibility and modularity. We’ll delve into all the things we can do with these plugins and orchestrators in the following sections.

Finally, the application has connector(s) to interact with the platforms you want to support to execute the commands generated by GenAI.

Reference technologies

ai Orchestrators: LangChain, semantic core
Cloud models: Azure Open ai

Setting

There are numerous in the settings themselves that you can adjust to achieve the desired results. Here is a typical config.json of a semantic kernel plugin:

{

  "schema": 1,

  "description": "My Application",

  "type": "completion",

  "completion": {

    "max_tokens": 300,

    "temperature": 0.0,

    "top_p": 0.0,

    "presence_penalty": 0.0,

    "frequency_penalty": 0.0,

    "stop_sequences": (

        "++++++"

    )

  },

  "input": {

    "parameters": (

      {

        "name": "input",

        "description": "Command Execution Scenario",

        "defaultValue": ""

      }

    )

  }

}

The ‘type’ specifies the type of API you want to run on the underlying LLM. Here we are using the “completion” API. The “temperature” determines the variability or creativity of the model. For example, while chatting, you may want the ai to respond with different phrases at different times, even though they can all convey the same intention of keeping the conversation interesting. However, here we always want the same precise answer. Therefore, we are using the value of 0. Your result can consist of different sections with some predefined separators if you want only the first section, like the exact match command in our case, to be generated as a response that you use. “stop_sequences” like here. You define your input with all the parameters, only one in this case.

Quick engineering

Now let’s delve into the much talked about rapid engineering and how we can take advantage of it.

System messages

System messages tell the model exactly how we want it to behave. For example, the Linux bash plugin in our case might have something like the following at the beginning of its skprompt.txt.

You are a useful wizard that generates commands for Linux bash machines based on user input. Your response should contain ONLY the command and NO explanation. For all user input, it will only generate a response by considering Linux bash commands to find its solution.

Which specifies your system message.

Some shots, instructions and examples

It helps the model give the exact answer by giving it some sample questions and the corresponding answers it is looking for. It is also called incitement of some shots. For example, our Linux bash plugin might have something like the following in its skprompt.txt after the system prompt mentioned above:

Examples

User: Get my IP

Assistant: curl ifconfig.me

++++++

User: Get the weather in San Francisco

Assistant: curl wttr.in/SanFrancisco

++++++

User:"{{$input}}"

Assistant:

You may want to fine-tune your system to choose the right examples/shots that get the desired result.

<h2 class="wp-block-heading" id="h-ai-orchestration”>ai Orchestration

We’ll put this configuration together and request engineering in our simple example and see how we can manage the ai orchestration in the semantic core.

import openai

import os

import argparse

import semantic_kernel as sk

from semantic_kernel.connectors.ai.open_ai import AzureTextCompletion

parser = argparse.ArgumentParser(description='GANC')

parser.add_argument('platform', type=str,

                    help='A platform needs to be specified')

parser.add_argument('--verbose', action='store_true',

                    help='is verbose')

args = parser.parse_args()

kernel = sk.Kernel()

deployment, api_key, endpoint = sk.azure_openai_settings_from_dot_env()

kernel.add_text_completion_service("dv", AzureTextCompletion(deployment, endpoint, api_key))

platformFunctions = kernel.import_semantic_skill_from_directory("./", "platform_commands")

platformFunction = platformFunctions(args.platform)

user_query = input()

response = platformFunction(user_query)

print (respone)

This Python script takes “platform” as a required argument. Select the correct plugin from the ‘platform_commands’ folder for the specified platform. It then takes the user’s query, invokes the function, and returns the response.

For your first use cases, you may want to experiment only this far, since LLMs already have a lot of intelligence. This simple setup and quick engineering alone can deliver results very close to the desired behavior, very quickly.

The following techniques are quite advanced at this point, require more effort and knowledge, and should be employed with return on investment in mind. The technology is still evolving and maturing in this space. At this time we will only take a cursory glance at them to check their integrity and our awareness of what awaits us.

Fine tuning

Tuning involves updating the weights of a previously trained language model on a new task and data set. Typically used for transfer learning, personalization, and domain specialization. There are several tools and techniques available for this. One way to do this is to use OpenAI’s CLI tools. You can give it your data and generate training data to make adjustments with commands like:

openai tools fine_tunes.prepare_data -f <LOCAL_FILE>

You can then create a custom model using Azure ai Studio:

Provide the adjustment data you prepared earlier.

Custom LLMs

If you are brave enough to dig deeper and experiment, read on! We will see how to build our custom models.

Retraining

This is very similar to the adjustment we saw before. This is how we can do it using transformers:

from transformers import AutoTokenizer

# Prepare your data

tokenizer = AutoTokenizer.from_pretrained("bert-base-cased")

def tokenize_function(examples):

    return tokenizer(examples("text"), padding="max_length", truncation=True)

# let's say my dataset is loaded into my_dataset

tokenized_datasets = my_dataset.map(tokenize_function, batched=True)

# load your model

from transformers import AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained("bert-base-cased", num_labels=5)

# Train

from transformers import TrainingArguments, Trainer

training_args = TrainingArguments(output_dir="mydir")

trainer.train()

# save your model which can be loaded by pointing to the saved directory and used later

trainer.save_model()

Training from scratch

Here you can start with some known model structures and train them from scratch. It will take a lot of time, resources and training data, although the model created is completely under your control.

New models

You can define the structure of your model, potentially improving existing models, and then follow the process above. Amazon’s Titan and Codewhisperer fall into this category.

Conclusion

GenAI has immense potential for various use cases. This article exemplified its application in cross-platform support and rapid solution creation. While skepticism surrounds GenAI, the path to harnessing its power is clear. However, the journey becomes complex when you delve deeper into model tuning and training.

Key takeaways:

As you can see, GenAI is very fascinating and enables various use cases.
We looked at one such use case and looked at how we could quickly start building a solution.
Some wonder if GenAI is a bubble. You can choose your favorite use case and try it yourself by following the steps set out in this article to answer it yourself!
The process can become complex and laborious very quickly as you begin to delve into territories such as tuning, training, and model building.

Frequent questions

P1. Is GenAI a bubble?

A. I don’t think so. You can choose your favorite use case and try it yourself by following the steps set out in this article to answer it yourself!

P2. What is the Generative ai Architecture Stack?

A. There are end users, model consumers, model tuners and model builders.

P3. What are the three components of generative ai?

A. They are LLMs, ai providers and orchestrators.

Q4. What is the infrastructure layer of generative ai?

A. They are GPUs, TPUs and cloud hosting services like Open ai, Azure Open ai, etc.

The media shown in this article is not the property of Analytics Vidhya and is used at the author’s discretion.

Tuning, Retraining, and More: Moving Forward with Customized LLMs

Technical Terrence Team

Fisker looks to surpass Tesla and other EV rivals on key feature

Leave a Reply Cancel reply

Recommended.

Bitcoin, Ethereum and EigenLayer: a play in three acts

Ethereum Bullish or Bearish? What the Futures Market Data Says

Stock Market News Today: S&P Maintains Gains After All-Time Milestone of 5,500 (SP500)

Ref2earn ICO (R2E): MLM Revolution Powered by Blockchain!

Bitcoin ASIC Maker Bitmain Suspends Salary Payments to Employees: Report

Categories

Important Links

Tuning, Retraining, and More: Moving Forward with Customized LLMs

Introduction

The proliferation of platforms as a use case

Linux, Windows and Mac commands

Value for the end user and application developer

Reference architecture

Reference technologies

Setting

Quick engineering

System messages

Some shots, instructions and examples

Fine tuning

Custom LLMs

Retraining

Training from scratch

New models

Conclusion

Frequent questions

Related

Related

Technical Terrence Team

Fisker looks to surpass Tesla and other EV rivals on key feature

Leave a Reply Cancel reply

Recommended.

Bitcoin, Ethereum and EigenLayer: a play in three acts

Ethereum Bullish or Bearish? What the Futures Market Data Says

Stock Market News Today: S&P Maintains Gains After All-Time Milestone of 5,500 (SP500)

Ref2earn ICO (R2E): MLM Revolution Powered by Blockchain!

Bitcoin ASIC Maker Bitmain Suspends Salary Payments to Employees: Report

Categories

Important Links

Get daily news updates to your inbox!