Language models fit input tag pairs presented in a context where natural language tags are remapped to arbitrary symbols. For a given task, the model must rely on in-context input label assignments to reason and reveal the task. In a new research paper, the Google AI team presents a simple tuning procedure that significantly improves the language model’s ability to reason and learn from input tag assignments for a given context. They call it Symbol Attunement. The research team uses a mixture of 22 NLP data sets with various arbitrary symbols as labels and experiments using multiple Flan-PaL models.
The performance of reference models on invisible contextual learning tasks can be improved by symbol fitting. These models are based on refined instances in which semantically unrelated tags replace natural language tags. Multiple instances in context would be required to define the task, since the task is not clear just by looking at a single instance in context. On average, symbol fitting produces an improved performance of +11.1% across eleven testing tasks for Flan-cont-PaLM-62B.
Symbol-tuned models only include natural language data instead of numerical and algorithmic data. This makes these models perform better in algorithmic reasoning tasks. To verify this, the researchers experiment with a set of list functional tasks in which the model needs to identify a transformation function between input and output lists containing non-negative integers. They use simple Turing concepts where the model uses binary string reasoning to map an input to an output. They find that symbol matching results in an average performance improvement across all tasks of 18.2% for Flan-PaLM-8B, 11.1% for Flan-PaLM-62B, 15.5% for Flan-cont-PaLM-62B, and 3.6% for Flan-PaLM-540B.
Compared to instructions-tuned models, symbol-tuned models are much better at following inverted labels presented in context. Instruction-fit models perform well below random guesses, since they cannot change the predictions to follow the inverted labels. On the other hand, symbol matching forces models to consider the label presented in context as an arbitrary symbol. This reduces the use of the prior knowledge model that contradicts the inverted labels. The researchers find that after symbol fitting, an average improvement across all data sets of 26.5% for Flan-PaLM-8B, 33.7% for Flan-PaLM-62B, and 34.0% for Flan-PaLM-540B.
The researchers say that symbol fitting does not require many fine-tuning steps for any model with small data sets. The observed performance remained relatively constant after a maximum change in performance in the initial 1k to 2k steps. Since the performance remains relatively constant, it can be assumed that larger models require a larger or more diverse set of symbol fitting data.
The researchers find that after the initial steps, higher proportions of symbol fit data do not affect model performance. As a result, the model succeeds in the ICL configuration. As long as non-trivial symbol tuning data is used, the proportion of the data used is irrelevant. The team found a strong correlation between the higher combination of symbol fit data, the more likely the model is to follow the inverted labels. This improves the model’s ability to override prior knowledge with in-context examples. This method is only successful if the model generalizes its ability to new tasks from the diverse set of tasks when the model is entered.
review the Paper and google article. Don’t forget to join our 26k+ ML SubReddit, discord channel, and electronic newsletter, where we share the latest AI research news, exciting AI projects, and more. If you have any questions about the article above or if we missed anything, feel free to email us at [email protected]
🚀 Check out over 800 AI tools at AI Tools Club
Arshad is an intern at MarktechPost. He is currently pursuing his Int. Physics Master’s degree from the Indian Institute of Technology, Kharagpur. Understanding things down to the fundamental level leads to new discoveries that lead to the advancement of technology. He is passionate about understanding nature fundamentally with the help of tools like mathematical models, ML models, and AI.