This AI research dives into the limitations and capabilities of large language models (LLMs) of transformers, empirically and theoretically, in composition tasks.

ChatGPT is all the rage and millions of people use it every day. With its incredible human-mimicking capabilities, such as answering questions, generating unique and creative content, summarizing massive textual data, completing code, and developing highly useful virtual assistants, ChatGPT makes life easy for us. Powered by OpenAI, ChatGPT is based on GPT 3.5 (Generative Pretrained Transformer) and the transformer architecture of GPT 4. GPT 4, the latest version of the language models released by OpenAI, is multimodal in nature, that is, it accepts input in form of text and images, unlike previous versions. Even other large language models (LLM) like PaLM, LLaMA, and BERT are being used in multi-domain applications related to healthcare, e-commerce, finance, education, etc.

A team of researchers has highlighted the difference between LLM’s impressive performance as a GPT on complex tasks and their difficulties with simple tasks in a recently published research paper. Delving deeper into the limitations and capabilities of Transformer LLMs, the team has run experiments on three representative composition tasks: multi-digit multiplication, logic grid puzzles, and a classic dynamic programming problem. These tasks involve breaking problems into smaller steps and combining those steps to produce an exact solution.

In order to study the limits of Transformers in solving compositional tasks that require multi-step reasoning, the authors have proposed two hypotheses. The first is that Transformers perform tasks by linearizing multi-step reasoning in path matching, thus relying on pattern matching and shortcut learning rather than understanding and implementing the underlying computational rules required to develop proper solutions. This approach allows for fast and accurate predictions on similar patterns during training, but cannot be generalized to rare complex examples. The second hypothesis states that transformers may have inherent limitations when trying to solve highly complex composition tasks with unique patterns. Early calculation errors can propagate to serious composition errors in later steps, preventing models from reaching the correct solution.

JOIN the fastest ML subreddit community

The authors have formulated the composition tasks as calculation graphs to investigate the two hypotheses. These graphs break the problem-solving process into smaller, more manageable, submodular functional steps, allowing for structured measures of problem complexity and the verbalization of computational steps as input sequences to language models. They even use the information obtained to make predictions about the patterns that the models would likely learn based on the underlying task distribution without running full computations within the graph.

Based on empirical findings, the authors have proposed that Transformers handle composition challenges by reducing multi-step reasoning to linearized subgraph matching. They have provided theoretical arguments based on multi-step abstract reasoning problems, which highlight that as task complexity increases, Transformers performance deteriorates rapidly. This shows that the models might already be limited in their ability to handle highly complex composition problems.

In conclusion, the empirical and theoretical results imply that rather than a deep understanding of the underlying thought processes, Transformers performance relies primarily on pattern matching and subgraph matching, which also supports the idea that the Transformers would find it difficult to perform increasingly difficult tasks. .

review the Paper. Don’t forget to join our 22k+ ML SubReddit, discord channel, and electronic newsletter, where we share the latest AI research news, exciting AI projects, and more. If you have any questions about the article above or if we missed anything, feel free to email us at asif@marktechpost.com

Check out 100 AI tools at AI Tools Club

Tanya Malhotra is a final year student at the University of Petroleum and Power Studies, Dehradun, studying BTech in Computer Engineering with a specialization in Artificial Intelligence and Machine Learning.
She is a data science enthusiast with good analytical and critical thinking, along with a keen interest in acquiring new skills, leading groups, and managing work in an organized manner.