Long language models (LLMs) have excelled in a wide range of NLP tasks and have shown encouraging evidence of achieving some features of artificial general intelligence. Recent research has also revealed the possibility of supplementing LLMs with external tools, greatly increasing their problem-solving power and efficiency, similar to how human intelligence has evolved. However, the availability of appropriate tools is a determining factor in how applicable these tooling procedures are. According to the lessons drawn from these milestones, the ability of people to create their tools to solve new problems was a significant turning point in human development.
In this study, researchers from Google Deepmind, Princeton University, and Stanford University apply this evolutionary notion to the field of LLMs, which is motivated by the importance of tool making to humans. The system they suggest, dubbed LLM as Tool Makers (LATM), allows LLMs to create their own reusable tools to take on new responsibilities. Their strategy consists of two crucial phases: 1) tool creation: An LLM, often called a tool builder, creates tools (implemented as Python functions), especially for a specific job. 2) application of the tool: a second LLM, known as the user of the tool, who may be the same person who created the tool, applies the tools to serve new requests. Due to the two-stage design, LATM is able to assign work to the most qualified LLM at each stage.
In particular, a powerful but resource-intensive model (such as GPT-4) can model the proficient tool creation process. On the other hand, a lightweight and affordable model (such as the GPT-3.5 Turbo) can be attributed to the significantly easier procedure for using tools. This method greatly reduces the average computing cost of handling multiple jobs while improving the problem-solving skills of LLMs. For a particular capacity, the tool making procedure only needs to be performed once. Therefore, the produced tools can be applied to multiple task instances.
This method provides a scalable and inexpensive alternative to address challenging problems. Think of a scenario where a user asks the LLM to set up a meeting that works for everyone (for example, through email exchanges). Complex arithmetic reasoning problems are often difficult for lightweight machines like the GPT-3.5 Turbo to solve. However, stronger models, such as GPT-4, can get the answers right and have significantly higher inference costs. By taking a powerful but expensive model as a toolmaker and turning it over to a profitable model as a tool user, LATM overcomes these obstacles. Once the tool has been forged, the user can use the tool to perform work quickly and efficiently once the tool has been forged.
This paradigm can also be used to address familiar games like 24-game Sudoku and repetitive work on other processes such as parsing and parsing online articles into certain data formats or creating routing plans that meet various specialized requirements. They also add the dispatcher, a lighter LLM, which decides if an incoming problem can be solved with already existing tools or if a new tool should be developed. This gives your architecture an additional degree of dynamics and allows for the creation and use of tools in real time. His essays demonstrate the effectiveness of this strategy on a variety of difficult Big Bench problems and difficult thinking tasks in general.
The results show that LATM can perform as well as models that require more resources and is more reasonably priced. The exciting possibilities for a developing society using LLM-generated tools are made possible by this unique LLM approach, which mimics the evolutionary leap of humans in generating and using tools.
review the Paper and GitHub link. Don’t forget to join our 22k+ ML SubReddit, discord channel, and electronic newsletter, where we share the latest AI research news, exciting AI projects, and more. If you have any questions about the article above or if we missed anything, feel free to email us at [email protected]
🚀 Check out 100 AI tools at AI Tools Club
Aneesh Tickoo is a consulting intern at MarktechPost. She is currently pursuing her bachelor’s degree in Information Science and Artificial Intelligence at the Indian Institute of Technology (IIT), Bhilai. She spends most of her time working on projects aimed at harnessing the power of machine learning. Her research interest is image processing and she is passionate about creating solutions around her. She loves connecting with people and collaborating on interesting projects.