Hierarchical and dynamic warning compression for efficient use of the Zero-shot API

Long indications present a significant challenge for practical LLM-based systems that need to operate with low latency and limited resources. We investigate fast compression for zero-shot dialog systems that learn to use invisible APIs directly in the context of their documentation, which can consume hundreds of fast tokens per API. We build on a recently introduced approach (Mu et al., 2023) that learns to compress the message into a few “essential token” activations during tuning. However, this simple idea is not effective in compressing API documentation, resulting in low precision compared to the baseline using an uncompressed message. In this work, we introduce two important improvements. First, we specialize essential tokens for different hierarchies within an API: we use a Gist_arg token to compress an argument and a Gist_value token to compress an acceptable value of a categorical argument. We then dynamically reveal the Gist_value tokens only when they are needed. Second, we add a reconstruction loss to predict the API documentation from the essential tokens. Across multiple API calling tasks, our proposed system maintains the simplicity, efficiency, and large compression factor (20 times in SGD) of the essential token approach while achieving significantly higher accuracy.

Hierarchical and dynamic warning compression for efficient use of the Zero-shot API

Technical Terrence Team

If you'd invested £5,000 in BT shares three months ago, this is how much you'd have today

Leave a Reply Cancel reply

Recommended.

Google leaks Pixel 8 Pro again with a 360-degree preview

Elon Musk, King of Video Games? Well, maybe not.

Evaluate the text summarization capabilities of LLMs for enhanced decision-making on AWS

How much would a Stocks & Shares ISA investor need to earn a second monthly income of £3,000?

The second-generation Apple HomePod is marked down to $285 in a rare deal

Categories

Important Links

Hierarchical and dynamic warning compression for efficient use of the Zero-shot API

Related

Technical Terrence Team

If you'd invested £5,000 in BT shares three months ago, this is how much you'd have today

Leave a Reply Cancel reply

Recommended.

Google leaks Pixel 8 Pro again with a 360-degree preview

Elon Musk, King of Video Games? Well, maybe not.

Evaluate the text summarization capabilities of LLMs for enhanced decision-making on AWS

How much would a Stocks & Shares ISA investor need to earn a second monthly income of £3,000?

The second-generation Apple HomePod is marked down to $285 in a rare deal

Categories

Important Links

Get daily news updates to your inbox!