Applying RLAIF for API-based code generation in lightweight LLMs

This article was accepted into the Natural Language Reasoning and Structured Explanations Workshop at ACL 2024.

Reinforcement learning from ai feedback (RLAIF) has shown significant potential in several domains including mitigating harm in LLM results, improving text summarization, and mathematical reasoning. This paper presents an RLAIF framework to improve the code generation capabilities of lightweight LLMs (parameters < 1B). We specifically focus on code generation tasks that require writing appropriate API calls, which is challenging due to the well-known hallucination problem in LLMs. Our framework extracts ai feedback from a larger LLM (e.g., GPT-3.5) through a specialized induction strategy and uses this data to train a reward model towards better alignment from smaller LLMs. We run our experiments on the Gorilla dataset and meticulously evaluate the quality of the code generated by the model on several metrics including AST, ROUGE, and Code-BLEU, and develop a pipeline to accurately compute its executable rate. Our approach significantly improves the performance of the fine-tuned LLM baseline, achieving a 4.5% improvement in the executeability rate. In particular, a smaller LLM model (780M parameters) trained with RLAIF outperforms a much larger fine-tuned baseline with 7B parameters, achieving a 1.0% higher code executeability rate.

Applying RLAIF for API-based code generation in lightweight LLMs

Technical Terrence Team

Euro index rises, yen falls today

Leave a Reply Cancel reply

Recommended.

End of Meta support for NFTs on Facebook and Instagram – Bitcoin News

SEC delays decision on options trading for BlackRock and Bitwise Ethereum spot ETFs

Bitcoin price surges 5% as Fed Chair Jerome Powell signals rate cuts ahead, but traders are leaning toward this ICO with just 2 days left

Trump's tariffs squeeze Bitcoin's miners who already fight – Brains Exec.

Walmart is selling an accent chair of $ 258 'comfortable' for only $ 110, and buyers say 'exceeded expectations'

Categories

Important Links

Applying RLAIF for API-based code generation in lightweight LLMs

Related

Technical Terrence Team

Euro index rises, yen falls today

Leave a Reply Cancel reply

Recommended.

End of Meta support for NFTs on Facebook and Instagram – Bitcoin News

SEC delays decision on options trading for BlackRock and Bitwise Ethereum spot ETFs

Bitcoin price surges 5% as Fed Chair Jerome Powell signals rate cuts ahead, but traders are leaning toward this ICO with just 2 days left

Trump's tariffs squeeze Bitcoin's miners who already fight – Brains Exec.

Walmart is selling an accent chair of $ 258 'comfortable' for only $ 110, and buyers say 'exceeded expectations'

Categories

Important Links

Get daily news updates to your inbox!