Machine learning has iconic applications in programming languages, from code understanding to code representation or completion. Previous work focused on exploiting the underlying deep semantic structure of programming languages such as Code2Vec, Code2Seq, and Graph Representation Learning for Code. The above architectures are tailored for native abstract syntax tree (AST)/data flow graph (DFG) structures. They have an important limitation: they can only be applied for tasks that involve completely executable code.
Further research has shown how transformer-based models can be used as a natural language for code at the lexical (text) level. Since then, language models have been widely used to model code in various tasks. These models are executed every few seconds, especially in the case of code completion. Powerful models running on consumer devices are preferred to avoid network latency, make a difference, and address discrepancies related to closed APIs.
Stability ai researchers presented Stable code, which serves as a general-purpose codebase language model intended for code completion, reasoning, mathematics, and other software engineering-based tasks. In addition, they introduce a variant of instruction called Stable code instruction which allows you to chat with the model in a natural chat interface to perform tasks based on instructions and answering questions.
Stable Code is based on Stable LM, a next-generation LLM for English natural language at a scale of 3 billion parameters. The model is a causal decoder-only transformer similar in design to the LLaMA architecture. The main differences with LLaMA are:
- Position embeds: Rotary position inlays are applied to the first 25% of the head inlay dimensions to improve performance.
- Standardization: LayerNorm with learned bias terms unlike RMSNorm.
- Biases: All bias terms were removed from the feedback networks and multi-head self-attention layers, except biases from key, query, and value projections.
Stable Code matches the performance of Llama and StarCoder on average across all programming languages, although it is relatively smaller. Furthermore, Stable Code 3B achieves strong performance on the 3B scale, showing notable capabilities in code completion tasks. They also evaluated instruction-tuned models on the code subset of the challenging Multi-turn testbench.
In conclusion, Stability ai researchers introduced Stable Code and Stable Code Instruct to address different software development use cases. Both Stable Code and Stable Code Instruct are compact language models for decoders only. Researchers have performed extensive model evaluations and comparisons with other models of similar size, demonstrating the remarkable performance of Stable Code and Stable Code Instruct. They also provide model analysis on typical edge computing architectures.
Review the Paper and ai/news/introducing-stable-code-instruct-3b?utm_source=twitter&utm_medium=website&utm_campaign=blog” target=”_blank” rel=”noreferrer noopener”>Blog. All credit for this research goes to the researchers of this project. Also, don't forget to follow us on Twitter. Join our Telegram channel, Discord Channeland LinkedIn Grabove.
If you like our work, you will love our Newsletter..
Don't forget to join our 39k+ ML SubReddit
Asjad is an internal consultant at Marktechpost. He is pursuing B.tech in Mechanical Engineering at Indian Institute of technology, Kharagpur. Asjad is a machine learning and deep learning enthusiast who is always researching applications of machine learning in healthcare.
<!– ai CONTENT END 2 –>