With the latest models in its Qwen series of open source ai models, Alibaba Cloud is further pushing the boundaries of ai technology. Alibaba has expanded its ai solutions with the launch of Qwen-1.8B and Qwen-72B, as well as specialized chat and audio models. Alibaba's dedication to developing ai capabilities is demonstrated by these models, which provide improved performance and versatility in language and audio processing.
With the launch of the Qwen-1.8B and its larger equivalent, the Qwen-72B, the Qwen series, which already comprises the Qwen-7B and Qwen-14B, has been significantly improved. Qwen-1.8B, pre-trained on a massive corpus of over 2.2 trillion tokens, is a transformer-based model with 1.8 billion parameters. This model outperforms many similar-sized and even larger models on various linguistic tasks, both in Chinese and English. It also supports a long context with 8192 tokens.
In particular, Qwen-1.8B, with its int4 and int8 quantized variants, provides an affordable deployment solution. These features make it a sensible choice for various applications by dramatically reducing memory needs. Its extensive vocabulary of over 150,000 tokens further enhances its linguistic ability.
The largest model, Qwen-72B, has been trained with 3 billion tokens. This model outperforms the GPT-3.5 in most tasks and outperforms the LLaMA2-70B in all tested tasks. Alibaba has designed the models to allow for low-cost implementation despite their wide parameters; Quantized versions allow a minimum memory usage of around 3 GB. This advancement significantly reduces the obstacles to working with massive models that used to cost millions of dollars in cloud computing.
Alibaba introduced Qwen-Chat, optimized versions designed for ai support and conversational capabilities, in addition to the base Qwen models. In addition to generating material and facilitating natural conversation, Qwen-Chat can perform code interpretation and summary tasks.
With its ability to handle multiple audio inputs in addition to text to generate text outputs, Alibaba's Qwen-Audio represents a notable advancement in multimodal ai. Amazingly, Qwen-Audio achieves state-of-the-art performance in speech recognition and a variety of audio understanding standards without the need for adjustments.
In the audio field, Qwen-Audio sets a new benchmark as a basic audio-language model. It uses a multitasking learning framework to handle many audio formats. Achieve impressive results across multiple benchmarks, including cutting-edge scores on tasks like AISHELL-1 and VocalSound.
Wen-Audio's adaptability includes operating multiple chat sessions from text and audio inputs, with features ranging from voice editing tools to music appreciation and sound interpretation.
Review the Paper, github, and Model. All credit for this research goes to the researchers of this project. Also, don't forget to join. our 33k+ ML SubReddit, 41k+ Facebook community, Discord channel, and Electronic newsletterwhere we share the latest news on ai research, interesting ai projects and more.
If you like our work, you'll love our newsletter.
Dhanshree Shenwai is a Computer Science Engineer and has good experience in FinTech companies covering Finance, Cards & Payments and Banking with a keen interest in ai applications. He is excited to explore new technologies and advancements in today's evolving world that makes life easier for everyone.
<!– ai CONTENT END 2 –>