In a significant stride towards enhancing the Japanese generative AI landscape, Stability AI, the pioneering generative AI company behind Stable Diffusion, has introduced its inaugural Japanese Language Model (LM) known as Japanese StableLM Alpha. This monumental launch has garnered attention as the company asserts its LM to be the most proficient publicly available model catering to Japanese speakers. The claim is substantiated by a comprehensive benchmark evaluation against four other Japanese LMs.
This newly introduced Japanese StableLM Alpha, boasting an impressive architecture of 7 billion parameters, stands as a testament to Stability AI’s commitment to technological advancement. The model is a versatile and high-performing tool for various linguistic tasks. It triumphs over its contemporaries in multiple categories, positioning itself as an industry leader.
The Japanese StableLM Base Alpha 7B commercial iteration is slated for release under the widely recognized Apache License 2.0. This specialized model is meticulously crafted through extensive training on a colossal dataset encompassing 750 billion tokens of both Japanese and English text, meticulously sourced from online repositories.
The underpinning of this achievement also owes its credit to collaborative efforts. Stability AI leveraged the expertise of the EleutherAI Polyglot project’s Japanese team, culminating in datasets crafted by Stability AI’s Japanese community. This collective endeavor is further facilitated by the employment of an extended version of EleutherAI’s GPT-NeoX software, a cornerstone of Stability AI’s developmental process.
A parallel innovation, the Japanese StableLM Instruct Alpha 7B marks yet another remarkable milestone. This model is primarily devised for research purposes and is exclusively intended for research applications. It exhibits a distinctive capability to adhere to user instructions, achieved through a methodical approach known as Supervised Fine-tuning (SFT) utilizing multiple open datasets.
These models were validated through rigorous evaluations employing EleutherAI’s Language Model Evaluation Harness. The models underwent scrutiny across various domains, such as sentence classification, sentence pair classification, question answering, and sentence summarization, emerging with an impressive average score of 54.71%. Stability AI contends that this performance metric unequivocally positions the Japanese StableLM Instruct Alpha 7B ahead of its contemporaries, underlining its prowess and superiority.
Interestingly, the launch of Stability AI’s Japanese LM holds another layer of intrigue due to its timing in relation to SoftBank’s recent announcement. Last week, SoftBank revealed its venture into the realm of homegrown Large Language Models (LLM) designed for the Japanese market. The company’s commitment is further underscored by a substantial allocation of approximately 20 billion JPY (over $140 million) towards its generative AI computing platform, poised for debut later this year.
As the landscape continues to unfold, it becomes a waiting game to ascertain which Japanese Language Model will ultimately establish its supremacy in the dynamic and evolving field of generative AI.
Check out the Project. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 28k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.
Niharika is a Technical consulting intern at Marktechpost. She is a third year undergraduate, currently pursuing her B.Tech from Indian Institute of Technology(IIT), Kharagpur. She is a highly enthusiastic individual with a keen interest in Machine learning, Data science and AI and an avid reader of the latest developments in these fields.