Tensoic AI launches Kan-Llama: a LoRA 7B Llama-2 pre-trained and tuned in 'Kannada' tokens

Tensoic has recently introduced Kannada Call (Kan-LLaMA) to address the limitations of language models (LLMs), specifically focusing on proprietary characteristics, computational resources, and barriers to broader contributions from the research community. Strongly emphasize the importance of open models that use the mouth to facilitate innovation in natural language processing (NLP) and machine translation. Despite the success of models like META LAMA 2, there are inherent limitations when it comes to native support for languages other than English, requiring an expansion of linguistic capacity.

Today's LLM projects, while impressive, often pose challenges due to their very nature and the need for multiple resources for training and implementation. The paper presents Kannada as a solution, aiming to spread Llama-2 powerfully for less major Indian languages, especially Kannada, incorporate model vocabulary modification through a phrase fragment tokenizer, use low-level optimization (LoRA) for efficient training. and solve the model, optimize it to scale with specific data structures to increase its conversational capabilities, emphasizing the publication of rules, data sets and, ultimately, documentation.

The proposed method improves the efficiency of Llama-2 vocabulary for efficient processing of Kannada texts. The sentence fragment tokenizer is trained on the Kannada text corpus and integrated with the existing Llama-2 tokenizer. Researchers use low-level optimization (LoRA) during pre-training to preserve the weight of pre-trained models and reduce the total number of trainable parameters. This efficient training method enables computational training of low-level LLM objects. The pre-training is performed on around 600 million Kannada tokens from CulturaX Dataset using 80GB Nvidia A100 instances and takes around 50 hours at an estimated cost of $170.

In conclusion, the article addresses the challenges associated with LLMs, emphasizing the importance of using open source models to foster innovation. The introduction of the Kannada Lama indicates a concerted effort to spread linguistic knowledge, especially in the case of the less important Indian languages. A comprehensive approach including terminology optimization, minimal optimization, and maintenance optimization implies a circular approach to address the limitations of existing models. Commitment to modeling openness and collaboration with organizations like Microsoft to make LLMs more accessible for research and public use. It reflects broader goals and contributes. to the development of next-generation language models.

Pragati Jhunjhunwala is a Consulting Intern at MarktechPost. She is currently pursuing B.tech from the Indian Institute of technology (IIT), Kharagpur. She is a technology enthusiast and has a keen interest in the scope of data science software and applications. She is always reading about the advancements in different fields of ai and ML.

<!– ai CONTENT END 2 –>

(FREE ai WEBINAR) 'LangChain for Multimodal Applications: Chat with Text/Image Data' (January 26, 2024)

Tensoic AI launches Kan-Llama: a LoRA 7B Llama-2 pre-trained and tuned in 'Kannada' tokens

Technical Terrence Team

The biggest risk of AI will fully manifest during the 2024 elections

Leave a Reply Cancel reply

Recommended.

This is how On Crypto Trader turned $100 into $8.3 million on an Ethereum L2 network

Judge sides with OpenSea in case of NFT theft

Block shares plunge 14% after Hindenburg Research report

National Australia Bank to Launch Stablecoin on Ethereum, Algorand: Report

McDonald's introduces a big hamburger change to its menu

Categories

Important Links

Tensoic AI launches Kan-Llama: a LoRA 7B Llama-2 pre-trained and tuned in 'Kannada' tokens

Related

Technical Terrence Team

The biggest risk of AI will fully manifest during the 2024 elections

Leave a Reply Cancel reply

Recommended.

This is how On Crypto Trader turned $100 into $8.3 million on an Ethereum L2 network

Judge sides with OpenSea in case of NFT theft

Block shares plunge 14% after Hindenburg Research report

National Australia Bank to Launch Stablecoin on Ethereum, Algorand: Report

McDonald's introduces a big hamburger change to its menu

Categories

Important Links

Get daily news updates to your inbox!