Semiconductors are essential for powering various electronic devices and driving development in the telecommunications, automotive, healthcare, renewable energy and IoT industries. In semiconductor manufacturing and design, the two main phases, FEOL and BEOL, present unique challenges. LLMs are trained on large amounts of text data using self-supervised learning techniques that can capture rich domain knowledge. LLMs can also help with tasks such as design rule verification, design generation, and space exploration in Integrated circuit (IC) design. LLMs enable the generation of new designs that adhere to specified constraints and are optimized for desired performance metrics, learning from large IC designs and design rule data sets. However, most models are general and do not have specific knowledge within the semiconductor industry. This reflects unique problems, such as the complex physics and chemistry of semiconductor devices and processes.
Currently, LLMs are general-purpose models that, despite their power, require more specialized knowledge for specific tasks in the semiconductor industry. artificial intelligence (ai) has improved semiconductor manufacturing by improving mask optimization and hotspot detection using machine learning, deep reinforcement learning, and data sets such as LithoBench. In the semiconductor industry, domain-specific large language models (LLMs), such as ChipGPT and ChatEDA, outperformed general models on tasks such as code generation, debugging, and chatbot assistance. The LLMs also evaluated natural language generation tasks, using expert feedback to improve benchmarks and address challenges in complex domain-specific assessments.
To integrate the power of LLMs into the semiconductor industry, researchers from Aitomatic Inc., FPT Software ai Center and Tokyo Electron Ltd. carried out a detailed investigation and proposed semikongthe first industry-specific LLM for the semiconductor domain that provides a foundation for developing customized proprietary models. SemiKong 1.0 focuses on building a fundamental model with an expert-level understanding of engraving problems. This approach involves training models with complete domain-specific data. The training process was divided into two stages: pre-training and fine-tuning.
There are very few high-quality data sets for the semiconductor domain. To address this, a large-scale text-based data set focused on semiconductor concepts and etching problems emerged, including pre-training data from technical books, articles, and patents, along with instruction data with 50,000 questions. Tools like GPT-4o-mini handled the formatting, while GPT-4o generated and answered some questions. The SemiKong model was trained in three steps. First, you were pre-trained using Llama3 checkpoints to learn about the semiconductor industry. He then underwent a supervised adjustment to improve his ability to handle tasks such as answering questions and reasoning. Finally, the model was fine-tuned with quantization to prepare it for real-world use, gaining deeper knowledge about semiconductor manufacturing along the way. The researchers used 8 NVIDIA A100 80GB GPUs for training to get better performance and training speed.
Evaluation of the SemiKong model involved comparing its performance against several criteria, including clarity and directness (C&D), practicality and immediate usability (PIU), efficiency and brevity (E&B), logical flow and coherence (LFC), expert to expert. Communication (CEE) and Use of Examples and Specificity (UES). Experiments showed that tuning alone did not significantly improve performance, as domain-specific knowledge was crucial. When pre-training was combined with fine tuning, performance improved. The largest models with 70B parameters outperformed the smallest ones, and the SemiKong 70B model excelled in all criteria.
In summary, the proposed method provided a robust solution to integrate LLM technology with the semiconductor industry and achieved great performance. It worked better than the basic open source model. However, SemiKong is in its initial phase and there is still a lot of work to do. This work of integrating the latest LLM technology in manufacturing can act as a foundation for future research in the semiconductor domain and change it forever!
Verify the paper and GitHub page. All credit for this research goes to the researchers of this project. Also, don't forget to follow us on <a target="_blank" href="https://twitter.com/Marktechpost”>twitter and join our Telegram channel and LinkedIn Grabove. If you like our work, you will love our information sheet.. Don't forget to join our SubReddit over 55,000ml.
(FREE VIRTUAL CONFERENCE ON ai) SmallCon: Free Virtual GenAI Conference with Meta, Mistral, Salesforce, Harvey ai and More. Join us on December 11 for this free virtual event to learn what it takes to build big with small models from ai pioneers like Meta, Mistral ai, Salesforce, Harvey ai, Upstage, Nubank, Nvidia, Hugging Face and more.
Divyesh is a Consulting Intern at Marktechpost. He is pursuing a BTech in Agricultural and Food Engineering from the Indian Institute of technology Kharagpur. He is a data science and machine learning enthusiast who wants to integrate these leading technologies in agriculture and solve challenges.
<script async src="//platform.twitter.com/widgets.js” charset=”utf-8″>