YuLan-Mini: A 2.42 Billion Parameter Open Data Efficient Language Model with Long Context Capabilities and Advanced Training Techniques
Large language models (LLMs) built using transformative architectures rely heavily on pre-training with large-scale data to predict sequential tokens. This ...