Intelligent scaling: Accelerating pre-training of large language models with small model initialization
This paper was accepted into the Efficient Natural Speech and Language Processing (ENLSP) Workshop at NeurIPS 2024. The pre-training phase ...
This paper was accepted into the Efficient Natural Speech and Language Processing (ENLSP) Workshop at NeurIPS 2024. The pre-training phase ...