Vision Foundation Models (VFM) pre-trained on massive datasets exhibit impressive performance on various downstream tasks, especially with limited labeled target data. However, due to their high inference computation cost, these models cannot be implemented for many real-world applications. Motivated by this, we pose the following important question: “How can we leverage the knowledge of a large VFM to train a small task-specific model for a new target task with limited labeled training data?”, and propose a simple model-oriented tasks. The knowledge transfer approach is a very effective solution to this problem. Our experimental results on five target tasks show that the proposed approach outperforms task-independent VFM distillation, web-scale CLIP pretraining, supervised ImageNet pretraining, and self-supervised DINO pretraining by up to 11.6%, 22.1%, 13.7% and 29.8%. , respectively. Furthermore, the proposed approach also demonstrates up to 9x, 4x and 15x reduction in pre-training computational cost compared to task-independent VFM distillation, ImageNet pre-training and DINO pre-training, respectively, while outperforming them. We also show that the data set used for knowledge transfer has a significant effect on the performance of the final target task and introduce a retrieval-augmented knowledge transfer strategy that uses web-scale image retrieval to select effective transfer sets.