Model representations such as Bidirectional Encoder Representations of Transformers (BERT) and Hidden Units BERT (HuBERT) have helped achieve state-of-the-art performance in dimensional speech emotion recognition. Both the HuBERT and BERT models generate fairly large dimensional representations, and these models were not trained with the emotion recognition task in mind. Such large dimensional representations result in speech emotion models with large parameter size, resulting in complexities of both memory and computational costs. In this work, we investigate the selection of representations based on their task saliency, which can help reduce model complexity without sacrificing dimensional emotion estimation performance. Furthermore, we investigate modeling label uncertainty in the form of rater sentiment variation and demonstrate that such information can help improve the generalizability and robustness of the model. Finally, we analyzed the robustness of the speech emotion model to acoustic degradation and observed that the selection of salient representations from pre-trained models and the uncertainty of the modeling labels helped improve the generalization ability of the models to unseen data that They contain acoustic distortions in the form of ambient noise. and reverb.