We study the problem of locally private mean estimation of high-dimensional vectors on the Euclidean ball. Existing algorithms for this problem incur suboptimal errors or have high communication and/or execution time complexity. We propose a new algorithmic framework, ProjUnit, for private mean estimation that produces algorithms that are computationally efficient, have low communication complexity, and incur optimal error up to a factor of 1+o(1). Our framework is deceptively simple: each randomizer projects its input into a low-dimensional random subspace, normalizes the result, and then runs an optimal algorithm like PrivUnitG on the low-dimensional space. Furthermore, we show that by properly mapping random projection matrices across devices, we can achieve fast server execution time. We mathematically analyze the error of the algorithm in terms of properties of the random projections and study two instances. Finally, our experiments for private mean estimation and private federated learning demonstrate that our algorithms empirically achieve almost the same utility as the optima, while having significantly lower computational and communication cost.