We present a basic model for zero-shot metric monocular depth estimation. Our model, Depth Pro, synthesizes high-resolution depth maps with unparalleled sharpness and high-frequency details. The predictions are metric, with absolute scale, without depending on the availability of metadata such as camera intrinsics. And the model is fast: it produces a 2.25-megapixel depth map in 0.3 seconds on a standard GPU. These features are made possible by a number of technical contributions, including an efficient multi-scale vision transformer for dense prediction, a training protocol that combines real and synthetic data sets to achieve high metric accuracy along with fine boundary tracking, Dedicated evaluation metrics for accuracy of bounds on estimates. State-of-the-art depth maps and focal length estimation from a single image. Extensive experiments analyze specific design choices and demonstrate that Depth Pro outperforms previous work on multiple dimensions.