It is popularly believed that optimizing appropriate loss functions produces predictors with good calibration properties; The intuition is that for such losses, the global optimum is to predict the ground truth probabilities, which are in fact calibrated. However, typical machine learning models are trained to approximately minimize loss on restricted families of predictors, which are unlikely to contain the ground truth. Under what circumstances does appropriate loss optimization over a constrained family produce calibrated models? What guarantees of accurate calibration do you offer? In this work we give a rigorous answer to these questions. We replace the global optimization with a local optimization condition that stipulates that the (adequate) loss of the predictor cannot be reduced much by post-processing its predictions with a given family of Lipschitz functions. We show that any predictor with this local optimization satisfies a soft calibration as defined in Kakade-Foster (2008), Błasiok et al. (2023). Local optimization is plausibly satisfied by well-trained DNNs, suggesting an explanation for why they are calibrated solely from appropriate loss minimization. Finally, we show that the connection between local optimality and calibration error goes both ways: near-calibrated predictors are also near-locally optimal.