We show that a GPT-3 model can learn to express uncertainty about its own responses in natural language, without the use of logits models. When asked a question, the model generates both an answer and a confidence level (eg, “90% confidence” or “high confidence”). These levels correspond to probabilities that are well calibrated. The model also remains moderately calibrated under distribution change and is sensitive to uncertainty in its own responses, rather than mimicking human examples. To our knowledge, this is the first time that a model has been shown to express calibrated uncertainty about its own responses in natural language. To test the calibration, we introduce the CalibratedMath task set. We compared the calibration uncertainty expressed in words (“verbalized probability”) with the uncertainty extracted from the logit models. Both types of uncertainty are capable of generalizing the calibration under the change of distribution. We also provide evidence that the ability of GPT-3 to generalize calibration depends on pretrained latent representations that correlate with epistemic uncertainty about its responses.