
Image by author
Python is a dynamically typed language. Therefore, you can create variables without explicitly specifying the data type. And you can always assign a completely different value to the same variable. While this makes things easier for beginners, it also makes it easier to create invalid objects in your Python application.
Well, you can create data classes that allow you to define fields with type hints. But they do not offer direct support for validating data. Get into Pydantico, a popular data serialization and validation library. Pydantic offers out-of-the-box support for data validation and serialization. Which means you can:
- Take advantage of Python type hints to validate fields,
- use the custom fields and built-in validators offered by Pydantic, and
- Define custom validators as needed.
In this tutorial, we will model a simple 'Employee' class and validate the values of different fields using Pydantic's data validation functionality. Let us begin!
If you have Python 3.8 or later, you can install Pydantic using pip:
If you need email validation in your app, you can install the optional email validator dependency when installing Pydantic like this:
$ pip install pydantic(email)
Alternatively, you can run the following command to install the email validator:
$ pip install email-validator
Note: In our example, we will use email validation. So install the dependency if you want to code.
Now let's create a simple Employee
class. First, we create a class that inherits from BaseModel
class. The various fields and expected types are specified as shown:
# main.py
from pydantic import BaseModel, EmailStr
class Employee(BaseModel):
name: str
age: int
email: EmailStr
department: str
employee_id: str
Note that we have specified that the email be from EmailStr
type that Pydantic supports instead of a normal Python string. This is because All valid strings may not be valid emails..
Because the Employee
The class is simple, let's add validation for the following fields:
email
– must be a valid email. specifying theEmailStr
takes this into account and we ran into errors when creating objects with invalid email.employee_id
– Must be a valid employee ID. We will implement a custom validation for this field.
Custom validation implementation
For this example, let's say that employee_id
must be a string of length 6 containing only alphanumeric characters.
We can use the @validator
decorator with employee_id
field in the argument and define the validate_employee_id
method as shown:
# main.py
from pydantic import BaseModel, EmailStr, validator
...
@validator("employee_id")
def validate_employee_id(cls, v):
if not v.isalnum() or len(v) != 6:
raise ValueError("Employee ID must be exactly 6 alphanumeric characters")
return v
Now this method checks if the employee_id
is valid for the Employee objects we are trying to create.
At this point your script should look like this:
# main.py
from pydantic import BaseModel, EmailStr, validator
class Employee(BaseModel):
name: str
age: int
email: EmailStr
department: str
employee_id: str
@validator("employee_id")
def validate_employee_id(cls, v):
if not v.isalnum() or len(v) != 6:
raise ValueError("Employee ID must be exactly 6 alphanumeric characters")
return v
In practice, it is very common to parse JSON API responses into data structures such as Python dictionaries. Let's say we have a file 'employees.json' (in the current directory) with the following records:
# employees.json
(
{
"name": "John Doe",
"age": 30,
"email": "john.doe@example.com",
"department": "Engineering",
"employee_id": "EMP001"
},
{
"name": "Jane Smith",
"age": 25,
"email": "jane.smith@example.com",
"department": "Marketing",
"employee_id": "EMP002"
},
{
"name": "Alice Brown",
"age": 35,
"email": "invalid-email",
"department": "Finance",
"employee_id": "EMP0034"
},
{
"name": "Dave West",
"age": 40,
"email": "dave.west@example.com",
"department": "HR",
"employee_id": "EMP005"
}
)
We can see that in the third record corresponding to 'Alice Brown', we have two fields that are not valid: the email
and the employee_id
:
Because we have specified that the email must be EmailStr
, the email chain will be automatically validated. We have also added the validate_employee_id
class method to check if objects have a valid employee ID.
Now let's add the code to parse the JSON file and create employee objects (we will use the built-in json module for this). We also import the ValidationError
Pydantic class. In essence, we try to create objects, handle ValidationError exceptions when data validation fails, and also print the errors:
# main.py
import json
from pydantic import BaseModel, EmailStr, ValidationError, validator
...
# Load and parse the JSON data
with open("employees.json", "r") as f:
data = json.load(f)
# Validate each employee record
for record in data:
try:
employee = Employee(**record)
print(f"Valid employee record: {employee.name}")
except ValidationError as e:
print(f"Invalid employee record: {record('name')}")
print(f"Errors: {e.errors()}")
When you run the script, you should see similar output:
Output >>>
Valid employee record: John Doe
Valid employee record: Jane Smith
Invalid employee record: Alice Brown
Errors: ({'type': 'value_error', 'loc': ('email',), 'msg': 'value is not a valid email address: The email address is not valid. It must have exactly one @-sign.', 'input': 'invalid-email', 'ctx': {'reason': 'The email address is not valid. It must have exactly one @-sign.'}}, {'type': 'value_error', 'loc': ('employee_id',), 'msg': 'Value error, Employee ID must be exactly 6 alphanumeric characters', 'input': 'EMP0034', 'ctx': {'error': ValueError('Employee ID must be exactly 6 alphanumeric characters')}, 'url': 'https://errors.pydantic.dev/2.6/v/value_error'})
Valid employee record: Dave West
As expected, only the record corresponding to 'Alice Brown' is shown No a valid employee object. By zooming in on the relevant part of the result, you can see a detailed message about why the email
and employee_id
The fields are not valid.
Here is the complete code:
# main.py
import json
from pydantic import BaseModel, EmailStr, ValidationError, validator
class Employee(BaseModel):
name: str
age: int
email: EmailStr
department: str
employee_id: str
@validator("employee_id")
def validate_employee_id(cls, v):
if not v.isalnum() or len(v) != 6:
raise ValueError("Employee ID must be exactly 6 alphanumeric characters")
return v
# Load and parse the JSON data
with open("employees.json", "r") as f:
data = json.load(f)
# Validate each employee record
for record in data:
try:
employee = Employee(**record)
print(f"Valid employee record: {employee.name}")
except ValidationError as e:
print(f"Invalid employee record: {record('name')}")
print(f"Errors: {e.errors()}")
That's all for this tutorial! This is an introductory tutorial to Pydantic. I hope you have learned the basics of modeling your data and using the built-in and custom validations that Pydantic offers. All code used in this tutorial is on GitHub.
Next, you can try using Pydantic in your Python projects and also explore the serialization capabilities. Happy coding!
twitter.com/balawc27″ rel=”noopener”>Bala Priya C. is a developer and technical writer from India. He enjoys working at the intersection of mathematics, programming, data science, and content creation. His areas of interest and expertise include DevOps, data science, and natural language processing. He likes to read, write, code and drink coffee! Currently, he is working to learn and share his knowledge with the developer community by creating tutorials, how-to guides, opinion pieces, and more. Bala also creates engaging resource descriptions and coding tutorials.
<script async src="//platform.twitter.com/widgets.js” charset=”utf-8″>