Image by author
Profiling Python code is useful for understanding how the code works and identifying opportunities for optimization. You've probably profiled your Python scripts for time-related metrics: measuring the execution times of specific sections of code.
But memory profiling (to understand memory allocation and deallocation during execution) is equally important. Because memory profiling can help identify memory leaks, resource utilization, and potential scaling issues.
In this tutorial, we will explore profiling Python code for memory usage using the Python package. memory profiler.
Let's start by installing the memory profiler Python package using pip:
pip3 install memory-profiler
Note: Install the memory profiler in a dedicated location virtual environment for the project rather than in its global environment. We will also use the tracing capabilities available in the memory profiler to trace memory usage, which requires matplotlib. So make sure you also have matplotlib installed in the project's virtual environment.
Let's create a Python script (say main.py) with a function process_strs
:
- The function creates two super long Python strings.
str1
andstr2
and concatenates them. - The keyword argument
reps
controls the number of times encoded strings must be repeated to createstr1
andstr2
. And we give it a default value of 10**6 which will be used if the called function does not specify the value ofreps
. - Then we explicitly remove
str2
. - The function returns the concatenated string.
str3
.
# main.py
from memory_profiler import profile
@profile
def process_strs(reps=10**6):
str1 = 'python'*reps
str2 = 'programmer'*reps
str3 = str1 + str2
del str2
return str3
process_strs(reps=10**7)
Running the script should give you a similar result:
As seen in the output, we can see the memory used, the increment with each subsequent chain creation and the chain deletion step that frees some of the used memory.
Running the mprof command
Instead of running the Python script as shown above, you can also run the mprof
command like this:
mprof run --python main.py
When you run this command, you should also be able to see a .dat file with memory usage data. You will have a .dat file every time you run the mprof
command: identified by timestamp.
Trace memory usage
Sometimes it's easier to analyze memory usage from a graph rather than looking at numbers. Recall that we discussed that matplotlib is a required dependency to use plotting capabilities.
You can use the mprof plot
command to plot the data to the .dat file and save it to an image file (here output.png):
Default, mprof plot
used data from the most recent run of the mprof
domain.
You can also see the timestamps mentioned in the plot.
Logging the memory usage profile to a log file
Alternatively, you can log memory usage statistics to a preferred log file in the working directory. Here, we create a file handler. mem_logs
to the log file and configure the stream
argument in the @profile
file handler decorator:
# main.py
from memory_profiler import profile
mem_logs = open('mem_profile.log','a')
@profile(stream=mem_logs)
def process_strs(reps=10**6):
str1 = 'python'*reps
str2 = 'programmer'*reps
str3 = str1 + str2
del str2
return str3
process_strs(reps=10**7)
When you run the script, you should be able to see the mem_profile.log file in your working directory with the following content:
You can also use the memory_usage()
function to understand the resources required for a specific function to execute, sampled at regular time intervals.
He memory_usage
The function takes the function to profile, positional and keyword arguments as a tuple.
Here, we would like to find the memory usage of the process_strs
function with keyword argument reps
set to 10**7. We also set the sampling interval to 0.1 s:
# main.py
from memory_profiler import memory_usage
def process_strs(reps=10**6):
str1 = 'python'*reps
str2 = 'programmer'*reps
str3 = str1 + str2
del str2
return str3
process_strs(reps=10**7)
mem_used = memory_usage((process_strs,(),{'reps':10**7}),interval=0.1)
print(mem_used)
Here is the corresponding result:
Output >>>
(21.21875, 21.71875, 147.34375, 277.84375, 173.93359375)
You can also adjust the sampling interval depending on how often you want memory usage to be captured. As an example, we set the interval to 0.01 s; which means we will now get a more granular view of the memory used.
# main.py
from memory_profiler import memory_usage
def process_strs(reps=10**6):
str1 = 'python'*reps
str2 = 'programmer'*reps
str3 = str1 + str2
del str2
return str3
process_strs(reps=10**7)
mem_used = memory_usage((process_strs,(),{'reps':10**7}),interval=0.01)
print(mem_used)
You should be able to see a similar result:
Output >>>
(21.40234375, 21.90234375, 33.90234375, 46.40234375, 59.77734375, 72.90234375, 85.65234375, 98.40234375, 112.65234375, 127.02734375, 141.27734375, 155.65234375, 169.77734375, 184.02734375, 198.27734375, 212.52734375, 226.65234375, 240.40234375, 253.77734375, 266.52734375, 279.90234375, 293.65234375, 307.40234375, 321.27734375, 227.71875, 174.1171875)
In this tutorial, we learned how to start profiling Python scripts for memory usage.
Specifically, we learned how to do this using the Memory-profiler package. We use the @profile
decorator and the memory_usage()
function to get the memory usage of a sample Python script. We also learned how to use capabilities such as tracing memory usage and capturing statistics in a log file.
If you are interested in profiling your Python script for runtimes, consider reading Profiling Python Code Using timeit and cProfile.
Bala Priya C. is a developer and technical writer from India. He enjoys working at the intersection of mathematics, programming, data science, and content creation. His areas of interest and expertise include DevOps, data science, and natural language processing. He likes to read, write, code and drink coffee! Currently, he is working to learn and share his knowledge with the developer community by creating tutorials, how-to guides, opinion pieces, and more.