This is a relatively short article. In it, I will use a real-world scenario as an example to explain how to use Numerical expressions on multidimensional Numpy arrays to achieve substantial performance improvements.
There aren’t many articles explaining how to use Numexpr in multi-dimensional Numpy arrays and how to use Numexpr expressions, so I hope this one helps you.
Recently, while reviewing some of my previous work, I came across this code snippet:
def predict(X, w, b):
z = np.dot(X, w)
y_hat = sigmoid(z)
y_pred = np.zeros((y_hat.shape(0), 1))for i in range(y_hat.shape(0)):
if y_hat(i, 0) < 0.5:
y_pred(i, 0) = 0
else:
y_pred(i, 0) = 1
return y_pred
This code transforms the probability prediction results to classification results of 0 or 1 in the machine learning logistic regression model.
But jeez, who would wear a for loop
iterate over Numpy ndarray?
It can be predicted that when the data reaches a certain amount, it will not only occupy a lot of memory, but the performance will also be lower.
That’s right, the person who wrote this code was me when I was younger.
With a sense of responsibility, I plan to rewrite this code with the Numexpr library today.
Along the way, I’ll show you how to use Numexpr and Numexpr. where
expression in multidimensional Numpy arrays to achieve significant performance improvements.
If you are not familiar with the basic usage of Numexpr, you can refer to this article: