Image by Editor | Midjourney and Canva
Deep learning is widely used in many areas of ai research and has contributed to technological advancements. For example, text generation, facial recognition, and speech synthesis applications are based on deep learning research.
One of the most used Deep Learning packages is PyTorchIt is an open source package created by Meta ai in 2016 and has since been used by many.
PyTorch has many advantages, including:
- Flexible model architecture
- Native support for CUDA (can use GPU)
- Based on Python
- Provide lower-level controls, which are useful for research and many use cases.
- Active development by developer and community.
Let's explore PyTorch with this article to help you get started.
Preparation
You need to visit the installation web page and select the one that suits your environment requirements. The code below is an example of an installation.
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
With PyTorch ready, let's move on to the core part.
PyTorch Tensor
Tensor is the basic building block of PyTorch. It is similar to NumPy array but has access to a GPU. We can try to create a PyTorch tensor with the following code:
a = torch.tensor((2, 4, 5))
print(a)
Output>>
tensor((2, 4, 5))
Like the NumPy array tensor, it allows matrix operations.
e = torch.tensor(((1, 2, 3),
(4, 5, 6)))
f = torch.tensor((7, 8, 9))
print(e * f)
Output>>
tensor((( 7, 16, 27),
(28, 40, 54)))
It is also possible to perform matrix multiplication.
g = torch.randn(2, 3)
h = torch.randn(3, 2)
print( g @ h)
Output>>
tensor(((-0.8357, 0.0583),
(-2.7121, 2.1980)))
We can access the Tensor information using the code below.
x = torch.rand(3,4)
print("Shape:", x.shape)
print("Data type:", x.dtype)
print("Device:", x.device)
Output>>
Shape: torch.Size((3, 4))
Data type: torch.float32
Device: cpu
Training Neural Networks with PyTorch
By defining the neural network using the nn.Module class, we can develop a simple model. Let's test it with the code below.
import torch
class SimpleNet(nn.Module):
def __init__(self, input, hidden, output):
super(SimpleNet, self).__init__()
self.fc1 = torch.nn.Linear(input, hidden)
self.fc2 = torch.nn.Linear(hidden, output)
def forward(self, x):
x = torch.nn.functional.relu(self.fc1(x))
x = self.fc2(x)
return x
inp = 10
hid = 10
outp = 2
model = SimpleNet(inp, hid, out)
print(model)
Output>>
SimpleNet(
(fc1): Linear(in_features=10, out_features=10, bias=True)
(fc2): Linear(in_features=10, out_features=2, bias=True)
)
The above code defines a SimpleNet
class that inherits from nn.Module
which configures the layers. We use nn.Linear
is for the layers, and relu
as the activation function.
We can add more layers or use different layers like Conv2D or CNN, but we wouldn't use them.
Next, we would train the SimpleNet
We develop with sample data from Tensor.
import torch
inp = torch.randn(100, 10)
tar = torch.randint(0, 2, (100,))
criterion = torch.nn.CrossEntropyLoss()
optimizr = torch.optim.SGD(model.parameters(), lr=0.01)
epochs = 100
batchsize = 10
for epoch in range(numepochs):
model.train()
for i in range(0, inp.size(0), batchsize):
batch_inp = inputs(i:i+batch_size)
batch_tar = targets(i:i+batch_size)
out = model(batch_inp)
loss = criterion(out, batch_tar)
optimizer.zero_grad()
loss.backward()
optimizr.step()
if (epoch + 1) % 10 == 0:
print(f'Epoch ({epoch+1}/{num_epochs}), Loss: {round(loss.item(),4})')
During the previous training, we used random Tensor data and started the loss function called CrossEntropyLoss
In addition, we run the SGD optimizer to manage the model parameters to minimize the loss.
The training process runs multiple times based on the epoch numbers and then performs the optimization process. This is the usual deep learning process.
We can add multiple steps to a more complex training to improve training, such as early stopping, learning rate, and other techniques.
Finally, we can evaluate the model we have trained on the unseen data. The following code allows us to do so.
from sklearn.metrics import classification_report
model.eval()
test_inputs = torch.randn(20, 10)
test_targets = torch.randint(0, 2, (20,))
with torch.no_grad():
test_outputs = model(test_inputs)
_, predicted = torch.max(test_outputs, 1)
print(classification_report(test_targets, predicted))
What happened above is that we switched the model to evaluation mode, which disabled the batch normalization and dropout updates. Also, we disabled the gradient calculation process to speed up the process.
You can visit the PyTorch Documentation to learn more about what you can do.
Conclusion
In this article, we will go over the basics of PyTorch. From creating tensors to tensor operations and developing a simple neural network model. The article is at an introductory level and any beginner should be able to follow it quickly.
Cornellius Yudha Wijaya Cornellius is a Data Science Assistant Manager and Data Writer. While working full-time at Allianz Indonesia, he loves sharing Python and data tips through social media and writing. Cornellius writes on a variety of ai and machine learning topics.