Man

Development & AI | Alper Akgun

Neural networks using Python, Pytorch & MNIST

September, 2023

In this blog, I will delve into the fascinating realm of deep learning by exploring how neural networks can be trained to recognize handwritten digits with remarkable accuracy. I want to gain practical knowledge to understand and implement neural networks using the MNIST dataset.

Importing pytorch libraries:


import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
            

Creating the neural net usin pytorch; we have one input layer, one hidden layer and one output layer.


class NeuralNet(nn.Module):
    def __init__(self, input_size, hidden_size, num_classes):
        super(NeuralNet, self).__init__()
        self.fc1 = nn.Linear(input_size, hidden_size)
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(hidden_size, num_classes)

    def forward(self, x):
        out = self.fc1(x)
        out = self.relu(out)
        out = self.fc2(out)
        return out
            

Training; we download the MNIST training data from the torchvision datasets. We are using cross entropy loss, and the Adam optimizer with a learning rate.


learning_rate = 0.001
input_size = 28 * 28
hidden_size = 128
num_classes = 10

train_dataset = torchvision.datasets.MNIST(root='./data', train=True, transform=transforms.ToTensor(), download=True)
train_loader = torch.utils.data.DataLoader(dataset=train_dataset, batch_size=batch_size, shuffle=True)

# Initialize the model
model = NeuralNet(input_size, hidden_size, num_classes)

# Loss and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=learning_rate)

# Training loop
total_step = len(train_loader)
for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_loader):
        # Forward pass
        images = images.reshape(-1, 28 * 28)
        outputs = model(images)
        loss = criterion(outputs, labels)

        # Backward pass and optimization
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        if (i + 1) % 100 == 0:
            print(f'Epoch [{epoch + 1}/{num_epochs}], Step [{i + 1}/{total_step}], Loss: {loss.item():.4f}')

Test the model


test_dataset = torchvision.datasets.MNIST(root='./data', train=False, transform=transforms.ToTensor())
model.eval()
with torch.no_grad():
    correct = 0
    total = 0
    for images, labels in test_dataset:
        images = images.reshape(-1, 28 * 28)
        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)
        total += 1
        correct += (predicted == labels).sum().item()

    accuracy = 100 * correct / total
    print(f'Test Accuracy: {accuracy:.2f}%')

We can do inference and draw a random input image using matplotlib. The example below, gets a random image and guesses its output.


import random
import matplotlib.pyplot as plt

# Choose a random digit image from the test dataset
random_idx = random.randint(0, len(test_dataset) - 1)
image, label = test_dataset[random_idx]

# Display the chosen digit image
plt.imshow(image.squeeze().numpy(), cmap='gray')
plt.title(f"True Label: {label}")
plt.show()

# Perform inference using the model
image = image.reshape(-1, 28 * 28)
model.eval()
with torch.no_grad():
    output = model(image)
    _, predicted = torch.max(output.data, 1)

print(f"Predicted Digit: {predicted.item()}")