Getting started with PyTorch for Deep Learning (Part 3.5: PyTorch Sequential)

This is Part 3.5 of the tutorial series. Please also see the other parts (Part 1Part 2, Part 3).

In Part 3 of this series we built a convultional neural network to classify MNIST digits by defining a new class, that extended nn.Module, called Net. We then defined the different components of our network in our initializer function and connected the network together by chaining functions in our forward() function. This is the most common way of defining a network in PyTorch, and it also offers the greatest flexibility, since normal Tensor operations can also be included.

However, PyTorch offers a easier, more convenient way of creating feed-forward networks with it’s nn.Sequential class. This is the “cleanest” way of creating a network in PyTorch, and reminds of other neural net frameworks out there such as Keras. You can also define your own layers (as we will below) and just add them to the Sequential chain. For more detailed information on the nn.Sequential class, see this.

Even though most of the functions we will need for our network is already contained in the torch.nn library (for a full list see here), for some reason, as far as I can tell, there is not function to flatten the network (which most other frameworks do offer). Luckily it is simple enough to write our own Flatten class that does exactly that. Below is all the code that should replace the Net class in Part 3.

class Flatten(nn.Module):
    def forward(self, input):
        return input.view(input.size(0), -1)
    
model = nn.Sequential(
        nn.Conv2d(1, 10, kernel_size=kernel_sz),
        nn.MaxPool2d(2, padding=0),
        nn.ReLU(),
        nn.Conv2d(10, 20, kernel_size=kernel_sz),
        nn.MaxPool2d(2, padding=0),
        nn.ReLU(),
        nn.Dropout2d(0.25),
        Flatten(),
        nn.Linear(320,100),
        nn.Linear(100,10),
        nn.LogSoftmax()        
        )

model.cuda() 

Very simple! (Full source here)

Note on performance

I did not run thorough benchmarking on this method compared to the method in Part 3, but from simple tests, using <code>nn.Sequential</code> causes about a 10% performance hit. A single epoch for the method in Part 3 takes on average about 4.55s on my GPU, and the Sequential method takes 4.95s per epoch, which is significant enough to note. I have to assume that it is simply due to increased overheads.

Conclusion

In this part we considered an alternative approach that can be used to define a network in PyTorch using the  nn.Sequential class. We say that it is a cleaner way to define the network, but that it could cause a performance decrease. The full source code can be found here.

2 thoughts on “Getting started with PyTorch for Deep Learning (Part 3.5: PyTorch Sequential)

  1. Hi Rensu

    I suspect you are mixing your information between Pytorch and Torch. The link re :add is to Torch (in Lua), and I don’t see an :add method in nn under Pytorch.

    Like

Leave a comment