This is Part 3.5 of the tutorial series. Please also see the other parts (Part 1, Part 2, Part 3).
In Part 3 of this series we built a convultional neural network to classify MNIST digits by defining a new class, that extended nn.Module
, called Net
. We then defined the different components of our network in our initializer function and connected the network together by chaining functions in our forward()
function. This is the most common way of defining a network in PyTorch, and it also offers the greatest flexibility, since normal Tensor operations can also be included.
However, PyTorch offers a easier, more convenient way of creating feed-forward networks with it’s nn.Sequential
class. This is the “cleanest” way of creating a network in PyTorch, and reminds of other neural net frameworks out there such as Keras. You can also define your own layers (as we will below) and just add them to the Sequential chain. For more detailed information on the nn.Sequential
class, see this.
Even though most of the functions we will need for our network is already contained in the torch.nn
library (for a full list see here), for some reason, as far as I can tell, there is not function to flatten the network (which most other frameworks do offer). Luckily it is simple enough to write our own Flatten
class that does exactly that. Below is all the code that should replace the Net
class in Part 3.
class Flatten(nn.Module): def forward(self, input): return input.view(input.size(0), -1) model = nn.Sequential( nn.Conv2d(1, 10, kernel_size=kernel_sz), nn.MaxPool2d(2, padding=0), nn.ReLU(), nn.Conv2d(10, 20, kernel_size=kernel_sz), nn.MaxPool2d(2, padding=0), nn.ReLU(), nn.Dropout2d(0.25), Flatten(), nn.Linear(320,100), nn.Linear(100,10), nn.LogSoftmax() ) model.cuda()
Very simple! (Full source here)
Note on performance
I did not run thorough benchmarking on this method compared to the method in Part 3, but from simple tests, using <code>nn.Sequential</code> causes about a 10% performance hit. A single epoch for the method in Part 3 takes on average about 4.55s on my GPU, and the Sequential method takes 4.95s per epoch, which is significant enough to note. I have to assume that it is simply due to increased overheads.
Conclusion
In this part we considered an alternative approach that can be used to define a network in PyTorch using the nn.Sequential
class. We say that it is a cleaner way to define the network, but that it could cause a performance decrease. The full source code can be found here.
Hi Rensu
I suspect you are mixing your information between Pytorch and Torch. The link re :add is to Torch (in Lua), and I don’t see an :add method in nn under Pytorch.
LikeLike
Hi Chris,
Thank you for catching that, honestly it must have been late when I added that since it isn’t even Python syntax.
I haven’t tested this, but it looks like the add_module() function does something similar.
https://pytorch.org/docs/master/nn.html#torch.nn.Module.add_module
LikeLike