Vishwak S

Post 1 : My interactions with Open-source

How did I start open-source work?

It was more out of curiosity and experimentation that led me to start contributing to open-source software, especially PyTorch.

Some background

In mid-2017, I visited the University of Tokyo (UoT), where I worked with Dr. Yusuke Iwasawa on deep generative models - GANs and VAEs to be specific. A major task that I was assigned was to implement popular models in a framework of my choice. Until then, I had absolutely no experience using a deep learning framework.

Why no experience?(click to know more)


Now, I had roughly 18 days to get some models up and running. I didn’t stumble across PyTorch randomly, like these people who stumbled on gold. I learned about the existence of PyTorch through hearsay, and my memory fortunately didn’t fail me at UoT. I promptly told them I would implement them in PyTorch.

I spent some nights trying to figure out the basics - the Autograd engine, nn.Module and simple layers - using documentation, examples and tutorials. This helped me build a DCGAN, and then a few others. You can find these here.

Now that I had these working as expected, I wanted to try them on some new datasets which were rather small. CUB200 was suggested to me by Dr. Iwasawa, but this wasn’t available in TorchVision. For people who have used CUB200, you probably know two things:

This is where the fun begins.

What could have been my first contributions, and maybe also my last.

I decide to create a Dataset object with all downloading, extracting and iterating capabilities like the ones you find in TorchVision - MNIST, CIFAR10 for example. The idea was that people would benefit from this addition since I would not be the only one using this dataset with PyTorch. I propose my naive solution of reducing the size of the images to a standard height and width, which is not a great idea as information is lost. The pull request eventually grew stale, and I gave up on providing a viable solution.

Given this failure, I decided to visit the core repository housing PyTorch. I dig in, trying to fix issues within my reach. I opened a few feature requests and their corresponding implementations, but these were minor and too trivial and/or badly implemented due to which these were closed.

I could have stopped there, demotivated by the fact that my first 4 pull requests to open source softwares failed. I didn’t. And it is important that you don’t.

Failures are the stepping stones to success. - someone said this, and so did my parents, teachers and friends.

So I didn’t….

I actively searched once more and find a feature request asking for the expm1 function back in last few pages of GitHub issues. For those who don’t know, the expm1 function provides a more accurate estimate of the function f(x) = ex - 1 around 0 than the naive implementation exp(x) - 1. I was still a novice; however I was able to implement this using grep and a previous commit implementing log1p (the inverse of expm1).

This was it: my first contribution. When was this? December 28, 2017. To give some context, this was the status of PyTorch then:

Was that all?

Second time? (Yes, I edited a meme for the purposes of this post)

This is not the peak. I wanted to remain consistent. I continued to track feature requests, bug reports - and helped fix as many of them as I could - either via code and/or code review.

This was not just in core PyTorch or TorchVision. I found issues that I could tackle in few repositories belonging to the PyTorch ecosystem and tried to provide assistance there as well. These are Pyro, GPyTorch and Ignite.

I also had a recent brief stint with Google’s JAX, but I had a lot on the board with PyTorch, so I decided to keep it aside for a while. I still look at it regularly to keep myself updated on developments.

What do I do now?

I mostly focus on maintaining the linear algebra backend of PyTorch. Linear Algebra backends such as LAPACK, BLAS and MAGMA fascinate me because of their efficiency. It is not sufficient to just know the significance of eigendecomposition or singular value decomposition theoretically; you will need to provide implementations that help demonstrate the capabilities of these mathematical algorithms scalably in real-life problems. I also take turns fixing bugs, implementing useful feature requests - some more important than the others along with a whole lot of other engineers and scientists at Facebook and other organizations.

What keeps you motivated?

I have a couple of reasons only:

Starting steps for the excited reader

If you have been motivated by the text above to contribute to open-source software, here are a bunch of basic tips I have to offer. Most of you perhaps know this already, in which case I would say that you are good to go.

If you still have any specific questions, please feel free to contact me.