Feature Complete

Getting involved with pandas a couple years ago turned out to be one of the most fruitful choices I've made. I've learned so much from the other contributors (the ones who really do a ton of work, unlike me) like Jeff, Stephan, Joris, Masaaki, and others.

That said, if I could offer just a tiny bit of advice to people looking to get into open source: start small. At the time of this writing pandas has 1,422 open issues and 72 open pull requests. It's a lot of work to just keep up with those, and I can't overstate my appreciation for all that Jeff and the others do to manage that. Let's be clear: there aren't 1,500 open issues because we're lazy and are ignoring the tracker, or because we're terrible programmers with above average bugs / lines of code. There's 1,500 open issues because pandas is a complex library solving an ambitious task. Just try to do a plot with Timedeltas on the x-axis; or don't since you'll get a MemoryError.1

One of the things I've most enjoyed about engarde is that I can reasonably claim it's feature complete. Sure, I can still tinker with it (including a complete rewrite in unnescessary enterprise OOP style), and there are some minor improvements like better error reporting that could be added. But engarde has a small scope. The core of the library is very simple: make assertions about DataFrames returned from functions. That scope is pretty well satisfied.

Back to Pandas

This is actually something we've been struggling with for pandas. There are regular requests for adding a new method here, a new IO format there. Taken individually, each request might make for a genuinely useful feature. But taken together the combined load (both cognitive load to the users and maintenance load for the maintainers) might make the library worse off2.

None of this should be taken as me dissuading you from contributing. Even in large projects you'll find little corners to work on, and you'll hopefully (certainly at pandas) find supportive maintainers to help you out. But if you have wanted to contribute and were intimidated or have contributed in the past and are burned out, take it from me: working on small, feature-complete libraries is a joy and looking back on a completed project is refreshing.

  1. I pick that example because it's my fault that's not fixed yet. I'm working on it, but like I said, it's complex. 

  2. This is the ulterior motivation to our pipe method. "Want a new method? Write your own library and use .pipe!"