We founded Ursa Labs in 2018 to build a team to enable Apache Arrow to be a robust and reliable next-generation computational foundation for in-memory analytics and data science. To make this possible, we partnered with RStudio, Two Sigma, and several other corporate sponsors to secure the financial support for a group of full time open source developers. In part through our efforts, Arrow is now installed more than ten million times per month by Python and R users and has quickly become one of the most important technologies for working with large tabular datasets. We have additionally helped the project reach its critical “1.0” release milestone, the first one offering formal cross-version stability guarantees.
But it’s not sufficient to write fast code. The success of a software project depends just as much—or more—on the community that forms around it. We believe that a large, active, and diverse contributor base, coupled with open and transparent governance, will yield an open source project with a long lifespan and high real world impact. Consequently, Ursa Labs works not only to add functionality to Arrow but also to cultivate its developer community.
We are continually encouraged by the growth and vibrance of the community, and we are proud to work alongside so many talented individuals who share our commitment. Apache Arrow counts over 500 unique contributors from diverse backgrounds across 6 continents. Within the community, discourse is highly collegial, and decision-making is consensus-driven, in keeping with our Code of Conduct and with the ideals of the Apache Software Foundation.
That said, as we reflect on the project’s new stage of maturity following the 1.0 release, we recognize that we can do more to promote diversity, equity, and inclusion. Open source software is well known to no more diverse (and often less so) than the broader tech community. There are many factors contributing to the lack of diversity, many of them rooted in the fact that too often open source contributions are unpaid and undertaken as a “side project”. Many individuals' work and family situations do not permit them the time and space to contribute free labor to an open source project.
We aspire to be supportive and welcoming of everyone who joins the mailing list or files a bug report. Frequently we encourage someone who tweets about an Arrow problem to open a Jira issue, where we can discuss and debug in more detail; in many cases, we can even persuade the reporter to submit a pull request to fix the issue or improve the documentation. Yet, we know that there is a much larger potential community out there that may feel intimidated to speak up or overwhelmed by the complexity of the project. We want to actively seek out these potential collaborators and give them the support necessary to become active members of the community.
To that end, we are very excited to announce the establishment of an open source apprenticeship program at Ursa Labs, supported by a $215,000 grant from the Chan Zuckerberg Initiative. Our program aims to promote the long-term sustainability and diversity of the Arrow project by recruiting developers from underrepresented groups (especially non-males and people of color) for a one-year fellowship and training them to be open source software maintainers. Apprentices will gain general experience in open source project maintenance and community support, as well as specific, deep knowledge of the Arrow libraries and of testing and packaging C++, Python, and R libraries on a range of platforms. Following the year-long program, apprentices will be prepared for software engineering jobs in the big data ecosystem, and we hope that they will continue to engage with the Arrow community and other open source projects into the future.
We are grateful to the Chan Zuckerberg Initiative for their support of Apache Arrow and of diversity and inclusion in open source software. In the coming weeks, we’ll be providing more details on the Ursa Labs apprenticeship program, as well as the application process.