How Developers Can Manage and Contribute to Successful Open-Source Projects

A new generation of open source is set to influence 2023’s top technologies for AI/ML, data-driven apps and cloud-native development.   Databricks’ Denny Lee tells IDN how developers can engage with open source – and benefit both their companies and their careers.

Tags: Apache, code, community, Databricks, developers, open source,

Denny Lee, JItterbit
Denny Lee
Senior Staff Developer Advocate
Databricks


"An open-source project cannot thrive without its community of contributors - those who use the code and those who actively contribute."

Enterprise Integration Summit
Integration Powers Digital Transformation for APIs, Apps, Data & Cloud
December 8, 2022
A Virtual Summit

When it comes to open-source projects, there are often conflicting directives regarding what the community needs and the business objectives of the companies that support them. 

 

An open-source project cannot thrive without its community of contributors - those who not only use the code but actively contribute to its upkeep and help to improve it. 

 

On the other hand, successful open-source projects often require companies to employ the community and contributors to ensure the project’s success.

 

When thinking about how successful a project is, there are different classifications, such as those defined in Nadia Eghbal’s excellent book Working in Public:

Toys (small projects with little to no contributor nor user growth)
Clubs (lower user growth, high contributor growth)
Stadiums (high user growth, lower contributor growth)
Federations (high user and contributor growth)

Regardless of how big or small your open source project is, it is crucial to remember that behind the all-important classifications and associated metrics is the individual developer, those defining their career path by using and contributing to your project.

It’s All About the Developers

I’ve had a great deal of experience working on many successful open source projects, including Apache Hadoop, Apache Spark™, MLflow, and Delta Lake. Before that, I was on Microsoft’s SQL Server engineering team working on the proprietary SQL Server database engine and Analysis Services (and subsequently Power BI) engines.

 

This was during the long-ago time that then Microsoft CEO touted that “Linux is a Cancer.” [But not to worry, this is not a knock on my previous employer.  Since those earlier days, Microsoft has come to embrace open-source and Linux. ]

 

That said, I bring this comment up from former Microsoft CEO Steve Ballmer because he was also famous for saying, “developers, developers, developers.”  And he definitely got that right - the importance of any community is the developer. 

They are the heart of the user and contributor community and crucial for its success. And building communities is more than just the domain of open-source. When I was at Microsoft, I helped to create an active SQL Server community, from building SQL Server meetups (before meetup.com) like SQL Saturday to trending #sqlkilt during the annual SQLPASS conference. What made this work was that we were all helping each other to develop (pun intended) our individual career paths while solving complex technical problems. 

Why Contribute to Open Source?

As more developers work together on an open-source project, early adopters often determine which tools and platforms became the de-facto standards in the community and commercial marketplace, such as Apache Spark during the initial Big Data hype. 

 

Many developers also contribute to a project because they can lean on the community for better design, code standards, and more thorough testing.  Even better, by building on their expertise, developers also improve their technical resumes (e.g., creating and resolving GitHub issues and pull requests).

 

In this context of hands-on work with open source, a developer’s “real-world resume” can often be more influential than a traditional CV. And that’s simply because of visibility one gains from engagement with a public GitHub repository.

 

In fact, a developer’s contributions to Github will showcase:

Communication skills: Writing skills are extremely important within and across engineering teams. Amazon is famous for its writing culture and many organizations recreate a version of this approach. This is especially important today as strong communication skills are required for remote teams to be effective.

 

Collaboration skills: Having an engineering team work well together across different time zones isn’t easy. It is even more complicated when the engineering team is comprised of individuals who work for different (and often competing) organizations. If you can collaborate well in communities, you can collaborate well in most organizations.

 

Technical stewardship and mentorship: One of the keys to a successful open-source project is that its members are actively helping others in the community to learn and grow. 

In addition to these tangible benefits, there is an equally valuable intangible impact on a developer’s career when working with an open source community. 

 

As developers become part of this growing and nurturing community, there is an important network effect. You are connected to more people and organizations who want to use and contribute to your project - opening up more promotion and job opportunities.

When To Manage, What To Manage 

Successful open-source projects allow developers to collaborate and innovate together, so what is there to manage? Companies often promote open-source projects thinking of their own business objectives, open-source it, and then imagine the community can grow on its own. 

 

But left to its own devices, the project may only achieve toy or club status.   

 

To help a project achieve stadium and, finally, federation status, it is important for the community of developers to communicate positively with each other in many forms continuously (e.g., blogs, documentation, meetups, virtual events, conferences, etc.). You need positivity in order to build trust; even when competing interests exist, we can find ways to collaborate. We might be competing now, but we often work together in some capacity down the road. 

 

As managers of open-source projects, our job is to establish trust while nurturing and encouraging the community. Specifically, we must build processes and provide metrics to answer community questions, publish high-quality blogs and documentation, ensure high-quality code standards, and more. Establishing and developing processes are required to enable strong engineering teams and understand that these processes are being applied to developers across different organizations, time zones, and for different (and sometimes competing) business goals. 

How To Choose an Open Source Project

So, now we have explored how developer benefits from becoming involved with an open source project.  We also have some basic framework for how to contribute and engage – with the code and the community.

 

Now the question of how to choose an appropriate open source project to engage with – for your interests, skills and future goals. 

 

Such a choice is often combination of not just the goal of the project or the quality of the code – but the health and vitality of the community. 

 

In my experience, one of the most important aspects of whether an open-source project is worth a developer's commitment is how active and helpful is the community.  Taking the time to answer questions, providing high-quality feedback on your code and design, treating everyone with respect, collaborating to help solve your and your company's scenarios, and building a positive environment are some of the hallmarks of an open-source project worthy of your commitment. 

 

Recall, when committing to a project, you are often unknowingly choosing to have this project define your career path.  Let's make sure that you surround yourself with a community that wants to see you succeed.

Final Thoughts

Developers may start using and contributing to an open-source project because of technical curiosity and a lower barrier of entry to view the code base.

 

To make an open-source community truly successful, managers must create a positive and encouraging network while ensuring high-quality coding and collateral standards.   

 


Denny Lee is a developer advocate at Databricks, where he is a Delta Lake committer and contributor to Apache Spark and MLflow. He previously built enterprise DW/BI and big data systems at Microsoft as part of the  Azure Cosmos DB, Project Isotope (HDInsight), and SQL Server teams. Denny holds a Masters in Biomedical Informatics from Oregon Health Sciences University.

 




back