Blog

Considerations When Open Sourcing a Project

One of the most frequent questions I receive not only from engineers, but from other lawyers is “we are thinking of making a project available under an open source license – what do we need to know?” Since I get this question a lot, I’ll share with you the list of information I usually collect from clients to help them assess whether an open source license is right for them, which one they should use, how they should communicate licensing info to others, and whether they should consider creating codes of conduct, contributor guidelines, or contributor agreements (or DCOs) for their projects:

Outbound Open source checklist

Background Info

  • What is the name of your project and the first version that will be publicly released?
  • Please describe what your project does and the format in which it is made available. (Will you be providing just source or binaries, too? Will it be packaged as a virtual machine, container, etc.?)
  • Where do you plan to make the open source project available? (ex.: Github, NPM, rubygems.org, etc.)
  • Does the project have any use outside the context of using it in conjunction with other Company products or services?
  • Do you have any reason to believe that there are Company patents on the technology you want to open source?

Technical Info

  • Does Security see any issues in open sourcing the proposed project? Have you ensured there are no keys or credentials in the source code?
  • Does your project contain any code licensed from a vendor (whether for free or for fee)?
  • Does your project contain any integrations with any third-party products or services?
  • Does the project have any “phone home” features? If so, please identify with specificity the data collected and why.
  • Please submit an open source code scan of the project and its dependencies (including transitive dependencies) for review.
    • The scan needs to be done before a main license can be chosen for the project because the main license of the project should be compatible with the licenses of the third-party code the project uses or pulls in. For example, if you want to license something out as GPL 2, you can’t include Apache 2.0 dependencies and vice versa.
    • You will need to remove dependencies/components under commercial licenses or under licenses that are incompatible with the chosen license for the project as a whole.

Goals

  • Explain why you want to open source this project. What are you trying to accomplish? Why is this preferable over a proprietary license or a proprietary license that gives access to source code?

License

  • Have you thought about the sort of license you’d like to release this under? If so, what is it and why?
  • If the project is used by a competitor or a provided as a service by a big cloud services provider, does that pose issues for the business?
  • Can you foresee any reasons why you might later regret open sourcing this project?
  • Do you expect proprietary Company products to incorporate this code in the future?
  • Do you expect that this project might grow into something you might want to commercially license in the future?

Developer/User Expectations

  • Do you intend to grow a community around this project or are you just looking for an easy way to get code into the hands of existing customers and partners? If you want to grow a community:
    • Who will be responsible for managing the community?
    • Are you committing to responding to pull requests and issues on a prompt basis?
  • Are you going to provide any formal support for users of the project?

Once this information is collected, I can provide clients with more specific guidance with respect to marking their source code, providing licensing info in their chosen distribution channels, how to set up a contributor agreement intake process (if necessary), how to set user/developer expectations appropriately, and how to attribute third party open source packages used in the project. However, sometimes we all discover that the necessary ingredients for creating an open source project aren’t there and we can walk through alternatives or the steps necessary to be ready in the future. I hope you find this useful!

“Ask Me Anything” Open Source Webinar with Tidelift

I recently sat down with Luis Villa, the co-founder of Tidelift, to record an “Ask Me Anything” (AMA) Webinar on open source licenses. TideLift is a startup that provides companies with a well-curated catalog of proactively maintained open source projects, and in turn provides maintainers with a plethora of resources and a revenue stream. As an open source expert himself (notably, he is the author the Mozilla Public License 2.0), Luis has run a series of open source AMAs and I encourage developers and young open source projects and companies to check them out. We touched on containers, API licenses (including Google v. Oracle), and compliance strategies. A recap of the AMA is available here, and the full audio recording can be downloaded here.

My Comments in “A New Wave of IP Risks”

FOSSA and Above the Law recently partnered to write a white paper titled, “A New Wave of IP Risks: How Open Source is Changing IP Risk in the Software Supply Chain.” I was interviewed for this white paper along with other attorneys including Heather Meeker and Mark Radcliffe, and my comments appear throughout the paper, so I hope you take a look!

The paper provides an overview of the legal risks posed by using open source software including those related to copyright infringement, patent infringement, trade secret misappropriation, as well as those related to reputation and partner/customer relationships. It’s a useful primer for anyone new to the area of open source compliance trying to get a handle on the scope of risk involved and the types of risks that should be considered.

If you would like a copy of the white paper without submitting contact info to Above the Law, please email me.

The Great Open Source Shake-up

The past year and a half has seen a number of open source and open source-ish companies like Elastic, Confluent, and MongoDB change licenses on certain products, moving away from traditional open source licenses such as Apache 2.0 and towards proprietary, source-available, or ultra copyleft licenses. These companies, let’s call them “middleware companies,” were responding mainly to competition from AWS, which used their technologies to deliver AWS services and presented a significant competitive challenge because these companies do not have the same extensive cloud services (physical data centers, physical security, and custom equipment is expensive) or a pre-existing customer base to whom they can cross-sell. Their goal was essentially to prevent AWS from monetizing their technologies without paying them money. 

From a high level view, it’s important to understand that most of these companies had a mix of a licensing models: offering some things under open source licenses and others under proprietary licenses. Broadly speaking, they shifted some of what had previously been open source into the proprietary bucket. 

Critics were quick to choose sides: defending the middleware companies against AWS’s (anti?)competitive practices or taking offense at the middleware companies’ unilateral license changes without input from the community,  or in certain cases, without concern for whether those new licenses would be incorrectly perceived as “open source” licenses. These criticisms are valid.

The Open Source Bet of the Middleware Companies

The middleware companies mostly chose their licensing strategies before AWS came into its own and in some cases, they built companies around pre-existing open source projects whose license was already determined. Open source has been an extremely attractive development model, creating a nearly frictionless collaboration environment that can span the globe. It’s valuable because of the amount and quality of code that can be produced this way. But it’s also valuable because the contributions are free; fewer engineers need to be hired and paid. It’s a great recruitment tool. And, perhaps most importantly, open source functions as a powerful marketing tool, allowing potential customers to sample the goods freely and ideally luring them into a support contract, proprietary license, and/or locking them into a specific ecosystem. All in all, it’s hard to blame these companies for choosing an open source strategy. 

However, even with AWS still in its nascent stages (it launched in 2006), most experts in the open source community already understood that open source licenses are at best orthogonal to SaaS. Because the obligations of the vast majority of open source licenses only kick in upon distribution, there are no obligations on a company using open source to power SaaS to attribute the copyright holders of the software, to make the source code for the service available, or open source their own modifications to the software. And obviously, there’s also no obligation on such a company to reach any sort of monetary arrangement with the authors of such open source software. That understanding was the genesis for the GNU Affero General Public License 3 (AGPL 3) published in 2007: it was meant to solve for this exact problem by adding language to the GPL 3 which required companies to provide source code for software that users interacted with via a computer network if the software was modified AGPL 3 code (or a derivative work of such modified AGPL 3 code). But even the AGPL’s obligations can be avoided by simply not modifying the AGPL 3 code, which there is often no reason to do, or by building layers between the AGPL 3 code and proprietary code. That’s why a lot of these middleware companies didn’t choose to relicense to AGPL 3 and why MongoDB, who was already using AGPL 3, chose to revise the AGPL 3 to expand the circumstances under which services running on AGPL’ed code must open source the previously proprietary parts of those services. 

The middleware companies were or should have been aware of the risk of using traditional open source licenses. They likely miscalculated that risk, not foreseeing the rise of public cloud computing generally, customers’ willingness to forego on-prem software, or the specific dominance of AWS and to a lesser extent the likes of Google Cloud and Microsoft Azure. Maybe they thought that AWS would buy them, kick them some money, or form some sort of partnership with them in the long run, but AWS had no legal obligation to do so and it simply became large and sophisticated enough to support and, where necessary, fork those projects without help from the middleware companies. From a business perspective, it became obvious to the middleware companies that they needed to change their licensing strategies. The customers getting their products from AWS were never going to pay the middleware companies for them directly and no money was going to flow to them from AWS either.

No Alternative Licensing Choices

The middleware companies would have obviously preferred to have an easy go-to alternative like the AGPL 3 which might solve their problems. Perhaps if MongoDB’s update of the GPL 3 (maybe with some practical modifications) had been christened by organizations like the Open Source Initiative or the Free Software Foundation, more of them would have gone down that route. Fundamentally, OSI and FSF are correct in saying that the traditional notion of open source means there should be no field of use discrimination: essentially that it would be antithetical to notion of “open source” to create a license which discriminated against a specific type of use for the software or a specific type of company from using the software. However, their failure to approve or create a standardized license to serve the needs of these middleware companies led to each company choosing its own arcane approach, in some cases moving code to proprietary licensing entirely where a more user-friendly approach might have been viable. 

It’s important to understand that the definition of “open source” and certainly the movement as a whole, was rooted in a very specific vision of the computing future. When RMS first tinkered with a printer driver, he and many of his contemporaries thought that more tinkering was the future. They believed that future generations would become computer-literate, each of us empowered to create and build what we needed when it suited us, and sharing and exchanging our projects with each other freely. In other words, the computer would inspire all to code in the way the printing press inspired all to write.

That world didn’t come to pass. First, cloud computing became far cheaper and easier (think: broadband, fiber optics, Wi-Fi) than anyone could have imagined, which obviated the need for installed software. In most cases, it relieved individuals from spending time managing a server, worrying about security, dealing with redundancy, etc. and gave them time back to work on things other than “computer housekeeping.” Second, the path of technological innovation in computing went in a different direction than expected, in part thanks to the success of open source itself. The original expectation was that people were going to spend most of their time creating end user applications and that most of what they were going to code would be new and unique. But, in fact, most software created today is probably 90% open source and only 10% brand new code. (That disconnect is also what makes traditional open source attribution requirements obsolete: we thought projects would attribute a handful of other projects and credit would be properly bestowed; instead new software products use thousands of open source projects and the “attribution” files for them are nearly worthless.) The success of dependency managers and the innovation of images and containers has cemented this trajectory. 

Third, the wild success of mobile devices really killed the need or desire for most people to learn to code or tinker. It was a platform that was easy to lock down technologically and that lock-down created a simplicity that many craved. It was a platform that was physically difficult to hack on, a good amount of lockdown made sense for safety and security, and it was actually a platform for the masses – even Grandma eventually got a phone even if she never saw a use for (or could afford) a desktop computer, and that’s especially true in developing countries where the first computer most people have is a cell phone. People could use computers without ever interacting with directories or files. In essence, people thought the world of open source would be populated by individuals, but that world was supplemented with oceans of highly specialized companies and the rights and freedoms that the OSI and FSF imagined it was securing for the end user are mostly enforced by corporations instead. 

The principles enshrined around open source make sense in the context from which they spring, but the computing world has changed and moved on while open source hasn’t. There’s a certain elegance in defending freedom from discrimination in licensing, but I think the middleware companies that face death or relicensing don’t feel all that “free” and certainly end users don’t feel all the empowered. End users never get source code to the SaaS applications they’re using and generally have very little transparency into how those applications work, what data those applications hoover, and where that data goes and how it’s used. For a long time, many thought the world would pivot yet again to decentralized computing, mesh networks, a backlash to storing personal information on “other people’s computers,” etc. But that hasn’t come to pass and may never come to pass except under extreme circumstances. Now more than ever there’s reason to believe we’re not going back. Without reevaluating the goals open source was meant to accomplish and writing new licenses, we may well find ourselves in a place where open source will remain pure but no one is using it anymore. 

The Role of the Contributor

Many people criticized the middleware companies for relicensing despite not having an OSI or FSF-approved alternative to move to. Even with strong contributor license agreements or contributor assignment agreements that in some cases gave the companies the full right to relicense, the relicensing left a sour taste in the mouths of contributors who thought they were working on an open, collaborative project, only to find it proprietized, without input or discussion overnight. It may be fair to call contributors to corporate projects naive in thinking that the corporations were ever going to take any action that didn’t maximize shareholder value, but feelings and expectations can’t be contracted away and there is a real tension between the original vision for community that corporations give lip service to and the reality of complete corporate dominion over the open source projects they manage. These contributors were the real victims in this fight; AWS makes money off of the middleware companies, but the middleware companies make money off of unpaid contributors, and if their new strategies succeed, they may be scooping up some of those AWS customers after all. Ethically, it would have behooved the middleware companies to compensate those contributors and to share with them the newfound rewards relicensing would reap, even if it were legally and financially difficult.

Solutions

An immediate solution to the “middleware problem” is the creation of source-available licenses that still allow end users and other developers to access source code, but which place limitations on how that code can be commercialized by others. The many different approaches we’ve seen to date create uncertainty and confusion because they all work differently, remain untested in the courts, and provide varying levels of information on how those approaches are supposed to work in the details and in practice. Standardization is a good first step and that’s already happening with the Polyform Project, which I’m proudly part of. 

 Collectively, if we can’t or won’t solve the “middleware problem,” via new licenses, then we must consider other avenues. It is glaringly obvious that the “middleware problem” isn’t just one of licensing, it exists in large part because the US has taken a very lax attitude towards anticompetitive practices. It’s indisputable that we have more industry consolidation, horizontally and vertically, than we have since Standard Oil and that for the most part, not only is no enforcement action taken against anticompetitive behavior, but mergers continue to be rubber stamped without regard for their effects on innovation generally. Perhaps more stringent enforcement of the laws as well as new laws would have yielded more than just a small handful of cloud computing companies, some of whom may have been better open source citizens, and in general may have yielded the decentralized alternatives many in the open source community have clamored for.

Perhaps we need to rethink corporate structures entirely to make room for a special open source-focused entity. There is a lot of uncharted ground to be had in co-op structures and other ways of giving contributors a stake in a project’s financial success such that if a project doesn’t succeed, then nothing is lost, but if a project is relicensed in the proprietary direction, all contributors are better off. This is undoubtedly a jarring way to think about open source for a generation of engineers who worked on open source not because of money but because of their passion for technology and their belief in openness and collaboration, but maybe this approach can fix the pain the “middleware problem” has created for contributors while other GDPR-like laws deal with user transparency and accountability. 

Part 2: The GPL Cooperation Commitment – Coincidentally, Great for Lawyers

This post is Part 2 of a two part blog post, Part 1 of which was titled “The Dark Practice of Free & Open Source Software Law.”

In April, I attended Free Software Foundation Europe’s Legal and Licensing Workshop in Barcelona, the world’s largest and probably most important gathering of open source attorneys. One of the presentations at the conference was about the GPL Cooperation Commitment and was presented by a number of companies who had entered into the Commitment. More can be read about the Commitment here, but in summary, the Commitment is as follows: a number of companies promised that if they enforce compliance with the GPL 2, they will give the non-compliant entity a 30 day cure period to come into compliance before terminating the license. In other words, they’re promising to treat GPL 2 enforcement just like the GPL 3, which has an explicit 30 day cure provision that the GPL 2 lacks.

On the face of it, this is a good step forward because it allows people to use the GPL 2 without fear of instantly and possibly irrevocably losing their license. This, the theory goes, encourages more companies to use the GPL 2, and perhaps to participate in and contribute to the open source ecosystem. It also underscores the view that enforcement should be about obtaining compliance, rather than using enforcement as a way to attack competitors or eke out monetary damages.

At first glance, I was on board with this seemingly benign and frankly, somewhat boring, “innovation.” I say that because 30 days is really not enough time for anyone to get GPL compliant given the product release cycle (a point the presenters openly conceded) and many of the companies signing on to the commitment either don’t produce much GPL’ed code (I’m looking at you, Etsy) or have no incentive to enforce the GPL. So, to the extent the Commitment has any impact at all, that impact is fairly marginal. But, then the presenters started to discuss future projects where they hoped to “document existing norms and establish new ones.” An audience member suggested that maybe there should also be a “commitment” around the idea that source code doesn’t actually have to be delivered with the GPL’ed software; it would be acceptable to simply post a link to it and point requesters to the same. Other suggestions came rolling in for other “commitments” that would essentially update or upgrade the GPL 2 to more closely match industry practice and modern technologies.

But the problem with the GPL Cooperation Commitment and other suggested “commitments” is that it makes the practice of open source law even more opaque than it already is. As I wrote in Part 1 of this blog series, understanding how an open source license should be applied in any particular situation requires a multilayer analysis that goes well beyond the text of the license itself.

With respect to the GPL in particular, one would have to understand the open source movement, one would have to understand the tome which we call the GPL 2, one would have to know what parts of software package were GPL 2 (and which parts are “GPL 2 only” v. “GPL 2 or later”) and if any exceptions applied, one would have to know what programming languages were involved and what the software architecture looked like, one would have to know whether they’re dealing in user space or kernel space and/or what connection there was between the two, one would have to read the GPL FAQ, one would have to know who the relevant copyright holder(s) are and what their intentions might be, and a number of other disparate sources on the GPL and its application, depending on the exact scenario they’re trying to analyze. In some cases, companies even post additional terms on their websites intended to convey their own, idiosyncratic interpretation of the GPL. Many GPL-related analyses that are widely accepted across the open source community, are based on nothing more formal or authoritative than an email from Linus Torvalds.

With the GPL Cooperation Agreement, one must not only know all of the above to do a complete analysis of GPL compliance, but now they also have to look at a list of Commitment participants on a random Github page to understand the applicable termination provision for the package they’re analyzing. A list that, by the way, isn’t acknowledged anywhere in the package itself. You just have to know it’s out there.

And it gets worse, because a project can have multiple copyright owners capable of enforcing the GPL, some of whom have signed onto the Commitment and some of whom haven’t. So now, if the Commitment actually matters to you and might be a consideration in your decision-making (the goal of the Commitment, right?) you actually need to know all the contributors to a particular project before you really know what the applicable termination provision to the project is. This is notoriously hard since contributors don’t necessarily mark their code with their copyright notices, not all projects track their contributors, and even when they do, it may not be clear if the contribution was really by the individual listed or their employer or whether the copyright in that contribution has been assigned to someone else (e.g. one company bought anothersince the contribution was memorialized. Now imagine there are multiple “commitments” like this and companies who signed onto one of them may or may not sign onto the others.

This yields a dream scenario for open source compliance attorneys who will now bill more hours completing their work and open source compliance software vendors, but a nightmare for companies just trying to do their best to comply with their open source license obligations. It also makes it that much harder for new attorneys to master this field of law. It’s a move that cements incumbent advantage for existing open source attorneys and existing tech companies, profitable and sophisticated enough to deal with an additional layer of compliance complexity that can only really be dealt with by investing money and time into automated compliance systems. It reeks to me of an anticompetitive move under the guise of open source altruism.

As a community, we should be trying to make open source compliance easier, not harder. Our goal should be to make open source licenses easily understandable to the people using them. We should be writing new licenses that fix the errors of old licenses, clarify ambiguities, and clearly spell out all the “hidden” understandings that experts divulge for a fee. In particular, the fact that it was necessary to invent the “Commitment” hack is a sign that it’s time to write new licenses that can easily be upgraded over time. After all, the Commitment was necessary because Linux is under “GPL 2,” not “GPL 2 or later,” meaning that it’s impossible to move Linux to GPL 3 without getting permission from every Linux contributor to do so since presumably they made their contributions under GPL 2 as well. It’s also a sign that we should think a lot harder about the terms under which projects accept contributions (like Developer Certificates of Origin or Contributor License Agreements) and perhaps accept that promising contributors that the license of a project will never change might be detrimental to the long-term success of open source in general if it means that fundamental technologies like Linux remain under stone age licenses and force many Linux-adjacent technologies to also live under outdated licenses in order to maintain license compatibility.

These are undoubtedly large issues to tackle, but necessary ones as the open source movement matures and we need a wider perspective that encompasses a license’s entire lifecycle and the inevitable changes the future will necessitate because of new laws, new legal interpretations, and new technologies. The open source legal community should set its sights on these larger issues rather than putting their fingers in the dike. It’s not sustainable to keep using the same old licenses and letting a self-selected group of companies declare they mean something that they don’t at their whim. That’s a path for turning the GPL 2 into a religious text instead of a legal document.