I gave a presentation today during the Open Source Initiative’s “Practical Open Source Information” event on OSS in procurement. This presentation specifically focuses on what OSS-related provisions you may need to add when you are purchasing licenses to third party software, either from consultants/contractors or off-the-shelf. You can download my slides, including a number of sample provisions here.
One of the most frequent questions I receive not only from engineers, but from other lawyers is “we are thinking of making a project available under an open source license – what do we need to know?” Since I get this question a lot, I’ll share with you the list of information I usually collect from clients to help them assess whether an open source license is right for them, which one they should use, how they should communicate licensing info to others, and whether they should consider creating codes of conduct, contributor guidelines, or contributor agreements (or DCOs) for their projects:
Outbound Open source checklist
- What is the name of your project and the first version that will be publicly released?
- Please describe what your project does and the format in which it is made available. (Will you be providing just source or binaries, too? Will it be packaged as a virtual machine, container, etc.?)
- Where do you plan to make the open source project available? (ex.: Github, NPM, rubygems.org, etc.)
- Does the project have any use outside the context of using it in conjunction with other Company products or services?
- Do you have any reason to believe that there are Company patents on the technology you want to open source?
- Does Security see any issues in open sourcing the proposed project? Have you ensured there are no keys or credentials in the source code?
- Does your project contain any code licensed from a vendor (whether for free or for fee)?
- Does your project contain any integrations with any third-party products or services?
- Does the project have any “phone home” features? If so, please identify with specificity the data collected and why.
- Please submit an open source code scan of the project and its dependencies (including transitive dependencies) for review.
- The scan needs to be done before a main license can be chosen for the project because the main license of the project should be compatible with the licenses of the third-party code the project uses or pulls in. For example, if you want to license something out as GPL 2, you can’t include Apache 2.0 dependencies and vice versa.
- You will need to remove dependencies/components under commercial licenses or under licenses that are incompatible with the chosen license for the project as a whole.
- Explain why you want to open source this project. What are you trying to accomplish? Why is this preferable over a proprietary license or a proprietary license that gives access to source code?
- Have you thought about the sort of license you’d like to release this under? If so, what is it and why?
- If the project is used by a competitor or a provided as a service by a big cloud services provider, does that pose issues for the business?
- Can you foresee any reasons why you might later regret open sourcing this project?
- Do you expect proprietary Company products to incorporate this code in the future?
- Do you expect that this project might grow into something you might want to commercially license in the future?
- Do you intend to grow a community around this project or are you just looking for an easy way to get code into the hands of existing customers and partners? If you want to grow a community:
- Who will be responsible for managing the community?
- Are you committing to responding to pull requests and issues on a prompt basis?
- Are you going to provide any formal support for users of the project?
Once this information is collected, I can provide clients with more specific guidance with respect to marking their source code, providing licensing info in their chosen distribution channels, how to set up a contributor agreement intake process (if necessary), how to set user/developer expectations appropriately, and how to attribute third party open source packages used in the project. However, sometimes we all discover that the necessary ingredients for creating an open source project aren’t there and we can walk through alternatives or the steps necessary to be ready in the future. I hope you find this useful!
I recently sat down with Luis Villa, the co-founder of Tidelift, to record an “Ask Me Anything” (AMA) Webinar on open source licenses. TideLift is a startup that provides companies with a well-curated catalog of proactively maintained open source projects, and in turn provides maintainers with a plethora of resources and a revenue stream. As an open source expert himself (notably, he is the author the Mozilla Public License 2.0), Luis has run a series of open source AMAs and I encourage developers and young open source projects and companies to check them out. We touched on containers, API licenses (including Google v. Oracle), and compliance strategies. A recap of the AMA is available here, and the full audio recording can be downloaded here.
FOSSA and Above the Law recently partnered to write a white paper titled, “A New Wave of IP Risks: How Open Source is Changing IP Risk in the Software Supply Chain.” I was interviewed for this white paper along with other attorneys including Heather Meeker and Mark Radcliffe, and my comments appear throughout the paper, so I hope you take a look!
The paper provides an overview of the legal risks posed by using open source software including those related to copyright infringement, patent infringement, trade secret misappropriation, as well as those related to reputation and partner/customer relationships. It’s a useful primer for anyone new to the area of open source compliance trying to get a handle on the scope of risk involved and the types of risks that should be considered.
If you would like a copy of the white paper without submitting contact info to Above the Law, please email me.
The past year and a half has seen a number of open source and open source-ish companies like Elastic, Confluent, and MongoDB change licenses on certain products, moving away from traditional open source licenses such as Apache 2.0 and towards proprietary, source-available, or ultra copyleft licenses. These companies, let’s call them “middleware companies,” were responding mainly to competition from AWS, which used their technologies to deliver AWS services and presented a significant competitive challenge because these companies do not have the same extensive cloud services (physical data centers, physical security, and custom equipment is expensive) or a pre-existing customer base to whom they can cross-sell. Their goal was essentially to prevent AWS from monetizing their technologies without paying them money.
From a high level view, it’s important to understand that most of these companies had a mix of a licensing models: offering some things under open source licenses and others under proprietary licenses. Broadly speaking, they shifted some of what had previously been open source into the proprietary bucket.
Critics were quick to choose sides: defending the middleware companies against AWS’s (anti?)competitive practices or taking offense at the middleware companies’ unilateral license changes without input from the community, or in certain cases, without concern for whether those new licenses would be incorrectly perceived as “open source” licenses. These criticisms are valid.
The Open Source Bet of the Middleware Companies
The middleware companies mostly chose their licensing strategies before AWS came into its own and in some cases, they built companies around pre-existing open source projects whose license was already determined. Open source has been an extremely attractive development model, creating a nearly frictionless collaboration environment that can span the globe. It’s valuable because of the amount and quality of code that can be produced this way. But it’s also valuable because the contributions are free; fewer engineers need to be hired and paid. It’s a great recruitment tool. And, perhaps most importantly, open source functions as a powerful marketing tool, allowing potential customers to sample the goods freely and ideally luring them into a support contract, proprietary license, and/or locking them into a specific ecosystem. All in all, it’s hard to blame these companies for choosing an open source strategy.
However, even with AWS still in its nascent stages (it launched in 2006), most experts in the open source community already understood that open source licenses are at best orthogonal to SaaS. Because the obligations of the vast majority of open source licenses only kick in upon distribution, there are no obligations on a company using open source to power SaaS to attribute the copyright holders of the software, to make the source code for the service available, or open source their own modifications to the software. And obviously, there’s also no obligation on such a company to reach any sort of monetary arrangement with the authors of such open source software. That understanding was the genesis for the GNU Affero General Public License 3 (AGPL 3) published in 2007: it was meant to solve for this exact problem by adding language to the GPL 3 which required companies to provide source code for software that users interacted with via a computer network if the software was modified AGPL 3 code (or a derivative work of such modified AGPL 3 code). But even the AGPL’s obligations can be avoided by simply not modifying the AGPL 3 code, which there is often no reason to do, or by building layers between the AGPL 3 code and proprietary code. That’s why a lot of these middleware companies didn’t choose to relicense to AGPL 3 and why MongoDB, who was already using AGPL 3, chose to revise the AGPL 3 to expand the circumstances under which services running on AGPL’ed code must open source the previously proprietary parts of those services.
The middleware companies were or should have been aware of the risk of using traditional open source licenses. They likely miscalculated that risk, not foreseeing the rise of public cloud computing generally, customers’ willingness to forego on-prem software, or the specific dominance of AWS and to a lesser extent the likes of Google Cloud and Microsoft Azure. Maybe they thought that AWS would buy them, kick them some money, or form some sort of partnership with them in the long run, but AWS had no legal obligation to do so and it simply became large and sophisticated enough to support and, where necessary, fork those projects without help from the middleware companies. From a business perspective, it became obvious to the middleware companies that they needed to change their licensing strategies. The customers getting their products from AWS were never going to pay the middleware companies for them directly and no money was going to flow to them from AWS either.
No Alternative Licensing Choices
The middleware companies would have obviously preferred to have an easy go-to alternative like the AGPL 3 which might solve their problems. Perhaps if MongoDB’s update of the GPL 3 (maybe with some practical modifications) had been christened by organizations like the Open Source Initiative or the Free Software Foundation, more of them would have gone down that route. Fundamentally, OSI and FSF are correct in saying that the traditional notion of open source means there should be no field of use discrimination: essentially that it would be antithetical to notion of “open source” to create a license which discriminated against a specific type of use for the software or a specific type of company from using the software. However, their failure to approve or create a standardized license to serve the needs of these middleware companies led to each company choosing its own arcane approach, in some cases moving code to proprietary licensing entirely where a more user-friendly approach might have been viable.
It’s important to understand that the definition of “open source” and certainly the movement as a whole, was rooted in a very specific vision of the computing future. When RMS first tinkered with a printer driver, he and many of his contemporaries thought that more tinkering was the future. They believed that future generations would become computer-literate, each of us empowered to create and build what we needed when it suited us, and sharing and exchanging our projects with each other freely. In other words, the computer would inspire all to code in the way the printing press inspired all to write.
That world didn’t come to pass. First, cloud computing became far cheaper and easier (think: broadband, fiber optics, Wi-Fi) than anyone could have imagined, which obviated the need for installed software. In most cases, it relieved individuals from spending time managing a server, worrying about security, dealing with redundancy, etc. and gave them time back to work on things other than “computer housekeeping.” Second, the path of technological innovation in computing went in a different direction than expected, in part thanks to the success of open source itself. The original expectation was that people were going to spend most of their time creating end user applications and that most of what they were going to code would be new and unique. But, in fact, most software created today is probably 90% open source and only 10% brand new code. (That disconnect is also what makes traditional open source attribution requirements obsolete: we thought projects would attribute a handful of other projects and credit would be properly bestowed; instead new software products use thousands of open source projects and the “attribution” files for them are nearly worthless.) The success of dependency managers and the innovation of images and containers has cemented this trajectory.
Third, the wild success of mobile devices really killed the need or desire for most people to learn to code or tinker. It was a platform that was easy to lock down technologically and that lock-down created a simplicity that many craved. It was a platform that was physically difficult to hack on, a good amount of lockdown made sense for safety and security, and it was actually a platform for the masses – even Grandma eventually got a phone even if she never saw a use for (or could afford) a desktop computer, and that’s especially true in developing countries where the first computer most people have is a cell phone. People could use computers without ever interacting with directories or files. In essence, people thought the world of open source would be populated by individuals, but that world was supplemented with oceans of highly specialized companies and the rights and freedoms that the OSI and FSF imagined it was securing for the end user are mostly enforced by corporations instead.
The principles enshrined around open source make sense in the context from which they spring, but the computing world has changed and moved on while open source hasn’t. There’s a certain elegance in defending freedom from discrimination in licensing, but I think the middleware companies that face death or relicensing don’t feel all that “free” and certainly end users don’t feel all the empowered. End users never get source code to the SaaS applications they’re using and generally have very little transparency into how those applications work, what data those applications hoover, and where that data goes and how it’s used. For a long time, many thought the world would pivot yet again to decentralized computing, mesh networks, a backlash to storing personal information on “other people’s computers,” etc. But that hasn’t come to pass and may never come to pass except under extreme circumstances. Now more than ever there’s reason to believe we’re not going back. Without reevaluating the goals open source was meant to accomplish and writing new licenses, we may well find ourselves in a place where open source will remain pure but no one is using it anymore.
The Role of the Contributor
Many people criticized the middleware companies for relicensing despite not having an OSI or FSF-approved alternative to move to. Even with strong contributor license agreements or contributor assignment agreements that in some cases gave the companies the full right to relicense, the relicensing left a sour taste in the mouths of contributors who thought they were working on an open, collaborative project, only to find it proprietized, without input or discussion overnight. It may be fair to call contributors to corporate projects naive in thinking that the corporations were ever going to take any action that didn’t maximize shareholder value, but feelings and expectations can’t be contracted away and there is a real tension between the original vision for community that corporations give lip service to and the reality of complete corporate dominion over the open source projects they manage. These contributors were the real victims in this fight; AWS makes money off of the middleware companies, but the middleware companies make money off of unpaid contributors, and if their new strategies succeed, they may be scooping up some of those AWS customers after all. Ethically, it would have behooved the middleware companies to compensate those contributors and to share with them the newfound rewards relicensing would reap, even if it were legally and financially difficult.
An immediate solution to the “middleware problem” is the creation of source-available licenses that still allow end users and other developers to access source code, but which place limitations on how that code can be commercialized by others. The many different approaches we’ve seen to date create uncertainty and confusion because they all work differently, remain untested in the courts, and provide varying levels of information on how those approaches are supposed to work in the details and in practice. Standardization is a good first step and that’s already happening with the Polyform Project, which I’m proudly part of.
Collectively, if we can’t or won’t solve the “middleware problem,” via new licenses, then we must consider other avenues. It is glaringly obvious that the “middleware problem” isn’t just one of licensing, it exists in large part because the US has taken a very lax attitude towards anticompetitive practices. It’s indisputable that we have more industry consolidation, horizontally and vertically, than we have since Standard Oil and that for the most part, not only is no enforcement action taken against anticompetitive behavior, but mergers continue to be rubber stamped without regard for their effects on innovation generally. Perhaps more stringent enforcement of the laws as well as new laws would have yielded more than just a small handful of cloud computing companies, some of whom may have been better open source citizens, and in general may have yielded the decentralized alternatives many in the open source community have clamored for.
Perhaps we need to rethink corporate structures entirely to make room for a special open source-focused entity. There is a lot of uncharted ground to be had in co-op structures and other ways of giving contributors a stake in a project’s financial success such that if a project doesn’t succeed, then nothing is lost, but if a project is relicensed in the proprietary direction, all contributors are better off. This is undoubtedly a jarring way to think about open source for a generation of engineers who worked on open source not because of money but because of their passion for technology and their belief in openness and collaboration, but maybe this approach can fix the pain the “middleware problem” has created for contributors while other GDPR-like laws deal with user transparency and accountability.