I recently sat down with Luis Villa, the co-founder of Tidelift, to record an “Ask Me Anything” (AMA) Webinar on open source licenses. TideLift is a startup that provides companies with a well-curated catalog of proactively maintained open source projects, and in turn provides maintainers with a plethora of resources and a revenue stream. As an open source expert himself (notably, he is the author the Mozilla Public License 2.0), Luis has run a series of open source AMAs and I encourage developers and young open source projects and companies to check them out. We touched on containers, API licenses (including Google v. Oracle), and compliance strategies. A recap of the AMA is available here, and the full audio recording can be downloaded here.
FOSSA and Above the Law recently partnered to write a white paper titled, “A New Wave of IP Risks: How Open Source is Changing IP Risk in the Software Supply Chain.” I was interviewed for this white paper along with other attorneys including Heather Meeker and Mark Radcliffe, and my comments appear throughout the paper, so I hope you take a look!
The paper provides an overview of the legal risks posed by using open source software including those related to copyright infringement, patent infringement, trade secret misappropriation, as well as those related to reputation and partner/customer relationships. It’s a useful primer for anyone new to the area of open source compliance trying to get a handle on the scope of risk involved and the types of risks that should be considered.
If you would like a copy of the white paper without submitting contact info to Above the Law, please email me.
The past year and a half has seen a number of open source and open source-ish companies like Elastic, Confluent, and MongoDB change licenses on certain products, moving away from traditional open source licenses such as Apache 2.0 and towards proprietary, source-available, or ultra copyleft licenses. These companies, let’s call them “middleware companies,” were responding mainly to competition from AWS, which used their technologies to deliver AWS services and presented a significant competitive challenge because these companies do not have the same extensive cloud services (physical data centers, physical security, and custom equipment is expensive) or a pre-existing customer base to whom they can cross-sell. Their goal was essentially to prevent AWS from monetizing their technologies without paying them money.
From a high level view, it’s important to understand that most of these companies had a mix of a licensing models: offering some things under open source licenses and others under proprietary licenses. Broadly speaking, they shifted some of what had previously been open source into the proprietary bucket.
Critics were quick to choose sides: defending the middleware companies against AWS’s (anti?)competitive practices or taking offense at the middleware companies’ unilateral license changes without input from the community, or in certain cases, without concern for whether those new licenses would be incorrectly perceived as “open source” licenses. These criticisms are valid.
The Open Source Bet of the Middleware Companies
The middleware companies mostly chose their licensing strategies before AWS came into its own and in some cases, they built companies around pre-existing open source projects whose license was already determined. Open source has been an extremely attractive development model, creating a nearly frictionless collaboration environment that can span the globe. It’s valuable because of the amount and quality of code that can be produced this way. But it’s also valuable because the contributions are free; fewer engineers need to be hired and paid. It’s a great recruitment tool. And, perhaps most importantly, open source functions as a powerful marketing tool, allowing potential customers to sample the goods freely and ideally luring them into a support contract, proprietary license, and/or locking them into a specific ecosystem. All in all, it’s hard to blame these companies for choosing an open source strategy.
However, even with AWS still in its nascent stages (it launched in 2006), most experts in the open source community already understood that open source licenses are at best orthogonal to SaaS. Because the obligations of the vast majority of open source licenses only kick in upon distribution, there are no obligations on a company using open source to power SaaS to attribute the copyright holders of the software, to make the source code for the service available, or open source their own modifications to the software. And obviously, there’s also no obligation on such a company to reach any sort of monetary arrangement with the authors of such open source software. That understanding was the genesis for the GNU Affero General Public License 3 (AGPL 3) published in 2007: it was meant to solve for this exact problem by adding language to the GPL 3 which required companies to provide source code for software that users interacted with via a computer network if the software was modified AGPL 3 code (or a derivative work of such modified AGPL 3 code). But even the AGPL’s obligations can be avoided by simply not modifying the AGPL 3 code, which there is often no reason to do, or by building layers between the AGPL 3 code and proprietary code. That’s why a lot of these middleware companies didn’t choose to relicense to AGPL 3 and why MongoDB, who was already using AGPL 3, chose to revise the AGPL 3 to expand the circumstances under which services running on AGPL’ed code must open source the previously proprietary parts of those services.
The middleware companies were or should have been aware of the risk of using traditional open source licenses. They likely miscalculated that risk, not foreseeing the rise of public cloud computing generally, customers’ willingness to forego on-prem software, or the specific dominance of AWS and to a lesser extent the likes of Google Cloud and Microsoft Azure. Maybe they thought that AWS would buy them, kick them some money, or form some sort of partnership with them in the long run, but AWS had no legal obligation to do so and it simply became large and sophisticated enough to support and, where necessary, fork those projects without help from the middleware companies. From a business perspective, it became obvious to the middleware companies that they needed to change their licensing strategies. The customers getting their products from AWS were never going to pay the middleware companies for them directly and no money was going to flow to them from AWS either.
No Alternative Licensing Choices
The middleware companies would have obviously preferred to have an easy go-to alternative like the AGPL 3 which might solve their problems. Perhaps if MongoDB’s update of the GPL 3 (maybe with some practical modifications) had been christened by organizations like the Open Source Initiative or the Free Software Foundation, more of them would have gone down that route. Fundamentally, OSI and FSF are correct in saying that the traditional notion of open source means there should be no field of use discrimination: essentially that it would be antithetical to notion of “open source” to create a license which discriminated against a specific type of use for the software or a specific type of company from using the software. However, their failure to approve or create a standardized license to serve the needs of these middleware companies led to each company choosing its own arcane approach, in some cases moving code to proprietary licensing entirely where a more user-friendly approach might have been viable.
It’s important to understand that the definition of “open source” and certainly the movement as a whole, was rooted in a very specific vision of the computing future. When RMS first tinkered with a printer driver, he and many of his contemporaries thought that more tinkering was the future. They believed that future generations would become computer-literate, each of us empowered to create and build what we needed when it suited us, and sharing and exchanging our projects with each other freely. In other words, the computer would inspire all to code in the way the printing press inspired all to write.
That world didn’t come to pass. First, cloud computing became far cheaper and easier (think: broadband, fiber optics, Wi-Fi) than anyone could have imagined, which obviated the need for installed software. In most cases, it relieved individuals from spending time managing a server, worrying about security, dealing with redundancy, etc. and gave them time back to work on things other than “computer housekeeping.” Second, the path of technological innovation in computing went in a different direction than expected, in part thanks to the success of open source itself. The original expectation was that people were going to spend most of their time creating end user applications and that most of what they were going to code would be new and unique. But, in fact, most software created today is probably 90% open source and only 10% brand new code. (That disconnect is also what makes traditional open source attribution requirements obsolete: we thought projects would attribute a handful of other projects and credit would be properly bestowed; instead new software products use thousands of open source projects and the “attribution” files for them are nearly worthless.) The success of dependency managers and the innovation of images and containers has cemented this trajectory.
Third, the wild success of mobile devices really killed the need or desire for most people to learn to code or tinker. It was a platform that was easy to lock down technologically and that lock-down created a simplicity that many craved. It was a platform that was physically difficult to hack on, a good amount of lockdown made sense for safety and security, and it was actually a platform for the masses – even Grandma eventually got a phone even if she never saw a use for (or could afford) a desktop computer, and that’s especially true in developing countries where the first computer most people have is a cell phone. People could use computers without ever interacting with directories or files. In essence, people thought the world of open source would be populated by individuals, but that world was supplemented with oceans of highly specialized companies and the rights and freedoms that the OSI and FSF imagined it was securing for the end user are mostly enforced by corporations instead.
The principles enshrined around open source make sense in the context from which they spring, but the computing world has changed and moved on while open source hasn’t. There’s a certain elegance in defending freedom from discrimination in licensing, but I think the middleware companies that face death or relicensing don’t feel all that “free” and certainly end users don’t feel all the empowered. End users never get source code to the SaaS applications they’re using and generally have very little transparency into how those applications work, what data those applications hoover, and where that data goes and how it’s used. For a long time, many thought the world would pivot yet again to decentralized computing, mesh networks, a backlash to storing personal information on “other people’s computers,” etc. But that hasn’t come to pass and may never come to pass except under extreme circumstances. Now more than ever there’s reason to believe we’re not going back. Without reevaluating the goals open source was meant to accomplish and writing new licenses, we may well find ourselves in a place where open source will remain pure but no one is using it anymore.
The Role of the Contributor
Many people criticized the middleware companies for relicensing despite not having an OSI or FSF-approved alternative to move to. Even with strong contributor license agreements or contributor assignment agreements that in some cases gave the companies the full right to relicense, the relicensing left a sour taste in the mouths of contributors who thought they were working on an open, collaborative project, only to find it proprietized, without input or discussion overnight. It may be fair to call contributors to corporate projects naive in thinking that the corporations were ever going to take any action that didn’t maximize shareholder value, but feelings and expectations can’t be contracted away and there is a real tension between the original vision for community that corporations give lip service to and the reality of complete corporate dominion over the open source projects they manage. These contributors were the real victims in this fight; AWS makes money off of the middleware companies, but the middleware companies make money off of unpaid contributors, and if their new strategies succeed, they may be scooping up some of those AWS customers after all. Ethically, it would have behooved the middleware companies to compensate those contributors and to share with them the newfound rewards relicensing would reap, even if it were legally and financially difficult.
An immediate solution to the “middleware problem” is the creation of source-available licenses that still allow end users and other developers to access source code, but which place limitations on how that code can be commercialized by others. The many different approaches we’ve seen to date create uncertainty and confusion because they all work differently, remain untested in the courts, and provide varying levels of information on how those approaches are supposed to work in the details and in practice. Standardization is a good first step and that’s already happening with the Polyform Project, which I’m proudly part of.
Collectively, if we can’t or won’t solve the “middleware problem,” via new licenses, then we must consider other avenues. It is glaringly obvious that the “middleware problem” isn’t just one of licensing, it exists in large part because the US has taken a very lax attitude towards anticompetitive practices. It’s indisputable that we have more industry consolidation, horizontally and vertically, than we have since Standard Oil and that for the most part, not only is no enforcement action taken against anticompetitive behavior, but mergers continue to be rubber stamped without regard for their effects on innovation generally. Perhaps more stringent enforcement of the laws as well as new laws would have yielded more than just a small handful of cloud computing companies, some of whom may have been better open source citizens, and in general may have yielded the decentralized alternatives many in the open source community have clamored for.
Perhaps we need to rethink corporate structures entirely to make room for a special open source-focused entity. There is a lot of uncharted ground to be had in co-op structures and other ways of giving contributors a stake in a project’s financial success such that if a project doesn’t succeed, then nothing is lost, but if a project is relicensed in the proprietary direction, all contributors are better off. This is undoubtedly a jarring way to think about open source for a generation of engineers who worked on open source not because of money but because of their passion for technology and their belief in openness and collaboration, but maybe this approach can fix the pain the “middleware problem” has created for contributors while other GDPR-like laws deal with user transparency and accountability.
This post is Part 2 of a two part blog post, Part 1 of which was titled “The Dark Practice of Free & Open Source Software Law.”
In April, I attended Free Software Foundation Europe’s Legal and Licensing Workshop in Barcelona, the world’s largest and probably most important gathering of open source attorneys. One of the presentations at the conference was about the GPL Cooperation Commitment and was presented by a number of companies who had entered into the Commitment. More can be read about the Commitment here, but in summary, the Commitment is as follows: a number of companies promised that if they enforce compliance with the GPL 2, they will give the non-compliant entity a 30 day cure period to come into compliance before terminating the license. In other words, they’re promising to treat GPL 2 enforcement just like the GPL 3, which has an explicit 30 day cure provision that the GPL 2 lacks.
On the face of it, this is a good step forward because it allows people to use the GPL 2 without fear of instantly and possibly irrevocably losing their license. This, the theory goes, encourages more companies to use the GPL 2, and perhaps to participate in and contribute to the open source ecosystem. It also underscores the view that enforcement should be about obtaining compliance, rather than using enforcement as a way to attack competitors or eke out monetary damages.
At first glance, I was on board with this seemingly benign and frankly, somewhat boring, “innovation.” I say that because 30 days is really not enough time for anyone to get GPL compliant given the product release cycle (a point the presenters openly conceded) and many of the companies signing on to the commitment either don’t produce much GPL’ed code (I’m looking at you, Etsy) or have no incentive to enforce the GPL. So, to the extent the Commitment has any impact at all, that impact is fairly marginal. But, then the presenters started to discuss future projects where they hoped to “document existing norms and establish new ones.” An audience member suggested that maybe there should also be a “commitment” around the idea that source code doesn’t actually have to be delivered with the GPL’ed software; it would be acceptable to simply post a link to it and point requesters to the same. Other suggestions came rolling in for other “commitments” that would essentially update or upgrade the GPL 2 to more closely match industry practice and modern technologies.
But the problem with the GPL Cooperation Commitment and other suggested “commitments” is that it makes the practice of open source law even more opaque than it already is. As I wrote in Part 1 of this blog series, understanding how an open source license should be applied in any particular situation requires a multilayer analysis that goes well beyond the text of the license itself.
With respect to the GPL in particular, one would have to understand the open source movement, one would have to understand the tome which we call the GPL 2, one would have to know what parts of software package were GPL 2 (and which parts are “GPL 2 only” v. “GPL 2 or later”) and if any exceptions applied, one would have to know what programming languages were involved and what the software architecture looked like, one would have to know whether they’re dealing in user space or kernel space and/or what connection there was between the two, one would have to read the GPL FAQ, one would have to know who the relevant copyright holder(s) are and what their intentions might be, and a number of other disparate sources on the GPL and its application, depending on the exact scenario they’re trying to analyze. In some cases, companies even post additional terms on their websites intended to convey their own, idiosyncratic interpretation of the GPL. Many GPL-related analyses that are widely accepted across the open source community, are based on nothing more formal or authoritative than an email from Linus Torvalds.
With the GPL Cooperation Agreement, one must not only know all of the above to do a complete analysis of GPL compliance, but now they also have to look at a list of Commitment participants on a random Github page to understand the applicable termination provision for the package they’re analyzing. A list that, by the way, isn’t acknowledged anywhere in the package itself. You just have to know it’s out there.
And it gets worse, because a project can have multiple copyright owners capable of enforcing the GPL, some of whom have signed onto the Commitment and some of whom haven’t. So now, if the Commitment actually matters to you and might be a consideration in your decision-making (the goal of the Commitment, right?) you actually need to know all the contributors to a particular project before you really know what the applicable termination provision to the project is. This is notoriously hard since contributors don’t necessarily mark their code with their copyright notices, not all projects track their contributors, and even when they do, it may not be clear if the contribution was really by the individual listed or their employer or whether the copyright in that contribution has been assigned to someone else (e.g. one company bought another) since the contribution was memorialized. Now imagine there are multiple “commitments” like this and companies who signed onto one of them may or may not sign onto the others.
This yields a dream scenario for open source compliance attorneys who will now bill more hours completing their work and open source compliance software vendors, but a nightmare for companies just trying to do their best to comply with their open source license obligations. It also makes it that much harder for new attorneys to master this field of law. It’s a move that cements incumbent advantage for existing open source attorneys and existing tech companies, profitable and sophisticated enough to deal with an additional layer of compliance complexity that can only really be dealt with by investing money and time into automated compliance systems. It reeks to me of an anticompetitive move under the guise of open source altruism.
As a community, we should be trying to make open source compliance easier, not harder. Our goal should be to make open source licenses easily understandable to the people using them. We should be writing new licenses that fix the errors of old licenses, clarify ambiguities, and clearly spell out all the “hidden” understandings that experts divulge for a fee. In particular, the fact that it was necessary to invent the “Commitment” hack is a sign that it’s time to write new licenses that can easily be upgraded over time. After all, the Commitment was necessary because Linux is under “GPL 2,” not “GPL 2 or later,” meaning that it’s impossible to move Linux to GPL 3 without getting permission from every Linux contributor to do so since presumably they made their contributions under GPL 2 as well. It’s also a sign that we should think a lot harder about the terms under which projects accept contributions (like Developer Certificates of Origin or Contributor License Agreements) and perhaps accept that promising contributors that the license of a project will never change might be detrimental to the long-term success of open source in general if it means that fundamental technologies like Linux remain under stone age licenses and force many Linux-adjacent technologies to also live under outdated licenses in order to maintain license compatibility.
These are undoubtedly large issues to tackle, but necessary ones as the open source movement matures and we need a wider perspective that encompasses a license’s entire lifecycle and the inevitable changes the future will necessitate because of new laws, new legal interpretations, and new technologies. The open source legal community should set its sights on these larger issues rather than putting their fingers in the dike. It’s not sustainable to keep using the same old licenses and letting a self-selected group of companies declare they mean something that they don’t at their whim. That’s a path for turning the GPL 2 into a religious text instead of a legal document.
This will be part 1 of a 2 part post, the second of which will be titled “The GPL Cooperation Commitment: Coincidentally, Great for Lawyers.” I promise the connection between the two posts will be revealed!
As an open source attorney to a number of tech startups, I often get asked how an attorney can learn more about this field and start practicing within it. I think it’s important to understand the shape and nature of the field, though, before getting into a detailed how-to. In my experience, this is a dark and murky area of law, where success largely depends on one’s connections.
Let’s start with a basic fact about lawyering: law school teaches very little about the actual practice of law, and that’s especially true for transactional or corporate attorneys who do not have a litigation practice, which would include most open source attorneys. When I went to Columbia from 2007-2010, there were no classes specifically on open source or even related fields like licensing, contract drafting, or negotiation. Sure, we could learn about various legal decisions in the areas of contracts and IP, so we had an idea of how the courts interpreted various contractual clauses and statutes, but we only ever read and interpreted one or two contracts and we learned nothing about how to actually put a client’s goals into legally binding writing. At graduation, none of us could really be trusted to write anything more complicated than a bathroom hall pass.
Most lawyers learn the day-to-day business of lawyering on the job, by taking on increasingly complex tasks that are heavily supervised and directed by more senior attorneys. In many areas of the law, this is relatively straightforward because there are many statutes, regulations, court rulings, law review articles, and even treatises available. Clients ask questions, junior attorneys look up the answers, and senior attorneys make sure the junior attorneys looked at the right things and communicated their findings clearly. There might be issues that no one has squarely addressed, in which case the attorney would have to guess as to how it might come out based on the information they have, but there’s no difficulty in figuring out what body of resources they need to examine to make that guess.
The open source world, in contrast, is relatively devoid of resources. In the US, there have only been a handful of cases related to open source and they’ve barely scratched the surface with respect to open source licensing interpretation. Outside the US, there has been more open source litigation, but that litigation has mostly taken place in Germany (and thus limited in applicability to Germany) and has mostly focused on the procedural aspects of the cases that have little bearing on open source license interpretation. With limited judicial interpretation of open source licenses, there’s not much for lawyers to write law review articles, books or treatises on.
So what on earth do lawyers use as resources in guiding their clients? Well, lawyers can guess as to how they think a license should be complied with based on a simple reading of the license, assuming there is just one applicable license. But, given that some licenses weren’t drafted by lawyers, some licenses get fairly technical, and that most commonly used licenses are so old that they predate pivotal concepts like software patents, containers, dependency managers, SaaS, and interpreted languages, simply reading the license leaves most lawyers with more questions than answers. Open source licenses as a whole are completely incomprehensible even to licensing attorneys if they’re unfamiliar with the culture and ethos of the open source movement.
In practice, even determining how any particular piece of software is licensed is not straightforward. Some software packages aren’t marked with any license at all. There might be licensing information on the project’s homepage instead. While many software packages have a COPYING or LICENSE file indicating the package’s license, many also have additional license information in other files in the package. It’s not uncommon to see packages with 10 or more licenses and finding all of them is either a long manual effort or involves automation. Additionally, it’s not uncommon to see a package licensed one way within the package itself but for there to be either additional or conflicting terms on the project homepage or an idiosyncratic explanation with regard to how that project interprets its chosen license.
Next, lawyers look to the authors or stewards of the license for help in interpreting it just like litigators look to congressional records for help in interpreting statutes. The GPL, Eclipse, and Apache FAQs are nearly sacred in this world. Sometimes this may involve more digging; going through old email threads, blog posts, and bulletin board posts to dig into the history of how the license was drafted and why. Some of this is available online, but a lot of this is oral history, passed down from one lawyer to another. It can be hard to know where to look or who to ask, so it’s a field where knowing others with this expertise is critical. Without understanding the historical context for how a license was born, what its intended function was, and what, if any, industry consensus there is on how a license should be applied in a particular case, it can be nearly impossible to make heads or tails of what a license written many years or even decades ago might mean in 2019.
Ultimately, lawyers advise their clients based on a risk analysis: if we don’t get this right, what is the likelihood we will be sued and by whom? Here, lawyers start looking at who the entities enforcing open source licenses are, what types of non-compliance they are targeting, who they targeting, and what their goals and motivations are. Some of this information has been memorialized in various writings, but knowing who is doing what today and what they may want to do in the future requires being plugged into the open source legal community and keeping abreast of various organizational politics within these entities. Some of these entities are license stewards or are formally organized as open source compliance organizations, but some of these entities are commercial players motivated by profits and the competitive landscape, and others are just trolls.
It’s also important for an open source lawyer to have some technical expertise. Understanding software architecture, the differences between various programming languages, and various concepts like user space v. kernel space, virtual appliances, APIs, sockets, linking, calling, etc. are all vital to giving clients good guidance. This information is easily understood by lawyers who are also engineers, but lawyers who aren’t engineers need to acquire this knowledge from disparate sources including engineers, other lawyers, and various websites and books. Because the software industry innovates so quickly, there’s no one resource for this information. Given the scale of open source software consumption today, an attorney’s technical expertise also has to extend to familiarity with a number of open source compliance tools, how to properly implement and configure them, and how to create practical workflows with them.
You’ll notice that the common theme here is more or less “you gotta know a guy/gal.” Should be easy enough… but the willingness to exchange legal thought around open source licenses is, perversely, inversely related to the number of court cases in this area. With very little substantive interpretation of open source licenses by the courts, lawyers are extremely reticent to share their understanding of open source licenses with each other. No one wants to reveal an interpretation that another attorney might dispute, and in so doing admit that what their client is doing fails to meet someone else’s understanding of what an open source license requires. That might lead to unwanted attention and litigation.
This leads us to the Free Software Foundation Europe’s Legal Network mailing list. This is probably the definitive mailing list for open source attorneys and the mailing list as a whole is invited to attend an annual conference in Europe called the Legal and Licensing Workshop, which is only open to list subscribers. This is the single largest gathering of open source attorneys in the world and it’s probably fair to say virtually all serious open source legal practitioners try to attend this conference. How does any US attorney knows about this list? You gotta know a guy/gal. How does one get approved to join the list? Again, you gotta know a guy/gal. Someone somewhere has to vouch for you. The list itself can be a valuable resource, but also requires tricky maneuvering: most of the list consists of in-house attorneys who lurk but dare not speak (see above) and some folks on the list are from open source license enforcement organizations. It’s great for outside counsel like me who can ask questions without necessarily divulging on whose behalf I’m asking them, but it’s a “look, don’t touch” arrangement for in-house counsel who might accidentally reveal they’ve being doing it all wrong all along – directly to the people most likely to instigate an enforcement action.
In short, becoming an open source attorney is less about learning the law in a particular field and more about being plugged into a particular community. To give clients the best guidance you can, you have to understand copyright law and contract law (and patents, trademarks to an extent, etc.), be familiar with dozens of licenses, understand their history, understand a good bit about programming and computers, know who the license stewards are and how they interpret their licenses, understand the enforcement players, have a grasp of the competitive landscape, and learn about various compliance automation tools. This is truly a multidisciplinary field in which resources are scattered and hard to come by and where every analysis contains multiple layers.
In my opinion, becoming a real expert in the field takes years of working with a number of different technologies and licenses and getting to know others in the open source world personally or at least through their writing. It also helps to have experience working on both primarily proprietary and primarily open source products. Unsurprisingly, there are relatively few open source legal experts in the world – the last Legal and Licensing Workshop in Barcelona had fewer than 200 attendees. Breaking into this field is just plain hard and strongly depends on having the right connections.