← BACK TO DISPATCH

License Hygiene for Patent-Sensitive AI Projects

> Disclaimer: I am not a lawyer. Nothing in this post is legal advice. The analysis below describes what license texts actually say - their plain language and the patent implications that follow from it. For any real IP decision, consult a qualified patent attorney.

License Hygiene for Patent-Sensitive AI Projects

Disclaimer: I am not a lawyer. Nothing in this post is legal advice. The analysis below describes what license texts actually say - their plain language and the patent implications that follow from it. For any real IP decision, consult a qualified patent attorney.


The Conflict Nobody Checks Until It’s Too Late

Most engineers think about open-source licenses as a shipping question: can I include this in a commercial product? Can I keep my modifications private? The GPL/proprietary boundary gets most of the attention, and most teams have at least a rough policy about it.

The patent question is different - and almost nobody asks it before it matters.

The question is: does using this code constrain my ability to patent my own work, or to assert patents defensively later? These are not the same question as “can I ship it.” A codebase can be perfectly legal to ship under a commercial exception or dual-license arrangement while simultaneously creating a patent retaliation trap that activates the first time you file.

The stakes are concrete. GPL v3 code in your dependency tree means that asserting a patent against someone who uses the same GPL v3-covered functionality could terminate your own license to that code. You go from plaintiff to infringer. The mechanism is a patent retaliation clause - text that deliberately makes patent assertion legally expensive for anyone incorporating the protected code.

This sounds abstract until a patent attorney goes through your requirements.txt or your package-lock.json and starts asking which licenses are in the dependency tree. At that point you’re doing archaeology on decisions made two years ago by an engineer who was thinking about getting the code to run, not about what happens if a competitor ships a similar feature in 2027.

The check takes 30 seconds per dependency. Do it before the import statement.


The Core Mechanism: GPL v3 Sections 10 and 11

GPL v3’s patent retaliation works through two sections in combination. Section 10 imposes a condition on every licensee:

“you may not initiate litigation (including a cross-claim or counterclaim in a lawsuit) alleging that any patent claim is infringed by making, using, selling, offering for sale, or importing the Program or any portion of it.”

Violating this condition triggers Section 8, which provides for automatic termination of all rights under the license - including patent licenses granted under Section 11. Section 11 itself grants each contributor’s essential patent claims to every downstream user; Section 8 revokes those grants when the licensee violates Section 10 by initiating patent litigation against the program.

Unpack the sequence this creates:

  1. You incorporate GPL v3 code into your system - a library, a component, even a small utility function you copied and adapted.
  2. A competitor ships a product that you believe infringes a patent you hold on new functionality you developed.
  3. You file suit for patent infringement, alleging infringement of the program or a portion of it.
  4. Your Section 10 violation triggers automatic termination of your rights under Section 8.
  5. Your license to the GPL v3 code terminates as of the date you filed.
  6. You are then distributing GPL v3 code without a license, which is its own infringement.

Step 3 is where it gets uncomfortable, because “any patent claim… infringed by making, using, selling… the Program or any portion of it” is not a precisely bounded concept. Patent claims are written broadly. The GPL code’s contribution to your system may be peripheral to your core invention. That doesn’t automatically insulate you - what matters is whether the patent you’re asserting can be characterized as one infringed by making or using the GPL v3-covered work. If opposing counsel can draw that line, the retaliation sequence fires.

AGPL v3 carries identical Section 10 and Section 11 language, with the additional reach of Section 13: providing the software as a network service - without distributing the binary - triggers the same source-disclosure obligations as distribution. For patent strategy, this means the full retaliation exposure applies and the network-use clause amplifies the surface area. If you’re building an AI inference service on an AGPL-licensed library, you are in the AGPL’s reach whether or not you ship a binary.

GPL v2 has no equivalent patent retaliation mechanism. Section 7 prohibits adding “further restrictions” to GPL’d code, which can complicate certain downstream patent assertion strategies, but it does not contain the automatic license-termination trigger introduced in v3. This is a meaningful distinction - v2 and v3 are not equivalent on patent risk.


License Compatibility Table

LicenseExplicit Patent GrantPatent RetaliationSafe for Patent FilerNotes
MITNoNoYesNo patent grant and no retaliation. Patent-neutral.
BSD-2-ClauseNoNoYesSame posture as MIT.
BSD-3-ClauseNoNoYesSame posture as MIT.
Apache-2.0Yes (§3)Yes (§3) - license onlyYes*Retaliation cancels your Apache license, not your underlying patent rights. See below.
GPL v2NoNoYes (with caveat)§7 “no additional restrictions” can complicate patent assertion strategies but no automatic retaliation.
GPL v3Yes (§11)Yes (§10 + §8) - license terminatesRisky§10 prohibits patent litigation against the program; §8 auto-terminates all rights including §11 patent grants on violation.
AGPL v3Yes (§11)Yes (§10 + §8) - license terminatesRiskySame as GPL v3. §13 extends copyleft to network use: deployed inference services are covered.
LGPL v2.1NoNoYesNo patent retaliation. Weaker copyleft than GPL.
LGPL v3Yes (§11)Yes (§10 + §8) - license terminatesRiskyIncorporates GPL v3 by reference. Same §10/§8/§11 retaliation framework applies.

Notes:

* Apache-2.0 retaliation terminates your license to the Apache-covered code, not your patents themselves. You cannot distribute that code anymore, but you do not lose your patent. This is materially less dangerous than GPL v3 retaliation, which leaves you distributing unlicensed GPL code - a position with its own significant legal exposure.

The GPL v2 caveat: §7’s prohibition on “further restrictions” means you cannot impose patent licensing terms on downstream recipients that go beyond what the GPL itself requires. In practice this constrains certain defensive patent strategies for code you release under GPL v2, but it does not create the automatic termination trigger that makes GPL v3 dangerous for patent filers.

LGPL v3 inherits the GPL v3 patent framework. The Lesser GPL is “lesser” in the sense of weaker copyleft (you can link against it without your application becoming GPL’d), but it is not lesser on patent retaliation. If you’re using LGPL v3 code and the library’s functionality is adjacent to your patent claims, the same exposure exists.


Apache-2.0: The Patent Grant That Works in Your Favor

Apache-2.0’s Section 3 reads, in relevant part:

“Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable… patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work.”

This is an explicit, affirmative patent grant from every contributor to every user. For users of Apache-licensed code, this is a protection: contributors cannot later sue you for using the code to practice their patents, because they granted you a license when they contributed.

For the code author, releasing under Apache-2.0 means you are granting a patent license for patents your implementation practices. That is the price of the Apache grant - it cuts both ways. If your new method is embodied in Apache-licensed code you publish, you have granted a royalty-free patent license to everyone who uses that code. This is worth knowing before you push an Apache-2.0 release of code that implements something you intend to patent.

The retaliation clause in Apache-2.0 §3 says: if you institute patent litigation claiming that the Apache-licensed work infringes your patents, your Apache license to that work terminates. Note what this does and does not do. It terminates your license to distribute and use the Apache code. It does not revoke your patents. You can still enforce those patents against other code. You just cannot continue using or distributing the Apache-licensed project you sued over. This is a deterrent, not a trap of the same severity as GPL v3.

This asymmetry is why Apache-2.0 appears in the “safe” column above despite having a retaliation clause. The practical effect of losing your Apache license is annoying but manageable. The practical effect of losing your GPL v3 license mid-distribution is being an infringer.


What “Using” GPL Code Actually Does to Your Patent Position

The danger is not theoretical and it is not confined to cases where you copied code wholesale. Three scenarios illustrate the gradient:

Scenario A: Peripheral dependency, non-overlapping claims. You use a GPL v3 data-loading utility that handles file parsing. Your patent covers a new optimization algorithm applied to the parsed data. The GPL code touches none of the claimed functionality. You assert the patent. No issue - the work you’re asserting against is not “incorporated in the work” in a way that ties to your patent claims. The retaliation clause looks for an allegation that the GPL work constitutes infringement; if you’re not alleging that, it does not fire.

Scenario B: Adjacent functionality. You use a GPL v3 library that implements an approximation method. Your patent claims a system that uses approximation methods to achieve a result. You assert the patent against a competitor. Their counsel argues that your patent’s claimed functionality depends on or is intertwined with the GPL library’s behavior. Whether that argument succeeds is a legal question, but you are then paying attorneys to litigate it. “Adjacent” is determined by how your patent claims are written and how skilled opposing counsel is. This is the dangerous scenario because the risk is fuzzy and front-loaded: you cannot fully know at filing time whether a court will find the connection.

Scenario C: Incorporated and modified. You copied GPL v3 code into your codebase, modified it, and ship it as part of your system. Your patent claims cover a combined system that includes what the GPL code does. This is almost certainly a problem. You have incorporated the work; your claims arguably allege that the combined system (which is the work) does something covered by your patent.

Scenario B is the operationally important one. The safe operating rule is simple: no GPL v3, AGPL v3, or LGPL v3 code in any codebase where you intend to file or assert patents covering the same functional area. “Same functional area” should be interpreted conservatively - patent claims are written to be broad.


Mitigations When You Have Already Adopted GPL Code

Discovering a GPL v3 dependency late is not a death sentence, but it requires deliberate action.

Re-implementation. Re-implement the GPL’d functionality from scratch, using only the interface contract (what it does) and not the implementation (how it does it). This is a clean-room approach. The critical discipline: document that the re-implementation was done without reference to the GPL source. Commit messages, design docs, and code review records that show you wrote it independently matter later. The replacement has to actually be independent - a refactoring of code you internalized by studying the GPL version is not a clean-room implementation.

Network isolation. Wrap the GPL component as a separately-deployed service and call it over HTTP or another protocol. Your main system does not link to GPL code; it calls an API endpoint. The patent retaliation clause applies to code “incorporated in the work” - a service you call over the network is not incorporated in the traditional sense under GPL v3 (though be careful: AGPL v3 §13 specifically addresses this pattern, extending copyleft to network use). For non-AGPL GPL v3 libraries, network isolation is often a workable mitigation. Confirm with counsel before relying on it.

Find the license-compatible alternative. For most purposes in AI/ML work, there is a permissively-licensed alternative to any GPL library you might reach for. The river library (online machine learning, BSD-3-Clause) covers drift detection and incremental learning without GPL exposure. NumPy, PyTorch, scikit-learn, and the broader scientific Python stack are BSD-licensed. Hugging Face Transformers is Apache-2.0. If you find yourself about to use a GPL library, spend 20 minutes checking whether an MIT or Apache-2.0 alternative exists with acceptable feature parity.

Abandonment. If the GPL component is peripheral - a convenience utility, a data format parser, a visualization tool that does not touch your core system - dropping it may be the cleanest path. Carry the technical debt of removing it now rather than the legal debt of explaining it to a patent attorney three years from now.


Pre-Adoption Checklist

Before adding any dependency to a patent-sensitive codebase, run this check:

  • Identify the license. Check the package’s LICENSE file or look it up by SPDX identifier. This takes 30 seconds.
  • If the license is GPL v3, AGPL v3, or LGPL v3, ask whether the component touches any functional area adjacent to patent claims you hold or intend to file. If it does, use a permissively-licensed alternative instead, and document the decision and the alternative you selected. If it does not touch patented functionality, document that judgment explicitly, explain why, and revisit it if your claims are later broadened.
  • If the license is MIT, BSD, Apache-2.0, or ISC, you are in safe territory on patent retaliation. For Apache-2.0, carry one additional note: by contributing your own code to an Apache-licensed project, you are granting a patent license on any patents your implementation practices. Know this before pushing an Apache-2.0 release of code covering an invention you intend to protect.
  • Record the decision in a license ledger: package name, version, SPDX identifier, the date you checked, and a one-line rationale for why you included or excluded it.

The license ledger does not have to be elaborate - a markdown table in the repo is sufficient. What matters is that the decision was made deliberately and documented before adoption, not reconstructed after the fact.


The “Re-implement It Later” Problem

You can always remove GPL code later and replace it with a clean implementation. Courts have accepted this. But the record matters.

If your commit history shows 18 months of using a GPL v3 library, followed by a replacement committed 6 weeks before your patent filing, opposing counsel will ask two questions. First: was the replacement a genuine clean-room implementation, or a refactoring of code you deeply understood through the GPL version? Second: did you begin the patent drafting process while the GPL code was still in the codebase?

These questions do not have automatic answers. They are questions of fact that require your testimony, your commit history, your design documents, and your communications with your patent attorney as evidence. “We replaced it before filing” is a better starting position than “we never replaced it,” but it is not a free pass.

Start with the right dependency. The cost of a 20-minute license review before writing pip install is near zero. The cost of reconstructing a clean-room development history under adversarial scrutiny is not.


Closing

The practical good news for AI/ML engineers is that the permissive-license ecosystem covers nearly everything you would actually use. PyTorch is BSD-3-Clause. TensorFlow is Apache-2.0. NumPy, SciPy, scikit-learn, pandas - BSD-3-Clause. Hugging Face Transformers - Apache-2.0. The river library for online learning - BSD-3-Clause. The overwhelming majority of modern AI infrastructure was intentionally licensed to be enterprise- and patent-safe.

The GPL libraries you encounter in an AI codebase are usually legacy data tools, certain optimization libraries, or specialized components that predate the Apache/MIT-everything norm. They have permissively-licensed alternatives in almost every case. The work of finding those alternatives is small compared to the work of defending the decision not to.

You do not have to avoid open source. You have to know which open source. The license is in the repository. Read it before the import statement is in your codebase. Run the pre-adoption checklist above for every dependency that touches a functional area you intend to patent. That is the complete workflow. Do it before the import, not before the filing.