Skip to content

allthemusicllc/atp-tools

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

All the Patents

Why?

If a GPT2-sized model can generate a valid set of claims, should anyone be able to monopolize the invention?

At their heart, patents are a temporary, sanctioned monopoly on an invention through a license to sue. This monopoly is justified by the public good created by encouraging innovation and the long-term impact of that innovation being shared in the public domain.

Unfortunately, this worthy policy goal has been lost in the chaos and misuse of the patent system.

One of the most common sources of frustration is the granting of "obvious" patents. While some inventions are clearly novel and non-obvious, many are not - but still slip through the examination process. These obvious but granted patents then loom large over the market, creating a "thicket" that discourages use or subsequent invention in the area of the granted patent. "Undoing" the grant of a patent is a costly and time-consuming process with possible negative consequences, and so many of these patents simply sit as prior art on the books, even if the patentholder knows they could never enforce them.

Congress and various stakeholders have discussed and proposed changes over time, including most recently the America Invents Act (AIA), but the problem of obvious patents persists.

But what if someone were to generate all the obvious inventions and make them public?

What if we shared the means of producing these obvious inventions so that everyone could help generate them on a normal CPU or consumer GPU?

And what if we could then make those obvious inventions easily searchable for anyone, including PTO examiners themselves, to use?

How it Works

We start with a small, GPT2-sized large language model - kl3m-170 - which was trained on a clean, copyright-free dataset. This helps us ensure that generations do not include copyrighted text, which would allow third-parties to interfere with the project via DMCA takedown requests.

Next, we fine-tune this model on two simultaneous tasks:

  1. Top-down drafting: We start from the most abstract parts of the patent - the title and abstract - and then generate the detailed claims. This is a traditional next-token prediction order.
# Patent

## Title
{title}

## Abstract
{abstract}

## Claims

1. {claim 1}

2. {claim 2}

...
  1. Bottom-up: We start from the most detailed part of the patent - the claims - and then generate the abstract and title. This reversed order can be thought of as similar to traditional extractive/abstractive summarization tasks.
# Patent

## Claims

1. {claim 1}

2. {claim 2}

...

## Abstract
{abstract}

## Title
{title}

Once this fine-tuning is complete, we can then generate new patents using either technique by prompting the model as follows:

  1. Top-down prompt: "# Patent\n\n## Title"

  2. Bottom-up prompt: "# Patent\n\n## Claims"

It's critical that generation occurs with sufficient randomness and diversity to ensure that the generated patents are not simply reproductions of the training data. This is a key area of ongoing research and development.

Much like the real process of invention, most of the "ideas" generated by this process will be either nonsense or unpatentable otherwise. Our goal is to estimate the "hit rate" of the model and continue to improve the efficiency and accessibility of the generation process so that the "cost per obvious invention" is as low as possible.

Current Status

This project is still in its infancy. We're doing R&D to develop prototype tools to demonstrate the possibility and cost of generating and sharing these obvious inventions. This R&D is currently focused on data collection, data curation, model training, and model evaluation.

Source and Model

The repository currently contains the following:

  • atp.py: a simple Streamlit app that automates the download of the model and allows for generation

Pending cleanup and commit:

  • fine-tuning code to continue pretrain/fine-tuning
  • data collection and curation code from USPTO Full-Text Database

Requirements:

  • kl3m-170m has extremely low resource requirements. It can be run on GPU with approximately ~1GB of VRAM or on CPU or Apple Silicon.

Related Material

About

AllThePatents tooling

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published