Identify patent families based on their similarity
Similar Family Searching
On this page:
What is Similar family search for patent data?
Similar family search with respect to patent data refers to a search algorithm or method that is used to identify patent families or related patents based on their similarities. This may involve searching for patents that are similar in terms of their technical specifications, patent claims, or other features.
A patent family includes all of the patents and patent applications that share the same priority application, which means they are related to the same invention. These related patents may be filed in different countries, but they share the same priority date, which is the date of the first patent application filed for that invention.
The purpose of this type of search is to identify other patents that may be relevant to the same technology or invention, which can be useful for patent analysis, patent landscaping, and competitive intelligence.
IMAGE 1 – PLACEHOLDER
Cipher uses a very sophisticated proprietary patent linguistic algorithm that has been tried and tested over the past 2 years across our Universal Technology Taxonomy (“UTT”) classification. It is more advanced than most other systems on the market and will typically therefore provide better results than other similarity search tools available on the market.
How Cipher Similar Family Searching works
Frequently Asked Questions
Why has Cipher’s ML suggested these results?
Similarity searching starts with vectorising every patent family in the universe (think of this like
giving each patent family a unique fingerprint).
Each patent family can then have its vector (fingerprint) compared against others to identify vectors
(patent families) that is closest to it, returning the closest results based on the chosen sample size
(50, 100, 1000 etc.).
What parts of the patent are considered when finding a match?
Cipher’s deep learning model (“algorithm”) is specifically designed for patent linguistic tasks and
uses the patent title, abstract & claims to generate a vector for each individual patent family. It is a
similar process to how the Chat GPT model operates.
What makes Cipher similarity searching better than competing systems?
The processing power, cost, and time required to vectorise all patents are significant, and a barrier for
suppliers to implement a robust similarity searching system. Cipher has leveraged the in-house
vectorisation in classifying every patent family in the world for use in our UTT, so the
testing and heavy lifting have already been done.
Competitor differentiation & constraints
The cost to vectorise all patents is significant, and typically a barrier for suppliers to implement a robust similarity searching system. Cipher has the advantage of having done much of this vectorisation for UTT, so the incremental workload has been substantially reduced compared to a new entrant developing this.
This sophisticated algorithmic approach provides a greater accuracy than similar semantic search driven approaches that many competitor systems use.
Semantics v/s similar family search IMAGE
Where similarity searching is used with multiple source patents, these patents must be related as the algorithm will assume you are providing it with patents about a similar topic. For example: imaging looking for “portable solar powered devices.” You can provide one solar powered fan example, and one solar powered pump example, and it will “figure out” that it’s the solar powered bit that’s important (overlap).
The algorithm may struggle if you give it a solar powered fan and a lidar sensor patent example, because there is little or no overlap between them.
How do we give clients confidence it works?
Cipher is using a very sophisticated patent linguistic focused algorithm that has been tried and tested with Universal Technology Taxonomy classification. It is more advanced than most other systems on the market and will therefore provide at least as good, but more often better results than competitors.
The mathematics behind it is actually relatively trivial, the skill comes into our proprietary algorithm to vectorise patent families.