Uniting Patent Data Sources with BigQuery
May 2018
A lot of data – from a lot of sources
© 2017
R&DData – Lab
Results
Trials & Test
Results
Sales & Market
Data
Private Corporate Data Private Vendor Data
Public Data Providers Collaborators & Partners
Patent Data on BigQuery
© 2018
Not really “search” –Instead SQL Query/ Join
Free Database – Google/IFI Public Patents
Free Database – EBI Data
Free Database – USPTO PAIR/PEDS
Paid Table – IFI Data Enrichments & others
Private Personal Data
Private Corporate Data
Go
ogl
e B
igQ
uer
y Reports based on Connecting Data
from Multiple Tables Together
What is BigQuery?
• Enterprise Cloud Data Warehouse• BigQuery is Google's fully managed, petabyte scale, low
cost enterprise data warehouse.
• Low cost – but not free.
• A powerful Big Data analytics platform • Analyze large datasets to find meaningful insights using
familiar SQL
• Join public, private, free and paid datasets – Including Patent Data
© 2018
Example – Full Text Search
© 2018
IFI Global Patent Database
IFI CLAIMS Direct
"VEGF receptor kinase inhibitor“
(vascular endothelial growth factor)
2,294 Results
Assign a “Relevance Score” and load into BigQuery as a private table.
VEGF_Receptor.LCPatents – private table,ordered by my private relevance field
© 2018
LCPatents (private)lc_number lc_score
US-20130053409-A1 112
WO-2009053737-A2 116
US-20030144298-A1 126
US-20030055006-A1 130
JP-2002536414-A 132
CN-103702990-A 137
WO-2008078091-A1 142
EP-2269603-A1 146
EP-2783686-B1 146
US-8410131-B2 146
LCPatents (private) – Public Data,ordered by my private relevance field
© 2018
LCPatents (private) patents-public-data.patents.publications
Row lc_number lc_score text
1 US-8778962-B2 148 Treatment of solid tumors with rapamycin derivatives
2 WO-2013014448-A1 1482 - (2, 4, 5 - substituted -anilino) pyrimidine derivatives as egfrmodulators useful for treating cancer
3 EP-2269604-A1 147 Treatment of solid tumours with rapamycin derivatives
4 RU-2325906-C2 147 Cancer medical treatment
5 EP-2269603-B1 146Treatment of breast tumors with a rapamycin derivative in combination with exemestane
6 EP-2783686-A1 146Combination of a rapamycin derivative and letrozole for treating breast cancer
7 EP-2269604-B1 146 Treatment of solid kidney tumours with a rapamycin derivative
8 US-8877771-B2 146 Treatment of solid tumors with rapamycin derivatives
9 EP-2269603-A1 146 Treatment of solid tumours with rapamycin derivatives
LCPatents (private) – Public Data,ordered by my private relevance field
© 2018
SELECT lc.lc_number, lc.lc_score, ttl.text
FROM `patents-public-data.patents.publications` AS ppd,
UNNEST(title_localized) AS ttl
JOIN `civil-dolphin-136720.VEGF_Receptor.LCPatents` AS lc
ON lc.lc_number = ppd.publication_number
WHERE ttl.language = "en"
ORDER BY lc.lc_score DESC
LCPatents - IFI Private Data: COUNT
© 2017
Row assignee Total1 AstraZeneca AB 102
2 Astex Therapeutics Ltd 84
3 Cancer Research Technology Ltd 33
4 NeuPharma Inc 32
5 Novartis AG 25
6 ForSight Vision4 Inc 22
7 Merck Sharp & Dohme Corp 21
8 Kinex Pharmaceuticals LLC 19
9 Eisai R&D Management Co Ltd 16
10 Novartis Pharma GmbH 15
11 University of Chicago 11
12 Medimmune Ltd 10
LCPatents IFI Data Enrichments
LCPatents - IFI Private Data: COUNT
© 2017
SELECT assignee, COUNT(IFI.publication_number) AS Total
FROM `striking-joy-185312.IFIDataEnrichments.IFIDataEnrichments` AS IFI,
UNNEST(original_assignee) AS assignee
JOIN `civil-dolphin-136720.VEGF_Receptor.LCPatents` AS lc
ON IFI.publication_number = lc.lc_number
GROUP BY assignee ORDER BY Total DESC
© 2017
LCPatents (private) – Public Data –IFI Paid Data
Row lc_number lc_score family_idpriority_
datecurrent_ assignee
legal_ status
1 JP-2004525899-A 152 26245731 20010219 Granted
2 WO-2013014448-A1 148 46875901 20110727
3 US-8778962-B2 148 26245731 20010219Novartis
Pharmaceuticals Corp
Active
4 EP-2269604-A1 147 26245731 20010219 Novartis AG Granted
5 RU-2325906-C2 147 26245731 20010219
6 CA-2438504-A1 146 26245731 20010219 Novartis AG Granted
7 CA-2438504-C 146 26245731 20010219 Novartis AG Active
8 EP-2764865-A2 146 26245731 20010219Novartis Pharma
GmbHWithdrawn
Novartis AG
9 EP-2762140-A1 146 26245731 20010219 Novartis AG Granted
LCPatents patents-public-data IFI Data Enrichments
LCPatents (private) – Public Data –IFI Paid Data
© 2017
SELECT lc.lc_number, lc.lc_score, ppd.family_id, ppd.priority_date,
IFI.current_assignee, IFI.legal_status
FROM `striking-joy-185312.IFIDataEnrichments.IFIDataEnrichments` AS IFI
JOIN `civil-dolphin-136720.VEGF_Receptor.LCPatents` AS lc
ON IFI.publication_number = lc.lc_number
JOIN `patents-public-data.patents.publications` AS ppd
ON lc.lc_number = ppd.publication_number
ORDER BY lc.lc_score DESC
SureChEMBL for LCPatents
© 2017
Row lc_number schembl_id smiles inchi_key field
1 US-20100092474-A1 SCHEMBL8755 COC1=CC=C(CN)C=C1IDPURXSQCKYKIJ-UHFFFAOYSA-N
5
2 WO-2008044041-A1 SCHEMBL8755 COC1=CC=C(CN)C=C1IDPURXSQCKYKIJ-UHFFFAOYSA-N
5
3 WO-2008044045-A1 SCHEMBL8755 COC1=CC=C(CN)C=C1IDPURXSQCKYKIJ-UHFFFAOYSA-N
5
4 US-20090306079-A1 SCHEMBL8755 COC1=CC=C(CN)C=C1IDPURXSQCKYKIJ-UHFFFAOYSA-N
5
5 US-20070021494-A1 SCHEMBL8755 COC1=CC=C(CN)C=C1IDPURXSQCKYKIJ-UHFFFAOYSA-N
5
6 US-7329660-B2 SCHEMBL104340 CC(C)C1=CC(N)=CC=C1XCCNRBCNYGWTQX-UHFFFAOYSA-N
5
7 WO-2008002674-A2 SCHEMBL133876 COC1=CC(O)=C(C=O)C=C1WZUODJNEIXSNEU-UHFFFAOYSA-N
5
8 WO-2014037750-A1 SCHEMBL309636 CCOC1=CC(Br)=CC=C1[N+]([O-])=OSVFZXFVVGNPTEF-UHFFFAOYSA-N
5
9 US-20100092474-A1 SCHEMBL383820 COC1=CC(C(O)=O)=C(C=C1)C(O)=OJKZSIEDAEHZAHQ-UHFFFAOYSA-N
5
10 WO-2008044041-A1 SCHEMBL383820 COC1=CC(C(O)=O)=C(C=C1)C(O)=OJKZSIEDAEHZAHQ-UHFFFAOYSA-N
5
LCPatents Ebi_surechembl
SureChEMBL for LCPatents
© 2017
SELECT v.lc_number, ebi.schembl_id, ebi.smiles, ebi.inchi_key, ebi.field
FROM `patents-public-data.ebi_surechembl.map` AS ebi
JOIN `civil-dolphin-136720.VEGF_Receptor.LCPatents` AS v
ON v.lc_number = ebi.patent_id
WHERE ebi.field = "5"
LIMIT 100
ChEMBL Compound, Site, Target
© 2017
Row standard_inchi_key compound_name site_name target_type pref_name
61HUMNYLRZRPPJDN-UHFFFAOYSA-N
Benzaldehyde Tyrosinase, Tyrosinase domain SINGLE PROTEIN Tyrosinase
62HUMNYLRZRPPJDN-UHFFFAOYSA-N
Benzaldehyde Tyrosinase, Tyrosinase domain SINGLE PROTEIN Tyrosinase
63HUMNYLRZRPPJDN-UHFFFAOYSA-N
Benzaldehyde Tyrosinase, Tyrosinase domain SINGLE PROTEIN Tyrosinase
64WGQKYBSKWIADBV-UHFFFAOYSA-N
Benzyl aminePhenylethanolamine N-methyltransferase, NNMT_PNMT_TEMT domain
SINGLE PROTEINPhenylethanolamine N-methyltransferase
65WGQKYBSKWIADBV-UHFFFAOYSA-N
Benzyl aminePhenylethanolamine N-methyltransferase, NNMT_PNMT_TEMT domain
SINGLE PROTEINPhenylethanolamine N-methyltransferase
66WGQKYBSKWIADBV-UHFFFAOYSA-N
Benzyl aminePhenylethanolamine N-methyltransferase, NNMT_PNMT_TEMT domain
SINGLE PROTEINPhenylethanolamine N-methyltransferase
67WGQKYBSKWIADBV-UHFFFAOYSA-N
Benzyl aminePhenylethanolamine N-methyltransferase, NNMT_PNMT_TEMT domain
SINGLE PROTEINPhenylethanolamine N-methyltransferase
68XKJCHHZQLQNZHY-UHFFFAOYSA-N
SID144208998Monoamine oxidase A, Amino_oxidase domain
SINGLE PROTEIN Monoamine oxidase A
69XKJCHHZQLQNZHY-UHFFFAOYSA-N
SID144208998Monoamine oxidase B, Amino_oxidase domain
SINGLE PROTEIN Monoamine oxidase B
ChEMBL Compount, Site, Target
© 2017
SELECT cs.standard_inchi_key, cr.compound_name, bs.site_name, td.target_type, td.pref_nameFROM `patents-public-data.ebi_chembl.target_dictionary_23` AS tdJOIN `patents-public-data.ebi_chembl.binding_sites_23` AS bsON td.tid = bs.tid
JOIN `patents-public-data.ebi_chembl.predicted_binding_domains_23` AS pbdON bs.site_id = pbd.site_id
JOIN `patents-public-data.ebi_chembl.activities_23` AS actON pbd.activity_id = act.activity_id
JOIN `patents-public-data.ebi_chembl.compound_structures_23` AS csON act.molregno = cs.molregno
JOIN `patents-public-data.ebi_chembl.compound_records_23` AS crON cr.molregno = cs.molregno
WHERE cs.standard_inchi_key IN("JOXIMZWYDAKGHI-UHFFFAOYSA-N","XKJCHHZQLQNZHY-UHFFFAOYSA-N","RMVRSNDYEFQCLF-UHFFFAOYSA-N","WVDDGKGOMKODPV-UHFFFAOYSA-N","VODUKXHGDCJEOZ-YUMQZZPRSA-N","LGRFSURHDFAFJT-UHFFFAOYSA-N","WGQKYBSKWIADBV-UHFFFAOYSA-N","VOLRSQPSJGXRNJ-UHFFFAOYSA-N","WFQDTOYDVUWQMS-UHFFFAOYSA-N","HUMNYLRZRPPJDN-UHFFFAOYSA-N","RWZYAGGXGHYGMB-UHFFFAOYSA-N","DGJKKXAFDOWIQI-UHFFFAOYSA-N","KHBQMWCZKVMBLN-UHFFFAOYSA-N","KWOLFJPFCHCOCG-UHFFFAOYSA-N")
EBI – European Biomedical Institute on BigQuery
© 2018
ebi_chembl ebi_surechembl• Activities • Smiles
• Assays • Inchi_Key
• Components • Patent_ID
• Compounds
• Drug Indications
• Drug Mechanisms
• Molecules
• Proteins
• Targets
BigQuery Console
© 2018
LCPatents – USPTO OCE Office Actions
© 2018
LCPatents (private) uspto_oce_office_actions
1 of 1495 rows shown
Rowpublication_
numberapp_id action_type claim_numbers
1 US-8298578-B2 13252942 1031,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26
2 US-9556120-B2 14554495nonstatutory double patenting
6,7,8,9,10,11,12,13,14,15,16,17
3 US-9572800-B2 15086485 103 1,2,3,4,5,6,7,8,9,10,11,12,13
4 US-9737544-B2 15000304nonstatutory double patenting
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20
5 US-9707202-B2 15044424nonstatutory double patenting
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38
6 US-9616050-B2 14840342nonstatutory double patenting
23,24,25,26,27,28,29,30
7 US-9707248-B2 15279361nonstatutory double patenting
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24
8 US-8642610-B2 13366726 1129,11,12,13,14,15,16,17,18,19,20,21,35,48,49,50,51,52,53,112
9 US-8673906-B2 13765850nonstatutory double patenting
1,2,3,4,5,6,7,8,9,10,11,12
10 US-8277830-B2 13252998 1031,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29
LCPatents (private) uspto_oce_office_actions
USPTO OCE Office Actions
© 2017
SELECT patents.publication_number, oa.app_id, oa.action_type,
oa.claim_numbers
FROM `patents-public-data.uspto_oce_office_actions.rejections` AS oa
JOIN `patents-public-data.uspto_oce_office_actions.match_app` AS match
ON oa.app_id = match.app_id
JOIN `patents-public-data.patents.publications` AS patents
ON match.application_number = patents.application_number
JOIN `civil-dolphin-136720.VEGF_Receptor.LCPatents` AS VEGF
ON VEGF.lc_number = patents.publication_number
USPTO PTAB Cases
© 2017
SELECT ptab.PatentOwnerName, ptab.PatentNumber, ptab.PetitionerPartyName,
ptab.TrialNumber, ptab.ProsecutionStatus, IFI.original_assignee
FROM `patents-public-data.uspto_ptab.trials` AS ptab
JOIN `patents-public-data.uspto_ptab.match` AS ptab_match
ON ptab.ApplicationNumber = ptab_match.ApplicationNumber
JOIN `patents-public-data.patents.publications` AS PUBLIC
ON PUBLIC.application_number = ptab_match.application_number
JOIN `striking-joy-185312.IFIDataEnrichments.IFIDataEnrichments` AS IFI
ON IFI.publication_number = PUBLIC.publication_number
JOIN `civil-dolphin-136720.VEGF_Receptor.LCPatents` AS VEGF
ON VEGF.lc_number = IFI.publication_number
RowPatentOwner
NamePatentNumber
Petitioner PartyName
TrialNumberProsecution
Statusoriginal_ assignee
1 Lane et al 8410131Breckenridge
Pharmaceutial, Inc.IPR2017-01592
Notice OF Filing Date Accorded
Novartis Pharmaceuticals
Corp
Patent Data on BigQuery Enhances your Search Results
© 2018
Public PatentData(free)
SureChEMBL(free)
IFI DataEnhancements
(paid)
Office Actions(free)
uspto
uspto
PAIR, PTAB(free)
Private, On-Premise Data
(e.g., Docket)
Search ResultPortfolio
orAny List
A Better Search Report
JOIN
Collaboration
© 2017
Share with google accounts or groups.
What this means to you
• BigQuery does not replace your text, semantic or structure based search tools
• BigQuery does let you make your search results more useful for:• Your Legal Team
• Your Business Sponsors
• Your Research Partners
© 2018
SECRET Data Fields!
© 2017
VEGF patents with secret data code.
The secret code cannot be visible to Google (even the Google Enterprise Cloud)
Tableau Desktop: Local Excel to BigQuery
© 2017
Tableau Desktop + BigQuery
© 2017
BigQuery SQL Query Excel file on Desktoppub number
Data join is created used publication_number. “Secret Code” is never transmitted to Google
Tableau Visualization
© 2017
Resources
• https://cloud.google.com/ - Google Cloud Platform> Launcher for Google Patents Public Data
• Google Announcement
• https://github.com/google/patents-public-data - GitHub Home for Google Patents
• Public Patent Data Now Available on Google BigQuery - IFI Blog Post on BigQuery, with examples
• IFI Data Enrichments – Information on IFI’s paid data enrichments
• W3 Schools SQL - SQL Reference
© 2018
Support comes with an IFI Data Enrichments Subscription!
https://cloud.google.com/https://cloud.google.com/blog/big-data/2017/10/google-patents-public-datasets-connecting-public-paid-and-private-patent-datahttps://cloud.google.com/blog/big-data/2017/10/google-patents-public-datasets-connecting-public-paid-and-private-patent-datahttps://www.ificlaims.com/news/view/blog-posts/public-patent-data-now.htmhttps://www.ificlaims.com/bigquery.htmhttps://www.w3schools.com/sql/sql_quickref.asp
Thank You!
Janice Stevenson
EVP – Client ServicesIFI CLAIMS Patent [email protected]@ificlaims.com
© 2018