+ All Categories
Home > Documents > Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain...

Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain...

Date post: 11-Jul-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
50
TXtract: Taxonomy-Aware Knowledge Extraction for Thousands of Product Categories Giannis Karamanolakis Columbia University [email protected] Jun Ma, Xin Luna Dong Amazon.com {junmaa, lunadong}@amazon.com
Transcript
Page 1: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

TXtract: Taxonomy-Aware Knowledge Extraction

for Thousands of Product Categories

Giannis Karamanolakis Columbia University

[email protected]

Jun Ma, Xin Luna Dong Amazon.com

{junmaa, lunadong}@amazon.com

Page 2: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

2

Product Understanding for Search and Question Answering

Page 3: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

3

“Alexa, which shampoos contain argan oil?”

Product Understanding for Search and Question Answering

Page 4: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

4

Need to Store Structured Knowledge About Products

flavor: “chocolate”

“Alexa, which shampoos contain argan oil?”

ingredients: “biotin”, “argan oil”, …

Page 5: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

Understanding Values for Product Attributes

5

Product ID Brand Flavor Size IngredientsB00FZHEGGW Fage Plain 35.3 oz …

B0725VRRLP Ben & Jerry’s … … …… … … … …

Product catalog: Product Attributes

Products

flavor: ???

Page 6: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

Understanding Values for Product Attributes

6

Product ID Brand Flavor Size IngredientsB00FZHEGGW Fage Plain 35.3 oz …

B0725VRRLP Ben & Jerry’s ??? ??? …… … … … …

Product Attributes

Products

(-) Issue: catalog is missing attribute values for many products

flavor: ???

Product catalog:

Page 7: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

Attribute Value Extraction from Product Profiles

7

Ben & Jerry's Strawberry Cheesecake Ice Cream 16 oz

Flavor SizeBrand

• Goal: extract attribute values from product titles & descriptions

Page 8: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

Attribute Value Extraction from Product Profiles

8

• Goal: extract attribute values from product titles & descriptions

• Previous work: deep neural networks for sequence tagging

[Zheng et al., KDD’18][Xu et al., ACL’19]

[Rezk et al., ICDE’19]

Page 9: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

Attribute Value Extraction from Product Profiles

9

• Goal: extract attribute values from product titles & descriptions

• Previous work: deep neural networks for sequence tagging

[Zheng et al., KDD’18][Xu et al., ACL’19]

[Rezk et al., ICDE’19]

ben & jerry's mint chocolate cookie ice cream 16 oz

DNN

O O O B I E O O O O

extracted flavor value: “mint chocolate cookie”BIOE Tagging Example

Page 10: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

Attribute Value Extraction from Product Profiles

10

• Goal: extract attribute values from product titles & descriptions

• Previous work: deep neural networks for sequence tagging

• Limitations of previous work:

DNN

“Dog Food”Product Titles

“Dog Food” Atribute Values

[Zheng et al., KDD’18] [Xu et al., ACL’19]

DNN

“Sports” Product Titles

“Sports” Atribute Values

(-) designed for a single category

Page 11: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

Attribute Value Extraction from Product Profiles

11

• Goal: extract attribute values from product titles & descriptions

• Previous work: deep neural networks for sequence tagging

• Limitations of previous work:

(-) ignore product categories(-) designed for a single category

Product Category

DNN

Product Title

Attribute Values

Page 12: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

Attribute Value Extraction from Product Profiles

12

• Goal: extract attribute values from product titles & descriptions

• Previous work: deep neural networks for sequence tagging

• Limitations of previous work:

Product Category(Baby Clothes)

flavor: “cranberry”

(-) ignore product categories(-) designed for a single category

DNN

Product Title

Attribute Values:

“Custom Boy & Girl Baby Bodysuit Red Blue Cranberry Funny Cotton Baby Clothes”

Page 13: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

Attribute Value Extraction from Product Profiles

13

• Goal: extract attribute values from product titles & descriptions

• Previous work: deep neural networks for sequence tagging

• Limitations of previous work:

VitaminDigital Camera

flavor? flavor: “fruit”

Fruit

flavor: “fruit”Not applicable Not valid

(-) hard to capture diversity of categories

(-) ignore product categories(-) designed for a single category

Page 14: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

Attribute Value Extraction from Product Profiles

14

• Goal: extract attribute values from product titles & descriptions

• Previous work: deep neural networks for sequence tagging

• Limitations of previous work:

(-) hard to capture diversity of categories

(-) ignore product categories

(-) hard to scale to large product taxonomies in e-Commerce

(-) designed for a single category

• >100M products • >10K categories • Products/categories continuously added

Page 15: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

TXtract: Extraction for Thousands of Product Categories

• TXtract: a taxonomy-aware neural network for attribute value extraction

TXtract

Product Title

Attribute Values

Product Categoriesin Taxonomy

15

Page 16: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

TXtract: Extraction for Thousands of Product Categories

• Our Contributions: 1. Consider multiple categories efficiently with a single model

• TXtract: a taxonomy-aware neural network for attribute value extraction

TXtract

Product Title

Attribute Values

Product Categoriesin Taxonomy

16

Page 17: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

TXtract: Extraction for Thousands of Product Categories

• Our Contributions: 1. Consider multiple categories efficiently with a single model2. Extract category-specific attribute values using conditional self-attention

• TXtract: a taxonomy-aware neural network for attribute value extraction

TXtract

Product Title

Attribute Values

Product Categoriesin Taxonomy

17

Page 18: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

Product

GroceryHealth Product

Fruit

AlcoholicBeverages

Beer

Sports Nutrition

Protein Powder

Beauty Product

Hair Care

Shampoos

Lager

Ice Cream

……

BabyProduct

… …Vitamins

Infant Milk

MakeUp…… …

Lotion

……

TXtract: Extraction for Thousands of Product Categories

• Our Contributions: 1. Consider multiple categories efficiently with a single model2. Extract category-specific attribute values using conditional self-attention3. Scale up extraction to hierarchical taxonomies with thousands of categories

• TXtract: a taxonomy-aware neural network for attribute value extraction

18

Page 19: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

Product

GroceryHealth Product

Fruit

AlcoholicBeverages

Beer

Sports Nutrition

Protein Powder

Beauty Product

Hair Care

Shampoos

Lager

Ice Cream

……

BabyProduct

… …Vitamins

Infant Milk

MakeUp…… …

Lotion

……

TXtract: Extraction for Thousands of Product Categories

• Our Contributions: 1. Consider multiple categories efficiently with a single model2. Extract category-specific attribute values using conditional self-attention3. Scale up extraction to hierarchical taxonomies with thousands of categories

• TXtract: a taxonomy-aware neural network for attribute value extraction

19

Page 20: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

Product

GroceryHealth Product

Fruit

AlcoholicBeverages

Beer

Sports Nutrition

Protein Powder

Beauty Product

Hair Care

Shampoos

Lager

Ice Cream

……

BabyProduct

… …Vitamins

Infant Milk

MakeUp…… …

Lotion

……

TXtract: Extraction for Thousands of Product Categories

• Our Contributions: 1. Consider multiple categories efficiently with a single model2. Extract category-specific attribute values using conditional self-attention3. Scale up extraction to hierarchical taxonomies with thousands of categories

• TXtract: a taxonomy-aware neural network for attribute value extraction

20

Page 21: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

Product

GroceryHealth Product

Fruit

AlcoholicBeverages

Beer

Sports Nutrition

Protein Powder

Beauty Product

Hair Care

Shampoos

Lager

Ice Cream

……

BabyProduct

… …Vitamins

Infant Milk

MakeUp…… …

Lotion

……

TXtract: Extraction for Thousands of Product Categories

• Our Contributions: 1. Consider multiple categories efficiently with a single model2. Extract category-specific attribute values using conditional self-attention3. Scale up extraction to hierarchical taxonomies with thousands of categories4. Increase robustness to wrong category assignments using multi-task training

• TXtract: a taxonomy-aware neural network for attribute value extraction

21

Page 22: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

1. Attribute Value Extraction from Product Profiles

2. TXtract: Taxonomy-Aware Attribute Value Extraction

3. Experiments

4. Conclusions and Ongoing Work

Outline

22

Page 23: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

• Goal:‣ Extracts attribute values for products …‣ … from thousands of diverse categories ‣ … organized in hierarchical taxonomies

23

Scaling to Thousands of Product Categories - Challenges

Page 24: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

‣ Extracts attribute values for products …‣ … from thousands of diverse categories ‣ … organized in hierarchical taxonomies

(-) expensive (-) prone to overfitting

DNN

store/orchestrate 1000+ models

DNNDNNDNN

DNNDNNDNN

DNN

DNNDNNDNN

DNN

most categories have<<1000 labeled training data

#categories

#labeled data

• Approach1: train a separate DNN for each category?

24

Scaling to Thousands of Product Categories - Challenges

• Goal:

Page 25: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

‣ Extracts attribute values for products …‣ … from thousands of diverse categories ‣ … organized in hierarchical taxonomies

• Approach1: train a separate DNN for each category?

25

• Approach 2: assume a single “flat” category?

(-) not effective: missing category-specific characteristics

Scaling to Thousands of Product Categories - Challenges

• Goal:

Page 26: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

26

TXtract: Taxonomy-Aware Attribute Value Extraction

TXtract

Product Title

Attribute Values

Product Category

• TXtract leverages the hierarchical product taxonomy

‣ “Small” categories: leverage products from related categories(+) efficient: single model for all categories

Page 27: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

27

TXtract: Taxonomy-Aware Attribute Value Extraction

TXtract

Product Title

Attribute Values

Product Category

(+) effective: extracts category-specific attribute values

• TXtract leverages the hierarchical product taxonomy

‣product category -> attribute applicability, valid attribute values

(+) efficient: single model for all categories

Page 28: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

x1

Product Text

xT…

h1 hT…

Tokens

Token Embeddings

Product Encoder

CRF

BIOE Tags y1 yT…

Leveraging Hierarchical Product Categories in TXtract

CRF: Conditional Random Field

previous models

28

Page 29: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

29

Product Category

CategoryEncoder

ec

c

Category Embedding

x1

Product Text

xT…

h1 hT…

Tokens

Token Embeddings

h̃1 h̃T…Taxonomy-Aware

Token Embeddings

Product Encoder

ConditionalSelf-Attention

CRF

BIOE Tags y1 yT…

Leveraging Hierarchical Product Categories in TXtract

Product Taxonomy

Attention

hT

Category CLF

c′�

Page 30: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

30

Product Category

CategoryEncoder

ec

c

Category Embedding

x1

Product Text

xT…

h1 hT…

Tokens

Token Embeddings

h̃1 h̃T…Taxonomy-Aware

Token Embeddings

Product Encoder

ConditionalSelf-Attention

CRF

BIOE Tags y1 yT…

Leveraging Hierarchical Product Categories in TXtract

Product Taxonomy

Attention

hT

Category CLF

c′�

Category EncoderGenerates Category Embeddings ec

Page 31: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

31

Product Category

CategoryEncoder

ec

c

Category Embedding

x1

Product Text

xT…

h1 hT…

Tokens

Token Embeddings

h̃1 h̃T…Taxonomy-Aware

Token Embeddings

Product Encoder

ConditionalSelf-Attention

CRF

BIOE Tags y1 yT…

Leveraging Hierarchical Product Categories in TXtract

Product Taxonomy

Attention

hT

Category CLF

c′�

Category EncoderGenerates Category Embeddings ec

Page 32: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

32

Product Category

CategoryEncoder

ec

c

Category Embedding

x1

Product Text

xT…

h1 hT…

Tokens

Token Embeddings

h̃1 h̃T…Taxonomy-Aware

Token Embeddings

Product Encoder

ConditionalSelf-Attention

CRF

BIOE Tags y1 yT…

Leveraging Hierarchical Product Categories in TXtract

Product Taxonomy

Attention

hT

Category CLF

c′�

Category EncoderGenerates Category Embeddings ec

Page 33: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

33

Product Category

CategoryEncoder

ec

c

Category Embedding

x1

Product Text

xT…

h1 hT…

Tokens

Token Embeddings

h̃1 h̃T…Taxonomy-Aware

Token Embeddings

Product Encoder

ConditionalSelf-Attention

CRF

BIOE Tags y1 yT…

Leveraging Hierarchical Product Categories in TXtract

Product Taxonomy

Attention

hT

Category CLF

c′�

Conditional Self-AttentionUses as “query vector”

in self-attentionec

Page 34: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

34

Product Category

CategoryEncoder

ec

c

Category Embedding

x1

Product Text

xT…

h1 hT…

Tokens

Token Embeddings

h̃1 h̃T…Taxonomy-Aware

Token Embeddings

Product Encoder

ConditionalSelf-Attention

CRF

BIOE Tags y1 yT…

Leveraging Hierarchical Product Categories in TXtract

Product Taxonomy

Attention

hT

Category CLF

c′�TXtract

Extracts category-specific valuesp(y1, …, yT ∣ x1, …, xT, c)

values text category

Page 35: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

Ethernet cable assigned under Hair Brushes

Eyeshadow assigned under Travel Cases

TXtractExtracts category-specific values

values text category

(-) Issue: products may be assigned to wrong taxonomy nodes!

p(y1, …, yT ∣ x1, …, xT, c)

Improving Robustness Towards Wrong Category Assignments

35

Page 36: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

(-) Issue: products may be assigned to wrong taxonomy nodes! (-) Conditioning on wrong categories -> wrong values

TXtractExtracts category-specific values

p(y1, …, yT ∣ x1, …, xT, c)values text category

wrongwrong

Improving Robustness Towards Wrong Category Assignments

36

Page 37: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

Improving Robustness Towards Wrong Category Assignments

Extract category-specific values

p(y1, …, yT ∣ x1, …, xT, c)

Main Task Auxiliary TaskPredict categories from text

p(c ∣ x1, …, xT)

Attention

h

Category Classifier

c

x1 xT…

h1 hT…

Product Encoder

Product Text

Product Embedding

Product Category

37

Page 38: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

Extract category-specific values

p(y1, …, yT ∣ x1, …, xT, c)

Main Task Auxiliary TaskPredict categories from text

p(c ∣ x1, …, xT)

Attention

h

Category Classifier

c

x1 xT…

h1 hT…

Product Encoder

“Taxonomy-aware” loss function“Correctly guess category AND ancestors”

Product Text

Improving Robustness Towards Wrong Category Assignments

Product Embedding

38

Page 39: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

Multi-Task Training of TXtract

Taxonomy-Aware

Attribute Value Extraction

Main Task Auxiliary Task

Attention

Category Classifier

c

x1 xT…

Product Encoder

ConditionalSelf-Attention

CRF

y1 yT…

Taxonomy-Aware

Category Prediction

39

Page 40: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

1. Attribute Value Extraction from Product Profiles

2. TXtract: Taxonomy-Aware Attribute Value Extraction

3. Experiments: Taxonomy with 4,000 Product Categories

4. Conclusions and Ongoing Work

Outline

Page 41: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

Experiments: Attribute Value Extraction

41

• Dataset:‣ 2 million products (sampled from Amazon.com webpages)‣ 4,000 categories (sampled from Amazon’s taxonomy)

• Training: distant supervision for sequence tagging

Catalog values flavor tags

• Attributes: brand, flavor, package size, ingredients

Page 42: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

42

ignores categories

considers categories

‣ TXtract outperforms OpenTag across 4,000 categories

Coverage (%) Macro F1 (%)

OpenTag 73.0 46.6

TXtract 81.6 (+11.7%) 49.7 (+10.4%)

Average performance across ALL categories & attributes

[Zheng et al., KDD’18]

TXtract Effectively Leverages Product Categories

Page 43: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

43

ignores categories

considers categories

‣ TXtract outperforms OpenTag across 4,000 categories

‣ TXtract outperforms other category-aware approaches

Average performance across ALL categories & attributes

[Zheng et al., KDD’18]

[Johnson et al., TACL’17][Cho et al., EMNLP’14]

[Ma et al., KDD’19]

See more results and ablation study in our paper!

TXtract Effectively Leverages Product Categories

Coverage (%) Macro F1 (%)

OpenTag 73.0 46.6

TXtract 81.6 (+11.7%) 49.7 (+10.4%)

Page 44: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

1. Attribute Value Extraction from Product Profiles

2. TXtract: Taxonomy-Aware Attribute Value Extraction

3. Experiments: Taxonomy with 4,000 Product Categories

4. Conclusions and Ongoing Work

Outline

Page 45: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

45

•E-commerce domain is challenging!

‣ Diverse categories

‣ Assignments to wrong categoriesHair Brush

Attribute Value Extraction - Scaling Up to Thousands of Product Categories

Page 46: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

46

•TXtract: hierarchical taxonomies with thousands of categories

Attribute Value Extraction - Scaling Up to Thousands of Product Categories

‣ Diverse categories

‣ Assignments to wrong categories

(+) Efficient:‣ single model trained on all categories in parallel

(+) Effective:

Hair Brush

‣ Leverages taxonomy using conditional self-attention & multi-task training‣ Improves extraction quality (e.g., up to 15% higher coverage)

•E-commerce domain is challenging!

Page 47: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

47

flavor: “chocolate”

ingredients: “argan oil”, …

TXtract

Towards Better, Large-Scale Product Understanding

Page 48: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

48

flavor: “chocolate”

ingredients: “argan oil”, …

TXtract

Building an “automatic”knowledge graph of products

Towards Better, Large-Scale Product Understanding

[Saldana et al., KDD’20]

[Saldana et al. KDD’20] AutoKnow: Self-Driving Knowledge Collection for Products of Thousands of Types

Page 49: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

49

flavor: “chocolate”

“Alexa, which shampoos contain argan oil?”

ingredients: “argan oil”, …

TXtract

[Saldana et al. KDD’20] AutoKnow: Self-Driving Knowledge Collection for Products of Thousands of Types

[Saldana et al., KDD’20]

Towards Better, Large-Scale Product Understanding

Page 50: Taxonomy-Aware Knowledge Extraction for Thousands of ... · 3 “Alexa, which shampoos contain argan oil?” Product Understanding for Search and Question Answering. 4 Need to Store

Thank you!

Giannis Karamanolakis Columbia University

[email protected]

Jun Ma, Xin Luna Dong Amazon.com

{junmaa, lunadong}@amazon.com https://gkaramanolakis.github.io


Recommended