There are reasons for positing a word structure

Post on 03-Feb-2016

35 views 0 download

Tags:

description

Maybe in order to understand mankind, we have to look at the word itself: "Mankind". Basically, it's made up of two separate words - "mank" and "ind". What do these words mean? It's a mystery, and that's why so is mankind. Jack Handy. There are reasons for positing a word structure. - PowerPoint PPT Presentation

transcript

Maybe in order to understand mankind, we have to look at the word itself: "Mankind". Basically, it's made up of two separate words - "mank" and "ind". What do these words mean? It's a mystery, and that's why so is mankind.

Jack Handy

There are reasons for positing a word structure

There are at least three conditions on structuring a word w into x.y

There are at least three conditions on structuring w into x.y

x is a stem and y is a suffix

There are at least three conditions on structuring w into x.y

x is a stem and y is a suffixy selects x

There are at least three conditions on structuring w into x.y

x is a stem and y is a suffixy selects x x and y are relevant for the distribution of w

Arguments for x being a stem carries over to an argument that y is a suffix

If x is a stem then x has meaning

If x is a stem then x has meaning

stem(x) → meaning(x)

If x is a stem then x has meaning

stem(x) → meaning(x)word(x) → meaning(x)

Being a stem is translated into being a word

Being a stem is translated into being a word

Pr( stem(x)| w=x.y) ~ Pr(word(x)| w=x.y)

Being a stem is translated into being a word

Pr( stem(x)| w=x.y) ~ Pr(word(x)| w=x.y)

Being a stem is translated into being a word

Pr( stem(x)| w=x.y) ~ Pr(word(x)| w=x.y)

{ | ( . ) ( )}{ | ( . )}

z z y zz z y

W WW

A beta distribution is used for assigning a probability based on the proportion

A beta distribution is used for assigning a probability based on the proportion

beta(positive, negative)

The top ten listMorph Ratio SD Prob Pos Negless 95 1 94 368 18' 93 1 92 1723 133's 92 0 92 9783 857ship 91 2 89 167 16like 91 2 89 140 14house 91 3 88 75 7'll 91 3 88 105 11head 88 4 85 61 8fish 88 4 84 66 9stone 87 4 83 66 10

The top ten listMorph Ratio SD Prob Pos Negless 95 1 94 368 18' 93 1 92 1723 133's 92 0 92 9783 857ship 91 2 89 167 16like 91 2 89 140 14house 91 3 88 75 7'll 91 3 88 105 11head 88 4 85 61 8fish 88 4 84 66 9stone 87 4 83 66 10

The top ten listMorph Ratio SD Prob Pos Negless 95 1 94 368 18' 93 1 92 1723 133's 92 0 92 9783 857ship 91 2 89 167 16like 91 2 89 140 14house 91 3 88 75 7'll 91 3 88 105 11head 88 4 85 61 8fish 88 4 84 66 9stone 87 4 83 66 10

Analyzing easiness

easi ness 78eas iness 46easines s 42easin ess 6easine ss 4

Analyzing termites

Suffix Ratio SD Prob Pos Neg

s 42 0 42 18098 25001

ites 43 4 40 78 102

es 23 0 23 2094 6925

tes 19 1 17 211 927

Analyzing termites

Suffix Ratio SD Prob Pos Neg

s 42 0 42 18098 25001

ites 43 4 40 78 102

es 23 0 23 2094 6925

tes 19 1 17 211 927

Analyzing termites

Suffix Ratio SD Prob Pos Neg

s 42 0 42 18098 25001

ites 43 4 40 78 102

es 23 0 23 2094 6925

tes 19 1 17 211 927

Analyzing termites

Suffix Ratio SD Prob Pos Neg

s 42 0 42 18098 25001

ites 43 4 40 78 102

es 23 0 23 2094 6925

tes 19 1 17 211 927

The measure of meaning captures the stem and suffix part

x is a stem and y is a suffix

Selectional relation is treated as the predictive power of the stem and suffix

easinesseasi → easi.er, easi.ly ness → readi.ness, fond.ness, hard.ness

eas → eas.ier, eas.ily, eas.ter, eas.toniness → read.iness,

Selectional relation is treated as the predictive power of the stem and suffix

easinesseasi → easi.er, easi.ly ness → readi.ness, fond.ness, hard.ness

eas → eas.ier, eas.ily, eas.ter, eas.toniness → read.iness,

Selectional relation is treated as the predictive power of the stem and suffix

easinesseasi → easi.er, easi.ly ness → readi.ness, fond.ness, hard.ness

eas → eas.ier, eas.ily, eas.ter, eas.toniness → read.iness,

Combining the endings from the stem and the starts from the suffix results in a collection of possible words

The first hypothesis is easi.ness

easi → .er, .ly ness → readi., fond., hard.

readi.er, readi.ly, fond.er, fond.ly, hard.er, hard.ly

5 positive 1 negative approx 90%

The second hypothesis is eas.iness

eas → .ier,.ily,.ter,.toniness → read.

read.ier, read.ily, read.ter, read.ton

1 positive 3 negative, 25%

easi.ness is best on both accounts and is the preferred analysis