Post on 02-Apr-2018
transcript
http://repository.osakafu-u.ac.jp/dspace/
Title A Document-Retrieval Method Using Dependency-Relations of Titles
Author(s) Takamatsu, Shinobu; Nishida, Fujio
Editor(s)
CitationBulletin of University of Osaka Prefecture. Series A, Engineering and nat
ural sciences. 1978, 27(1), p.29-43
Issue Date 1978-10-31
URL http://hdl.handle.net/10466/8299
Rights
29
A Document-Retrieval Method Using Dependency-Relations of Titles
Shnobu TAKAMiegrsu" and Fojio NisHiDA"
(Received June 15, 1978)
This paper presents a method of document retrieval using dependency-relations
among key-terms which appear in titles of documents written in English.
A noun phrase or a noun clause contained in a title is converted to a function-
expression and normalized so that semantically equivalent expressions have a unique
syntactic expression. The normal function£xpression is recorded in a tree-1ike file.
The retrieval system is also designed where, for a request expressed in a noun phrase
or a noun clause, almost all the documents implied by the request are efficiently
retrieved.
1. Introduction
This paper presents a new method of document-retrieval using dependency-relations
between words involved in document titles written in EngliSh.
In order to desigriate the requested objects more clearly and precisely, it is necessary
to use a modified and delimited expression of objects such as seen in titles of documents.
However, the method using roles and lmks that have been studied so far has, as well
known, a considerably lower recall rate5). The main reason is considered that the usual
role4ink structure lacks a technique that unifies various equivalent expressions and also
identifies implication-relations.
In this paper, a title expressed by a noun phrase is reduced to a function-expression
by a deterministic parsing. Most of ambiguities of dependency-relations can be removed
by using a set of lower categories ofverbal cases. The function-expression thus obtained
is transformed into a nomial fbrm by several steps, and thereby various dependency-
expressions which have the same meaning are tried so as to have a unified expression as
far as possible.
For the further improvement of the recall rate, a retrieval method using an implica-
tion-relation is introduced based on the concept of parts and wholes.
A construction of a Me which contains dependency-relational data is also presented
for ethcient and systematic retrieval.
2. Conversion of Enghsh Titles to Function-Expressions
2.1 Function-expressionsofEnglishtitles
Most English titles are expressed by noun phrases and noun clauses. The subject re-
* Departrnent of Electrical Engineering,College ofEngineering.
30 Shinobu TAKAMATSU and Fiijio NISHIDA
presented by a noun phrase or a clause is considered to denote a subset of objects,
attributes or events which are all restricted and modified by other objects, attributes and
events.
This consideration leads to the concept that a noun phrase or a noun clause can be
generaily represented as the following function-expression:
f(Ki:=ti, K2:=t2,''', Kn:=tn), (1)where f is a function symbol consisting of a noun word called a governor, each ti (i =
1, 2,・・・,n) is aterm which is amodifier calleda dependant, and each Ki (i = 1,
2, ・ ・ ・ , n) is a case-label which indicates a certain dependency-relation between f arid
ti' ' The term ti consists of an adjective word, a noun word or a string of words of a
functionexpression such as (1) or (2).
A function£xpression of an adjective clause takes a form:
p(Li:=si, L2:=s2,・・・, Lm:=sm), (2)where p is a word of a descriptive adjective or a verbal adjective which governs each
term si (i '-- 1, 2,・・・,m), and each Li (i --' 1,2,・・・,m) is the case-label ofa
dependant si on a governor p.
A term si consists of a noun word, an adverb word or a string ofwords ofa imc-
tion-expression such as (1) or (2).
A case-label Li in a function-expression (2) originates from the cases ofa predicate
word p.i) These cases are subdivided into obligatory cases shown in (a-1) '" (a-3)
contained in Table 1-A, and optional cases shown in (a-4) "v (a-13) contained in Table
1-A.
Table 1-A. Case-labels in the function-expressions.
Thecategoryofagovernor Number Case-labe1 Thecategoryof
adependant
Thepreposi-tionofthedependant
Examples
(a-1) SUBJective by,of,in Fregeandefinition
(a-2) OBJectjve of Ionimplantation
(a-3) COMPLement Enabling(adevice)totest
(a4) INSTrurnent OBjECT Measurementbyprobe
(a-5) MEANS EVENT
with,by
on Fabricationbysputtering
(a-6) PURPose EVENT,OBJECT for Methodformeasurement
EVENT (a-7) SOURCE OBJECT from Emissionfromsilicon
(a-8) GOAL OBJECT ,to,mto Implantationintosilicon
(a-9) CONDition ATTRIBUTE (High)speeddrive
(a-10) MANNer DESCRIPTIONin,with
Efficientcomputation
(a-ll) EXHibition OBjECT,EVENT in,of Optimizationintransistor
(a-12) Location LocATION Underwaterconnection
(a-13) TIME TIME
in,on
at Real-timeoperation
A Document-Retrieval Mlethod Clsing Dependency-Relations of "tles 31
Table 1-B. Case-labels in the function-expressions.
Thecategoryofagovernor Number Case-label Thecategoryof
adependant
Thepreposi-tionofthedependant
Examples
(b-1) SUBJ-i of,with,in SuperconductingjunctionOBJECT
ATTRIBUTE (b-2) OBJ-) of,with Improvedefficiency
(b-3) INST-i for,to Probeformeasurement
(b-4) GOAL-t .EVENT (Ion)implantedGaAsOBJECT
(b-5) PROCESS(PURP-t) by,with Diodesbysputtering
(b-6) EXH-t Systemimproved(inefficiency)
ATTRIBUTE (b-7) COND-' in,of Driftvelocity
PARTS (c-1) COMPONENT MATERIAL with,fromof Thermisterfromgermanium
DEVICE (c-2) COMPONENT PARTS with,of Intensifierwithphotocathode
PARTS (c-3) COMPOSITE DEVICE Inductanceforfilter
MATERIAL (c4) COMPOSITE PARTSin,of,for
Filmmaterial
(c-5) ATTRibute ATTRIBUTE of,with,in (High)frequencycapacitorOBJECT
(c-6) NUMber QUANTITY Threesystems
(c-7) OBJECT OBJECT of,in ConductityinjunctionATTRIBUTE
(c-8) VALue QUANTITY 300n.cm'sresistance
THINGS (c-9) ARTicle ARTICLE Asimulator
Obligatory cases and their categories are intrinsic to each predicate, and given to each
predicate in a word-dictionary, while optional cases, their categories and prepositions are
assumed here to be common to all the predicates. In this paper, the categories of nouns
are classified as shown in Table 2, where the categories of adjectives and adverbs are
assumed to be the categories of their nominalized words.
Table 2. Classification of categories.
Categories Examples
THINGS
NON-EVENT
EVENT
TDEVICE .....
.BJEcTi&fiA:TES.RfiAiL.'XL・
ENTITY ...ATTRIBUTE ..........QUANTITY............
LocATION............TIME.................
STATE{g.Ogt,S,pE.oS,sS,ii,T:i:o.N'
ACTION
intensifier, rectifier, microscope, ・ ・ ・
film, diode, thermister,・・・
. semiconductor,oxide,Ge,・・・
X-ray, pulse, wave,..・
. capacity, speed, conductity,・・・ two, three, some, 6100C, O.36eV, 300 st ・ cm,・・・
... underwater, terrestrial,・・・
... real-time, interval,・・.
FUNCTION, ... PHENOMENONMETHOD ......
USE ...,...FORMATIONCAUSE .....
---
possesslon,
composltlon,・・・ efficiency, correctness, slowness, ・ ・ .
. rectification, amplification,
scintiilation, superconduction, ・ ・ ・
. implantation, sputtering, removal,
mspectlon,・・・ use, application, utilization, ・ - ;
fbrmation, production, fal)rication, ・ ・ ・
cause,・・・
32 Shinobu TAKAMMSU and Fuiio NISHIDA
As seen from Table 2, the nominalized words of descriptive and verbal adjectives
belong to a category `event' and the other noun words belong to a category `non-event'.
Each case-label Ki (i -- 1,2,・・・,n) in the fimction-expression (1) is defined based
on the cases of a predicate mentioned above as follows:
[Case 1] Whenagovernor f belongs to acategory `event', the case-label Ki ofa
dependant ti on the governor f is the same as that ofadependant ti on the predicate
word corresponding to the governor r
[Case 2] Whenagovernor f belongs to acategory `non-event' andadependant ti
belongs to a category `event', the case-label LJ・i is used. L;・i means that the case rela-
tion of ti to f is inverse to that of f to ti expressed by a case-label Li.
These inverse case-labels are shown in (b-1) 'Nt (b-7) contained in Table 1-B.
[Case 3] When bothagovernor f andadependant ti belongtoacategory `non-
event', a certain functional verbal word such as `contain', `compose' or `be' can be con-
sidered to be omitted between f and ti.
The dependency-relations in this case are those between a composite and its com-
ponent, those between an object and its attribute, and others. 'Ihe case-labels are given
as shown in (c-1) "v (c-9) contained in Table 1-B together with the categories ofa
governor and its dependant.
2.2 Conversion to function£xpressions
There are several syntactic patterns of noun phrases and clauses corresponding to
their function expressions (1) and (2). Table 3 shows the main parts of syntactic nies
of noun phrases and clauses, where the square brackets [ ] mean the involved symbols
as well as the brackets can be omitted in some cases. Corresponding to a non-terminal
symbol, the label of the part of speech in conventional use is shown in Table 4.
These syntactic rules have the following context-free like form:
<R>::= <D><G>gl<G>g <D>, (3)where a subscript `g' desigriates that the word reduced to the non-terminal symbol
having the subscript `g' is the governor of the words reduced to the other non-terrninal
symbols appearing in the right side of a syntactic rule.
The parsing is mainly based on the bottom-up analysis of the precedence grammar
and the usual categorical matching.2),3),8)
As a phrase or a clause is reduced to a non-temiinal symbol <R > by using the
syntactic rule (3), the function-expression corresponding to <R >, namely,
is constructed if g is aword of <G> and if d is aword of <D> or a reduced fbrrn
to <D>.
A Document-Retrieval Method Clsing Dependency-Relations of 7Vtles 33
Table 3. Syntactic rules of noun phrases and clauses.
Classification Number Syntacticrules
(A) (Al) <NPC>::==<NPC1>[and<NPC>]Co-ordinate (A2) 1<NPC1>or<NPC>conjunction (A3) <NPCI>::=<NC>1<NP>
(Bl) <NC>::=<NP>g<VAPI>(B2) l<NP>g<VAP2>(B3) 1<NP>g<DAP>
(B) (B4) <NP>::"<NP>g[<PP>]Nounclauses (B5) l[<ART>]<NPI>gand (B6) l<POSS><NPI>gNounphrases (B7) 1<IQADJ><NP1>g
(B8) <NPI>::=[<DQADJ>]<NP2>g(B9) <NP2>::=I<VAP3>)<NP3>g(BIO) 1<VAP4><NP3>g(Bll) 1<NADJ><NP2>g(B12) l<DADJ><NP2>g(B13) <NP3>::=[<NP2>]<N>g
(Cl)' <VAPI>::==<VAPI>g[<ADVP>](C) '(C2) kING>g<NPC>
Adjective (C3) <VAP2>::=<VAP2>g[<ADVP>]clauses (C4) 1<EN>g<NPC>and (C5) <DAP>::=<DADJ>g<PP>phrases (C6) <VAP3>::--{<N>]<ING>g
(C7) <VAP4>::=[<N>]<EN>g
(D).Theothers(Dl)
(D2)
(D3)
<ADVP>::=<ADV>1<PP><PP>::=<PREP><NPC><POSS>::=<NPI>'s
Table 4. Non-terminal symbols.
Non-terminal symbols
<NPC><NC><NP><VAP><DAP><ADVP><PP><poss><ART><IQADJ><DQADJ><NADJ><DADJ><ING><EN><N><ADV><PREP>
Mnemonics
Noun Phrases or Clauses
Noun ClauseNoun PhraseVerbal Adjective Phrase (or clause)
Descriptive Adjective Phrase (or clause)
ADVerb PhrasePrepositional Phrase
POSSessive noun phrase
ARTicleIndefinite Quantitative ADJective
Definite Quantitative ADJective
Nominal ADJectiveDescriptive ADJective::esiepnatr9iacrilil/iPle } verbaladJective
NounADVerbPREPosition
adjective
34 Shinobu TAKAMMSU and Fojio NISHIDA
ff the reduced fbrm to <G> has already a fimction fbrm:
g(Ki :"di,''',Km :=dm),
a function-expression :
g(Ki:=di,''', Km := dm, K:=d)
is constructed.
The case-label K in the function-expression (4) is determined as follows.
If either the governor g or the dependant d belongs to a category `event', the
case- label K is determined by both the category of d (or g) and that ofacase ofa
predicate corresponding to g (or d) and also the preposition of the dependant by
refering to Table 1-A and (b-1) 'h- (b-7) in Table 1-B.
If both the governor g ahd the dependant d belong to a category `non-event',
the case-label K is determined by the categories ofboth g and d and also the preposi-
tion of the dependant by refering to (c-1) "- (c-9) in Table 1-B.
[Example 1]
Let us consider a title
`Si02 Mms fbrmed by ion implantation into silicon'.
The categories of the words `Si02 ', `fdrns', `ion', `implantation' and `sMcon' involved
in the above are `material', `parts', `object' `event' and `object' respectively. It is also
seen by refering to a word-dictionary that the case structure of both the predicates `fbrm'
and `lmplant' is
SUBJ: =human or device, OBJ: =object thing. , Hence, by refering to the syntactic rules (B13), (B2), (C3) and (B4) in Table 3, the
above title is converted into a function-expression
`frkns(COMPONENT:=Si02, OBJ-':=formed(MEANS:=implantation
(OBJ:=ion, GOAL:=sthcon))).
The above parsing can also remove various ambiguities of dependency-relations.
However, there stM remain some kinds of ambiguities that can not be removed by a
simple categorical matching used in the above analysis.
Let us consider a case where a phrase or a clause that belongs to a category `event'
can depend on two words syntactically. Then, if these two words belong to a category
taken by a case of the predicate word, the simple categorical matching can not remove
the ambiguity whether the predicate word depends on either of these two words.
Such ambiguities can be almost removed by using a set of more inferior and precise
categories taken by cases of a predicate word.
Given a predicate word p, there are generally several sets of inferior categories.
Denote one of these categorical sets by
A Documen t-Retrieval Mlethod tlsing Dependency-Relations of 7Vtles 35
[C,,・・・, q,・・・, C2,・・・]. (5) Then, a relation:
(V ti)・・(V tk)・・(V tR) ・・・ [p (Ki:= ti,-・・, Kk:= tk,・・, KR := t2,・・) (5') ' ti E Ci A' ''Atk E (Zk A' 'Atg E CIz A' '']
holds.
Using the above relation, the ambiguities of dependency-relations can be generally
removed. Suppose, fbr instance, a noun clause
`t2 ''t''P'' tk ''',
where `p ・・tk・・' is an adjective clause and it is ambiguous whether the adjective clause
dependson tR or t.
In such a situation, if there is a categorical set of p such that tR belongs to Cli
and t does not belong to any category of the categorical set (5) under the condition
tk E Ck, then p is determined to depend on t2 through the case-label Ki ' .
[Exarnple 2]
Let us consider a title
`Connectors of conductive rubber used for leadless electronic device'.
The three words `connectors', `rubber' and `device' belong to the categories `parts',
`material' and `device'.
One of the case structures of `use' is
use (OBJ: =parts, GOAL: =device).
Hence, the adjective clause `used fbr leadless electronic device' can be determined to
depend on `connectors' through the case label OBJ-i .
3. Normalization of Function-Expressions
Document retrieval generally needs the essential coincidence between the expression
ofa request and that of the headmg ofarelevant document stored inaMe. Hence,it is
desirable that different function-expressions which have equivalence in the meaning are
transforrned into a unique and concise expression called the normal form in advance of
retrieval.
The transformations to the normal form consist of [1] nominakzation of verbal
words and removal of some nominabzed words, and [2] equivalence-transfbrmations in
some cases.
[ 1 ] Nominalization and removal of some kinds of verbal noun
All the predicate words consisting of verbal words and adjective words are replaced
with their nominalized words.
36 Shinobu TAKAMArrSU and Fzijio NISHIDA
EExample 3]
The function-expression of Example 1 is nominalized into
`films (COMPONENT: = Si02 , OBJ-"' : = fbrmation (MEANS: = implantation
(OBJ: = ion, GOAL: = slicon)))'.
The nominakzed function£xpressions contain some expressions which can be made
siippler. They are expressions which contain verbalnoun words p' such as `composi-
tion', `formation' and `application' in the following fbrm :
f (Li・ ':= p'(Li: = t)). (6) As described in the preceding section, there is a corresponding compound noun
phrase expression which represents the dependency-relation between an object and its
composite or an event and its instrument, and the correspondmg fUnction-expression is
Since expression (7) is more concise than expression (6) and more effieient for re-
trieval, expression (6) is transformed to expression (7) if p' in (6) isafimctional verbal
noun as shown in the above. A case-label K in (7) is determined by both the categories
of the governor f and the dependant t and also the case-label Li of t for the verbal
noun word p' by using Table 5.
Table5. Removalofverbalnouns.'
Verbalnoun
'p
Case-label
Li
Categoryofgovernor
fCategoryofdependant
t
Case-label
KINST OBJECT OBJECT COMPONENT
COMPOSITIONSUBJ OBJECT OBJECT COMPOSITE
FORMATION MEANS OBJECT EVENT PRocESS
EVENT OBJECT INST
OBJ EVENT EVENT MEANS
OBJECT OBJECT COMPONENTUSE
OBJECT EVENT INST-iPURP
EVENT EVENT PURP
GOAL OBJECT OBJECT COMPOSITE
If the new case-label of the other term-word which depends on a removed functional
verbal word is not provided in Table 5, the original case-label on the removed imctional
verbal word is used unchangingly as the new case-label for convenience.
[Example 4]
In the function£xpression of Example 3, `fbrmation' is a removable verbal noun,
`filrns' and `implantation' belong to `obejct' and `event' respectively, and the case-label of
`implantation' on `formation' is `MEANS'.
A Document-Retrieval Method USing Dependenqy-Relations of Titles 37
Hence, from Table 5, the function-expression of Exarnple 3 is transformed into
`films (COMPONENT: = Si02 , PROCESS: = implantation
(OBJ: =ion, GOAL: =silicon))'.
[2] Equivalence-transformations
The functionexpressions thus obtained stil1 contain some expressions which are
different from each other in their apparent forms of dependency-relations and equivalent
in their meanings. These expressions often appear in the following fbrms:
(a) An object f modified by both an action f' andamanner f" of f',
(b) An object f modified by both an attribute f' and a description f" of f',
(c) A manipulation f" such as identification modified by bath an object f and
an attribute f' of f.
The above expressions have two dependency forms respectively as shown in the both
sides of the following equivalence-relations:
f(Ki :=f'(Kl :=f"(s), t), r)=f(K2 :=f" (s, K5 :=f'(t)), r), (8)
f(Ki:=f'(Kl:=f"(s),t),r)=f(K2:=f'(t),K5:=f"(s),r), (9)
f" (K, :=f' (Kl :=f(s), t), r) =f" (K2 :=f'(t), Ki :=f(s), r), (1 0)
where a symbol `=' denotes an equivalence-relation, s, t and r represent null or sorne
terms prefixed by case-labels, and Ki , Kl , K2 , K6 are case-labels shown in Table 6.
Table 6. Setsofcasesforequivalence-relations.
Setsofcategories Setsofcase-labelsClassifica-
tionf f' f"
Expres-sion's
number K, K2 K, K6
(8) SUBJ--i MANN EXH-i SUBJ(a) OBJECT ACTION DESCRIPTION
(9) SUBJ-i MANN SUBJ-i EXH-i
(b) OBJECT ATTRIBUTE DESCRIPTION (8) ATTR SUBJ-i EXH-i SUBJ
(8) ATTR OBJ-i EXH-i OBJ(c) OBJECT ATTRIBUTE METHOD
(1O) OBJ OBJECT OBJ EXH
For convenience, the normal forms are assumed here to be defined as the expressions
of the left side of the respective equivalence-relations. Hence, if there are some
expressions of the right side of the equivalence-relation (8), (9) or (10), they are
transformed into those of the left side.
The function£xpression obtained by the transformations described in [1] and [2]
is called a normal function£xpression.
[Example 5]
(i) The function-expression ofthe noun phrase
38 Shinobu TAKAMieCTSU and Fojio NISHIDA
`Optimum transistor of power amplification'
is
`traiisistor (SUBJ-' : = amplification (OBJ: = power),
EXH-i : = optimum)'.
This is transfbrrned by (a-9) in Table 6 into the normal fbrm
`transistor (SUBJ-i : = amplification (OBJ: = power, MANN: = optimum))'
which is the functionexpression of the noun phrase
`Transistor with optimum power amplification'.
(ti) The imction-expression of the noun phrase
`Diodes improved in efficiency'
is
`diodes (EXH- ' : = improvement (OBJ : = efficiency))'.
This is transformed by (c-8) in Table 6 into the normal form
`diodes (ATTR: = eMciency (OBJ- i : = improvement))',
which is the function£xpression of the noun phrase
`Diodes with improved ethciency'.
(hi) The function£xpresslon of the noun phrase
`Cornplex-permittivity measurement ofliquid'
is
`measurement(OBJ:=complex-pemittivity, EXH:=liquid)'.
This is transformed by (c-10) in Table 6 into the nomal form
`measurernent (OBJ: = complex-permittivity (OBJECT: = liquid))',
which is the functionexpression of the noun phrase
`Measurement of complex-perrnittivity in liquid'.
4. Retrieval
A request fbr retrieval is input in a fbrm of an Enghch noun phrase or clause and it
is transfbrrned into a normal functionexpression by using the procedures described in the
preceding sections. The words of function-symbols contained in a normal function-
expression are called key-terms.
The key-terms are hierarchically disposed in an inverted Me accordmg to superior-
inferior relations of their categories as shown in Fig. 1 .4)'7) Each key-term is followed by
several pairs of the identification number of a document which contains the key-term
and a posltion-symbol of the key-term.
The positionsymbol of a key-term indicates a role-position of the key-term in the
normal functionexpression corresponding to a title. The assignment of a position-
symbol is specified as follows:
(1) A nul1 string is assigned to the position-symbol of a head key-term which does
not have the preceding key-term to be modMed.
A Document-Retrieval Method dsing Dependency-Relations of 7Vtles 39
object
T-----r-'----m devices parts materialreEE}Efi. i80P6EpCR'ts':E'ctww'snCigdA"i']Or[@oBJEpce:TmaS'ii:liiiiloM'poNE/T'L,,ri--H"-ii,lllllliil}i.,.,,,.,,[!i/!iilil&F{.::;::.d,.,,,,}
, quantlty
, reslstance ' rr"-i magneto- : event [(!]iistance ' - : phenomenon manipulation ・ -- : : addition : --- rrLA illlglpanRtttloEnss] i diffusien
Fig. 1. An illustration of the inverted file.
(2) A string `a・K' is assigned to the position-symbol ofa key-term which modifies
through the prefixed case-label K the preceding key-term having a position-symbol a.
[Exarnple 6]
In the normal function-expression
, O `films(COMPONENT:=Si02, PROCESS: = implantation (OBJ: = oxygen-ion,
GOAL: = silicon))'
corresponding to a title of a document
`Si02 films formed by oxygen-ion implantation into silicon',
the position-symbols of the key-terms `Mms', `Si02', `implantation' and `silicon' are a
null string, `COMPONENT', `PROCESS' and `PROCESS ・ GOAV respectively.
In the normal function-expression
@ `magnetoresistance (OBJECT: = fthns (COMPONENT: = permalloy))' correspond-
ing to a title of a document
`Magnetoresistance in permalloy fdms',
the position-symbols of the key-terms `magnetoresistance',`films' and `permalloy' are a
null string, `OBJECT' and `OBJECT ・ COMPONENT' respectively.
Fig. 1 Mustrates a part of an inverted Me which records the information contained
in the titles of the documents (D and @ in the above example in a distributed form on
several categorical key-term trees.
The retrieval consists of Mode (1) and Mode (2).
Mode (1) Retrieval ofdocuments which have key-terms equivalent or inferior to that
of a request by means of the usual key-term matchng from the inverted file.
40 Shinobu TAKAMATSU and Fajio NISHIDA
Mode (2) Retrieval of documents implied contextually as well as semanticany by a
request from the documents retrieved in Mode (1).
The retrieval in Mode (2) is divided into the fbnowing two cases (i) and (li):
(i) Implication-relations based on both superior-inferior relations and qualifications
by key-terms Let ltl denote a set of objects, attributes or events expressed by a key-term t.
Then, ifa fbrmula:
(flgIf'IAi(-.Y・ls).,glk). (Ki=K;A(tilg{t;})
holds in the fbllowing two functionexpressions:
Iti =A lf(Ki:=ti,K2:=t2,''', Km:=tm)l,
(t,I .A. Ift(Kl:-t{, K5:- t5,・・・, KA:- th)I,
then
holds,
Since the case-labels in a functionexpression are represented as position-symbols in
the inverted Me as mentioned above, the retrieval procedure based on the above implica-
tion-relation (1 1) is described as follows:
[Procedure 1] het ij and a7' (i -- 1,2,・・・,n) beakey-term anditsposition-
symbol in a functionexpression of a request respectively, then retrieve the documents
which have function£xpressions containing at least n pairs of the key-term equivalent or
inferior to ij and the position-symbol as;.
[Example 7]
' The document (!) is retrievedby Procedure1 from the inverted Me shown in Fig.1
for a request such as
`films(COMPONENT:=oxide, PROCESS:=addition
(OBJ: = impurities))'
or `Oxide fdms fbrmed by impurities addition'.
(il) Implication-relations based on a pair of a part and a whole
'Ihe documents entitled by
r [a study or report] on a whole f' which contains a part fJ
can be considered to imply semantically those entitled by
r [a study or report] on a part f contained in a whole f' J .
[Example 8]
`Film with magnetoresistance' implies `Magnetoresistance in frkn'.
A Document-Retrieval Method USing Dependency-Relations of Titles 41
As shown in Table 7, there wil1 be several case-pairs of a part and a whole such as
components and its composite, an attribute and its object, or means and the purpose.
Table 7. Case-pairs of a part and a whole.
K, K,
COMPOSITE COMPONENT
OBJECT ATTRibute
PURPose MEANS
The above implication-relation can be expressed in a symbolic notation:
f(t, Ki:=f' (s)) [f' (s, K2:=f(t)), (12)
whereboth t and s representsequencesoftermsprefixedbycase-labels,andbotk Ki
and K2 are case-labels that indicate modification-relations between a part and a whole.
As seen from the implication-relation (12), the key-terrn of a whole f' in the left
side has the excessive position-symbol Ki compared with that in the right side, while
the key-term ofa part f in the right side has the excessive position-symbol K2
compared with that in the left side.
Based on the implication-relation (12), the retrieval procedure is described as
follows:
[Procedure 2] Let ij and ati be a key-term and its position-symbol respectively
in a function-expression of a request.
Then, retrieve the documents which have the equivalent or inferior key-term
to ij and its position-symbols such that
(1) they consist of a to which to add a case-label Ki shown in the left column of
' Table 7,
or
(2) they consist of oj from which to delete a case-label K2 shown in the right
column of Table 7.
[Example 9]
The document @ is retrieved by Procedures 1 and 2 from the inverted Me shown
in Fig. 1 fora request such as
, `fdm (COMPONENT: r- magnetic material)' or `Film from magnetic material'.
5. Simulation
A small scale simulation on the language analysis of titles, the normalization and
the document-retrieval was carried out on FACOM U-2oo. The prograrn was written in
an assembly language. The required memories for document-retrieval and those fbr both
language analysis and normalization were about 3.4 KB and 12 KB respectively.
42 Shinobu TAKAMATSU and Fujio NISHIDA
'I he following are some ruustrative examples of inputs and their outputs.
Some ofthe input titles prefixed by a document number are
(37) `Low noise MOS-FET utthzing molybdenum gate masked ion implantation!,
(48) `Modifiedtriplediffusedpower-transistor',
(47) `Thn Si02 film forrned by ion implantation into sikcon',
(32) `The growth ofepitaxial layers for fabrication of I.A, composites',
(45) `FabricationofsthconSchottky-banier-diodesbysputtering'.
They are transfbtmed into their normal function-expressions respectively:
(37) `MOS-FET (ATTR: = noise (SUBJ" : = lowness),
PROCESS : = implantation (OBJ: = ion,
MEANS: = masking (OBJ: = gate (COMPONENT: = molybdenum))))',
(48) `power-transistor (PROCESS : = diffUsion (MANN: = triple),
OBJ-' : = modification)',
(47) `film (SUBJ-' : = thinness, COMPONENT: = Si02,
PROCESS : = implantation (OBJ: = ion, GOAL: = silicon))',
(32) `growth (OBJ:= layers (SUBJ-':= epitaxial),
PURP: = fabrication (OBJ: = composites (COMPONENT: = I.A,)))',
(45) `fabrication (OBJ: = Schottky-barrier-diodes
(COMPONENT: = sMcon), MEANS: = sputtering)'.
Ihese are recorded in a Me which consists of several categorical trees as shown in Fig.
1.
For some requests Rl, R2 and R3, the numbers of the relevant documents are
retrieved as follows:
INPUT REQUESTSRI : `Transistor formed by addition',
`t・ransistor (PROCESS : = addition)',
R2 : `'Ihin dielectric film',
`film (SUBJ-' : = thinness, COMPONENT: -- dielectric)',
R3 : `Fabricationofsemiconductorparts',
`fabrication (OBJ: = parts (COMPONENT: = semiconduct or))'.
OUTPUT DocUMENT'S NUMBERS RI: (37) and (48),
R2: (47) ,
R3 : (3 2) and (45) .
The documents (37), (47), (48) and (45) were retrieved by using the lmplication-
relation (1 1) based on superior-inferior relations of categories and qualifications.
The document (32) was retrieved by using both the implication-relations (11) and
(12) based on modincation relations between a part and a whole as well as superior-
inferior relations.
A Document-Retrieval Mlethod Clsing Dependency-Relations of 71ftles 43
6. Conclusion
It was found from a computer experiment that though the system presented in this
paper is a higher level retrieval system using dependency-relations between key-terms, the
retrieval speed and required memories are almost the same as those of the usual key-terrn
matchng system.
As for the improvement of the recall rate, the equivalence and implication relations
introduced above are considered to play an essential role. More general and extensive
studies along this direction are very important and left in future studies.
1)2)3)4)
5)
6)7)8)
References
B. Bruce, Artificial Intelltgen.ce, 6, 327 (1975).
Y. Wilks, Comm. ACM, 18, 264 (1975).
A. V. Gershamn, The 5th International Joint Conference on Al, 1, 132 (1977).
S. E. Fahlman, M!T, AI memo 331 (1975).
F. W. Lancaster, Information Retrieval System, New York: John Wiley & Sons, (1968).
N. Abe, et al, Trans. IPS. Japan, 11, 699 (1970), x.W. M. Turski, Inform.Stor. Retr. 7, 89 (1971). X
F. Nishida, et al, Trans. IECE. Japan, E60, 290 (1977).