29/11/2007 Dutch-Belgian Database Day 2007
PAS: A Personal Alert System for Information Retrieval in CRISs
Germán Hurtado Martín1,2
Chris Cornelis2
1. Hogeschool Gent, 2. Universiteit Gent
Dutch-Belgian Database Day 2007 229/11/2007
Overview
CRISs Fuzzy sets and Rough sets PAS project
Dutch-Belgian Database Day 2007 329/11/2007
Overview
CRISs Fuzzy sets and Rough sets PAS project
Dutch-Belgian Database Day 2007 429/11/2007
CRISs: Current Research Information Systems
Bring together information related to current research
Publications, project descriptions, programmes, researchers, organizations, patents…
Dutch-Belgian Database Day 2007 529/11/2007
Examples of CRISs
USDA/CRIS: http://cris.csrees.usda.gov SICRIS: http://sicris.izum.si RIS: http://www.ris.is IWETO: http://www.iweto.be Degóis: http://www.degois.pt uniCRIS: http://www.unicris.com euroCRIS: http://www.eurocris.org
Dutch-Belgian Database Day 2007 629/11/2007
Information Retrieval in CRISs
Fuzzy
Rough
Dutch-Belgian Database Day 2007 729/11/2007
Overview
CRISs Fuzzy sets and Rough sets PAS project
Dutch-Belgian Database Day 2007 829/11/2007
Fuzzy sets and rough sets
Traditional approach: crisp sets
Young people = {x People | 0<age(x)<27}
Dutch-Belgian Database Day 2007 929/11/2007
Fuzzy sets and rough sets
Fuzzy approach: fuzzy sets
0 if age(x) ≥ 301 if age(x) ≤ 20(30 – age(x)) / 10 otherwise
Young(x) =
Dutch-Belgian Database Day 2007 1029/11/2007
Fuzzy sets and rough sets
Rough approach: rough sets Information system: (X, A) Equivalence relation in X: R Equivalence class of X: Rx
Equivalence classes:{x1,x4}, {x2}, {x3}, {x5}, {x6}with P = {Organisat., Funding, Discipl.}
{x1,x4,x5}, {x2}, {x3}, {x6}with P = {Organisation, Discipline}
X
A
Dutch-Belgian Database Day 2007 1129/11/2007
Rough set: representation
X
Upper approx. RA
(Ry ∩ A ≠ Ø)
Lower approx. RA
(Ry A)
A
positive examples
Equivalence class of R
Dutch-Belgian Database Day 2007 1229/11/2007
Rough set (R↓A, R↑A): example
Equivalence class: {x1,x4}, {x2}, {x3}, {x5}, {x6} with P = {Org., Fund., Discipl.}
R↑A
R↓A
A
A = {x1, x2, x3}
R↓A = {x2, x3}
R↑A = {x1, x2, x3, x4}
Dutch-Belgian Database Day 2007 1329/11/2007
Fuzzy rough sets
Fuzzy approach on rough sets Fuzzy set A Fuzzy relation R
R (x,y)
Upper approximation (R↑A)(y) = T(R(x,y),A(y))
Lower approximation (R↓A)(y) = I(R(x,y),A(y))
Xx∈sup
Xx∈inf
Dutch-Belgian Database Day 2007 1429/11/2007
Fuzzy rough sets: application
Query expansionAllows more results by using R↑A
R Programming Hardware C++ Java Laptop Algorithm
Programming 1.0 0.8 0.8 0.6
Hardware 1.0 0.4
C++ 0.8 1.0 0.7 0.2
Java 0.8 0.7 1.0 0.2
Laptop 0.4 1.0
Algorithm 0.6 0.2 0.2 1.0
- Query: “Programming”- Expanded query: {(“Programming”,1.0), (“C++”,0.8), (“Java”,0.8), (“Algorithm”,0.6)}
Dutch-Belgian Database Day 2007 1529/11/2007
Overview
CRISs Fuzzy sets and Rough sets PAS project
Dutch-Belgian Database Day 2007 1629/11/2007
PAS-project
What is the PAS-project? Personal Alert System (HoGent) Goal: to get the researcher’s attention on funding
possibilities that match his/her profile Information: about researchers, projects, funding
possibilities (grants etc.) → matching/collaboration Automation and intelligence
Dutch-Belgian Database Day 2007 1729/11/2007
PAS – How does it work?
-Name
-Staff number
-Department(s)
-Group
-Date of creation of the profile
-Last update of the profile
-Percentage research time
-Skills description
-Diplomas
-Publications
-IWETO-keywords
-Free keywords
Fill in
IWETO-
taxonomy
Thesaurus 1
User
Dutch-Belgian Database Day 2007 1829/11/2007
PAS – How does it work?
-Reference
-Title
-Content
-Attachment(s)
-Level
-Duration
-Institution
-Deadline
-Address
-Contact person
-IWETO-keywords
-Free keywords
IWETO-
taxonomy
Messages
Dutch-Belgian Database Day 2007 1929/11/2007
PAS – How does it work?
The IWETO-classification has 641 research fields:
5 at the 1st level, 31 at the 2nd level, 605 at the 3rd level
1
2
3
Dutch-Belgian Database Day 2007 2029/11/2007
PAS – How does it work?
By adding “free keywords” we can refine the classification
1
2
3
0.6
0.7
0.8
Dutch-Belgian Database Day 2007 2129/11/2007
PAS – How does it work?
Query:A = {k3}
Expanded query:R↑A = {(k1,0.8), (k3,1.0), …}
M1 → R2
Dutch-Belgian Database Day 2007 2229/11/2007
PAS – How does it work?
0.6
0.7
0.80.7
Dutch-Belgian Database Day 2007 2329/11/2007
PAS – Current implementation
Prototype developed as master’s thesis at the Hogeschool Gent
Basic algorithm using weights and their products and basic fuzzy rough query expansion1
Basic profiles and messages Manual processing of feedback Skeleton for the final system
1 P. Srinivasan, M. E. Ruiz, D. H. Kraft, J. Chen: Vocabulary mining for information retrieval: rough sets and fuzzy sets, Information Processing and Management, 37(1) (2001) 15-38
Dutch-Belgian Database Day 2007 2429/11/2007
PAS – Future work
Richer representation of profiles and messages Automation of the feedback mechanism Dealing with imprecision and words from different thesauri Dealing with ambiguity and incomplete profiles Tracking research activities for collaboration Automatic extraction of information from text files
Dutch-Belgian Database Day 2007 2529/11/2007
Thank you