Online Dictionary and Thesaurus
Presented by:M. Khalid Akhtar
Overview• World Wide Web: How efficient ?• Objective?• Perl: In brief• Apache: Advantages• MySQL: Advantages• Database Schema• Some common PERL syntax used in the program• The Approach• Search Form• Subroutines• Dictionary/Thesaurus Result page• References
Time consuming!
Difficult to find information!
Stale/Missing links/web pages!!
Irritating advertisements/banners
Sometimes special software required to view information!
World Wide Web: How Efficient ?
Objective:
Create an Online Dictionary and Thesaurus
Using:
HTMLPerl & CGI ScriptsMySql DatabaseApache Web Server
Perl: In brief
• Perl has Advanced Language Features •Highly portable
•Higher level constructs than UNIX Shell scripts
•More functionality than awk/sed
•Strong support for pattern matching
•Strong CGI support
•Strong HTML parsing capability
•Strong support for writing Web agents
Apache: Advantages
The main advantage of using the Apache web server is speed.Using the mod_perl module, Apache can return CGI requestson average 2-3 times faster than web servers that do not have mod_perl technology. It takes advantage of powerfulAPI interface.
Its FREE !!
Apache web server is compiled with Processor-SpecificOptimization to take advantage of the power of the newprocessor generation, giving it 5-30% more performancethan any other Web Server.
Apache provides a robust and commercial grade reference implementation of the HTTP protocol.
MySQL: Advantages
MySQL is extremely good:
For logging.
When you open many connections; It connects very fast.
• When SELECT and INSERT are used at the same time.
• When updates are not combined with selects that take a long time.
• When most selects/updates use unique keys.
• When many tables are used without long conflicting locks.
• When you have big tables (MySQL uses a very compact
table format).
Database Schema
Database Schema contd...
Database Schema contd...
Database Schema contd...
SITEID #
SITENAME
WEBSITES
WORDID #
MEANING1
MEANING2
MEANING3
:
MEANING5
SITEID #
WORDMEANING
Database Schema
Legends
# - Foreign KeyRed Fields - Primary Key
WORDID #
WORD
WORDS
SITEID #
WORDSREQUESTED
WORDSRETURNED
SITEREQ
WORDID #
CONCEPT
SITEID #
CONCEPTDETAIL1
CONCEPTDETAIL2
:
:
:
CONCEPTDETAIL9
WORDCONCEPTS
Database Schema contd...
Some common PERL syntax used in the program...
To connect to MySQL:$dbh = DBI->connect("DBI:mysql:$dbname", $user)
or die "Can't connect: " . DBI->errstr;
To store values in an ARRAY:push @siteids, $siteId;
To process ARRAY elements:foreach $siteId (@siteids) {
::
}
The ApproachPerl modules:
• seachword.pl• dbInterface.pl
• seachword.pl: Is the driving progam.
• Once invoked, queries the database for submitted word’s dictionary meaning or thesaurus concepts, using routines from dbInterface.pl.
The Approach Contd …• Spawns user agents to fetch
meanings/concepts from WebSites.
• Inserts meanings/concepts in the database.
• Updates frequency stats.
• Displays result in the browser.
• Prints the search form in the Internet browser to allow new searches.
Search Form
Subroutines:
Check the meaning first in local database using: dbGetWordMeaning($dbh, $word)
dbGetWordId($dbh, $word)dbGetSiteIdByFreq($dbh, $word)
if (length $word != 0) {# First fetch the meaning from the database
@meanings=&dbGetWordMeaning($dbh, $word);
if ($#meanings < 0) {# Get the meaning from dictionary.comif (&parse_dictionary_com($word)) {
# Fetch the meaning from the database@meanings=&dbGetWordMeaning($dbh, $word); }
Subroutine contd...
if ($#meanings < 0) {# Get the meaning from dictionary.comif (&parse_dictionary_com($word)) {
# Fetch the meaning from the database@meanings=&dbGetWordMeaning($dbh,
$word);}
}
Subroutine contd...
if ($#meanings < 0) {# Get the meaning from dictionary.comif (&parse_dictionary_com($word)) {
# Fetch the meaning from the database@meanings=&dbGetWordMeaning($dbh,
$word);}
}dbInsertWordMeaning($dbh, $siteId, $word, @meanings);dbUpdateSiteFreq($dbh, $siteId, 1);&dbUpdateSiteFreq($dbh, $siteId, 0);dbUpdateSiteFreq($dbh, $siteId, 0);
The result page: Dictonary
The result page: Thesaurus
References
1. Advanced PERL Programming by Sriram Srinivasan
2. CGI Programming with PERL by Scott Guelich, Shishir Gundavaram and Gunther Birznieks
3. The Complete HTML Reference by Thomas A. Powell4. PERL in Nutshell by Stephen Spainhour, Nathan
Patwardhan5. Apache Web-Server by Lars Eilebrecht6. Apache Server Unleashed by Richard Bowen, Ken A. L. Coar, Rich Bowen, Patrik Grip-Jansson, Matthew
Marlowe, Mohan Chinnapan7. http://www.apache.org8. http://addy.com/dc/html/writing_cgi_scripts.html9. http://www.cs.tcd.ie/research_groups/aig/iag/toplevel2.html10. http://www.mysql.com
Questions !