Advance copy – on going updating to reflect J-ISIS latest release
Page 1
J-ISIS Quick Tutorial The latest J-ISIS Distribution zip file can be downloaded from:http://kenai.com/projects/j-isis/downloads
JCD
07/11/2011
This is a quick J-ISIS tutorial with many screen shots. CDS/ISIS for Windows Reference Manual (Version 1.5) has been used for developing J-ISIS and remains a reference for the Print Formatting Language.
Advance copy – on going updating to reflect J-ISIS latest release
Page 2
1. Installation .......................................................................................... 5
1.1 New Installation ...................................................................................................................................................... 5
1.2 Updating .................................................................................................................................................................. 5
1.3 UNIX/Linux Installation .......................................................................................................................................... 6
2. J-ISIS theorical limits ......................................................................... 6
3. Starting J-ISIS .................................................................................... 7
4. J-ISIS Browser Component ................................................................ 7
5. J-ISIS Presentation – Client/Server Application ................................. 8
6. When you start J-ISIS, the first thing to do is to open a connection to
the Database Server ................................................................................ 9
7. Opening Databases for Further Processing ....................................... 10
8. Visualizing Database Content ........................................................... 14
8.1 Data Viewer ........................................................................................................................................................... 15
8.2 DB Browser ............................................................................................................................................................ 18
8.3 Dictionary Browsing .............................................................................................................................................. 20
9. Searching .......................................................................................... 23
9.1 Guided Search ....................................................................................................................................................... 24
9.2 Expert Search ........................................................................................................................................................ 33
9.2.1 Terms ............................................................................................................................................. 33 9.2.2 Fields ............................................................................................................................................... 34
9.2.3 Term Modifiers ............................................................................................................................... 34 9.2.3 Boolean Operators .......................................................................................................................... 36 9.2.4 Grouping ......................................................................................................................................... 37
Field Grouping ......................................................................................................................................... 37 9.2.5 Escaping Special Characters ........................................................................................................... 37
Expert Search Example ................................................................................................................................................ 38
WARNING ON A GENERAL EDITING ISSUE ................................ 41
10. Importing ........................................................................................ 42
10.1 Importing ISO 2709 files ...................................................................................................................................... 42
10.2 Importing MARC files .......................................................................................................................................... 51
11. Exporting ........................................................................................ 55
12. PFT Manager .................................................................................. 57
Advance copy – on going updating to reflect J-ISIS latest release Page 3
12.1 Presentation ........................................................................................................................................................ 57
12.2 Re-Using Plain Old WinISIS PFTs ......................................................................................................................... 58
Problems you may be faced when using old PFTs ......................................................................... 60
13. J-ISIS Print Formatting Language .................................................. 61
13.1 Differences with WinISIS ..................................................................................................................................... 61
13.2 ISIS Formatting Language – J-ISIS implementation ............................................................................................. 61
13.3 Including an external format ............................................................................................................................... 62
13.4 Format exits (Call from the PFT to external functions) ....................................................................................... 62
13.5 Some features have not been implemented because they can be replaced by XHTML, CSS or JavaScript ....... 62
14. J-ISIS Groovy Console ................................................................... 63
15. Groovy Programming Language .................................................... 64
15.1 Classes & Scripts .................................................................................................................................................. 64
15.2 Groovy Tutorial ................................................................................................................................................... 65
16. Using Groovy to write Format exits (Call from the PFT to external
functions) .............................................................................................. 65
17. Database Creation ........................................................................... 70
18. Data Entry ...................................................................................... 76
Subfielded fields .......................................................................................................................................................... 76
Repeatable fields......................................................................................................................................................... 78
CREATE (CRUD) ....................................................................................................................................................... 79
19. New Advanced Worksheet Editor .................................................. 87
20. New Advanced Data Entry ............................................................. 92
21. Sorting and Printing ........................................................................ 97
21.1 Quickly Printing All/Or a Selected Range of Records .......................................................................................... 97
21.2 Sorting the Records before Printing.................................................................................................................... 99
22. Multilingual UNICODE Databases .............................................. 103
22.1 Windows ........................................................................................................................................................... 103
22.2 Full fonts: .......................................................................................................................................................... 104
22.3 Configuring a J-ISIS database to use a special font. .......................................................................................... 104
23. Client Z39.50 ................................................................................ 108
Annex 1 .............................................................................................. 109
Installing JDK 1.6 ............................................................................... 109
Annex 2 .............................................................................................. 111
The Field Select Table (FST) ............................................................. 111
A. FST parameters .............................................................................................................................................. 112
Advance copy – on going updating to reflect J-ISIS latest release Page 4
1. Data extraction format ................................................................................................................ 112 2. Indexing Techniques ................................................................................................................... 112
3. Field identifier ............................................................................................................................. 115
Annex 3 .............................................................................................. 117
How to use jisis core library in Groovy scripts or other Web
Applications ....................................................................................... 117
1 j-isis Core Library Application Programming Interface (API) ........................................................................ 117
2 Code Snippets: .................................................................................................................................................... 119
Establishing a connection, opening a database and Browsing the database record by record:................................................................................................................................................................ 119
Exploring the record data ................................................................................................................... 120
Processing Specific Fields – Example 1 ......................................................................................... 120
Processing Specific Fields – Example 2 ......................................................................................... 121 3 The API ................................................................................................................................................................. 122
4 Writing a Groovy Application to produce a pdf catalogue ............................................................................. 129
Advance copy – on going updating to reflect J-ISIS latest release Page 5
1. Installation
1.1 New Installation
You should first install the JDK 1.6 if it’s not yet installed. Please see Annex I for downloading and
installing the JDK 1.6.
To install J-ISIS from Kenai J-ISIS Distribution zip file, you should first create a directory (testjisis for
example) on your hard disk and unZIP the files to this directory. Once installed, J-ISIS will consist of the
following directory layout:
Know where the directory location of your J-ISIS is. [It will now be referred to as: $JISIS_HOME]
In this example, it is “C:\testjisis15”
1.2 Updating
In case you have already installed a previous version of J-ISIS, the best strategy is as follow:
a) Install the new release in a new folder and NOT ON TOP OF PREVIOUS
b) Copy your working databases from the previous J-ISIS installation \jisis_suite\home_example_db
directory to the new one
c) Maybe make a backup of the previous J-ISIS installation \jisis_suite\home_example_db directory in case
d) Delete the previous J-ISIS installation.
It may be necessary to save your working databases which are under the \jisis_suite\home_example_db
directory of the previous J-ISIS installation. There is a parent directory per database. Thus if you have a
database named Ernesto_DB, you will have a directory called Ernesto_DB under the
\jisis_suite\home_example_db directory.
Advance copy – on going updating to reflect J-ISIS latest release Page 6
jisis_suite-------+------------bin
|
+------------conf
|
+------------etc
|
+-------------home_example_db--------+-------------ASFAEX
| |
| +-------------Ernesto_DB
1.3 UNIX/Linux Installation
The zip distribution contains both, the Windows launcher and the shell script to be use under Unix/Linux
** J-ISIS is started from an executable launcher under Windows (jisis_suite.exe)
** J-ISIS is started from a shell script under Unix (jisis_suite)
Both files are located in the jisis_suite/bin directory.
On Linux, at least on Ubuntu, make sure you have done something like this:
export MOZILLA_FIVE_HOME=/usr/lib/mozilla
export LD_LIBRARY_PATH=$MOZILLA_FIVE_HOME
You should also set the DEF_HOME variable in the “dbhome.conf” as follow:
DEF_HOME=../home_example_db (Please note the dotdot)
2. J-ISIS theorical limits
A single database managed by Berkeley DB can be up to 248 bytes, or 256 terabytes.
Berkeley DB scales in terms of the amount of data it manages, the capabilities of the devices on which
it runs, and the distance over which applications distribute data. Largest installations may reach petabytes.
Size of a field: - In theory: 2^31 - 1 = 2147483647 (~2 GigaByte). In practice: heap size and end of virtual
memory
Size of a record: Record sizes up to two gigabytes. Frequently used data is cached in memory.
Number of occurrences of a field: In theory: 2^31 - 1 = 2147483647 (~2 Gigabytes). In practice: end of
virtual memory.
Number of fields of a record: In theory: 2^31 - 1 = 2147483647 (~2 Gigabytes). In practice: end of virtual
memory.
Number of lines of a FST: In theory: 2^31 - 1 = 2147483647 (~2 Gigabytes). In practice: end of virtual
memory.
Number of records in a database: The largest number a four-byte signed integer can hold is
2,147,483,647.
Advance copy – on going updating to reflect J-ISIS latest release Page 7
3. Starting J-ISIS
Double click on “jisis_suite.exe“ ( ) which is located in the “$JISIS_HOME
/jisis_suite/bin” directory to start J-ISIS.
You can also create a shortcut to “jisis_suite.exe“and drag it to the desktop.
4. J-ISIS Browser Component
J-ISIS uses the DJNativeSwing (http://djproject.sourceforge.net/ns/ ). Library for integrating the using the
machine installed Web browser and displaying the print formatted records which are formatted in HTML.
Thanks to Geertjan Wielenga from Sun (now Oracle) who wrote a step-by-step tutorial
(http://netbeansdzone.com/how-to-nb-djnative-swing ).
Furthermore, this new J-ISIS release works on the following 6 platforms:
Windows32,
Windows64,
Advance copy – on going updating to reflect J-ISIS latest release Page 8
Linux32,
Linux64,
MacOSX32,
MacOSX64
5. J-ISIS Presentation – Client/Server Application
Advance copy – on going updating to reflect J-ISIS latest release Page 9
J-ISIS is a Client/Server application which is
working as a database server as well as a
client. When you start J-ISIS, in fact you
start a J-ISIS database server listening on
port 1111 by default.
But as a client, you can connect either to the
localhost database server of the local
machine or to another machine which should
have J-ISIS running and that will also play
the role of a J-ISIS database server. In that
case, you will provide the IP address of the
machine as “Host Name”, “192.168.0.13” for
example. You can get it by typing “ipconfig”
in a command window of the server machine.
6. When you start J-ISIS, the first thing to do is to open a connection to the Database Server
Clicking on “Open Connection” will open the dialog
below. By default you get “localhost” machine as
database server. It means that the database server and
desktop client application will be on the same
machine.
You could connect to another machine running J-ISIS
by providing the IP address of this machine.
The databases and all related files are stored on the
Server machine.
Enter “admin” and “admin” as “User” and
“Password” respectively.
User administration is not yet implemented and you
Advance copy – on going updating to reflect J-ISIS latest release Page 10
should keep the default “admin” “admin” values.
7. Opening Databases for Further Processing
Clicking on “Open Database…” will open the dialog below.
This dialog displays the list of databases defined under the root directory
“C:\testjisis15\jisis_suite\home_example_db”
Advance copy – on going updating to reflect J-ISIS latest release Page 11
This dialog displays the list of databases defined under the root directory
“C:\testjisis15\jisis_suite\home_example_db”
You may define several root directories for the databases. Each root directory is defined by a “DEF_HOME”
variable and is stored in the “$JISIS_HOME/jisis_suite/conf/dbhome.conf” file
($JISIS_HOME being the J-ISIS install directory.
All databases are defined in a sub-directory of the root directory
Advance copy – on going updating to reflect J-ISIS latest release Page 12
$JISIS_HOME/jisis_suite/conf/dbhome.conf
Content:
DEF_HOME=./home_example_db
#DEF_HOME2=D:\MyDb
The database root directory is defined as:
$JISIS_HOME/jisis_suite/home_example_db
Thirteen J-ISIS databases are provided with this
distribution (ASFAEX, AUTOR, etc)
Select the “ASFAEX” database for example and click on “Finish” button
Advance copy – on going updating to reflect J-ISIS latest release Page 13
You can see the databases that are opened by putting the mouse cursor on the “Databases Pool” tab on the
left.
You can also see the connections that are established by putting the mouse cursor on the “Connections
Pool” tab on the left.
Advance copy – on going updating to reflect J-ISIS latest release Page 14
When you open a database, J-ISIS checks several properties of the database such as the FST and the indexes.
Thus it may take a couple of seconds for big databases (more than 50 000 records) before giving back the
hand.
You will also see messages displayed in the Output Window such as the FST and the validity of the print
formats used in the FST. This Output window is quite useful for debugging and informing the user.
8. Visualizing Database Content
Advance copy – on going updating to reflect J-ISIS latest release Page 15
8.1 Data Viewer
J-ISIS records are displayed according to print format definitions written in the ISIS formatting language.
The J-ISIS Data Viewer is using a web browser component that is XHTML, JavaScript and CSS compliant
The “RAW” format is always available and used by default
You can change the format to “egbert” for example:
Advance copy – on going updating to reflect J-ISIS latest release Page 16
If you have Internet access, you can click on the links:
Advance copy – on going updating to reflect J-ISIS latest release Page 17
Advance copy – on going updating to reflect J-ISIS latest release Page 18
The print formats can be edited with the “Pft Manager” in “Tools” menu item
8.2 DB Browser
The Database Browser provides a tabular view of the DB with records along the rows and the fields along
the columns.
o The first column contains the Master File Number (MFN) and is frozen horizontally.
o Cells can be enlarged using drag and drop of the vertical separator line
Advance copy – on going updating to reflect J-ISIS latest release Page 19
o Cells height allows displaying 3 lines
o Clicking on a cell where the text is greater than what can be displayed provides vertical scroll bar to
view the whole cell.
o Columns (except the MFN) can be drag and drop
Further development and Improvements under consideration:
o Selection of Fields/column to display
o Filtering
o Searching with highlighting
o Sorting
o Changing the display font
o Changing the display format (1 record per cell with RAW format)
o Printing/ Print Preview
Advance copy – on going updating to reflect J-ISIS latest release Page 20
Text in Arabic can be aligned from right to left and vice versa by pressing Ctrl/Asterisk using the asterisk on
the numeric pad
8.3 Dictionary Browsing
This option will allow to display and browse the indexed terms. J-ISIS displays Information about the index
and the terms indexed in a table with 4 columns that contain respectively the term index, field tag, term
value and frequency.
Terms are sorted by field tag and term value.
A selected Term value can be copied by clicking on the right mouse button and Ctrl/C Copy. It can then be
pasted in the Search Form.
Advance copy – on going updating to reflect J-ISIS latest release Page 21
The “Quick Search” fields provide a way of searching quickly a term by typing the first characters in the
Query field:
Advance copy – on going updating to reflect J-ISIS latest release Page 22
Typing “AU=” for example will display only the terms that begin with “AU=”
You can refine the search by typing more characters, adding the letter a will display only the terms that
begin with “AU=a“
Advance copy – on going updating to reflect J-ISIS latest release Page 23
9. Searching
J-ISIS provides two searching methods; Guided Search and an Expert Search which allow all Lucene
capabilities.
Guided search is the simpler of the two and only supports Boolean terms. The structure of the search is
constrained by the user interface, making it difficult to enter incorrect queries. Expert search permits a wider
range of searching functions including proximity searching and searching for repeatable fields.
Fields that are indexed for searching are specified in the field selection table. Fields that are not indexed
cannot be searched unless a free text search is used, which scans the entire contents of the records.
Advance copy – on going updating to reflect J-ISIS latest release Page 24
9.1 Guided Search
Guided Search is selected by default. The new Guided Search module uses autocomplete user interface
features that provide users with suggested queries or results as they type their query in the search box. This
is also commonly called autosuggest or incremental search. J-ISIS autocomplete implementation is very fast
even on large indices in under a few milliseconds so that the user sees results pop up as he types them.
Typing “T” or “t” on the ASFAEX database pop up the following results:
Advance copy – on going updating to reflect J-ISIS latest release Page 25
“700” is the field tag and the number between rectangular brackets is the number of occurrences for the
term.
If we want to select a term from the pop up list, we just click on the term in the term in the lis. For example,
if we click on “tables”, the term is completed in the query field.
Advance copy – on going updating to reflect J-ISIS latest release Page 26
Now, if we click on the “Search” button, we get:
Advance copy – on going updating to reflect J-ISIS latest release Page 27
We can connect several terms by “AND” or “OR” using the “Match all of the following” (AND) and
“Match any of the following” (OR) radio buttons.
For example, we select the following term from the dictionary
We also enable the “Match any of the following” radio button, and we click on the “+” button to add a new
term field.
Then we enter PY=
Advance copy – on going updating to reflect J-ISIS latest release Page 28
We select “PY=1969” from the popup list:
The query is therefore: “TI=The tides in James Bay” OR “PY=1969”
Then we click on the “Search” button:
Advance copy – on going updating to reflect J-ISIS latest release Page 29
And we get the following results:
Advance copy – on going updating to reflect J-ISIS latest release Page 30
Advance copy – on going updating to reflect J-ISIS latest release Page 31
The format can be changed:
Advance copy – on going updating to reflect J-ISIS latest release Page 32
Another Example:
Advance copy – on going updating to reflect J-ISIS latest release Page 33
9.2 Expert Search
The checkbox “Guided Search” is checked by default and must be unchecked. Lucene Query syntax should
be used knowing that the field names are the field tags
http://lucene.apache.org/java/2_1_0/queryparsersyntax.pdf
9.2.1 Terms
A query is broken up into terms and operators. There are two types of terms: Single Terms and Phrases.
A Single Term is a single word such as "test" or "hello".
A Phrase is a group of words surrounded by double quotes such as "hello dolly".
Multiple terms can be combined together with Boolean operators to form a more complex query (see
below).
Advance copy – on going updating to reflect J-ISIS latest release Page 34
9.2.2 Fields
Lucene supports fielded data. When performing a search you can either specify a field, or use the default
field. In J-ISIS, the field names are the field tags.
You can search any field by typing the field tag followed by a colon ":" and then the term you are looking
for.
As an example, let's assume a Lucene index contains two fields, title (tag 10) and text (tag 20) and text is the
default field. If you want to find the document entitled "The Right Way" which contains the text "don't go
this way", you can enter:
10:"The Right Way" AND 20:go
or
10:"Do it right" AND right
Since tag 20 (text) is the default field, the field indicator is not required.
Note: The field tag is only valid for the term that it directly precedes, so the query:
10:Do it right
Will only find "Do" in the title field. It will find "it" and "right" in the default field (in this
case the text field (20)).
9.2.3 Term Modifiers
Lucene supports modifying query terms to provide a wide range of searching options.
Wildcard Searches Lucene supports single and multiple character wildcard searches:
o To perform a single character wildcard search use the "?" symbol.
o To perform a multiple character wildcard search use the "*" symbol.
The single character wildcard search looks for terms that match that with the single character replaced. For
example, to search for "text" or "test" you can use the search:
te?t
Multiple character wildcard searches looks for 0 or more characters. For example, to search for test, tests or
tester, you can use the search:
Advance copy – on going updating to reflect J-ISIS latest release Page 35
test*
You can also use the wildcard searches in the middle of a term.
te*t
Note: You cannot use a * or ? symbol as the first character of a search.
Fuzzy Searches
Lucene supports fuzzy searches based on the Levenshtein Distance, or Edit Distance algorithm. To do a
fuzzy search use the tilde, "~", symbol at the end of a Single word Term. For example to search for a term
similar in spelling to "roam" use the fuzzy search:
roam~
This search will find terms like foam and roams.
An additional (optional) parameter can specify the required similarity. The value is between 0 and 1, with a
value closer to 1 only terms with a higher similarity will be matched. For example:
roam~0.8
The default that is used if the parameter is not given is 0.5.
Proximity Searches
Lucene supports finding words are a within a specific distance away. To do a proximity search use the tilde,
"~", symbol at the end of a Phrase. For example to search for a "apache" and "jakarta" within 10 words of
each other in a document use the search:
"jakarta apache"~10
Range Searches
Range Queries allow one to match documents whose field(s) values are between the lower and upper bound
specified by the Range Query. Range Queries can be inclusive or exclusive of the upper and lower bounds.
Sorting is done lexicographically.
mod_date:[20020101 TO 20030101]
This will find documents whose mod_date fields have values between 20020101 and 20030101, inclusive.
Note that Range Queries are not reserved for date fields. You could also use range queries with non-date
fields:
title:{Aida TO Carmen}
Advance copy – on going updating to reflect J-ISIS latest release Page 36
This will find all documents whose titles are between Aida and Carmen, but not including Aida and Carmen.
Inclusive range queries are denoted by square brackets. Exclusive range queries are denoted by curly
brackets.
9.2.3 Boolean Operators
Boolean operators allow terms to be combined through logic operators. Lucene supports
AND, "+", OR, NOT and "-" as Boolean operators.
Note: Boolean operators must be written in upper case.
The OR operator is the default conjunction operator
This means that if there is no Boolean operator between two terms, the OR operator is used. The OR
operator links two terms and finds a matching document if either of the terms exist in a document. This is
equivalent to union using sets. The symbol || can be used in place of the word OR.
To search for documents that contains either "jakarta apache" or just "jakarta" use the query:
"jakarta apache" jakarta
or
"jakarta apache" OR Jakarta
AND
The AND operator matches documents where both terms exist anywhere in the text of a single document.
This is equivalent to an intersection using sets. The symbol && can be used in place of the word AND.
To search for documents that contain "jakarta apache" and "Apache Lucene" use the query:
"jakarta apache" AND "Apache Lucene"
+ (required operator)
The "+" or required operator requires that the term after the "+" symbol exist somewhere in a
the field of a single document.
To search for documents that must contain "jakarta" and may contain "lucene" use the query:
+jakarta apache
NOT
The NOT operator excludes documents that contain the term after NOT. This is equivalent to
a difference using sets. The symbol ! can be used in place of the word NOT.
Advance copy – on going updating to reflect J-ISIS latest release Page 37
To search for documents that contain "jakarta apache" but not "Apache Lucene" use the query:
"jakarta apache" NOT "Apache Lucene"
Note: The NOT operator cannot be used with just one term. For example, the following search will return no
results:
NOT "jakarta apache"
- (prohibit operator)
The "-" or prohibit operator excludes documents that contain the term after the "-" symbol.
To search for documents that contain "jakarta apache" but not "Apache Lucene" use the query:
"jakarta apache" -"Apache Lucene"
9.2.4 Grouping
Lucene supports using parentheses to group clauses to form sub queries. This can be very useful if you want
to control the boolean logic for a query.
To search for either "jakarta" or "apache" and "website" use the query:
(jakarta OR apache) AND website
This eliminates any confusion and makes sure you that website must exist and either term
jakarta or apache may exist.
Field Grouping
Lucene supports using parentheses to group multiple clauses to a single field.
To search for a title that contains both the word "return" and the phrase "pink panther" use the query:
10:(+return +"pink panther")
9.2.5 Escaping Special Characters
Lucene supports escaping special characters that are part of the query syntax. The current list special
characters are
+ - && || ! ( ) { } [ ] ^ " ~ * ? : \
To escape these character use the \ before the character. For example to search for (1+1):2 use the query:
\(1\+1\)\:2
Advance copy – on going updating to reflect J-ISIS latest release Page 38
Expert Search Example The prefixed term “SD=AQUACULTURE DEVELOPMENT” has 3 occurrences in field 960
The prefixed term “SD=AQUACULTURE ECONOMICS” has 1 occurrence in field 960
The prefixed term “SD=AQUACULTURE SYSTEMS” has 3 occurrences in field 960
The prefixed term “SD=AQUACULTURE TECHNIQUES” has 1 occurrences in field 960
Expert Search on all terms:
1st, we uncheck the “Guided Search” checkbox and we enter the query as follow:
960:"SD=AQUACULTURE DEVELOPMENT" OR 960:"SD=AQUACULTURE ECONOMICS" OR
960:"SD=AQUACULTURE SYSTEMS" OR 960:"SD=AQUACULTURE TECHNIQUES"
Advance copy – on going updating to reflect J-ISIS latest release Page 39
Advance copy – on going updating to reflect J-ISIS latest release Page 40
Wildcard Search: 960:SD=AQUACULTURE*
Advance copy – on going updating to reflect J-ISIS latest release Page 41
WARNING ON A GENERAL EDITING ISSUE
There are many places in J-ISIS GUI where the data is presented in a table and the user has the possibility to
edit a cell by clicking on the cell. These include the FDT Editor, FST Editor and many other places.
For example in the FST manager:
If you start modifying the format by clicking on the cell and type directly in the cell („new text‟ literal
added) as follows:
The modification is not taken into consideration (i.e. saved physically) until you press “ENTER” or click on
another cell. Even if the change is displayed, it may not have been saved!
Advance copy – on going updating to reflect J-ISIS latest release Page 42
10. Importing
10.1 Importing ISO 2709 files
You should have established a database server connection before importing. In the examples below, we will
use the WinISIS cds example database that has been exported twice on ISO 2709 files cds0 and cds80
available in “testjisis15\jisis_suite\Test DB\WinISIS cds” . The iso file cds0 contains one
record per line and cds80 ISO file contains records which are split in lines of 80 characters.
The Import of databases in ISO2709 format has been
extensively tested. WinISIS Databases in format
ISO2709 and encoding CP850, CP1256 Arabic
Windows, and UTF-8 have been successfully
imported. Big databases with more than 170 000
(Louvre DB) , 370 000 (MARC DB) and 1 800 000
(Index Translatonium) records have been
successfully imported.
Please note that for performance reasons, indexing is
not performed when importing and should be done
after through the “Re Index Database” menu item of
the “Database” menu bar.
Advance copy – on going updating to reflect J-ISIS latest release Page 43
Step 1: Select External File
Select the appropriate format, encoding, and the external file: Please note that for the CDS WinISIS
database, we use Code page 850 which is a code page that was used in Western Europe, under DOS.
The default encoding is ISO-8859-1 which is used by Windows. Thus it is needed to change the encoding to
CP850.
Then click on “Next”
Advance copy – on going updating to reflect J-ISIS latest release Page 44
Step 2: Select the Import Option
The available options are similar to those available in WinISIS
Click on “Next” as we want to create a new Database
Advance copy – on going updating to reflect J-ISIS latest release Page 45
Step 3: Database
o Provide the database name,
o Click on “Create a Database from Existing Plain Old FDT and FST”
o Provide the fdt and fst path:
Advance copy – on going updating to reflect J-ISIS latest release Page 46
Step 4: Parameters
Change the default parameters if needed
Click on “Finish”, then you will see the following dialog:
Click on “OK”, check the parameters and click on OK if they are correct
Advance copy – on going updating to reflect J-ISIS latest release Page 47
Import will start and you can follow the status at the bottom on the right side
When import is finished, you will get the following dialog:
Click on “OK” and you can now browse the database (“Browse”->”DB Browser”):
Advance copy – on going updating to reflect J-ISIS latest release Page 48
Advance copy – on going updating to reflect J-ISIS latest release Page 49
DON‟T FORGET TO INDEX THE DATABASE!
J-ISIS is using Lucene to index the database records. Terms are generated from the formats provided in the
FST.
The index can be rebuilt at anytime for the current DB, through the “Re Index Database” menu item of the
“Database” menu bar.
All WinISIS indexing techniques are implemented.
Advance copy – on going updating to reflect J-ISIS latest release Page 50
Wait until the progress indicator disappears and you can see:
Advance copy – on going updating to reflect J-ISIS latest release Page 51
You can now check the index by browsing the dictionary:
10.2 Importing MARC files
The difference is that you can import the record leader information in 30XX fields by checking the check
box in step 4 and that you can re-use MARC fdt and fst templates. Please note that the information stored in
30XX fields (if any) will be move in the record leader when exporting.
There is one small marc file named “summerland.mrc” and an extract from ABCD called “marc-
ABCD.iso” located in “\jisis_suite\Test DB\marc”.
Advance copy – on going updating to reflect J-ISIS latest release Page 52
Step 2 is identical and in step 3 you select the MARC template fdt and
fst:
And in step 4 :
Advance copy – on going updating to reflect J-ISIS latest release Page 53
Don‟t forget to change the Input line length to “0”
check the “Move Leader Info into 30XX fields” checkbox
If you look at the database in the “Data Viewer”, you will see:
Advance copy – on going updating to reflect J-ISIS latest release Page 54
Advance copy – on going updating to reflect J-ISIS latest release Page 55
11. Exporting
The "Export database" menu item allows you to
extract all data of a data base or a portion thereof
normally for transmitting it to other users. You may
also use this command to perform some reformatting
of the records of a data base and then use the import
function to store the reformatted data into the
original or a different data base.
You can export in the following Marc formats: ISO2709, MarcXML and MODS. The structure of all MARC
records is based on an exchange format for bibliographic recors as specified in the ANSI/NISO Z39.2 and
ISO 2709:1996 standards.
If the database contains leader fields 3000:3004, these fields will be placed in the record leader when
exporting.
Let's try to export the simple marc database (summerland) that we just imported.
Advance copy – on going updating to reflect J-ISIS latest release Page 56
Keeping " ISO2709 "and clicking on "Next" button will popup the dialog below:
Name of output ISO file: Enter the name of the output file. Extension will be ".iso" for ISO2709,
".mrcxml" for MarcXML and ".mod" for MODS.
Output Directory: You can select the output directory where the output file will be stored. The "work"
directory is used by default.
Field separator: This field defines the field separator character to be used in the output file. The standard
field separator defined in ISO 2709 is the ASCII character 30 (hexadecimal 1E).
Record separator: This field defines the record separator character to be used in the output file. The
standard field separator defined in ISO 2709 is the ASCII character 29 (hexadecimal 1D).
Advance copy – on going updating to reflect J-ISIS latest release Page 57
Subfield delimiter:This field defines the separator character for subfields to be used in the output file.
J-ISIS usually use character ^. However many bibliographic standards use character $. Default is “^”.You
may also specify any ASCII character as field separator, by using the following notation:
12. PFT Manager
12.1 Presentation
The PFT manager offers a workspace for creating, editing, testing, converting and
deleting PFTs.
PFT editor with syntax highlighting, Copy/Paste Undo/Redo, Syntax checking,
PFT generation, etc...
Advance copy – on going updating to reflect J-ISIS latest release Page 58
12.2 Re-Using Plain Old WinISIS PFTs You can copy the old WinISIS pfts into the /ipft directory of the J-ISIS database
Advance copy – on going updating to reflect J-ISIS latest release Page 59
After copying the PFTs, closing all databases and re-opening the PFT Manager, you will have access to
these PFTs.
Advance copy – on going updating to reflect J-ISIS latest release Page 60
Problems you may be faced when using old PFTs
a) WinISIS formats may be split arbitrary in lines of 80 characters as in this example:
Clicking on the “Syntax” button will give an error as shown above. The solution is to rework the format so
that line splitting doesn‟t occur in the middle of an expression.
b) Strange characters are displayed:
J-ISIS is UNICODE and all data is stored using UNICODE encoding including data stored on file. The solution is change the encoding to UNICODE by clicking on the “Convert” button:
Advance copy – on going updating to reflect J-ISIS latest release Page 61
Click on save to keep the new encoding
13. J-ISIS Print Formatting Language
More than 90 % of WinISIS Print formatting language has been implemented.
13.1 Differences with WinISIS
There are some differences in the print formatting language syntax between WinISIS and J-ISIS. J-ISIS is
using a grammar for defining the syntax and generating the syntax analyser or parser. The grammar was
designed from the WinISIS Reference Manual and is stricter than WinISIS.
For example:
%/ is not accepted and should be replaced by %#
V07 should be replaced by V7
“conditional literal” should be followed by a field
13.2 ISIS Formatting Language – J-ISIS implementation
Advance copy – on going updating to reflect J-ISIS latest release Page 62
This language is also used for indexing, sorting, printing, reformatting, validating, exporting and importing
records. The formatting language has a strict syntax and semantics and formats entered by the user are
parsed before being accepted by the system.
13.3 Including an external format You may include an external format in a format by using the @name function, where name is the name of
the format to be included. This format must be in the data base ipft directory of the database.
13.4 Format exits (Call from the PFT to external functions) In a format you may invoke external Groovy methods you have written to perform special formatting
functions required by a particular application, which could not otherwise be obtained by using the formatting
language.
These programs are called Format exits. As format exits are developed to satisfy specific needs, their
description is beyond the scope of the formatting language.
CDS/ISIS provides, however, a normalized way to interface format exits with the formatting language.
From the point of view of the formatting language a Format exit is a string function with a format argument.
The argument is first executed and its output is passed to the function. A format exit returns a character
string which CDS/ISIS handles as if it was a field in the record being formatted.
From the point of view of J-ISIS a Format exit is a Groovy Method that can be written with the Groovy
Console.
13.5 Some features have not been implemented because they can be replaced by XHTML, CSS or JavaScript
FONTS
COLS
Paragraph Formatting Indentation Tabs Centering Justification Right alignment Frame New page Image insertion
M(indent,indent) TAB or TAB(value) QC QJ QR BOX NP PICT
Character Formatting Bold Italic Underline Font choice Font size
b i ul fn fsn
Advance copy – on going updating to reflect J-ISIS latest release Page 63
Text color Escape
fln |
LINK Commands
OPENFILE CMD GOTO LOGOTO term LAGOTO/xxx term GOBACK FORMAT format BROWSE base,mfn,format TEXTBOX format TEXTBOXCHILD format TEXTBOXRCHILD format TEXTBOXLOAD format TEXTBOXRCHILDLOAD format TEXTBOXIMG format TEXTBOXCHILDIMG format TextBOXRCHILDIMG format PROMPT TEXTBOX… VIEW base, mfn, format
14. J-ISIS Groovy Console J-ISIS provides a Groovy Console tab that you can open through the “Tools”->”Groovy Console” menu
item:
Advance copy – on going updating to reflect J-ISIS latest release Page 64
Notes:
1. The Groovy Console has a specific menu bar and toolbar which is embedded in the Groovy Console
Window. You create, edit, save, load or execute Groovy script through them.
2. There is a bug when you load a Groovy file through the “File”->”Open” menu item: the cursor exists
but is hidden. A workaround to get visible the cursor is to minimize and maximize J-ISIS. Sorry for
the inconvenience, I wasted lot of time trying to fix this issue without success, hope I will be able to
fix this issue for the next release.
15. Groovy Programming Language Groovy is a dynamic language for the Java™ Virtual Machine (JVM). It offers full object-orientation,
scripting, optional typing, operator customization, lexical declarations for the most common data types,
advanced concepts like closures and ranges, compact property syntax and seamless Java™ integration.
From Groovy, you can call any Java code like you would do from Java. It‟s identical. You can also call
Groovy code from Java
15.1 Classes & Scripts A Groovy class declaration looks like in Java. Default visibility modifier is public:
class MyClass {
void myMethod(String argument) {
}
}
When a .groovy file or any other source of Groovy code contains code that is not enclosed in a class
declaration, then this code is considered a Script, e.g.
println "Hello World"
Scripts differ from classes in that they have a Binding that serves as a container for undeclared references
(that are not allowed in classes).
println text // expected in Binding
result = 1 // is put into Binding
Methods may have parameters with or without default value and may return an
expression:
Advance copy – on going updating to reflect J-ISIS latest release Page 65
def someMethod(para1, para2 = 0, para3 = 0) {
// Method code goes here
return expression
}
15.2 Groovy Tutorial http://groovy.codehaus.org/Beginners+Tutorial
16. Using Groovy to write Format exits (Call from the PFT to external functions)
The TestFunc, SimpleTestFunc and pdfCatalogue groovy scripts are provided with this distribution in the
jisis-suite\work directory and the PFTs SimpleTestFunc, TestFunc are defined in the ASFAEX database.
For example, you can create the following Simple Groovy Function:
You can test the function by clicking on the Execute Groovy Script Toolbar button:
Advance copy – on going updating to reflect J-ISIS latest release Page 66
Save it in the work directory with name SimpleTestFunc.groovy
Then, you can create SimpleTestFunc PFT that call the SimpleTestFunc Groovy script
Save it by clicking on “Save”
Advance copy – on going updating to reflect J-ISIS latest release Page 67
And now if you choose the SimpleTestFunc PFT in the Data Viewer:
You will see:
Advance copy – on going updating to reflect J-ISIS latest release Page 68
A Format exit is invoked as follows: &Name(format)
Where:
& identifies this as a Format exit invocation;
Name is the name of the Groovy Script to be executed;
The full J-ISIS core library API is available for developing Groovy Scripts. It means that you can access the
current record, the current database, the format, or even other databases, FSTs, PFTs, parsing and
execution of PFT, including searching, the index dictionary, iText Open Source PDF library, etc.
Here is below an example that accesses a record:
Please note the statement:
IRecord rec = binding.getVariable("record");
This statement allows the script to access the current record which is provided through a &Name(format)
command in a PFT. The current Database and the format can be accessed through the following statements:
Advance copy – on going updating to reflect J-ISIS latest release Page 69
IDatabase db = = binding.getVariable("db")
String format = binding.getVariable("format")
If you select TestFunc in the “Data Viewer”, you will see:
Advance copy – on going updating to reflect J-ISIS latest release Page 70
17. Database Creation
To create a database in J-ISIS, a database definition wizard is used. This consists of a sequence of dialogs
that prompt the user for input to create four core database elements:
o Field definition table (FDT),
o Data entry worksheet (WKS),
o Default print format definition (PFT),
o Field selection table (FST).
The field definition table defines the tag, name and type of fields in the database. Data entry worksheets
create data entry interfaces that include only those fields that the user selects. Print format definitions are
written in the ISIS formatting language and define the appearance of records. The field selection table
selects fields to index and a corresponding indexing method.
Advance copy – on going updating to reflect J-ISIS latest release Page 71
Clicking on next will provide a Wizard panel that displays the Field Definition Table Editor.
Field Definition Table (FDT) – Database Structure
The Field Definition Table (FDT) provides information on the contents of the master records in a given data
base. In particular it defines the various fields which may be present and a number of parameters for each
field.
The FDT is used to control the creation of data entry worksheets for the data base and to validate the
contents of fields.
Advance copy – on going updating to reflect J-ISIS latest release Page 72
A field is created or updated by providing data in the fields of the upper line:
Each line of the FDT defines one field of the Master file record and contains 7 parameters: the field tag,
name, type, presence of a indicators (Marc21), repeatability, first subfield and subfields delimiters or
pattern These are described below.
Field Tag - The tag is a unique numeric value identifying the field. As in CDS/ISIS, you will use the tag of
the field each time you want J-ISIS to perform a given operation on the field. The tag is stored in the master
record and is associated with the contents of the corresponding field.
Field Name - The field name is a descriptive name you assign to the field. It is normally used in data entry
worksheets to label the field on the screen. You may consider that this is the name of the field as you know
it, whereas the tag is the name by which the field is known to J-ISIS.
Field type - The field type indicates possible restrictions on the data characters which may be stored in the
field. The field type may be one of the following:
Advance copy – on going updating to reflect J-ISIS latest release Page 73
Indicators – Indicates if the field has indicators as defined in bibliographic formats such as Marc21. If this
check box is checked, the advanced worksheet editor will automatically generate data entry fields for the
indicators.
First Subfield – Indicates if the first subfield of a subfielded field has a subfield delimiter
Note that the first subfield of a subfielded field need not have a subfield delimiter, provided that it is always
present. For example, if in a title field you wanted to use a subfield for the subtitle, the title part of the field,
which will obviously always be present, need not have an explicit delimiter. Thus the following entry for
this field would be possible:
Il nome della rosa^bUn manoscritto
If this box is checked, the advanced worksheet editor will automatically generates a data entry element for
this implicit subfield.
Repeatability - This parameter defines whether the field is repeatable (i.e. it may occur more than once in
any given record) or not.
Subfields/Pattern - Depending on the type of field defined, this entry defines either the set of subfields, if
any, allowed in the field, or the pattern (for type PATTERN).
Subfields - If the field contains subfields, the allowed subfield identifiers are defined here, in the order in
which they must appear. Note that the not sign (^) identifying the subfield delimiter is not entered. For
example, if a field may contain the subfields ^a ^b and ^c, these are defined in the FDT as abc (and not ^a ^b
and ^c)
Let‟s create a field with tag 10, “title” as name and no subfields
We first defined the tag, name and subfields in the line editor, and then we click on the « Add/Update »
button to create it in the FDT table below
Advance copy – on going updating to reflect J-ISIS latest release Page 74
Let‟s create a second field with tag 20, “Authors” as name and subfields “ab” that is repetitive.
Clicking on next will provide a Wizard panel that displays the Worksheet Editor.
Advance copy – on going updating to reflect J-ISIS latest release Page 75
Data Entry Worksheet
Clicking on the 2 arrows down button will create worksheet fields from the FDT fields as above
Clicking on next button will provide a Wizard panel that displays the Field Selection Table
Editor.
Field Selection Table (FST)
Fields can be moved from the FDT into the FST or removed from the FST by clicking on the down and
up arrows respectively.
Advance copy – on going updating to reflect J-ISIS latest release Page 76
Clicking on “Finish” button will create the Database
18. Data Entry
Data is entered manually through a data entry interface specified by the user through a worksheet definition.
Subfields and repeatable fields are permitted. Existing records can be modified or deleted through the same
interface. Records are stored in a Berkeley DB. There is only one Berkeley DB for a database. A Lucene
index and a field selection table (FST) are associated to a database. The field selection table defines the print
format to be applied to a record for extracting the terms to index.
The basic data entry facilities called CRUD (Create, Retrieve, Update & Delete using a user worksheet) are
implemented, and the index is updated each time a record is saved or deleted.
The “Dictionary Browser”, “DB Browser” and “Data Viewer” are synchronized with the “Data Entry” when
they are opened simultaneously in the application.
Subfielded fields
When you enter a field containing subfields you must key in the required subfield delimiters in front of each
subfield. A subfield delimiter is a 2-character code preceding and identifying a variable length subfield
within a field. It consists of the character ^ followed by an alphabetic or numeric character, e.g.
^a
If the subfield code is alphabetic, you may enter it in either upper or lower case: J-ISIS makes no difference
between ^a and ^A. You may therefore use the most convenient form.
Do not insert spaces or punctuation marks either before or after the subfield delimiter, unless you have been
specifically instructed to do so. Entering spaces or punctuation may adversely affect the printing of the field
later on.
Here is an example of a field with three subfields:
^aUnesco^bParis^c1985
Advance copy – on going updating to reflect J-ISIS latest release Page 77
Advance copy – on going updating to reflect J-ISIS latest release Page 78
Repeatable fields If the field you are entering is repeatable and you need to enter more than one occurrence, enter each
occurrence separately, and click on the repeatable field icon (“+”) for each new occurrence to be added.
Copy/Paste (with Ctrl/C Ctlr/V) and Undo/Redo (Ctrl/Z Ctrl/Y) can be uses during data Entry.
Advance copy – on going updating to reflect J-ISIS latest release Page 79
Let‟s create a new record for the book “C: A Reference Manual (C: ARMS, ISBN 0-13-089592X) written by
two authors: Samuel P. Harbison and Guy L. Steele.
When selecting “Data Entry” from the “Edit” menu, we get a data entry display with the default worksheet
field names and no data (and MFN equals 0). The database record content can be displayed using the MFN
field or the navigation icons ( , , ,
CREATE (CRUD)
Click on the Icon of the fields to add a new Field/Occurrence
Enter the Title and create two occurrences to enter the authors
Advance copy – on going updating to reflect J-ISIS latest release Page 80
Click on the diskette Icon to save the record
Check the Database with the DB browser
Advance copy – on going updating to reflect J-ISIS latest release Page 81
Check the index with the Dictionary browser
Advance copy – on going updating to reflect J-ISIS latest release Page 82
Let‟s create a new record for the book “Rich Client Programming, PLUGGING INTO THE NETBEANS
PLATFORM” written by three authors: Tim Boudreau, Jaroslav Tulach and Geertjan Wiielenga.
Go back to the Data Entry Tab and Click on the Icon to create a new record
A data Entry panel is displayed, click on the icon in the Title panel:
Advance copy – on going updating to reflect J-ISIS latest release Page 83
A Field/Occurrence data entry area is displayed
We enter the book title:
Advance copy – on going updating to reflect J-ISIS latest release Page 84
Then we enter the authors by creating three occurrences
Clicking on the “+” icon as indicated in the above screen shot will add a new occurrence to enter the second
author. Clicking again on the “+” icon will allow entering the third author, and finally we will have:
Advance copy – on going updating to reflect J-ISIS latest release Page 85
We can now save the record, clicking on the diskette icon
Click on the DB Browser Tab to see the database content
Advance copy – on going updating to reflect J-ISIS latest release Page 86
Please note that occurrences are separated by a percent (%) sign.
Advance copy – on going updating to reflect J-ISIS latest release Page 87
Click on the Dictionary Tab to see the terms indexed.
19. New Advanced Worksheet Editor
The new advanced worksheet editor uses a Tree-
Table layout and allows defining a worksheet that
goes at the subfield level, define repetitive subfields
and that may contain field indicators and implicit
subfields.
THE NEW ADVANCED WORKSHEET EDITOR
BUILD WORKSHEETS THAT ARE INTENDED
TO BE USED BY THE NEW ADVANCED DATA
ENTRY MODULE. HOWEVER WORKSHEETS
REMAINS COMPATIBLE WITH TH
STANDARD DATA ENTRY MODULE.
In our case, the default worksheet defined at database creation using the standard worksheet editor is
displayed. It doesn‟t include the subfields. We want to improve it, so we first click on the “Remove All”
button
Advance copy – on going updating to reflect J-ISIS latest release Page 88
After clicking on the button, we get the following:
Advance copy – on going updating to reflect J-ISIS latest release Page 89
Now, we will click on the button to build our advanced worksheet definition. The bottom table
now looks like this:
Please note the field with tag 20 (Authors) has now a “+” node that can be expanded by clicking
on it:
We have now worksheet entries for the subfields and we can change the default prompt and indicates if the
subfield is repetitive (the type can also be changed).
Double clicking on “v20^a” cell will allow editing the prompt:
We can now save the worksheet
Advance copy – on going updating to reflect J-ISIS latest release Page 90
The new worksheet definition defined in xml format will contain more information but remain compatible
with the standard data entry module.
We have used the “Add All” button to move all fields but fields can be selected individually and inserted at
any place in the bottom Tree-Table, you just have to select the node (or tree root) after which you want to
insert the field, and the sub-nodes will be created automatically.
It‟s quite easy to define template worksheets for Marc21 bibliographic records or Authority records an
Unimarc bibliographic records.
Example of a Marc21 bibliographic worksheet:
Advance copy – on going updating to reflect J-ISIS latest release Page 91
Example of a Marc21 authority control worksheet:
Advance copy – on going updating to reflect J-ISIS latest release Page 92
20. New Advanced Data Entry
The new advanced data entry editor uses a Tree-
Table layout and worksheets defined with the new
advanced worksheet editor.
It provides also interactivity for the basic
functionality of entering, editing, viewing, or
deleting records, that is, CRUD (Create Read
Update Delete).
When selecting “Advanced Data Entry” from the “Edit” menu, we get a data entry display with the default
worksheet field and subfield names and no data (and MFN equals 0). The database record content can be
displayed using the MFN field or the navigation icons , , ,
Advance copy – on going updating to reflect J-ISIS latest release Page 93
Expanding the tag 20 node will look like this:
Data can be entered by clicking on a cell in the data column
Dark pink cells cannot be edited
Clicking on the pencil will provide a dialog with an editor.
Advance copy – on going updating to reflect J-ISIS latest release Page 94
Clicking on the diskette icon will save the record. Cell Editing should be validated by pressing the
“Enter key”
Advance copy – on going updating to reflect J-ISIS latest release Page 95
Clicking on the icon will display the previous record
Clicking again on the icon will display the first record
Advance copy – on going updating to reflect J-ISIS latest release Page 96
Records can be updated and saved
Next Record
Last Record
Previous Record
First Record
Create New Record
Delete Record
Advance copy – on going updating to reflect J-ISIS latest release Page 97
21. Sorting and Printing Sorting and printing is done on a database which is opened. Printing output is always directed to a disk file.
Printing can be done without sorting the database record, i.e.
the records will be printed in the master file number (MFN)
sequence order.
This command allows you to print all the records or to print a
selected range of records. You may sort the records by virtually any
combination of fields and subfields.
The GUI offers two tabs called respectively “Print” and “Sort”.
21.1 Quickly Printing All/Or a Selected Range of Records
1. Specify which records you want to print. You may print the whole data base or a specific range of
records. You can enter a list of MFNs and/or MFN ranges separated by commas. For example: 1,10,100-
150, 50.
2. Select the print format which defines which fields must be printed and how they should be
formatted
3. Give a name to the output file and select the directory where it should be stored.
HTML is for the moment the only output format supported, but you can send the output to a pdf file when
printing if you have installed a Pdf driver such as PdfCreator:
http://sourceforge.net/projects/pdfcreator/
4. Click on the “Print” button.
Advance copy – on going updating to reflect J-ISIS latest release Page 98
Please note the “SORTING” radio buttons where you choose between “Don’t Use Sorting”, “Use the
selected Hit Sort File for driving the output” and “Sort the records according to keys defined in the
Sort Tab”.
J-ISIS displays the following message once printing on a disk file is done.
Open the output file in your favorite browser:
Advance copy – on going updating to reflect J-ISIS latest release Page 99
21.2 Sorting the Records before Printing
You will need to click on the “Sort” tab to specify the sorting parameters. As a first example, we will sort the
ASFAEX records on field 543 “Date of publication”:
FSTs are discussed in detail in the CDS/ISIS Reference Manual. You may either supply the name of a pre-
defined FST or enter one directly. If you want to use a pre-defined FST enter the name preceded by an at
Advance copy – on going updating to reflect J-ISIS latest release Page 100
sign (@). The @ sign tells CDS/ISIS that this is a name, rather than an actual FST. To provide an actual
FST, you must enter the three components separated by a space in the following order: field identifier,
indexing technique, and format. In case you need to enter a multi-line FST, separate each line with a + sign
surrounded by spaces. Here are two sample entries: the first one instructs CDS/ISIS to use the pre-defined
FST called AUTHOR; the second instructs the system to create a sort key from field 10 and a sort key from
each descriptor in field 20.
@AUTHOR
1 0 V10 + 1 2 V20
Thus, we define one sorting key by checking the check box of the first sorting key, we keep a default length
of 15 characters and we provide the FST entry: “543 0 V543”
Advance copy – on going updating to reflect J-ISIS latest release Page 101
Coming back to the Sort tab, we:
1) Check the “Sort the records according to keys defined in the Sort Tab” radio button
2) Give a name to the output file (“SortOnDate”)
3) Select the SortOnDate PFT:
Advance copy – on going updating to reflect J-ISIS latest release Page 102
4) Click on the “Print” button
Advance copy – on going updating to reflect J-ISIS latest release Page 103
You should see something like the above screen shot. Please note that the blank or not existing dates are at
the beginning.
22. Multilingual UNICODE Databases
J-ISIS is fully UNICODE for text storage and indexing. If you are unable to read some Unicode characters
in your browser, it may be because your system is not properly configured. Here are some basic instructions
for doing that. There are two basic steps:
Install fonts that cover the characters you need
Configure J-ISIS to use them.
22.1 Windows
For Windows XP, getting additional languages installed is as follows:
Start > Settings > Control Panel > Regional Options and Language Options.
In the Languages tab, check the Supplemental language support option(s) you want. Setting both options
will install all optional fonts. This adds fonts as well as system support for these languages.
Advance copy – on going updating to reflect J-ISIS latest release Page 104
22.2 Full fonts: If you have Microsoft Office 2000 and newer versions, you can get the Arial Unicode MS font, which is
the most complete. To get it, insert the Office CD, and do a custom install. Choose Add or Remove Features.
Click the (+) next to Office Tools, then International Support, then the Universal Font icon, and choose the
installation option you want.
22.3 Configuring a J-ISIS database to use a special font.
1) Select the database :
Advance copy – on going updating to reflect J-ISIS latest release Page 105
2) Select the font for the database
Arial Unicode MS is the best choice as it allows to mix language, alphabets and scripts:
Advance copy – on going updating to reflect J-ISIS latest release Page 106
Advance copy – on going updating to reflect J-ISIS latest release Page 107
Advance copy – on going updating to reflect J-ISIS latest release Page 108
23. Client Z39.50 It works for Marc21 and Unimarc, you can access Z3950 servers with User ID and Password,
Parallel search on multi servers. Records are converted to UNICODE, You can select the records
that you wish to export from the retrieved records. You can export to ISO2709, XML, MarcXML, Text and
J-ISIS DB
I will also provide templates for Marc21 and Unimarc, I mean FDT+FST+Worksheet for each format
so that it will very easy to create a new DB.based on these formats and to get bibliographic records
with Z3950.
Advance copy – on going updating to reflect J-ISIS latest release Page 109
Annex 1
Installing JDK 1.6
1.1 Downloading JDK 1.6
You can download the latest JDK 1.6 from
http://java.sun.com/javase/downloads/index.jsp.
Choose the JDK 6 Update 13 or latest update:
The Windows version of JDK update 13 download file is named jdk-6u13-
windows-i586-p.exe .
NOTE: new versions or updates may be available. If you download a new
version or an update version, the file
name may be slightly different from jdk-6-windowsi586.
exe.
Advance copy – on going updating to reflect J-ISIS latest release Page 110
1.2 Installing JDK 1.6 on Windows
Follow the steps below to install JDK 1.6:
1. Double click jdk-6-windows-i586.exe to run the installation program. You will see the JDK License
dialog displayed.
2. Click Accept to display the JDK Custom Setup dialog.
3. You may install JDK in a custom directory. For simplicity, don‟t change the directory. Click Next to
install JDK. After a while, the JRE Custom Setup dialog is displayed.
4. You may install JRE in a custom directory. For simplicity, don‟t change the directory. Click Next to
install JRE.
5. After installation completed, the Complete dialog is displayed. Click Finish to close the dialog.
Advance copy – on going updating to reflect J-ISIS latest release Page 111
Annex 2
The Field Select Table (FST)
A Field Select Table (FST) defines criteria for extracting one or more elements from a J-ISIS database
record. Depending on the context in which an FST is being used, these elements may then be used to create
the Lucene index entries for the record from which they were extracted, for sorting records in the desired
sequence before producing a printed report, or to reformat records during an import or export operation.
An element can be generally defined as a fragment of a record resulting from a particular process. Although
in many cases elements will be actual data elements, i.e. a field or a subfield, in other cases they may be
words, phrases, or any other piece of data which has a particular meaning to a specific application.
FSTs are created or modified by means of the FST editor under Tools > Fst Manager
.
A sample FST is displayed below:
Field ID Technique Data extraction format
24 4 mhl,v24
69 2 v69
70 0 mhl,v70|%|
26 0 "PLACE=",v26^a
26 0 "PUBL=",v26^b
An FST consists of one or more lines each defining three parameters:
1. A field identifier (column labelled ID);
2. An indexing technique (column labelled IT); and
3. A data extraction format coded using the CDS/ISIS formatting language.
Whenever J-ISIS is requested to extract elements using an FST, it will read the relevant database records and
carry out, for each record and for each FST entry, the following process:
Advance copy – on going updating to reflect J-ISIS latest release Page 112
1. Execute the format to extract from the record the corresponding data;
2. Apply the specified indexing technique to the data produced by the format; and
3. Assign to each element thus produced the specified field identifier.
The process described above is strictly mechanical and is performed exactly as described. There is no
transmission of knowledge between one step and the next, only of data, although all steps co-operate in
achieving the desired result. For example, the fact that a particular field was extracted during step 1 is not
known to step 2: step 1 uses the full power of the formatting language to produce a string of characters and
pass it on to the step 2. This step operates on this character string according to the specified indexing
technique. Indexing techniques are defined as processes on character strings, not on records or fields. It is
because of this generalized design that FSTs may be used for such different purposes as defining the
contents of the Inverted file or specifying the sorting requirements of a printed listing, which might appear,
at first sight, totally unrelated.
In the most general terms, you may think of an FST as a device able to produce elements of data required to
perform a certain task.
A. FST parameters The three parameters of an FST line are described below in the order they are processed (when editing an
FST with the line editor, they are entered in the reverse order).
1. Data extraction format
This is coded using the J-ISIS (CDS/ISIS) formatting language described under "The Formatting Language".
Because the data produced by this format is not meant to be displayed, but further processed, J-ISIS does
not restrict the line width to any particular value and, consequently, it will never split data between lines.
The concept of lines, however, may be relevant to a particular indexing technique applied to the output
produced by the format. In this case CDS/ISIS will guarantee that lines will only be created in response to
explicit new line commands you specify in the format.
Because of this, certain formatting commands such as the C, the indentation or the escape sequence
commands would normally be irrelevant in a data extraction format and may, in some cases, produce
unexpected results. They should therefore be avoided, unless they are necessary to achieve the intended
result.
On the other hand, the mode (see "Mode command") selected to output certain fields may be instrumental to
the correct functioning of a particular indexing technique: certain techniques require in fact a specific mode
(this is indicated under each indexing technique discussed below). It is your responsibility to insert the
appropriate mode command(s) in the data extraction format, if necessary.
Also note that requesting upper case translation, may adversely affect other further processes applied to the
data produced by the FST. As a general rule you should not request upper case translation (use modes mpl,
mhl or mdl as applicable, rather than mpu, mhu or mdu), unless you are sure it is needed and will not have
any side effects. J-ISIS will automatically perform upper case translation whenever needed. For example, all
elements generated by the Inverted file FST will be translated to upper case before they are stored in the
dictionary, even when the FST produces them in lower case.
2. Indexing Techniques
Advance copy – on going updating to reflect J-ISIS latest release Page 113
An indexing technique specifies a particular processing to be performed on the data produced by the format
in order to identify the specific elements to be created. There are eight indexing techniques which you may
use. They are given a numeric code from 0 to 4 as explained below.
a. Indexing technique 0
Build an element from each line extracted by the format. This technique is normally used to index whole
fields or subfields. Note, however, that J-ISISC will build elements from lines, not from fields. This is
because J-ISIS looks upon the output of the format as a string of characters where fields are no longer
identifiable. It is therefore your responsibility to produce the correct data through the format, especially
when you are indexing repeatable fields and/or more than one field. In other words, when using this
technique, your data extraction format should output one line for each element to be indexed.
b. Indexing technique 1
Build an element from each subfield or line extracted by the format. As CDS/ISIS will search the output of
the format for subfield delimiter codes, for this technique to work correctly your format must specify proof
mode (or no mode at all, as this is the default mode), because it is the only mode preserving the subfield
delimiter codes on output (remember that heading and data mode replace subfield delimiters by punctuation
marks). Note that indexing technique 1 is in fact a shortcut to using indexing technique 0. For example:
Record content:
^aParis^bUnesco^c1965
FST Format output Elements produced
1 1 mpl,v26 ^aParis^bUnesco^cl
965
Paris
Unesco
1965
1 0
mhl,v26^a/v26^b/v26^c Paris
Unesco 1965
Paris
Unesco
1965
1 1 mdl,v26 Paris, Unesco,
1965 Paris, Unesco, 1965
c. Indexing technique 2
Builds an element from each term or phrase enclosed in triangular brackets (<...>). Any text outside brackets
is not indexed. Note that this technique requires proof mode, because the other modes delete the brackets.
The advantages of using triangular brackets over using slashes (Indexing technique 3), are discussed under
"Search term delimiters”.
A field containing “Mission report describing a <university course> in
<documentation training> at an East African <library school>” will produce the
following elements when indexed with this technique:
university course
documentation training
library school
Advance copy – on going updating to reflect J-ISIS latest release Page 114
d. Indexing technique 3
Does the same processing as indexing technique 2 except that terms or phrases are enclosed in slashes (/../).
For example the following text:
Mission report describing a /university course/ in /documentation
training/ at an East African /library school/
will produce the following elements when indexed with this technique:
university course
documentation training
library school
e. Indexing technique 4
Build an element from each word in the text extracted by the format. A word is any sequence of contiguous
alphabetic 2 characters1.
When you use this indexing technique, you may prevent certain non-significant words from being indexed
by defining them in a special file called the Stopword file (see under “Creating a stopword file” for details
on how to build a stopword file).
Note: when this technique is used to index an entire field containing subfield delimiters, you must specify
heading or data mode (mhl or mdl) in the corresponding data extraction format so that subfield delimiter
replacement will take place before indexing, otherwise alphabetic subfield delimiter codes will be
considered part of a word. It is also advisable to use heading or data mode if the field being indexed contains
filing information, so that only the display form of the field is indexed and any data required for sorting the
field is ignored (see under “Filing information”).
f. Indexing techniques 5 to 8
The following 4 indexing techniques will allow specifying a prefix for search terms extracted with indexing
techniques 1, 2, 3 and 4. These techniques are numbered 5, 6, 7 and 8 respectively. The prefix is specified in
the data extraction format as an unconditional literal as follows:
'dp...pd', [format]
Where:
„d‟ is a delimiter of your choice (which does not occur in the prefix itself)
p...p is the actual prefix
For example: 1 8 '/TI=/',v24
This will index each word of field 24 and prefix each term with TI=.
1 The definition of alphabetic characters may be customized at each user installation through the system table ISISAC.TAB (see under “Alphabetic characters
table (ISISAC.TAB)”.
Advance copy – on going updating to reflect J-ISIS latest release Page 115
3. Field identifier
The field identifier is a number (in the range 1-32767) which is assigned to each element created during the
indexing step. The meaning of the field identifier depends on the purpose the FST is being used for, as
explained below.
Inverted file FST: the field identifier is the qualifier to be used during searching (see under “Operand
qualifier”);
Sorting FST: the field identifier is the field tag to be used in a user-supplied heading format (see
“Heading format”);
Reformatting FST: the field identifier is the ISO tag to be assigned to an exported field (see under
“Reformatting FST”), or the CDS/ISIS tag to be assigned to an imported field (see
under “Reformatting FST”).
You may find additional information on FSTs used for a specific purpose under “System sort worksheet”,
“Export worksheet”, and “Import worksheet”.
Advance copy – on going updating to reflect J-ISIS latest release Page 116
4 signifies that each word in the field will be indexed separately (except stopwords – see
Section 4.7). If the field is divided into subfields, you must specify mode mhl or mdl in the
extraction format –
being indexed you have to make a text file which lists these words and which is
known as a stopword file.
4.7 STOPWORD LIST
If you are indexing a field by separate words (indexing technique 4) you may want to prevent
common, non-informative words such as 'an' or 'the' from being indexed. This can be
achieved by setting up a stopword list for the database. Words on the stopword list will not be
indexed using indexing techniques 4 (though they may still appear as part of phrases
produced with other indexing techniques). Note that there can only be one stopword list for a
given database, not different lists for different fields.
The stopword file needs to be set up outside CDS/ISIS using a text editor or word processor
(see Section 2.9.8). It must have the same name as the database and the file extension stw
(e.g. books.stw for the database BOOKS). It must reside in the same folder as the FDT file
for the database.
The file must contain one stopword on each line with no preceding spaces, and the words
must be in capital letters. An example is shown below. A
AN
AND
BY
FOR
FROM
IS
IT
NOT
TO
THE
WITH
_______________________________________________________
Although you may be able to think of hundreds of words that are not useful as search terms, it
Advance copy – on going updating to reflect J-ISIS latest release Page 117
Annex 3
How to use jisis core library in Groovy scripts or other Web Applications
1 j-isis Core Library Application Programming Interface (API)
J-ISIS (as CDS/ISIS) is not a relational database system. Records are variable-length records and are
identified by a unique ID called the master file number (mfn). Records are made of variable-length fields
identified by a tag. They can be repetitive and can have several occurrences. A non repetitive field has a
single occurrence. An occurrence can contain several subfields.
The classes and objects that come naturally from the CDS/ISIS application are connections, database server,
databases, records, fields, occurrences, subfields, field definition table, indexes, field selection table, queries,
etc.
The Java programming language provides a
mechanism for defining a type that permits multiple
implementations: interfaces. Interfaces cleanly separate
the API from the implementation. By convention, in J-
ISIS, interface names begin with the letter “I”.
Interfaces provide the method signatures that class
implementations must provide.
The figure on the left summarizes J-ISIS main
interfaces.
IRecord (Key= mfn)
IField (Key=tag)
IOccurrence (Key=sequence index)
ISubfield (Key=subfield code)
1 : m
1 : m
1 : m
IDatabase (Key= dbHome, dbNname)
1 : m
IConnection
Key= (hostName, port)
1 : m
Advance copy – on going updating to reflect J-ISIS latest release Page 118
The figure on the left summarizes the classes that
implement J-ISIS main interfaces on the client side.
Java Packages:
import org.unesco.jisis.corelib.common.IConnection
import org.unesco.jisis.corelib.client.ConnectionNIO
import org.unesco.jisis.corelib.client.ClientDbProxy
import org.unesco.jisis.corelib.common.IDatabase
import org.unesco.jisis.corelib.record.IRecord
import org.unesco.jisis.corelib.record.IField
import org.unesco.jisis.corelib.record.StringOccurrence
import org.unesco.jisis.corelib.record.Subfield
Record (Key= mfn)
Field (Key=tag)
StringOccurrence
(Key=sequence index)
Subfield (Key=subfield code)
1 : m
1 : m
1 : m
ClientDbProxy
(Key= dbHome, dbNname)
1 : m
ConnectionNIO
Key= (hostName, port)
1 : m
Advance copy – on going updating to reflect J-ISIS latest release Page 119
2 Code Snippets:
Establishing a connection, opening a database and Browsing the database record by record:
// Initialize the server parameters
username = "admin";
password = "admin";
port = "1111";
hostname = "localhost";
// Establish a connection to the server
def connection = ConnectionNIO.connect(hostname, Integer.valueOf(port),
username, password);
// Create a Database object bind to this server
ClientDbProxy db = new ClientDbProxy(connection)
// Let's use DB ASFAEX on root defined by DEF_HOME
dbHome = "DEF_HOME";
dbName = "ASFAEX"
// Open the database
db.getDatabase(dbHome, dbName)
// Get first record
IRecord rec = db.getFirst();
// Iterate over the records in the database until nomore
while (rec != null) {
// Process the record->
// …..
// Get the next sequential record in the mfn order
rec = db_.getNext();
}
// Close the database
db.close();
Advance copy – on going updating to reflect J-ISIS latest release Page 120
Exploring the record data
// Get the number of fields in the record
int nfields = rec. getFieldCount();
// Iterate over all fields
for (int i=0; i<=nfields; i++) {
// Get the ieme field
Field field = rec.getFieldByIndex(i);
// Get the number of occurrences
int nocc = field.getOccurrenceCount();
if (nocc>0) {
// Iterate over the occurrences
for (j=0; j<nocc; j++) {
// Get the jeme occurrence of the ieme field
StringOccurrence occ = field.getOccurrence(j);
// Get the subfields, an occurrence has at least one subfield
// The data without a subfield delimiter code pair has a dummy
// subfield code char “*”
Subfield[] subfields = occ.getSubfields();
// Iterate over the subfields
for (int k=0; k<subfields.length; k++) {
// Get the data of the keme subfield
String data = subfields[k].getData()
// Get the subfield delimiter code
char code = subfields[k].getSubfieldCode();
// Process Data…..
}
}
}
Processing Specific Fields – Example 1
// Get the Monographic Level Authors (tag 200)
field = rec.getField(200);
// Process the field if it exists
if (field != null) {
// Get the number of occurrences
int nocc = field.getOccurrenceCount();
if (nocc>0) {
// Output a title if we have occurrences
chapter.add (new Paragraph ("Monographic Level Authors:", h1Font));
// Build a list from the occurrences
List list = new List (false, 30);
for (int i=0; i<nocc; i++) {
list.add (new ListItem (field.getStringOccurrence(i)));
}
// Output the list
Advance copy – on going updating to reflect J-ISIS latest release Page 121
chapter.add (list);
}
}
Processing Specific Fields – Example 2
// Get the Corporate Authors (tag 210)
field = rec.getField(210);
if (field != null) {
// A field has at least one occurrence
int nocc = field.getOccurrenceCount();
if (nocc>0) {
// Output a title if we have occurrences
chapter.add (new Paragraph ("Corporate Authors:", h1Font));
// Build a list from the subfields in the occurrences
List list = new List (false, 30);
// Iterate over the occurrences
for (int i=0; i<nocc; i++) {
StringOccurrence occ = field.getOccurrence(i);
Subfield[] subfields = occ.getSubfields();
// Iterate over the subfields
for (int j=0; j<subfields.length; j++) {
// Add the subfield data to the list
list.add (new ListItem (subfields[j].getData()));
}
}
// Output the list
chapter.add (list);
}
}
Advance copy – on going updating to reflect J-ISIS latest release Page 122
3 The API
public interface IRecord extends Serializable {
// Get the type of record
public int getRecordType();
// Get field with tag “tag”
public IField getField(int tag) throws DbException;
// Get the field with index “index”
public IField getFieldByIndex(int index) throws DbException;
// Get the number of fields
public int getFieldCount() throws DbException;
// Get MFN
public long getMfn();
// Get a vector of fields
public Vector getFields() throws DbException;
// Set the MFN
public void setMfn(long mfn);
// Get an html representation
public String toHtml();
// Get a serialized value
public byte[] toBytes() throws IOException;
}
Advance copy – on going updating to reflect J-ISIS latest release Page 123
public interface IField extends Serializable {
public int getTag();
public int getType();
public boolean hasOccurrences();
public boolean hasSubfields();
public Object getFieldValue();
public String getStringFieldValue();
public Object getOccurrenceValue(int occur);
public IOccurrence getOccurrence(int occur);
public String getStringOccurrence(int occur);
public String getSubfield(int occur, String subfield);
public int getOccurrenceCount();
public void setFieldValue(Object value) throws DbException;
public void setOccurrence(int occur, Object value) throws DbException;
public void removeOccurrence(int occur) throws DbException;
public void setType(int type);
public byte[] toBytesEx() throws IOException;
public int fromBytes(byte[] buf, int pos);
}
public interface IOccurrence extends Serializable {
// Returns true if this occurrence has subfields
public boolean hasSubfields();
// Returns the occurrence value
public Object getValue();
// Returns the subfield with subfieldTag
public String getSubfield(String subfieldTag);
// Sets the occurrence value
public void setValue(Object value);
// Get a bytes representation
public byte[] toBytesEx() throws IOException;
// Build the occurrence from bytes
public int fromBytes(byte[] buf, int pos);
}
Advance copy – on going updating to reflect J-ISIS latest release Page 124
public interface ISubfield extends Serializable {
// Returns the Subfield code that identifies the data element.
public char getSubfieldCode();
// Sets the data element identifier.
public void setSubfieldCode(char code);
// Returns the data element.
public String getData();
// Sets the data element.
public void setData(String data);
// Returns true if the given regular expression matches a subsequence of the
public boolean find(String pattern);
}
Advance copy – on going updating to reflect J-ISIS latest release Page 125
public interface IDatabase {
// Open a database
public void getDatabase(String dbHome, String dbName) throws DbException;
// Create a database
public boolean createDatabase(CreateDbParams createDbParam) throws DbException;
// Get the number of records in the database
public long getRecordsCount() throws DbException;
// Get the server connection for this database
public IConnection getConnection();
// Get the home string
public String getDbHome();
// Get the database name
public String getDbName();
// Close database
public boolean close() throws DbException;
/**************************************************
* Management of observers that will be notified
* when a change in the database occurs
**************************************************/
// Add an observer for this database
public void addObserver(Observer newObserver);
// Delete the observer for this database
public void deleteObserver(Observer observer);
/**************************************************
* Create, read, update and delete (CRUD)
* The four basic functions of persistent storage
**************************************************/
// Create a new empty record, IRecord contains the mfn allocated
public IRecord addNewRecord() throws DbException;
// Add a record to the database without updating the index
public Record addRecord(Record record) throws Exception;
// Read record with key "mfn"
public IRecord getRecord(long mfn) throws DbException;
// Read record with key "mfn" using the cursor method
public IRecord getRecordCursor(long mfn) throws DbException;
// Update and existing record or Create a new record if mfn=0
public Record updateRecord(Record record) throws Exception;
// Delete record with key "mfn"
boolean deleteRecord(long mfn) throws DbException;
Advance copy – on going updating to reflect J-ISIS latest release Page 126
public long getLastMfn() throws DbException;
/************************************
* Record iteration
************************************/
public IRecord getFirst() throws DbException;
public IRecord getLast() throws DbException;
public IRecord getNext() throws DbException;
public IRecord getPrev() throws DbException;
public IRecord getCurrent() throws DbException;
/***************************************
* Reading a chunck of records
***************************************/
public Vector<Record> getRecordChunck(int from, int to) throws DbException;
public Vector<Record> getRecordChunck(long fromMfn, int nRecords)
throws DbException;
public Vector<Record> getRecordChunck(long[] mfnChunck) throws DbException;
/*******************************
* Field Definition Table
*******************************/
public FieldDefinitionTable getFieldDefinitionTable() throws DbException;
public boolean saveFieldDefinitionTable(FieldDefinitionTable fdt)
throws DbException;
/*******************************
* Field Selection Tables
*******************************/
public FieldSelectionTable getFieldSelectionTable() throws DbException;
public boolean saveFieldSelectionTable(FieldSelectionTable fst) throws DbException;
public String[] getFstNames() throws Exception;
public boolean saveFst(String name, FieldSelectionTable fst) throws Exception;
public FieldSelectionTable getFst(String name) throws DbException;
public boolean removeFst(String name) throws DbException;
public String getDefaultFstName() throws DbException;
/*******************************
* Print Formats
*******************************/
public String getDefaultPrintFormat() throws DbException;
public String getDefaultPrintFormatName() throws DbException;
public String getPrintFormat(String name) throws DbException;
public String getPrintFormatAnsi(String name) throws DbException;
public String[] getPrintFormatNames() throws DbException;
public boolean removePrintFormat(String name) throws DbException;
public void saveDefaultPrintFormat(String format) throws DbException;
public boolean savePrintFormat(String name, String format) throws Exception;
Advance copy – on going updating to reflect J-ISIS latest release Page 127
/*******************************
* Worksheets
******************************/
public WorksheetDef getWorksheetDef(String name) throws DbException;
public String[] getWorksheetNames() throws DbException;
public boolean removeWorksheetDef(String worksheetName) throws DbException;
public boolean saveWorksheetDef(WorksheetDef wkDef) throws Exception;
/*******************************
* Index
******************************/
public boolean buildIndex() throws DbException;
public boolean clearIndex() throws DbException;
public boolean reIndex() throws DbException;
public IndexInfo getIndexInfo() throws DbException;
/***************************************
* Reading a chunck of records
***************************************/
public Vector<DictionaryTerm> getDictionaryTermsChunck(int from, int to)
throws DbException;
public Vector<DictionaryTerm> getDictionaryTermsChunckEx(String from, int n)
throws DbException;
public Vector<DictionaryTerm> getSortedDictionaryTermsChunck(int from, int to)
throws DbException;
public long getDictionaryTermsCount() throws DbException;
/*******************************
* Search
******************************/
public long[] search(String query) throws DbException;
public long[] searchLucene(String query) throws DbException;
}
Advance copy – on going updating to reflect J-ISIS latest release Page 128
public interface IConnection {
public void close() throws DbException;
//public void createDatabase(CreateDbParams createDbParam) throws DbException;
public void echo() throws DbException;
//public RemoteDatabase getDatabase(String dbHome, String dbName) throws
DbException;
public String[] getDbHomes() throws DbException;
public Vector getDbNames(String dbHome) throws DbException;
public UserInfo getUserInfo();
public String getServer();
public int getPort();
}
Advance copy – on going updating to reflect J-ISIS latest release Page 129
4 Writing a Groovy Application to produce a pdf catalogue
Suppose that we want to produce a catalogue of the records which are in the ASFAEX example database.
We will use the j-isis core library to extract the records and the iText open source library to format the
catalogue.
Generating a document in pdf, rtf or html with iText involves the following five steps:
Step 1: Create a Document.
Step 2: Get a DocWriter instance (in this case, a PdfWriter instance)
Step 3: Open the Document.
Step 4: Add content to the Document.
Step 5: Close the Document.
A document is created as follow:
Document doc = new Document(PageSize.A4)
By default, the orientation is Portrait. You can change this to Landscape by invoking the rotate method:
Document doc = new Document(PageSize.A4.rotate())
The Document class describes a document's page size (Letter, Legal, A4, and so on), margins, and other
important attributes. It is also a container for a document's chapters, sections, images, paragraphs, and other
content.
The Groovy code is provided in the pdfCatalogue.groovy file:
import org.unesco.jisis.corelib.client.ClientDbProxy;
import org.unesco.jisis.corelib.client.ConnectionPool;
import org.unesco.jisis.corelib.common.Global;
import org.unesco.jisis.corelib.common.IDatabase;
import org.unesco.jisis.corelib.record.IRecord;
import org.unesco.jisis.corelib.record.IField;
import org.unesco.jisis.corelib.record.StringOccurrence;
import org.unesco.jisis.corelib.record.Subfield;
import org.unesco.jisis.corelib.client.ConnectionNIO;
import org.unesco.jisis.corelib.common.IConnection;
import java.awt.Color;
import java.io.*;
import com.lowagie.text.*;
import com.lowagie.text.pdf.*;
class pdfCatalogue {
def bf = BaseFont.createFont (BaseFont.HELVETICA,
BaseFont.CP1252,
BaseFont.NOT_EMBEDDED);
// Establish a title font for all record titles.
def titleFont = new Font (Font.HELVETICA, 18, Font.BOLD,
Advance copy – on going updating to reflect J-ISIS latest release Page 130
new Color (0, 0, 128));
def h1Font = new Font(Font.HELVETICA, 12, Font.BOLD,
new Color(0, 0, 128));
def process() {
// Create an instance of the Document class
Document doc = new Document ();
PdfWriter writer;
writer = PdfWriter.getInstance (doc,
new FileOutputStream ("asfaex.pdf"));
writer.setViewerPreferences (PdfWriter.PageModeUseOutlines);
doc.open ();
initDocument(doc, writer);
def username = "admin";
def password = "admin";
def port = "1111";
def hostname = "localhost";
// Establish a connection to the server
def connection_ = ConnectionNIO.connect(hostname, Integer.valueOf(port),
username, password);
// Create a Database object bind to this server
ClientDbProxy db_ = new ClientDbProxy(connection_)
// Let's use DB ASFAEX defined on root DEF_HOME
def dbHome = "DEF_HOME";
def dbName = "ASFAEX"
// Open the database
db_.getDatabase(dbHome, dbName)
// Get first record
IRecord rec = db_.getFirst();
// Iterate over the records
while (rec != null) {
// Create a record chapter.
doc.add (recordChapter(rec));
rec = db_.getNext();
}
// Close
doc.close ();
writer.close();
}
Advance copy – on going updating to reflect J-ISIS latest release Page 131
def initDocument(doc, writer) {
// Establish a footer that shows the page number between a pair dashes.
HeaderFooter footer = new HeaderFooter (new Phrase ("- "), new Phrase (" -"));
footer.setAlignment (Element.ALIGN_CENTER);
doc.setFooter (footer);
// Create the title page.
PdfContentByte cb = writer.getDirectContent ();
cb.rectangle (doc.left (), doc.bottom (), (float)(doc.right () - doc.left ()),
(float)(doc.top ()-doc.bottom ()));
cb.stroke ();
cb.beginText ();
cb.setFontAndSize (bf, 34);
cb.showTextAligned (PdfContentByte.ALIGN_CENTER, "ASFA",
(float)((doc.right ()-doc.left ()) / 2 + doc.leftMargin ()),
(float)((doc.top ()-doc.bottom ()) / 2 + doc.topMargin ()),
0);
cb.setFontAndSize (bf, 12);
cb.showTextAligned (PdfContentByte.ALIGN_CENTER,"The Aquatic Sciences and
Fisheries Abstracts (ASFA) Bibliographic Database",
(float)((doc.right ()-doc.left ()) / 2 + doc.leftMargin ()),
(float)((doc.top ()-doc.bottom ()) / 2 + doc.topMargin ()-18),
0);
cb.endText ();
// Create the Introduction chapter.
Paragraph title = new Paragraph ("Introduction", titleFont);
title.setAlignment (Element.ALIGN_CENTER);
title.setSpacingAfter (18.0f);
Chapter chapter = new Chapter (title, 0);
chapter.setNumberDepth (0);
Paragraph p = new Paragraph ("The Aquatic Sciences and Fisheries Abstracts
(ASFA) Bibliographic Database is the" +
"principal information product produced through the
cooperative efforts of the international" +
"network of ASFA Partners
(http://www.fao.org/fishery/asfa/1,1/en) and FAO. " +
"The database contains more than 1 million
bibliographic references (or records) to the world's" +
"aquatic science literature accessioned since
1971.");
p.setAlignment (Element.ALIGN_JUSTIFIED);
chapter.add (p);
doc.add (chapter);
}
def recordChapter (rec) {
// Create a record chapter.
Paragraph title = new Paragraph ("Record "+rec.getMfn(), titleFont);
title.setAlignment (Element.ALIGN_CENTER);
title.setSpacingAfter (18.0f);
Chapter chapter = new Chapter (title, 1);
chapter.setNumberDepth (0);
chapter.setBookmarkOpen (false);
chapter.setBookmarkTitle (("Record "+rec.getMfn()));
// Get the English Title (tag 220)
IField field = rec.getField(220);
chapter.add (new Paragraph ("English Title:", h1Font));
Paragraph p = new Paragraph (field.getStringFieldValue());
p.setAlignment (Element.ALIGN_JUSTIFIED);
Advance copy – on going updating to reflect J-ISIS latest release Page 132
chapter.add (p);
// Get the Original Title (tag 224)
field = rec.getField(224);
chapter.add (new Paragraph ("Original Title:", h1Font));
p = new Paragraph (field.getStringFieldValue());
p.setAlignment (Element.ALIGN_JUSTIFIED);
chapter.add (p);
// Get the Serial Title (tag 324)
field = rec.getField(324);
chapter.add (new Paragraph ("Serial Title:", h1Font));
p = new Paragraph (field.getStringFieldValue());
p.setAlignment (Element.ALIGN_JUSTIFIED);
chapter.add (p);
// Get the Abstact (tag 700)
field = rec.getField(700);
chapter.add (new Paragraph ("Abstract:", h1Font));
p = new Paragraph (field.getStringFieldValue());
p.setAlignment (Element.ALIGN_JUSTIFIED);
chapter.add (p);
/*
Image image = Image.getInstance ("mercury.gif");
image.setAlignment (Image.ALIGN_MIDDLE);
chapter.add (image);
*/
// Get the Monographic Level Authors (tag 200)
field = rec.getField(200);
if (field != null) {
int nocc = field.getOccurrenceCount();
if (nocc>0) {
chapter.add (new Paragraph ("Monographic Level Authors:", h1Font));
List list = new List (false, 30);
for (int i=0; i<nocc; i++) {
list.add (new ListItem (field.getStringOccurrence(i)));
}
chapter.add (list);
}
}
// Get the Corporate Authors (tag 210)
field = rec.getField(210);
if (field != null) {
// A field has at least one occurrence
int nocc = field.getOccurrenceCount();
if (nocc>0) {
chapter.add (new Paragraph ("Corporate Authors:", h1Font));
List list = new List (false, 30);
for (int i=0; i<nocc; i++) {
StringOccurrence occ = field.getOccurrence(i);
Subfield[] subfields = occ.getSubfields();
for (int j=0; j<subfields.length; j++) {
list.add (new ListItem (subfields[j].getData()));
}
}
chapter.add (list);
}
}
return chapter;
}
Advance copy – on going updating to reflect J-ISIS latest release Page 133
}
Advance copy – on going updating to reflect J-ISIS latest release Page 134
We create an instance of the pdfCatalogue class and we call the process method:
def catalogue = new pdfCatalogue()
catalogue.process()
Click on the “Execute Groovy Script” Toolbar button to execute pdfCatalogue script.
During execution, you should see the following dialog:
And when the dialog disappears, you should see:
Advance copy – on going updating to reflect J-ISIS latest release Page 135
If you don‟t provide a full path, the output file “asfaex.pdf” will be stored in the j-isis root folder it it should
look like:
Advance copy – on going updating to reflect J-ISIS latest release Page 136
Advance copy – on going updating to reflect J-ISIS latest release Page 137