+ All Categories
Home > Documents > CREATION OF A NATIONAL MINERAL DATABASE - AN …nopr.niscair.res.in/bitstream/123456789/27875/1/ALIS...

CREATION OF A NATIONAL MINERAL DATABASE - AN …nopr.niscair.res.in/bitstream/123456789/27875/1/ALIS...

Date post: 28-Feb-2021
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
7
Annals of Library Science and Documentation 1987, 34(3), 90-96 CREATION OF A NATIONAL MINERAL DATABASE - AN EXPERIMENT PART I Describes the different aspects of a national mineral database. The first part deals with the theoretical aspects of the database and studies various methods employed for codification and formating the data. The secomJ-pm-Ldeals with the actual implementation and its use to answer various queries. For demonstration DMS-Ila Burroughs package on data base management was used. INTRODUCTION During the last decade computer technology has assumed an important role in earth sciences. There has been increasing recognition to the fact that electronic data processing enables earth scientists to utilize, to the fullest extent, the large resources of data currently and poten- tially available to them. Once data have been acquired in computer- processable form, the geologist can take full advantage of the benefits offered by computer. Data can be retrieved more efficiently and in a more meaningful form for the solution of a specific problem. Further advantages of electro- nic data processing can be realised in the ex- change and publication of mineral data. The exchange of data in a standardized computer processable form is an important factor in the improvement of communication amongst earth scientists. When data are really available in this form, the need to publish large quantities of data in tabular form is eliminated, and this results in cost reduction. Part I of this study examines the standards which must be considered in developing format for recording, storing, and retrieving data in computer processable files and proposes a framework for linking all such data. This part 90 PKROY INSDOC New Delhi 110067 does not consider input formats or other de- tailed specifications required by' specific com- puters and programs. [6] DATA FILES ON MINERALS Before discussing the concept of a national system, which would consist of a great number of data files, it is necessary to consider the requirements for an individual computer pro- cessable file. Assembling a data file on minerals is not without difficulties. If however, keywords, system of classification and classification ter- minology are standardized for use within an individual file they may be treated as data and the difficulties are thus eliminated. The requirements of an individual computer processable file are as follows: 1. Data within the file must be recorded in a consistent manner. 2. The file must use a reference numbering system that identifies uniquely each item or block of data, and provides an internal link between items of data. If an individual file is to be used for more than one purpose, or if a number of individual files are to be utilised for one or more appli- cations, the standards u:ed to define and record data must be recognised by all potential users. Extension of these requirements to all mineral data files in India, forms the basis of the con- cept of the national system. NATIONAL SYSTEM OF MINERAL DATA concept and requirements: In order to achieve Ann Lib Sci Doc
Transcript
Page 1: CREATION OF A NATIONAL MINERAL DATABASE - AN …nopr.niscair.res.in/bitstream/123456789/27875/1/ALIS 34(3) 90-96.pdf · 92 Ann Lib Sci Doc. MINERAL DATABASE all of which are considered

Annals of Library Science and Documentation 1987, 34(3), 90-96

CREATION OF A NATIONAL MINERALDATABASE - AN EXPERIMENTPART I

Describes the different aspects of a nationalmineral database. The first part deals with thetheoretical aspects of the database and studiesvarious methods employed for codificationand formating the data. The secomJ-pm-Ldealswith the actual implementation and its use toanswer various queries. For demonstrationDMS-Ila Burroughs package on data basemanagement was used.

INTRODUCTION

During the last decade computer technologyhas assumed an important role in earth sciences.There has been increasing recognition to thefact that electronic data processing enablesearth scientists to utilize, to the fullest extent,the large resources of data currently and poten-tially available to them.

Once data have been acquired in computer-processable form, the geologist can take fulladvantage of the benefits offered by computer.Data can be retrieved more efficiently and ina more meaningful form for the solution of aspecific problem. Further advantages of electro-nic data processing can be realised in the ex-change and publication of mineral data. Theexchange of data in a standardized computerprocessable form is an important factor in theimprovement of communication amongst earthscientists. When data are really available inthis form, the need to publish large quantitiesof data in tabular form is eliminated, and thisresults in cost reduction.

Part I of this study examines the standardswhich must be considered in developing formatfor recording, storing, and retrieving data incomputer processable files and proposes aframework for linking all such data. This part

90

PKROYINSDOCNew Delhi 110067

does not consider input formats or other de-tailed specifications required by' specific com-puters and programs. [6]

DATA FILES ON MINERALS

Before discussing the concept of a nationalsystem, which would consist of a great numberof data files, it is necessary to consider therequirements for an individual computer pro-cessable file.

Assembling a data file on minerals is notwithout difficulties. If however, keywords,system of classification and classification ter-minology are standardized for use within anindividual file they may be treated as dataand the difficulties are thus eliminated.

The requirements of an individual computerprocessable file are as follows:

1. Data within the file must be recordedin a consistent manner.

2. The file must use a reference numberingsystem that identifies uniquely eachitem or block of data, and providesan internal link between items of data.

If an individual file is to be used for morethan one purpose, or if a number of individualfiles are to be utilised for one or more appli-cations, the standards u:ed to define and recorddata must be recognised by all potential users.Extension of these requirements to all mineraldata files in India, forms the basis of the con-cept of the national system.

NATIONAL SYSTEM OF MINERAL DATAconcept and requirements: In order to achieve

Ann Lib Sci Doc

Page 2: CREATION OF A NATIONAL MINERAL DATABASE - AN …nopr.niscair.res.in/bitstream/123456789/27875/1/ALIS 34(3) 90-96.pdf · 92 Ann Lib Sci Doc. MINERAL DATABASE all of which are considered

· MINERAL DATABASE

the desired flexibility and compatibility withother environmental sciences, the system mustconsist of a framework of principles ratherthan detailed specifications and formats whichmay not stand the test of time and in any caseare not fundamental to the system. Therefore,the system is defined according to the follow-ing principles:

1) The system will consist of files heldby individual organisations dealing withthe mineral data.

2) File within the system will be compu-ter-based, will be designed to cater tothe needs of the geologist who collectsand uses the data.

3) File within the system will be linked byentries for reference numbering andgeographic location, and common me-thods of coding.

4) The national index will serve as the keyto the contents and location of datafiles.

5) Standards will be established for de-fining and recording data. However,the specific document for recordingthe data will be based on the user'srequirements [ 8 ].

National Index

One of the primary functions of the nationalsystem is the selective retrieval and interchangeof data. This benefit cannot be achieved withouta national index to mineral data available inIndia.

The following are the main points of na-tional index:

1) The. national index is a computer-assisted index to the existing mineraldata contained in unpublished or pub-lished documents.

2) The national index is a necessary com-ponent of the national system, however,it can exist independently of the system.

3) A central agency is essential to co-ordinate indexing.

Vol 34 No 3 September 1987

4) Any organization wishing to partici-pate in the national system will beresponsible for the indexing of its owndata, according to standard indexingprocedures.

5) All organisations and individuals willhave access to the index regardless ofwhether or not they choose to indextheir own documents.

National Indexing Procedures

The essential steps in compiling a nationalindex to mineral deposit data are:

i) Indexing: The indexing of any documentinvolves the description of a document and itsowner, and the recognition of its significantdata content.

a) The title and other bibliographic datawhich are sufficient to identify a document,show where it is stored and by whom.

b) Type of output control required forthe preparation of indexes. 'Output Control'is a device in the system that enables the total'data store' to be split into any number of dis-crete indexes.

c) Concepts describing geographic locationand kind of data. The development of skill andconsistency in indexing will depend on theestablishment and maintenance of an indexingmanual.

ii) Vocabulary Control: For effective indexing,it is mandatory that a system of vocabularycontrol be established through a thesaurus.

When a group of more specific concepts isapproved, as well as the generic term with abroader meaning, reference is made to thebroader and narrower terms as well as relatedterms to indicate the choice available. Thechoice made by the indexer will be a matterof his judgment.

iii) Preparation and distribution of the index:Document being indexed will provide the inputto a computer-assisted information retrievalsystem that will store each entry on storagemedia. Printed editions of the national index

91

Page 3: CREATION OF A NATIONAL MINERAL DATABASE - AN …nopr.niscair.res.in/bitstream/123456789/27875/1/ALIS 34(3) 90-96.pdf · 92 Ann Lib Sci Doc. MINERAL DATABASE all of which are considered

or part of it would be derived from these mediaas required, by exercising the output control.The national index can appear in several volu-mes. Each printed index produced will providean alphabetic listing of each concept that hasbeen entered into the system upto that time.The concepts recognized would appear as-headings in an alphabetic order, followed bynumeric concepts.

A search for data would begin with a de-finition of concepts that refer to the kindof data sought, because the searcher wouldlikely be unfamiliar with the policy adoptedin defining concepts in the index. He wouldbe wise to consult the thesaurus to checkusage and reduce the chance of missing an indexconcept that may be useful in his task.

iv) Index revision: It is inevitable that usersof the index will discover inconsistencies orfind data that in their opinion, have not beenproperly indexed or entirely missed. It willbe essential to provide a formal means fordrawing this to the attention of the indexpolicy group: Users should always be encouragedto suggest these improvements, including de-letion of documents by means of a specialform designed for this purpose.

REFERENCE NUMBER

A reference number is used in documents forthe identification and organisation of variousitems of data in a file into compartments suit-able for retrieval.

There is a strong compulsion to makefurther use of reference number by includingdata for such purposes as identifying the orga-nization, the decade and year of assignment,the project for which it was assigned, the classof data to which it refers and even geographiclocation of observed data. In minerals, a samplenumber is often analogous to a reference num-

ROYber and to many, a reference number has muchin common with library call number. Both thelibrary and sample numbers serve to classifyunits as well as identify them.

For computer-processed data, the referencenumber may be ·best described as a tag whichaccompanies the item of data and serves toidentify it in the computer's memory. It mayalso serve to link items of fact relating to asingle observation, point or sample.

Recommended Reference Numbering Format

1) The 'file code' would identify the organi-zation and enable the interchange of data withinthe context of a national system. To avoidduplication the 'file code' would be assignedby a central organization. .

2) The 'project number' would be designatedby the organization in accordance with internalpolicy or operational requirement.

3) The 'year' would be recorded for example,as '80' for 1980.

4) 'Accession number' would be recorded asdefined.

In general, reference number should satis-fy the following conditions:

i) Uniqueness

ii) Consistency

iii) Meaningfulness to the participant butwithout usefulness as data in a system.

Where useful data must also be includedelsewhere. It is unlikely that reference numbersin the suggested context will be used for sort-ing or retrieval, as this is most commonly doneon geographic location, class, or discipline,

Recommended Reference Numbering Format

1File code

2Project No.

3Year

4Accession No.

1 234 5 6 7 10 11 12 13 148 9

92 Ann Lib Sci Doc

Page 4: CREATION OF A NATIONAL MINERAL DATABASE - AN …nopr.niscair.res.in/bitstream/123456789/27875/1/ALIS 34(3) 90-96.pdf · 92 Ann Lib Sci Doc. MINERAL DATABASE all of which are considered

MINERAL DATABASE

all of which are considered data and hence areindependent of the reference number.

iv) A prefix or 'Key' should precede the re-ference number where an interchange of datais to occur. This prefix, assigned by a centralorganization, would designate the name of theparticipating organization and would facilitatethe interchange of data.

GEOGRAPHIC COORDINATES

Location is one major factor that is commonto virtually all geographical data and obser-vations. It may also be the most frequentlyused reference in retrieval of data from files,and as such, must be common to files in widelydivergent disciplines.

Other factors that are important in theselection of a system for defining location indata processing are that it should:

i) be national and preferably internationalin scope;

ii) be capable of defining both a point andan area;

iii) contain a mmunumof discontinui-ties between individual coordinate gridsand thus be useful for ariel and regionalcomputations;

iv) be amenable to use on automatic plot-ters;

v) be readily usable for measurement ofcoordinates from points plotted on amap both by the use of a simple romerand a digitizer;

vi) be available on current maps of India,

vii) be widely known and understood.

viii) facilitate reoccupation of a point wherea sample was taken or an observationmade.

GEOGRAPHIC COORDINATES (LATITUDEAND LONGITUDE)

It is assumed that these geographic or angularcoordinates are so well-known that a descrip-

Vol 34 No 3 September 1987

tion is unnecessary. They are rarely printedas a grid on maps but appear in a graticulearound the edge of maps with intersectionsplotted as crosses, in the body of the maps.Neatlines (inner margins) of 1 :250,000 ma/sare divided into degrees and units of 15 minuteswhere neatlines of 1:25,000 maps are in degreesand units of 5 minutes.

THE UNIVERSAL TRANSVERSE MERCA-TOR GRID

This is a world-wide system of zones, each zonebeing 6 degrees longitude wide and each extend-ing from the equator to the 80th parallel oflatitudes, north and south. Within each of the60 zones there is a rectangular grid in meters.The ordinates of this grid are parallel to thecentral meridian of each zone and the abscissaeare normal to it. Thus all squares (1,000 metreon 1:250,000 maps) are the same size through-out each zone [8].

CODING OF MINERAL NAMES AND TERMS

Coding is the arbitrary assignment of symbols,numbers or letters to ordinary written languagefor some particular purpose such as shorteningword lengths or facilitating computer process-ing. During the early days of computer techno-logy, coding was an effective means for copingwith problems of limited storage capacity,slow processing speed, and the constraintsimposed by an 80-column card format. Al-though such coding was generally justified fromthe computer point of view, it tended to dis-courage the user from both entering and re-trieving information. Fortunately, coding is nolonger a major hindrance. Recent develop-ments in computer technology, includinghighly increased storage capacity and pro-cessing speed, have effectively eliminated theneed, so far as the computer is concerned, forcoding at the input and output stages. Thedifferences in processing numbers as againstletters, or a long word as against a short wordhave become vanishingly small. The advantagesobtained by saving computer time and storagespace by entering data using unfamiliar codesis mostly offset by factors affecting the costand accuracy of recording the data in the firstplace. Moreover, the prime objective of anysystem should be to maximize effectivenessfor the user and the use of familiar uncodedlanguages does this in most cases.

93

Page 5: CREATION OF A NATIONAL MINERAL DATABASE - AN …nopr.niscair.res.in/bitstream/123456789/27875/1/ALIS 34(3) 90-96.pdf · 92 Ann Lib Sci Doc. MINERAL DATABASE all of which are considered

Factors in Coding

The essential consideration in selection of aset of either coded or uncoded names andterms that are to be distinguished by a compu-ter is symbolic uniqueness for each name andterm. Computers operate at the symbolic(syntactical) level and cannot make distinctionsbetween symbolically similar terms (e.g. for-mation and formation), even though the geo-logist recognized a profound semantic distinc-tion (a body of rocks vs. a process). Converse-ly, a computer will distinguish between twosemantically identical terms if their syntaxdiffers. Within a given context, any codingsystem must provide for a set of unique sym-bols, letters or numbers. A desirable characteri-stic of a coding system for general use is thatit should be mnemonic (helping, or meant tohelp, identification of the word being coded).This trait is common in ordinary abbreviations(codes), for example, Oft' (feet) or 'Dev' (Devo-nian). Codes completely lacking mnemonicqualities generally require constant reliance ona dictionary and are therefore, awkward andundesirable for the geologist [5] .

Coding in the National System

A national system would consist of files heldin different organizations and in various fieldsof minerals. This implies a minimum of cen-tralized control and presents the undesirablepossibility of different coding systems beingapplied to similar files across the country.Two methods of approaching this problem arepossible.

1) Assigned codes for certain large, speciali-zed classes of terms:

Certain field of minerals use large numbers ofterms involving many subtle but importantsemantic distinctions. Coding of these termsmay be common practice because they areused repeatedly; unfortunately, the coding isoften inconsistent. In such situations, the bestapproach usually is to have a committee ofexperts to propose a set of assigned codes.Users of these codes would require a dictionaryauthorized by the committee for encoding anddecoding. A good example is the 'Well DataGlossary' prepared by the Subcommittee ofWell Data Retrieval Systems of the American

94

ROY

Petroleum Institute (1966). This Subcommi-ttee classified and coded approximately 1,300terms. The codes recommended are more orless mnemonic but because of jhe large numberof symbolically and semantically closely relatedterms, it was impossible to use any particularsystematic coding method and still maintainuniqueness for each of the terms.

2) Derived codes for smaller classes of termscommon to many files:

A derived coding method generates a codedirectly from the full term according to someset of rules, and therefore, has the advantageof allowing codes to be generated withoutcentral control (i.e. without an authoriseddictionary). If the population of words beingcoded is reasonably small, the methods willproduce only a few duplicates, although somecannot be avoided. This approach would beideal for many purposes in the national sys-tem, as any term can be independently encoded.However, it may break down at some pointwhen the number of duplicates becomes solarge that it must evolve into an assigned systemin order to maintain uniqueness .. Nevertheless,it appears that the use of a derived codingsystem would have many useful applicationsin the national system, for example, in thecoding of rock names or geographic locations.Another useful application of derived codingmethod is in abbreviating words in tables, orother output displays where there are spacelimitations. The use of a standard method ofabbreviation in these situations would of courseimprove communciation.

Numeric Code for Standard Geological TimeTerms

In view of the common use of geological timeterms in many types of data files, there may besome advantages of having available standardnumeric codes for these terms. Various numericcodes have been published (e.g. Buller, 1964,p. 882-885; Ontario Dept. E.R.M. 1965 p.14-15) and still others probably exist in privatefiles. This coding system has the followingadvantages:

i) The coding system follows the hierar-chical structure of the eons, eras, andperiods.

Ann Lih SC! ~)oc

Page 6: CREATION OF A NATIONAL MINERAL DATABASE - AN …nopr.niscair.res.in/bitstream/123456789/27875/1/ALIS 34(3) 90-96.pdf · 92 Ann Lib Sci Doc. MINERAL DATABASE all of which are considered

MINERAL DATABASE

ii) The system is 'open-ended' at thelower (older) end of the time scale,which will allow for easy modificationand addition to the older time terms,where future changes are most likelyto occur.

iii) The three-digit codes can be expandedto four digits or more to accommodatefiner subdivisions for individual pur-poses without prejudice to the standardaspect of the first three digits.

iv) The relative numeric values of thecodes are parallel to the interval unitsof the time scale. Thus, higher the codenumber, the older the time unit. Thiswill allow a simple numeric sort toeffect a geological time unit sort [9].

SUGGESTIONS

1) Codes for geological names and terms shouldbe used only when it is in the best interestsof the geologists. Considerations involvingsubsequent computer operations shouldnot influence the choice. In general, namesand terms should be used in the form andcontext which is most familiar and conve-nient to the geologist.

2) Each name and term entered as data in agiven context, whether coded or uncoded,must be symbolically unique, if that itemis to be recognized and retrieved from acomputer file.

3) Two approaches to coding for a givenfile are possible:

(a) Assigned codes, prepared by a commi-ttee' of experts, and controlled by anauthorized dictionary.

(b) Derived codes, generated by standardset of rules, requiring essentially nocentral control.

4) All codes, whether assigned or derived,should have high mnemonic qualities.

5) A standard. numeric code for the majorgeological time units is suggested for usewhere a numeric code is desired.

Vol 34 No 3 September 1987

CONCLUSIONThis study is conducted to take necessarysteps to develop a national system for therecording, storage and retrieval of mineralsdata in computer processable form. The imme-diate need for such a system results from thecurrent expansion in the volume of mineraldata and the increased availability and use ofcomputer for data storage and treatment.Standards for computer-processable files arean urgent requirement, if Indian geologistsare to take full advantage of the large resourcesof data available to them.

The system is defined by the followingprinciples:

1) The system will consist of data files heldand controlled by individual organizations.

2) Files within the system will be computer-based but oriented to the requirementsof the users; computer requirements willbe of secondary importance.

3) Files within the system will be linked bythe use of standard methods of recordingreference numbers, geographic location ofcoding.

4) The index to the contents and location ofdata files in India, within and outside ofthe system, will be a computer-assistedNational Index.

5) Data in files within the system will berecorded according to certain minimumstandards; however the standards for indi-viduals may exceed these minimum depend-ing on the user's needs.

Several steps have been taken towardsdeveloping the system. Procedures and standardswith respect to reference number, geographiclocation and coding are suggested.

ACKNOWLEDGEMENT

I would like to record my gratitude to theauthorities of Geological Survey of India andINSDOC for providing me with necessaryhelp in completing this work. I am indebtedto Shri R Satyanarayana, INSDOC for hisguidance and keen interest in the work.

95

Page 7: CREATION OF A NATIONAL MINERAL DATABASE - AN …nopr.niscair.res.in/bitstream/123456789/27875/1/ALIS 34(3) 90-96.pdf · 92 Ann Lib Sci Doc. MINERAL DATABASE all of which are considered

REFERENCES

1. Bayer R & McCreight E: Organisation and main-tenance of large order indexes. Icta Information1971,1,173-189.

2. Buler J V: A computer oriented system for thestorage and retrieval of well information. BullCanadian Petroleum Geology 12(4), 847-91.

3. Hubaux A, ed: Geological data files: survey ofinternational activity. CODATA Bull 1972, 8.

4. Hubaux A, ed: A new geological test - The data.Earth Science Review 1973, 9(2), 159-196.

5. Hermer M, Lenci M & Lesage M T: SIGMI: auser oriented file-processing system. Geosdence1976,1,187-193.

96

ROY

6. Murthy M V N: National earth sciences data centrein G.S.1. Paper presented at' the U.S. - India Se-minar on Information Resources in Energy, En-vironment and National Resources, WashingtonD.C., 1976, Oct, 13-16.

7. Murthy M V N: Earth Sciences in India, problemsand opportunities with special reference to de-velopment programmes. Paper presented at theu.s. - India Seminar, 1976, Oct, .13-16.

8. Robinson S C: Interim report of the committeeon storage and retrieval of geological data in Cana-da.

9. Talpatra A K: Geological documentation for com-\

puter based studies. GSI News 1975,6(5/6).

Ann Lib Sci Doc


Recommended