+ All Categories
Home > Documents > Geodatabase over Taita Hills, Kenya - · PDF filecapabilities to handle georeferenced data in...

Geodatabase over Taita Hills, Kenya - · PDF filecapabilities to handle georeferenced data in...

Date post: 11-Mar-2018
Category:
Upload: vuongnguyet
View: 217 times
Download: 3 times
Share this document with a friend
6
In: Pellikka, P., J. Ylhäisi & B. Clark (eds.) Taita Hills and Kenya, 2004 – seminar, reports and journal of a field excursion to Kenya. Expedition reports of the Department of Geography, University of Helsinki 40, 47-52. Helsinki 2004, ISBN 952-10-2077-6, 148 pp. Geodatabase over Taita Hills, Kenya Anna Broberg & Antero Keskinen Abstract This article introduces the basics of geographical information systems (GIS) and explains how the Taita Hills project can benefit from them. GIS are used to store and manipulate geographical data. In a GIS data is stored in databases either in raster or vector format. Databases are widely used because of their flexibility in managing data. The Taita Hills database contains at this stage only a limited amount of different themes as roads, administrative area boundaries etc., but will be extended as the need arises and as more data is collected and produced. What is a geographical information system? Geographical information systems (hereafter GIS) are computer-based systems that are used to store and manipulate geographic information. A GIS is designed for the collection, storage and analysis of objects and phenomena where geographic location is an important characteristic or critical to the analysis (Aronoff 1995:1). Most information is somehow spatial and thus it is important to be able to place it in geographical context. The revolution of information technology has enabled the building of these computer systems handling geographical information. GIS technology has provided relatively instant access to the information via computer terminals. These enable locational data to be displayed and also updated directly on the screen. When geographical data is in digital form, it becomes possible to display and analyze the data in ways that are often much quicker and more effective than has been possible using manual techniques (Jones 1999:6). According to several sources (Aronoff 1995:39; Jones 1999:7; Kraak & Ormeling 1996:9-10) a computer system must provide a few sets of capabilities to handle georeferenced data in order to be a real GI-system. Through data input georeferenced existing data sets as digital maps or aerial photographs and satellite images are added to the system. Data can also be added through digitizing maps and documents. The management of the added data is usually done using the database approach. The core of a GIS is however manipulation and analysis of data. Examples of these capabilities could be data processing; interpreting and classifying the surveyed data; searching by location, class or attribute and modelling real-life processes or characteristics of a dataset. While analysing and modelling the data it can be turned into information. This means that data is no longer just values in the dataset, but they are converted into information that tells us something more than just the raw-data. Through data output (usually computer display or printed maps) the information is passed to the user or wider audience. In this communication process the visualisation of the information is a very important factor. GI-systems are indeed widely used in order to create maps, either normal paper maps or digital maps e.g. for Internet use. Data output is also significant for purposes of exploring data. Producing several visualisations of some dataset can tell a lot of new things about it. Why use GIS in the Taita project? The advantages gained while using GIS instead of traditional maps are remarkable. GIS data is maintained in digital format. As such the data is in a form more physically compact than that of paper maps, tabulations or other conventional types (Aronoff 1995:43). A countless amount of data can be stored for each location, whereas in conventional maps scale always limits the amount of information that can be presented.
Transcript

In: Pellikka, P., J. Ylhäisi & B. Clark (eds.) Taita Hills and Kenya, 2004 – seminar, reports and journal of a field excursion to Kenya. Expedition reports of the Department of Geography, University of Helsinki 40, 47-52. Helsinki 2004, ISBN 952-10-2077-6, 148 pp.

Geodatabase over Taita Hills, Kenya

Anna Broberg & Antero Keskinen

Abstract This article introduces the basics of geographical information systems (GIS) and explains how the Taita Hills project can benefit from them. GIS are used to store and manipulate geographical data. In a GIS data is stored in databases either in raster or vector format. Databases are widely used because of their flexibility in managing data. The Taita Hills database contains at this stage only a limited amount of different themes as roads, administrative area boundaries etc., but will be extended as the need arises and as more data is collected and produced.

What is a geographical information system? Geographical information systems (hereafter GIS) are computer-based systems that are used to store and manipulate geographic information. A GIS is designed for the collection, storage and analysis of objects and phenomena where geographic location is an important characteristic or critical to the analysis (Aronoff 1995:1). Most information is somehow spatial and thus it is important to be able to place it in geographical context. The revolution of information technology has enabled the building of these computer systems handling geographical information. GIS technology has provided relatively instant access to the information via computer terminals. These enable locational data to be displayed and also updated directly on the screen. When geographical data is in digital form, it becomes possible to display and analyze the data in ways that are often much quicker and more effective than has been possible using manual techniques (Jones 1999:6). According to several sources (Aronoff 1995:39; Jones 1999:7; Kraak & Ormeling 1996:9-10) a computer system must provide a few sets of capabilities to handle georeferenced data in order to be a real GI-system. Through data input georeferenced existing data sets as digital maps or aerial photographs and satellite images are added to the system. Data can also be added through digitizing maps and documents.

The management of the added data is usually done using the database approach. The core of a GIS is however manipulation and analysis of data. Examples of these capabilities could be data processing; interpreting and classifying the surveyed data; searching by location, class or attribute and modelling real-life processes or characteristics of a dataset. While analysing and modelling the data it can be turned into information. This means that data is no longer just values in the dataset, but they are converted into information that tells us something more than just the raw-data. Through data output (usually computer display or printed maps) the information is passed to the user or wider audience. In this communication process the visualisation of the information is a very important factor. GI-systems are indeed widely used in order to create maps, either normal paper maps or digital maps e.g. for Internet use. Data output is also significant for purposes of exploring data. Producing several visualisations of some dataset can tell a lot of new things about it. Why use GIS in the Taita project? The advantages gained while using GIS instead of traditional maps are remarkable. GIS data is maintained in digital format. As such the data is in a form more physically compact than that of paper maps, tabulations or other conventional types (Aronoff 1995:43). A countless amount of data can be stored for each location, whereas in conventional maps scale always limits the amount of information that can be presented.

In: Pellikka, P., J. Ylhäisi & B. Clark (eds.) Taita Hills and Kenya, 2004 – seminar, reports and journal of a field excursion to Kenya. Expedition reports of the Department of Geography, University of Helsinki 40, 47-52. Helsinki 2004, ISBN 952-10-2077-6, 148 pp.

Also the ability to manipulate the spatial data and corresponding attribute information and to integrate different types of data in a single analysis are unmatched by any manual methods (Aronoff 1995:43). With normal maps only visual analyses overlaying several maps offers similar capabilities. Visualisation-wise GIS provides the ability to keep storage of the data and its visualisation apart. This enables the production of different kinds of maps from the same data for diverse uses or user-groups. In the Taita project there are many varied assignments containing tasks relating to remote sensing, geoinformatics or producing maps. Here a common geographical database can be of great help. It will act as reference data for several datasets produced during the project like classifications of the satellite imagery and aerial photographs. Thus it must be accurately georeferenced and put into a coordinate system so that further datasets can be included in it. Also, using the ready-made datasets we have made, visualisations and maps can be easily produced for dissimilar purposes during the project. Making the database: technical solutions and contents Modelling a database A spatial database can be thought of as an integrated set of geographic information on a particular subject and area (Longley et al. 2002:226). A database consists of a digital representation of Earth’s surface or near surface, in order to serve some problem-solving or scientific purposes (Longley et al. 2002:18)). A database can be built for one major project such as the Taita project, or it can be continuously maintained in order to serve daily service actions and transactions. Depending on the purpose, a database can vary in size from a megabyte to several petabytes (Longley et al. 2002:18). The geodatabase over Taita Hills is still quite small but will be extended and maintained during the period of the project when more information is acquired. A database is not only a collection of information about things; it is also, crucially, a representation of their relationships to each other. The objective in collecting and

maintaining information in a database is to relate facts and situations that were previously separate (Aronoff 1995:151). In a database all the data are handled in the same context, which enables making far more comprehensive analysis and manipulations of data. In general, the database approach to storing geographic data offers a number of advantages over traditional file-based datasets. For instance, data collected and better organised at a single location reduces redundancy and duplication, and decreases maintenance costs. Furthermore, applications become data independent so that they can evolve separately over time and use the same data. Moreover, a database remains constant, enabling transferring user knowledge between applications more easily (Date 1995; Longley et al. 2002:226). The database over Taita Hills is an important method of storing, sharing and utilizing the data collected about the area during the project. While modelling a database three sequent and increasingly abstract stages can be distinguished: conceptual, logical and physical modelling (see Longley et al. 2002:233; Jones 1999:163). At the level of conceptual modelling object types (or classes as roads, forests etc) and their relationships are defined and proper geographic representation for these is selected (Longley et al. 1999:234). Then a logical model for the database, or database type and structure is chosen. Of these the relational structure is most widely applied. It is based on a simple and powerful idea whereby information is represented by reasonably simple tables, which are subjected to various relational operators (Jones 1999:175). With the aid of functions such as primary keys and relational joins the relational structure often operates very effectively. Thus, the information from various sources may be linked together and analysed in the same context enabling a number of operations, which would be unreasonable without the relational structure. Most simply relational database can be made using e.g. ArcView shapefiles. The organization of its data files can be described in terms of records, fields and keys. Each entity (for example each polygon representing an administrative area) has its own row, or record, in a shapefile. A record is divided into fields, each of which contains an item of data (e.g. area name, area number). Some or one of the fields is termed as

In: Pellikka, P., J. Ylhäisi & B. Clark (eds.) Taita Hills and Kenya, 2004 – seminar, reports and journal of a field excursion to Kenya. Expedition reports of the Department of Geography, University of Helsinki 40, 47-52. Helsinki 2004, ISBN 952-10-2077-6, 148 pp.

key, a record with which the record is retrieved from the data file. In this case the key could be an area number. Fields that are not designated as key fields and containing whatever kind of data connected to the entity are termed attribute fields (Aronoff 1995:155). The final stage in designing a database is physical modelling, or defining how the data will be loaded into the database. It is the definition of the actual physical database schema that will hold the database data values. There are several definition languages such as SQL, which are used to build the physical model of databases. It is at this level that data items are defined in terms of parameters such as the number of bytes of storage they occupy and the address at which they are to be found in physical memory (Jones 1999:164). Choosing between raster and vector data Choosing the types of geographical representation is a critical database design task. There are two spatial data types, raster and vector, which are used to represent aspects of the real world (Figure 1). The raster data model is associated to field type continuous phenomena, such as elevation, and is based on an array of cells each having its own value. The raster data model has the great virtue of simplicity but it can produce very large files. The accuracy with which a raster layer can represent spatial data is related to the size of the pixel. The smaller the pixel size the greater the accuracy. However, larger grids need more space for storage and data handling. As for the vector data model, it can be divided into three different forms of representing objects: point, line or polygon. It is usually associated with representing phenomena that are discreet objects by nature. In this method all objects are created using points, whose coordinates are stored, and more complex combinations of these. Vector modelling has a precise nature of representing discreet objects with the storage efficiency and availability of functional tools for various operations (Longley et al. 2002:189).

Contents of the Taita database The information of the database over Taita Hills is mainly based on the maps, satellite images and aerial photographs acquired from different sources. The maps over the area are in scale 1:50 000 from the year 1991. The SPOT XS satellite image from 1987 is presented in Figure 2. The maps have been converted to digital format in the scanning process. Furthermore, in order to make the maps spatially functional they have been registered to the Transverse Mercator projection used in the area, and also used on the original paper maps. Digital aerial photography was acquired in spring 2003. The photography produced a large amount of separate pictures, which will be mosaicked in order to get a wider perspective of the area. From the pictures also a digital elevation model can be generated using terrain distortion. All these processed data will be used to extend and update the database. Classifications will be made from the pictures, which makes it possible to get more accurate information over the area for features such as settlement, new roads, forests, cultivated ground etc. The database over Taita Hills will include data both in raster and vector formats. The vector data model is used to represent discreet objects such as roads, rivers and administrative area boundaries. Furthermore, railways, lakes, water points and airstrips are built using the vector data model. The construction of the vector shapefiles is carried out using ArcView software. The created files are also suitable for use in other software, such as MapInfo after converting them. The raster data model will be used at the later stage in order to include image files made with software packages for remotely sensed images or aerial photographs such as Erdas Imagine. We digitised several feature classes from the scanned maps. Roads (digitised as lines) were then classified into five different categories and the numbers expressing the classes were saved as an attribute. This enables the creation of visualisations based on the attributes (Figure 3).

In: Pellikka, P., J. Ylhäisi & B. Clark (eds.) Taita Hills and Kenya, 2004 – seminar, reports and journal of a field excursion to Kenya. Expedition reports of the Department of Geography, University of Helsinki 40, 47-52. Helsinki 2004, ISBN 952-10-2077-6, 148 pp.

Feature in raster format Feature in vector format Feature in real life

Figure 1. Comparison of vector and raster data models.

Figure 2. SPOT XS satellite image from January 1987 showing the Taita Hills on the left and Sagala Hills on

the right.

In: Pellikka, P., J. Ylhäisi & B. Clark (eds.) Taita Hills and Kenya, 2004 – seminar, reports and journal of a field excursion to Kenya. Expedition reports of the Department of Geography, University of Helsinki 40, 47-52. Helsinki 2004, ISBN 952-10-2077-6, 148 pp.

Figure 3. A simple visualisation showing the different road classes.

Figure 4. Every sub-location has its own number and relates to one of the divisions.

Later it might be useful to build topology for the road network. The topologically structured format is designed to encode geographic information in a form better suited for spatial analysis and other geographic studies (Aronoff 1995:113). Vector-based models enable the

determination of the spatial relationships between the components. Spatial relationships of connectivity and adjacency are examples of topological relationships (Jones 1999:32). This means that for example analyses about moving around the road network could be made.

In: Pellikka, P., J. Ylhäisi & B. Clark (eds.) Taita Hills and Kenya, 2004 – seminar, reports and journal of a field excursion to Kenya. Expedition reports of the Department of Geography, University of Helsinki 40, 47-52. Helsinki 2004, ISBN 952-10-2077-6, 148 pp.

Administrative area boundaries were digitised as polygons; first creating the highest-level boundaries and then splitting these to smaller units using lower level boundaries. For many purposes, meanings are standardised in the form of classifications consisting of a set of categories. Often categories are grouped or classified into successively higher levels that enable phenomena to be referred to at different levels of abstraction (Jones 1999:18). We used this hierarchical structure with administrative regions, which were classified at three levels. Thus, the administrative regions may be examined in various levels of accuracy depending on the scale preferred (Figure 4). Moreover, Tsavo national park, airstrips and some lakes are found in the database as polygon shapefiles. All the shapefiles were registered in the right projection.

In this phase of the project only some point features (waterholes) were included in the database. More themes can be added as the need arises. Also we found that it is not use digitising very many features, because many of the things of interest are more easily and accurately gained from the classifications carried out using Erdas Imagine. As a conclusion we would like to emphasize that this stage of database construction was carried out only in favour to create a reference dataset to advantage further research. As information is added into the base, also the possibilities for analyses increase.

References Aronoff, S. (1995). Geographic information systems: a management perspective. 4th printing. 294 p.

WDL Publications, Ottawa. Date, C.J. (1995). Introduction to Database Systems. 6th edition. 839 p. Addison-Wesley, Reading,

Massachusetts. Longley, P.A., M.F. Goodchild, D.J. Maguire & D.W. Rhind (2002). 3nd edition. 454p. Geographic

information systems and science. John Wiley & Sons, Chichester, England. Jones, C. (1999). Geographical Information Systems and Computer Cartography. 3rd printing. 319 p.

Addison Wesley Longman Limited, Singapore. Kraak, M.-J. & F. Ormeling (1996). Cartography: visualization of spatial data. 222 p. Longman,

Harlow.


Recommended