DLI Training Workshop - cudo.carleton.ca€¦ · data access projects Roundtable discussion about...

Post on 06-Oct-2020

7 views 0 download

transcript

Hosted by

the

University of Regina Library December 1999

DLI Training Workshop

Chuck Humphrey

DLI Workshop --

Dec 1999 2

Day 1 : A.M.

• Review data service models within

the framework of:

access and dissemination

aggregate data and microdata

statistics versus data

DLI Workshop --

Dec 1999 3

Day 1 : A.M.

Hands-on work with aggregate data

• CANSIM

• E-STAT

• Census ‘96

• Health Indicators Database

DLI Workshop --

Dec 1999 4

Day 1 : P.M.

• Microdata retrieval systems

LANDRU (UC)

• introduction with hands-on experience

ISLAND (UBC)

• introduction with hands-on experience

DLI Workshop --

Dec 1999 5

Day 2 : A.M.

Update to DLI since 1997

• experiences with new web

services

Spatial Data Retrieval: GEODE

DLI Workshop --

Dec 1999 6

Day 2 : P.M.

Other data extractors

Discussion about possible COPPUL

data access projects

Roundtable discussion about

introducing data to our reference

colleagues

DLI Workshop --

Dec 1999 7

DLI Workshop --

Dec 1999 8

Data Service Models

• Begin by discussing data service

models within the framework of

three topics:

access and dissemination

aggregate data and microdata

statistics versus data

DLI Workshop --

Dec 1999 9

Data Service Models

• Models were presented as a continuum

during the 1997 DLI workshop

“Order & Pass-through” Service

Install Data and Provide Access

Treat as a Collection and Provide Reference

DLI Workshop --

Dec 1999 10

Data Service Models

• Choose a model that matches your

staff and computing resources

DLI Workshop --

Dec 1999 11

Acquisition

Fill a Request

Locate data

Order data & documentation

Collection Development

Select & Locate data

Order data & documentation

Catalogue data & documentation

Install & Store (data & documentation)

Reference

Search for data

Interpret documentation

Retrieve or download data

Process data

change formats

subset cases or variables

aggregate cases

merge files

analyze data

Acquisition

Fill a Request

Locate data

Order data & documentation

Collection Development

Select & Locate data

Order data & documentation

Catalogue data & documentation

Install & Store (data & documentation)

Reference

Search for data

Interpret documentation

Retrieve or download data

Process data

change formats

subset cases or variables

aggregate cases

merge files

analyze data

DLI Workshop --

Dec 1999 12

Acquisition

Fill a Request

Locate data

Order data & documentation

Collection Development

Select & Locate data

Order data & documentation

Catalogue data & documentation

Install & Store (data & documentation)

Reference

Search for data

Interpret documentation

Retrieve or download data

Process data

change formats

subset cases or variables

aggregate cases

merge files

analyze data

Acquisition

Fill a Request

Locate data

Order data & documentation

Collection Development

Select & Locate data

Order data & documentation

Catalogue data & documentation

Install & Store (data & documentation)

Reference

Search for data

Interpret documentation

Retrieve or download data

Process data

change formats

subset cases or variables

aggregate cases

merge files

analyze data

DLI Workshop --

Dec 1999 13

Acquisition

Fill a Request

Locate data

Order data & documentation

Collection Development

Select & Locate data

Order data & documentation

Catalogue data & documentation

Install & Store (data & documentation)

Reference

Search for data

Interpret documentation

Retrieve or download data

Process data

change formats

subset cases or variables

aggregate cases

merge files

analyze data

Acquisition

Fill a Request

Locate data

Order data & documentation

Collection Development

Select & Locate data

Order data & documentation

Catalogue data & documentation

Install & Store (data & documentation)

Reference

Search for data

Interpret documentation

Retrieve or download data

Process data

change formats

subset cases or variables

aggregate cases

merge files

analyze data

DLI Workshop --

Dec 1999 14

Acquisition

Fill a Request

Locate data

Order data & documentation

Collection Development

Select & Locate data

Order data & documentation

Catalogue data & documentation

Install & Store (data & documentation)

Reference

Search for data

Interpret documentation

Retrieve or download data

Process data

change formats

subset cases or variables

aggregate cases

merge files

analyze data

Acquisition

Fill a Request

Locate data

Order data & documentation

Collection Development

Select & Locate data

Order data & documentation

Catalogue data & documentation

Install & Store (data & documentation)

Reference

Search for data

Interpret documentation

Retrieve or download data

Process data

change formats

subset cases or variables

aggregate cases

merge files

analyze data

DLI Workshop --

Dec 1999 15

Acquisition

Fill a Request

Locate data

Order data & documentation

Collection Development

Select & Locate data

Order data & documentation

Catalogue data & documentation

Install & Store (data & documentation)

Reference

Search for data

Interpret documentation

Retrieve or download data

Process data

change formats

subset cases or variables

aggregate cases

merge files

analyze data

Acquisition

Fill a Request

Locate data

Order data & documentation

Collection Development

Select & Locate data

Order data & documentation

Catalogue data & documentation

Install & Store (data & documentation)

Reference

Search for data

Interpret documentation

Retrieve or download data

Process data

change formats

subset cases or variables

aggregate cases

merge files

analyze data

DLI Workshop --

Dec 1999 16

Acquisition

Fill a Request

Locate data

Order data & documentation

Collection Development

Select & Locate data

Order data & documentation

Catalogue data & documentation

Install & Store (data & documentation)

Reference

Search for data

Interpret documentation

Retrieve or download data

Process data

change formats

subset cases or variables

aggregate cases

merge files

analyze data

Acquisition

Fill a Request

Locate data

Order data & documentation

Collection Development

Select & Locate data

Order data & documentation

Catalogue data & documentation

Install & Store (data & documentation)

Reference

Search for data

Interpret documentation

Retrieve or download data

Process data

change formats

subset cases or variables

aggregate cases

merge files

analyze data

DLI Workshop --

Dec 1999 17

Acquisition

Fill a Request

Locate data

Order data & documentation

Collection Development

Select & Locate data

Order data & documentation

Catalogue data & documentation

Install & Store (data & documentation)

Reference

Search for data

Interpret documentation

Retrieve or download data

Process data

change formats

subset cases or variables

aggregate cases

merge files

analyze data

Acquisition

Fill a Request

Locate data

Order data & documentation

Collection Development

Select & Locate data

Order data & documentation

Catalogue data & documentation

Install & Store (data & documentation)

Reference

Search for data

Interpret documentation

Retrieve or download data

Process data

change formats

subset cases or variables

aggregate cases

merge files

analyze data

Find a referral partner on campus

DLI Workshop --

Dec 1999 18

1.

DLI Workshop --

Dec 1999 19

The Inventory Model

• In the traditional inventory model,

roughly half of the support goes to

putting items on the shelf, while the

other half goes to finding and getting

the items off the shelf.

Source: Darlene Fichter

DLI Workshop --

Dec 1999 20

The Access Model

• With the access model, support is

split between getting information into

a deliverable state and finding

appropriate ways of retrieving and

disseminating the information.

DLI Workshop --

Dec 1999 21

Access/Dissemination Issues

• managing vendor licenses

are the license conditions realistic?

what type of identification or

authentication is required?

DLI Workshop --

Dec 1999 22

Access/Dissemination Issues

• matching products with technology

is the product dependent on a

specific operating system?

is the product software dependent?

DLI Workshop --

Dec 1999 23

Access/Dissemination Issues

• determining access methods

stand-alone, lan or wan?

what are the finding tools?

DLI Workshop --

Dec 1999 24

Access/Dissemination Issues

• determining dissemination options

what are the output formats?

does the output require special

storage considerations?

DLI Workshop --

Dec 1999 25

The Access Model

• These issues and others about

access and dissemination will

underlie our discussions over the

next two days.

DLI Workshop --

Dec 1999 26

2.

DLI Workshop --

Dec 1999 27

Data Types

In the 1997 DLI workshop time was

spent discussing differences

between aggregate data and

microdata.

Each type has an impact on data

access models.

DLI Workshop --

Dec 1999 28

Aggregate Data

Aggregate data consist of statistical summaries derived from original data collections and organized in tables according to the following properties:

• socio-economic phenomena

• spatial representation

• time

DLI Workshop --

Dec 1999 29

Aggregate Data

Statistical summaries

• these summaries take the form of

counts, totals, sums, averages or

percentages

DLI Workshop --

Dec 1999 30

Age and Sex are displayed

Spatial representation and Time are fixed

Cells contain counts

DLI Workshop --

Dec 1999 31

Spatial representation and Age are fixed

Year and Sex are displayed

DLI Workshop --

Dec 1999 32

Geography and Sex are displayed

Age and Time are fixed

DLI Workshop --

Dec 1999 33

Aggregate Data

Aggregate data products

• usually stored as a series of related

tables in some type of database

structure requiring special retrieval

software (examples from STC include

C86, C91, CBP, CANSIM, etc.)

DLI Workshop --

Dec 1999 34

Microdata

Microdata are

• usually anonymised records of actual respondents from a survey

• unsummarized, i.e, observations in the form in which the data were collected

• in a raw format requiring some form of processing, typically a flat ASCII file

Microdata: Cases 3 & 4 from the GSS 2 Main File

0000312141100119820012122222210020982001212222224011

21111241112121112205020197111971021212222225211026121

2043001409557204113130221119999019787878797022214112

7141240031500061661123222222222111117262616221222266

6666636212000000020320222224222000022204141101101102

1111111221110000002100000000021000000000100000000002

00000423300200200100000100200

0000411001100111011021222222210020092002122222220211

11111231212111211208120193811938044122222221111052201

203901007504721031191012233520406058787870304221303

4207083004000014200071112221222117215756565655555556

66666656565000555500210222111111110000001111100001101

1122121221110110101100001101011000000000000000000000

00000000000000000000000000000

000041144504000800024010000000012518733

000041144308000900006011222220012518733

000041141709000930003031222220012518733

000041141709301100009031222220012518733

000041141211001330015011222220012518733

000041149113301630018011222220012518733

000041141216301800009011222220012518733

000041143018002000012031222220012518733

000041147920002015001541222220012518733

000041143720152130007531222220012518733

000041147921302145001542221220012518733

000041144321452200001512221220012518733

000041147522002300006012221220012518733

000041144523002800030010000000012518733

Microdata: First 14 Cases from the GSS 2 Episode File

DLI Workshop --

Dec 1999 37

Impact on Data Access

Aggregate data have been processed and organized in a database structure

• must locate the table with desired data

• must deal with each database structure

• must deal with accompanying retrieval software

DLI Workshop --

Dec 1999 38

Impact on Data Access

Microdata data must be processed or subset for subsequent processing

• must identify desired variables and cases (data documentation)

• must deal with the raw data file structure

• must address the issue of desired formats

DLI Workshop --

Dec 1999 39

Impact on Data Access

• These and others differences between aggregate data and microdata will be part of our discussions about data access, also.

DLI Workshop --

Dec 1999 40

3.

DLI Workshop --

Dec 1999 41

Statistics versus Data

The term statistics is commonly used

to describe the numeric summaries,

such as counts, totals, sums and

averages, that people use to make a

point in a study or report.

DLI Workshop --

Dec 1999 42

Statistics versus Data

The term data refers to numeric files containing a collection of raw information with many observations that can be analyzed from a variety of perspectives.

DLI Workshop --

Dec 1999 43

Statistics versus Data

Typically, generalizations are drawn from analyses of a data file.

For example, the information provided by all of the individuals in a survey is considered to be data, while the percent of respondents in a survey with a university degree is a statistic.

DLI Workshop --

Dec 1999 44

Blurring Statistics and Data

In the print world, statistical information is usually found in statistical abstracts, census monographs and serial publications by government agencies.

DLI Workshop --

Dec 1999 45

Blurring Statistics and Data

In the digital world this numeric information is now appearing with electronic table access on CD-ROM, the Internet, or in electronic journals.

Many aggregate data products now

fall in this category.

DLI Workshop --

Dec 1999 46

Blurring Statistics and Data

In other instances, the responses in the microdata file of a survey may provide the answer to a statistical question.

• For example, the percentage of the population in Canada with high blood pressure may be determined from the National Population Health Survey.

DLI Workshop --

Dec 1999 47

Impact on Data Access

• The use of aggregate data products and microdata files to answer statistical questions will also contribute to our discussions about data access.

DLI Workshop --

Dec 1999 48

DLI Workshop --

Dec 1999 49

Context for Aggregate Data

• Simplifying access to aggregate data is partially driven by a desire to use these products to answer general statistics questions.

• The demand for facts and figures at the reference desk remains constant or steadily increases.

DLI Workshop --

Dec 1999 50

Aggregate Data Challenges

• The challenges of creating access to aggregate data were summarized earlier.

finding a table with the desired statistics

dealing with each database structure

coping with a variety of retrieval software

DLI Workshop --

Dec 1999 51

DLI Aggregate Data Sources

• Four major DLI aggregate data products have been chosen for this workshop.

DLI Workshop --

Dec 1999 52

DLI Aggregate Data Sources

• CANSIM is a major source for economic and social data that are organized in time series.

DLI Workshop --

Dec 1999 53

DLI Aggregate Data Sources

• E-STAT offers a simplified interface to a selected subset of CANSIM and some additional aggregate data that have been identified as useful in teaching.

DLI Workshop --

Dec 1999 54

DLI Aggregate Data Sources

• The 1996 Census aggregate data files are a particularly important collection because these electronic tables contain more statistical information than is available for the Census in print.

DLI Workshop --

Dec 1999 55

DLI Aggregate Data Sources

• The Health Indicators Database is a compilation of tables from several sources to provide a single-access tool about health status in Canada.

DLI Workshop --

Dec 1999 56

DLI Aggregate Data Sources

• These four aggregate data products are good candidates for every DLI member to have.

Now let’s turn to the comparative worksheet on these four products.