+ All Categories
Home > Documents > Social Science Computer at the University of Wisconsin ...SOCIAL SCIENCE COMPU1'ING AT FElL...

Social Science Computer at the University of Wisconsin ...SOCIAL SCIENCE COMPU1'ING AT FElL...

Date post: 05-Oct-2020
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
12
This PDF is a selection from an out-of-print volume from the National Bureau of Economic Research Volume Title: Annals of Economic and Social Measurement, Volume 1, number 2 Volume Author/Editor: Sanford V. Berg, editor Volume Publisher: NBER Volume URL: http://www.nber.org/books/aesm72-2 Publication Date: April 1972 Chapter Title: Social Science Computer at the University of Wisconsin: SIMS and SEOSYS Chapter Author: Max E. Ellis Chapter URL: http://www.nber.org/chapters/c9197 Chapter pages in book: (p. 237 - 248)
Transcript
Page 1: Social Science Computer at the University of Wisconsin ...SOCIAL SCIENCE COMPU1'ING AT FElL UNIVERSITY OF WISCONSIN: SIMS ANt) SEOSYS 1W MAX E. ELLIS INTROiUCTION For the past three

This PDF is a selection from an out-of-print volume from the National Bureau of Economic Research

Volume Title: Annals of Economic and Social Measurement, Volume 1, number 2

Volume Author/Editor: Sanford V. Berg, editor

Volume Publisher: NBER

Volume URL: http://www.nber.org/books/aesm72-2

Publication Date: April 1972

Chapter Title: Social Science Computer at the University of Wisconsin: SIMS and SEOSYS

Chapter Author: Max E. Ellis

Chapter URL: http://www.nber.org/chapters/c9197

Chapter pages in book: (p. 237 - 248)

Page 2: Social Science Computer at the University of Wisconsin ...SOCIAL SCIENCE COMPU1'ING AT FElL UNIVERSITY OF WISCONSIN: SIMS ANt) SEOSYS 1W MAX E. ELLIS INTROiUCTION For the past three

a

4naTai% uf !&-oninnu and .ics1 .%f',u,enutr, 2 I)72

SOCIAL SCIENCE COMPU1'ING AT FElL UNIVERSITYOF WISCONSIN: SIMS ANt) SEOSYS

1W MAX E. ELLIS

INTROiUCTION

For the past three years, the Data and ('omputation Center for the Social Sciences(DACC) at the University of Wisconsin has been engaged in developing softwarefor social science applications. The main effort has been research and dcvelopm'entof systems for describing and processing hierarchical data files. Emphasis has beenplaced on the design of user languages for describing data already in machinereadable form and on the development of efficient algorithms and systems forretrieval and editing of large data files. Two such systems arc described in thispaper. SIMS, a Social Science Information Management System. is now underdevelopment and is our ultimate goal in providing the social scientist with acomplete modular and transportable system for processing complex structuredfiles. SEOSYS, the Survey of Economic Opportunity System, is a system developedspecilically for retrieval of information from the Survey of Economic Opportunitydata files and has been used as a model for the design and implementation of SIMS.

The University of Wisconsin has a Univac 1108 system with batch terminalsat remote sites throughout the University. DACC has a Univac 9200 computerserving as an Input/Output terminal to the I 108. The 9200 communicates withthe 1108 via coaxial cable and provides card I/O and printing at the social sciencebuilding site. Magnetic tape Illes are stored at the central 1108 site and are accessibleto all remote terminals. The 1108 hardwai e configuration consists of a centralprocessing unit, 4 memory units of 65K 36 bit words each. 2 Fastrand II drumstorage devices consisting of 22 million words each. 4 flying head drums consistingof 262K words each, 10 tape drives, a printer, card reader. punch and the com-munication devices to handle the more than 10 remote batch terminals.

The minimum computer system configuration in which SI MS can operate musthave the following attributes:

--A multi-processing capability with facility for creation and execution of ajob control stream from a user program.

--An ANSI Fortran IV or a comparable Fortran compiler which throughFortran system routines or special routines called from a Fortran program.allows I/O to a random access device such as drum or disk. Also needed are1,/0 functions comparable to the UNIVAC or CDC Fortran BUFFERIN.BUFFEROUT, DECODE and ENCOI)E [3.51

Provides users with an equivalent of 50K 36 bit words or greater core forthe program and common block and at least an equivalent of one million36 bit words of random storage.

---Allows collection or mapping of precom piled relocatable routines, routinescompiled at execution and labelled common blocks.

A compiler for ANSI Cobol.--At least 3 tape units are required for certain processing functions.

237

Page 3: Social Science Computer at the University of Wisconsin ...SOCIAL SCIENCE COMPU1'ING AT FElL UNIVERSITY OF WISCONSIN: SIMS ANt) SEOSYS 1W MAX E. ELLIS INTROiUCTION For the past three

SEOSYS, described in the last section, is Written in Fortran and only reqtlirsthe hardware nornially made available to standard Fortran programs. Since allSEOSYS I/O is tape, no use is made of the random storage dcvicc. The sue 01SEOSYS is well within the limitation of 65K words set by the Fortran Compliers

SIMS: A SocIAi SCWNCr INFORMATION MANA(a%lEN1 SYSIFMSIMS incorporates a number of integrated processing functions for the Corn-plete processing of simple arid complex data files consisting of fixed length dataitems. Facilities exist for describing hierarchical structured files which are alreadyin machine readable form and for the complete editing of such tiles [2]. Thesetwo basic functions are complemented by a series of analytical functions such ascross-tabulation, correlation, etc. The modular constructioii of the system enablesadditional analytical routines to be added, including user supplied Fortran sub-routines. The user oriented command language of SIMS provides the social scienceresearcher with an interface to the system which is familiar to him. The syntaxand semantics of this language may be easily altered by a programmer to handleany idiosyncrasies in the terminology used by a particular class of users, or tochange the user interface entirely to conform to users other than the social scientist.Figure 1 is a sample SIMS reauest with explanations of the input Statements.it provides a general feel for the system and some properties of the language. Thisexample combines a number ofdifferent processing functions in one request or job.The user has survey data on cards and is using SIMS to "familiarize" himselfwith his data. Assume this is the first tine the data has been processed by thecomputer. In a single SIMS run, the researcher can describe the data (*DESCRIplION), validate and perform Consistency checks on data items (*EDIT) andproduce some preliminary cross-tabulations (*CROSSTABS)An input request may be Catalogued and retrieved at a later date for updatingor execution. The file description max' be entered into the SIMS library and storedin machine readable form. When the file described is referenced in subsequent runs(using the *INPUT statement) the file's descriptio,i is autoniaticall) retrieved andmade available to the SIMS retrieval and analytical routines.Initially SI MS will be limited in its statistical analysis capability since thistype of processing is readily available via other systems or generalj,d routinesand the file handling features of SIMS provide for complete editing, reformattingand extracting of data for such statistical programs. The main objective of SIMSis to provide a researcher with a file processing tool that he can use without the aidof a programmer, Figure 2 is a list of the commands for the first SIMS system.Details on the parameters of each statement are not given but the hriefdescrupt ionsof each should serve to summarize the features of SIMSThe first version of SIMS is scheduled for release by the end of 1972. Thisversion will be batch operational and will run under the EXEC $ operating systemof the Univac 1108. Most routines have been written in ANSI Fortran IV or Cobolwith additional DACC Fortran coding standards applied [I].A generalized system for implementing applications software systems has beendeveloped for the implementation of SIMS. LENS (Language intErface withNatural Semantics) [4] is a system which writes or genermtes programs from inout

238

Page 4: Social Science Computer at the University of Wisconsin ...SOCIAL SCIENCE COMPU1'ING AT FElL UNIVERSITY OF WISCONSIN: SIMS ANt) SEOSYS 1W MAX E. ELLIS INTROiUCTION For the past three

SA

MP

LE S

IMS

RE

QU

ES

T

5EC

INU

SE

R-S

MIT

1I,A

CC

OU

N'r-

2908

.MO

DE

-PR

OD

,RU

N-E

D-1

97 1

-SU

RV

EY

-TA

BLE

S

*rj

AN

ALY

SIS

OF

SU

RV

EY

DA

TA

*7'p

1Jr

1971

-S

UR

VE

Y

*ED

IT, T

TP

EO

BS

ER

VA

TIW

S ,M

AX

-CR

RO

F.S

-lOO

VA

LID

AT

E V

AR

IAB

LES

SE

X IK

CL*

,AG

E, O

CC

UP

AT

ION

(H

EA

D)

C1C

K, I

F (

SE

X (

HE

AD

)' IS

-M

ALE

AN

D-A

GE

1E

AD

) -G

T. 2

1 A

SID

- E

$CC

eE.C

T'2

OO

O)

CH

EC

K, I

F (

SE

X (

HE

AD

) -E

S -

HA

LE.A

ND

V6E

Q' 1

-AS

ID.S

EX

(S

PO

US

E)

.zS

'FV

IALE

)

CC

RO

IST

AB

SC

ELL

S A

RE

FR

EQ

UE

NC

IES

, RC

W-P

ER

CE

NT

, CO

LU)O

5-P

ZR

CE

NT

TA

SLE

R(l-

OC

CU

PA

rLcB

CO

LUN

A-S

EX

(H

EA

D)

TA

BIL

, RJ-

OC

CU

PA

TI4

, CO

LUM

N-W

OR

K-C

OD

E, P

AG

E.S

EX

(HE

AD

), C

ELL

S A

ll P

BE

QU

1CU

S

DE

SC

RIP

T1O

N, F

ILE

-NA

ME

- 19

71-S

UR

VE

Y

AB

ST

RA

CT

:19

71 S

UR

VE

Y O

F H

EA

DS

OF

HO

US

EH

OLD

S IN

5.E

. WI5

CfS

IIIT

IlES

DA

TA

OB

TA

INE

D F

R(1

DE

PT

. OF

WE

LFA

RE

.S

TO

RA

GE

-D

ES

CR

II'IIO

U:

ST

OR

AC

E-D

E V

ICE

-CA

RD

SR

EC

OR

D-X

DS

NT

IFIC

AT

IOR

-CA

RD

-$O

OB

SE

RV

AT

IOR

-ID

EN

TIF

ICA

TE

cE4.

NA

ME

- II

EA

DID

- H

EA

D-N

UM

BE

RR

EC

CE

D-D

ES

CR

1PT

ION

:N

AM

E -

HE

AD

, CA

RD

-NO

- I

VA

R1A

3LE

1:

NA

ME

- H

EA

D-N

LB4I

IER

, FO

RM

AT

- 1

/14

Fig

ure

1

EX

I'JN

A1'

!ON

01'

VT

AT

EM

I'.N

TS

Thi

s is

a p

rodu

ctio

n ru

n fo

r S

MIT

H, t

he in

put a

tres

o ca

talo

gued

undc

t acc

ount

290

9 an

d th

e gi

ven

run

idtn

tl(te

atl,o

n 19

71-S

UR

VE

Y-

TA

BLE

S.

'Thl

.s ti

ll. .s

pI'c

ars

on a

ll pa

gn o

f prin

tor

Out

put.

Xnp

ut ii

the

1971

-SU

RV

EY

tile

dc.

czib

ed u

nder

aD

ES

CF

,ZP

TZ

ON

.

Val

idat

e th

e ce

de. f

or tt

varia

ble,

list

ed s

nd p

erfo

rn th

sco

nsis

tenc

y ch

ecks

ela

ted.

Con

tinue

unt

il M

AX

-ER

RO

RS

-IG

O.

Che

ck e

tch

entr

y or

obs

erva

tion

and

prin

t err

or m

essa

ge if

expr

essi

on is

fals

e.

Pro

duce

the

follo

win

g tw

o co

ntin

genc

y ta

bles

giv

ing

freq

uenc

ies

ofoc

curr

ence

(or

cou

ntS

) an

d pe

rcen

iace

.T

he s

econ

d ta

io 1

. 3-

dim

ensi

onal

. For

tabl

e 2

thu

glob

al p

aram

eter

s of

the

*CR

OS

ST

9Sst

atot

rent

are

ove

rrid

den

by C

ELL

S A

RC

FU

EN

CIE

S.

The

sur

vey

file

is o

n ca

rds

with

1 to

3 c

ard,

per

obs

erva

tion

or e

ntry

dep

endi

ng w

heth

er a

spo

use

is p

rese

nt a

nd if

ho.

id w

orke

d.C

ards

are

iden

tifie

d by

CA

RD

-NO

, and

obs

.rvn

tton,

by

HE

AD

-N

iIIR

ER

.C

ard

1 is

HE

AD

info

r., C

ard

2i

SP

OU

SE

and

3 in

com

ejo

for.

of H

EA

D.

The

sta

tem

onts

bct

wrt

n 'O

IYC

RIF

TIO

R a

ndE

AT

A

are

subs

tate

men

ti of

the

Dat

a D

cscr

lptio

c

rho

FO

RM

AT

is th

e "s

tart

ing

eolu

mn"

/'For

tr..n

For

mat

".T

he B

CC

.2:D

Isth

e co

de o

r v*

lue

of a

var

iabl

e or

item

.T

he II

KA

D-I

11M

AE

R a

ppor

sin

Col

a. 1

-4a

ever

y ca

rd o

r re

cord

.T

he C

AR

D-N

O. i

.i in

Col

. 3 o

fev

ery

card

.V

ALU

ES

.may

be

refe

renc

ed b

y th

eir

nam

e, e

.I. S

EIt

lM

ALE

.V

AR

LAB

LES

may

be

refe

renc

ed b

y th

clr

12 c

har.

man

e or

unk

qonu

mbe

r.

A d

etai

led

desc

riptio

n of

a v

aria

bLe

may

be

give

n en

d co

ntin

uEd

on a

dditi

onal

car

ds if

icce

snar

y (e

.g. A

CE

on

the

left)

.

MA

RIT

AL-

ST

AT

indi

cate

, if S

PO

US

E te

nd s

houl

d be

pre

sent

.Ii

SP

OU

SE

pre

sent

and

tIul

,s V

ALU

E -

2th

uco

a va

l1da

tIoer

ror

will

ho

indi

cate

d,W

OR

K-C

OD

E in

dica

te, i

f IIF

AD

-CA

SII

card

pre

sent

.O

nly'

one

SP

OU

SE

car

d an

d on

e IL

EA

D-C

AS

U c

ard

m.,y

.ppe

sr fo

r a

HE

AD

.T

his

is s

tate

d in

thso

ST

RU

CIr

rUR

E-D

ES

CR

LPrL

os.

It O

CC

UP

AT

ION

was

not

giv

en a

ME

SS

ING

VA

LUE

of 9

9 w

as a

ssig

ned.

VA

RIA

LE 2

:N

AM

EC

AR

D-N

O, F

OIU

IAT

- 5

/IlB

OU

ND

1:

-lE

AD

-CD

, VA

LUE

- 1

BO

UN

D 2

:IIA

HE

- S

PO

US

E-C

D, V

ALU

E -

2B

OU

ND

3:

MA

NE

-lE

AD

-CA

SH

, VA

LUE

- 3

VM

IAQ

LE 3

:N

AM

E -

SE

X, F

OR

MA

T6/

11S

OU

ND

I:N

AM

EM

ALE

, VA

LUE

- 1

BO

UN

D 2

:N

AM

E -

FE

MA

LE, V

ALU

E -

2

VA

RIA

BLE

4:

MA

NE

- A

CE

, FO

RM

AT

- 7

112

DE

TA

EL

- 00

IMP

LIE

S N

O A

C! C

IVE

N

YM

IAS

LE 3

:N

AM

E -

RA

CE

, FO

RM

AT

- 9

.9.5

VA

RIA

BLE

6:

NA

ME

- M

AR

ITA

L-S

TA

T, F

OR

MA

T -

16/

11B

OU

ND

1:

NA

ME

- M

AR

RIE

D, V

ALU

E -

1S

OU

ND

2:

NA

ME

- S

XN

CLE

, VA

LUE

s 2

VA

PIA

BIIS

7:N

AM

EW

OR

K-C

OD

E, F

OR

MA

T -

IS/Il

SO

UN

D 1

:N

AM

E -

NO

r-W

OR

KIN

G, V

ALU

E -

0B

OU

ND

2:

NA

ME

- W

OR

KIN

G,

VA

LUE

- 1

VA

RIA

BLE

SN

AM

E -

OC

CU

PA

TE

1, F

OR

MA

T -

16/

12D

ET

AIL

- N

OT

ALL

OC

CU

PA

TIO

NS

AR

! GIV

EN

BO

UN

D 1

:N

AM

E -

BR

ICK

lAY

ER

, VA

LUE

- I

BO

UN

D 2

:N

AM

E -

CA

RP

EN

TE

R, V

ALU

E -

2B

OU

ND

3:

NA

ME

- O

TH

ER

, VA

LIS

E -

3-9

8S

OU

ND

4:

NA

ME

- M

ISS

XN

C, V

ALU

E -

99

Page 5: Social Science Computer at the University of Wisconsin ...SOCIAL SCIENCE COMPU1'ING AT FElL UNIVERSITY OF WISCONSIN: SIMS ANt) SEOSYS 1W MAX E. ELLIS INTROiUCTION For the past three

a

SM

P12

SIM

SR

EQ

UE

ST

Exp

AN

Art

C,4

0?

S7A

1NT

SR

EC

OR

D-D

ES

CR

ZP

rZ(M

;S

'OU

SE

2 (S

A1E

AS

RE

AD

RE

CO

RD

VA

J1A

BLE

l-)

VA

AB

LE 9

:!A

O-K

WlB

FR

l/t'.

WH

OSP

OU

SE B

ELC

NC

STO

VA

RIA

BLE

10:

CA

RD

-NO

5/IL

CA

RD

/RE

CO

RD

WC

TIF

ZC

AT

I(B

IB

OU

ND

1:

HE

AD

-CD

1A

D D

EH

RA

PH

ZC

INF

O.

BO

UN

D 2

:S

pOU

SE

-CD

2S

PO

US

ES

D1O

GR

AF

NZ

CIN

TO

.B

OU

ND

3;

HE

AD

-CA

SH

26I

CE1

EY V

ALU

ESIF

HE

AD

WO

R3E

DV

AR

IAB

LE 1

1;S

EX

6/Il

BO

UN

D 1

:M

ALE

1B

OU

ND

2:

PY

.NM

22

VA

RIA

BLE

12:

AC

E1/

IlA

CE

IN Y

EA

RS

VA

RIA

BLE

13:

RA

CE

9/A

STHE VA.UZ ISAINDREC

Note that MOEv*1, naaaa

are numerals mnd

the Ictsa

BOUND 1:

1w

uxrE

ceded value.

are nawe..

BOUND

2:2

BM

CX

BOUND

3:3

OTH

ERR

EC

OR

D-D

ES

CR

IPT

ION

:H

EA

D-C

AS

H,

3V

MIA

ELE

16:

INC

Q(E

40/7

10.2

CR

OS

SIN

C/Y

EAR

VM

IAB

LZ 1

3:A

SS

ET

S30

/710

.2T

OtA

L A

SS

ET

SV

AR

IAO

LZ 1

6;LI

AB

ILIT

IES

6O/?

'.0.

TO

TA

L LI

AB

ILIT

IES

ST

RU

CD

..R5-

DE

SC

RX

PT

IC*I

:H

ElD

RE

CO

RD

ISF

0U.J

ED

BY

HE

AD

-CA

SH

RE

CO

RD

IF W

OR

E-C

OD

EIS

WO

RK

ING

ELS

EIS

TO

LUZ

JED

BY

SP

OU

SE

RE

CO

RD

1? M

AR

ITA

L-S

TA

TE

QU

ALS

1.

SP

OU

SE

RE

CO

RD

ISV

OLL

4ED

BY

HE

AD

RE

CO

RD

.H

EA

D-C

AS

H R

EC

OR

DIS

P0I

LaJE

D B

YS

i'CV

SE

RE

CO

RD

.iT

MA

RV

rAL-

S'E

AT

EQ

UA

LS I

ELS

E IS

FO

UaJ

ED

BY

HEA

D R

ECO

RD

.*O

AT

A, F

tLS

-Mi

- 19

71-S

UR

VE

Y(D.r4 Cerda tot

Sf1 Survel 711.)

1 (Continued)

The SPOUSE record

dcecrptton appears

,tcb 2s,neternscD

'nISSInK and utthence dcatl

descrIption

or

able

SP

QU

SE

car

d..rc coded "2' in

th

rccord ID code

or

CA

RD

-NO

.Thu

value,

2,te hated on

the

RE

CO

XD

-DE

SC

PIP

lEO

Nst.c.ent.

DelSaiter. are

ptLon.l a, only

'blanks" ore

required.

Bounda need notbe pect(Led

a. can be errs

froe the..

ccntinuoue .oney

vatuca.

The

ST

RU

CT

UR

E-D

ES

CR

IPT

IONdeecribee te

hoLcal reletlon

aeona the. 3 card.

or record types.

Note t.,t

IE\D

Card

can be followed

by any one of

the 3 cord

tjpee,

SPOUSE Card

can be followed

only by a IUA1)C.r. and

lEA

D-C

AS

HCard by

a HEAD

or a

SP

OtIS

ECard dependl

, ,.wr&;.l stet..

A data card fIle

neu.r by prcccdcd

ny an *OATA

.atcmrnt,

Tine

ft1

sa,ee act arce

wIth th.st

prtnc .

tie °LPII'lJP And

*DE

SC

RZ

PT

LON

.tnte,uent..

End of SIRS

Input requelt.

Page 6: Social Science Computer at the University of Wisconsin ...SOCIAL SCIENCE COMPU1'ING AT FElL UNIVERSITY OF WISCONSIN: SIMS ANt) SEOSYS 1W MAX E. ELLIS INTROiUCTION For the past three

control Scatemenenj

CS

EC

INThis statement precedes each SIMS request and identifies

the user and job.

*END

A StMS request is terminated by WEND.

Hor, than on. request may be ubmittid and is

lentified by a

b,gtnning *BEGIN and an ending 'END.

*ST and *flOST

rhes. stattments if ratbeddrd in the input request either turn on

or turn off a listing of the input

request cards.

*REMOVE

All input requests are catalogued under the RUN-ID of the

'SEGIN statement (tt present) and are

removed or uncatalogued with the

REMOVE statement.

*1151k

Ujars Fortran subroutines must be preceded by this statement.

*DAV

Data on cards sub.itted as part of the SIMS it put request arepreceded by Chic statement.

8IHS Statumflflts

This statement identifies the file that is to

'a the input to the processing function or

functions

specified.

The major statement.

*IN

PtJ

7,is glbal to all processing functions unless a major pro-

cessing function statement (e.g. 'CROSSTAB. 'FDIT etc.) is followed

by an INPUT statement

(nO

aste

risk)

,then the file li,ted on rite

INP

UT.catcmtnt will be used as input to the function re-

quo. ted.

Thc.e statements see used to select or omit observationsand/Or variables for processing.

Tttc sane

global relation as eaplained for

*IN

IUT

and

INP

UT

appl

ies

to th

ese

stet

cmen

ts.

This statement and its 12 ,ubstatement5 (Not listed here) representthe Date Doscriptior' lm',fc

(DDL) of SIMS.

This statement and its 4 substacoernotl

(Not

listed here) represent the SIMS variable rcdeflnit ton

capability.

The statements are uged to reco4c variables or compute newyariableu, and dci&,c and

assign values and value names to variables.

This statement is analogous to the

ST

RU

CT

UR

Estatement of the Dot (See the

amplo request>,

It

enables the logical structura of the file to be respecit'isd atexecution time thereby incrc.trrI

the retrieval efficiency.

Figure 2

nDut Statements

'INPUT or INPUT

'SELECT or SELECT

OMLT

or

t1T

*DESCRIPT7(1l

*PSUUCTURE

Page 7: Social Science Computer at the University of Wisconsin ...SOCIAL SCIENCE COMPU1'ING AT FElL UNIVERSITY OF WISCONSIN: SIMS ANt) SEOSYS 1W MAX E. ELLIS INTROiUCTION For the past three

S

SIM

9 St

eteo

nta

snt.

Otrr

pur o

r C*i

fiUf

Thta

stat

emen

ttd

sntif

i.s a

n ou

tput

ILl..

The

*am

e gl

obal

ralit

ton

a, x

plsi

ned

(Or '

INPI

.'T a

ndIS

PUT

eppi

ts. t

o th

isst

atem

ent.

*TIT

L.E

Th. t

itle

spec

ified

appe

ers o

n ev

ery

page

cf o

utpu

t.'E

DIT

This

stat

emen

t sod

its 2

subs

tace

nent

u(N

or li

sted

her

ebu

t app

earin

g or

the

sam

ple)

repr

rsen

t the

stns

file

val

idat

ion

and

con.

t.ten

eych

ecki

ng c

apab

ilitie

s.Ed

it op

erat

ion.

on th

, inp

ut fi

le.p

enift

d in

clud

evs

lidat

lon

of 1

)ob

serv

atio

n st

ruct

ure.

2) v

aria

ble

form

at.,

and

3)va

riabl

eco

des o

r val

ues

and

cons

iste

ncy

chec

king

.tso

gva

riabL

es.

this

stat

emen

tac

com

panI

ed b

y up

datc

trans

actio

n ca

rd.

prov

ides

a m

eans

for

dele

ting,

add

ing

orco

rrec

ting

obse

rvat

ion,

or v

arid

hieg

of t

hein

put f

ile a

peci

fied.

5DU

hIP

Rec

ords

of t

he in

pot

file

spec

ified

are

dum

ped

or p

rinra

dto

re*d

abt,

for.

in a

form

at d

epen

dont

on th

e re

cord

ing

mod

eof

the

file

and

o?ci

ona

,pec

if te

d by

the

user

.Th

e in

put f

ilesp

ecifi

ed L

a co

pied

in th

e so

me

form

at,

This

stat

emen

tsp

ecifi

es c

ondi

tiona

let

ract

ion

of o

baer

vatf,

ona

or v

risbl

ea p

rodu

cing

iubp

opul

.toe.

of th

e in

put f

ilesp

erifi

eiV

ersi

on 1

of S

ilts

assu

mes

aer

ial o

rse

quen

tial p

roce

ssin

gof

dat

a.So

rting

of a

spec

ifind

Inpu

t file

is sp

ecifi

ed u

sing

tie *

SOR

T St

atC

oent

.M

EPC

ETw

o fil

es o

f the

nam

e bi

tes1

stru

ctur

esa

y be

mer

ged.

The

mer

ge c

riter

iaen

d "h

It-m

iss'

optio

n,ar

e ap

ci(ie

d on

thIs

stat

emen

t.*S

J.tPI

ZA

rsnd

sam

ple

or a

snep

levh

ich

incl

udes

rare

occ

urrin

g va

lues

for v

aria

ble,

ispr

oduc

ed ir

on th

ein

put f

ile sp

ctlu

icd.

The

varia

bles

use

da.

the

sam

plin

g cr

iteria

are

li.te

d as

per

t of

tIs sL

ate-

men

t.'$

AR

CIX

AI$

One

-dim

ensi

onal

or m

argi

nal

freq

uenc

y di

strib

utio

nson

vat

iabi

c, a

re p

rodu

ced

from

the

optio

nan

dw

aris

ba li

st o

f rita

stat

emen

t.*C

RO

SS'E

SN

-dim

ensi

onal

tabi

es o

ffr

eque

ncte

s, nn

uns,

urns

, ste

ndor

d dv

intio

ns,

row

per

cent

uies

or

orn

perc

enta

ges a

re sp

ecifi

edus

ing

this

ata

tune

ntas

wel

t as e

saoc

ited

atat

tatie

s suc

h as

ri'l-.

qsre

,va

rianc

e, st

anda

rdde

vitio

ncc

c.71

NTS

Ban

s mom

ent m

aIde

n,or

eut

rLce

s of ø

elge

ted

varia

bles

sum

s and

sue,

of c

ross

-pru

duct

sar

c pr

odoc

edfr

om th

e op

tions

end

vani

ubic

, lis

ted

inth

tc st

atem

ent.

'CO

RR

E1A

TXG

t5C

orrc

latio

n m

atric

esor

nob

ecie

d va

ri&,tc

sar

c pr

oduc

ed fr

Om

tie

varI

able

s in

this

atut

rnco

t,W

TAC

XW

ino

in b

atch

nod

cse

t tin

,, of

il,

IIM

S m

achi

re rc

adub

lodc

curn

cnte

tion

will

be

prin

ted

ucco

rl'g

toth

u pr

oble

m a

reas

the

user

has

indl

rtitd

a, p

art o

f thi

sst

atem

ent.

In in

tera

ctiv

eor

00-

line

mod

oth

is st

atem

ent

initi

ates

an

inte

ract

ive

icac

hing

fund

ton

inw

hich

the

user

doaw

era

ques

t toe

' rel

e-va

nt to

this

pro

blem

.Ti

c in

tera

ctiv

eff

.Cil

(unc

tion

wIll

not b

e av

aila

ble

inve

rsiO

n 1

of S

L.15

.

Pigu

re 2

(Con

tinue

d)

Page 8: Social Science Computer at the University of Wisconsin ...SOCIAL SCIENCE COMPU1'ING AT FElL UNIVERSITY OF WISCONSIN: SIMS ANt) SEOSYS 1W MAX E. ELLIS INTROiUCTION For the past three

describiii the .soiuee language (SIMS statements) and the target language (thegenerated or precompiled SI MS tob stream which is to he executed). Rules aregiven to LENS for the mapping of the source language to the target language. Inthe case of SIMS, the rules are tire complete description of the SI MS commandlanguage. For sonic other general applications program the rules would he thedescription of the resultant program's control cards and control card processing.During the mapping process detailed error messages are printed as statements of thesource language are checked for syntax and order. Statementsof the target language are stored on a random access device !br later execution.This then completes the LENS processing.

In summary, the SIMS system is composed of generalized relocatable routinessuch as an El)IT print routine, a cross-tabulation subroutine etc.. and LENSmacros and nets which describe the source and target languages or SI MS state-ments and generated job stream, respectively. Each user has access to the entiresystem and as such can create his own data base of flies, file descriptions and libraryof SIMS requests unique to his application. If he so chooses he may produce hisown version of the SIMS request language and associated generated output. Thiscan be done through alteration of the LENS input. The modular construction ofLENS and other SIMS routines plus the paging capability of LENS and the hostoperating system IrciIitates many SIMS users to run SIMS simultaneously. Finally.SIMS provides both a novice and experienced computer user with a tool for J

processing simple. complex, large or small jobs in a manner familiar to him.

SEOSYS: A GENERALIZED Sysmsi FOR EXTRACTION FROSI AND ANALYSIS OF TIlE1966-1%7 SURvEY or E(oNowc OI'I'ORTUNIT\ DATA FILEs

1. Logical Structure of the 1966-1967 SEO Data Files

The 1966 and 1967 "Surveys of Economic Opportunity" were conducted bythe Bureau of the Census at the request of the 0111cc of Economic Opportunity inorder to augment the information regularly collected in the Current PopulationSurveys (CPS) for February and March of each year. In addition to a number ofitems common to both surveys (such as age. family status, work experience andincome), the SEO also provides information on other characteristics such as housing,marital history. training, assets and liabilities. The main purpose in collecting thisinformation was to provide a base for micro-analytic research in exploring thecauses and correlates of poverty. The files have beeii specially designed, edited anddocumented to this end.

The 1966 SEO sample consisted of about 30,000 households and was madeup of two parts : (1) a national sample (about 18.000) drawn in the same Wa as theCurrent Population Survey Sample and (2) a supplementary sample (about 12.000)in areas with a large concentration of nonwhites. The sample was designed in thisway to improve estimates of the characteristics of the poor, in particular, thenonwhite poor.

The 1967 SEO saniple consisted of reinterviews of tire same addresses in-cluded in the 1966 SEO. Most of the questions asked in 1966 were asked again in1967 making some measures of change possible for persons interviewed in bothyears.

243

Page 9: Social Science Computer at the University of Wisconsin ...SOCIAL SCIENCE COMPU1'ING AT FElL UNIVERSITY OF WISCONSIN: SIMS ANt) SEOSYS 1W MAX E. ELLIS INTROiUCTION For the past three

interview Unit, and "adult'' information for some of these people. It may be usefulto think of the information for each SE() household or address as organized withina 4-level structure with the segments of information for the household connectedby a simple "tree structure" as illustrated opposite. The four levels implicit in thestructure each contain one of the four segment types within the file.

2. Phsiva1 Characteristics of tIw 1966--I 967 SEQ Data Files

Although the tree structure is conceptually useful for describing the or-ganization of the file, the organization of the file on magnetic tape is sequential.Segments for each household appear on the tape file in "left list" order. i.e.. thatsequence in which they occur when the tree structure is traversed from left to rightalong its branches. For the above example. the segments would appear on magnetictape in the following order:

Segments describing a given household are contiguous on the file. Non-interviewhouseholds are represented by a household segment only.

Input to SEOSYS must be either the 1966 or 1967 SEO as produced by theData and Computation Center. These versions of ihe SEO files contain fixedlength physical records or blocks with a record being 9 numeric BCD (BinaryCoded Decimal) characters. Blocks contain 30 records each and records of ahousehold may continue over more than one block.

3. Data Retriet'al

SEOSYS provides an efficient means for retrieval, extraction and analysisof information from the SEO data tiles. Most standard statistical progranis orsystems are not capable of directly processing files with complex structures suchas SEO. They usually require the data to be of a "rectangular" or matrix structure,in which the columns of the matrix are the variables and the rows the observations.Most often an observation is synonymous with a tape record or card. SEOSYSbridges this gap by retrieving information from the hierarchical tree structure (asillustrated in the sample household) and creating a rectangular file for analysis,This reformatting or structure change may be combined simultaneously with

245

Segment Level ('onlent

H H L D I HOUSCI1OI(t dataIN I V 2 Inierv ie Unit No. I dataI'FRS 3 Person I dataADLT 4 Adult I dataPERS 3 Person 2 daaADLT 4 Adult 2 dataPERS 3 Person 3 dataI NT\' 2 Intervie Unit No. 2 dataPERS 3 Person I dataADLT 4 Adut I dataIER.S 3 Person 2 data

Page 10: Social Science Computer at the University of Wisconsin ...SOCIAL SCIENCE COMPU1'ING AT FElL UNIVERSITY OF WISCONSIN: SIMS ANt) SEOSYS 1W MAX E. ELLIS INTROiUCTION For the past three

analysis, or may be done separately by producing an extract or work tile which is asubpopulation of SEO to be analyzed at a later date.

Pertinent physical characteristics of the SE() tapes arc provided to ShOS\Svia an abbreviated machine readable version of the SE() codehook. tJsin thisinhrmation and "knowing'' the possible tree structures a! households in thefile, SEOSYS iscapableof retrievingattrihutes from anyofthe four levels, household,interview, person or adult. The user specilies at what level his analysis will he.SEOSYS then searches a "household tree," "remembering" at what level theanalysis will he based and retrieves the attributes or variables to be selected Iromany leveL A fixed length observation vector containing these data items from anylevel is then created. one observation for the level of analysis.

Consider a study of all l)CESOflS in the survey who arc black, have incomesless than 53,000 and who live in multi-family dwellings. The umi o/ anti! rsis orlevel of analysis in this case is the person. Therefore an observation possibly con-taining information froni all levels (e.g.. HOUSE SIZE from the HHLI). RACEfrom the 1NFV, AGE from the PERS and INCOME from the ADLT) would becreated for every person who satisfies the selection criteria. SEOSYS, as it is tra-versing a household, "saves' attributes or variables from higher levels (e.g..HHLD and INTV) if need be and "I ks ahead" for data from lower levels (e.g.,A DLT). During this retrieval proci earching is terminated immediately if thedata interrogated do not satisfy the . ction criteria, thereby minimizing retrievaltime. For the example request rn oncd above, if the household being queriedconsisted of only one family, tb itribute # FAM (number of families) of thehousehold record or segment b.: g equal to I would indicate to SEOSYS thatpersons in this household shoulo not be included in the sample. Any further check-ing of race or income etc.. would be omitted and SEOSYS would then search for thebeginning of the next household.

Most analyses performed on survey type data files require some transforma-tion of the data in the master file, creation of new variables or conditional extractionor selection ola sample population. The SEO tiles are no exception. Because of theextensive amount of information for a household and the complex structure of thefIles, users of the SEO data will almost always require some form of data trans-formation to create a subpcpuiation analysis. SEOSYS allows a user completeinteraction with the system through user supplied Fortran subroutines. Suchroutines facilitate transgeneration of variables at all levels and selection ofobserva-tions. A user may also perform his own analysis in these supplied routines.

SEOSYS has been developed specifically for the purpose of providing aresearcher with a user-oriented system for accessing. extracting, and analyzingdata of u veys of Economic Opportunity. Since SEOYS has been customdesigned for these data files, the retrieval algorithm in SEOSYS provides efficientaccess to the data while giving users a general system for r.rocessing the data.The general features of SEOSYS allow almost any request to he handled withminimal computer time and little or no programming time.

4. Documentation 4 railahie

The following documents are available free through the University ofWisconsin Data and Computation Center:

246

Page 11: Social Science Computer at the University of Wisconsin ...SOCIAL SCIENCE COMPU1'ING AT FElL UNIVERSITY OF WISCONSIN: SIMS ANt) SEOSYS 1W MAX E. ELLIS INTROiUCTION For the past three

---1966 Survey of Economic Opportunity Codehook---1 967 Survey of Economic Opportunity Codebook---1966 and 1967 Survey of Economic Opportunity Sample Design and

Weighting---The Comparison of Selected Economic and E)emographC Characteristics

from the 1966 and 1967 Surveys of Economic Opportunity and the Corn-parable Current Population Surveys

---1966 Survey of Economic Opportunity Unweighted Counts (Includingweighted estimates of Income, Asset and Liability items)

1967 Survey of Economic Opportunity Unweighted Counts (Includingweighted estimates of Income, Asset and Liability items)

---1966 and 1967 Survey of Economic Opportunity Sample Variance Esti-mates

1966 and 1967 Survey of Economic Opportunity Cross-Year Tabulations---SEO Data Files--Fixed Length Format----SEOSYS: A Generalized System for Extraction from and Analysis of the

1966-1967 Survey of Economic Opportunity Data Files- Users Manual

The documents listed above and others have been compiled by F.. JoAn Olsoninto the Surrey of Economic Opportunity Bibliography. The bibliography is inmachine readable form and is printed by the computer via the indexing system.UWIS. developed by the Madison Academic Computing Center at the Universityof Wisconsin.

The list of documents is indexed by author and documents with more thanone author appear once for each author. The entries of the bibliography have

been assigned to one of the following categories:User Guide (6) Working PaperThesis (B.A.) (7) PublishedThesis (M.A.) (8) ConferenceThesis (PhD) (9) OtherForthcoming

The category name appears on the listing. The bibliography has also been indexed

by key title words.

ACKNOWLEDGMENTS

The SIMS system has been funded entirely by the National Science Foundation.grant GS-1992, and has been under the faculty direction of Professor DennisAigner, with Max Ellis directing the system design and implementation. Significantcontributions in development of the system have been made by the followingsenior staff members of DACC: William Katke. Kenneth Nelson, James Olsonand Shou-chuan Yang. These persons with Max Ellis have designed the system. its

user interface and programs.The development of SEOSYS was funded by the Office of Economic Oppor-

tunity and the Institute for Research on Poverty. Programming of the system was

done by Kenneth Nelson, Linda Werner, and Luise Cunliffe. Nancy \Villiamsonand Ronald Sepanik contributed significantly to the design of the system and

247

Page 12: Social Science Computer at the University of Wisconsin ...SOCIAL SCIENCE COMPU1'ING AT FElL UNIVERSITY OF WISCONSIN: SIMS ANt) SEOSYS 1W MAX E. ELLIS INTROiUCTION For the past three

I

assisted the proglannners in the testing of SEOSYS. The portiofl of this pap.pertaining to the Sursey of Economic ()pportunjt includes corn ribut ions fromRonald Sepanik and 1)avid Richards011 I )escriptioii of the .SF() data flies habeen reproditced in part from Th 1966 and /96/ Stirrer o/ Lmiamjc ()ppoi 11tH itFiles wit! Re/wet! So/ht'are, l3rookings ('omputer ('enter Memo # June 30.1969 by George Sadowsky and Marore Reed.

Riu EREN(',s[ Ellis Ma E. Fortran (u/,n SfwU/ars/ Daa and (iputation (euler teeliuiie.ii Paper IlPa SfUnjversut of VVisconsun Madison \Vjseonsj,u E)ecenibr 197012] Ellis. Max E and K. 11. Nelson A flats, flt.or!ptu,n l.anu,,'e br llar,u-5lzi,-,,/ l)aia /3/es l'ueuikdat ACM SICFIDENT workshop

on Data Descruptuoti and Acce5s. Reprinted us Data and ('ounruI(,i.tion Center Paper ftP-I I). University of WISCDflSjfl Madison Wusco1971)13J Control Data Corporation 3400 3600 3f)0 ( mpultr .Svstenu I;rr-ar, Rsiertnu s Pt, hiNo. 60132900. A. 965.

f4J K .,i ke %V,l 11am I. E.VS ReJeri',zue .tfaniw/ Prs'/j,flin5,r F. Data a id (am p151st I on (en ICr WorJ, nPaper UnIversjt\ of W!scon5i Madison Wisconsj,i Auet,st. 971.[5] VNjV\C f-ww/wfle,,,(s 0/ hsrr UP-7S6 October 14, I 965


Recommended