Implementing Legacy Implementing Legacy Statistical Algorithms in a Statistical Algorithms in a Spreadsheet EnvironmentSpreadsheet Environment
Stephen W. Stephen W. LiddleLiddleInformation Systems FacultyInformation Systems Faculty
Rollins eBusiness CenterRollins eBusiness Center
John S. LawsonJohn S. LawsonDepartment of StatisticsDepartment of Statistics
Brigham Young UniversityBrigham Young UniversityProvo, UT 84602Provo, UT 84602
OverviewOverview
nn IntroductionIntroductionnn Fundamentals of VBA in ExcelFundamentals of VBA in Excelnn Retargeting traditional algorithms to a Retargeting traditional algorithms to a
spreadsheet environmentspreadsheet environmentnn Converting FORTRAN to VBAConverting FORTRAN to VBAnn ConclusionsConclusions
Why Convert FORTRAN Programs to Run Why Convert FORTRAN Programs to Run in a Spreadsheet Environment?in a Spreadsheet Environment?
nn Useful code available that is not Useful code available that is not implemented in standard statistical implemented in standard statistical packagespackages
nn FORTRAN compilers not usually available FORTRAN compilers not usually available on normal Windows workstationon normal Windows workstation
nn Many textbooks refer to published Many textbooks refer to published FORTRAN algorithms FORTRAN algorithms
Sources for Published FORTRAN Sources for Published FORTRAN AlgorithmsAlgorithms
nn STATLIB (STATLIB (http://http://lib.stat.cmu.edulib.stat.cmu.edu//))nn General ArchiveGeneral Archivenn Applied Statistics ArchiveApplied Statistics Archivenn Journal of Quality Technology ArchiveJournal of Quality Technology Archivenn JASA Software ArchiveJASA Software Archivenn JCGS ArchiveJCGS Archive
Advantages of Running Legacy Advantages of Running Legacy FORTRAN Code in ExcelFORTRAN Code in Excel
nn Comfortable environment for practitionersComfortable environment for practitionersnn More user friendly input from spreadsheetMore user friendly input from spreadsheetnn Output to spreadsheet allows further Output to spreadsheet allows further
graphical and computational analysis of graphical and computational analysis of results with Excel functionsresults with Excel functions
VDG Inputs: Design X1 X2 X3 X4 X5 X6
Nickname: 1-hybrid 0 0 0 0 0 2.3094Runs: 30 -1 -1 -1 -1 -1 0.57735
Factors: 6 1 1 -1 -1 -1 0.57735Model Order(1/2): 2 1 -1 1 -1 -1 0.57735
Design Region(S/C): s -1 1 1 -1 -1 0.57735Weight by N (Y/N): y 1 -1 -1 1 -1 0.57735
Number of Radii (20-99): 20 -1 1 -1 1 -1 0.57735Scaling Unit (suggest 1): 1 -1 -1 1 1 -1 0.57735
Design Radius/Region Radius: 1 1 1 1 1 -1 0.577351 -1 -1 -1 1 0.57735
-1 1 -1 -1 1 0.57735-1 -1 1 -1 1 0.577351 1 1 -1 1 0.57735
-1 -1 -1 1 1 0.577351 1 -1 1 1 0.577351 -1 1 1 1 0.57735
-1 1 1 1 1 0.577352 0 0 0 0 -1.1547
-2 0 0 0 0 -1.15470 2 0 0 0 -1.1547
Run VDG
Proposed MethodologyProposed Methodology
nn Understand original FORTRAN programUnderstand original FORTRAN programnn Choose suitable I/O methodsChoose suitable I/O methodsnn Convert original FORTRAN code to VBAConvert original FORTRAN code to VBAnn Test and use resulting Excel codeTest and use resulting Excel code
Visual Basic For ApplicationsVisual Basic For Applications
nn Built on ANSI BASICBuilt on ANSI BASICnn Language engine of Microsoft OfficeLanguage engine of Microsoft Officenn Modern structured programming languageModern structured programming languagenn Has vast array of types, functions, Has vast array of types, functions,
programming helpsprogramming helpsnn Powerful support environment (Office platform)Powerful support environment (Office platform)
nn Popular in business contextsPopular in business contexts
Excel Object ModelExcel Object Model
nn Objects in Excel are Objects in Excel are addressable in VBAaddressable in VBA
nn Each object has:Each object has:nn PropertiesPropertiesnn MethodsMethods
Application
Workbooks (Workbook)
Range Chart
Worksheets (Worksheet)
Clicking these buttons runs the ORPS1 and ORPS2 algorithms.
Input Region
Output Region
Input/Output MethodsInput/Output Methods
nn NonNon--interactiveinteractivenn Files, databasesFiles, databasesnn Worksheet cellsWorksheet cells
nn InteractiveInteractivenn Message boxesMessage boxesnn Input boxesInput boxesnn Custom GUI formsCustom GUI forms
FORTRAN vs. VBAFORTRAN vs. VBA
nn VBA: VBA: ((--b+Sqrb+Sqr (b^ 2(b^ 2--4*a*c))/(2*a)4*a*c))/(2*a)
nn FORTRAN:FORTRAN: ((--b+SQRT(bb+SQRT(b**2**2--4*a*c))/(2*a)4*a*c))/(2*a)
aacbb
242 −±−
aacbb
242 −±−
More OperatorsMore Operators
nn .EQ..EQ. ==nn .NE..NE. <><>nn .LT..LT. <<nn .LE..LE. <=<=nn .GT..GT. >>nn .GE..GE. >=>=
nn .AND..AND. AndAndnn .OR..OR. OrOrnn .NOT..NOT. NotNot
nn //// &&
Data TypesData Types
nn INTEGERINTEGER Byte, Integer, LongByte, Integer, Longnn REALREAL SingleSinglenn DOUBLE PRECISION DOUBLE PRECISION DoubleDoublenn COMPLEXCOMPLEX NonNon--primitive in VBAprimitive in VBAnn LOGICALLOGICAL BooleanBooleannn CHARACTERCHARACTER StringStringnn CHARACTER*CHARACTER*lengthlength String*String*lengthlength
nn Other notable VBA types:Other notable VBA types:nn Currency, Decimal, Date, VariantCurrency, Decimal, Date, Variant
Worksheet FunctionsWorksheet Functions
n ChiDist(x,deg_freedom) n Returns one-tailed probability of the ?2 distribution.
n Correl(array1,array2)n Returns the correlation coefficient of two cell ranges.
n Fisher(x)n Returns the Fisher transformation at a given x.
n Pearson(array1,array2)n Returns the Pearson product moment correlation coefficient for two sets.
n Quartile(array,quart)n Returns the requested quartile of a data set.
n StDev(array)n Returns the standard deviation of a data set.
n ZTest(array,x,sigma)n Returns the two-tailed P-value of a z-test.
FlowFlow--Control StatementsControl Statements
If expr1 Thenstmt1
ElseIf expr2 Thenstmt2
…Else
stmtnEndIf
IF (expr1) THENstmt1
ELSE IF (expr2) THENstmt2
…ELSE
stmtnEND IF
Block if
If expr Then stmtIF (expr) stmtLogical ifVBAFORTRAN
Subtle Differences (“Subtle Differences (“GotchasGotchas”)”)
nn Implicit conversion of real to integer valuesImplicit conversion of real to integer valuesnn FORTRAN: truncateFORTRAN: truncatenn VBA: roundVBA: roundnn Solution: use Solution: use VBA’sVBA’s Fix(), which truncatesFix(), which truncates
nn Both languages allow implicit typingBoth languages allow implicit typingnn This introduces ambiguityThis introduces ambiguitynn Solution: supply explicit types everywhereSolution: supply explicit types everywhere
Eliminating Eliminating GotoGoto StatementsStatements
nn Computer science accepts the axiom that Computer science accepts the axiom that gotogoto is generally “considered harmful”is generally “considered harmful”
nn We advocate rewriting We advocate rewriting alogrithmsalogrithms to use to use structured programming techniques where structured programming techniques where feasiblefeasiblenn Sine qua nonSine qua non is “make it work”is “make it work”nn It’s a good idea for maintainability, It’s a good idea for maintainability,
understandability to move to structured formunderstandability to move to structured form
Eliminating Eliminating GotoGoto StatementsStatements
DO 8 J=1,3...
6 ...IF(OBJFN.GT.BESTFN) GO TO 7...GO TO 6
7 IF(J.EQ.3) GO TO 8XK=BESTK-STEP
8 CONTINUE
Eliminating Eliminating GotoGoto StatementsStatements
For j=1 To 3...
6 ...IF(OBJFN.GT.BESTFN) GO TO 7...GO TO 6
7 IF(J.EQ.3) GO TO 8XK=BESTK-STEP
8 Next j
Eliminating Eliminating GotoGoto StatementsStatements
For j=1 To 3...
6 ...IF(OBJFN.GT.BESTFN) GO TO 7...GO TO 6
7 If j <> 3 Thenxk = bestk - step
End IfNext j
Eliminating Eliminating GotoGoto StatementsStatements
For j=1 To 3...Do Until objfn > bestfn
...LoopIf j <> 3 Then
xk = bestk - stepEnd If
Next j
Our ReasoningOur Reasoning
nn Digital assets are fragileDigital assets are fragilenn FORTRAN is not universally availableFORTRAN is not universally availablenn Excel is a ubiquitous, powerful platformExcel is a ubiquitous, powerful platformnn VBA is a fullVBA is a full--featured language capable of featured language capable of
handling sophisticated statistical handling sophisticated statistical computationscomputations
ConclusionsConclusions
nn We recommend creating a WebWe recommend creating a Web--based based repository of Excel/VBA implementations repository of Excel/VBA implementations of classic statistical algorithmsof classic statistical algorithms
nn We can preserve our legacy algorithms in We can preserve our legacy algorithms in this modern spreadsheet environmentthis modern spreadsheet environment
nn EE--mail us if you want a copy of our mail us if you want a copy of our manuscript (manuscript (liddleliddle or or [email protected]@byu.edu))