Small team workflow in government analyticsPeter EllisManager Sector Performance18 March 2014
Todays talkWho are we and why is our experience important?What are data-intensive economic reports?The challengeThe solutionReflections on analytics in government
The Sector Performance team9-10 staff$5 million budget mostly for outsourced data collectionOne of 3, 4 or 9 analytical teams in MBIE Depending on definitionsBut diverse approaches from different teamsVariety of rolesManage collection of tourism and science and innovation dataAnalyse and publicly disseminate tourism dataAnalyse data on all sectors for policy teams and MinistersSupport policy teams in other areasMid through 5 year Tourism Data Improvement ProgrammeSince MBIEs creation, now applying the tools, skills and techniques to a wider range of data
http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram
Capability building for an analytical teamFive key areas neededWorkflow, document management and teamworkAnalytical techniquesToolsData reshaping and managementData storageMany programmes dont take all five into accountIT-led BI programmes may focus on only #3 and #5Universities typically only teach #2
Data-intensive economic reportshttp://www.mbie.govt.nz/what-we-do/business-growth-agenda
The challenge update the draft overview Sectors ReportCurrent version had evolved over 24 months over 200 plots and 50 tables of dataNot all the data sources fully definedSome of the Excel workbooks lostSome data was custom-cut by Statistics New ZealandHome-grown (and inconsistent) concordances to sectorSome data hard keyed in, and not clear what was original, what was analysis, and what was grooming/reshapingTight timeframeHigh profile, and quality guarantee essential
This is just one worksheet of around 30 only 20 of which we could find
Principles for a solutionSeparate the data from the grooming and analysisReproducibilitySystemised constant teamwork and peer review, requiring:Repository-based version controlCentralised and disciplined folder and file structureModular code with custom functions, palettes and themesFrequent integration and continuous testingCut the dependencies on externalsExtreme code-based plot polishingAnd for our next project (Small Business Report):Frequent iteration with the client (policy team and Minister)Separate exploratory analysis from polishing
The toolkit
The folder structureraw_dataconcordancesNZ.StatInfosharecustomgrooming_codedataanalysis_codeoutputPart IPart II dashboardsR.git
Held together with key files in the projects root directory:integrate.r (in future to replace with makefile)sector_report.rproj.Rprofile
Particular things that make this approach humGitRstudio projects are a great way of organisingBut Notepad++ users can still participate if they use R shortcuts in the root folder of the repoClean, pared back, modular scripts essential for readabilityCreate your own palette, ggplot2 themes, font variables and functions for image dimensions and resolutionResource for oversight, coordination, ensuring the build worksManager needs to be technical enough to dive into the repoYou wouldnt have a policy manager who couldnt use WordClear spec or ability to have agile iterative approach with client
Joels 12 point test for software developer teamsDo you use version control for your code?*Can you make a build in one step?Do you make frequent builds (at least daily)?Do you use an issues tracking system?*Do you fix bugs before writing new code?Do you have an up-to-date schedule?Do you have a spec?Do programmers have quiet working conditions?Do you use the best tools available?*Do you have testers? (not sure this ones relevant)Do new candidates write code during their selection ?Do you do hallway usability testing?
Surprisingly relevant for analytics teams too
Tweaked (*) from http://www.joelonsoftware.com/articles/fog0000000043.html
Five things needed for successful capability buildingExternal demandSustained management commitmentResourcing for trialling, experiments and intensive customised trainingSupportive IT team and environmentPreparedness for the process to take years rather than months
Different needs and roles
Some particular issues in governmentDemand from Ministers and senior management essential Courage required to raise the expectationsNeed to push some boundariesWork with, not against, your ICT teamCommon goalsRecognise where ICT projects are needed and when to use BAUBalance of waterfall v. agile and beyondBut - be prepared to use personal machines as a trial environment for new tools and techniquesOnly way to know what you want to invest in high costs in packaging up new software for locked down networksA significant sized team essential to build momentumRecent developments only possible for us with the creation of MBIE