+ All Categories
Home > Documents > CSCW 2016 - How to Hackathon: Socio-technical Tradeoffs in Brief, Intensive Collocation

CSCW 2016 - How to Hackathon: Socio-technical Tradeoffs in Brief, Intensive Collocation

Date post: 07-Mar-2016
Category:
Upload: arun-kalyanasundaram
View: 14 times
Download: 0 times
Share this document with a friend
Description:
Hackathons are events where people who are not normallycollocated converge for a few days to write code together.Hackathons, it seems, are everywhere. We know that longtermcollocation helps advance technical work and facilitateenduring interpersonal relationships, but can similarbenefits come from brief, hackathon-style collocation? Howdo participants spend their time preparing, working face-toface,and following through these brief encounters? Do theactivities participants select suggest a tradeoff between thesocial and technical benefits of collocation? We presentresults from a multiple-case study that suggest the way thathackathon-style collocation advances technical work variesacross technical domain, community structure, andexpertise of participants. Building social ties, in contrast,seems relatively constant across hackathons. Results fromdifferent hackathon team formation strategies suggest atradeoff between advancing technical work and buildingsocial ties. Our findings have implications for technologysupport that needs to be in place for hackathons and forunderstanding the role of brief interludes of collocation inloosely-coupled, geographically distributed work.

of 15

Transcript
  • How and When Do Hackathons for Scientific Software Work? Insights from a Multiple-Case Study

    1st Author Name Affiliation

    City, Country e-mail address

    2nd Author Name Affiliation

    City, Country e-mail address

    3rd Author Name Affiliation

    City, Country e-mail address

    ABSTRACT Scientific communities are experimenting with hackathons, short-term intense software development events, in order to advance the technical and social infrastructure that supports science. We know little empirically, however, about the outcomes of these hackathons and even less about how to plan them to maximize the likelihood of success. This paper aims to fill that gap by presenting a multiple-case study of the stages a hackathon goes through as it evolves and how variations in how stages are conducted affect outcomes. We identify practices across the preparation, execution, and follow-through stages of a hackathon that meet the specialized needs of scientific software. Differences in the kinds of disciplines included, classes of users, and team formation strategies reveal tradeoffs among technical progress, surfacing user needs, and building community. Our findings have implications for future empirical studies, the kinds of technology support that need to be in place for hackathons, and funding policy.

    Author Keywords Scientific software; hackathons; multiple-case study; qualitative methods.

    ACM Classification Keywords H.5.3. [Information interfaces and presentation (e.g., HCI)]: Group and Organization Interfaces Computer-supported cooperative work.

    INTRODUCTION Software is of central importance to modern scientific practice [1416,40]. Scientists often write their own software, from small scripts that process data and create figures for publications to large workbench applications that integrate visualization, simulation, and analysis. Scientists quite often replicate each others efforts, however, because they do not realize they have common needs. Some proportion of this software is an invisible

    resource that scientists might be willing to share [15]. Because software is easily replicated and distributed, it can in theory be collectively enhanced and maintained.

    In practice, however, even when qualified and motivated people are available, a lack of human infrastructure [24] and users unfamiliarity with the codebase [3,34] presents barriers to contribution. To overcome these barriers, scientific communities are experimenting with hackathons, short-term intense events where teams of scientists from academia and industry, postdocs, graduate students, and software developers collaborate face-to-face to share and develop software. Prior research suggests that hackathons may be effective ways to attract and train new contributors, learn about the technical details of users needs, and create and enhance ad hoc teams [18,22,25,39]. Other than informal evidence from scientific communities who have held hackathons, CSCW researchers know little empirically about the immediate outputs of these hackathons, and we know even less about how to plan them to maximize the likelihood of success.

    A key challenge for design is that hackathons vary along many dimensions (e.g., duration, size, goals, agendas). Prior research, however, has identified underlying practices that seem to be common across different instances [25,39]. For example, soliciting use cases from participants provides an occasion to identify community needs, which is important even if the rest of the hackathon is unsuccessful. These practices influence downstream hackathon activities, such as defining technical objectives, who will work on what and with whom, outputs of the hackathon, and what lasting impact, if any, the hackathon will have. Therefore, to provide useful, evidence-based design guidance, it seems desirable to understand the entire lifecycle of a hackathon.

    We therefore ask the following questions:

    (1) What are the stages a hackathon goes through as it evolves?

    (2) How do variations in how stages are conducted affect outcomes?

    To answer these questions, we conducted a multiple-case study [42] of three hackathons applied to scientific software. We collected multiple sources of evidence, including online documentation to understand hackathon planning practices and task progress, 71 hours of on-site observations to understand event dynamics, and 23 semi-

    Paste the appropriate copyright/license statement here. ACM now supports three different publication options: ACM copyright: ACM holds the copyright on the work. This is the

    historical approach. License: The author(s) retain copyright, but ACM receives an

    exclusive publication license. Open Access: The author(s) wish to pay for the work to be open

    access. The additional fee must be paid to ACM. This text field is large enough to hold the appropriate release statement assuming it is single-spaced in Times New Roman 8-point font. Please do not change or modify the size of this text box. Each submission will be assigned a DOI string to be included here.

  • structured interviews to understand in more detail the kinds of interactions we observed and the reasons for them. We conducted pre and post surveys to understand participants preparation activities, their perceived outcomes, their satisfaction with outcomes, and their reasons for being satisfied or dissatisfied. Finally, we extracted committed changes and issues from hackathon source-code repositories to triangulate on our qualitative data and compare outputs from each hackathon.

    Our findings group hackathon activities under three stages: a preparation stage marked by idea brainstorming, learning about tools and research profiles, and preparing tools and datasets; an execution stage marked by team formation, building solutions, knowledge sharing, and building social ties, and a follow-through stage marked by reification of ideas, stimulation of user engagement, and maintenance of social ties. Differences in the kinds of disciplines included, classes of users, and styles of team formation reveal tradeoffs among technical progress, surfacing user needs, and building community. In the following sections we review related research, describe our study, present our results, and discuss the implications of our findings.

    BACKGROUND

    What is a Hackathon? A hackathon can be defined as a short-term event where computer programmers and others involved in software development collaborate intensively on software projects. The term is a portmanteau of the words hack and marathon. In this context, the word hack refers to computer programming in the exploratory, not criminal, sense.

    A hackathon typically begins with presentations about the event, as well as about the subject of the hackathon if any. Attendees then suggest ideas and form teams based on individual interests and skills. The main work then begins, generally lasting from one day up to one week. At the conclusion of the hackathon, there are presentations where teams demonstrate their results. If the hackathon is a competition, a panel of judges selects winning teams and awards prizes.

    What is the Point of a Hackathon? Briscoe and Mulligan [2] loosely group hackathons as being either tech-centric or focus-centric. Tech-centric hackathons aim at software development with a specific application or technology. Focus-centric hackathons, in contrast, apply software development to address a social issue or business objective.

    Technology companies such as Google and Yahoo! tend to hold tech-centric hackathons to grow their user base and create demand for their products. For instance, Open Hack Day, a hackathon run publicly by Yahoo! since 2006 has focused on having third-party developers learn about and use Yahoo! APIs (e.g., Flickr) to build novel software applications and win prizes [11].

    On the one hand, open-source software projects like PyPy, OpenBSD, and Linux put on tech-centric hackathons to rapidly advance work on specific development issues. On the other, incorporating new developers into the project to replace ones who leave is a key concern. As such, during hackathons, core developers work in pairs with newcomers to help them learn project conventions and details of the codebase [32].

    National and local government agencies tend to hold focus-centric hackathons in order to address social issues such as crisis management, open government, and health. For instance, in 2014 the British Government ran DementiaHack, a hackathon dedicated to improving the lives of people living with dementia. CareUmbrella, an app that lets users create tags for things around the house that when tapped, activate audio recordings explaining how they work or the story behind them, won first prize [9].

    In contrast, technology companies like Google, Facebook, and Yahoo! put on focus-centric hackathons to encourage new product innovation. For instance, Facebooks Like button was conceived at one of the companys hackathons [19]. Hackathons also complement routine software development, addressing the need to explore ideas that involve high market and technical uncertainties [30].

    While hackathons have been commonplace among professional developers for some time [1,28], now hackathons organized by and for students and scientists are surging in size, scale, and frequency for networking, recruiting, pitching, and learning. For instance, in 2014, there were some 40 intercollegiate hackathons. This year, more than 150 are expected [23]. A large reason for this is that, in comparison to a job fair, the chaotic environment of a hackathon allows recruiters to more easily identify students who are likely to thrive in a technology career. Likewise, students can test-drive the experience of working in a technology company before committing to a job.

    Scientists are using hackathons to advance the ecosystem of software that supports research, encourage collaboration between communities working on related problems, and train scientists in software development. For example, at Mozilla Science Labs 2014 global hackathon, over 22 cities were represented over two days, including scientists from Auckland to Melbourne, to Paris and London, to New York and San Francisco. Products included tutorials and other learning materials on topics like data analysis and software development, and tools for reproducibility in science, such as extracting scientific facts from publications [38].

    The Tension Between Needs of Scientific Software and Open-Source Software in General The open-source software development model is widely held up as an approach for scientific software developers to follow. In both cases, communities who are geographically and organizationally dispersed write and maintain the

  • software. Both depend on new contributors [20:179], but newcomers face both technical and social barriers to contribution. For instance, developers placing their first contribution into an open-source software project often have trouble using a projects libraries and frameworks, as well as finding able and willing mentors [3,34].

    However, directly applying the open-source model to scientific software development neglects important differences between scientific software and open-source software in general. First, scientists who build tools are serving their own, possibly idiosyncratic short-term needs [21]. As a result, there is little opportunity for other scientists to learn what tools are available. Second, even if the tools could be of much more value to a community if built in a particular wayfor instance using popular data structures and the newest versions of libraries and frameworksthe scientists building them often lack knowledge of these needs as well as the incentives to meet them [14,15].

    In contrast, reputation is an effective incentive for contribution in open-source software [31], where the number of followers a developer has is a signal of social status [7]. Third, unlike open-source software, much scientific software only remains active for the span of the research grants supporting them. Even if scientists understand the larger communitys needs and work to serve them, the time scale of their available resources may not match the community needs. Fourth, scientists who would be willing to invest the time to adapt, extend, and maintain these tools may be deterred by the lack of human infrastructure [24] and unfamiliarity with the codebase. It is daunting to learn about a code base to make useful modifications, especially when one has little or no connection with the codes authors. Furthermore, most scientists are never formally taught how to build, validate, and share software well [10,29].

    Fortunately, informal evidence suggests that hackathons may be a good fit to both the specialized problems of scientific software and open-source software in general. Interactions with other attendees and tutorials expose participants to new tools [25,39]. Setting the agenda of the hackathon provides an occasion to discuss and prioritize community issues and needs [18,22,39]. Incentives are built-in; to do their work, scientists need tools, and will invest the time and resources to create them. Hackathons can be repeated, potentially bringing in new recruits [25,32] and renewing the human infrastructure over time. Finally, people new to a code base can get a gentle introduction, with hands-on experience and mentoring [5,22,25,32,39]. However, there is as of yet little empirical support for these claims.

    Benefits of Face-to-Face Interaction Temporary collocation can speed up software development work that is normally coordinated remotely [27,37]. When team members are collocated, they can move visible

    artifacts, mark them to reflect mutually agreed-on changes, and easily consider issues and alternatives [27]. In a field study of an automobile company using radical collocation, an extreme form of collocation where all team members work a few feet from one other in the same room, Teasley et al. [37] found that the ability to overhear allowed team members to have informal training sessions and meetings around project artifacts. Radically collocated teams doubled their productivity compared with the previous company baseline.

    Organizational research shows that face-to-face meetings in the life of distributed teams can create and enhance social ties among team members. Starting a project with a face-to-face meeting may jumpstart this process because team members can develop a shared understanding of the work [13]. The literature recommends face-to-face meetings throughout a projects lifetime to maintain the social ties underlying professional relationships [26]. Hinds and Cramton [12] found that the effects of site visits, where team members travel to the location of their coworkers to spend time working and socializing with them, can be long-lasting. After returning home, globally distributed team members tended to be more responsive, communicate more, and disclose more information with one another.

    To summarize, the possible benefits of hackathons are: (1) use the affordances of temporary collocation to rapidly advance technical work; (2) use formal and informal communication to create awareness of community needs and thus facilitate extra work [40] outside the hackathon; and (3) use face-to-face interactions to build durable social ties. We aim to contribute to this body of knowledge by understanding whether and how hackathons achieve these benefits, and how to manage various aspects of design throughout the hackathon timeline to increase the likelihood of reaching desired outcomes.

    METHOD To address our research questions, we conducted a multiple-case study [42] of three hackathons applied to scientific software. Several considerations led us this set.

    Our first criterion was to pick a clearly single-disciplinary hackathon and an interdisciplinary hackathon to see how design considerations could facilitate or hinder achieving balance between software developers interests and scientists needs, which is a key challenge in Science of Team Science 1 research (e.g., [36]). For instance, we expected to see different mechanisms for achieving common ground between communities than when only one community was present.

    We found two hackathons meeting this criterion in OpenBio Codefest 2014 (referred to hereafter as OBC)

    1 Science of Team Science (SciTS) is a field aimed at understanding and improving the processes and outcomes of collaborative, team-based research [8].

  • and the 2014 NSF Polar DataVis Hackathon (referred to hereafter as PDV). OBC was a two-day hackathon aimed at giving developers of open-source bioinformatics software libraries such as Biopython and scientific workflow platforms like Galaxy a chance to be fully focused on their projects. Attendance was informal, with about 45 participants on the first day, and 35 on the second day. PDV was a two-day hackathon aimed at fostering collaboration between data visualization experts and polar scientists. Expected outcomes were novel and high impact prototypes and visualizations. Attendance was more stable than OBC, with the same 39 participants on the first and second day. Because of the difference in collaborative orientations, PDV served as a theoretical replication [42] of OBC.

    We sought a third case that would contrast with our original pair on other dimensions, serving as another theoretical replication. We selected the 2015 R PopGen Hackathon (referred to hereafter as RPG), a hackathon that aimed to foster an interoperating ecosystem of tools and resources for population genetics data analysis using the popular R platform. Whereas PDV comprised two different disciplines working on related problems, RPG comprised a single discipline. We therefore expected to see fewer mechanisms for developing common ground. Although both OBC and RPG comprised participants from a single discipline, OBC had primarily developers while RPG had different classes of users, including end users contributing use cases, end users with some programming experience wanting to learn how to develop reusable packages (the unit of code distribution in R), and pure method developers. We expected this contrast in roles and programming experience to be helpful in forming theory about the kinds of knowledge exchanged, and how it is exchanged during a hackathon. It should also reveal differences in how awareness for common needs arises than when only developers are present. Attendance for RPG was 28 participants, and in contrast to OBC and PDV, it was five days long.

    Data Collection We collected multiple sources of evidence, including event documentation (e.g., mailing list discussions, agendas, announcements, idea lists, and team progress reports) to understand planning practices, 71 hours of on-site observations (OBC = 17 hours; PDV = 17 hours; RPG = 37 hours) to understand event dynamics (e.g., how teams form around tasks), and 23 semi-structured interviews (Table 1) to understand in more detail the interactions we observed and the reasons behind them. At each hackathon we captured photographs of the event space, daily team stand-up reports, work breaks, technical sessions, and team meetings. The organizers of OBC and RPG allowed us to video record participant introductions, stand-up reports, and final demonstrations. We were unable to video record any portion of PDV due to legal and insurance requirements associated with the venue.

    ID Team(s) Role

    OBC P1 Seven Bridges D P2 ADAM D P3 Arvados, CloudBioLinux D P4 Arvados M P5 Khmer, Galaxy D P6 Arvados D

    P7 ADAM D

    PDV P8 Crawl Polar Data D, O P9 Visual Story, Temporal Vis U P10 Crawl Polar Data, Temporal Vis U P11 TangeloHub, GISCube D P12 Crawl Polar Data, Event Metrics D, O P13 Visual Story, Crawl Polar Data,

    Polar Imagery D

    P14 Visual Story, Polar Imagery D P15 Crawl Polar Data, PolarHub D

    RPG P16 Community website D-U, O

    P17 Community website D-U P18 Streamline VCF data flow D P19 Outliers in multi-variate stats D-U

    P20 Simulation U P21 Simulation D P22 Estimating population size D-U P23 Outliers in multi-variate stats U

    Table 1. Summary of interview participants (D = developer; U = End user, often little to no development expertise; D-U =

    End user with moderate development expertise; O = organizing team; M = manager).

    In selecting interviewees we aimed for coverage across hackathon teams. For RPG, we looked across the spectrum of participant roles (Table 1), aiming to see examples of teaching and learning as well as end user feedback. Our on-site observational notes helped us develop probes around the motivations for concrete interactions, how they happened, and their results. We solicited participants by e-mail and interviewed them using either Skype or Google Hangouts. We interviewed one participant by phone. Interviews typically lasted just under an hour. A professional transcription services firm transcribed all interviews.

    We designed a pre-survey to understand participant expectations (i.e., What would the ideal outcome of this hackathon be to you?), tasks participants desired to work

  • on (i.e., Please specify one or more tasks you want to accomplish at the hackathon), and preparation for those tasks (i.e., What preparation did you do for the above tasks? Select all that apply. [list]). One week before each hackathon, the organizers e-mailed a link to our survey to all registered participants.

    We created a post-survey to assess if, how, and why/why not outcomes matched expectations. The survey asked questions about participants satisfaction with their teams work (i.e., To what extent were you satisfied or dissatisfied with the work completed in your team? from Very dissatisfied to Very satisfied), reasons for this (i.e., What were the reasons for the extent to which you were satisfied or dissatisfied with the work completed in your team?), perceived outcomes (i.e., In your opinion, what were your most important outcomes of the event?), and whether outcomes matched expectations (i.e., Think about what your ideal outcome coming into the event was. To what extent was this outcome achieved? from Not at all to Perfectly). On the last day of OBC, the organizer e-mailed a link to our post-survey to participants, resulting in a response rate of 68%. To achieve a higher response rate for PDV we handed out and collected paper copies of the post-survey after final demonstrations, but before participants left. This resulted in a 100% response rate. We tried this approach for RPG as well, but before leaving the venue, only several participants returned paper copies back to us. We therefore immediately sent e-mails to participants with the link to the post-survey, and then sent reminders a few days later. The final response rate was 75%.

    Finally, we obtained work artifacts (e.g., presentation slides, committed source-code changes) throughout each hackathon in order to triangulate on our qualitative data and compare outputs from each hackathon.

    Data Analysis We applied standard qualitative analysis techniques [4] to our interview transcripts, observational notes, and event documentation. We first imported these materials into the Dedoose qualitative data analysis software [33]. Three of the authors independently conducted open coding on the text about activities before, during, and after each hackathon, differences among them, and hackathon outputs.

    In the next phase of analysis we wrote, shared, and discussed descriptive memos about emerging themes in the data. We used the video recordings to corroborate and augment our observational notes. We met weekly to unify, refine, and collapse codes where there was commonality, using themes from our memos as support. We applied the resulting set of codes to the remaining data, adding codes when necessary. We continued this process until theoretical saturation.

    RESULTS We group hackathon activities under three primary stages: preparation, execution, and follow-through.

    Preparation

    Idea Brainstorming Participant engagement begins a few weeks before the hackathon. Participants brainstorm ideas for hackathon tasks using information and communication technologies (ICTs) that organizers have provided. Common among them is the ability to communicate asynchronously via text in a way that is publicly viewable. For instance, OBC attendees used a shared Google Document to indicate their interest in different ideas. Because most attendees were affiliated with software projects, they had identified tasks in the usual ways, e.g., directly from customers and the issue tracker. As such, the content on this page tended to be scant, with only a few phrases documenting the idea. In contrast, PDV and RPG attendees used GitHubs Issues feature to propose and discuss ideas. GitHub issues are a way to keep track of tasks, enhancements, and bugs. After someone opens an issue, a linear discussion view allows anyone to comment on the issue or reply to other comments.

    The general process of evolving an idea is as follows. First, a participant posts a textual description of their idea, often in the form of a use case. The participant may also say something about their general strategy for implementation of the idea, as well as provide source-code and hyperlinks to supporting datasets and technologies.

    After the idea is proposed, other participants start to ask for clarifications about the use case provided and details of the datasets and technologies suggested. People from different disciplines are involved here, typically in suggesting potentially useful technologies with which they have familiarity (P8, P9, P11). For instance, a computer scientist may reply asking for a description of what each column represents in a dataset that a domain scientist provided. Participants tend to begin making use of social networking functionality to direct attention to questions, e.g., @[name1] ^^ please see @name2 question above (P8) and notify others who may have relevant expertise but are not yet participating in the discussion (P8, P11). As proposers clarify their ideas, other participants begin to understand their needs and their skillsets (P8, P19). For example, after a polar scientist (P10) clarified their dataset, a computer scientist was able to write a script to transform it into a format that would be easy for visualization tools to visualize.

    As the idea becomes clearer, support begins to build. Participants may begin making positive comments in support of the idea (e.g., P8, P9, P19), but some critiquing goes on as well. For instance, participants point out that additional use cases should be considered (P8, P9, P23), or that more effective technical solutions exist than those proposed, e.g., more useful visualization layouts (P9). Participants, however, do not critique at length. Unless it is clear to proposers that addressing the critique will benefit a larger number of participants, they will not significantly

  • alter the idea (e.g., merge it or remove it). Instead, participants defer to the team formation stage to make a decision (e.g., P8).

    While evaluating proposed ideas, other participants use hyperlinks to cross-reference related ideas previously posted and compare and contrast objectives (P11, P17). If the idea stems from a personal research or work need, proposers do tend to feel ownership and will advocate for it as a separate idea (P8, P19). In other cases, the idea may stem from the proposers desire to learn about a particular technique or technology (P9) or to provide an obvious community service, e.g., a tool demonstration, reference document for available software to do analyses (P11, P17). With agreement from other participants, these ideas are merged with related ones.

    Brainstorming comes to a halt a few days before the hackathon. Proposed ideas are in various states. A few do not have any comments. Many more others have been clarified, have participant support, and have suggested enhancements. Few ideas have concrete development objectives and task assignments because they are still just collections of use cases, technologies, and datasets. No ideas have been ruled out. The work of translating ideas into tasks begins on the first day of the hackathon.

    Learning about Tools, Datasets, and Research Profiles While participants are brainstorming ideas, they are simultaneously learning about tools and datasets that may address their own needs (P3, P9, P10), as well as the needs of others (P11, P20, P22). They use the @mention notation to bring potentially useful tools and datasets to others attention. In some cases, references to these resources resolve some of the proposed ideas due to an existing solution already being in place (e.g., P9).

    Our results suggest that this process can help people from different disciplines characterize other ones. For instance, P11, a developer of a scientific visualization tool who was looking to learn about open problems in the polar science community where his tool could contribute told us:

    So that GitHub pre-meeting activity was helpful to me to orient, learn the problems that are interesting, to learn who the sum of the profiles of the researchers, Ah, this is a very visionary person who wants to do this. This is somebody who's providing data specific to this community. This is somebody who is doing experiments and using this. (P11)

    A practice unique to RPG was that organizers encouraged participants to introduce themselves using a mailing list set up for the event (P16). We found some evidence that this helps participants engage others when they arrive to the hackathon (e.g., P22, P23). For instance, even though P22 did not contribute to the task brainstorming discussions, he reached out to another participant doing related research, with whom he would eventually work together at the hackathon.

    Alignment: Preparing Tools and Datasets In the days leading up to the hackathon, participants prepare tools (P4, P11, P14, P15) and datasets, as well as install software (P20) to be used during the hackathon. This generally involves making sure documentation is available and the code is in a buildable state (P4, P11, P14, P15).

    The ideas discussed during brainstorming serve as important inputs to this process. Developers modify their tools to address additional use cases brought up in discussions (P11). For instance, P11 added new features to demonstrate a workflow that used a dataset discussed in another idea page. Domain scientists ensure that their datasets are in formats that can be easily understood, queried and processed at the hackathon (P9, P10). For instance, after receiving questions about the formatting standards used in her data set, P10 updated her readme file to explain the different formats.

    One useful practice used in both PDV and RPG was the creation of a list of software that should be installed in advance to ensure efficient completion of the proposed tasks (P12, P18). The organizers of PDV provisioned an Amazon machine that people could get login credentials for, and then install any software they wanted on it, which eliminated the barrier of having to install unfamiliar libraries and frameworks on ones own machine (P12).

    In general, social networking functionality such as @mentions allowed participants to quickly ask directed questions of other participants, receive updates about idea clarifications, and bring in participants with relevant interest and expertise to bear on the discussion. The ability to cross-reference other ideas helped participants identify and merge related ideas. However, there were some difficulties in using the ICTs provided. Some of the computer scientists at PDV suspected that the domain scientists did not propose ideas because they could not figure out how to use GitHub (P9, P12). This is likely to be a problem whenever bringing multiple disciplines together at a hackathon. Different disciplines have familiarity with different tools, and some members of one discipline will be unintentionally excluded from participation in the medium selected.

    Execution

    Team Formation Ideas identified in the preparation stage seed team formation. Team formation is the very first activity on day one of the hackathon after brief introductions from organizers and participants.

    Our observations revealed three distinct team formation strategies. In the open shepherding style of OBC, most participants came to the hackathon already associated with a project (and therefore a team) since OBCs objective was to give these developers focused time on their projects. As a result, most participants by default sat with their colleagues. There were, however, free agents, attendees not

  • associated with these projects. During individual introductions, the event organizer suggested matches between free agents and existing teams and teams with each other.

    In contrast, PDV and RPG used project pitches, short presentations made by individuals to the group describing ideas intended for wider adoption. Most were based on ideas discussed in the preparation stage. Pitches were followed by an opportunity for questions and group interactions with the proposers for participants to pick teams. In the selection by organizer style of PDV, participants indicated their interest in ideas by writing their names on flip charts (one flip chart per idea). The organizers selected a few ideas with high interest to work on first. Other high interest ideas were reserved for later, according to the organizers, in order to balance between ideas that had lower interest. On the second day, participants were encouraged to work on different ideas to disperse participants enthusiasm and energy across ideas (P8). Periodically the organizers walked around and determined which teams were complete and which needed more time. When teams were complete, new teams formed around remaining ideas.

    In the selection by attraction style of RPG, ideas that people got behind were de facto selected. Participants wrote down ideas they thought would be interesting, one idea per sheet. Participants then discussed their ideas with others sitting at their table, and each table was asked to pitch the most important idea. The organizer wrote this idea down on a chart and attached the relevant post it-notes. Each table used different color notes. This was repeated in round robin fashion. If ideas from other tables were similar, the post-its were attached to the same chart (see Figure 1). Volunteers were then asked to stand next to the flip charts, and everyone else was free to wander around the room, discussing pitches, offering suggestions, and deciding how to fit in. In contrast to the selection by organizer style, teams in the selection by resource attraction style stayed together for the duration of the hackathon.

    Building Solutions As soon as team formation concludes, teams spend focused time on their projects for the remainder of the hackathon. Event spaces are configured to seat all participants in a single room, and to accommodate multiple teams (see Figure 2). According to survey responses, the average size of a team in OBC (n=31) was 4 (sd=1.8, low=1, high=8), in PDV (n=32) it was 7 (sd=2.5, low=2, high=14), and in RPG (n=19) it was 6 (sd=1.06, low=4, high=8).

    How Did Teams Work? Different working styles could be observed. Teams of developers with a priori development targets (e.g., addressing the backlog of issues in the project tracker) did not generally need input from participants outside of their teams. Therefore, they often worked independently seated with their colleagues. Bursts of independent work were followed by team discussions of the code they were writing (P2, P7). These face-to-face discussions supplemented the more formal code review that occurs in open-source software development:

    We have a bit of informal discussion at the table. Normally we mark up each others' [source-code changes] pretty heavy. But since everybody was sitting right around the table it was a lot easier just to say, Oh, I would change this. Oh, I would change that, instead of actually commenting directly on the [source-code changes] (P2).

    Another common pattern was pairs of individuals communicating while their other team members worked independently (P3, P18, P20, P22, P23). Here we saw team members talking through their ideas with one another, and showing each other errors and successes on their screens.

    What Was the Role of End Users? We observed a generative approach to requirements gathering. To take advantage of the expertise in the room, developers on teams would implement a series of new features, ask attendees in other teams to try them out, and have those same attendees approach them throughout the event to address issues and

    Figure 2. One of the event spaces, with separate tables for

    each team.

    Figure 1. An idea with high interest from participants (left) and low interest (right).

  • bugs they encountered in use (P3, P5). We also observed that some teams had end users who would approach other teams in the hackathon to clarify use cases or needs (P20, P21). In team discussions about design of the tools, end users and developers realized that certain use cases were unclear (e.g., what format the data is in when the tool reads it). Team members decided that while developers wrote code, end users should initiate these conversations with other teams. Other teams expected end users to approach them with such questions because the teams looking to clarify needs would announce that they needed input during their daily progress reports to all participants.

    Our observations revealed end users working to keep task boundaries clear. This happens because of the short-term nature of the hackathon; participants need something to show by the end of it. For example, one participant working on a project to help users build simulations of population genetics data told us about her role in staving off developers desire to build a user interface:

    That was sort of, I guess, at least two to three times a day there would be us talking about [a user interface] and like, oh, yeah, yeah, I could have done that, and I would just say, ok, so does that help with our immediate task? (P20)

    How Did ICTs Support Collaboration? Certain technology needs to be in place to support technical work. For instance, participants need version control to capture and merge contributions. They need shared documents (e.g., wikis) to keep track of individual assignments and progress, and shared repositories to store these documents, as well as datasets, relevant publications, and documentation.

    Overall, GitHub worked well because it provided integration of these technologies. Even developers, however, talked about the learning curve associated with the GitHub workflow (e.g., P12, P13, P18, P23). For instance, P18 and P23 recalled that their team members

    would occasionally overwrite each others code changes while working. However, participants initially unfamiliar with GitHub (e.g., P23) acknowledged that using it in the preparation stage to brainstorm ideas helped lower the barrier to use during execution. Participants used many of the same features in both stages (e.g., posting issues, authoring on the wiki). This suggests that using consistent ICTs in preparation and execution is an important design consideration for advancing technical work.

    Scientists developing software that analyzes large datasets (on the order of gigabytes) require storage support beyond what is currently provided by state-of-the-art version control. Members of the RPG Outliers in multi-variate stats team, for instance, found that storing datasets on GitHub slowed down their package. Moving the datasets to a shared Google Drive folder eliminated this problem (P19, P23).

    Types of and Satisfaction with Outputs. In contrast to OBC and RPG teams, we found that PDV teams generated many discussions of collaboration plans but few novel software prototypes (Figure 3). PDV teams were also less satisfied with their technical output. Only 66% (21/32) of PDV participants were satisfied or very satisfied with what was achieved in their team. In contrast, 81% (25/31) of OBC and 86% (18/21) of RPG participants were satisfied or very satisfied with technical work achieved in their team.

    Interviews and open-ended survey responses revealed that this stemmed from not having clear team objectives, and was exacerbated by the team formation strategy chosen. Because inputs needed from each discipline were not clear, polar scientists and data visualization developers were uncertain how they could concretely contribute (P9, P13, P14). As a result, participants joined teams based primarily on their interests; polar scientists joined teams working on ideas proposed by other polar scientists, and data visualization developers joined teams working on ideas proposed by other developers. This led to relatively

    Figure 3. Average number of source-code commits (left) and discussions (right) per day two weeks before, during, and after each hackathon. For PDV and RPG we show unique comments posted to the GitHub issue tracker. In contrast, because OBC

    participants used a shared Google Document, we show unique edits to that document extracted from the revision history.

  • homogenous teams. When some teams dissolved mid-day continuing teams had to rehash previous discussions for newcomers (P9), leaving little time to take ideas from concept to realization.

    Knowledge Sharing

    Practices

    Knowledge Exchanged

    Bootcamps -How to download, install, configure, use software -Programming conventions and practices

    Tool demonstrations

    -Datasets and tools under development -How to download, install, configure, use software -How to construct workflows -User needs

    Watching others code

    -Data structures -Programming conventions and practices

    Round robin discussions

    -Data sets and tools under development -User needs

    Table 2. Knowledge sharing practices and kinds of knowledge exchanged.

    Knowledge Sharing We found seven different types of knowledge shared, and four supporting practices (Table 2). One important type of knowledge that we expected to find, and did find, was knowledge about user needs. For instance, while demonstrating a metadata search tool for polar data, the developer of the tool learned:

    [Polar scientists] want to be able to search on [a single meta data attribute on a file] and they want to be able to say like give me all the files from this specific data set or this slice of the data set that have cumulus clouds and over this region, and the cumulus clouds attribute is right down nested within the data set. It's not like top level, it's not

    explicit, it's very implicit within the data set. So that was one thing that fall that was kind of a shortcoming that we've managed to address now (P12)

    A knowledge sharing practice that we observed only at RPG was the bootcamp, an interactive tutorial designed to get participants up to speed on a particular technology or codebase. Unlike OBC where all participants were developers and unlike PDV where domain scientists were not necessarily expected to write code, all RPG participants were expected to collaboratively write and share software. Yet because of the spectrum of users to developers, the organizers anticipated that some participants would not have much familiarity with popular tools for software development (P16). As a result, the organizer of RPG ran a bootcamp where the topic was GitHub. Participants sat around the organizer with their laptops and the organizer projected his laptop on a screen. The organizer went through the basics of setting up a Git repository, transferring it to GitHub, and then adding files to the repository. Participants followed along on their computers. Anyone with questions would shout out, and either the organizer or someone else with experience would answer the question, sometimes coming around to the askers screen to examine the issue (Figure 4, left). The close proximity of bootcamps to other teams allowed the team members to move freely in and out, depending on their interest and expertise in the topic (P17, P18, P19, P22).

    Bootcamps and tool demonstrations are similar in that they both teach how to download, install, configure and use software, but there are important differences between them. Bootcampus focus on well-known tools that have gained widespread adoption (e.g., GitHub, R), teaching broad skills that will benefit most participants as they work. As a result they are typically held before teams start work. In contrast, tool demonstrations focus on tools developed within research labs to do specialized analyses. They are thus likely of interest to fewer scientists and occur within teams. They often use specific datasets and use cases that scientists provide during brainstorming activities in the preparation stage. Scientists need these resources to do their work. As a result, during tool demonstrations, scientists learn about

    Figure 4. Two different knowledge sharing practices: bootcamps (left), and watching others code (right).

  • datasets relevant to their work, and how to construct workflows using the tools. Scientists provide datasets and use cases that developers of the tools may have not originally anticipated, allowing developers (and end users to some extent) to learn more about user needs.

    Watching others code allows participants to understand data structures and programming conventions and practices (P7, P10, P17, P22, P23) more effectively compared to learning on their own. For example, two team members using their own computers to go through a coding tutorial would look over each others shoulders occasionally to see if their output matched and discuss errors (P23). To learn how to use particular frameworks and data structures, participants would go over to team members who were using those frameworks to code and watch over their shoulder as they were coding. The more experienced team members would vocalize what they were doing and why they were doing it. Watching experts code worked so well because it was more effective than copying coding examples without having sufficient context to understand why they were useful, and at the same time it was not burdening the experts to the point where they would not get anything done.

    Building Social Ties Participants build community both inside and outside of working on technical tasks. Intense teamwork under pressure allows participants to learn more about their collaborators personalities, see how they react to problems along the way, and develop strong connections with them (P2, P5, P7, P18). For example, in the hours before final demonstrations, participants rush to integrate code they have been writing independently. Often, they must solve errors together such as missing dependencies or overwriting each others changes in the code repository. Participants told us that this intense collaboration lowers the barrier to future collaboration (P5, P6, P9, P13, P18, P22):

    I had some big asks about [Galaxy] workflows, how they do their workflows, and I knew it was just going to be more productive for me to start to build that working relationship in personand thats exactly what happened. We understand each others personalities and perspectives and what motivates us, and we can drop each other notes. (P5)

    Spending time together outside of the intense technical work (e.g., during coffee breaks, meals, and bus rides to the hackathon venue from the hotel) allows participants to learn about each others interests and reflect on opportunities for collaboration. These informal discussions lead to collaboration plans for writing grant proposals (P9, P13), working on manuscripts (P15, P20, P22) and collaborating on source-code projects outside the purview of the hackathon (P2, P7, P18, P20, P21). For example, these discussions resulted in the creation of a new Working Group to create a tool definition and workflow language to make workflows portable across different platforms (e.g., Galaxy), so that users can easily move their workflows among these platforms and share them with other scientists.

    This objective was not identified a priori as a priority of any one team. Instead, it emerged from informal discussions among participants from multiple teams in a kitchen area adjacent to the main hackathon space.

    Responses to our surveys provide additional support for building upon existing social ties. We found that 68% (23/34) of OBC participants had worked previously remotely with other participants. Of these participants, 35% (8/23) responded that these relationships were now a little better and 57% (13/23) responded much better, with only 9% (2/23) saying the relationships had not changed. Similarly, of PDV participants, 63% (20/32) worked together previously remotely. Of these 20 participants, 10% (2/20) described these relationships as a little better and 90% (18/20) described them as much better. In contrast, only 43% (9/21) of RPG participants had worked previously remotely. All 9 described their relationships with these participants as much better.

    The results for OBC seem obvious when considering that most participants at the hackathon worked on their primary open-source software projects, therefore they would naturally already have prior experience with their team mates. The number of participants of PDV who collaborated previously remotely initially surprised us, since the objective of the hackathon was to bring together two disjoint communities. From looking at participants institutional affiliations and speaking with participants in interviews, it seems that some proportion of participants within each community had worked together previously. Because teams ended up being relatively homogenous, participants strengthened these existing ties. The organizers of RPG, in contrast, sought to diversify participants along demographics and expertise within the same community (P16). Although participants described knowing of one another, e.g., in package documentation and on mailing lists (P18, P20, P22), they had not worked together prior, even remotely.

    Follow-through

    Reification of Ideas Teams often have a good idea about what the next steps are regarding their tasks because there are naturally some objectives that are incomplete and feedback that needs to be addressed. For instance, developers who give demonstrations of their tools can incorporate feedback given to them about important use cases (P11, P12, P15).

    OBC and RPG participants seemed quite motivated to continue working on hackathon tasks. For these participants there were obvious motivations to continue the work. OBC participants had selected tasks that they knew would provide value to users. These included fixing reported bugs and implementing features requested by potential customers (P1, P2, P7). To do their work, RPG participants need tools. Making these tools interoperate more effectively, e.g., readily use the output of one tool as

  • input for another or enhance a tool to support multiple data formats, would reduce extra work required in their daily use.

    Figure 3 provides evidence that these participants followed through, continuing to commit source-code to their teams repositories. Team members who had collaborated previously outside of the hackathon (e.g., P2 and P7, P18) were confident about being able to wrap up development work on their tasks fairly quickly. Several participants mentioned having regularly scheduled teleconferences with their team members to finish hackathon tasks (P1, P5, P6, P17, P20, P21), and arranging to meet at other hackathons to continue the work (P20, P21). Participants mentioned targeting source-code packages and manuscripts.

    Most notably, the working group that emerged from OBC is still meeting every two weeks via Google Hangouts (P1, P5, P6). To coordinate work and make decisions they use a Google Group. Since OBC, the group has grown to 99 members, and is still active 10 months after the event, averaging 41 posts per month. They have produced two drafts of a tool definition language for formally describing tools used in scientific workflows, as well as a reference implementation. They are also working on a publication.

    Reactions from PDV participants were mixed. On the one hand the computer scientists who had demonstrated their tools were able to get some feedback that could address use cases that domain scientists provided (P8, P15). On the other hand, many teams comprising domain scientists never got to the point where concrete development tasks were proposed. As a result, there was nothing for the few computer scientists in these teams to follow up on, and no obvious incentive to do so:

    So [P9] thinks that he's going to get a proposal in second day's hackathon and I expect he will run that by me and I

    will have input to that. Other than that, I have no plans to follow up or maybe to see people in the future. (P13)

    Stimulation of User Engagement In the weeks following the hackathon, developers, especially those who gave tool demonstrations in the execution stage, reach out to potential users to see if and how they are using the tools (e.g., P7, P11, P12, P14). This generally involves incorporating issues that users submitted during and after the hackathon and notifying them of this (P12), or working with users to help them make their own source-code contributions (P5, P7, P12).

    Multiple developers expressed a desire to meet users face-to-face at future events to facilitate this process as well as describe their projects roadmap going forward (P2, P7, P12, P14, P18), though we do not have much evidence for whether this actually happened. An exception was that in a follow-up e-mail to us two months after the hackathon, P12 mentioned that at a recent research conference, he met with some data archive managers to whom he had demonstrated his tool. The data managers were still using his tool, and he was able to obtain additional feedback. Using feedback he received there, he was able to add more features to the tool.

    Maintenance of Social Ties At the same time developers are reaching out to end users, attendees may be working to maintain and follow up on the relationships they have built at the hackathon. Participants talked about plans to follow up on the relationships that they built there. This included continuing to give each other feedback on ideas for development features (P5, P21), exchanging resources like scripts and datasets (P10, P18), hashing out plans for manuscripts (P9, P15, P19, P20), and making site visits to give research talks and explore possibilities of future collaborations (P2, P7, P18).

    Preparation Execution Follow-through

    Knowledge Sharing 1. Research profiles 2. Datasets & tools 3. Use, configure, install software 4. Construct workflows 5. User needs 6. Programming conventions & practices 7. Data structures

    Technical Work 1. Brainstorming tasks 2. Building solutions

    Community Building 1. Establishing ties 2. Maintaining ties

    Table 3. Summary timeline of hackathon outcomes and component activities. We use ovals rather than straight lines to indicate that start and end times and extent of overlap with other activities are approximations.

  • DISCUSSION Table 3 summarizes outcomes and component activities within the hackathon lifecycle. Surprisingly, some, such as learning about users needs actually begin to take shape before the face-to-face portion of the hackathon begins. Similarly, technical work and community building start in the preparation stage, but also continue in follow-through. Below, we discuss how differences between the kinds of disciplines, team formation strategies, and classes of users mean there are likely tradeoffs among these outcomes.

    Mixing Domain Scientists with Computer Scientists In selecting our cases for this study, we expected to see differences in how stages were conducted when domain scientists were included in addition to computer scientists. At OBC, open-source software teams had mostly identified their tasks ahead of time, jotting them down in a shared document but having no public discussion. Because participants had clear goals and expertise, they were able to make rapid progress on their technical work. However, not including domain scientists in brainstorming discussions or at the event mean that there were fewer opportunities to gain awareness of the needs of end user domain scientists.

    The interdisciplinary nature of PDV and the team formation strategy selected seem to have combined to produce many collaboration plans but limited technical progress. Compared with OBC, computer scientists were able to learn about domain scientists needs, since visualization developers learned about polar scientists research questions and datasets during preparation. During execution, the organizers selected which tasks would happen when, without specifying specific contributions needed from each discipline and without ensuring that teams included a mix of people from each. Moreover, continuing teams had to rehash discussions for newcomers when other teams stopped, distracting from the time available to develop software.

    Including Different Classes of Users By selecting RPG, we expected to see how including participants from all areas of the spectrum of end users to developers would influence the types of knowledge exchanged, how it was exchanged, and how common needs would arise. We found knowledge sharing practices unique to RPG, such as bootcamps and watching others code, that can help incorporate newcomers and build community.

    Including only developers at OBC seems to have advanced technical work, but come at the cost of not incorporating newcomers and not learning about users needs. Although there was more training than in OBC, RPG participants were also able to make quite a bit of technical progress. Why was this the case? Watching others code resulted in participants learning about programming conventions and practices needed for their work without burdening developers. The longer duration of this hackathon may have also helped offset any losses in productivity due to experts mentoring less experienced programmers. In general

    however, there seem to be tradeoffs between technical progress, building community, and awareness of user needs.

    Comparisons to other Engagements How do hackathons compare with other forms of engagement and professional knowledge exchange? On the one hand, they seem quite similar to other events. The formal exchange of knowledge in bootcamps are similar to tutorial workshops, such as Software Carpentry [41], which focuses on teaching researchers about software program design, version control, and testing. The close ties that participants form with their team mates resemble abbreviated versions of the bonds that form between Summer of Code students and mentors working remotely from one another [39]. The hands-on training participants receive is similar to the experiences of newcomers to Agile Sprints [32]. The informal conversations among the large number of participants at coffee breaks and dinner, some of which may lead to collaboration plans, resemble those of academic conferences.

    On the other hand, hackathons seem quite different. In particular, the nature of the social ties that hackathon participants develop seems a level deeper than what can be expected out of a tutorial/workshop or academic conference. Our findings have shown that hackathon participants discuss their work with one another, observe each other working, socialize, and exchange knowledge. The pressure to produce before a deadline forces collaboration that leads to understanding peoples personalities and their working styles, resembling some of the antecedents to situated coworker familiarity [12]. At the same time participants are prototyping tools, they seem to be prototyping working relationships with their peers.

    Future Work We can think of several avenues for future work. One way to further test confidence in our findings would be to add a series of replications to our case study. For instance, we suspect that the PDVs failure to deliver new prototypes was due, in part, to the team strategy chosen. However, a rival explanation might be that all interdisciplinary hackathons face this challenge. Less common ground might have necessitated more discussion, which took time away from technical work. To rule out this rival explanation we might perform one or more theoretical replications of PDV with multi-disciplinary teams, but with different team formation strategies. A survey could also be used to augment our qualitative data on the prevalence of hackathons and the practices they use (i.e., planning, knowledge sharing, working styles), the reported experiences of participants, the perceived outcomes, and the relationships among these variables.

    We also see a need for longitudinal studies that follow hackathon participants months afterward, perhaps administering surveys, to assess the longer term impact of the event. The impact may be more interesting in terms of the social relationships rather than the maturation of

  • hackathon prototypes into production quality software. The reason is because participants are busy professionals with their own objectives, priorities, and responsibilities. It seems unrealistic to continue to work toward hackathon goals when they are no longer supported to do so. In contrast, we found evidence that hackathon participants developed collaboration plans with their team members and other attendees outside their teams. To follow-up on our findings that hackathons seem to increase familiarity and build social ties it seems useful to explore quantitative measurements of these constructs in surveys (e.g., [6,12]), and in co-authorship to scientific software source-code repositories, both before and after hackathons.

    Implications for Design Our results have important implications for the kinds of technology support needed for hackathons. In the preparation phase, social networking functionality should be in place to allow participants to quickly ask directed questions to other participants, receive updates about idea clarifications, and bring in participants with relevant interest and expertise to bear on the discussion. It should provide transparency that allows participnats to find ideas, merge related ones, and suggest modifications to ideas they care about. Especially if the hackathon is multi-disciplinary, the functionality should provide transparency to allow interested parties to find about others research interests and the technologies and datasets they use.

    In the execution phase, participants need version control to capture and merge contributions. They need shared documents (e.g., wikis) to keep track of individual assignments and progress, and shared repositories to store these documents, as well as datasets, relevant publications, and documentation. When possible, the technology should be consistent across preparation and execution so that participants do not have to learn a new technology or tool. A platform that integrates social networking functionality with software development seems ideal. When using a hackathon to link multiple disciplines, however, organizers should keep in mind that different disciplines have familiarity with different tools, and members of one discipline may be unintentionally excluded from participation. One potential solution here may be to align two different technologies in the same system by creating an application programming interface (API) that allows them to interoperate.

    Existing ICTs are likely to fall short when considering hackathons that include remote participants, both individuals and whole teams. As others (e.g., [37]) have pointed out, the real benefits of collocation come from people being at hand. For hackathons, individuals need to be included in one of the teams. Current solutions for communication via cameras and microphones (e.g., Google Hangouts) are not ideal, as sharing documents (e.g., sketches, whiteboards) and brainstorming require the ability to fluidly shift ones visual attention. Knowledge shared via

    watching others code, overhearing group discussions and tutorials are also likely to be hampered. Future studies of such hackathons are needed.

    Implications for Funding Agencies We view this paper as part of a growing body of work (e.g., [17,40]) on scientific collaboration that has important implications for funding policy. In particular, we focus on software written by scientists. Unless this software is maintained, it soon becomes useless. Yet as we noted previously, funding is generally limited to a specific research project. Even software that has generated much interest across multiple scientific communities can founder if the primary maintainers find it impossible to meet the evolving needs of users.

    It is clear that funding agencies are interested in producing and sustaining software that advances scientific knowledge [35]. Other than developing better indicators for tracking software usage (e.g., Scientific Software Map), hackathons seem like a potential strategy. Indeed, the National Science Foundation sponsored PDV, suggesting genuine interest in the hackathon model. In the longer-term we see opportunities for producing policy prescriptions on incorporating hackathons as an element of scientific software sustainability. For instance, hackathons could be included in software maintenance plans for proposal applications. Policy prescriptions could help funders design and evaluate these plans. More careful empirical study, especially of design considerations and lasting impact, will be needed in the meantime.

    CONCLUSION In this paper we examined the stages a hackathon goes through as it evolves and how variations in how stages are conducted relate to outcomes. We identified practices across the preparation, execution, and follow-through stages of a hackathon that meet the specialized needs of scientific software. Differences in the kinds of disciplines included, classes of users, and team formation strategies suggest tradeoffs among technical progress, surfacing user needs, and building community. Surprisingly, quite a few activities begin to take shape before the hackathon begins and have implications for the kinds of technology that need to be in place. Our hope is that in addition to informing future empirical studies, our results bring attention to the hackathon model and raise the level of discussion in the scientific community about planning and conducting successful engagements.

    REFERENCES 1. Mariva H. Aviram. 2015. JavaOnes Palm-sized

    winner: How 3Com stole the show, palms down. Retrieved May 11, 2015 from http://www.javaworld.com/article/2076473/mobile-java/javaone-s-palm-sized-winner.html

    2. Gerard Briscoe and Catherine Mulligan. 2014. Digital Innovation: The Hackathon Phenomenon. Retrieved August 4, 2014 from

  • http://www.creativeworkslondon.org.uk/wp-content/uploads/2013/11/Digital-Innovation-The-Hackathon-Phenomenon1.pdf

    3. Gerardo Canfora, Massimiliano Di Penta, Rocco Oliveto, and Sebastiano Panichella. 2012. Who is Going to Mentor Newcomers in Open Source Projects? Proceedings of the ACM SIGSOFT International Symposium on the Foundations of Software Engineering, ACM Press, 44:144:11. http://doi.org/10.1145/2393596.2393647

    4. Juliet Corbin and Anselm Strauss. 2014. Basics of Qualitative Research: Techniques and Procedures for Developing Grounded Theory. SAGE Publications, Inc., Thousand Oaks, CA.

    5. Michael R. Crusoe and C. Titus Brown. 2014. Channeling Community Contributions to Scientific Software: A Sprint Experience.

    6. Jonathon N. Cummings and Sara Kiesler. 2008. Who Collaborates Successfully? Prior Experience Reduces Collaboration Barriers in Distributed Interdisciplinary Research. Proceedings of the ACM Conference on Computer-Supported Cooperative Work, ACM Press, 437446. http://doi.org/10.1145/1460563.1460633

    7. Laura Dabbish, Colleen Stuart, Jason Tsay, and Jim Herbsleb. 2012. Social Coding in GitHub: Transparency and Collaboration in an Open Software Repository. Proceedings of the ACM Conference on Computer-Supported Cooperative Work, ACM Press, 12771286. http://doi.org/10.1145/2145204.2145396

    8. Holly J. Falk-Krzesinski, Noshir Contractor, Stephen M. Fiore, et al. 2011. Mapping a research agenda for the science of team science. Research Evaluation 20, 2, 145158. http://doi.org/10.3152/095820211X12941371876580

    9. HackerNest. 2014. DementiaHack TORONTO by the British Govt & HackerNest. Retrieved May 11, 2015 from http://www.eventbrite.com/e/dementiahack-toronto-by-the-british-govt-hackernest-tickets-12349265987?aff=estw

    10. Jo Erskine Hannay, Carolyn MacLeod, Janice Singer, Hans Petter Langtangen, Dietmar Pfahl, and Greg Wilson. 2009. How Do Scientists Develop and Use Scientific Software? Workshop on Software Engineering for Computational Science and Engineering, IEEE Computer Society, 18. http://doi.org/10.1109/SECSE.2009.5069155

    11. Ross Harmes. 2008. Open! Hack! Day! Retrieved May 13, 2015 from http://code.flickr.net/2008/09/03/open-hack-day/

    12. Pamela J. Hinds and Catherine Durnell Cramton. 2014. Situated Coworker Familiarity: How Site Visits Transform Relationships Among Distributed Workers. Organization Science 25, 3, 794814.

    13. Pamela J. Hinds and Suzanne P. Weisband. 2003. Knowledge Sharing and Shared Understanding in Virtual Teams. In Virtual Teams that Work: Creating Conditions for Virtual Team Effectiveness, Cristina B. Gibson and Susan G. Cohen (eds.). Jossey-Bass, San Francisco, CA, 2136.

    14. James Howison and James D. Herbsleb. 2011. Scientific Software Production: Incentives and Collaboration. Proceedings of the ACM Conference on Computer-Supported Cooperative Work. http://doi.org/10.1145/1958824.1958904

    15. James Howison and James D. Herbsleb. 2013. Incentives and Integration in Scientific Software Production. Proceedings of the ACM Conference on Computer-Supported Cooperative Work, ACM Press, 459470. http://doi.org/10.1145/2441776.2441828

    16. Xing Huang, Xianghua Ding, Charlotte P. Lee, Tun Lu, Ning Gu, and Sieg Hall. 2013. Meanings and Boundaries of Scientific Software Sharing. Proceedings of the ACM Conference on Computer-Supported Cooperative Work, ACM Press, 423434. http://doi.org/10.1145/2441776.2441825

    17. Steven J. Jackson, Stephanie B. Steinhardt, and Ayse Buyuktur. 2013. Why CSCW Needs Science Policy (and Vice Versa). Proceedings of the ACM Conference on Computer-Supported Cooperative Work, ACM Press, 11131124. http://doi.org/10.1145/2441776.2441902

    18. Toshiaki Katayama, Mark D. Wilkinson, Kiyoko F. Aoki-Kinoshita, et al. 2014. BioHackathon series in 2011 and 2012: penetration of ontology and linked data in life science domains. Journal of Biomedical Semantics 5, 5, 5. http://doi.org/10.1186/2041-1480-5-5

    19. Pedram Keyani. 2012. Stay focused and keep hacking. Retrieved May 11, 2015 from https://www.facebook.com/notes/facebook-engineering/stay-focused-and-keep-hacking/10150842676418920/

    20. Robert E. Kraut and Paul Resnick. 2011. Building Successful Online Communities: Evidence-Based Social Design. MIT Press, Cambridge, MA.

    21. Grace de la Flor, Marina Jirotka, Paul Luff, John Pybus, and Ruth Kirkham. 2010. Transforming Scholarly Practice: Embedding Technological Interventions to Support the Collaborative Analysis of Ancient Texts. Computer Supported Cooperative Work (CSCW) 19, 3-4, 309334. http://doi.org/10.1007/s10606-010-9111-1

    22. Hilmar Lapp, Sendu Bala, James P. Balhoff, et al. 2007. The 2006 NESCent Phyloinformatics Hackathon: A Field Report. Evolutionary Bioinformatics 3, 287296.

  • 23. Steven Leckart. 2015. The Hackathon Fast Track, From Campus to Silicon Valley. The New York Times. Retrieved May 11, 2015 from http://nyti.ms/1CawQxH

    24. Charlotte P. Lee, Paul Dourish, and Gloria Mark. 2006. The Human Infrastructure of Cyberinfrastructure. Proceedings of the ACM Conference on Computer-Supported Cooperative Work, ACM Press, 483492. http://doi.org/10.1145/1180875.1180950

    25. Steffen Mller, Enis Afgan, Michael Banck, et al. 2014. Community-driven development for computational biology at Sprints, Hackathons and Codefests. BMC Bioinformatics 15 Suppl 1, Suppl 14, S7. http://doi.org/10.1186/1471-2105-15-S14-S7

    26. Bonnie A. Nardi and Steve Whittaker. 2002. The Place of Face-to-Face Communication in Distributed Work. In Distributed Work, Pamela J. Hinds and Sara Kiesler (eds.). MIT Press, Cambridge, MA, 83110.

    27. Judith S. Olson, Stephanie Teasley, Lisa Covi, and Gary Olson. 2002. The (Currently) Unique Advantages of Collocated Work. In Distributed Work, Pamela J. Hinds and Sara Kiesler (eds.). MIT Press, Cambridge, MA, 113135.

    28. OpenBSD. Hackathons. Retrieved May 11, 2015 from http://www.openbsd.org/hackathons.html

    29. Prakash Prabhu, Yun Zhang, Soumyadeep Ghosh, et al. 2011. A Survey of the Practice of Computational Science. State of the Practice Reports, 112. http://doi.org/10.1145/2063348.2063374

    30. Mikko Raatikainen, Marko Komssi, Vittorio Dal Bianco, Klas Kindstom, and Janne Jarvinen. 2013. Industrial Experiences of Organizing a Hackathon to Assess a Device-Centric Cloud Ecosystem. Proceedings of the IEEE Annual Computer Software and Applications Conference, IEEE Computer Society, 790799. http://doi.org/10.1109/COMPSAC.2013.130

    31. Jeffrey A. Roberts, Il-Horn Hann, Sandra A. Slaughter, and John F. Donahue. 2006. Understanding the Motivations, Participation, and Performance of Open Source Software Developers: A Longitudinal Study of the Apache Projects. Management Science 52, 7, 984999. Retrieved May 9, 2014 from http://pubsonline.informs.org/doi/abs/10.1287/mnsc.1060.0554

    32. Anders Sigfridsson, Gabriela Avram, Anne Sheehan, and Daniel K. Sullivan. 2007. Sprint-driven development: working, learning and the process of enculturation in the PyPy community. In Open Source Development, Adoption and Innovation, J. Feller, B. Fitzgerald, Walt Scacchi and A. Sillitti (eds.). Springer US, 133146. Retrieved May 28, 2014 from http://cs.anu.edu.au/iojs/index.php/ifip/article/view/11308

    33. LLC SocioCultural Research Consultants. 2014. Dedoose Version 5.0.11, web application for managing, analyzing, and presenting qualitative and mixed method research data. Retrieved from http://www.dedoose.com

    34. Igor Steinmacher, Marco Aurlio Gerosa, and David F. Redmiles. 2015. Social Barriers Faced by Newcomers Placing Their First Contribution in Open Source Software Projects. Proceedings of the ACM Conference on Computer-Supported Cooperative Work & Social Computing, ACM Press, 13791392. http://doi.org/10.1145/2675133.2675215

    35. Craig A. Stewart, Guy T. Almes, and Bradley C. Wheeler. 2010. Cyberinfrastructure Software Sustainability and Reusability: Report from an NSF-funded Workshop. Indiana University, Bloomington, IN. Retrieved from http://hdl.handle.net/2022/6701

    36. Daniel Stokols, Kara L. Hall, Brandie K. Taylor, and Richard P. Moser. 2008. The Science of Team Science: Overview of the Field and Introduction to the Supplement. American Journal of Preventive Medicine 35, 2 Suppl, S7789. http://doi.org/10.1016/j.amepre.2008.05.002

    37. Stephanie Teasley, Lisa Covi, M.S. Krishnan, and Judith S. Olson. 2000. How Does Radical Collocation Help a Team Succeed? Proceedings of the ACM Conference on Computer-Supported Cooperative Work, ACM Press, 339346. http://doi.org/10.1145/358916.359005

    38. Kaitlin Thaney. 2014. The #mozsprint heard round the world. Retrieved May 11, 2015 from http://www.mozillascience.org/the-mozsprint-heard-round-the-world/

    39. Erik H. Trainer, Chalalai Chaihirunkarn, Arun Kalyanasundaram, and James D. Herbsleb. 2014. Community Code Engagements: Summer of Code & Hackathons for Community Building in Scientific Software. Proceedings of the ACM Conference on Supporting Group Work, ACM Press, 111121. http://doi.org/10.1145/2660398.2660420

    40. Erik H. Trainer, Chalalai Chaihirunkarn, Arun Kalyanasundaram, and James D. Herbsleb. 2015. From Personal Tool to Community Resource: Whats the Extra Work and Who Will Do It? Proceedings of the ACM Conference on Computer-Supported Cooperative Work & Social Computing, ACM Press, 417430. http://doi.org/10.1145/2675133.2675172

    41. Greg Wilson. 2006. Software Carpentry: Getting Scientists to Write Better Code by Making them More Productive. Computing in Science & Engineering 8, 6669.

    42. Robert K. Yin. 2014. Case Study Research. SAGE Publications, Inc., Thousand Oaks, CA.


Recommended