The Design and Development of a Cloud-based Platform ...garnering collective expertise through...

The Design and Development of a Cloud-based Platform SupportingTeam-oriented Evidence-based Reasoning: SWARM Systems Paper

R.O. Sinnott, C. Bayliss, C. Guest, G. Jayaputera, G. Karami, J. Kim, Y. Pan, R. Susanto, D. Vu, I. Widjaja, Z. ZhaoSchool of Computing and Information Systems, University of Melbourne, Australia

R. de Rozario, E. Silver, S. Thomman, T. van Gelder, School of Biosciences, University of Melbourne, AustraliaY. Ades, T. Dwyer, K. Marriott, M. Schwarz, Faculty of IT, Monash University, Melbourne, Australia

Contact author email: [email protected]

Abstract

The Smartly-assembled Wiki-style ArgumentMarshalling project (SWARM) commenced in 2017 aspart of the US Intelligence Advanced Research ProjectsActivity (IARPA) funded Crowdsourcing Evidence,Argumentation, Thinking and Evaluation (CREATE)Program. The SWARM project has developed an onlineplatform allowing groups to produce evidence-basedreasoning. This paper provides a summary of thecore requirements and rationale that have driventhe SWARM platform implementation. We presentthe technical architecture and associated designimplementation. We also introduce core capabilitiesthat have been introduced to encourage user interactionand social acceptance of the platform by the crowds.

1. Introduction

The Internet is awash with data and opinions. Thisdata is increasing exponentially and will likely continueto do so for the foreseeable future [1]. There havebeen enormous advances in the ability to harnessand utilize such data through advanced computationaltechniques and access to large scale infrastructuresleveraging for example high performance computingand Cloud resources. The term “big data” has nowentered mainstream vernacular as the ability to deriveunderstanding and knowledge from this data deluge [2].This is often through advanced data mining, machinelearning and information retrieval involving multipleterabytes of data. However whilst the technologiesaround big data processing have seen fundamentaladvances in computational abilities, the ability forreasoning and understanding has not significantlyprogressed.

The IARPA funded CREATE program seeksto address this issue. Specifically, the CREATEprogram focuses on development and evaluationof platforms that use crowdsourcing and structuredanalytic techniques to improve analytical reasoning,

with specific focus on the demands of the IntelligenceCommunity. The CREATE program commencedin 2017. It involves four funded projects that aredeveloping systems that are targeted specifically atsupporting improvements in evidence-based reasoning.The four teams include: TRACE – Trackable Reasoningand Analysis for Collaboration and Evaluationwhich focuses on the development of a web-basedapplication that uses crowdsourcing to overcomecommon shortcomings in intelligence work byimproving the division of labor and reducing boththe systematic and random errors individuals generatewhile promoting communication and interaction amongteams; Co-Arg – Cogent Argumentation System withCrowd Elicitation which focuses on the developmentof a software-based cognitive assistant for intelligenceanalysts that tests hypotheses, evaluates evidence,sorts facts from deception and provides intelligentreasoning about potentially rapidly evolving situations;BARD – Bayesian Argumentation via Delphi whichuses causal Bayesian networks as the underlyingstructured representations of argument analysis andaugments this with automated Delphi methods to bringgroups of analysts to a consensus-based analysis, andSWARM – Smartly-assembled Wiki-style ArgumentMarshalling which focuses on the development ofa user-oriented, web and Cloud-based platform thatsupports crowd-based reasoning with specific focus onsupporting end users and achieving improved, collectivereasoning.

The CREATE program itself represents a 4.5-yeareffort comprising three phases. Phase 1 of the program(January 2017 - September 2018) focuses on thedevelopment of the core platforms and their evaluationby independent teams of users. The nature of thequestions (problems) posed in Phase 1 are typicallysmaller-scale constrained problems, i.e. where all ofthe information is provided in the problems that areset. The projects are now entering the final partof Phase 1. The ultimate aim of the platforms inPhase 1 is to clearly demonstrate that they provide a

Proceedings of the 52nd Hawaii International Conference on System Sciences | 2019

URI: https://hdl.handle.net/10125/59481ISBN: 978-0-9981331-2-6(CC BY-NC-ND 4.0)

Page 410

significant improvement over other approaches, e.g. useof Googledocs and Google Hangouts for collaborativeediting and communication. Those CREATE platformsthat meet these criteria will proceed to Phase 2where the problems that are posed are expected to bemore complex and potentially include unconstrainedproblems, e.g. involving external resources and data.This paper focuses on the rationale that has driven thedesign and development of SWARM.

2. Related Work

There has been an extensive body of researchundertaken into improving evidence-based reasoningwith specific focus on the needs and demands ofthe Intelligence Community [3]. [4] identifies arange of techniques that can be used to tackleissues faced by intelligence analysts in their dailyreasoning activities. These include approaches toaddress challenge judgments, identify mental mindsets,stimulate creativity and manage uncertainty. Whilst[4] identifies a range of techniques that can be usedby analysts, there is a dearth of actual evidencethat any given structured technique actually improvesimprove on-the-job reasoning. Whilst some techniqueshave been shown to promote good thinking in othercontexts, e.g. there is evidence that argument mappingcan improve critical thinking skills [5], the adoptionof any specific tool or approach by the intelligencecommunity has not materialized. There are manypotential explanations for this. The suitability of anygiven tool for the given problem at hand; the demandsrequired to master particular tools; the fact that toolsmay not be suited to the particular workflow or processthat reflects the actual daily needs of analysts; therequirements for tools that can be general purpose andused for different scenarios depending on the problemat hand. For many analysts, the workflow aroundtheir reasoning and collaboration is primarily basedupon drafting of documents, e.g. Word documents,and iterating with peers/experts for feedback/comments.Such an approach does not lend itself to improvementsin reasoning nor benefit from other experts (people).

There are however many examples of systemsthat have evolved and become de facto places forexpert opinions and answers [6]. These are notbased upon any given expert or analyst, but throughgarnering collective expertise through crowdsourcing[7] [8]. Wikipedia is one of the major web-basedplatforms that has leveraged the global community[9]. Whilst not specifically focused on reasoning, itis clear that Wikipedia has established resources thathave shaped global knowledge based on the wisdom

and knowledge of (global) crowds. The wisdom ofcrowds has been studied by numerous researchers [10][11] [7] [12] [13], yet the translation into mainstreamsoftware solutions that can be adopted directly by theIntelligence Community has failed to occur. Whilstplatforms such as Wikipedia are focused largely aroundfactual information, they do not directly lend themselvesto the typical problem-based requirements facing theintelligence community.

Platforms such as StackOverflow provide manyfeatures that are highly desirable when leveraging thewisdom of the crowd and more closely aligned withthe needs of the Intelligence Community. WithinStackOverflow, any user can post any technical questionto the platform, e.g. how to tackle a specific softwareproblem. It is noted that StackOverflow has nowbeen rolled out and supports many other disciplines.Following a given posting, any user can post a potentialanswer to the question and all users are able tocomment and vote on each answer. Depending onthe rating of all users, the best (most highly-rated)answer floats to the top and in that sense constitutesthe crowds answer to the question. This approach ishighly successful and is heavily used by many softwaredevelopers globally as the place to have software issuestackled. This depends greatly on the reputation systemthat reflects author contributions and ensures that thecommunity self-monitor any potential gamification ofthe system, e.g. having users rate each other toincrease their ranking on the platform [14]. For manytechnical developers, the ranking of an individual onStackOverflow can be used as both a token of esteembut also to directly aid career development. There aremany desirable features from a platform for tacklingpotentially arbitrary questions from the community atlarge. The fact that the solutions have been adoptedby mainstream software developers is testament to theirsuccess. Such recognition has driven the core SWARMuser requirements, however there are some significantdifferences which we enumerate.

3. SWARM Requirements

The SWARM platform has been developed througha highly agile software development methodology. Thetechnical team has implemented numerous versionsof the platform that have been iteratively developedwith feedback and ideas shaped continually by thedomain scientists and experimental teams working onSWARM. The earliest prototype focused largely onargument mapping that allowed for users to tag theirinputs as hypotheses, evidence, arguments for/against,assumptions etc. This version of the platform was

Page 411

delivered within the first few months of the project.It was rapidly identified that this approach had majorissues in how teams perform and interact. The projectthus pivoted in the core approach taken. Ultimately,it was recognised that the success of SWARM wouldbe based on development and delivery of a platformthat encourages users to engage and not forcing themdown an argument mapping solution. The platformshould support their activities, encourage team work andminimise the technological demands that too often placea hurdle in adoption of a given technology. The specificrange of criteria for SWARM included:

Contribute reports and collectively select the bestreport. SWARM users should initially draft theirproposed responses on a given problem independentlyof any other user’s work. The individual drafting processshould allow users to get their thoughts in order anddevelop their argument before opening their work upfor review. Once a user publishes their response to agiven question, other users can comment and proposeimprovements to it. This is an iterative team-basedprocess that underpins SWARM. At the end of lifecycleto a given problem that is set, the highest-rated reportis selected to be the team’s submission. Users shouldalso be able to contribute snippets (resources), that aidthe team’s work, e.g. diagrams or analysis, that may notform part of the final report directly but serve as an aidto improving the team effort.

Feedback and review. SWARM’s rating systemneeds to be designed so that users can give fine-grainedfeedback to authors of any contribution. This helpscontributors figure out which aspects of their reportsneed more work. Users can comment directly on reports,or discuss them informally via chat. These featuresencourage users to improve their work by incorporatingsuggestions and feedback. Users should not be able torate their own contributions.

Engagement. SWARM needs to be designed tosupport users who are solving difficult and oftendry, fictitious problems. Given this, boostingparticipant engagement and motivation is a major designrequirement. As such, the SWARM platform is requiredto make the user experience as smooth as possible. Thisshould include a clean and intuitive user interface andsolutions that encourage engagement at any time.

Social warmth. For team based activities, SWARMrecognises that there is a clear need to support socialinteractions between users to foster social warmth andencourage engagement. Chat and comments bothpromote social interaction. Users should be able to tageach other in chat and send/receive notifications. If theywish to share credit when another user has providedhelpful feedback, they can nominate other users as

co-authors of their reports. Sharing credit promotes thespirit of collaboration.

Few constraints on users. SWARM should not bedesigned to replace or automate human reasoning. Itshould not ask users to use any particular structuredtechnique, which might constrain them to only solvea subset of possible problems or dissuade them fromengaging due to the effort required to learn to use anygiven tool. Rather, SWARM should be designed tosupport, encourage and motivate people to work hardon a wide range of problems. Users should be able tocontribute their work in the format that they are mostcomfortable with. SWARM should provide users witha rich text editing environment, supporting the abilityto drag and drop images and allow them to incorporatework from potentially many diverse tools.

Access to support for structured analytic techniquesand other problem-solving tools. While SWARMusers should not be constrained to use any particulartechnique, they should be supported in the use ofa variety of different techniques. The need for awide array of “lenses”, i.e. ways of viewing andapproaching problems, should be offered. Users shouldbe encouraged to use the tool that best fits the problemat hand without any mandate to adopt a given solution.The SWARM platform should incorporate some coresupporting tools, but provide a rich compendium (LensKit) of tools that the users might find beneficial. Thisshould include basic training and education. These LensKit tools are external to the core SWARM platform, butthe results of applying these tools can be included intothe platform, e.g. as charts, graphs.

Anonymity. Users should be provided withpseudonyms and avatars to allow them to expressopinions more freely, and to rate each others workwithout being prejudiced by their opinion of theindividual. Ratings should also be kept anonymous;users can see how many other users rated their work, butcannot see who did so, allowing users to rate withoutfear of retaliation.

Support multiple large team sizes. SWARMshould be designed for numerous potentiallylarge teams/crowds with no adverse impact onthe performance of the platform. Through thecontainer-based approach that has been taken to deliverand deploy the components, the system has beendesigned to scale horizontally across the Cloud [15].Supporting multiple teams and not just a single largecrowd allows to compare the platform use for improvedreasoning across teams and compare the results againstcontrol conditions, e.g. just using googledocs.

Page 412

De

vic

eID

Data

Portal

External IDP

SWARM Core

DataAccess RatingEngine

SecurityEngineBI-Logic

EventNotificationServiceMobileNotication

Service

AnalyticsService ReportingServices

MessagingServices

Cre

de

ntia

l / JW

T

Da

ta

Messages

Me

ssa

ge

s

Android / IOS

Da

taE

ve

nts

Eve

nts

Mobile Messaging

Apple Push Notification

Firebase Cloud

Messaging

Notifications

Notifications

Data

Re

po

rts

Messages

Eve

nts

Me

ssa

ge

s

Re

po

rts

Cre

de

ntia

l / JW

T

Cre

de

ntia

l / JW

T

No

tifica

tion

s

De

vic

eID

DeviceID

No

tifica

tio

ns

No

tifica

tio

ns

Figure 1. Overall SWARM System Architecture

4. SWARM Architecture

The SWARM ecosystem is built around the conceptof micro-services as opposed to development of a singlemonolithic application. Such an approach allows tocontainerize the system components and hence promotesystem scalability. This is especially important withlarge numbers of users. This lends itself to Cloudinfrastructures. In this section we identify the keySWARM components and their associated functionality.Figure 1 presents a high-level overview of the (current)SWARM system architecture.

As shown in Figure 1, SWARM is comprised of anumber of key components. We enumerate these andoutline their core functionality in the following sections.

The SWARM Portal is the main component thatsupports user interactions and the user experience as awhole on the SWARM platform. Users are grouped intoteams where each team can have its own set of problems.Given this, there is a clear need for authentication andauthorisation capabilities in the platform through anidentity provider (IdP). Users can discuss the postedproblems with their peers as well as construct theirown response to the problems that are posted. Aswell as posting their own responses, users can commentand rate other team member responses. The Portalprovides access to a range of other tools that havebeen developed. This includes an Argument Mapping/ Graphical Tool, access to a Lens kit for concepts andtechniques that can help and guide users to reason andthus produce quality responses, as well as capabilitiesfor probabilistic reasoning. All users are assigned

an Australian-oriented animal-based pseudonym on theplatform, e.g. Dingo47 as well as an associatedavatar. Users are discouraged from disclosing theiractual identities to avoid any offline discussions. It isexpected that all interactions and communications occuron the platform.

Given the ubiquity of mobile devices for accessingweb content, SWARM also provides users with nativemobile applications (Android® and IOS®). Thesemobile apps offer similar functionality to the Portal, butin a manner tailoured to the limitations of the screenspace of mobile devices. The development and deliveryof mobile applications encourages engagement in theplatform, e.g. users can access the content and benotified of updates or chat with other users at any time.

The SWARM Core provides the main subsystem ofthe SWARM ecosystem. This subsystem is comprisedof several key components including the BusinessIntelligence Logic, Data Access, Security Engine anda Rating Engine. The SWARM Business IntelligenceLogic component defines and executes the businessrules that shape the behaviour of the SWARM platformusing JSON Web Tokens (JWT). There are a numberof business rules governing the user interaction withinSWARM. A given problem lifecycle typically consistsof a number of states. When a problem is posted by anadministrator, it can be in one of three states: Frozen,Open or Closed. When a problem is in the Frozenstate, users can only read and construct their privateresponses, i.e. their responses cannot be made publicuntil the state of the problem is changed to Open by theproblem owner (the administrator). A problem does notremain active/open forever. Typically problems are openfor a certain period of time in which the users/analystsare encouraged to produce interact with one another toproduce responses. Once the problem state is changed toClosed, users are unable to add responses or commentsto the problem. Users can, however still view closedproblems. Events and notifications that are activatedwithin SWARM are also governed by a set of businessrules that are defined within the business intelligencecomponent. The granularity and frequency of theseevents and notifications are fully configurable.

The SWARM Data Access component provides thedata persistent layer of the system. This componentis responsible for marshalling/unmarshalling of objectsprior to transmission to/from the database service. Thiscomponent provides a separation between the objectsand the actual process of storing/retrieving data. Sucha layer provides a degree of transparency over theunderlying database and hence reduces the couplingbetween the system logic and need for data serialization.

The SWARM Security Engine is responsible for

Page 413

access control, authentication and authorization. TheSecurity Engine authenticates users through a localdatabase and third party identity providers such asGoogle®. For external identity providers, the OAuth2[16] framework is used. Once authenticated, users arechecked with regard to their associated permissions, e.g.to determine which group the user belongs too as well astheir role. Users with admin roles have higher privilegescompared to ordinary users. Such admin roles are ableto publish new problems and define the duration ofproblems for instance. Other users are typically onlyable to analyse, rate, chat and otherwise contribute tothe production to the responses to problems that are setwithin their own team.

The SWARM Rating Engine supports and managesthe evaluation aspects of user contributions. It providesusers with the ability to (privately) rate other userresponses to the given problem. Various forms of ratingsuch as simple vote (like) or scale-based (0-100) aresupported by the engine on different dimensions ofthe responses as shown in Figure 2. More complexand detailed evaluation-based ratings leveraging moreflexible, problem-related rubrics are also supported.The rubric can provide user-insights on various issuesof reasoning and argumentation. The Rating Enginealso provides a means to reconcile and aggregate alluser ratings for all responses to a particular problem.It supports wide-ranging modes of aggregation fromsimple or weighted averages (for all ratings related to aspecific response) to complex matrix-based aggregates(across all users and all responses within a problem).The result of this aggregation is used to select the bestresponse for the problem of interest. The SWARMRating Engine utilizes a number of strategies to rate andscore responses including: BandRating, RatingScore,and RatingCount. Here BandRating classifies responsesbased on the number of ratings a response receives. Aresponse will be classified in a higher band if it receivesa certain threshold value. The RatingCount on the otherhand is given as the total number of ratings a responsereceives. The RatingScore is an overall score given to aresponse.

The SWARM Messaging subsystem is the backboneof the communication (messaging) functionalityavailable in SWARM. With this functionality usersare able to discuss various aspects online via anonline chat facility. As a crowd-based application,this messaging service compliments SWARM as itpromotes the interaction between users in the crowd.The messaging service is available to both the Portal andmobile applications. The SWARM messaging serviceis an XMPP (Extensible Messaging and PresenceProtocol)-based application server which enables near

Figure 2. SWARM Rating Score

real-time message exchange between client devicesover the network.

Users in SWARM are grouped into teams, thus theMessaging Service must also provide services that arealigned with such grouping. Users from a given groupshould not be able to see discussions/chats occurringin different groups. To achieve this, the MessagingService relies on the authorization service provided bythe Security Engine. When a user logs into the SWARMsystem, the access token provided to the Security Enginecontains authorization information in its payload. Usingsuch information, the Messaging Service is able tocontrol the message visibility and hence preserve theprivacy of groups.

The SWARM Reporting Service is charged withgenerating the final report. As a crowd-based reasoningplatform, SWARM does not depend on an individual toproduce the best report. Instead, it automatically selectsthe best response as contributed to, and rated by, theteam to produce the final report. The process of selectionis done by sorting responses based on their BandRating,RatingScore, RatingCount and the final TimeStamp.Responses with a higher BandRating will be selectedfirst regardless of whether their RatingScore is lowerthan the BandRating. Such a strategy is used becauseit is believed that a response that has been lowly ratedby hundreds of people (i.e. with a higher BandRating)is likely to be more accurate and realistic compared toa report that has only been rated by a single individualeven though its RatingScore is high. If there are morethan one response with an equally high BandRating thenthe process of selection is continued. In this secondstage, responses with the highest RatingScore will beselected. If that still does not produced a single result,then the response with the highest RatingCount will beselected. Finally, if the previous process of eliminationdoes not produce a result then the time of the update to

Page 414

a response will be used as the basis for deciding on thefinal report that is actually submitted.

The SWARM Event Notification Service supports thecapture and preservation of events that have occurredin the system. Capturing and preserving events isimportant because it allows data analytics to be appliedto gain insight from the individual and team userbehaviour when interacting with the system. This alsoallows for identification of features requiring furtherdevelopment in the system to increase the overall userexperience of SWARM.

Events in SWARM are typically generated by thecore subsystem which directly interacts with users viathe Portal. Thus events represent user activities. Forcertain user activities, e.g. when a user places acomment to a response or creates a response, an eventis triggered (generated). The events generated are sentto the Event Notification Service where they can bepreserved. The Event Notification Service also producesnotifications on certain events. These notifications aresent to the relevant users either via the web browseror via the mobile applications using push notificationtechnology. For example, when a user post commentsto another user’s response, the latter will be notifiedimmediately. Similarly, if a response was posted, themembers of the related group will be notified. Thisencourages such members to analyze the responsesand rate them accordingly. In order to accommodateportal-based users that are not online, the EventNotification Service buffers notifications that could notbe delivered. Re-delivery will be attempted when thetarget users are online and reconnect to the SWARMportal. The Event Notification Service maintains a list ofuser browsers used to connect to SWARM. When thereis a notification required to be sent to a particular user, asimple look up extracts the list of the browsers wherethe user is activated. The notification is then sent tothe user. The Mobile Notification Service complementsthe Event Notification Service. Its main task is tomanage the notification delivery service required formobile devices. Unlike the Event Notification Servicewhich maintains a list of browsers associated with users,this subsystem maintains the unique IDs of mobileuser devices. Thus a single notification can be sentto multiple devices belonging to a single user. Whena mobile application is used for the first time (afterinstallation), it connects to SWARM Core and registersitself by sending the device ID of the mobile applicationto the core. This device ID is then persisted in thedatabase. Upon receiving a request from the EventNotification Service subsystem to send a notification toa user, this component performs a simple look-up toretrieve a list of devices associated with that user. The

information returned also contains the type of devices,either iOS or Android-based. Appropriate mobilemessages are then constructed accordingly accordingto the device type. These messages are finally set toeither Google’s FCM (Firebase Cloud Messaging) orApple’s APN (Apple Push Notification) platforms forfinal delivery to the mobile devices. It should be notedthough that the Mobile Notification Service does notbuffer notifications that failed during delivery, rather itrelies on the service of FCM and APN to redeliver thenotifications.

The SWARM Analytics subsystem provides insightsabout user behaviour within SWARM based on theanalysis of user activities. This can include creatingresponses; usage of argument mapping tools; use ofthe lens kit; comments posted to other team memberresponses; events; message contents; usage of weband mobile applications as well as any demographicinformation. By analysing the interaction between usersin the platform and in the chat facility, the SWARMplatform is able to identify which users are most activein a group and how teams interact more generally. Suchinformation can be used to tackle attrition, e.g. for usersthat have not engaged for some time, notifications canbe sent to encourage them to re-engage.

All of the above components and servicesare deployed as containers using Docker. Thisapproach allows SWARM to be easily deployed andscaled horizontally across Cloud infrastructures.Currently SWARM has been deployed on twodifferent cloud infrastructure: the National eResearchCollaboration Tools and Resources (NeCTAR -www.nectar.org.au) Research Cloud - the national cloudfor academics/researchers in Australia, and to AmazonWeb Services (AWS - aws.amazon.com).

5. SWARM Realization

The SWARM system has been designed to supportgroups of people working collaboratively on activitiesrelated to evidence-based reasoning. The typicalworkflow is that a problem is defined and releasedto multiple separate groups to work on. SWARMembraces the fact that each individual in a group willhave their own approach, style or preferred approachto support high quality reasoning. SWARM allowsvarious work styles. For example, users can discussa problem in the chat forum or by submitting aproposed response immediately. Supporting the idea ofcontending analyses, users are also encouraged to tryand compare different analytic approaches. To supportthis, SWARM provides a SWARM Lens kit, where alogical lens can be seen as any kind of tool, concept,

Page 415

Figure 3. SWARM Portal

pitfall, checklist or technique that can help structure orguide users’ thinking. This compendium of techniquesis available for users, but there is no direct expectationthat users must adopt any particular lens.

Figure 3 shows the main screen of the SWARMportal after the user has navigated to a particularproblem. The screen provides a multi-panel interface.The left side shows the problem. This is typicallyalways accessible to users when they are working on(responding to) a given problem that has been set. Themiddle panel shows the group responses and commentsand ranking related to the responses. The right sideshows the chat panel. This split panel approach enablesa seamless workflow where the user can draft a responsewhile consulting the problem description or discuss theproblem in the chat with collaborators. The panels canbe enlarged or minimized as desired.

The middle panel of Figure 3 is the main workarea. Here users can submit responses, either as afully fleshed out answers (reports) to a problem oras useful resources, where a resource is anything thatcould be helpful to the team, e.g. a diagram or listof alternative hypotheses. The panel itself is based ona simple web-based editor (CKedit - www.ckedit.com)that supports text entry and formatting. This has beenextended in various ways to meet the editing, graphicaland analytical needs of SWARM.

A report is typically a more structured, formalanswer which covers various aspects of the problem.Users are able to work on their contributions inisolation in a private, draft state (indicated by a yellowbackground). When they feel that they are ready forscrutiny by their group members, they can publishthem. Since SWARM has been designed to cope withlarge crowds, the system provides features to expand,collapse and filter the information in the middle pane.Responses can be sorted in a variety of ways: by reportsubmission order, by average rating, by the creationdate, by recent changes or based on the author. User

can react to responses, either by commenting on them orby rating them. Resources and comments can be ratedby their usefulness with a simple thumbs up/thumbsdown. Reports can be rated in detail as shown in Figure2 based on their readiness to become the final reportfor the team. The readiness rating includes aspects ofreasoning such as the completeness, correctness, logic,evidence or alternatives used, as well as aspects ofcommunication such as clarity or format. Ultimatelythe rating of the reports that are generated determineswhich report is to be used as the basis for the team’sfinal report. It is noted that teams will typically identifythe best report at a given point in time before the closingdate of the problem and iteratively work on refinementsand improvements to this single report before the finalsubmission deadline.

By default the left side panel shows the problemdescription along with general information about its duedate and its current state (frozen, open or closed). Therecan be many questions that are listed, both open andclosed. Selection of a given question will automaticallyopen the second navigation tab that shows a list ofresponses and their ratings. On the right side of Figure 3is the chat panel. This shows the discussions occurringamongst the group members giving them the chanceto exchange thoughts about the problem in a morespontaneous manner. Users can also address other groupmembers by tagging them. It is important to note that thechat panel encourages informality, e.g. the chat roomcan be used for general discussions that are potentiallyoutside the remit of the problem focus. This encouragessocial warmth and collaboration.

It is noted that the portal front end has undergonea range of experiments and evaluations with regards toits interface. This has included video recording andmonitoring of individuals in how they might use theplatform – with specific focus on those users that havenever seen or heard of SWARM, and users that are notsavvy with web-based systems. This has directly shapedthe user experience aspects of the platform includingthe layout, content and naming conventions used. Thecurrent version reflects the general consensus from theusers involved in these experiments on what the systemshould offer, to be as user-friendly as possible, e.g. morecomplicated features of the platform are suppressed andonly available after opening up sub-menus. A keyyardstick of the platform is minimising the need fortraining.

In addition to the core features of the portal, theSWARM platform also provides several targeted toolsincluding a Probability Calculator and a GraphicalTool. As noted, the SWARM design philosophyencourages a diversity of analytical approaches to solve

Page 416

problems. This is a deliberate choice to encourageparticipation. The idea is that autonomy in analyticalapproach will make users feel less constrained andtherefore more willing to engage with the system.However, there are tools and approaches that areespecially suited for certain kinds of problems. Forexample, in situations where problems deal withuncertainties, it is useful to have tools for dealingwith probability calculations. The design principle of”latitude” in analytical approaches, versus the needfor specifically relevant tools led to the idea of a”non-intruding” simple tool for probability calculations.That is, rather than constraining users to approachanalysis of uncertainty in a particular way, SWARMallows users to use any tool, but also makes availablea simple ”probability calculator”. The probabilitycalculator is embedded into the text editor and canbe accessed by simply typing a probability formulainside ”backticks”. A ”calculate” button is availableon the editor toolbar to make the system evaluate theprobability formula and render a result. Formulas maybe written in structured English, or in an abbreviatedmathematical form. For example, the ”probabilityof Rain is 20%” is acceptable, as is its abbreviatedform of ”pr Rain=0.2”. A probabilistic problem maybe formulated with probability assignment statementslike these, as well as conditional probabilities like the”chance of SprinklerOn given Rain is 1%”. Once theproblem has been formulated, calculations can be madelike the ”probability of WetGrass given Rain” as shownin Figure 4 (left).

In essence, the probability calculator can inferprobabilities from simple Bayes networks, representedas structured English formulas. In this way, the platformprovides users with a simple (and optional) tool toreason about probabilities. The tool attempts as muchas possible to shield the user from the technicalitiesof the probability calculations themselves. The designprinciple reflects the kind of approach inherent in pocketcalculators, whereby a user need not know how a”square root” or ”compound interest” is calculated,but merely how such calculations are used, and whatparameters are needed.

Technically, the calculator uses a simulationapproach to solving collections of probability formulas.That is, each probability variable is represented asa bit array that is randomly assigned in proportionto the given probability for that variable. Relationalcalculations such as the ”probability of X given Y andZ” are then calculated from Boolean operations onthe bit arrays. Some limitations have been found inthis approach, notably the timeliness of convergence ofresults, especially on problems with a moderate number

Figure 4. In-built Platform Features: Probability

Calculator and Graphical Editor

of variables, or on very small or large probabilities.Design is underway to address these limitations, whilekeeping the simple user interface. It is also notedthat the LensKit also identifies a range of richer (andmore complex) tools and environments for performingBayes-type analytics off platform.

In addition to the probability calculator, SWARMrecognized that various analytical approaches involvea visual aspect. Often a line of argumentation can becommunicated more effectively when illustrated by avisual representation, e.g. a causal loop diagram. Manycritical thinking skills can also be improved throughargument mapping. While users are free to use anygraph tool they are familiar with and simply uploadimages to the platform, SWARM also developed anin-built graphical editor. This tool not only allowsusers to create any number of diagrams but it lets usersprofit from close coupling to the rest of the portal. Thegraphical editor can be opened from the response panel,which will turn the chat panel into a drawing canvaswhere diagrams can be constructed. Once finished thediagram can be saved right back to the current cursorposition as part of the response. It is also possible tocopy and paste text from other parts of the interfaceinto the graphical editor, e.g. parts of the problemdescription or other responses. To support argumentmapping, marked up text in the users own responsecan be imported and included into the graphics. Theeditor can also read in probabilistic expressions fromthe corresponding responses and automatically createsimple Bayesian diagrams based on the calculations asshown in Figure 4 (right).

The graphical editor offers a range of basicdiagramming features. The size, shape and colour ofnodes can be changed and text in nodes and on arrowsmarked up. Nodes can be arranged on the canvaseither manually or using a force-directed automaticlayout. Images can be dropped onto the canvas tobecome part of an existing node or to create a new

Page 417

Figure 5. Mobile application

node. Reverting changes and deleting functionalitymakes it easy to try out different approaches. Thegraphical editor was designed to be useful for an arrayof different use cases, e.g. from drawing a simplediagram to automatically creating a structure based onmarked up text in the corresponding response. Furtherrefinements and extensions to the graphical editor areunder consideration, whilst acknowledging that manyusers will often want to use their graphical editor ofchoice outside of the SWARM platform and simplyincorporate the images.

In addition to the portal based offerings, SWARMrecognised that there are several fundamental challengesthat need to be addressed for the SWARM platform tobe successful in improving evidence-based reasoning.Arguably the most important of these is in avoidingattrition of users. There is no mechanism that can beapplied to make users use the system. Rather, the systemneeds to be both easy to use and ideally encourageusers to want to use the system. Nowadays, individualswill typically expect to access web content on theirphone. This should allow ubiquitous access to mostif not all of the features that exist within the SWARMportal. To tackle this, targeted SWARM mobile appshave been designed and delivered through the AppStore(iPhone/iPad) and Google Play (Android). Example ofthe features of the mobile apps are shown in Figure 5.The left pane shows the open/closed questions and theprofile of the users. The middle pane shows the basicwork panel related to a particular problem whilst theright panel shows the chat room. Unlike the portal whichhas all three panels available at all times, the mobileapps requires users to navigate across these panels (dueto the obvious limitations of the screen space).

These apps enable users to access the SWARMcontent at any time. Users can receive notificationson the phone, e.g. by being mentioned in the chatroom or when a problem is nearing the end of itstime window and requires rating. This latter point isimportant since the quality of reports depends on the

crowds engaging. Even if individuals in the team havenot had chance to directly engage in the development ofthe responses, they can all read and rank the responses.This gentle encouragement is key to the adoption andsuccess of SWARM. The apps can be downloaded fromthe AppStore/Google Play by anyone, however onlythose that have a valid account on the SWARM platformare able to use the apps. This requires users enter thecredentials that they have been assigned.

6. Preliminary Results

The SWARM platform is currently in the processof being formally evaluated as part of the CREATEprogram by a Test & Evaluation (T&E) team with acrowd selected and managed by them, hence officialresults and feedback have not yet been produced. Itis noted that the CREATE crowd for all platforms isexpected to be over 4000 users with different numbersassigned to each platform and with different teamssizes. The SWARM platform has been deployed tothe Amazon Cloud and integrated into a web-basedfront end (www.createbetterreasoning.com) togetherwith all other CREATE-funded platforms for thisofficial evaluation. Unofficial experiments have beenconducted with SWARM for the last year however,using multiple releases of the platform on the NeCTARResearch Cloud. This involved multiple problems andmultiple teams. The problems that have been set arerepresentative of the kinds of problems that are to beset by the T&E. These problems have an associatedset of rubrics that are used as the basis for decidingon the quality of the reports that are produced, andhence on the quality of the reasoning that has goneinto those reports. The SWARM experimental team areresponsible for the creation and subsequent assessmentof the reports that are produced through the SWARMdeployment on NeCTAR. The teams themselves haveranged from groups of 12-30 individuals with a varietyof backgrounds. These individuals were recruited basedon a SWARM-specific social media campaign (over4500 users signed up to use the SWARM platform onFacebook with over 530 actually included into teamsand using the platform).

While we cannot generalize over the evidence-basedreasoning from the SWARM trials we can show thatdifferent group dynamics develop depending on theproblem. Figure 6 shows the dynamic that developedwithin one group discussing a problem in the firstweek (left) and second week (right). For both graphswe collected the number of contacts between groupmembers either by tagging each other in the chat orby commenting on each others responses. The size

Page 418

Figure 6. Visualizing Group Dynamics for a Given

Team

of the nodes indicates the social connectivity of teammembers (the number of contacts made), while the colorshows their level of engagement on the system (theamount of responses, comments, chat messages andratings they contributed). Both graphs show that thereseems to be a correlation between both measures: largenodes are usually darker meaning that team memberswho contribute more also interact more with each other.As seen, a group of five people seem to be the mostactive players in solving the problem in week 1 whereasone core individual stands out in the second week.This difference might be explained by the fact thatthe problem posed in the second week required anunderstanding of Bayesian logic and the group reliedheavily on a team member who seemed to have thenecessary background.

7. Conclusions and Future Work

In this paper, we present the rationale and thedesign and development of the SWARM platform. Theplatform has been used extensively over the last yearand the results appear very promising. The officialassessment is currently ongoing by the CREATE T&Eteams with the plans for adjudication of the platformto be made in the final quarter of 2018. Feedback onthe platform by end users has been positive and mostimportantly, the reports that are produced are typicallyof high quality. Indeed the reports that are producedare often better than the official answers that have beenprepared by the experimental teams.

Further work includes development of dashboardsfor individual/team analytics to better understand howand why teams interact to produce improved reports.Work on scaling the platform to deal with much largerteams is currently in focus as well as the challenges ofweb-based collaboration involving potentially hundredsof contributors. This will largely depend upon thesuccessful evaluation of the SWARM platform.

Acknowledgments: This research is based upon work

supported in part by the Office of the Director ofNational Intelligence (ODNI), Intelligence AdvancedResearch projects Activity (IARPA), under Contract[2017-16122000002]. The views and conclusionscontained herein are those of the authors and shouldnot be interpreted as necessarily representing the officialpolicies, either expressed or implied, of ODNI, IARPA,or the U.S. Government. The U.S. Government isauthorized to reproduce and distribute reprints forgovernmental purposes notwithstanding any copyrightannotation therein.

References

[1] M. Li, “Internet of people,” Concurrency andComputation: Practice and Experience, vol. 29,no. 3, 2017.

[2] S. John Walker, “Big data: A revolution that willtransform how we live, work, and think,” 2014.

[3] A. Barnes, “Making intelligence analysis moreintelligent: Using numeric probabilities,” Intelligenceand National Security, 31(3), pp. 327–344, 2013.

[4] “A tradecraft primer: Structured analytic techniques forimproving intelligence analysis,” 2009.

[5] T. van Gelder and et al, “Cultivating expertise ininformal reasoning,” Canadian Journal of ExperimentalPsychology, 58,, pp. 142–152, 2004.

[6] M. Burgman, “Trusting judgements: How to get the bestout of experts,” 2016.

[7] A. Mannes and et al., “The wisdom of select crowds,” JPers Soc Psychol, 107(2), pp. 276–299, 1997.

[8] D. Brabham, “Crowdsourcing,” 2013.

[9] Y. Suzuki, “Assessing the quality of wikipedia editorsthrough crowdsourcing,” Wiki Workshop 2016, Montreal,Canada, 2016.

[10] L. Hong and S. Page, “Groups of diverse problem solverscan outperform groups of high-ability problem solvers,”Proceedings of the National Academy of Sciences of theUSA, pp. 16385–16389, 2004.

[11] D. Budescu and et al., “Identifying expertise to extractthe wisdom of crowds,” Management Science, 61(2),pp. 267–280, 2015.

[12] C. Davis-Stober and etal, “When is a crowd wise?,”Collective Intelligence,, vol. 1, no. 2, 2014.

[13] “Motivations for participation in a crowdsourcingapplication to improve public engagement in transitplanning,”

[14] A. Bosu and etal, “Building reputation in stackoverflow:an empirical investigation,” in Proceedings of the 10thWorking Conference on Mining Software Repositories,pp. 89–92, IEEE Press, 2013.

[15] Z. Kozhirbayev and etal, “A performance comparisonof container-based technologies for the cloud,” FutureGeneration Computer Systems, vol. 68, pp. 175–182,2017.

[16] D. Hardt, The OAuth 2.0 Authorization Framework,2012-10. https://tools.ietf.org/html/rfc6749.

Page 419

Date post:	13-Jul-2020
Category:	Documents
Upload:	others
View:	1 times
Download:	0 times

The Design and Development of a Cloud-based Platform ...garnering collective expertise through...

Documents