+ All Categories
Home > Documents > TSD: a Secure and Scalable Service for Sensitive Data and eBiobanks Gard Thomassen, PhD Head of...

TSD: a Secure and Scalable Service for Sensitive Data and eBiobanks Gard Thomassen, PhD Head of...

Date post: 23-Dec-2015
Category:
Upload: felicity-dickerson
View: 218 times
Download: 1 times
Share this document with a friend
Popular Tags:
28
TSD: a Secure and Scalable Service for Sensitive Data and eBiobanks Gard Thomassen, PhD Head of Research Support Services Group University Center for Information Technology (USIT) University of Oslo
Transcript
Page 1: TSD: a Secure and Scalable Service for Sensitive Data and eBiobanks Gard Thomassen, PhD Head of Research Support Services Group University Center for Information.

TSD: a Secure and Scalable Service for Sensitive Data and eBiobanks

Gard Thomassen, PhD

Head of Research Support Services Group

University Center for Information Technology (USIT)

University of Oslo

Page 2: TSD: a Secure and Scalable Service for Sensitive Data and eBiobanks Gard Thomassen, PhD Head of Research Support Services Group University Center for Information.

Outline

• Sensitive Data• TSD setup, solutions, demo, status and future• How to get on board• Some discussion topics • Q&A

Page 3: TSD: a Secure and Scalable Service for Sensitive Data and eBiobanks Gard Thomassen, PhD Head of Research Support Services Group University Center for Information.

What is sensitive data?

Norway : Personal Data Act §2, point 8 – race/ethnic data, political opinion, philosophical

and religious beliefs, the fact that a person has been suspected of, charged with, indicted for or convicted a criminal act, health, sex life and trade-union membership

Page 4: TSD: a Secure and Scalable Service for Sensitive Data and eBiobanks Gard Thomassen, PhD Head of Research Support Services Group University Center for Information.

Who has sensitive data

Almost everyone

Page 5: TSD: a Secure and Scalable Service for Sensitive Data and eBiobanks Gard Thomassen, PhD Head of Research Support Services Group University Center for Information.

TSD launch in Computerworld 16/5-14

Page 6: TSD: a Secure and Scalable Service for Sensitive Data and eBiobanks Gard Thomassen, PhD Head of Research Support Services Group University Center for Information.

Norsk KreftGenom Konsortium

Sammenliknet med den hardware vi benyttet fram til overgangen til TSD, som vel kan karakteriseres som en middels brukbar tjenermaskin, med 64kjerner, kan vi med TSD oppnå en teoretisk hastighetsforbedring på 30X. Itillegg til dette kommer at vi har opitmalisert vår analysepipeline, vedat vi har parallellisert flere trinn. Tidligere ville ensekvenseringsanalyse på 48 svulst/normal-par resultert i kjøringstid påto-tre måneder minimun. Vi kjørte nå denne uka på TSD det samme på todager og noen timer. Altså forsiktig sagt en dramatisk forbedring.

Prof Eivind Hovig, NCGC

Page 7: TSD: a Secure and Scalable Service for Sensitive Data and eBiobanks Gard Thomassen, PhD Head of Research Support Services Group University Center for Information.

Teknisk ukeblad & e24, 5/5-14

Page 8: TSD: a Secure and Scalable Service for Sensitive Data and eBiobanks Gard Thomassen, PhD Head of Research Support Services Group University Center for Information.

Uniforum

Page 9: TSD: a Secure and Scalable Service for Sensitive Data and eBiobanks Gard Thomassen, PhD Head of Research Support Services Group University Center for Information.

TSD

Pilot 2009 - 2012

Page 10: TSD: a Secure and Scalable Service for Sensitive Data and eBiobanks Gard Thomassen, PhD Head of Research Support Services Group University Center for Information.

System requirements• Security, isolation and access control as given by law• Large storage capacity• Multi tenant (multiple users)• High performance computing (HPC) resource• High bandwidth• Easy to maintain and operate• Easy to use and “practical” (also for audio and video)• Some freedom within a confined user space• Accessible from anywhere through proper mechanisms• A variety of software and public data-sources must be available• Windows and Linux support (server/host-side)• Data collection services• Data sharing services

Page 11: TSD: a Secure and Scalable Service for Sensitive Data and eBiobanks Gard Thomassen, PhD Head of Research Support Services Group University Center for Information.

Setup, solutions and status

Page 12: TSD: a Secure and Scalable Service for Sensitive Data and eBiobanks Gard Thomassen, PhD Head of Research Support Services Group University Center for Information.

System outline

Gateway

HPC - ColossusVM-server

Storage

Internet

Secure encrypted network to special high volume data production sites

1 (project)

1 (storage area)

n 1

Page 13: TSD: a Secure and Scalable Service for Sensitive Data and eBiobanks Gard Thomassen, PhD Head of Research Support Services Group University Center for Information.

TSD demo

Check out the login help pages, all projects can use PCoIP and ssh+RDP to access windows from 15/8-15

Thinlinc (html5 based ) for access to linux machines in TSD will be enabled by August 2015.

Help pages : http://www.uio.no/tjenester/it/forskning/sensitiv/hjelp/brukermanual/index.html

Page 14: TSD: a Secure and Scalable Service for Sensitive Data and eBiobanks Gard Thomassen, PhD Head of Research Support Services Group University Center for Information.

Data import and export using TSD

File lock server

Virtual file lock server

Virtual project-server

File lock HD

Project HD

TSD

NFS mount

2

Data copied here by sftp(2-factor authentication) encrypted data if sensitive

1

4

3

Page 15: TSD: a Secure and Scalable Service for Sensitive Data and eBiobanks Gard Thomassen, PhD Head of Research Support Services Group University Center for Information.

Data collection using TSD

“Nettskjema-minID”Nettskjema homepage

minID

Project VM

Project disk

File lock

Encrypted XML (PGP)

TSD

Page 16: TSD: a Secure and Scalable Service for Sensitive Data and eBiobanks Gard Thomassen, PhD Head of Research Support Services Group University Center for Information.

Security details

• OATH TOTP 2-factor authentication – Smart phones or programmable hardware tokens– 2-factor absolutely everywhere

• Import/export is under strict control• No open connection to the internet• All administration happens from the inside• Strong separation between projects• Hardened FreeBSD gateway and firewall • Encrypted backup, one key per project• Sys-admins are single users (traceability)• Sys-admins have to use same authentication process

Page 17: TSD: a Secure and Scalable Service for Sensitive Data and eBiobanks Gard Thomassen, PhD Head of Research Support Services Group University Center for Information.

Homepage

http://www.uio.no/tjenester/it/forskning/sensitiv/

Risk evaluation etc : http://www.uio.no/tjenester/it/forskning/sensitiv/mer-om/systembeskrivelse/

Page 19: TSD: a Secure and Scalable Service for Sensitive Data and eBiobanks Gard Thomassen, PhD Head of Research Support Services Group University Center for Information.

TSD status

• > 90 research projects• > 350 users• Secure storage (> 1 PiB on disk)• Secure data analysis • Linux or windows hosts (> 250 VMs)• Secure import and export• Web-based data harvesting• HPC cluster (>1500 cores)• Postgres DBs• Video and sound display

Page 20: TSD: a Secure and Scalable Service for Sensitive Data and eBiobanks Gard Thomassen, PhD Head of Research Support Services Group University Center for Information.

Capabilities enabled by TSD

• Large scale NGS research on human genomes• Large scale medical imaging studies • Large scale studies with web-based data

collection• Cross border data-collection• Off-site analysis of sensitive data• Secure storage for verification of published

research• Electronic consent, soon• Free storage from Norstore, soon free CPU?

Page 21: TSD: a Secure and Scalable Service for Sensitive Data and eBiobanks Gard Thomassen, PhD Head of Research Support Services Group University Center for Information.

Nordic collaboration opportunities

• Laws are fairly similar (Norway very strict)• Difficult to exchange sensitive data for research• One should learn from each other as these systems

demands very special IT-knowledge• Services development and system-administration

know-how is non-sensitive and may be shared• Building TSD addressed many novel security

questions in a University setting to be learnt from• Large DBs/registeries of health data may enable very

interesting research in the future • TSD is involved in the NeIC-based Tryggve project • We are happy to collaborate!

Page 22: TSD: a Secure and Scalable Service for Sensitive Data and eBiobanks Gard Thomassen, PhD Head of Research Support Services Group University Center for Information.

Main collaborators on TSD

Collaborators• Norwegian Storage Infrastructure (NorStore)• Norwegian Genetics Analysis Platform (GenAp)• Norwegian Dietary Registry (Medical Faculty)• Institute of Psychology (Faculty of Social Sciences)• Norwegian Cancer Sequencing Consortium (NCGC)

Reference group

Oslo University Hospital, NorStore, Regional Ethical Committee, National Institute of Public Health, Norwegian Cancer Registry, Research Network at OUS, Elixir Norway, NCGC, GenAP, Institute of Psychology.

Page 23: TSD: a Secure and Scalable Service for Sensitive Data and eBiobanks Gard Thomassen, PhD Head of Research Support Services Group University Center for Information.

Future of TSD - main topics• More on how to handle video and sound

– harvesting– management– metadata– analysis

• Journal system for Psychologists (Univ of Umeå collaboration)• Biobanks computing resource ?• Thinlinc (August 2015) and PCoIP (15/8-15)• VMware and VDI infrastructure for video• Galaxy inside TSD• Elixir helpdesk connected to TSD• Hosting docker containers• Invariant storage of research data (connected with Cristin ?)• National eInfrastructure investment in TSD ??

Page 24: TSD: a Secure and Scalable Service for Sensitive Data and eBiobanks Gard Thomassen, PhD Head of Research Support Services Group University Center for Information.

Use of direct identification inside TSD

Disclaimer / instruction : “Prosjektleder er ansvarlig for at man benytter direkte identifiserbare data så lite som overhodet mulig inne i TSD. Ved bruk av Nettskjema + minID/BankID skal personnummer vaskes bort og koplingsnøkkel lagres slik at den er tilgjengelig for så få prosjektdeltakere som mulig“

http://www.uio.no/tjenester/it/forskning/sensitiv/hjelp/secure-nettskjema/index.html

Page 25: TSD: a Secure and Scalable Service for Sensitive Data and eBiobanks Gard Thomassen, PhD Head of Research Support Services Group University Center for Information.

Unencrypted large datasets -> TSD

• Controlled end-end networks using ACL´s• SFTP with 2-factor• No open internet connections in any endpoint

Page 26: TSD: a Secure and Scalable Service for Sensitive Data and eBiobanks Gard Thomassen, PhD Head of Research Support Services Group University Center for Information.

PCoIP (encrypted and 2-factor)

TSD then needs port 22 for sftp/ssh and port 4172 for TCP and UDPDirectional control of c&p

Page 27: TSD: a Secure and Scalable Service for Sensitive Data and eBiobanks Gard Thomassen, PhD Head of Research Support Services Group University Center for Information.

Thinlinc

• Linux Remote Desktop protocol over html5 (https) with 2-factor authentication

• BSD firewall in front• Same in principle as PCoIP setup :

– “Login server”– “Connection broker”– “Virtual project machines / servers”– Can have directional control of c&p

Page 28: TSD: a Secure and Scalable Service for Sensitive Data and eBiobanks Gard Thomassen, PhD Head of Research Support Services Group University Center for Information.

Thanks to

• tsd-core@usit• virt-core@usit• storage-core@usit• postgres-core@usit• network-core@usit• hpc-core@usit• windows-core@usit• unix-core@usit• IT-security@usit

Project group / developers

• IT-dir Lars Oftedal• Hans A. Eide• Märtha Felton • Reference group

Administration / associated


Recommended