Date post: | 21-Jan-2016 |
Category: |
Documents |
Upload: | phillip-lester |
View: | 219 times |
Download: | 0 times |
Digital Vault
Kick-off 02/12/2015
© T
rust
1Tea
m 2
015
Fast & scalable object level storage Secure content persistence Secure bi-directional content sharing Secure content provenance Shared content libraries Private-hosted file synchronization Built upon open source components Basic security model with optional extensions (consumer-driven
security enforcement) Vault-in-vault concept
Intro Digital Vault Engineour understanding
© T
rust
1Tea
m 2
015
Policies for user storage quota available API Engine available On-premise solution User OAuth2 consent available in existing authorization
infrastructure User authentication available in existing authentication
infrastructure
Conceptassumptions
© T
rust
1Tea
m 2
015
Conceptat the center of the solution: secured content
© T
rust
1Tea
m 2
015
Content encryption standards• X509 private keys for desktop clients• PDKDF2 session keys using RSA encryption• AES-256/CBC encryption for data transfer• ISO-32000 AES256 encryption for PDF encryption• ISO/IEC 9899:1999• Digital sign shared documents using ETSI AdES and ASiC
Adaptable feature-rich security model• Optional password protection on shared link• Optional expiration time on shared link• Optional signing for content integrity• Optional X509 public key signing for content transfer
Conceptsecurity Standards
© T
rust
1Tea
m 2
015
Conceptsecurity at all levels
© T
rust
1Tea
m 2
015
Vault Security• Secured local storage• Secured cloud storage• Secured content transfer• Trusted list of sync devices• Secured token distribution• Content provenance and Content integrity
Vault Archiving• Content retention using Apache CMIS• Content retention to private cloud distributed storage
Conceptsecurity at all levels
© T
rust
1Tea
m 2
015
Micro-service design Stateless services design API-first design Behavior driven development User-centric Semantic Versioning
Conceptdesign principles
resilient
elastic
stateless
responsive
© T
rust
1Tea
m 2
015
In scope – Demo DV application• OAuth2 enabled• Angular JS • To test all endpoints with user actions:
– upload, share, download,…
Synchronization client• cfr. Seafile clients available• OSX, Windows, Linux, terminal based• Mobile Android• Mobile iOS
Conceptwireframes
© T
rust
1Tea
m 2
015
Mobile applications (Android & iOS)
Conceptwireframes
© T
rust
1Tea
m 2
015
Conceptarchitecture component model
assumption: existing search engine
© T
rust
1Tea
m 2
015
• Client side (front-end)– 3rd party web applications for a variety of devices– Demo DV application made within the scope of the project– Desktop synchronization clients– Mobile synchronization clients
• Server side (back-end)– Digital Vault Engine– Integration with API Engine– Integration with Search Engine
• Server side (storage)– Storage and storage replication (quota storage policy)– Archiving to private distributed cloud storage– Archiving to ECM via Apache Chemistry layer
Conceptarchitecture component model
© T
rust
1Tea
m 2
015
Basic version of the DV Demo application Connects directly to the micro-service API Implements following user stories:• 1) upload file from DV Demo app into existing DV folder• 2) share file from DV Demo app => mail to user with link• 3) user downloads file using the link from the received mail
Conceptproof of Concept
© T
rust
1Tea
m 2
015
Technologyfile system design
© T
rust
1Tea
m 2
015
Files are organized into Libraries – designed for synchronization• Network/storage deduplication• No upload/download limit• Fast upload (back-end daemons)
Data model and sync similar to GIT (Repo, Branch, Commit, FS, Block) Selective sync library to devices Sync with existing folder Sync client-side end-to-end data encryption Full platform support: Win, OSX, Linux, mobile Share to a person or a group Share specific content or a folder Read-write and read-only share
Technologyfile system design
© T
rust
1Tea
m 2
015
Technologydeduplication
© T
rust
1Tea
m 2
015
Technologyhigh-level architecture
© T
rust
1Tea
m 2
015
Seafile• C, C++• OpenSSL
Java EE• JAXRS, CDI• Maven• Bouncy Castle Crypto API
Sync desktop clients• Qt4/5• C++
Sync mobile clients• Android• iOS
Technology stack
© T
rust
1Tea
m 2
015
Content Integrity and Content Provenance
Archiving to cloud storage Archiving to ECM platforms
Basic security on all levels Customizable security
Technology stackinnovative features in the solution
Different from cloud storage solutions for personal use
Open API security :every application can enforce strong security
© T
rust
1Tea
m 2
015
Digipolis and T1T agree on list of detailed product requirementsT1T creates product backlog based on product requirementsSprints of 2 weeksSprint demo
Transparency via JIRA projectRegular sync meetings with Digipolis stakeholder
Approachsprint planning with monthly releases
© T
rust
1Tea
m 2
015
sprint 1-2
• password, AES folder• storage• Account Mgmt• synch• Token Distribution• Content Sharing
sprint 3-4
• security features• key store
management• zip creation &
encryption• pdf encryption
sprint 5-6
• content provenance• archiving to ECM• integration with
search engine
Approachmilestones part 1
PO
C0.0
.1
Versio
n
0.0
.5
© T
rust
1Tea
m 2
015
sprint 7-8
• archiving to personal cloud storage
• trusted devices list• bug fixing
sprint 9
• bug fixing• move to Acceptance
sprint 10
• move to production
Approachmilestones part 2
Versio
n
0.5
.0
Versio
n
1.0
.0
© T
rust
1Tea
m 2
015
Deliverables• Source code• Builds• Technical documentation• User documentation
Project closing• Hand-over to technical team• User training
Duration of the project is approx 4 months
Approachdeliverables and project closing
Thank you for your kind attention
Do you have any questions?
© T
rust
1Tea
m 2
015
A typical synchronization work flow consists of the following steps:• Seafile client daemon detects changes in the worktree (via inotify etc).• The daemon commits the changes to the local branch.• Download new changes from the master branch on the server (if any).• Merge the downloaded branch into local branch (also checkout changes to
worktree).• Fast-forward upload local branch to server's master branch.
Custom merge algorithm• Auto-sync Git is unreliable• Merge after file write-protection releases lock
Annex 1Synch algorithm
© T
rust
1Tea
m 2
015
Annex 2Git approach – why?
Synchronization may be interrupted at any point by shutting down the program or computer, after reboot we lose all notifications from the OS. We need a reliable and efficient way to determine which files in the worktree has been changed (even after reboots).
Git's index file are used to do this. It caches the timestamps of every file in the worktree when the last commit is generated. So we can easily and reliably detect changed files in the worktree since the latest commit by comparing timestamps.
Another notable case is what happens if two clients try to upload to the server simultaneously. The commit procedure on the server ensures atomicity. So only one client will update the master branch successfully, while the other will fail.
The failing client will restart the sync work flow later. It will first merge the changes from the succeeded client then upload again.
© T
rust
1Tea
m 2
015
Annex 3Low-bandwidth Network File System
Description of LBFS: https://trust1t.atlassian.net/wiki/display/DVE/Digipolis+-+Digital+Va
ult+Engine+-+Presentation?preview=/51707943/52199434/lbfs.pdf
© T
rust
1Tea
m 2
015
Sprint planning and milestones in Jira: https://
trust1t.atlassian.net/secure/RapidBoard.jspa?rapidView=69&projectKey=DIGIDV&view=planning
Annex 4Backlog
© T
rust
1Tea
m 2
015
https://github.com/haiwen/seafile - 3200+ stars Estimated at least 200K users worldwide, mostly in Europe Open Source Software (AGPLv2) Available Open Source sync clients for desktop and mobile GIT approach but enhanced for auto-sync and handling large files Custom merge algorithm Basic privacy protection Efficient network transfer (LBFS-based) Only does what it should do best - approach
Annex 5Why Seafile?
© T
rust
1Tea
m 2
015
Automatic synchronization Clients do not store file history, thus they avoid the overhead of storing data
twice. Git is not efficient for larger files such as images. Files are further divided into blocks for more efficient network transfer and
storage usage. File transfer can be paused and resumed. Support for different storage backends on the server side. Support for downloading from multiple block servers to accelerate file transfer. More user-friendly file conflict handling. (Seafile adds the user's name as a
suffix to conflicting files.) Graceful handling of files the user modifies while auto-sync is running. Git is
not designed to work in these cases.
Annex 6What are the differences for Seafile vs Git?