Two Examples of Open Source Software Developed at CERN: and
RMLL visits at CERN – July 2012
LOGO
Digital Library Software
http://invenio-software.org/
LOGO
What is it used for?• Depositing• Archiving• Organizing• Disseminating
• Any type of document~350GB of PDFs at CERN
~20TB of images and videos1M records
LOGO
What is
LOGO
‣ Integrated Digital Library / Repository software
‣ A platform of choice for managing documents in HEP
‣ also adopted in other fields (medium to big repositories)
‣ Web application
‣ Open-source GPL-2 project
‣ LAMP stack: Python (mostly), MySQL and Apache
‣ Based on open standardsMARCXML, OAI-PMH, OpenURL, OpenSearch, etc.
‣ Flexible, scriptable
LOGO
Invenio’s gears• Lots of Python, with a sprinkle of C and Lisp(!)• 630K lines of Python code• MySQL ISAM for storing data• Native indexing engine• Apache + mod_wsgi + mod_xsendfile
LOGO
Invenio’s History1954 CERN library starts paper dissemination of preprints (early Open
Access)1965 First computers at CERN library to help with cataloging1990 Electronic distribution of preprints via FTP1993 CERN Preprint Server, web front-end of electronic preprint
catalogue. Institutional repository1996 CERN Library Server (weblib): added books, periodicals and
"other material”.2000 CERN Document Server: multimedia material, internal notes
2002 First public release of the software under GNU-GPL.Worldwide installations and collaborations
Open Access at CERN• “Consistent with the stated position of the Collaborations and the General Conditions applicable
to Experiments at CERN, every effort will be made to publish papers under Open Access conditions, as defined by the SCOAP3 initiative. As at the date of this document, the Creative Commons Attribution ("cc by") license meets these conditions.”
• OA at CERN has a long history, the CERN Convention of 1953 states: "...the results of its experimental and theoretical work shall be published or otherwise made generally available".
LOGO
Our development Environment• Git distributed version control system• Trac for ticket tracking• VirtualBox + Vagrant for testing
deployment• We develop on SLC5/6 (based on
RHEL5/6), on Ubuntu, on Debian…
LOGO
Quality Assurance• Coding standards
• Eg. PEP8 (Style Guide for Python), etc.
• Documentation• "If the code and the comments disagree, then both are probably wrong."
– attributed to Norm Schryer
• Test suite• ~1,000 unit/regression/web tests
• Security• XSS, CSRF, SQL injection, etc.
• Code review• Kwalitee check: "measuring" quality
• "It looks like quality, it sounds like quality, but it’s not quite quality.”– CPAN Testing Service (quoting Michael Schwern)
LOGO
Our community
• 30 institutions worldwide• CERN + DESY + Fermilab + SLAC• EPFL …• ADS and arXiv joining forces• Translated so far into 26 languages• 45 committers (in the last year)• Free + Paid support
LOGO
An example installation
LOGO
• 1 Load balancer (HAProxy + Apache mod_proxy + mod_evasive)
• 5 Worker nodes:• 2 VMs for static files• 3 Real machines for Python handled requests
• 2 DB nodes (MySQL master + MySQL replica)• AFS distributed FS for backups and file storage• Sustained recent Higgs announcement load (230
requests per second with peaks of 800 req/s)
What’s next?• Werkzeug/Flask + Jinja2 + WTForms for the
web framework• SQLAlchemy for DB abstraction• Twitter Bootstrap + jQuery for the style• Optional Solr indexing
LOGO
Conference Management Software
http://indico-software.org
LOGO
• History and Features• Technologies• Development
LOGO
What is Indico ?• Web-based event organization• Archive of events metadata and related
documents (minutes, slides, etc)• Booking service and collaboration hub
• Rooms• Videoconference• Webcast
LOGO
What is Indico ?• Started as an European Project - 2002
• First time used in 2004• In production at CERN: http://indico.cern.ch• And in >100 institutions around the world
• GSI, DESY, Fermilab,…• http://indico-software.org/wiki/IndicoWorldWide
• Free and Open Source
LOGO
Indico @ CERN• > 170.000 events• > 700.000 presentations• > 900.000 files
LOGO
Event Management with Indico• All kinds of events
LOGO
Managing Simple Events
LOGO
Managing Meetings
LOGO
Managing Conferences
LOGO
Managing Conferences• Full Lifecycle
LOGO
Managing Conferences
LOGO
Collaboration Hub• Room Booking
LOGO
Collaboration Hub• Collaboration service requests:
Videoconference, webcast, recording
LOGO
Technology• Python >2.6 + WSGI
• babel, webassets, pytz, zope.index, zope.interface, simplejson, suds, lxml, zc.queue, python-dateutil, pypdf, pyatom, reportlab, etc
• Mako 0.4.1+ as template engine• ZODB as underlying database (http
://www.zodb.org/)• Web frameworks:
• jQuery• Backbone.js
LOGO
Infrastructure
LOGO
Compatibility• Many browsers compatibility: IE8+, FF3.6+,
GChrome, Safari, etc• Working on mobile version
LOGO
Development Tools• Git as Control Version System• ~ Eclipse + PyDev• Unit and Selenium Test +
Jenkins (Continuous Integration Server)
• Sphinx for Documentation• Trac as Project Site• Github: http://github.com/indico• Transifex for i18n:
https://www.transifex.com/projects/p/indico/
What’s Next ?• Enhance the software: v1.0 end of 2012• Enlarge the community: more advertising
LOGO
Questions?
LOGO