Date post: | 21-Jan-2018 |
Category: |
Presentations & Public Speaking |
Upload: | caifti |
View: | 257 times |
Download: | 0 times |
OpenStack for Research
and
Research for OpenStack
the INFN - OpenStack joint-venture/adventure
Doina Cristina Aiftimiei (INFN CNAF)On behalf on INFN Cloud WG
OpenStack Summit – Tokyo 2015
This work is licensed under a Creative Commons
Attribution-NonCommercial-ShareAlike 4.0 International License
Outline
• INFN – National Institute for Nuclear Physics,
Italy
• OpenStack for Research
National & International Experiences
• Research for Openstack
Development & Improvements for OpenStack
OpenStack Summit – Tokyo 2015 2
Outline
• INFN – National Institute for Nuclear Physics,
Italy
• OpenStack for Research
National & International Experiences
• Research for Openstack
Development & Improvements for OpenStack
OpenStack Summit – Tokyo 2015 3
INFN = Istituto Nazionale di Fisica Nucleare
(National Institute for Nuclear Physics)
OpenStack Summit – Tokyo 2015 4
Italian research agency dedicated to the study of the fundamental constituents of matter and the laws that govern them, under the supervision of the Ministry of Education, Universities and Research (MIUR).
It conducts theoretical and experimental research in the fields of subnuclear, nuclear and astroparticle physics, in close collaboration with Italian universities.
o Strong experience and know-how also in cutting-edge technology and instruments. Among them, long-standing experience on HPC, distributed storage and computing (Grids and Clouds)
• Strong focus also on technology transfer programs. Transfer of technologies and
know-how to Italian and European companies, developed within INFN scientific programs.
5
• INFN sites About 30 among full
INFN branches and collaboration groups hosted in university departments
4 national laboratories: Catania, Frascati, Gran Sasso, Legnaro
3 national centers:o CNAF, National Center
for Research and Development in Informatics and Telematics, Bologna
o GSSI, Gran SassoScience Institute, L’Aquila
o TIFPA, Trento Institute for Fundamental Physics and Applications, a Trento
INFN - lines of research & projects
• Projects addressing “next generation computing infrastructure” DataGrid (2001 – 2004)
o Research and Technological Development for an International Data Grid
EC: 9.6 M€ - INFN 9.69%
DataTaG (2002-2004): o Research & technological development for a Data
TransAtlantic Grid EC: 4.1 M€ - INFN 15.29%
EGEE I, II, III (2004 – 2010)o Enabling Grids for E-sciencE
EC: 100,8M€ - INFN 11%
EMI (2010 – 2013)o Europeean Middleware Initiative
EC: 12 M€ - INFN 16%
EGI-Inspire (2010 – 2014)o Integrated Sustainable Pan-European Infrastructure for
Researchers in Europe EC: 25 M€ - INFN 6,9%
OpenStack Summit – Tokyo 2015 6
INFN - lines of research & projects
• Projects addressing “next generation computing infrastructure” DataGrid (2001 – 2004)
o Research and Technological Development for an International Data Grid
EC: 9.6 M€ - INFN 9.69%
DataTaG (2002-2004): o Research & technological development for a Data
TransAtlantic Grid EC: 4.1 M€ - INFN 15.29%
EGEE I, II, III (2004 – 2010)o Enabling Grids for E-sciencE
EC: 100,8M€ - INFN 11%
EMI (2010 – 2013)o Europeean Middleware Initiative
EC: 12 M€ - INFN 16%
EGI-Inspire (2010 – 2014)o Integrated Sustainable Pan-European Infrastructure for
Researchers in Europe EC: 25 M€ - INFN 6,9%
OpenStack Summit – Tokyo 2015 7
INFN - lines of research & projects
• NEW projects addressing “next generation computing infrastructure” National Projects
o PRISMA (2012 – 2015) PiattafoRme cloud Interoperabili per SMArt-government
• MIUR/EC 27.4/20.3 M€ - INFN 8,14%
o OCP (2014 – 2016) OpenCityPlatform - Smart Cities and Communities and Social Innovation
• MIUR 11.9 M€ - INFN 13,2%
International Projectso EGI – Engage (2015 – 2017)
Engaging the Research Community towards an Open Science Commons• `EC: 8 M€ - INFN 7,5%
o INDIGO – DataCloud (2015 – 2017) INtegrating Distributed data Infrastructures for Global ExplOitation
• EC: 11 M€ - INFN ~18,7%
OpenStack Summit – Tokyo 2015 8
INFN - lines of research & projects
• NEW projects addressing “next generation computing infrastructure” National Projects
o PRISMA (2012 – 2015) PiattafoRme cloud Interoperabili per SMArt-government
• MIUR/EC 27.4/20.3 M€ - INFN 8,14%
o OCP (2014 – 2016) OpenCityPlatform - Smart Cities and Communities and Social Innovation
• MIUR 11.9 M€
International Projectso EGI – Engage (2015 – 2017)
Engaging the Research Community towards an Open Science Commons• `EC: 8 M€ - INFN 7,5%
o INDIGO – DataCloud (2015 – 2017) INtegrating Distributed data Infrastructures for Global ExplOitation
• EC: 11 M€ - INFN ~18,7%
OpenStack Summit – Tokyo 2015 9
Outline
• INFN – National Institute for Nuclear Physics,
Italy
• OpenStack for Research
National & International Experiences
• Research for Openstack
• Development & Improvements for OpenStack
OpenStack Summit – Tokyo 2015 10
INFN needs … OpenStack
• Heavily distributed nature Local computing groups
o day-to-day support to users and local services
Centrally coordinated teamso central services – mainly CNAF & LNF
• Areas Local computing services at INFN sites (e.g. local networks, mailing, support for local
users, …)
Basic services and central services (e.g. network services, HA infrastructure, Authentication and Authorization Infrastructure, web servers, …)
Computing for administrative services (Sistema Informativo)
Big computing centres (Tier-1 and Tier-2’s) and distributed computing infrastructure (Grid)
Computing in the experiments and in national and international projects
=> Largely benefitted from advancements in the technology, especially in the fields of distributed computing and virtualization techniques
The INFN infrastructure must be, by its nature, inclusive, allowing the use of many resources as possible, also heterogeneous.
Mechanisms of federation and orchestration to access at the IaaS level to different resources
OpenStack Summit – Tokyo 2015 11
INFN Corporate Cloud
• INFN Corporate Cloud (INFN-CC) working group, has been planning and testing possible architectural designs for the implementation of a distributed private cloud infrastructure and realized a prototype
• Overview INFN-CC is a multi-regional OpenStack installation
o some services are centrally managed and common to all regions while other services are local and associated to a single region.
o Based on a small number of sites (core) => 3
The main goal -> standard IaaS interfaces to an homogeneous but distributed cloud environment focused on the deployment of highly available, distributed network services and applications
OpenStack Summit – Tokyo 2015 12
INFN Corporate Cloud (2)
• Highlights Common, distributed
Identity Service backed by the INFN AAI LDAP.
Common, distributed Swift Object storage for
o VM image/snapshot repository
o Block device backup
o Personal data ….
Common, distributed Image Service backed by the above Object Storage
DNS HA + cloud.infn.itDNS domain
OpenStack Summit – Tokyo 2015 13
Working on:
•Log collection and analysis
•Infrastructure automation
•Infrastructure management
•Infrastructure monitoring (nagios → zabbix)
•Deployment of use cases
INFN-CC
OpenStack Summit – Tokyo 2015 14
Regional Cloud Infrastructures
Bari: New Data Center
OpenStack Summit – Tokyo 2015 15
Regional Cloud Infrastructures
BARI - IaaS
• Hardware: HUAWEI CE 12800
Host directly connected with the core switch at 10 GB
Host multi disk used for cloud storage
• Provisioning Foreman
• Deployment tool Puppet
OpenStack Summit – Tokyo 2015 16
• Openstack deployment Private cloud to provide local services
Public cloud to support several projects
• Deployment details MySQL backed consolidated with
Percons/Galera cluster
Rabbitmq AMQP backend running in cluster mode
HAproxy as endpoint frontend
VLAN tenant network
linuxbridge
No Neutron L3 agent
CEPH RBD used as backend for Glance, Cinder, Nova
Regional Cloud Infrastructures
• It is the main INFN computing centreproviding computing and storage services to ~30 scientific collaborations
• Tier-1 for LHC experiments (ATLAS, CMS, ALICE and LHCb)
• Provision and evolution of a Cloud infrastructure to provide virtual services (CPU, storage, database, networking, …) for various use cases Investigation on how to better integrate Grid-
based and Cloud-based services
Cloud Partition Dynamically assign nodes belonging to the Grid-
enabled Tier-1 farm to a Cloud infrastructure
The partition director uses a dynamic partitioning mechanism similar to the one deployed at the Tier-1 for the provisioning of multi-core resources
OpenStack Summit – Tokyo 2015 17
CNAF – Data Center & Cloud Activities
Regional Cloud Infrastructures
Havana
o 1 Controller Node Keystone, Glance (LVM) , Heat, Horizon, Ceilometer,
MySQL, QPID
2x8 HT (32) Intel(R) Xeon(R) CPU E5-2450 0 @ 2.10GHz, 64 GB di RAM
o 1 Network Node Neutron con OVS + VLAN
2x6HT (24) Intel(R) Xeon(R) CPU E5-2450 0 @ 2.10GHz, 64 GB di RAM
o 4 x Compute Node Nova per KVM/QEMU
2 x 8 core AMD con 64 GB RAM per nodo (tot: 64 core e 256 GB di RAM)
o Shared Storage (PowerVault + 2 server GPFS) 16 TB onGPFS for Nova backend
o 1 Web Proxy per la dashboard
OpenStack Summit – Tokyo 2015 18
CNAF – Data Center & Cloud ActivitiesJuno
o 2 x Controller Node – HA active/active Keystone, Heat, Horizon, Ceilometer, Neutron server,
Trove
HAProxy & Keepalived x API
Glance & Cinder
o 2 x Network Node – HA active/active partial(DHCP agent in active/active, L3 agent in hot standby) Neutron con LinuxBridge + VLAN
2x6 HT (24) Intel(R) Xeon(R) CPU E5-2450 0 @ 2.10GHz, 64 GB di RAM
o 13 x Compute Node = 208 CPU + ~814 GB RAM Nova per KVM/QEMU, LinuxBridge agent
2 x 8 core AMD Opteron 6320 @ 2.8GHz con 64 GB RAM
o Shared Storage (PowerVault + 2 server GPFS) 16TB on GPFS for Nova (instances), Glance (images),
Cinder (volumes) backends
o 1 Web Proxy for the Dashboard
o 3x Percona XtraDB, RabbitMQ, MongoDB, ZooKeeper
o 2x HAProxy for Percona (failover, no roundrobin!)
Regional Cloud Infrastructures
OpenStack Summit – Tokyo 2015 19
Openstack Icehouse – HA setup
• 21 compute nodes (728 HT cores) New compute nodes (~ 300
cores) to be added in a fewmonths
• ~ 100 TB for OpenStackservices and user data
• Used by several experimentsfor different use cases
Cloud Area Padovana
Regional Cloud Infrastructures
OpenStack Summit – Tokyo 2015 20
• Cloud Area Padovana - Home made developments
Identity Management
o To manage integration with external identity services (in
particular INFN IdP)
o To manage registrations of users and projects
Regional Cloud Infrastructures
• Key Features: Distributed layer 3 private
network
10Gb / low latency geographical connection
Security by isolation
Highly automatedinfrastructure
Distributed datacenter management
OpenStack Summit – Tokyo 2015 21
RMLab - Single-region OpenStack distributed
datacenter
Regional Cloud Infrastructures
OpenStack Summit – Tokyo 2015 22
Openstack RMLab
• How it works:
Automate all you can
Everything replicable
anywhere
Wiki is the bible
Git is the vault
EGI Federated Cloud (FedCloud)
• More than one year of production
• Resources
21 providers from 14 NGIs
17 interested in joining from 7 new NGIs
Average A/R 82%/83%• Usage
~700K VMs
~9M CPU hours wall time
3 Cloud flavours: Open Stack (13 sites), Open Nebula (7), Synnefo (1)
OpenStack Summit – Tokyo 2015 23
INFN & EGI FedCloud
3 INFN Open Stack (Juno) sites are part of the
EGI FedCloud:
• Bari, Catania and Padova
• Resources (total): > 500 cores and 63 TB
• Many scientific communities supported: CERN LHC
experiments, INSTRUCT, wE-NMR, MoBrain, Biomed,
Drihm, Chipster, …
• Interoperability based on OCCI but evolving towards ‘de
facto’ standards (e.g. NOVA-API)
OpenStack Summit – Tokyo 2015 24
Outline
• INFN – National Institute for Nuclear Physics,
Italy
• OpenStack for Research• National & International Experiences
• Research for Openstack
• Development & Improvements for OpenStack
OpenStack Summit – Tokyo 2015 25
National Cloud Projects
• PRISMA (PiattafoRme cloud Interoperabili per SMArt-government), industrial research project financed by MIUR – “PON Ricerca e Competitività”
• OCP (Open City Platform), Smart Cities and Communities and Social Innovation – financed by MIUR
• c
OpenStack Summit – Tokyo 2015 26
PRISMA Project
• PRISMA is an innovative open source cloud computing solution designed to meet thespecific needs of the Public Administrationt
• “[…] development of an innovative cloud computing platform open and interoperable fore-government services, which produces models and reference innovative implementationsin the processes that involve the urban and metropolitan aspects of Public LocalAdministration (PAL), and the realization of a set of vertical scalable applications,accessible according to a "self service” model.”
• Key Features On-demand computing resources
On-demand storage resources
Aggregate monitoring and alarming for hardware and services
Support to federation of identities for users
OpenStack Summit – Tokyo 2015 27
• OpenSource technologies: IaaS Cloud platform:
OpenStack
Storage: GlusterFS, CEPH
Network: pfSense, SDN
Monitoring: Zabbix
Virtualization: KVM, Hyper-V (Cloudbase), Vmware
Configuration: Puppet
PRISMA IaaS technical specs & specific
functionalities
OpenStack Summit – Tokyo 2015 28
• Technical specifications Scale out
o High performance computing
o High performance storage
All data in local storage at the site are replicated to provide high reliability for all storage tiers
o Block, object and ephemeral storage
All components of IaaS are based on open source technologies
• Specific functionalities: VPN between geographically
distributed sites
Date and services geograficalhigh availability
IaaS level Orchestration
Agregated monitoring (Ceilometer + Zabbix)
Interoperability between open source and proprietary virtualizers
External authentication systems (SAML2, X509, ldap, active directory)
Open City Platform Project
• Objectives: to research, develop and test new technological solutions open &
interoperable that can be used on-demand within the Cloud Computing
research, development and testing of new organizational modelssustainable in time for the PA, to innovate with new standards andtechnological solutions, the provision of services by Regional and LocalPA to citizens, businesses and other administrations
OpenStack Summit – Tokyo 2015 29
• Scientific and technological challenges Federated management of heterogeneous
cloud platforms
Integrated monitoring and support to billing systems
Design and re-engineering of cloud-applications
Disaster Recovery as a Service
Integration of PaaS components in particular PaaS for eGov
Open data and the Open Service and integration in business models
Federated identity management and related trust relationships
OCP Project & OpenStack
• Architectural model
OpenStack Summit – Tokyo 2015 30
• IaaS layer OpenStack/Juno
On Ubuntu 14.04
Automatic installation tool
o HA deployment with puppet-openstack
o Foreman integration
Working ono Cloud Management
Interface
o MaaS
o CFaaS (HEAT)
o DRaaS
International Projects
The INDIGO – DataCloud
INtegrating Distributed data Infrastructures for Global ExplOitation
• An H2020 project approved in January 2015 in the EINFRA-1-2014 call 11.1M€, 30 months (until September 2017) – lead by INFN
• Who: 26 European partners in 11 European countries Including developers of distributed software, industrial partners, research institutes, universities, e-
infrastructures: e.g. INFN, CSIC, UPV, CERN, KIT, EGI.eu and many others (see the project website).
• What: develop a general-purpose open source Cloud platform for scientific computing and data. Based on standards and widely used technologies, supported by large communities
Offering low learning curve: existing software suites like ROOT, Octave/Matlab, Mathematica or R-Studio will be supported and offered in a transparent way.
• For: multi-disciplinary scientific communities E.g. structural biology, earth science, physics, bioinformatics, cultural heritage, astrophysics, life
science, climatology
Gathered use cases were distilled into into 29 ranked computational, storage and infrastructural requirements
• Where: deployable on hybrid (public or private) Cloud infrastructures The role of private companies is crucial to exploit hybrid cloud platforms
Reference platforms will be OpenStack and OpenNebula
• Why: requirements coming from ~11 different communities
OpenStack Summit – Tokyo 2015 31
INDIGO - DataCloud
OpenStack Summit – Tokyo 2015 32
General Architecture Overview
INDIGO - DataCloud
• Improving IaaS resources to accommodate scientific applications Providing support for containers
o Develop/extend container support.
o Integration of trusted repositories for containers
o Extend relevant IaaS standard interfaces.
Improve the existing cloud schedulers
o Fair-share scheduling.
o Spot instances.
Integration of container execution in batch systems and explore access to InfiniBand and GPGPUs.
Provide local IaaS site orchestration using standards (i.e. TOSCA)
OpenStack Summit – Tokyo 2015 33
INDIGO – DataCloud & OpenStack
• Containers: Container execution support:
o Existing nova-docker driver, follow up development
o Evaluate Magnum project: Deploy containers on-top of OpenStack (still under development, implies an API change).
Integration of container repositories o Automatic sync with INDIGO container repository (DockerHub).
o Same baseline of images in all resource providers.
o Integration with OpenStack Glance
• Schedulers Two complementary mechanisms: fair-sharing and spot or preemptible instances.
o Fair-sharing – Synergy
o Spot instances Aiming for introducing basic support for spot instances into OpenStack core (blueprint submitted)
• Orchestration Provide IaaS orchestration using the OASIS TOSCA language (used at different levels
within INDIGO)
OpenStack Orchestration (Heat) as the IaaS orchestration engine.
TOSCA-translator at the CLI level.
Aiming for making Heat get TOSCA requests directly. o Discussions with OpenStack devs, blueprint being drafted.
OpenStack Summit – Tokyo 2015 34
INDIGO – DataCloud
Advanced scheduling in OpenStack• Implementation of a new advanced scheduling service,
Synergy Adoption of a resources provisioning model based on a fair share
algorithm to maximize the resources usage as well as to guarantee that resources are equally distributed among users and groups
Persistent priority queuing mechanism for handling user requests that can't be immediately fulfilled
OpenStack Summit – Tokyo 2015 35
• Where we are now: Integrating Synergy as
external OpenStackproject.
OpenStack Summit – Tokyo 2015 36