+ All Categories
Home > Documents > AHybridCloudArchitecturefora ... · A"Hybrid"Cloud"Architecture"for"a"...

AHybridCloudArchitecturefora ... · A"Hybrid"Cloud"Architecture"for"a"...

Date post: 26-Jun-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
14
A Hybrid Cloud Architecture for a Social Science Research Compu9ng Data Center June 30, 2014 @ DCPerf 2014 Len Wisniewski Director, Research Technology Services Ins9tute for Quan9ta9ve Social Science, Harvard University Joint work with Steve Abramson and Bill Horka
Transcript
Page 1: AHybridCloudArchitecturefora ... · A"Hybrid"Cloud"Architecture"for"a" Social"Science"Research"Compu9ng"DataCenter " June"30,"2014"@ DCPerf2014 " LenWisniewski Director,"Research"Technology"Services"

A  Hybrid  Cloud  Architecture  for  a    Social  Science  Research  Compu9ng  Data  Center  

 June  30,  2014  @  DCPerf  2014  

 Len  Wisniewski  

Director,  Research  Technology  Services  Ins9tute  for  Quan9ta9ve  Social  Science,  Harvard  University  

Joint  work  with  Steve  Abramson  and  Bill  Horka  

Page 2: AHybridCloudArchitecturefora ... · A"Hybrid"Cloud"Architecture"for"a" Social"Science"Research"Compu9ng"DataCenter " June"30,"2014"@ DCPerf2014 " LenWisniewski Director,"Research"Technology"Services"

     

30  June  2014  

Social  Science  Research  Problems  

•  Examples  •  Sta9s9cal  analysis  (EdX)  •  Social  network  analysis  (TwiTer)  •  Text  analysis  (PDF  scraping)  •  Geographic  analysis  (WorldMap)  •  Qualita9ve  analysis  (survey  data)  

Collect  Data  

Store  Data  

Clean  Data  

AnalyzeData  

Archive  &  Share  Data  

Page 3: AHybridCloudArchitecturefora ... · A"Hybrid"Cloud"Architecture"for"a" Social"Science"Research"Compu9ng"DataCenter " June"30,"2014"@ DCPerf2014 " LenWisniewski Director,"Research"Technology"Services"

     

30  June  2014  

Social  Science  Research  requirements  

•  Easy  to  use  •  Availability  of  GUIs  for  familiar  applica9ons  

•  Scalable  analysis  •  Scalable  data  •  Scalable  computa9on  

•  Secure  storage  •  Confiden9al  data  •  Harvard  Level  3,  4,  5  data    (see  security.harvard.edu)  

Page 4: AHybridCloudArchitecturefora ... · A"Hybrid"Cloud"Architecture"for"a" Social"Science"Research"Compu9ng"DataCenter " June"30,"2014"@ DCPerf2014 " LenWisniewski Director,"Research"Technology"Services"

     

30  June  2014  

IQSS  Research  Technology  Services  

Research  Technology  Consul9ng  

Infrastructure  Research  &  Development  

Infrastructure  Opera9ons  

Page 5: AHybridCloudArchitecturefora ... · A"Hybrid"Cloud"Architecture"for"a" Social"Science"Research"Compu9ng"DataCenter " June"30,"2014"@ DCPerf2014 " LenWisniewski Director,"Research"Technology"Services"

     

30  June  2014  

RCE  in  detail  

•  RCE  =  Research  Compu9ng  Environment  •  Used  to  run  common  sta9s9cal  applica9ons  

–  R,  GAUSS,  Mathema9ca,  MATLAB,  Octave,  SAS,  S-­‐PLUS,  Stata  

•  RCE  has  three  types  of  nodes  –  Login  nodes  

•  User  logs  in  via  NX  (similar  to  VNC)  and  gets  a  desktop  session  •  User  can  launch  an  applica9on  directly  from  the  desktop  

–  Compute-­‐on-­‐demand  nodes  •  User  has  special  “RCE  Powered  Applica9ons”  menu  to  launch  applica9ons  on  machines  with  large  memory  resources  (up  to  250  GB)  

–  Batch  nodes  •  Used  typically  for  non-­‐interac9ve,  long-­‐running,  scalable  jobs  •  Most  jobs  use  R    

Page 6: AHybridCloudArchitecturefora ... · A"Hybrid"Cloud"Architecture"for"a" Social"Science"Research"Compu9ng"DataCenter " June"30,"2014"@ DCPerf2014 " LenWisniewski Director,"Research"Technology"Services"

     

30  June  2014  

RCE  architecture  and  configura9on  

Resource    Manager  

User  Secure  login    

via  NX  

Interac9ve  nodes  

Allocate  and

 manage  

resources  

Local  

Remote  

Key  Applica+ons    R  GAUSS  Mathema9ca  MATLAB  Octave  SAS    S-­‐PLUS  Stata  (SE  and  MP)  

Batch  nodes  

Page 7: AHybridCloudArchitecturefora ... · A"Hybrid"Cloud"Architecture"for"a" Social"Science"Research"Compu9ng"DataCenter " June"30,"2014"@ DCPerf2014 " LenWisniewski Director,"Research"Technology"Services"

     

30  June  2014  

RCE  batch  resource  usage  

Page 8: AHybridCloudArchitecturefora ... · A"Hybrid"Cloud"Architecture"for"a" Social"Science"Research"Compu9ng"DataCenter " June"30,"2014"@ DCPerf2014 " LenWisniewski Director,"Research"Technology"Services"

     

30  June  2014  

Batch  nodes  and  R  

Page 9: AHybridCloudArchitecturefora ... · A"Hybrid"Cloud"Architecture"for"a" Social"Science"Research"Compu9ng"DataCenter " June"30,"2014"@ DCPerf2014 " LenWisniewski Director,"Research"Technology"Services"

     

30  June  2014  

The  case  for  RCE  and  the  cloud  

•  To  advance  research  compu9ng  technologies,  we  need  to  focus  less  on  commodity  services  

•  External  vendors  manage  large  commodity  clusters  more  efficiently  than  any  in-­‐house  opera9on  

•  Embarassingly  parallel  queries,  the  bulk  of  social  science  data  analysis,  are  ideal  research  to  benefit  from  the  cloud  resources  

•  The  RCE’s  structure  allows  a  gradual  transi9on  and  hybrid  infrastructure  

•  Clouds  will  expand  the  range  of  hardware  and  plahorm  support  for  all  researchers  

Page 10: AHybridCloudArchitecturefora ... · A"Hybrid"Cloud"Architecture"for"a" Social"Science"Research"Compu9ng"DataCenter " June"30,"2014"@ DCPerf2014 " LenWisniewski Director,"Research"Technology"Services"

     

30  June  2014  

The  hybrid  model:  RCE  and  the  cloud  

Remote   …  

Resource    Manager  

User  Secure  login    

via  NX  

Interac9ve  

Allocate  and

 manage  

resources  

Local  

Key  Applica+ons    R  GAUSS  Mathema9ca  MATLAB  Octave  SAS  (COD  only)  S-­‐PLUS  Stata  (SE  and  MP)  

R  GAUSS  Mathema9ca  MATLAB  Octave  SAS  S-­‐PLUS  Stata  (SE  and  MP)        R  Octave  

Batch  nodes  

Page 11: AHybridCloudArchitecturefora ... · A"Hybrid"Cloud"Architecture"for"a" Social"Science"Research"Compu9ng"DataCenter " June"30,"2014"@ DCPerf2014 " LenWisniewski Director,"Research"Technology"Services"

     

30  June  2014  

Cloud  advantages  

•  Elas9city  –  Avoids  over-­‐  and  under-­‐resourcing  –  Eliminates  need  to  pay  for  resources  not  in  use  –  Accesses  much  larger  set  of  resources  when  needed  

•  Increased  research  compu9ng  focus  –  Offloads  hardware  maintenance  to  the  “experts”  –  Focuses  local  staff  on  working  with  researchers  to  develop  the  next  genera9on  

of  social  science  compu9ng  tools  

•  Customized  user  environments  –  Sets  up  each  cloud  OS  image  with  only  the  somware  needed  

•  More  direct  accoun9ng  of  usage  –  Reduces  divisional  upfront  commitment  –  Charges  project  for  specific  9me  /  resources  used  

Page 12: AHybridCloudArchitecturefora ... · A"Hybrid"Cloud"Architecture"for"a" Social"Science"Research"Compu9ng"DataCenter " June"30,"2014"@ DCPerf2014 " LenWisniewski Director,"Research"Technology"Services"

     

30  June  2014  

Moving  exclusively  to  the  cloud  

Local  

Remote   …  

Resource    Manager  

User  Secure  login    

via  NX  

Interac9ve  

Allocate  and

 manage  

resources  

Challenges    1.  Managing  

number  of  nodes  in  cluster  

2.  Securing  communica9on  between  local  and  remote  resources  

3.  Syncing  local  and  remote  data  

4.  Managing  cost  for  high-­‐memory  nodes  

5.  License  management  and  connec9on  issues  for  interac9ve  apps  

Batch  nodes  

1

2

3

4

5

Page 13: AHybridCloudArchitecturefora ... · A"Hybrid"Cloud"Architecture"for"a" Social"Science"Research"Compu9ng"DataCenter " June"30,"2014"@ DCPerf2014 " LenWisniewski Director,"Research"Technology"Services"

     

30  June  2014  

Implementa9on  

Page 14: AHybridCloudArchitecturefora ... · A"Hybrid"Cloud"Architecture"for"a" Social"Science"Research"Compu9ng"DataCenter " June"30,"2014"@ DCPerf2014 " LenWisniewski Director,"Research"Technology"Services"

     

30  June  2014  

Future  work  

•  Simula9on  •  Expanding    to  other  clouds  •  Distributed  file  systems  •  Securely  isola9ng  jobs  •  Hierarchical  databases  


Recommended