Date post: | 05-Jan-2016 |
Category: |
Documents |
Upload: | cornelius-bradford |
View: | 212 times |
Download: | 0 times |
The HDF Group
1 www.hdfgroup.org
Milestone 5.1: Initial POSIX Function Shipping
DemonstrationJerome Soumagne, Quincey Koziol
09/24/2013
© 2013 The HDF Group
www.hdfgroup.org09/24/2013 Initial POSIX Function Shipping Demonstration 2
Overview – Mercury
© 2013 The HDF Group
• Mercury “Function Shipper”: RPC layer that supports• Non-blocking transfers• Large data arguments (w/RMA)• Native transport protocols of HPC systems
• Mercury serves as a basis for higher-level frameworks that need to operate on/store/access data remotely• HDF5 IOD virtual object plugin• IOFSL I/O forwarding scalability layer• Storage systems• Analysis frameworks
www.hdfgroup.org09/24/2013 Initial POSIX Function Shipping Demonstration 3
Overview – Mercury
© 2013 The HDF Group
• Already largely presented in previous milestones• No major modification of Mercury for this deliverable in
order to support POSIX calls• But Mercury is still being improved:
• Performance tuning on Infiniband cluster• Support for additional network transports is being added (TCP /
ibverbs / SSM)
• Paper submitted at end of Q4 now accepted and being presented at IEEE Cluster 2013:• J. Soumagne, D. Kimpe, J. Zounmevo, M. Chaarawi, Q.Koziol, A.
Afsahi, and R. Ross, “Mercury: Enabling Remote Procedure Call for High-Performance Computing”, IEEE International Conference on Cluster Computing, Sep 2013
www.hdfgroup.org09/24/2013 Initial POSIX Function Shipping Demonstration 4
Fast Forward Stack – Function Shipping
© 2013 The HDF Group
HDF5 API
VOL
Mercury(Client)
Mercury(Server)
Native (H5) IOD VOL
Network
IOD VOL
VFL
…
www.hdfgroup.org09/24/2013 Initial POSIX Function Shipping Demonstration 5
POSIX Function Shipping (Example)
© 2013 The HDF Group
HDF5 API
VOL
VFL
File System
Mercury(Client)
Mercury(Server)
Native (H5) IOD VOL
sec2Network
POSIX I/O
POSIX I/OPOSIX I/O
www.hdfgroup.org09/24/2013 Initial POSIX Function Shipping Demonstration 6
Mercury POSIX
© 2013 The HDF Group
• Support POSIX I/O routines through Mercury• Completely separate package built on top of Mercury
called: Mercury POSIX (lightweight library + server)• Design keys:
• Support 32/64 bit platforms and large files• No modification of original source code that uses POSIX I/O
(e.g., HDF5 sec2 driver)• Redirects I/O to Mercury server with dynamic linking
• Can make use of all the transports available through Mercury (although MPI dynamic connection is not really flexible and always available)
• Code for supporting POSIX routine is automatically generated inside Mercury POSIX by using BOOST preprocessor macros
www.hdfgroup.org09/24/2013 Initial POSIX Function Shipping Demonstration 7
Mercury POSIX – Stub Generation
© 2013 The HDF Group
• Most routines are generated with one line macro• Built on top of existing Mercury/Boost macros• However supporting variable arguments routines requires
some extra lines to create encoding / decoding routines that check argument flags etc
www.hdfgroup.org09/24/2013 Initial POSIX Function Shipping Demonstration 8
Mercury POSIX – Stub Generation
© 2013 The HDF Group
• Two main macros:/* Non-bulk routines */MERCURY_POSIX_GEN_STUB(func_name,
ret_type,in_types,out_types)
/* Bulk routines */MERCURY_POSIX_GEN_BULK_STUB(func_name,
ret_type,in_types,out_types,bulk_read)/* 1/0 if reading/writing bulk data
*/
www.hdfgroup.org09/24/2013 Initial POSIX Function Shipping Demonstration 9
Mercury POSIX – Stub Generation
© 2013 The HDF Group
• Example, showing results of the following macro:
/* off_t lseek(int fildes, off_t offset, int whence) */MERCURY_POSIX_GEN_STUB(lseek,
hg_off_t,(hg_int32_t)(hg_off_t)(hg_int32_t),
)
www.hdfgroup.org09/24/2013 Initial POSIX Function Shipping Demonstration 10
Mercury POSIX – Stub Generation
© 2013 The HDF Group
/* off_t lseek(int fildes, off_t offset, int whence) */MERCURY_POSIX_GEN_STUB(lseek,
hg_off_t,(hg_int32_t)(hg_off_t)(hg_int32_t),
)
• Generate input structuretypedef struct{ hg_int32_t in_param_0; hg_off_t in_param_1; hg_int32_t in_param_2;} lseek_in_t;
www.hdfgroup.org09/24/2013 Initial POSIX Function Shipping Demonstration 11
Mercury POSIX – Stub Generation
© 2013 The HDF Group
/* off_t lseek(int fildes, off_t offset, int whence) */MERCURY_POSIX_GEN_STUB(lseek,
hg_off_t,(hg_int32_t)(hg_off_t)(hg_int32_t),
)
• Generate proc routine for input structurestatic __inline__ inthg_proc_lseek_in_t(hg_proc_t proc, void *data){ lseek_in_t *struct_data = (lseek_in_t *) data;
hg_proc_hg_int32_t(proc, &struct_data->in_param_0); hg_proc_hg_off_t(proc, &struct_data->in_param_1); hg_proc_hg_int32_t(proc, &struct_data->in_param_2);
return ret;}
www.hdfgroup.org09/24/2013 Initial POSIX Function Shipping Demonstration 12
Mercury POSIX – Stub Generation
© 2013 The HDF Group
/* off_t lseek(int fildes, off_t offset, int whence) */MERCURY_POSIX_GEN_STUB(lseek,
hg_off_t,(hg_int32_t)(hg_off_t)(hg_int32_t),
)
• Generate output structuretypedef struct{ hg_off_t ret;} lseek_out_t;
www.hdfgroup.org09/24/2013 Initial POSIX Function Shipping Demonstration 13
Mercury POSIX – Stub Generation
© 2013 The HDF Group
/* off_t lseek(int fildes, off_t offset, int whence) */MERCURY_POSIX_GEN_STUB(lseek,
hg_off_t,(hg_int32_t)(hg_off_t)(hg_int32_t),
)
• Generate proc routine for output structurestatic __inline__ inthg_proc_lseek_out_t(hg_proc_t proc, void *data){ lseek_out_t *struct_data = (lseek_out_t *) data;
hg_proc_hg_int64_t(proc, &struct_data->ret);
return ret;}
www.hdfgroup.org09/24/2013 Initial POSIX Function Shipping Demonstration 14
Mercury POSIX – Stub Generation
© 2013 The HDF Group
/* off_t lseek(int fildes, off_t offset, int whence) */MERCURY_POSIX_GEN_STUB(lseek,
hg_off_t,(hg_int32_t)(hg_off_t)(hg_int32_t),
)
• Generate client stub (simplified version)hg_off_tlseek(hg_int32_t in_param_0, hg_off_t in_param_1, hg_int32_t in_param_2){ lseek_in_t in_struct; lseek_out_t out_struct; hg_off_t ret;
/* Initialization */ ...
www.hdfgroup.org09/24/2013 Initial POSIX Function Shipping Demonstration 15
Mercury POSIX – Stub Generation
© 2013 The HDF Group
/* Register function if not registered */ MERCURY_REGISTER("lseek", lseek_in_t, lseek_out_t);
/* Fill input structure */ in_struct.in_param_0 = in_param_0; in_struct.in_param_1 = in_param_1; in_struct.in_param_2 = in_param_2;
/* Forward call to remote addr and get a new request */ HG_Forward(addr, id, &in_struct, &out_struct, &request);
/* Wait for call to be executed */ HG_Wait(request, HG_MAX_IDLE_TIME, &status);
/* Get output parameters */ ret = out_struct.ret;
return ret;}
www.hdfgroup.org09/24/2013 Initial POSIX Function Shipping Demonstration
Mercury POSIX – Stub Generation
© 2013 The HDF Group16
/* off_t lseek(int fildes, off_t offset, int whence) */MERCURY_POSIX_GEN_STUB(lseek,
hg_off_t,(hg_int32_t)(hg_off_t)(hg_int32_t),
)
• Generate server stub (simplified version)static intlseek_cb(hg_handle_t handle){ lseek_in_t in_struct; lseek_out_t out_struct;
hg_int32_t in_param_0; hg_off_t in_param_1; hg_int32_t in_param_2;
hg_off_t ret;
www.hdfgroup.org09/24/2013 Initial POSIX Function Shipping Demonstration 17
Mercury POSIX – Stub Generation
© 2013 The HDF Group
/* Get input buffer */ HG_Handler_get_input(handle, &in_struct);
/* Get parameters */ in_param_0 = in_struct.in_param_0; in_param_1 = in_struct.in_param_1; in_param_2 = in_struct.in_param_2;
/* Call function */ ret = lseek (in_param_0, in_param_1, in_param_2);
/* Fill output structure */ out_struct.ret = ret;
/* Free handle and send response back */ HG_Handler_start_output(handle, &out_struct);
}
www.hdfgroup.org09/24/2013 Initial POSIX Function Shipping Demonstration 18
Mercury POSIX
© 2013 The HDF Group
• Routines currently supported:access fdatasync mkdir truncate
chdir fpathconf mkfifo umask
chmod fstat mknod unlink
chown fsync open write
creat ftruncate pathconf
close getcwd read +Large file versions:
dup lchown readlink creat64
dup2 link rmdir ftruncate64
fchdir lockf stat lseek64
fchmod lseek symlink open64
fchown lstat sync etc.
www.hdfgroup.org09/24/2013 Initial POSIX Function Shipping Demonstration 19
Mercury POSIX
© 2013 The HDF Group
• Routines not yet supported:closedir pipe readdir
fcntl pread rewinddir +Large file versions:
opendir pwrite utime ?
www.hdfgroup.org09/24/2013 Initial POSIX Function Shipping Demonstration 20
Mercury POSIX - Configuration
© 2013 The HDF Group
• Environment variables required:• MERCURY_NA_PLUGIN: Underlying network transport
method used to forward calls to remote servers.• e.g., "bmi”
• MERCURY_PORT_NAME: Port name information (IP/port) specific to the network transport chosen – used to establish a connection with a remote server.
• e.g., "tcp://72.36.68.242:22222”• LD_PRELOAD: Path to Mercury POSIX shared library.
• e.g., “/usr/local/lib/libmercury_posix.so”
• Setting LD_PRELOAD redirects all POSIX calls to the Mercury server (can be an issue with local scripts, etc. that make use of POSIX I/O)
www.hdfgroup.org09/24/2013 Initial POSIX Function Shipping Demonstration 21
Mercury POSIX - Testing
© 2013 The HDF Group
• Integrated regression tests (limited POSIX test suite)• HDF5 sec2 driver (demo)• Lustre POSIX test suite
• However: framework issues, needs to be modified, possibly need to support fdopen and FILE* routines?
www.hdfgroup.org09/24/2013 Initial POSIX Function Shipping Demonstration 22
Demo – Mercury POSIX and HDF5 tools
© 2013 The HDF Group
$ pwd~jsoumagne/demo
$ ls *.h5ls: *.h5: No such file or directory
$ export MERCURY_NA_PLUGIN=“bmi”
$ export MERCURY_PORT_NAME=“tcp://127.0.0.1:22222”
$ export LD_PRELOAD=/path/to/libmercury_posix.so
$ pwd~jsoumagne/demo_server
$ lscoord.h5
$ mercury_posix_server bmiWaiting for client...
www.hdfgroup.org09/24/2013 Initial POSIX Function Shipping Demonstration 23
Demo – Mercury POSIX and HDF5 tools
© 2013 The HDF Group
$ h5dump -H coord.h5HDF5 "coord.h5" {GROUP "/" { DATASET "multiple_ends_dset" { DATATYPE H5T_STD_I32LE DATASPACE SIMPLE { ( 4, 5, 3, 4, 2, 3, 6, 2 ) / ( 4, 5, 3, 4, 2, 3, 6, 2 ) } } DATASET "multiple_ends_dset_chunked" { DATATYPE H5T_STD_I32LE DATASPACE SIMPLE { ( 4, 5, 3, 4, 2, 3, 6, 2 ) / ( 4, 5, 3, 4, 2, 3, 6, 2 ) } } DATASET "single_end_dset" { DATATYPE H5T_STD_I32LE DATASPACE SIMPLE { ( 2, 3, 6, 2 ) / ( 2, 3, 6, 2 ) }... (skip)
$ mercury_posix_server bmiWaiting for client...Thu, 19 Sep 13 17:31:00 CDT: Executing open64Thu, 19 Sep 13 17:31:00 CDT: Executing __fxstat64Thu, 19 Sep 13 17:31:00 CDT: Executing lseek64Thu, 19 Sep 13 17:31:00 CDT: Executing hg_posix_readThu, 19 Sep 13 17:31:00 CDT: Executing lseek64Thu, 19 Sep 13 17:31:00 CDT: Executing hg_posix_readThu, 19 Sep 13 17:31:00 CDT: Executing hg_posix_readThu, 19 Sep 13 17:31:00 CDT: Executing hg_posix_readThu, 19 Sep 13 17:31:00 CDT: Executing getcwd... (skip)
www.hdfgroup.org09/24/2013 Initial POSIX Function Shipping Demonstration 24
Demo – Mercury POSIX and HDF5 tools
© 2013 The HDF Group
$ h5copy -i coord.h5 -s single_end_dset -o coord_simple.h5 -d simple
Thu, 19 Sep 13 17:33:51 CDT: Executing open64Thu, 19 Sep 13 17:33:51 CDT: Executing open64... (skip)Thu, 19 Sep 13 17:33:51 CDT: Executing __fxstat64Thu, 19 Sep 13 17:33:51 CDT: Executing lseek64Thu, 19 Sep 13 17:33:51 CDT: Executing hg_posix_readThu, 19 Sep 13 17:33:51 CDT: Executing lseek64Thu, 19 Sep 13 17:33:51 CDT: Executing hg_posix_read... (skip)Thu, 19 Sep 13 17:33:51 CDT: Executing hg_posix_writeThu, 19 Sep 13 17:33:51 CDT: Executing hg_posix_writeThu, 19 Sep 13 17:33:51 CDT: Executing close
www.hdfgroup.org09/24/2013 Initial POSIX Function Shipping Demonstration 25
Demo – Mercury POSIX and HDF5 tools
© 2013 The HDF Group
$ h5dump coord_simple.h5HDF5 "coord_simple.h5" {GROUP "/" { DATASET "simple" { DATATYPE H5T_STD_I32LE DATASPACE SIMPLE { ( 2, 3, 6, 2 ) / ( 2, 3, 6, 2 ) } DATA { (0,0,0,0): 0, 1, (0,0,1,0): 1, 2,... (skip) (1,2,2,0): 122, 123, (1,2,3,0): 123, 124, (1,2,4,0): 124, 125, (1,2,5,0): 125, 126 } }}}
Thu, 19 Sep 13 17:36:57 CDT: Executing open64Thu, 19 Sep 13 17:36:57 CDT: Executing __fxstat64Thu, 19 Sep 13 17:36:57 CDT: Executing lseek64Thu, 19 Sep 13 17:36:57 CDT: Executing hg_posix_readThu, 19 Sep 13 17:36:57 CDT: Executing lseek64Thu, 19 Sep 13 17:36:57 CDT: Executing hg_posix_readThu, 19 Sep 13 17:36:57 CDT: Executing hg_posix_readThu, 19 Sep 13 17:36:57 CDT: Executing hg_posix_read... (skip)Thu, 19 Sep 13 17:36:57 CDT: Executing lseek64Thu, 19 Sep 13 17:36:57 CDT: Executing hg_posix_readThu, 19 Sep 13 17:36:57 CDT: Executing close
www.hdfgroup.org09/24/2013 Initial POSIX Function Shipping Demonstration 26
Conclusion – Future Work
© 2013 The HDF Group
• Very easy to forward POSIX I/O calls and does not require modification of existing tools / code
• Mercury POSIX can be easily extended to support additional system / library calls
• Can directly take advantage of updates to Mercury (network transports, etc.)
• Next Quarter:• Support remaining POSIX routines• Test with MPI I/O (ROMIO driver)• Test with Lustre POSIX test suite
• If framework issues are solved
www.hdfgroup.org09/24/2013 Initial POSIX Function Shipping Demonstration 27
Questions
© 2013 The HDF Group