Date post: | 06-Jan-2017 |
Category: |
Engineering |
Upload: | michael-stack |
View: | 66 times |
Download: | 0 times |
Coprocessors – Uses, Abuses Solutions
• 26 // SEPTEMBER // 2016
COPYRIGHT 2016 BLOOMBERG FINANCE L.P. ALL RIGHTS RESERVED.
Esther Kundin(With guest appearance by Clay Baenziger)
Coprocessors
What is a coprocessor?– Custom jar loaded into HBase daemon process– Endpoint – like a stored procedure– Observer – like a trigger
Observers– Region Observer
• preGet, postGet• prePut, postPut
– WAL Observer– Master Observer
• runs in HBase master• Create, Delete, Modify table
Why use a coprocessor?– Simple filter or aggregation run on your data– Reduces amount of data being sent to the client– NOT for complex data analysis– Ex: Apache Phoenix (“We put the SQL back in
NoSQL”)
PORT – A sample use case
Post-Get example
RegionServer
postGet
Key Col1 Col2 Col3 Col4 Col5Key1Abc 1 4 5
Key1Def 2 2 2
Key1Xyz 10 11 12
Key1 Abc-col1 Def-col2 Abc-col3 Abc-col4 Xyz-col5Key1 1 2 4 5 12
Table Representation:
Coprocessor Result:
Problems and Solutions
Coprocessors crash regionservers– Exceptions (other than IOExceptions) in the
coprocessor bring down the RegionServer– In other cases, the coprocessor silently unloads
Solution – catch all exceptionspublic final void prePut(...) throws IOException { try { prePutImpl(…); } catch(IOException ex) { // Allow IOExceptions to propagate // They won't cause an unload throw ex; } catch(Throwable ex) { // Wrap other exceptions as IOException LOG.error("prePut: caught ", ex); throw new IOException(ex); }}
Coprocessors can hog memory– Memory is shared with RegionServer memory and
coprocessor memory– Memory hogging slows RegionServer Performance
Solutions - defensive Java code– Profile all coprocessor code for memory usage
• Use a generic profiler with a driver for your coprocessor
– Use common Java tricks for limiting memory usage• Use primitive types and underlying arrays where
possible• Use immutable objects• StringBuilder vs String concatenation
Problems with deployment– Manual Deployment
• disable table• assign new coprocessor• enable table
– Rollout of non-backward-compatible coprocessor difficult
Solutions– HBASE-7639 – online schema update is enabled,
perhaps it will work– Hard-code jar path in hbase-site.xml
• Used by Apache Phoenix• Not the best approach for user-defined coprocessors
Logging and metrics tips– Update log4j.properties file with a separate log
parameter for coprocessors– Use MDC context to pass parameters to all parts of
the coprocessor(http://www.slf4j.org/api/org/slf4j/MDC.html)
– Create an extra column in a Result to pass back an object populated with metrics
– Bad request can bring down the whole cluster– Missing jar will bring down the RegionServerERROR org.apache.hadoop.hbase.coprocessor.CoprocessorHost: The coprocessor fooCoprocessor threw java.io.FileNotFoundException: File does not exist: /path/to/coprocessor.jar java.io.FileNotFoundException: File does not exist: /path/to/coprocessor.jar
Unsolved issues
(Preventing) Abuses
– Affects all region servers – one at a time– HTable descriptors contain coprocessor class:
Clean-up can be messy HBASE-14190 - Assign system tables ahead of user region assignment
– Set table property:hbase.coprocessor.abortonerror to false2016-09-24 02:32:07,366 ERROR org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost: Failed to load coprocessor net.clayb.hbase.coprocessor.RegionObserverjava.io.FileNotFoundException: File does not exist: hdfs://Test/user/foo/clayCoprocessor.jar(Region server stays alive only table stays disabled)
Load Failures
Handler Failure– Affect some operations and not others (e.g. scan works, not get)– RPC starvation is simple and non-obvious failure:
public class RegionObserverInfinity extends BaseRegionObserver { public void preGetOp(…) throws IOException { for(;;){ LOG.trace(“Off I go…”); }}
– Use jstack to see what is up in a region server:clay@hbase-regionserver:~$ sudo jstack 3990[…]net.clayb.RegionObserverInfinity.preGetOp(…) @bci=12, line=28 (Compiled frame; information may be imprecise)
Coprocessor Whitelisting– Coprocessors are key to HBase operation:
• AccessController• TokenProvider• SecureBulkLoadEndpoint• MultiRowMutationEndpoint
– hbase.coprocessor.user.enabled – disables all user coprocessors (e.g. Apache Phoenix)
– HBASE-16700 – “Allow for coprocessor whitelisting” or abuse HBASE-15686
Recap– Coprocessors are dangerous:
Coprocessors are an advanced feature of HBase and are intended to be used by system developers only. – HBase Book
– Write defensive code!– Needed from the community
• Story for coprocessor deployment• Process isolation• JMX metrics
Thank you!