Saturday, March 3, 2012

Use JDB to debug beeswax

I needed to figure out who is called HiveMetaStore.get_partition_with_auth because it scans each partition one by one, and will be a performance issue when you have thousands partitions.
12/02/29 12:39:14 INFO ppd.OpProcFactory:  ((local_dt = '2011-11-01') and (local_dt < '2011-11-08'))
12/02/29 12:39:14 INFO ppd.OpProcFactory: Processing for TS(293)
12/02/29 12:39:14 INFO ppd.OpProcFactory: Pushdown Predicates of TS For Alias : lz.lz_omniture_hit
12/02/29 12:39:14 INFO ppd.OpProcFactory:  ((local_dt = '2011-11-01') and (local_dt < '2011-11-08'))
12/02/29 12:39:14 INFO metastore.HiveMetaStore: 27: get_partition_names : db=lz tbl=lz_omniture_hit
12/02/29 12:39:14 INFO HiveMetaStore.audit: ugi=platdev ip=unknown-ip-addr cmd=get_partition_names : db=lz tbl=lz_omniture_hit 
12/02/29 12:39:14 INFO metastore.HiveMetaStore: 27: get_partition_with_auth : db=lz tbl=lz_omniture_hit[2011-01-01,AT]
12/02/29 12:39:14 INFO HiveMetaStore.audit: ugi=platdev ip=unknown-ip-addr cmd=get_partition_with_auth : db=lz tbl=lz_omniture_hit[2011-01-01,AT] 
12/02/29 12:39:14 INFO metastore.HiveMetaStore: 27: get_partition_with_auth : db=lz tbl=lz_omniture_hit[2011-11-01,AR]
12/02/29 12:39:14 INFO HiveMetaStore.audit: ugi=platdev ip=unknown-ip-addr cmd=get_partition_with_auth : db=lz tbl=lz_omniture_hit[2011-11-01,AR] 
12/02/29 12:39:14 INFO metastore.HiveMetaStore: 27: get_partition_with_auth : db=lz tbl=lz_omniture_hit[2011-11-01,AT]
12/02/29 12:39:14 INFO HiveMetaStore.audit: ugi=platdev ip=unknown-ip-addr cmd=get_partition_with_auth : db=lz tbl=lz_omniture_hit[2011-11-01,AT] 
12/02/29 12:39:14 INFO metastore.HiveMetaStore: 27: get_partition_with_auth : db=lz tbl=lz_omniture_hit[2011-11-01,AU]
12/02/29 12:39:14 INFO HiveMetaStore.audit: ugi=platdev ip=unknown-ip-addr cmd=get_partition_with_auth : db=lz tbl=lz_omniture_hit[2011-11-01,AU] 
12/02/29 12:39:14 INFO metastore.HiveMetaStore: 27: get_partition_with_auth : db=lz tbl=lz_omniture_hit[2011-11-01,BE]
Unfortunately, I cannot use Eclipse on the machine where I run beeswax server because X window is not installed. JDB is the only choice, but it really not a good tool for debugging. Here is how I use JDB to debug Beeswax server:
  1. Start Beeswax server in debug mode, add this in apps/beeswax/beeswax_server.sh before hadoop jar is called
    export HADOOP_OPTS="-Dlog4j.configuration=log4j.properties -Xdebug -Xrunjdwp:transport=dt_socket,address=4444,server=y,suspend=n"
    ...
    echo Executing $HADOOP_HOME/bin/hadoop jar $BEESWAX_JAR "$@"
    exec $HADOOP_HOME/bin/hadoop jar $BEESWAX_JAR "$@"
    
  2. Start JDB using jdb -attach 4444
  3. Add source file paths
    > use /etc/hive/conf:$HOME/git/hue/apps/beeswax/src/beeswax/../../../../desktop/conf:/usr/lib/hadoop/conf:/usr/lib/hadoop:$HOME/git/hue/apps/beeswax/java/src/main:$HOME/hadoop/hive-0.7.1-cdh3u1/src/ql:$HOME/hadoop/hive-0.7.1-cdh3u1/src/metastore:$HOME/hadoop/hive-0.7.1-cdh3u1/src/common
    
  4. List all threads: threads
    Beeswax-8[1] threads
    Group system:
      (java.lang.ref.Reference$ReferenceHandler)0xec0 Reference Handler                                                          cond. waiting
      (java.lang.ref.Finalizer$FinalizerThread)0xebf  Finalizer                                                                  cond. waiting
      (java.lang.Thread)0xebe                         Signal Dispatcher                                                          running
    Group main:
      (java.lang.Thread)0xec1                         main                                                                       running
      (java.util.TimerThread)0xebd                    Timer thread for monitoring ugi                                            cond. waiting
      (java.lang.Thread)0xebc                         MetaServerThread                                                           running
      (java.lang.Thread)0xebb                         Evicter                                                                    sleeping
      (java.lang.Thread)0xeba                         pool-2-thread-1                                                            running
      (org.apache.hadoop.util.Daemon)0xeb9            LeaseChecker                                                               sleeping
      (java.lang.Thread)0xeb8                         pool-3-thread-1                                                            running
      (org.apache.hadoop.util.Daemon)0xeb7            LeaseChecker                                                               sleeping
      (org.apache.hadoop.util.Daemon)0xeb6            LeaseChecker                                                               sleeping
      (org.apache.hadoop.util.Daemon)0xeb5            LeaseChecker                                                               sleeping
      (org.apache.hadoop.util.Daemon)0xed1            LeaseChecker                                                               sleeping
      (java.lang.Thread)0xef6                         Beeswax-8                                                                  running (at breakpoint)
      (org.apache.hadoop.ipc.Client$Connection)0xef9  IPC Client (47) connection to chelhadedw002/10.184.39.97:8020 from platdev cond. waiting
      (java.lang.Thread)0xefa                         sendParams-5                                                               cond. waiting
    
  5. Go to a thread: thread 0xebc
  6. Add breakpoints:
    • Stop at the specified line:
      stop at com.cloudera.beeswax.BeeswaxServiceImpl:689
    • Stop in a method:
      stop in com.cloudera.beeswax.BeeswaxServiceImpl.query
    • Stop in a method of nested class:
      stop in org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partition_with_auth
Here is what I found:
MetaServerThread[1] stop in org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partition_with_auth
Set breakpoint org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partition_with_auth
MetaServerThread[1] cont
>
Breakpoint hit: "thread=Beeswax-8", org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partition_with_auth(), line=1,463 bci=0

Beeswax-8[1] where
  [1] org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partition_with_auth (HiveMetaStore.java:1,463)
  [2] org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getPartitionWithAuthInfo (HiveMetaStoreClient.java:657)
  [3] org.apache.hadoop.hive.ql.metadata.Hive.getPartition (Hive.java:1,293)
  [4] org.apache.hadoop.hive.ql.metadata.Hive.getPartition (Hive.java:1,258)
  [5] org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.prune (PartitionPruner.java:229)
  [6] org.apache.hadoop.hive.ql.optimizer.GenMapRedUtils.setTaskPlan (GenMapRedUtils.java:551)
  [7] org.apache.hadoop.hive.ql.optimizer.GenMapRedUtils.setTaskPlan (GenMapRedUtils.java:514)
  [8] org.apache.hadoop.hive.ql.optimizer.GenMapRedUtils.initPlan (GenMapRedUtils.java:125)
  [9] org.apache.hadoop.hive.ql.optimizer.GenMRRedSink1.process (GenMRRedSink1.java:76)
  [10] org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch (DefaultRuleDispatcher.java:89)
  [11] org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch (DefaultGraphWalker.java:88)
My Hive version is hadoop-hive-cdh3u1. The code of PartitionPruner is not good enough and it was replaced in hadoop-hive-cdh4b1.