Wednesday, June 5, 2013

Impala build steps on CentOS 6.3

  • Build boost-1.42.0
  • Before building impala
    • change be/CMakeLists.txt. I removed all boost RPMs and built boost libraries from sources. boost_date_time will be /usr/local/lib/libboost_date_time-mt.*. The build failed without this change. If you have boost RPMs 1.41 installed, you may not need to change this. But the build will fail with other issues.
      diff --git a/be/CMakeLists.txt b/be/CMakeLists.txt
      index c14bd31..cd5abac 100644
      --- a/be/CMakeLists.txt
      +++ b/be/CMakeLists.txt
      @@ -224,7 +224,7 @@ set (IMPALA_LINK_LIBS
         ${LIBZ}
         ${LIBBZ2}
         ${AVRO_STATIC_LIB}
      -  -lrt -lboost_date_time
      +  -lrt -lboost_date_time-mt
       )
       
       if ("${CMAKE_BUILD_TYPE}" STREQUAL "CODE_COVERAGE")
      
    • change build_public.sh to build release version and don't have to put -build_thirdparty in command line:
      diff --git a/build_public.sh b/build_public.sh
      index 6ea491b..28b445a 100755
      --- a/build_public.sh
      +++ b/build_public.sh
      @@ -23,8 +23,8 @@ set -e
       # Exit on reference to uninitialized variable
       set -u
       
      -BUILD_THIRDPARTY=0
      -TARGET_BUILD_TYPE=Debug
      +BUILD_THIRDPARTY=1
      +TARGET_BUILD_TYPE=Release
       
       for ARG in $*
       do
      
  • After building, run shell/make_shell_tarball.sh. This can generate a shell/build dir to have all files for impala-shell.
  • Prepare hadoop, hbase and hive config files, copy from /var/run/cloudera-scm-agent/process.
  • change bin/set-classpath.sh like this
    CLASSPATH=\
    $HOME/hadoop/hadoop-conf:\
    $HOME/hadoop/hbase-conf:\
    $HOME/hadoop/hive-conf:\
    #$IMPALA_HOME/fe/src/test/resources:\
    #$IMPALA_HOME/fe/target/classes:\
    #$IMPALA_HOME/fe/target/dependency:\
    #$IMPALA_HOME/fe/target/test-classes:\
    $IMPALA_HOME/fe/target/impala-frontend-0.1-SNAPSHOT.jar:\
    ${HIVE_HOME}/lib/datanucleus-core-2.0.3.jar:\
    ${HIVE_HOME}/lib/datanucleus-enhancer-2.0.3.jar:\
    ${HIVE_HOME}/lib/datanucleus-rdbms-2.0.3.jar:\
    ${HIVE_HOME}/lib/datanucleus-connectionpool-2.0.3.jar:
    
    for jar in `ls ${IMPALA_HOME}/fe/target/dependency/*.jar`; do
      CLASSPATH=${CLASSPATH}:$jar
    done
    
    export CLASSPATH
    
    Otherwise, you might see if you don't include impala-frontend.jar
    Exception in thread "main" java.lang.NoClassDefFoundError: com/cloudera/impala/common/JniUtil
    Caused by: java.lang.ClassNotFoundException: com.cloudera.impala.common.JniUtil
    
    Or this if you don't have hadoop-conf in the path
    E0605 09:32:23.236434  5272 impala-server.cc:377] Unsupported file system. Impala only supports DistributedFileSystem but the LocalFileSystem was found. fs.defaultFS(file:///) might be set incorrectly
    E0605 09:32:23.236655  5272 impala-server.cc:379] Impala is aborted due to improper configurations.
    
  • Impalad_flags
    -beeswax_port=21001
    -fe_port=21001
    -be_port=22001
    -hs2_port=21051
    -enable_webserver=true
    -mem_limit=-1
    -webserver_port=25001
    -state_store_subscriber_port=23001
    -default_query_options
    -log_filename=impalad
    -use_statestore=false
    -nn=5K04.corp.pivotlink.com
    -nn_port=8020
    
  • create a tarball of impala build because no such a open-source script.
    tar zcvf impala.tar.gz impala --exclude="*.class" --exclude="*.o" --exclude="impala/thirdparty" --exclude="impala/.git" --exclude="*.java" --exclude="*.cpp" --exclude="*.h" --exclude="expr-test"
  • start impalad
    cd impala_home
    export IMPALA_HOME=$PWD
    bin/start-impalad.sh -build_type=release --flagfile=impalad_flags_path
    
  • start impala-shell
    cd impala_home
    export IMPALA_HOME=$PWD
    export IMPALA_SHELL_HOME=$PWD/shell/build/impala-shell-1.0.1
    $IMPALA_SHELL_HOME/impala-shell -i impalad-host:21001
    

No comments:

Post a Comment