hadoop

Apache Beam Spark Runner example using Maven

In this post I will show you how to create Apache Beam Spark Runner project using Maven. Tools/ Frameworks used: Java 8 Apache Spark Maven Intellij Apache Beam Add Cloudera repository in maven settings.xml <repository> <id>cloudera</id> <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url> </repository><repository> <id>cloudera</id> <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url> </repository> full settings.xml file: <settings></settings><profiles><profile> <id>cld</id>   <repositories> <repository> <id>cloudera</id>…

Lanuch Hbase Mapreduce Job

Lanuch Hbase Mapreduce Job Add all the required jars to a variable separated by(,) like jar=a,b in my case I want to supply all hbase jars to job 1 2 3 4 5 jar=/home/hadoop/Desktop/techsquids/repos/techsquids/map-reduce/target/map-reduce-1.0-SNAPSHOT.jar classpath=$jar for f in $HBASE_HOME/lib/*.jar;do classpath="${classpath},$f" donejar=/home/hadoop/Desktop/techsquids/repos/techsquids/map-reduce/target/map-reduce-1.0-SNAPSHOT.jar classpath=$jar for f in $HBASE_HOME/lib/*.jar;do classpath="${classpath},$f" done Next prepare…