Steps to Run:
- Compile "PiEstimatorKrb" using Maven:
- Make sure you have mvn in your classpath.
- cd PiEstimatorKrb
- Edit the pom.xml and configure the hadoop dependencies for the version of hadoop you are using.
- mvn clean install
- Copy target/PiEstimatorKrb-1.0.jar to pi_load_test/job/lib.
- Make sure to put "pi_load_test" directory into hdfs, "hadoop fs -put pi_load_test /some/path".
- Edit "launch_jobs.sh" and set the following variables at the top of the script:
- HDFS_BASE: Set to match the location that you put pi_load_test. Note that HDFS_BASE should not contain the words "pi_load_test".
- LAUNCH_HOME: Set to the location where you extracted this archive.
- MAPS: To the number of Pi maps you want to run.
- SAMPLES: To the number of Pi saamples you want to run.
- SLEEP_TIME: To the number of seconds to wait between spawning each workflow.
- TOTAL_JOBS: To the number of workflows you want to spawn total.
- Run "./launch_jobs.sh" to start spawning jobs.
------- File Descriptions -------------
config.sh - Gathers info from hadoop/conf like NameNode and JobTracker URL and if Kerberos is enabled. This is called from run.sh, status.sh and jobs_list.sh jobid.txt - This is generated by run.sh, but not much use. It will contain the job id of the most recently spawned workflow. jobs_list.sh - Lists all oozie jobs. If you want to filter do "jobs_list.sh RUNNING" or whatever status you want. launch_jobs.sh - This loops spawning Pi jobs. PiEstimatorKrbSrc - This is the source code for the custom PiEstimator class that is used in the workflow. pi_load_test - This is the workflow that is being spawned. It runs a single java action to spawn PiEstimator. run.sh - This is called by launch_jobs.sh to kick off the workflow. status.sh - This can be run to check the status of a running workflow. "status.sh <workflow_id>".