Some Little Bits: Efficient uploading of Jar files to your Hadoop cluster

Thursday, February 27, 2014

Efficient uploading of Jar files to your Hadoop cluster

Copying fat Jar files up to your Hadoop cluster to execute jobs on production-sized data sets in order to find bottlenecks can be painful when you want a quick turn around whilst debugging.

Sometimes local mode just doesn't cut it.

A good solution is to use rsync which supports incremental checking to only transfer file differences.

Command:
rsync -avz /your_source_directory/somejob-0.0.1.jar login@servername:/target_directory/somejob-0.0.1.jar

Options:
-a archive mode
-v verbose mode
-z compress file data during the transfer

Some Little Bits

Thursday, February 27, 2014

Efficient uploading of Jar files to your Hadoop cluster

No comments:

Post a Comment