Pages

Saturday, May 24, 2014

Hadoop DistCp



DistCp (distributed copy) is a tool for large inter/intra-cluster copying.

Run the DistCp command for inter-cluster.
$ hadoop distcp hdfs://nn1:8020/basePath hdfs://nn2:8020/basePath
$ hadoop distcp hdfs://nn1:8020/basePath1 hdfs://nn1:8020/basePath2 hdfs://nn2:8020/basePath
$ hadoop distcp hdfs://nn1:8020/srclist hdfs://nn2:8020/basePath
Where srclist contains hdfs://nn1:8020/basePath1 and hdfs://nn1:8020/basePath2.

This will expand the namespace under /basepath on nn1 into a temporary file, partition its contents among a map tasks, and start a copy on each TaskTracker from nn1 to nn2.

Run the DistCp command on the destination cluster only for copying between different versions of Hadoop.

$ hadoop distcp hftp://cdh3-namenode:50070/ hdfs:/cdh4-namenode/

For specific path such as /hbase
$ hadoop distcp hftp://cdh3-namenode:50070/basePath hdfs:/cdh4-namenode/basePath

Where cdh3-namenode refers to the source NameNode hostname as defined by the config  fs.default.name and 50070 is the NameNode port as defined by the config dfs.http.address, and cdh4-namenode is destination NameNode as defined by the config fs.defaultFS. You can also use destination nameservice-id. basePath in both above URIs is the directory to copy.


6 comments:

Unknown said...

Thanks for sharing a valuable information, here i have learned some new things about Distcp its really useful to me keep your updates regularly
Hadoop training chennai

Unknown said...

Thanks for sharing your view to our knowledge’s, its helps me plenty keep sharing…
sas course in Chennai

Unknown said...


I have read all the articles in your blog; was really impressed after reading it. FITA Academy is glad To inform you that; we provide practical training on all the technologies with MNC exports. We
Assure you that through our training the students will gain all the sufficient knowledge to have a voyage in IT industry.
FITA Chennai Reviews

Unknown said...

Thanks for sharing this valuable information to our vision. You have posted a trust worthy blog keep sharing.
ccna training in Chennai

Unknown said...

I see this content as a Unique and very informative article. Impressive article like this may help many like me in finding the best Hadoop Training in Chennai and there finding the best hadoop training institute in chennai

Unknown said...

very nice blogs!!! i have to learning for lot of information for this sites...Sharing for wonderful information.
VMWare course chennai | VMWare certification in chennai | VMWare certification chennai

Post a Comment