hadoop - How could i relate Amazon EC2,S3 and my HDFS? -


i learning hadoop in pseudo distributed mode,so not aware of cluster. when browsed cluster s3 data storage device. , ec2 computing service,but couldn't understand real use of it. hdfs available in s3. if yes when learning hive came across moving data hdfs s3 , mentioned archival logic.

hadoop distcp /data/log_messages/2011/12/02 s3n://ourbucket/logs/2011/12/02 

my hdfs landed on s3 how beneficial? might silly if 1 give me overview helpful me.

s3 storage, no computation allowed. can think s3 bucket can hold data & can retrieve data using there api. if using aws/ec2 hadoop cluster on aws/ec2, different s3. hdfs file system in hadoop maximizing input/output performance.

the command shared distributed copy. copy data hdfs s3. in short, ec2 have hdfs default file system in hadoop environment , can move archive data or unused data s3, s3 storage cheaper ec2 machines.


Comments