Tag Archives: hoop

HttpFS for CDH3 – The Apache Hadoop FileSystem over HTTP

Categories: CDH General HDFS

HttpFS is an HTTP gateway/proxy for Apache Hadoop FileSystem implementations. HttpFS comes with CDH4 and replaces HdfsProxy (which only provided read access). Its REST API is compatible with WebHDFS (which is included in CDH4 and the upcoming CDH3u5).

HttpFs is a proxy so, unlike WebHDFS, it does not require clients be able to access every machine in the cluster. This allows clients to to access a cluster that is behind a firewall via the WebHDFS REST API.

Read more

Hoop – Hadoop HDFS over HTTP

Categories: Community HDFS

What is Hoop?

Hoop provides access to all Hadoop Distributed File System (HDFS) operations (read and write) over HTTP/S.

Hoop can be used to:

  • Access HDFS using HTTP REST.
  • Transfer data between clusters running different versions of Hadoop (thereby overcoming RPC versioning issues).
  • Access data in a HDFS cluster behind a firewall. The Hoop server acts as a gateway and is the only system that is allowed to go through the firewall.

Read more