From a-z in 10 minutes! It is hard to believe if you have had previous experience with setting up, sizing, and deploying a distributed search engine service that this is possible. Imagine how many times IT has lost valuable time spending hours trying to understand Apache Solr application requirements and map them into how to […]
Background Apache ZooKeeper is a core infrastructure component in Apache Hadoop stack and is also widely used by many companies for service discovery, configuration management, and so on. Previously ZooKeeper does not support authentication and authorization of servers that are participating in the leader election and quorum forming process; ZooKeeper assumes that every server that […]
Thanks to Karthik Vadla, Abhi Basu, and Monica Martinez-Canales of Intel Corp. for the following guest post about using CDH for cost-effective processing/indexing of DICOM (medical) images. Medical imaging has rapidly become the best non-invasive method to evaluate a patient and determine whether a medical condition exists. Imaging is used to assist in the diagnosis […]
Apache ZooKeeper is a client/server system for distributed coordination that exposes an interface similar to a filesystem, where each node (called a znode) may contain data and a set of children. Each znode has a name and can be identified using a filesystem-like path (for example, /root-znode/sub-znode/my-znode). In Apache HBase, ZooKeeper coordinates, communicates, and shares state […]
In Part 1 of this series about Apache HBase snapshots, you learned how to use the new Snapshots feature and a bit of theory behind the implementation. Now, it’s time to dive into the technical details a bit more deeply. What is a Table? An HBase table comprises a set of metadata information and a set […]