At Cloudera, we’re always working to make it easier for you to work with Hadoop and integrate Hadoop-based systems in with your existing data sources. One example of how we accomplish this is Sqoop, a database import tool developed at Cloudera that allows you to easily copy data between databases and HDFS. We originally announced this tool in June, but we’ve been steadily improving it since then. It can now talk with several more databases than before,
Today’s Hadoop World video comes from Ed Capriolo, and goes into details about how to effectively monitor Hadoop in production environments. Thanks Ed, and stay tuned for more!
Every day, we hear about people doing amazing things with Apache Hadoop. The variety of applications across industries is clear evidence that Hadoop is radically changing the way data is processed at scale. To drive that point home, we’re excited to host a guest blog post from the University of Maryland’s Michael Schatz. Michael and his team have built a system using Hadoop that drives the cost of analyzing a human genome below $100 —
Update (May 1 2013): The post below, which is based on an outdated VM, is deprecated. Rather please see the Cloudera QuickStart VM, which runs on VirtualBox, VMware, and KVM.
Cloudera’s Training VM is one of the most popular resources on our website. It was created with VMware Workstation, and plays nicely with the VMware Player for Windows, Linux, and Mac. But VMware isn’t for everyone. Thomas Lockney has managed to get our VM image running on Virtual Box,
Disclaimer: Cloudera no longer approves of the recommendations in this post. Please see this documentation for configuration recommendations.
One of the things we get a lot of questions about is how to make Hadoop highly available. There is still a lot of work to be done on this front, but we wanted to take a moment and share the best practices from one of our customers. Check out what Paul George has to say about how they keep thier NameNode up at ContextWeb.