Cloudera Developer Blog · Community Posts
Strata Conference + Hadoop World 2013 (Oct. 28-30 in New York City) approaches (register here for an automatic 20% discount), and that means it’s time to get your meetup schedule sorted out!
There are a variety of them planned across the week (something for everyone!), onsite at the conference hotel as well as offsite. Use the links below to RSVP.
Welcome to our second edition of “This Month in the Ecosystem.” (See the inaugural edition here.) Although August was not as busy as July, there are some very notable highlights to report.
Today, I thought it would be helpful to highlight some features that will help you get the most out of this new service:
Strata Conference + Hadoop World 2013 is looming on the horizon and pacing to be the largest gathering of Big Data professionals on the globe. As co-hosts with O’Reilly, we have seen the conference thrive, grow, and are excited about the upcoming Oct. 28 – 30 event!
The ecosystem is evolving at a rapid pace – so rapidly, that important developments are often passing through the public attention zone too quickly. Thus, we think it might be helpful to bring you a digest (by no means complete!) of our favorite highlights on a regular basis. (This effort, by the way, has different goals than the fine Hadoop Weekly newsletter, which has a more expansive view – and which you should subscribe to immediately, as far as we’re concerned.)
Find the first installment below. Although the time period reflected here is obviously more than a month long, we have some catching up to do before we can move to a truly monthly cadence.
Cloudera Impala has made huge progress since its initial announcement – and there’s even more good news on the roadmap. To learn more, plan to attend an Impala meetup hosted by Cloudera in its San Francisco offices on the evening of Aug. 20:
We’re very happy to re-publish the following post from Twitter analytics infrastructure engineering manager Dmitriy Ryaboy (@squarecog).
OSCON 2013 is already receding in the rear-view mirror, but we had a great time. Cloudera’s sessions were very well attended — with Tom Wheeler taking the prize (well over 200 attendees for his “Introduction to Apache Hadoop” tutorial) — but best of all was the opportunity to meet and mingle with people in the broader open source community. If you visited us at Booth 420, we hope you will now download and install the QuickStart VM after seeing it in our demo, and that your questions were adequately answered (most popular question: “Can you tell me more about Cloudera Impala?”)
In my biased opinion, the crowning achievement was our ability to not only distribute a couple hundred “Data is the New Bacon” Tshirts within a 36-hour period, but to clean ourselves out of the meat-free version shortly thereafter, as well:
This is a great day for technical end-users – developers, admins, analysts, and data scientists alike. Starting now, Cloudera complements its traditional mailing lists with a new, feature-rich community forums intended for users of Cloudera’s Platform for Big Data! (Login using your existing credentials or click the link to register.)
Although mailing lists have long been a standard for user interaction, and will undoubtedly continue to be, they have flaws. For example, they lack structure or taxonomy, which makes consumption difficult. Search functionality is often less than stellar and users are unable to build reputations that span an appreciable period of time. For these reasons, although they’re easy to create and manage, mailing lists inherently limit access to knowledge and hence limit adoption.
Continuing the fine tradition of Clouderans contributing books to the Apache Hadoop ecosystem, Apache Sqoop Committers/PMC Members Kathleen Ting and Jarek Jarcec Cecho have officially joined the book author community: their Apache Sqoop Cookbook is now available from O’Reilly Media (with a pelican the assigned cover beast).
The book arrives at an ideal time. Hadoop has quickly become the standard for processing and analyzing Big Data, and in order to integrate a new Hadoop deployment into your existing environment, you will very likely need to transfer data stored in legacy relational databases into your new cluster.