Installing Scribe For Log Collection

Categories: Data Ingestion

Scribe is a newly released log collection tool that dumps log files from various nodes in a cluster to Scribe servers, where the logs are stored for further use.  Facebook describes their usage of Scribe by saying, “[Scribe] runs on thousands of machines and reliably delivers tens of billions of messages a day.”  It turns out that Scribe is rather difficult to install, so the hope of this post is to help those of you attempting to install Scribe.  The first step is to get dependencies installed.

Scribe has many dependencies that must be installed in order for Scribe to be built properly. They are listed here:

  • ruby (ruby and ruby-dev)
  • python (python and python-dev)
  • libevent (libevent and libevent-dev)
  • boost v1.36
  • Thrift
  • fb303 (included in Thrift in contrib/fb303)

The order in which these are installed is important.  First, you must install libevent, then libevent-dev, then boost, then Thrift, and finally fb303.  I installed libevent and libevent-dev from RPMs, whereas boost, Thrift, and fb303 were installed from source.  I was unable to get Scribe, Thrift, and fb303 to locate the boost libraries and includes correctly with the default boost install directories, so I installed boost in /usr/local/boost, /usr/local/boost/bin, /usr/local/boost/lib, and /usr/local/boost/include.  Run ‘./configure –help’ when configuring boost to see how to specify these options.

When configuring Thrift and fb303, you must specify your location of boost with the “–with-boost=/path/to/boost/root” option and also set your BOOST_ROOT environment variable.

Finally, you must make sure your LD_LIBRARY_PATH environment variable contains the lib folders that house the Thrift, fb303, boost, and libevent C++ libraries.  LD_LIBRARY_PATH follows the same pattern as the PATH variable.  That is, directories that contain libraries are separated by colons.  If you forget to set your LD_LIBRARY_PATH variable, then you’ll get the following error when running scribed:

scribed: error while loading shared libraries: cannot open shared object file: No such file or directory

Install Scribe
Once you’ve successfully installed all dependencies, installing Scribe is easy.  Scribe ships with a fairly comprehensive README file, but the instructions involving boost’s configuration are slightly incorrect.  I needed to pass the “–with-boost=/usr/local/boost” options while configuring.  However, the README file says to use “–with-boost /usr/local/boost”.  Here is my full configure statement:

./configure --with-boost=/usr/local/boost --with-boost-system=boost_system-gcc41-mt-1_36 --with-boost-filesystem=boost_filesystem-gcc41-mt-1_36

We installed Scribe from trunk instead of releases/scribe-2.0.

Configure Scribe
Scribe ships with a few good examples in $SCRIBE_ROOT/examples; just take a look at the README in the examples directory, and you should be ready to rock.  However, the README doesn’t document that the scribe_ctrl and scribe_cat programs are in the examples directory.

I hope this tutorial was helpful!  Send us an email if you have any issues.  We’ll make a follow-up post later talking about more in-depth Scribe configurations and benchmarking.


2 responses on “Installing Scribe For Log Collection