Skip to main content

Posts

Showing posts from August, 2013

An impatient start with Cascading

Last couple of years when i worked with Hadoop, in many blogs and conferences i have heard about Cascading framework on top of Hadoop to implements ETL. But in our lean start up project we decided to used Pig and we implemented our data flow based on Pig. Recently i have got a new book from O'Reilly Media "Enterprise Data Workflows with Cascading" and finished two interesting chapter with one breath. This weekend i have managed a couple of hours to make some try with examples from the Book. Author Paco Nathan very nicely explains why and when you should use Cascading instead of PIg or Hive, even more he gives examples to try at home. My today's blog is to my first expression on Cascading. All the examples of the book could be found from the git hub . I have cloned the project from the Git hub and ready to run the examples. Project Impatient compiles and build with Gradle. I have run gradle clean jar and stacked with the following errors: Could not resolve all