Sunday, March 9, 2014

Fast queries over large data

This past week at  I talked w/ Armando Fox about projects, MOOCs, and connection/engagement of teams (including, but not limited to students).

On a data front, this post from a little more than a year ago:

talks of Google's Dremel where Armando said:
‘Before Dremel, no one had really done a system that was that big and that fast. Usually, you have to do one or the other. The more you do one, the more you have to give up on the other. But with Dremel, they did both.’

The OpenDremel system which the wired article references has been merged with Apache Drill which has reached milestone-1.

While Drill is for searching across HBase, Mongo, and Cassandra,  Apache Spark is about significantly speeding Map-Reduce as compared to simply using Hadoop.

No comments:

Post a Comment

Subscribe via email

Enter your email address:

Delivered by FeedBurner