big data cloud image?

Thu Jan 2 14:41:42 UTC 2014

On 12/17/2013 04:25 PM, Sam Kottler wrote:
>
>
> ----- Original Message -----
>> From: "Matthew Miller" <mattdm at fedoraproject.org>
>> To: bigdata at lists.fedoraproject.org
>> Sent: Tuesday, December 17, 2013 4:19:19 PM
>> Subject: big data cloud image?
>>
>> Hello, Big Data SIG. As we're looking at use cases for the future of Fedora
>> Cloud, this is one thing that came up. Would it be useful for us to provide
>> an image preinstalled with Hadoop or some other of the Big Data software,
>> possibly with some orchestration layer for putting together a cluster built
>> from these images.
>>
>> If this seems like a good idea in general, what do you think would look like
>> in specific?
>
> Seems like one of the most obvious things that could be included in a
> well-tuned JVM; one of the hardest things to get right as someone who
> is just starting out with Hadoop/Storm/Kafka/HBase. That being said,
> the characteristics for each of those tools might be different so
> that's another issue.

it would definitely be useful to provide an image w/ big data software 
pre-installed. even installing the jvm ahead of time can significantly 
reduce the time it takes to get the software up and running, not to 
mention how big the hadoop dep graph is!

we're hoping to include ambari in f21 (just trying to pass some upstream 
+ fedora hurdles right now), which will help w/ orchestration, but 
that's only one path. ambari will use puppet recipes, but there's also 
chef to consider.

it would be ideal if we could have the big data software delivered to 
uses via cloud images (including nocloud for virt-install and vagrant) 
as well as containers through the docker index.

best,

matt