Big Data SIG Meeting Minutes: 2013-03-07

Robyn Bergeron robyn.bergeron at gmail.com
Thu Mar 7 18:40:32 UTC 2013


Howdy,

Minutes from today's meeting follow below.

Had a really interesting conversation - though I think we didn't quite
get through much of the agenda :) We did discuss some categories,
things already packaged, etc. - I'm on duty to make a sub-wiki-page so
we can collectively document some of these things.

Also had some talk around packaging Apache Bigtop as well as the
intersections with that community - as many of their folks apparently
use Fedora (YAY).

Anyway: Keep your intros, thoughts coming - I suspect we'll be
gathering more humans in the coming weeks.

Thanks for coming!

-Robyn


Minutes: http://meetbot.fedoraproject.org/fedora-meeting-1/2013-03-07/big_data_sig.2013-03-07-16.59.html
Full logs: http://meetbot.fedoraproject.org/fedora-meeting-1/2013-03-07/big_data_sig.2013-03-07-16.59.log.html

===============================
#fedora-meeting-1: Big Data SIG
===============================


Meeting started by rbergeron at 16:59:44 UTC. The full logs are
available at
http://meetbot.fedoraproject.org/fedora-meeting-1/2013-03-07/big_data_sig.2013-03-07-16.59.log.html
.

Meeting summary
---------------
* Who's around for fun?  (rbergeron, 17:00:05)
  * present: rbergero, tflink  (rbergeron, 17:01:15)
  * present: witlessb  (rbergeron, 17:01:45)
  * present: threebean, zoglesby, samkottler, jsmith  (rbergeron,
    17:02:18)

* Agenda for today's first meeting :D  (rbergeron, 17:03:49)
  * LINK:
    http://lists.fedoraproject.org/pipermail/bigdata/2013-March/000003.html
    (rbergeron, 17:04:32)
  * Agenda looks like: What this is all about, what do we have, what
    don't we have, what is anyone here interested in doing :)
    (rbergeron, 17:05:09)

* What's the Big Data SIG all about?  (rbergeron, 17:05:57)
  * loosely quoting from o'reilly: "If the size of your data is part of
    the problem, it's Big Data."  (rbergeron, 17:07:57)
  * IDEA: one part of it is getting a decent setup; other part is
    understanding tools and approaches needed  (rbergeron, 17:11:42)
  * IDEA: two pieces you hear most about in Big Data seem to be massive
    storage, & parallel computing (hadoop, column databases, etc)
    (rbergeron, 17:12:16)
  * IDEA: another component is online processing or online analysis -
    predicting what is trending before its had time to hit disk
    (rbergeron, 17:13:29)
  * IDEA: for ex. - idea that google can predict flu outbreaks faster
    than public health agencies by watcihng search terms; financial
    tools as well apply to concept  (rbergeron, 17:15:49)
  * IDEA: another ex. of stream processing is twitter analytics -
    looking for emerging topics in twitter streams  (rbergeron,
    17:16:52)

* What are the buckets or categories, and what do we have?  (rbergeron,
  17:18:21)
  * IDEA: orchestration, batch processing, stream processing are
    categories that come to mind - orch (zookeeper), batch (hadoop,
    disco), stream (storm)  (rbergeron, 17:21:23)
  * IDEA: storage is another category  (rbergeron, 17:22:25)
  * IDEA: full hadoop stack seems to be thought of as useful foundation
    layer for some types of work, but HDFS is getting attention as weak
    spot, with nosql dbs and gluster being used as alternatives
    (rbergeron, 17:23:00)
  * hdfs is part of the hadoop project; one can install hdfs without
    using the mapreduce part  (rbergeron, 17:28:05)
  * lots of java libraries, servers like tomcat (used by solr, oozie)
    are already in  (rbergeron, 17:44:01)
  * IDEA: we have pandas (not the animal, http://pandas.pydata.org) -
    useful for data analysis, not really big data  (rbergeron, 17:49:40)
  * ACTION: rbergeron to add a sub-page of packges we have (unless
    someone beats me to it)  (rbergeron, 17:50:18)
  * IDEA: interest in disco - seems to be more python-friendly than
    hadoop (though we are aware that there are python wrappers for
    hadoop)  (rbergeron, 18:08:03)
  * spring is packaged - could package spring-hadoop  (rbergeron,
    18:11:24)
  * ACTION: bmahe to expound on openjdk/bug filing, as well as the wide
    world of bigtop packaging, as time permits :)  (rbergeron, 18:12:51)

* Operation Agenda: Yeah...  (rbergeron, 18:15:50)
  * ACTION: rbergeron to prod in meeting notes to get people to talk re:
    what would we like to do (we==they)  (rbergeron, 18:23:54)

Meeting ended at 18:26:13 UTC.


Action Items
------------
* rbergeron to add a sub-page of packges we have (unless someone beats
  me to it)
* bmahe to expound on openjdk/bug filing, as well as the wide world of
  bigtop packaging, as time permits :)
* rbergeron to prod in meeting notes to get people to talk re: what
  would we like to do (we==they)


Action Items, by person
-----------------------
* bmahe
  * bmahe to expound on openjdk/bug filing, as well as the wide world of
    bigtop packaging, as time permits :)
* rbergeron
  * rbergeron to add a sub-page of packges we have (unless someone beats
    me to it)
  * rbergeron to prod in meeting notes to get people to talk re: what
    would we like to do (we==they)
* **UNASSIGNED**
  * (none)

People Present (lines said)
---------------------------
* rbergeron (163)
* bmahe (59)
* tflink (36)
* ctyler (27)
* threebean (8)
* zodbot (5)
* witlessb (4)
* samkottler (1)
* zoglesby (1)
* jsmith (1)

Generated by `MeetBot`_ 0.1.4

.. _`MeetBot`: http://wiki.debian.org/MeetBot


More information about the meetingminutes mailing list