Big Data SIG Meeting Minutes: 2013-03-07
by Robyn Bergeron
Howdy,
Minutes from today's meeting follow below.
Had a really interesting conversation - though I think we didn't quite
get through much of the agenda :) We did discuss some categories,
things already packaged, etc. - I'm on duty to make a sub-wiki-page so
we can collectively document some of these things.
Also had some talk around packaging Apache Bigtop as well as the
intersections with that community - as many of their folks apparently
use Fedora (YAY).
Anyway: Keep your intros, thoughts coming - I suspect we'll be
gathering more humans in the coming weeks.
Thanks for coming!
-Robyn
Minutes: http://meetbot.fedoraproject.org/fedora-meeting-1/2013-03-07/big_data_sig...
Full logs: http://meetbot.fedoraproject.org/fedora-meeting-1/2013-03-07/big_data_sig...
===============================
#fedora-meeting-1: Big Data SIG
===============================
Meeting started by rbergeron at 16:59:44 UTC. The full logs are
available at
http://meetbot.fedoraproject.org/fedora-meeting-1/2013-03-07/big_data_sig...
.
Meeting summary
---------------
* Who's around for fun? (rbergeron, 17:00:05)
* present: rbergero, tflink (rbergeron, 17:01:15)
* present: witlessb (rbergeron, 17:01:45)
* present: threebean, zoglesby, samkottler, jsmith (rbergeron,
17:02:18)
* Agenda for today's first meeting :D (rbergeron, 17:03:49)
* LINK:
http://lists.fedoraproject.org/pipermail/bigdata/2013-March/000003.html
(rbergeron, 17:04:32)
* Agenda looks like: What this is all about, what do we have, what
don't we have, what is anyone here interested in doing :)
(rbergeron, 17:05:09)
* What's the Big Data SIG all about? (rbergeron, 17:05:57)
* loosely quoting from o'reilly: "If the size of your data is part of
the problem, it's Big Data." (rbergeron, 17:07:57)
* IDEA: one part of it is getting a decent setup; other part is
understanding tools and approaches needed (rbergeron, 17:11:42)
* IDEA: two pieces you hear most about in Big Data seem to be massive
storage, & parallel computing (hadoop, column databases, etc)
(rbergeron, 17:12:16)
* IDEA: another component is online processing or online analysis -
predicting what is trending before its had time to hit disk
(rbergeron, 17:13:29)
* IDEA: for ex. - idea that google can predict flu outbreaks faster
than public health agencies by watcihng search terms; financial
tools as well apply to concept (rbergeron, 17:15:49)
* IDEA: another ex. of stream processing is twitter analytics -
looking for emerging topics in twitter streams (rbergeron,
17:16:52)
* What are the buckets or categories, and what do we have? (rbergeron,
17:18:21)
* IDEA: orchestration, batch processing, stream processing are
categories that come to mind - orch (zookeeper), batch (hadoop,
disco), stream (storm) (rbergeron, 17:21:23)
* IDEA: storage is another category (rbergeron, 17:22:25)
* IDEA: full hadoop stack seems to be thought of as useful foundation
layer for some types of work, but HDFS is getting attention as weak
spot, with nosql dbs and gluster being used as alternatives
(rbergeron, 17:23:00)
* hdfs is part of the hadoop project; one can install hdfs without
using the mapreduce part (rbergeron, 17:28:05)
* lots of java libraries, servers like tomcat (used by solr, oozie)
are already in (rbergeron, 17:44:01)
* IDEA: we have pandas (not the animal, http://pandas.pydata.org) -
useful for data analysis, not really big data (rbergeron, 17:49:40)
* ACTION: rbergeron to add a sub-page of packges we have (unless
someone beats me to it) (rbergeron, 17:50:18)
* IDEA: interest in disco - seems to be more python-friendly than
hadoop (though we are aware that there are python wrappers for
hadoop) (rbergeron, 18:08:03)
* spring is packaged - could package spring-hadoop (rbergeron,
18:11:24)
* ACTION: bmahe to expound on openjdk/bug filing, as well as the wide
world of bigtop packaging, as time permits :) (rbergeron, 18:12:51)
* Operation Agenda: Yeah... (rbergeron, 18:15:50)
* ACTION: rbergeron to prod in meeting notes to get people to talk re:
what would we like to do (we==they) (rbergeron, 18:23:54)
Meeting ended at 18:26:13 UTC.
Action Items
------------
* rbergeron to add a sub-page of packges we have (unless someone beats
me to it)
* bmahe to expound on openjdk/bug filing, as well as the wide world of
bigtop packaging, as time permits :)
* rbergeron to prod in meeting notes to get people to talk re: what
would we like to do (we==they)
Action Items, by person
-----------------------
* bmahe
* bmahe to expound on openjdk/bug filing, as well as the wide world of
bigtop packaging, as time permits :)
* rbergeron
* rbergeron to add a sub-page of packges we have (unless someone beats
me to it)
* rbergeron to prod in meeting notes to get people to talk re: what
would we like to do (we==they)
* **UNASSIGNED**
* (none)
People Present (lines said)
---------------------------
* rbergeron (163)
* bmahe (59)
* tflink (36)
* ctyler (27)
* threebean (8)
* zodbot (5)
* witlessb (4)
* samkottler (1)
* zoglesby (1)
* jsmith (1)
Generated by `MeetBot`_ 0.1.4
.. _`MeetBot`: http://wiki.debian.org/MeetBot
11 years
RE: Intros and such - why are you here?
by John Dulaney
> Date: Thu, 7 Mar 2013 04:09:04 -0700
> From: Robyn Bergeron <robyn.bergeron(a)gmail.com>
> To: bigdata(a)lists.fedoraproject.org
> Subject: Intros and such - why are you here?
>
> Hi everyone,
>
> Since this is a new SIG and all that, I thought it would be lovely to
> perhaps introduce ourselves, and say a bit about why you're here and
> interested. Whether you're already doing things in this area, you'd
> like to learn about it, you want to do things, or you're just here to
> observe, or any other reason - any reason is a great reason :)
>
> And I guess that means: I get to go first. Hah.
>
> So, a bit about me and my interests here:
>
> I'm Robyn, I'm the Fedora Project Leader, and I like to make new
> things happen. :)
>
> Why big data? A few reasons:
>
> #1: I've always had a fascination with data and how it can be used as
> part of a decision-making process. I believe that agility is one of
> the most important differentiating factors for organizations. The
> ability to do things quickly, identify key trends and data points, and
> make decisions and act upon knowledge, enables organizations to move
> more intelligently. And by intelligently I mean this: (a) Predicting
> the right thing to do based upon patterns, (b) being able to detect
> signs that you're doing the wrong thing - so that you can fix that
> faster.
>
> The cloud brings us the ability to utilize, deploy, orchestrate
> infrastructure more rapidly; having lots of data points, and the
> ability to analyze that information, comes through big data. Putting
> those two things together gets you to the point where you can analyze
> faster, or on a more ad-hoc basis, or deliver the capability to
> analyze random things more rapidly to the person who wants to act upon
> information.
>
> #2: SCIENCE! I think it's interconnected hugely here. I like to think
> that what we do in Fedora can help to change the world. There is more
> information than ever about every little bit of the universe we live
> in, and helping people to sort through that leads to making the world
> a better place. I'll be passing out banjos and marshmallows for the
> campfire at the end of our meeting. Kumbaya :)
>
> #3: I think that people like to tinker with and learn about new stuff,
> and that Fedora is a great place to do that - "features" and "first"
> are two of our awesome foundations. But I think that people are more
> interested in "the new stuff" than they are interested in "what it
> runs on" - so I hope that in bringing some interesting tools to
> Fedora, and making them work well, we can inspire some new people to
> use Fedora. And hopefully inspire them to also become contributors,
> broaden the set of tools that we offer, give feedback about what we're
> doing, and encourage them to share *what* and *how* they did things.
>
> And that was long-winded. So I'll stop there. :D
>
> Anyone else?
>
I'm John, and I'm a Fedora User.
/me waits for the murmurs of "Hi, John" to die down.
I'm here for several reasons. FIrst, I like shiny new stuff, and I like to take
that shiny new stuff and do things to it that the creators did not have in mind
at all. Secondly, as a member of the QA team, I may be able to suggest test
days and the like. Thirdly, it's good to just stay informed about what is coming
down the pipe in Fedora.
John Dulaney.
PS: If Robyn is passing out banjos, I'll take one of these:
http://www.stellingbanjo.com/custom2.htm
11 years
Boing Boing big data book review today
by Matthew Miller
http://boingboing.net/2013/03/08/big-data-a-revolutio.html
Big Data is a new book from Viktor Mayer-Schonberger, a respected
Internet governance theorist; and Kenneth Cukier, a long-time technology
journalist who's been on the Economist's for many years. As the title and
pedigree imply, this is a business-oriented book about "Big Data," a
computational approach to business, regulation, science and entertainment
that uses data-mining applied to massive, Internet-connected data-sets to
learn things that previous generations weren't able to see because their
data was too thin and diffuse.
Big Data is an eminently practical and sensible book, but it's also an
exciting and excitable text, one that conveys enormous enthusiasm for the
field and its fruits.
[and so on]
--
Matthew Miller ☁☁☁ Fedora Cloud Architect ☁☁☁ <mattdm(a)fedoraproject.org>
11 years
What we have right now in Fedora is: ..... ??
by Robyn Bergeron
Hi everyone,
Per the meeting today - and me volunteering to add a page where we can
add the things we know of that already exist in Fedora - well, I did
that. It's also linked to the main Big Data SIG page.
https://fedoraproject.org/wiki/Big_data_packaging
It's also got a bit of a wish-list section, not sure if we want to
jump that far ahead yet :)
As far as the categories/buckets - they're not set in stone, just
suggestions from today; it might be helpful if someone knowledgeable
(read: probably not me, haha) wanted to elaborate a bit on what they
mean and/or provide some examples. If you don't know what bucket
something goes in, don't worry - just add it as a bulletpoint above
the categories, and we'll sort it out as we go along.
I listed the categories here:
https://fedoraproject.org/wiki/Big_data_SIG#Categories
As someone famous once said, "It's a wiki, be bold!" - so feel free to
add/subtract content as you see fit. :D
-Robyn
11 years
Intros and such - why are you here?
by Robyn Bergeron
Hi everyone,
Since this is a new SIG and all that, I thought it would be lovely to
perhaps introduce ourselves, and say a bit about why you're here and
interested. Whether you're already doing things in this area, you'd
like to learn about it, you want to do things, or you're just here to
observe, or any other reason - any reason is a great reason :)
And I guess that means: I get to go first. Hah.
So, a bit about me and my interests here:
I'm Robyn, I'm the Fedora Project Leader, and I like to make new
things happen. :)
Why big data? A few reasons:
#1: I've always had a fascination with data and how it can be used as
part of a decision-making process. I believe that agility is one of
the most important differentiating factors for organizations. The
ability to do things quickly, identify key trends and data points, and
make decisions and act upon knowledge, enables organizations to move
more intelligently. And by intelligently I mean this: (a) Predicting
the right thing to do based upon patterns, (b) being able to detect
signs that you're doing the wrong thing - so that you can fix that
faster.
The cloud brings us the ability to utilize, deploy, orchestrate
infrastructure more rapidly; having lots of data points, and the
ability to analyze that information, comes through big data. Putting
those two things together gets you to the point where you can analyze
faster, or on a more ad-hoc basis, or deliver the capability to
analyze random things more rapidly to the person who wants to act upon
information.
#2: SCIENCE! I think it's interconnected hugely here. I like to think
that what we do in Fedora can help to change the world. There is more
information than ever about every little bit of the universe we live
in, and helping people to sort through that leads to making the world
a better place. I'll be passing out banjos and marshmallows for the
campfire at the end of our meeting. Kumbaya :)
#3: I think that people like to tinker with and learn about new stuff,
and that Fedora is a great place to do that - "features" and "first"
are two of our awesome foundations. But I think that people are more
interested in "the new stuff" than they are interested in "what it
runs on" - so I hope that in bringing some interesting tools to
Fedora, and making them work well, we can inspire some new people to
use Fedora. And hopefully inspire them to also become contributors,
broaden the set of tools that we offer, give feedback about what we're
doing, and encourage them to share *what* and *how* they did things.
And that was long-winded. So I'll stop there. :D
Anyone else?
11 years
Meeting reminder: Big Data SIG! Thursday, 17:00 UTC, 2012-03-07
by Robyn Bergeron
Just a quick reminder - Dan was kind enough to send one out earlier,
but I figure it can't hurt to do another with "meeting reminder" in
the subject to catch your attention :D once more:
The debut meeting of the big data SIG is *today* - Thursday, March 7 .
When: 17:00 UTC
Time converter: bit.ly/14wW3Ci
Where: On IRC, irc.freenode.net, in #fedora-meeting-1 (note that -1 on the end!)
Everyone is, of course, welcome to join us!
Agendas for a first meeting are always a bit odd to come up with, but
I think this is what we'll roll with for the meeting:
Agenda:
* Quick overview of "what this group is all about"
* Identify things we already have that are useful
* Identify things we don't have - so that folks who might be
interested in something might be compelled to take it on
* Identify a few loose goals - based on what people present might want to do.
Other items are totally welcome - we can adjust as needed :)
If you have any good references for "what is big data," what the
important tools are to be targeting, or likely groups of potential
users/collaborators, I totally encourage you to bring that info along
to the meeting, or feel free to share before or after. :D
Cheers, and see you at 17:00!
-Robyn
11 years
First post :)
by Robyn Bergeron
I guess I will invoke the spirit of other lovely mailing lists, such as the Fedora Cloud SIG mailing list, and do the traditional first post.
We'll do good stuff here. As Fedora always does.
-Robyn
11 years