Anyone running Ceph?

Brown, David M JR david.brown at pnnl.gov
Sat Jan 24 00:10:12 UTC 2015


Okay then,

This is probably in need of a presentation somewhere and some slides, but
here goes…

We got into using Ceph early on as the features were something that was
useful for our performance gathering we did on our HPC systems. We used to
collect metrics using Ganglia or Collectl and aggregate them into a
PostgreSQL server, a big one. The issue with that system was the queries
tended to take a long time and we had a very specific set of patterns we
used to access the data. Not that PostgreSQL isn’t the greatest database
on the planet as we did abuse it quite heavily and it did exactly what it
was supposed to do. There were a few things that drove us away from
PostgreSQL. First, was the transactional nature of SQL wasn’t needed. Most
of the time we were processing the performance data after the job ends to
produce graphs for use later. Users don’t do a lot of interactive
performance monitoring with these metrics at our site. Second was the
scalability of the SQL system, we had been tracking 2,000 nodes each with
about 180 metrics being collected every minute and we wanted that for 5-6
years. With that much data you are going to get a cluster of systems so
what’s easier for us to deal with? Ceph.

The current process we use to capture the metrics starts with Collectd. We
then funnel the data to Ceph though a custom software, NWPerf. We then
have a data formatter that can aggregate the data into static files that
can be copied locally or shared over HTTP. The formatter can generate
static files for individual jobs or for the entire cluster. The
visualization software (CView) then reads those static files and displays
the output in an OpenGL context so that application performance
bottlenecks can be discovered more easily.

We’ve also used the performance metrics to make more education decisions
about future procurements. This information is invaluable for us when
evaluating whether we need more storage, faster storage, more network,
lower latency network, etc. The process we use to do this is more manual
as we only need to ask these questions every 4-5 years.

Ceph is the backbone storage for this data as its scalable. We can attack
performance gathering as well as extracting information out of that data
in a fault tolerant and parallel way.

Furthermore, Ceph is being used as the backend storage for our OpenStack
cluster. It’s a small 12 node micro-cloud from SuperMicro. We’ve extended
Ceph to these nodes and run one OSD per disk on those systems. Cinder and
Swift are both backed by Ceph and integrate quite nicely with a qemu-kvm
built against the librbd layer of Ceph. We are figuring out how to use
OpenStack at our site and what benefits it will provide us. There are
several pushes to deploy OpenStack clusters at PNNL in both the operations
area as well as the research area.

We are very interested in what RedHat is planning to do with Inktank and
its involvement with Ceph as we depend on it quite a lot.

Thanks, hopefully this was helpful.

- David Brown

________________________________________
From: server-bounces at lists.fedoraproject.org
[server-bounces at lists.fedoraproject.org] on behalf of Patrick McGarry
[pmcgarry at redhat.com]
Sent: Friday, January 16, 2015 11:06 AM
To: server at lists.fedoraproject.org
Subject: Re: Anyone running Ceph?

David,

Awesome, this is great information. If you'd be interested in writing
up some of the details of your work and experience I'd be happy to
post it under your name (or mine with attribution). If you wouldn't
have time for this I'd love to get a brief outline from you and put
together some questions that I could use to write a piece. We love
featuring our users and their awesome work. Let me know what you might
have time/appetite for. Thanks so much!


Best Regards,

Patrick McGarry
Director Ceph Community || Red Hat
http://ceph.com  ||  http://community.redhat.com
@scuttlemonkey || @ceph


On Thu, Jan 15, 2015 at 1:10 PM, Brown, David M JR <david.brown at pnnl.gov>
wrote:
> Patrick,
>
> I¹m from PNNL and we¹ve been running ceph for between 5-7 years. We are a
> RedHat shop and have used Ceph for two applications quite successfully
> during that time. We run several HPC systems using CentOS/SL and capture
> performance data into Ceph for permanent storage for the life of the
> system. Over the last 2 years we¹ve been running a small/medium Openstack
> cluster backed by Ceph as well.
>
> As all of the usage I¹m aware of is US DOE open research, so we are
> perfectly willing to have more in-depth conversation about this if anyone
> is interested. However, I¹ll leave that up to the fedora list as to
> whether they want to have it on the list or not.
>
> Please feel free email me directly about this we are happy to share
> whatever information people are interested in about Ceph and Openstack.
>
> Thanks,
> - David Brown
>
> On 1/15/15, 7:22 AM, "Patrick McGarry" <pmcgarry at redhat.com> wrote:
>
>>Hey all,
>>
>>As the Ceph team dives deeper and deeper into the multi-headed distro
>>world I'm looking to gather anecdotal evidence of what the Ceph user
>>experience is like on various distros. If anyone here is running (or
>>has run) Ceph on some flavor of Fedora, I'd love to hear about it.
>>
>>My hope is that we can work with the Fedora community to polish off
>>any difficulties or awkwardness that may still exist. Thanks!
>>
>>
>>Best Regards,
>>
>>Patrick McGarry
>>Director Ceph Community || Red Hat
>>http://ceph.com  ||  http://community.redhat.com
>>@scuttlemonkey || @ceph
>>_______________________________________________
>>server mailing list
>>server at lists.fedoraproject.org
>>https://admin.fedoraproject.org/mailman/listinfo/server
>
> _______________________________________________
> server mailing list
> server at lists.fedoraproject.org
> https://admin.fedoraproject.org/mailman/listinfo/server
_______________________________________________
server mailing list
server at lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/server



More information about the server mailing list