An Interview on Fedora 11's enhanced Audio Control with Lennart Poettering
Where would we be without sound? It's the most primitive of
communication methods, and yet it has spawned so much technology around
it. Whether you're a musician, a DJ, riding a bus to work, or even just
stuck in a cubicle listening to the radio somewhere, sound has become an
integral part of your daily experiences. When Fedora 11 lands, along
with it will land a number of enhancements to the sound subsystem,
including unified volume control, per stream and per device monitoring,
and proper Bluetooth audio support. I recently caught up with Lennart
Poettering, Red Hat Desktop Team Engineer and resident audio guru.
Here's what he had to say about the upcoming improvements and what the
*1. Please introduce yourself and give us a brief intro to how you
started working on the upcoming audio improvement in F11.*
I am Lennart Poettering and have been working for Red Hat in the Desktop
Group for two years now this month. I live in Berlin, Germany.
PA has been part of Fedora since F8. Since then we used to ship two
volume control appications: the GNOME volume control and a PA (Pulse
Audio) specific tool (pavucontrol). The latter was mostly a showcase
what can be done with PA and I wrote it mostly as a demo, not because I
thought it was any good as an UI.
Of course having these two volume control UIs in Fedora was a situation
that badly needed fixing. Especially since both UIs exposed too many
unnecessary options: the GNOME volume control exposed a lot of low-level
hardware-specific features that only a tiny minority of people actually
really understood, and the PA volume control exposed a lot of low-level
software features that a slightly larger minority of people only
actually really understood.
Now during the last year we reached a point were the feature set of PA
for volume controls became very complete (with such things as arbitrary
meta data on every stream/device, per-stream and per-device monitoring,
hardware volume range extension, "flat" volumes and lots of other stuff)
and Jon McCan with help from Bastien Nocera finally took up the work to
fix the UI situation.
They basically designed the new UI from scratch with input from
usability experts. It implements many of the features the old
pavucontrol tool did, but in a much nicer, streamlined way. Also it
integrates sound theme/event sound control with general audio
configuraton and volume control in a single UI tool.
*2. Can you give us some background on the upcoming changes to the audio
subsystem in the Fedora 11 Release?*
If you want to know more about the Volume Control, I'd just refer to the
We moved PA 0.9.15 into F11, a nice overview over the new features you
can find here:
However that overview is a bit out-of-date. There are quite a few
additional features that went into 0.9.15, most prominently full
Bluetooth Audio support: Together with Bastien Nocera and the BlueZ guys
I worked to make Bluetooth audio easily accessible -- the bluetooth
applet now exposes an easy dialog that allows you to pair and activate a
bluetooth headset. After that is done it will automatically appear in
PulseAudio. If you need to reactivate it later, you can do that with a
simple click in the applet menu. It works surprisingly well. It even
works fine for lip-sync video. Which is kind of magic, given that
Bluetooth Audio doesn't actually offer any timing interfaces, so syncing
up audio with video is not really possible. I spent a lot of time to
make sure it does work nonetheless, and it seems to work on the majority
of headphones although I cannot say for sure if it does for all of them.
*3. Where did the ideas to change all this stuff come from? Didn't audio
always work in Fedora?*
Depends what you mean by 'work'. Sure, basic audio output worked. But in
many ways what we had on Linux was not comparable to what MacOS or
Windows supported. And it still isn't in many ways. However in other
ways we have now surpassed those competitors.
A lot of the changes we introduced with PA are not directly visible to
the user. For example the so called 'glitch-free' logic in PA is very
important for a modern audio stack, however the normal user will never
notice it -- except maybe because when we introduced it initially a lot
of driver bugs got exposed that people were not aware of before because
that driver functionality (usually timing related) was not really
depended on by any application. In fact even now many of the older
drivers expose broken timing that makes usage with PA not as much fun as
it could be.
A more detailed explanation of this 'glitch-free' logic you may find here:
Both Windows Vista and MacOS X have similar g-f logic in their audio
stacks, however with PA we brought it to the next step. For example, we
implemented this logic in a zero-copy fashion and with arbitrary sample
types. This allows us to pass PCM data through our pipelines without
ever having to copy/convert it unless we really have to.
So yes, as you might have noticed I spend a lot of time to get low-level
internals right. And I like to speak about it, even though most people
are not aware of all those technical details and how awesome this all
is. ;-) That said, this stuff isn't perfect yet and could need more
But it's not all just in the low-level details. Also on higher levels we
got inspired by how our competitors do things. For example the new
"flat" volume logic was pioneered in Vista, and we have now adopted a
similar logic in PA. It's a great way to reduce the complexities of
volume control by 'merging' a few of the sliders in the pipeline. It
thus solves the "So which slider is now causing my volume to be too
low?" a bit. But also here, there's more work to be done.
It's not all just getting inspired by our competitors. There are a lot
of genuinely new features in PA that none of them have (at least to my
knowledge). For example, in PA we have 'spatial' event sounds. I.e. if
an event sound sound is triggered by a mouse click/dialog at the left
side of the screen the sound is generated more from the left speakers,
and similar for the right side. This is of course mostly a toy. But I
think a useful one ;-) .
Listing all the fancy features PA has would certainly be a bit too much
for this interview. So I'll leave it with this... ;-)
Generally, we get inspiration from everywhere. And sure, as long as the
most basic music playback was enough for you audio did always work in
Fedora. But OTOH, when we started with the integration of all of these
new audio features into Fedora two years ago the audio stack was still
at a point of what was modern in the 90's. With the new features of the
new volume control and PA we are working on bringing Linux audio to what
is modern today.
*4. Can you also give us a comparison of our new audio framework in
reference to other audio frameworks and audio subsystem models that are
There are many frameworks out there. On Free Software systems PA doesn't
really have any competitor. Some people think that JACK is one, but it
actually is not. JACK is clearly focussed on audio production and not
very useful on the desktop otherwise. For example, it is strictly
designed to provide very low-latency at the price of power consumption.
This is the right thing to do for audio production but not on the
general desktop. Logic like 'glitch-free' (see above) makes a lot of
sense for the usual desktop audio since it allows flexible adjusting of
the latency to what is needed. If used properly it can be used to
decrease the interrupt rate to 1/s, while still allowing instant
reaction to user input. Since most PCs these days are laptops theses
kind of power consumption related features are very important.
One of the current weaker points of Audio on Linux is that we have this
clear separation of JACK for audio production and PA for
desktop/embedded. Other operating systems have managed to make this a
bit smoother by having a single stack for both. This however actually
has both advantages and disadvantages.
To improve the situation for now we focussed on making PA and JACK
cooperate better. In F11 when JACK needs low-level access to an audio
device it will tell PA so and PA will comply and release the device.
This should make switching between the two sound systems easier though
of course this is no perfect solution. Given the lack of manpower
further integration is unlikely to happen anytime soon -- though both
the JACK guys and I seem not generally opposed to something like that.
Now, if you compare our audio stacks with those of the big other
operating systems (Windows and MacOSX), then besides the fact that they
usually integrate desktop audio and audio production better than we do
(as mentioned) there are many things we are better in and many they are
better in. We certainly have more flexibility: i.e. depending on your
application you can access audio on a lot of different levels: you can
access ALSA directly if you need very low-level control, or via PA for
desktop level control. You have APIs like GStreamer for media streaming
and so on.
This flexibility however translates to more complexity in many ways, and
a hodgepodge of API styles. (OTOH Apple's CoreAudio actually isn't as
streamlined as many MacOS proponents like us to think.) The
documentation for our APIs is usually much worse then theirs. We really
need some improvements in that area. Featurewise, PA usually has better
networking related features then those counterparts. But there's a lot
of features they have right now we lack.
Other Unixes, such as FreeBSD and OpenSolaris are still stuck with OSS
(Open Sound System) audio. In F11 we finally switched OSS off by default
(though you can still reenable it via some minor hackery). OSS was the
predecessor of ALSA. Thankfully it is now fully obsolete on Linux. OSS
is mostly a design from the early nineties. It has received only minor
updating since then. It is no way comparable to what we now have on
Linux or even what MacOS or Windows provide. (Although is has some very
vocal fans which like to write me hate mails because I say things like this)
*5. This work all started in earlier releases dating all the way to even
Fedora 8, if I am correct. How has all this stuff progressed and evolved
from then? What was done in previous releases that enabled building upon
for this release?*
Fedora 8 was the first release where we integrated PA. In Fedora 9 we
stabilized PA support. In F10 we integrated the 'glitch-free' logic
which turned out to be quite a bumpy ride given that it exposed a lot of
timing related driver bugs. In F11 g-f has now been made more robust and
most of the more modern audio drivers should now be fixed. Also we have
now started to push PA support more into the UI, like with this new
*6. What are the plans for the future, if any, in this particular space
in the distro?*
I am working on multiple things for F12. Firstly there will be a couple
of more low-level changes to PA. The core will be made more threaded.
Right now, we run most things in one 'main' thread and do low-level
audio I/O in one thread for each audio card. My plan for F12 is to split
that one 'main' thread up into as many threads as possible. This should
make PA more robust for a couple of operations, and make latencies more
Then, I am working on considerably beefing up PA's usage of the
low-level hardware volume controls. For example, many cards have
seperate low-level volume sliders for "Speaker", "Master",
more) that are in the line from the PCM data we stream to the speakers.
PA currently exposes only one of those sliders (usually "Master"). My
plan is to 'multiply' those sliders and create a single 'product'
virtual slider from them that has a better granularity and a larger
range. This rework will also introduce input/output switching and
What has already landed in PA's git repository is support for UPnP A/V.
When used in conjunction with Zeeshan Ali's Rygel UPnP MediaServer
implementation this allows streaming any application's audio to a any
UPnP MediaRenderer (including PS3/Xboxes and all those 'Internet Radio'
devices). This is actually pretty neat. Later on we hope to make PA a
Media Renderer as as well as a MediaServer. This nicely compliments our
current Apple RAOP support.
And there's a lot of other things planned. We'll see how much of that
will be ready for F12. I don't like to talk too much about upcoming
features and planned code if I don't have anything to show yet, so I'll
leave it at this.
And then there's always a little project of mine that is called
'libsydney' that is intended to be a portable, modern and friendly PCM
API. During the last months I focussed more on PA itself though.
*7. Do you feel that work like this helps enhance the desktop experience
on Linux in general and strengthens the cause of the Linux Desktop, or
is it more all in day's work?*
I think that PA is the way forward for audio on the Linux desktop. It
may have its deficiencies -- but everything has. We still have some way
to go, but I believe that a modern audio layer is really important for
the Linux Desktop to succeed.
And no, it doesn't feel at all in a day's work. It always is a great
feeling to see how PA got incorporated into so many distributions and
how it is now used by so many people. I am pretty sure that only if you
hack on Linux software you get this in this ways.
*8. Speaking of all in a days work, what are things do you usually work
on? What do you most enjoy doing outside of work.*
Red Hat basically hired me to help improving audio on Linux. So that's
what I am doing during work.
Outside of work spend my time with photopgraphy. And I am trying my best
to travel to interesting places as much as I can and my time off allows.
Thank you Lennart for an excellent interview, ideas and insight. We look
forward to hearing more from you. Get it--hearing more, he works on
sound, okay I give up.