Slowness getting back on network from resume

Dan Williams dcbw at redhat.com
Tue Jul 15 15:40:25 UTC 2008


On Mon, 2008-07-14 at 10:51 -0400, Jesse Keating wrote:
> This likely isn't any desktop component that is causing this, but I'd
> like to know what it is that makes getting back on wireless network
> after resuming from suspend take so long.  I've bee comparing with Vista
> on my laptop just for giggles, and Vista is back on the wireless and FF
> even refreshes my gmail before I can get my password typed into the
> screen lock.  This is all very very very fast.  F9 on the other hand
> takes longer to get to the password dialog and then once unlocked it can
> still be another minute+ before NM starts the connection dance, another
> 30 seconds after that before wireless picks up again.  Where is the
> delay?

NM throws away the current access point list on suspend/hibernate.  On
resume from sleep/hibernate, it needs to do a scan or two to find the
network to associate with, and after that, do the whole connection
process (including DHCP) again.  There is a considerable amount of lag
here, and there's certainly room for improvement.

First, we need better kernel interfaces for wireless.  Finishing
nl80211/cfg80211 and ensuring that they have a more request/response
oriented API [1] is the way to go.

Second, Mac OS X and Windows indicate connection differently from
NetworkManager.  Mac OS X will show that Airport is "connected" way
before it's successfully completed DHCP or IP assignment.  That's a
completely useless status since you can't actually do anything in that
state, so NetworkManager considers the "connected" state to be once all
IP configuration is complete and successful.

Third, drivers still suck for scanning.  Ideally, we'd just fire off a
quick scan right after NM wakes back up, get back a nice list of access
points, and immediately connect to one of them.  For whatever reason a
single scan is still not reliable enough to get an adequate picture of
the network around you.  And since scans can take quite a bit of time to
complete [2] you get unacceptable latency here.

So how to fix this?

a) Get drivers to support specific SSID scanning (unfortunately older
drivers probably won't be able to do this, but all mac80211 drivers
can), and when an SSID scan is requested, make that jump to the start of
the scan queue.  Have the driver/mac80211 provide a specific response to
that scan request so that NetworkManager knows the result.

b) When coming out of sleep, have NetworkManager SSID scan the last
connected AP.  If that AP doesn't show up in scan results (its not
there, the driver sucks, etc), continue as normal.  If that AP does show
up in scan results, try to associate with that AP.

c) Since NM controls the DHCP client, NM is capable of providing
different lease files to dhclient for each network you've connected to.
So for 'my-home-wireless' NM could try to re-acquire the previous lease
before falling back to a complete DHCP transaction.

I don't really know how much time this would save, since lots of the
latency is in the wpa_supplicant <-> driver communication during
authentication/association because the WEXT API is so bad at reporting
it's status.  But that can be improved independently of the NM bits, and
the stuff above is definitely a win.

Try resuming while plugged into a cable, and NM will be REALLY FAST.
It's just when you throw wireless into the mix that stuff gets more
complicated and therefore quite a bit laggier.

Dan

[1] i.e. a way to track the result of specific requests.  With WEXT you
say "associate to X" and some time later the driver may or may not send
association status back, but there's no ID/cookie to tie that result
back to the original request that was made.  Would be nice to be able to
track the progress of asynchronous operations through the driver/kernel.

[2] historically up to 10s with madwifi to scan both A and B/G bands;
you have up to 60 channels with at least a 100ms passive scan dwell time
on each channel, and maybe 50ms channel switch lag, not including
firmware/driver command latency.  SSID scanning (called an "active scan"
in 802.11) helps because you send out the probe request and only have to
wait 20 or 30ms before giving up, instead of the minimum 100ms a passive
scan requires.




More information about the desktop mailing list