On Tue, Mar 13, 2012 at 05:16:26PM +1100, Virgil wrote:
I have a strange issue someone, hopefully, can advise me on.
I have two external connections into a host: em1 is behind a firewall machine
which is connected to an internet backbone link, while em2 is an adsl all-u-
can-eat deal with a cheap router.
Both are bridged. So there's pem1 and pem2 which are connected to the em1 and
em2 bridges respectively.
There is a domU router which nats (and is connected to both bridges).
Some domU servers (a variety of fc6 to fc16 32 and 64) default route to the
domU router and go out the all-u-can-eat link.
The issue is, I recent upgraded the host from Fedora8 to Fedora16, and now the
domU machines that once happily used the all-u-can-eat link, no longer can.
However, in an act of desperation, I moved one of the domUs to a backup
machine. Nothing else changed. Bingo. That domU works. All resources are still
on the original machine (i.e. the DB servers, the router, etc. etc.).
It seems that the router does't seem to nat domUs on the same host. You move
the domU off with comms going via em1 to another host and it works.
Iptraf tracing seems to indicate that the TCP connection is setup. The first
packet goes off and is acked. Then everything stops and eventually the tcp
connection closes on the router, but the domU and the remote computer think
the link is still up (at least that's what I think is happening). The domU's
send queue ends up with lots of data in it according to 'netstat -ant'
Only nat'd tcp connections are effected. Connections to machines on the same
bridge work fine.
Does this sound familiar to anyone?
How big is the packet. There is a bug we found in netback where a specific
length of a packet causes netback to stall. Patches will be visible soon
once we have run through all the regression tests.
Thanks in advance
xen mailing list