Hi all,
first post to this list, hopefully on topic!
I've a VPS, KVM-based, OpenSuse Leap 15.0. firewalld 0.5.5, OpenSuse pkg-ed version. Working fine for a long time.
Since recent hardware issues on the hostsystem of my VPS the provider has had to "hard reboot" all VPS including mine on the hostsystem, unfortunately several times, some sort of "reboot loop".
Thought this should be no issue for jounaled file systems, but, however, since this hard reboots "something" is messed up in my network config. Although all relevant config files I'm aware of are identical on VPS and latest backup.
Symptoms:
- firewalld could be started by systemd, without errors.But iptables -L -n does not show anything. - firewall-cmd runs into a DBUS error:
# firewall-cmd --state ERROR:dbus.proxies:Introspect error on :1.63:/org/fedoraproject/FirewallD1: dbus.exceptions.DBusException: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
For all flags I've tested, like --get-default-zone etc.
Currently this issue prevents also nic eth0 to get up, only possible if firewalld was stopped and started later again. But same issue after next reboot.
Any ideas in which direction I should search to solve this painful issue? As said, compared all config files below /etc/firewalld, /etc/wicked (wickedd used in OpenSuse if Network Manager is not used), /etc/sysconfig/network. Couldn't find a single difference, all sizes and all timestamps are the same, made some digs into file contents, also the same.
No really helpful information with journalctl -b, can provide, if wanted.
Any hints highly appreciated!
Regards, Michael
On Wed, Feb 20, 2019 at 09:44:29PM +0000, Michael Tufar wrote:
Hi all,
first post to this list, hopefully on topic!
I've a VPS, KVM-based, OpenSuse Leap 15.0. firewalld 0.5.5, OpenSuse pkg-ed version. Working fine for a long time.
Since recent hardware issues on the hostsystem of my VPS the provider has had to "hard reboot" all VPS including mine on the hostsystem, unfortunately several times, some sort of "reboot loop".
Thought this should be no issue for jounaled file systems, but, however, since this hard reboots "something" is messed up in my network config. Although all relevant config files I'm aware of are identical on VPS and latest backup.
Symptoms:
- firewalld could be started by systemd, without errors.But iptables -L -n does not show anything.
- firewall-cmd runs into a DBUS error:
# firewall-cmd --state ERROR:dbus.proxies:Introspect error on :1.63:/org/fedoraproject/FirewallD1: dbus.exceptions.DBusException: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
For all flags I've tested, like --get-default-zone etc.
Currently this issue prevents also nic eth0 to get up, only possible if firewalld was stopped and started later again. But same issue after next reboot.
Any ideas in which direction I should search to solve this painful issue? As said, compared all config files below /etc/firewalld, /etc/wicked (wickedd used in OpenSuse if Network Manager is not used), /etc/sysconfig/network. Couldn't find a single difference, all sizes and all timestamps are the same, made some digs into file contents, also the same.
No really helpful information with journalctl -b, can provide, if wanted.
Any hints highly appreciated!
This sounds like firewalld is stuck waiting on something and therefore not able to service dbus messages. That could be many things; dbus, ebtables/iptables locks, filesystem, etc.
I vaguely recall a report awhile back about a stale ebtables lock causing an issue like this. Check if /run/ebtables.lock exists - if so removing it will probably fix the issue. This seems likely given your "hard reboots" description. https://bugzilla.redhat.com/show_bug.cgi?id=1495893
Here are a list of things you can try to hunt it down.
1) First increase the logging to max by adding "--debug=9" to /etc/sysconfig/firewalld.
2) Check if firewalld is stuck calling ebtables/iptables. systemctl can tell you.
# systemctl status firewalld
3) Manually run firewalld in the foreground with strace. You may need to install strace. This will produce lots of output. The last dozen or so lines are the most relevant. In a normal scenario firewalld will sit poll() waiting for dbus messages.
# systemctl stop firewalld # strace firewalld --nofork
Eric,
thank you so very much, my issue is SOLVED!!!
I did what you suggested, step by step:
- no /run/ebtables.lock on my machine, even no *.lock - increased debug level to 9 - logfile just contains very few debug lines, nothing meaningful - installed strace and run firewalld in foreground, as adviced.
Result: strace log shows that firewalld wanted to access/create/whatever below an abbreviated path "/var/lib/e...". Found directory /var/lib/ebtables. Containing a single file named "lock", timestamp from last "hard reboot"
DELETED this file, DONE!
Not sure, but probably this file /var/lib/ebtables/lock is the OpenSuse "flavour" of your mentioned /run/ebtables.lock.
Thanks again, great help!
Kind regards, Michael
Am 20.02.19 um 23:40 schrieb Eric Garver:
On Wed, Feb 20, 2019 at 09:44:29PM +0000, Michael Tufar wrote:
Hi all,
first post to this list, hopefully on topic!
I've a VPS, KVM-based, OpenSuse Leap 15.0. firewalld 0.5.5, OpenSuse pkg-ed version. Working fine for a long time.
Since recent hardware issues on the hostsystem of my VPS the provider has had to "hard reboot" all VPS including mine on the hostsystem, unfortunately several times, some sort of "reboot loop".
Thought this should be no issue for jounaled file systems, but, however, since this hard reboots "something" is messed up in my network config. Although all relevant config files I'm aware of are identical on VPS and latest backup.
Symptoms:
- firewalld could be started by systemd, without errors.But iptables -L -n does not show anything.
- firewall-cmd runs into a DBUS error:
# firewall-cmd --state ERROR:dbus.proxies:Introspect error on :1.63:/org/fedoraproject/FirewallD1: dbus.exceptions.DBusException: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
For all flags I've tested, like --get-default-zone etc.
Currently this issue prevents also nic eth0 to get up, only possible if firewalld was stopped and started later again. But same issue after next reboot.
Any ideas in which direction I should search to solve this painful issue? As said, compared all config files below /etc/firewalld, /etc/wicked (wickedd used in OpenSuse if Network Manager is not used), /etc/sysconfig/network. Couldn't find a single difference, all sizes and all timestamps are the same, made some digs into file contents, also the same.
No really helpful information with journalctl -b, can provide, if wanted.
Any hints highly appreciated!
This sounds like firewalld is stuck waiting on something and therefore not able to service dbus messages. That could be many things; dbus, ebtables/iptables locks, filesystem, etc.
I vaguely recall a report awhile back about a stale ebtables lock causing an issue like this. Check if /run/ebtables.lock exists - if so removing it will probably fix the issue. This seems likely given your "hard reboots" description. https://bugzilla.redhat.com/show_bug.cgi?id=1495893
Here are a list of things you can try to hunt it down.
First increase the logging to max by adding "--debug=9" to /etc/sysconfig/firewalld.
Check if firewalld is stuck calling ebtables/iptables. systemctl can tell you.
# systemctl status firewalld
- Manually run firewalld in the foreground with strace. You may need to install strace. This will produce lots of output. The last dozen or so lines are the most relevant. In a normal scenario firewalld will sit poll() waiting for dbus messages.
# systemctl stop firewalld # strace firewalld --nofork
firewalld-users mailing list -- firewalld-users@lists.fedorahosted.org To unsubscribe send an email to firewalld-users-leave@lists.fedorahosted.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedorahosted.org/archives/list/firewalld-users@lists.fedorahos...
On Thu, Feb 21, 2019 at 10:30:33AM +0100, michaelof@rocketmail.com wrote:
Eric,
thank you so very much, my issue is SOLVED!!!
I did what you suggested, step by step:
- no /run/ebtables.lock on my machine, even no *.lock
- increased debug level to 9
- logfile just contains very few debug lines, nothing meaningful
- installed strace and run firewalld in foreground, as adviced.
Result: strace log shows that firewalld wanted to access/create/whatever below an abbreviated path "/var/lib/e...". Found directory /var/lib/ebtables. Containing a single file named "lock", timestamp from last "hard reboot"
DELETED this file, DONE!
Not sure, but probably this file /var/lib/ebtables/lock is the OpenSuse "flavour" of your mentioned /run/ebtables.lock.
That seems to be the case. I think it's a compile time option.
FWIW, locking in ebtables was improved recently to help avoid this specific issue. You may want to file a ticket with Suse to get that fix.
Upstream ebtables fixes:
6a826591878d ("Use flock() for --concurrent option") 068ba959c09b ("Fix locking if LOCKDIR does not exist")
As far as I can tell, these fixes are not in any official releases.
Thanks again, great help!
Glad I could help. Eric.
Kind regards, Michael
Am 20.02.19 um 23:40 schrieb Eric Garver:
On Wed, Feb 20, 2019 at 09:44:29PM +0000, Michael Tufar wrote:
Hi all,
first post to this list, hopefully on topic!
I've a VPS, KVM-based, OpenSuse Leap 15.0. firewalld 0.5.5, OpenSuse pkg-ed version. Working fine for a long time.
Since recent hardware issues on the hostsystem of my VPS the provider has had to "hard reboot" all VPS including mine on the hostsystem, unfortunately several times, some sort of "reboot loop".
Thought this should be no issue for jounaled file systems, but, however, since this hard reboots "something" is messed up in my network config. Although all relevant config files I'm aware of are identical on VPS and latest backup.
Symptoms:
- firewalld could be started by systemd, without errors.But iptables -L -n does not show anything.
- firewall-cmd runs into a DBUS error:
# firewall-cmd --state ERROR:dbus.proxies:Introspect error on :1.63:/org/fedoraproject/FirewallD1: dbus.exceptions.DBusException: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
For all flags I've tested, like --get-default-zone etc.
Currently this issue prevents also nic eth0 to get up, only possible if firewalld was stopped and started later again. But same issue after next reboot.
Any ideas in which direction I should search to solve this painful issue? As said, compared all config files below /etc/firewalld, /etc/wicked (wickedd used in OpenSuse if Network Manager is not used), /etc/sysconfig/network. Couldn't find a single difference, all sizes and all timestamps are the same, made some digs into file contents, also the same.
No really helpful information with journalctl -b, can provide, if wanted.
Any hints highly appreciated!
This sounds like firewalld is stuck waiting on something and therefore not able to service dbus messages. That could be many things; dbus, ebtables/iptables locks, filesystem, etc.
I vaguely recall a report awhile back about a stale ebtables lock causing an issue like this. Check if /run/ebtables.lock exists - if so removing it will probably fix the issue. This seems likely given your "hard reboots" description. https://bugzilla.redhat.com/show_bug.cgi?id=1495893
Here are a list of things you can try to hunt it down.
First increase the logging to max by adding "--debug=9" to /etc/sysconfig/firewalld.
Check if firewalld is stuck calling ebtables/iptables. systemctl can tell you.
# systemctl status firewalld
- Manually run firewalld in the foreground with strace. You may need to install strace. This will produce lots of output. The last dozen or so lines are the most relevant. In a normal scenario firewalld will sit poll() waiting for dbus messages.
# systemctl stop firewalld # strace firewalld --nofork
firewalld-users mailing list -- firewalld-users@lists.fedorahosted.org To unsubscribe send an email to firewalld-users-leave@lists.fedorahosted.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedorahosted.org/archives/list/firewalld-users@lists.fedorahos...
firewalld-users mailing list -- firewalld-users@lists.fedorahosted.org To unsubscribe send an email to firewalld-users-leave@lists.fedorahosted.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedorahosted.org/archives/list/firewalld-users@lists.fedorahos...
firewalld-users@lists.fedorahosted.org