Hi,
just copying from: https://bugzilla.redhat.com/show_bug.cgi?id=1607872#c8
On Raspberry Pi 3B+ when I boot from MicroSD from USB adapter all works fine. When I use the same MicroSD into the MicroSD slot of Raspberry it boots fine and when it should be display text login prompt it prints:
[ 70.246299] sdhost-bcm2835 3f202000.mmc: timeout waiting for hardware interrupt.
And it is mostly dead (it pings but no sshd works anymore, NumLock works but no login prompt etc.). Raspbian boots fine there even directly from MicroSD.
kernel-5.1.11-300.fc30.aarch64 kernel-5.2.0-0.rc3.git3.1.fc31.aarch64
I see nowhere on this list mentioned this problem, really nobody is using MicroSD directly?
Jan
Hi Jan,
On 21.06.19 12:59, Jan Kratochvil wrote:
Hi,
just copying from: https://bugzilla.redhat.com/show_bug.cgi?id=1607872#c8
this issue isn't related to your problem. This problem occurs only on a specific command.
On Raspberry Pi 3B+ when I boot from MicroSD from USB adapter all works fine. When I use the same MicroSD into the MicroSD slot of Raspberry it boots fine and when it should be display text login prompt it prints:
[ 70.246299] sdhost-bcm2835 3f202000.mmc: timeout waiting for hardware interrupt.
I've seen a lot of reports about this issue. Unfortunately this was only reproducible with specific MicroSD cards.
Are you able to reproduce this issue with multiple MicroSDs? What type of MicroSD produce this issue?
And it is mostly dead (it pings but no sshd works anymore, NumLock works but no login prompt etc.). Raspbian boots fine there even directly from MicroSD.
kernel-5.1.11-300.fc30.aarch64 kernel-5.2.0-0.rc3.git3.1.fc31.aarch64
I see nowhere on this list mentioned this problem, really nobody is using MicroSD directly?
I only test with MicroSD cards directly.
Stefan
Jan _______________________________________________ arm mailing list -- arm@lists.fedoraproject.org To unsubscribe send an email to arm-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/arm@lists.fedoraproject.org
On Fri, Jun 21, 2019 at 12:48 PM Jan Kratochvil jan.kratochvil@redhat.com wrote:
Hi,
just copying from: https://bugzilla.redhat.com/show_bug.cgi?id=1607872#c8
On Raspberry Pi 3B+ when I boot from MicroSD from USB adapter all works fine. When I use the same MicroSD into the MicroSD slot of Raspberry it boots fine and when it should be display text login prompt it prints:
[ 70.246299] sdhost-bcm2835 3f202000.mmc: timeout waiting for hardware interrupt.
And it is mostly dead (it pings but no sshd works anymore, NumLock works but no login prompt etc.). Raspbian boots fine there even directly from MicroSD.
What make/model of mSD card are you using, I've seen it reported but never seen it myseld.
kernel-5.1.11-300.fc30.aarch64 kernel-5.2.0-0.rc3.git3.1.fc31.aarch64
I see nowhere on this list mentioned this problem, really nobody is using MicroSD directly?
I primarily use mSD cards directly, I currently have 12 RPi of different models from the RPi2, original 3 (a couple of gens), the 3B+, 3A+ and even a CM3 based device running various Fedora images on both aarch64 and ARMv7 with no issues. I use two types of card, either Sandisk Ultra or Samsung EVO.
Also it's possible under powered PSUs might cause something like this, I've had reports of issues with SD cards that have gone away with different PSUs, what is the rating of your PSU?
Hello,
further discussion with Peter Robinson was at: https://bugzilla.redhat.com/show_bug.cgi?id=1607872#c11
On Fri, 21 Jun 2019 13:55:39 +0200, Stefan Wahren wrote:
I've seen a lot of reports about this issue. Unfortunately this was only reproducible with specific MicroSD cards.
Are you able to reproduce this issue with multiple MicroSDs? What type of MicroSD produce this issue?
From the Bug: # that does not matter, the important part is that it works with Raspbian # while it does not with Fedora.
On Fri, 21 Jun 2019 15:26:11 +0200, Peter Robinson wrote:
I see nowhere on this list mentioned this problem, really nobody is using MicroSD directly?
I primarily use mSD cards directly, I currently have 12 RPi of different models from the RPi2, original 3 (a couple of gens), the 3B+, 3A+ and even a CM3 based device running various Fedora images on both aarch64 and ARMv7 with no issues.
One should also use more wide range of MicroSD cards.
From the Bug: # I have verified now that it is a regression # since (as this kernel boots fine directly from my Kingston MicroSD): # Fedora-Server-29-1.2.aarch64.raw.xz = kernel-4.18.16-300.fc29.aarch64
Also it's possible under powered PSUs might cause something like this, I've had reports of issues with SD cards that have gone away with different PSUs, what is the rating of your PSU?
From the Bug: # Original Raspberry PSU 2.5A
Regards, Jan Kratochvil
Hi Jan,
Am 22.06.19 um 09:04 schrieb Jan Kratochvil:
Hello,
further discussion with Peter Robinson was at: https://bugzilla.redhat.com/show_bug.cgi?id=1607872#c11
On Fri, 21 Jun 2019 13:55:39 +0200, Stefan Wahren wrote:
I've seen a lot of reports about this issue. Unfortunately this was only reproducible with specific MicroSD cards.
Are you able to reproduce this issue with multiple MicroSDs? What type of MicroSD produce this issue?
From the Bug: # that does not matter, the important part is that it works with Raspbian # while it does not with Fedora.
it does matter, because there is zero free documentation about the sdhost controller on the BCM2835. So everything depends on helpful users and a lots of testing. Raspbian uses their downstream sdhost driver bcm2835-sdhost, while Fedora the mainline driver bcm2835. This one was derived from the downstream one around kernel 4.12. Thanks for letting me know it's a Kingston card. I've a few Kingston cards and never had a problem. So i wouldn't think there is a general issue with them.
On Fri, 21 Jun 2019 15:26:11 +0200, Peter Robinson wrote:
I see nowhere on this list mentioned this problem, really nobody is using MicroSD directly?
I primarily use mSD cards directly, I currently have 12 RPi of different models from the RPi2, original 3 (a couple of gens), the 3B+, 3A+ and even a CM3 based device running various Fedora images on both aarch64 and ARMv7 with no issues.
One should also use more wide range of MicroSD cards.
From the Bug: # I have verified now that it is a regression # since (as this kernel boots fine directly from my Kingston MicroSD): # Fedora-Server-29-1.2.aarch64.raw.xz = kernel-4.18.16-300.fc29.aarch64
Thanks this is very helpful. Could you please test Fedora kernel 4.19 and 5.0 for aarch64? This would narrow down the issue much more.
Also it's possible under powered PSUs might cause something like this, I've had reports of issues with SD cards that have gone away with different PSUs, what is the rating of your PSU?
From the Bug: # Original Raspberry PSU 2.5A
I don't think this is power related.
Thanks Stefan
Regards, Jan Kratochvil
On Sat, 22 Jun 2019 10:59:29 +0200, Stefan Wahren wrote:
it does matter, because there is zero free documentation about the sdhost controller on the BCM2835.
"Kingston C16G JAPAN, SDC2", bought in 2009.
Raspbian uses their downstream sdhost driver bcm2835-sdhost, while Fedora the mainline driver bcm2835. This one was derived from the downstream one around kernel 4.12.
So why not to update upstream kernel from the downstream one again...
# I have verified now that it is a regression # since (as this kernel boots fine directly from my Kingston MicroSD): # Fedora-Server-29-1.2.aarch64.raw.xz = kernel-4.18.16-300.fc29.aarch64
Thanks this is very helpful. Could you please test Fedora kernel 4.19 and 5.0 for aarch64? This would narrow down the issue much more.
I have tried to put the kernel-4.18.16-300.fc29 driver drivers/mmc/host/bcm2835.c into kernel-5.1.12-300.fc30: https://people.redhat.com/jkratoch/bcmbug.diff https://koji.fedoraproject.org/koji/taskinfo?taskID=35710809
And it booted without any error - normally it fails in 60-100 seconds after boot like: [ 70.246299] sdhost-bcm2835 3f202000.mmc: timeout waiting for hardware interrupt.
But then with my custom built kernel I ran 'dnf distro-sync' and it failed again, just later: [ 1059.688583] sdhost-bcm2835 3f202000.mmc: timeout waiting for hardware interrupt. [ 1059.793528] sdhost-bcm2835 3f202000.mmc: bcm2835_read_wait_sdcmd: timeout (100 ms) ... [ 1060.341279] sdhost-bcm2835 3f202000.mmc: bcm2835_read_wait_sdcmd: timeout (100 ms) [ 1060.346235] sdhost-bcm2835 3f202000.mmc: previous command never completed. [ 1060.350784] print_req_error: I/O error, dev mmcblk0, sector 9965568 flags 1 [ 1060.350832] mmc0: card 495c removed [ 1060.355420] EXT4-fs warning (device mmcblk0p3): ext4_end_bio:318: I/O error 10 writing to inode 8907 (offset 0 size 2572288 starting block 1245986) [ 1060.366554] Buffer I/O error on device mmcblk0p3, logical block 932129
OK, I see maybe I will give up and just buy a newer MicroSD card...
Thanks, Jan Kratochvil
Hi Jan,
Am 22.06.19 um 20:40 schrieb Jan Kratochvil:
On Sat, 22 Jun 2019 10:59:29 +0200, Stefan Wahren wrote:
it does matter, because there is zero free documentation about the sdhost controller on the BCM2835.
"Kingston C16G JAPAN, SDC2", bought in 2009.
Raspbian uses their downstream sdhost driver bcm2835-sdhost, while Fedora the mainline driver bcm2835. This one was derived from the downstream one around kernel 4.12.
So why not to update upstream kernel from the downstream one again...
this won't work. During upstreaming process there has been a lot of changes to get this driver mainline. So we can't simply merge them again. We need to finding the offending upstream commit and revert it.
# I have verified now that it is a regression # since (as this kernel boots fine directly from my Kingston MicroSD): # Fedora-Server-29-1.2.aarch64.raw.xz = kernel-4.18.16-300.fc29.aarch64
Thanks this is very helpful. Could you please test Fedora kernel 4.19 and 5.0 for aarch64? This would narrow down the issue much more.
I have tried to put the kernel-4.18.16-300.fc29 driver drivers/mmc/host/bcm2835.c into kernel-5.1.12-300.fc30: https://people.redhat.com/jkratoch/bcmbug.diff https://koji.fedoraproject.org/koji/taskinfo?taskID=35710809
And it booted without any error - normally it fails in 60-100 seconds after boot like: [ 70.246299] sdhost-bcm2835 3f202000.mmc: timeout waiting for hardware interrupt.
But then with my custom built kernel I ran 'dnf distro-sync' and it failed again, just later: [ 1059.688583] sdhost-bcm2835 3f202000.mmc: timeout waiting for hardware interrupt. [ 1059.793528] sdhost-bcm2835 3f202000.mmc: bcm2835_read_wait_sdcmd: timeout (100 ms) ... [ 1060.341279] sdhost-bcm2835 3f202000.mmc: bcm2835_read_wait_sdcmd: timeout (100 ms) [ 1060.346235] sdhost-bcm2835 3f202000.mmc: previous command never completed. [ 1060.350784] print_req_error: I/O error, dev mmcblk0, sector 9965568 flags 1 [ 1060.350832] mmc0: card 495c removed [ 1060.355420] EXT4-fs warning (device mmcblk0p3): ext4_end_bio:318: I/O error 10 writing to inode 8907 (offset 0 size 2572288 starting block 1245986) [ 1060.366554] Buffer I/O error on device mmcblk0p3, logical block 932129
OK, I see maybe I will give up and just buy a newer MicroSD card...
It would be helpful to send me this card and i'll try to fix this.
Stefan
Thanks, Jan Kratochvil _______________________________________________ arm mailing list -- arm@lists.fedoraproject.org To unsubscribe send an email to arm-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/arm@lists.fedoraproject.org
Hi Stefan,
On Sun, 23 Jun 2019 10:36:57 +0200, Stefan Wahren wrote:
It would be helpful to send me this card and i'll try to fix this.
I have tracked it down so far to a regression between: kernel-5.0.0-300.fc30.aarch64 kernel-5.0.17-300.fc30.aarch64
And drivers/mmc/host/bcm2835.c is the same in both versions.
I will contact you off-list, it would be great if you can fix that.
Thanks, Jan Kratochvil
Hi,
Am 23.06.19 um 10:43 schrieb Jan Kratochvil:
Hi Stefan,
On Sun, 23 Jun 2019 10:36:57 +0200, Stefan Wahren wrote:
It would be helpful to send me this card and i'll try to fix this.
I have tracked it down so far to a regression between: kernel-5.0.0-300.fc30.aarch64 kernel-5.0.17-300.fc30.aarch64
And drivers/mmc/host/bcm2835.c is the same in both versions.
at least the major kernel version makes sense, because 5.0 had a lot of changes to this driver. It might be some timing critical issue, which is triggered by a kernel config change or a change in the mmc core.
Can you please point me to the kernel config of 5.0.17? Are there any additional changes to the mainline kernel?
Regards
I will contact you off-list, it would be great if you can fix that.
Thanks, Jan Kratochvil
On Sun, 23 Jun 2019 11:00:20 +0200, Stefan Wahren wrote:
Am 23.06.19 um 10:43 schrieb Jan Kratochvil:
kernel-5.0.0-300.fc30.aarch64
https://koji.fedoraproject.org/koji/buildinfo?buildID=1219993 wget -O - https://kojipkgs.fedoraproject.org//packages/kernel/5.0.0/300.fc30/aarch64/k... -i --to-stdout ./lib/modules/5.0.0-300.fc30.aarch64/config >config-5.0.0-300.fc30.aarch64
kernel-5.0.17-300.fc30.aarch64
https://koji.fedoraproject.org/koji/buildinfo?buildID=1269307 wget -O - https://kojipkgs.fedoraproject.org//packages/kernel/5.0.17/300.fc30/aarch64/... -i --to-stdout ./lib/modules/5.0.17-300.fc30.aarch64/config >config-5.0.17-300.fc30.aarch64
Can you please point me to the kernel config of 5.0.17?
^
Are there any additional changes to the mainline kernel?
If you mean Fedora patches then yes, there are many. https://src.fedoraproject.org/rpms/kernel/tree/f30 fedpkg clone -a -b f30 kernel kernel-f30 git clone -b f30 https://src.fedoraproject.org/rpms/kernel.git kernel-f30
I can also try 'vanilla' builds to verify it still happens there.
Thanks, Jan
On Sun, 23 Jun 2019 10:36:57 +0200, Stefan Wahren wrote:
It would be helpful to send me this card and i'll try to fix this.
I have tracked it down so far to a regression between: kernel-5.0.0-300.fc30.aarch64 kernel-5.0.17-300.fc30.aarch64
And drivers/mmc/host/bcm2835.c is the same in both versions.
at least the major kernel version makes sense, because 5.0 had a lot of changes to this driver. It might be some timing critical issue, which is triggered by a kernel config change or a change in the mmc core.
Can you please point me to the kernel config of 5.0.17? Are there any additional changes to the mainline kernel?
There's a few Fedora patches but nothing specific to mmc or the RPi mmc/sdhci modules.
The patches we had in 5.0.x for RPi was the patch to add support for the 3A+ and DT improvement series [1]
The ARMv7 and aarch64 configs for 5.0.17 can be found here [2][3] but our RPi configs didn't change much in the 4.19 - 5.0 range, I think the most recent one was the addition of CONFIG_BCM2835_POWER when those changes landed upstream which I think was the 5.1 merge window.
[1] https://www.spinics.net/lists/arm-kernel/msg699583.html [2] https://pbrobinson.fedorapeople.org/config-5.0.17-300.fc30.aarch64 [3] https://pbrobinson.fedorapeople.org/config-5.0.17-300.fc30.armv7hl
Regards
I will contact you off-list, it would be great if you can fix that.
Thanks, Jan Kratochvil
arm mailing list -- arm@lists.fedoraproject.org To unsubscribe send an email to arm-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/arm@lists.fedoraproject.org
On Sun, 23 Jun 2019 12:32:14 +0200, Peter Robinson wrote:
There's a few Fedora patches but nothing specific to mmc or the RPi mmc/sdhci modules.
kernel-5.0. 0-300.fc30.aarch64 PASS kernel-5.0. 0-300.vanilla.fc30.aarch64 PASS https://koji.fedoraproject.org/koji/taskinfo?taskID=35752827 kernel-5.0.17-300.vanilla.fc30.aarch64 PASS https://koji.fedoraproject.org/koji/taskinfo?taskID=35752457 kernel-5.0.17-300.fc30.aarch64 FAIL
So it looks as a Fedora-specific regression somewhere in the 5.0.x series.
I can continue bisecting but a hint is welcome to reduce the number of steps.
Maybe Stefan Wahren is not interested if it is a Fedora specific regression?
(Then sure my testing may have fuzzy results etc. but I haven't noticed anything like that yet.)
Jan
The vanilla kernels I have built with: https://people.redhat.com/jkratoch/aarch64-vanilla.patch As the official --with vanilla kernel.spec flag is buggy now: Error: rpmbuild -bs --with vanilla kernel.spec https://bugzilla.redhat.com/show_bug.cgi?id=1547553
Hi Jan,
Am 23.06.19 um 21:22 schrieb Jan Kratochvil:
On Sun, 23 Jun 2019 12:32:14 +0200, Peter Robinson wrote:
There's a few Fedora patches but nothing specific to mmc or the RPi mmc/sdhci modules.
kernel-5.0. 0-300.fc30.aarch64 PASS kernel-5.0. 0-300.vanilla.fc30.aarch64 PASS https://koji.fedoraproject.org/koji/taskinfo?taskID=35752827 kernel-5.0.17-300.vanilla.fc30.aarch64 PASS https://koji.fedoraproject.org/koji/taskinfo?taskID=35752457 kernel-5.0.17-300.fc30.aarch64 FAIL
So it looks as a Fedora-specific regression somewhere in the 5.0.x series.
I can continue bisecting but a hint is welcome to reduce the number of steps.
Maybe Stefan Wahren is not interested if it is a Fedora specific regression?
in case this is only reproducible with Fedora, i suggest that you try to narrow down this issue first.
But i think is triggered by some kind of timing.
Could you please try to revert to these patches step by step starting with kernel-5.0.17-300.fc30.aarch64 :
f6000a4eb34e6462bc0dd39809c1bb99f9633269 mmc: bcm2835: reset host on timeout 07d405769afea5718529fc9e341f0b13b3189b6f mmc: bcm2835: Recover from MMC_SEND_EXT_CSD af19b7ce76ba220f358c82b0a5e7d68909a23aa5 mmc: bcm2835: Avoid possible races on data requests 37fefadee8bb665ae337a15aa635dabff9f66ade mmc: bcm2835: Terminate timeout work synchronously 6dc6f2619017109e45550accc120f823fdc31c3e mmc: bcm2835: Refactor dma_map_sg handling
2f5da678351f0d504966fab113968202aa5713fb mmc: bcm2835: Properly handle dmaengine_prep_slave_sg
f7da7782aba92593f7b82f03d2409a1c5f4db91b dmaengine: bcm2835: Fix interrupt race on RT 9e528c799d17a4ac37d788c81440b50377dd592d dmaengine: bcm2835: Fix abort of transactions 3e05ada043828c5880c88789c824e3d40d6830cb dmaengine: bcm2835: Return void from abort of transactions
Maybe one of them is causing the issue.
Thanks Stefan
(Then sure my testing may have fuzzy results etc. but I haven't noticed anything like that yet.)
Jan
The vanilla kernels I have built with: https://people.redhat.com/jkratoch/aarch64-vanilla.patch As the official --with vanilla kernel.spec flag is buggy now: Error: rpmbuild -bs --with vanilla kernel.spec https://bugzilla.redhat.com/show_bug.cgi?id=1547553 _______________________________________________ arm mailing list -- arm@lists.fedoraproject.org To unsubscribe send an email to arm-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/arm@lists.fedoraproject.org
On Sun, 23 Jun 2019 12:32:14 +0200, Peter Robinson wrote:
There's a few Fedora patches but nothing specific to mmc or the RPi mmc/sdhci modules.
kernel-5.0. 0-300.fc30.aarch64 PASS kernel-5.0. 0-300.vanilla.fc30.aarch64 PASS https://koji.fedoraproject.org/koji/taskinfo?taskID=35752827 kernel-5.0.17-300.vanilla.fc30.aarch64 PASS https://koji.fedoraproject.org/koji/taskinfo?taskID=35752457 kernel-5.0.17-300.fc30.aarch64 FAIL
So it looks as a Fedora-specific regression somewhere in the 5.0.x series.
I can continue bisecting but a hint is welcome to reduce the number of steps.
Maybe Stefan Wahren is not interested if it is a Fedora specific regression?
(Then sure my testing may have fuzzy results etc. but I haven't noticed anything like that yet.)
Have you tested 5.1.x like .12 or the latest 5.2rc5 releases?
Jan
The vanilla kernels I have built with: https://people.redhat.com/jkratoch/aarch64-vanilla.patch As the official --with vanilla kernel.spec flag is buggy now: Error: rpmbuild -bs --with vanilla kernel.spec https://bugzilla.redhat.com/show_bug.cgi?id=1547553
On Sun, 23 Jun 2019 21:53:07 +0200, Peter Robinson wrote:
Have you tested 5.1.x like .12 or the latest 5.2rc5 releases?
In my OP I have tested (as failing): kernel-5.2.0-0.rc3.git3.1.fc31.aarch64
I have also tested (as failing): kernel-5.1.11-300.fc30.aarch64 (not yet tested 5.1.12)
Jan
Am 23.06.19 um 21:22 schrieb Jan Kratochvil:
On Sun, 23 Jun 2019 12:32:14 +0200, Peter Robinson wrote:
There's a few Fedora patches but nothing specific to mmc or the RPi mmc/sdhci modules.
kernel-5.0. 0-300.fc30.aarch64 PASS kernel-5.0. 0-300.vanilla.fc30.aarch64 PASS https://koji.fedoraproject.org/koji/taskinfo?taskID=35752827 kernel-5.0.17-300.vanilla.fc30.aarch64 PASS https://koji.fedoraproject.org/koji/taskinfo?taskID=35752457 kernel-5.0.17-300.fc30.aarch64 FAIL
So it looks as a Fedora-specific regression somewhere in the 5.0.x series.
Please wait, i see that fc30 still uses this experimental cpufreq patch, which is very dangerous.
https://src.fedoraproject.org/rpms/kernel/blob/f30/f/bcm2835-cpufreq-add-CPU...
Please try to remove this patch at first
I can continue bisecting but a hint is welcome to reduce the number of steps.
Maybe Stefan Wahren is not interested if it is a Fedora specific regression?
(Then sure my testing may have fuzzy results etc. but I haven't noticed anything like that yet.)
Jan
The vanilla kernels I have built with: https://people.redhat.com/jkratoch/aarch64-vanilla.patch As the official --with vanilla kernel.spec flag is buggy now: Error: rpmbuild -bs --with vanilla kernel.spec https://bugzilla.redhat.com/show_bug.cgi?id=1547553 _______________________________________________ arm mailing list -- arm@lists.fedoraproject.org To unsubscribe send an email to arm-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/arm@lists.fedoraproject.org
Am 23.06.19 um 21:22 schrieb Jan Kratochvil:
On Sun, 23 Jun 2019 12:32:14 +0200, Peter Robinson wrote:
There's a few Fedora patches but nothing specific to mmc or the RPi mmc/sdhci modules.
kernel-5.0. 0-300.fc30.aarch64 PASS kernel-5.0. 0-300.vanilla.fc30.aarch64 PASS https://koji.fedoraproject.org/koji/taskinfo?taskID=35752827 kernel-5.0.17-300.vanilla.fc30.aarch64 PASS https://koji.fedoraproject.org/koji/taskinfo?taskID=35752457 kernel-5.0.17-300.fc30.aarch64 FAIL
So it looks as a Fedora-specific regression somewhere in the 5.0.x series.
Please wait, i see that fc30 still uses this experimental cpufreq patch, which is very dangerous.
https://src.fedoraproject.org/rpms/kernel/blob/f30/f/bcm2835-cpufreq-add-CPU...
Please try to remove this patch at first
That hasn't changed in some kernels, we've had that there for some time, in F-29 too
I can continue bisecting but a hint is welcome to reduce the number of steps.
Maybe Stefan Wahren is not interested if it is a Fedora specific regression?
(Then sure my testing may have fuzzy results etc. but I haven't noticed anything like that yet.)
Jan
The vanilla kernels I have built with: https://people.redhat.com/jkratoch/aarch64-vanilla.patch As the official --with vanilla kernel.spec flag is buggy now: Error: rpmbuild -bs --with vanilla kernel.spec https://bugzilla.redhat.com/show_bug.cgi?id=1547553 _______________________________________________ arm mailing list -- arm@lists.fedoraproject.org To unsubscribe send an email to arm-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/arm@lists.fedoraproject.org
Am 23.06.19 um 22:14 schrieb Peter Robinson:
Am 23.06.19 um 21:22 schrieb Jan Kratochvil:
On Sun, 23 Jun 2019 12:32:14 +0200, Peter Robinson wrote:
There's a few Fedora patches but nothing specific to mmc or the RPi mmc/sdhci modules.
kernel-5.0. 0-300.fc30.aarch64 PASS kernel-5.0. 0-300.vanilla.fc30.aarch64 PASS https://koji.fedoraproject.org/koji/taskinfo?taskID=35752827 kernel-5.0.17-300.vanilla.fc30.aarch64 PASS https://koji.fedoraproject.org/koji/taskinfo?taskID=35752457 kernel-5.0.17-300.fc30.aarch64 FAIL
So it looks as a Fedora-specific regression somewhere in the 5.0.x series.
Please wait, i see that fc30 still uses this experimental cpufreq patch, which is very dangerous.
https://src.fedoraproject.org/rpms/kernel/blob/f30/f/bcm2835-cpufreq-add-CPU...
Please try to remove this patch at first
That hasn't changed in some kernels, we've had that there for some time, in F-29 too
But it is still dangerous and has negative side effects on sdhost interface. At least we should test if this has influence on the issue.
I can continue bisecting but a hint is welcome to reduce the number of steps.
Maybe Stefan Wahren is not interested if it is a Fedora specific regression?
(Then sure my testing may have fuzzy results etc. but I haven't noticed anything like that yet.)
Jan
The vanilla kernels I have built with: https://people.redhat.com/jkratoch/aarch64-vanilla.patch As the official --with vanilla kernel.spec flag is buggy now: Error: rpmbuild -bs --with vanilla kernel.spec https://bugzilla.redhat.com/show_bug.cgi?id=1547553 _______________________________________________ arm mailing list -- arm@lists.fedoraproject.org To unsubscribe send an email to arm-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/arm@lists.fedoraproject.org
On Sun, 23 Jun 2019 22:11:55 +0200, Stefan Wahren wrote:
Am 23.06.19 um 21:22 schrieb Jan Kratochvil:
So it looks as a Fedora-specific regression somewhere in the 5.0.x series.
Please wait, i see that fc30 still uses this experimental cpufreq patch, which is very dangerous.
https://src.fedoraproject.org/rpms/kernel/blob/f30/f/bcm2835-cpufreq-add-CPU...
Please try to remove this patch at first
That's it! e9086bdbaaa1f966291adc784f375cc3a24c5762 kernel-5.1.12-300.nocpufreq.fc30.aarch64 PASS https://koji.fedoraproject.org/koji/taskinfo?taskID=35768074
Only now I understand why it successfully proceeds with the whole boot and it breaks by that error only after it becomes idle at the login screen. sdhost-bcm2835 3f202000.mmc: timeout waiting for hardware interrupt.
I have seen now a discussion https://github.com/lategoodbye/rpi-zero/issues/32 there is some principial problem all drivers need an update for that frequency scaling, so maybe sdhost-bcm2835 also still needs an update.
Thanks, Jan
On Mon, Jun 24, 2019 at 7:47 AM Jan Kratochvil jan.kratochvil@redhat.com wrote:
On Sun, 23 Jun 2019 22:11:55 +0200, Stefan Wahren wrote:
Am 23.06.19 um 21:22 schrieb Jan Kratochvil:
So it looks as a Fedora-specific regression somewhere in the 5.0.x series.
Please wait, i see that fc30 still uses this experimental cpufreq patch, which is very dangerous.
https://src.fedoraproject.org/rpms/kernel/blob/f30/f/bcm2835-cpufreq-add-CPU...
Please try to remove this patch at first
That's it! e9086bdbaaa1f966291adc784f375cc3a24c5762 kernel-5.1.12-300.nocpufreq.fc30.aarch64 PASS https://koji.fedoraproject.org/koji/taskinfo?taskID=35768074
Only now I understand why it successfully proceeds with the whole boot and it breaks by that error only after it becomes idle at the login screen. sdhost-bcm2835 3f202000.mmc: timeout waiting for hardware interrupt.
I have seen now a discussion https://github.com/lategoodbye/rpi-zero/issues/32 there is some principial problem all drivers need an update for that frequency scaling, so maybe sdhost-bcm2835 also still needs an update.
It's weird it only started causing issues late in the 5.0.x cycle as that driver hasn't changed for some time and we've shipped it for a couple of releases now (since June 2018).
There's a different driver in the 5.2 rc series from 5.2.0-0.rc4.git2.1 and later builds, you might want to try rc5
https://koji.fedoraproject.org/koji/buildinfo?buildID=1288824
On Mon, 24 Jun 2019 09:12:40 +0200, Peter Robinson wrote:
There's a different driver in the 5.2 rc series from 5.2.0-0.rc4.git2.1 and later builds, you might want to try rc5
https://koji.fedoraproject.org/koji/buildinfo?buildID=1288824
Yes, that works. Could you backport that to the latest stable Fedora release? Thanks.
Jan
On Mon, 24 Jun 2019 10:22:33 +0200, Jan Kratochvil wrote:
On Mon, 24 Jun 2019 09:12:40 +0200, Peter Robinson wrote:
There's a different driver in the 5.2 rc series from 5.2.0-0.rc4.git2.1 and later builds, you might want to try rc5
https://koji.fedoraproject.org/koji/buildinfo?buildID=1288824
Yes, that works. Could you backport that to the latest stable Fedora release?
BTW after ~6 hours of uptime kernel-5.2.0-0.rc5.git0.1.fc31.aarch64 died again with: [22937.548815] sdhost-bcm2835 3f202000.mmc: timeout waiting for hardware interrupt.
It was probably heavily swapping to the MicroSD card (tried running Gnome3+Firefox there).
I will try running it with kernel-5.1.12-300.nocpufreq.fc30.aarch64 instead.
Jan
On Tue, 25 Jun 2019 23:05:09 +0200, Jan Kratochvil wrote:
BTW after ~6 hours of uptime kernel-5.2.0-0.rc5.git0.1.fc31.aarch64 died again with: [22937.548815] sdhost-bcm2835 3f202000.mmc: timeout waiting for hardware interrupt.
It was probably heavily swapping to the MicroSD card (tried running Gnome3+Firefox there).
I will try running it with kernel-5.1.12-300.nocpufreq.fc30.aarch64 instead.
And in 1.5 hours today it happened even with kernel-5.1.12-300.nocpufreq.fc30.aarch64. So it only matters how often (with this MicroSD card but I still haven't tried to buy any other one).
A bit curious the controller is not restartable/resettable but then yes, it is Free software so I can try to code it that way myself.
Jan
Hi Jan,
Am 26.06.19 um 16:58 schrieb Jan Kratochvil:
On Tue, 25 Jun 2019 23:05:09 +0200, Jan Kratochvil wrote:
BTW after ~6 hours of uptime kernel-5.2.0-0.rc5.git0.1.fc31.aarch64 died again with: [22937.548815] sdhost-bcm2835 3f202000.mmc: timeout waiting for hardware interrupt.
It was probably heavily swapping to the MicroSD card (tried running Gnome3+Firefox there).
I will try running it with kernel-5.1.12-300.nocpufreq.fc30.aarch64 instead.
And in 1.5 hours today it happened even with kernel-5.1.12-300.nocpufreq.fc30.aarch64. So it only matters how often (with this MicroSD card but I still haven't tried to buy any other one).
you still have the option to send me the card.
A bit curious the controller is not restartable/resettable but then yes, it is Free software so I can try to code it that way myself.
I'm not sure this is really a problem with the controller.
Stefan
Jan
On Wed, 26 Jun 2019 16:58:32 +0200, Jan Kratochvil wrote:
And in 1.5 hours today it happened even with kernel-5.1.12-300.nocpufreq.fc30.aarch64. So it only matters how often
As a summary I have returned to Raspberry to check it more after some time and it all works perfectly now with updated F-30 running: kernel-5.2.8-200.fc30.aarch64 It has even very fast SD card writes (7.8MiB/s) compared to old kernels.
When I tried to test the original kernel: kernel-5.1.11-300.fc30.aarch64 It does not even boot reporting again: sdhost-bcm2835 3f202000.mmc: timeout waiting for hardware interrupt.
With new kernel this message is no longer printed, even when running still on that 10 years old 16GB MicroSD card.
Regards, Jan
(That NoIR Raspberry camera module locks up kernel but that is offtopic here.)
Hi Jan,
On 22.08.19 11:31, Jan Kratochvil wrote:
On Wed, 26 Jun 2019 16:58:32 +0200, Jan Kratochvil wrote:
And in 1.5 hours today it happened even with kernel-5.1.12-300.nocpufreq.fc30.aarch64. So it only matters how often
As a summary I have returned to Raspberry to check it more after some time and it all works perfectly now with updated F-30 running: kernel-5.2.8-200.fc30.aarch64 It has even very fast SD card writes (7.8MiB/s) compared to old kernels.
this must have been very old kernels ( before introduction of sdhost driver ) or DMA wasn't activated.
When I tried to test the original kernel: kernel-5.1.11-300.fc30.aarch64 It does not even boot reporting again: sdhost-bcm2835 3f202000.mmc: timeout waiting for hardware interrupt.
With new kernel this message is no longer printed, even when running still on that 10 years old 16GB MicroSD card.
Hard to say, this issue is really fixed now.
Regards, Jan
(That NoIR Raspberry camera module locks up kernel but that is offtopic here.)
This issue is new to me, please report to the list.
Regards Stefan
On Thu, Aug 22, 2019 at 10:59 AM Stefan Wahren stefan.wahren@i2se.com wrote:
Hi Jan,
On 22.08.19 11:31, Jan Kratochvil wrote:
On Wed, 26 Jun 2019 16:58:32 +0200, Jan Kratochvil wrote:
And in 1.5 hours today it happened even with kernel-5.1.12-300.nocpufreq.fc30.aarch64. So it only matters how often
As a summary I have returned to Raspberry to check it more after some time and it all works perfectly now with updated F-30 running: kernel-5.2.8-200.fc30.aarch64 It has even very fast SD card writes (7.8MiB/s) compared to old kernels.
this must have been very old kernels ( before introduction of sdhost driver ) or DMA wasn't activated.
When I tried to test the original kernel: kernel-5.1.11-300.fc30.aarch64 It does not even boot reporting again: sdhost-bcm2835 3f202000.mmc: timeout waiting for hardware interrupt.
With new kernel this message is no longer printed, even when running still on that 10 years old 16GB MicroSD card.
Hard to say, this issue is really fixed now.
Regards, Jan
(That NoIR Raspberry camera module locks up kernel but that is offtopic here.)
This issue is new to me, please report to the list.
News to me too, I have a standard camera module attached to a 3A without issue. Lets start a different thread on this.
Hi Jan,
Am 24.06.2019 um 08:47 schrieb Jan Kratochvil:
On Sun, 23 Jun 2019 22:11:55 +0200, Stefan Wahren wrote:
Am 23.06.19 um 21:22 schrieb Jan Kratochvil:
So it looks as a Fedora-specific regression somewhere in the 5.0.x series.
Please wait, i see that fc30 still uses this experimental cpufreq patch, which is very dangerous.
https://src.fedoraproject.org/rpms/kernel/blob/f30/f/bcm2835-cpufreq-add-CPU...
Please try to remove this patch at first
That's it! e9086bdbaaa1f966291adc784f375cc3a24c5762 kernel-5.1.12-300.nocpufreq.fc30.aarch64 PASS https://koji.fedoraproject.org/koji/taskinfo?taskID=35768074
Only now I understand why it successfully proceeds with the whole boot and it breaks by that error only after it becomes idle at the login screen. sdhost-bcm2835 3f202000.mmc: timeout waiting for hardware interrupt.
I have seen now a discussion https://github.com/lategoodbye/rpi-zero/issues/32 there is some principial problem all drivers need an update for that frequency scaling, so maybe sdhost-bcm2835 also still needs an update.
this problem only exists for Peter's patch. The latest approach V4 by Nicolas only touches the ARM clock. So sdhost-bcm2835 doesn't need the downstream workaround.
Stefan
Thanks, Jan
Am 24.06.2019 um 08:47 schrieb Jan Kratochvil:
On Sun, 23 Jun 2019 22:11:55 +0200, Stefan Wahren wrote:
Am 23.06.19 um 21:22 schrieb Jan Kratochvil:
So it looks as a Fedora-specific regression somewhere in the 5.0.x series.
Please wait, i see that fc30 still uses this experimental cpufreq patch, which is very dangerous.
https://src.fedoraproject.org/rpms/kernel/blob/f30/f/bcm2835-cpufreq-add-CPU...
Please try to remove this patch at first
That's it! e9086bdbaaa1f966291adc784f375cc3a24c5762 kernel-5.1.12-300.nocpufreq.fc30.aarch64 PASS https://koji.fedoraproject.org/koji/taskinfo?taskID=35768074
Only now I understand why it successfully proceeds with the whole boot and it breaks by that error only after it becomes idle at the login screen. sdhost-bcm2835 3f202000.mmc: timeout waiting for hardware interrupt.
I have seen now a discussion https://github.com/lategoodbye/rpi-zero/issues/32 there is some principial problem all drivers need an update for that frequency scaling, so maybe sdhost-bcm2835 also still needs an update.
this problem only exists for Peter's patch. The latest approach V4 by Nicolas only touches the ARM clock. So sdhost-bcm2835 doesn't need the downstream workaround.
Yes, and we've already moved to the other driver in rawhide given that is going upstream. I wanted it to bake there for some time. The other driver, while known to be far from perfect, has worked fine without a single reported issue until now. We've carried that patch since 4.17 RC series, and it's saved me from the complaint of "The Raspberry Pi so slow on Fedora" which I use to get anywhere from 30 to 100 times a week (yes, I use to log them) and verbal abuse, and once physical abuse, so I'm not going to apologise for a single problem on a 10 year old SD card which enabled me to work on other things in my spare time.
We will move the stable releases to the new driver too. I'm yet undecided whether this is on the 5.1.x series or when Fedora rebases to 5.2 at around the 5.2.3 time frame, that driver is quite new and has only been in rawhide for 10 days.
Peter