Apparently, a few ipq40xx devices have sporadic problems when reading the
flash over SPI. When that happens, the result of the faulty SPI read is
cached and it isn't re-attempted. Depending on when it happens, the router
either panics and reboots or is left in a partially broken state (an
application wont start).
The data on the flash is alright.
This wasn't the case with Openwrt with Linux < 5.x but I wasn't able to
work out which software change was responsible.
Github user karlpip created a patch for testing that disabled the cache
entirely and added logs. Typically, only one or two SPI operations fail at
a time:
[689200.631152] spi-nor spi0.0: SPI transfer failed: -110
[689200.631280] spi_master spi0: failed to transfer one message from queue
[689200.635369] jffs2: Write of 68 bytes at 0x00ffccf4 failed. returned -110, retlen 0
[689200.642014] jffs2: Not marking the space at 0x00ffccf4 as dirty because the flash driver returned retlen zero
Because reads aren't re-attempted, squashfs can't recover:
[3171844.279235] SQUASHFS error: Failed to read block 0x2bb912: -5
[3171844.279284] SQUASHFS error: Unable to read fragment cache entry [2bb912]
[3171844.283980] SQUASHFS error: Unable to read page, block 2bb912, size 14e6c
[3171844.291650] SQUASHFS error: Unable to read fragment cache entry [2bb912]
[3171844.297831] SQUASHFS error: Unable to read page, block 2bb912, size 14e6c
I assume there to be some kind of underlying electrical problem because,
in my experience, this happens a lot more when PoE is used.
NoTengoBattery has made an in-depth investigation:
https://forum.openwrt.org/t/patch-squashfs-data-probably-corrupt/70480
.. and created a patch that evicts the page cache and retries reading:
https://github.com/NoTengoBattery/openwrt/blob/linksys-ea6350v3-mastertrack/target/linux/ipq40xx/patches-5.4/9996-fs_squashfs_improve_squashfs_error_resistance.patch
The patch also works well with the WPJ428 but NoTengoBattery didn't try to
upstream it ("This is not the solution that should be used").
In 2020, I tried and failed to create a working patch that prevents faulty pages to
be cached in the first place. Because I needed a solution, I backported
"squashfs: add option to panic on errors " (10dde05b89980ef)
which has since become available in Openwrt.
The 'error=panic' option has been tested on a fleet of multiple hundred
WPJ428s over multiple years. Without this patch, devices regularly went
into 'limbo' on reboot or update and required a manual reboot.
Devices with this patch don't. I was initially concerned that the kernel
panic would leave devices with a real corrupted data but I haven't seen a
case of actual corruption since (outside of people turning off the power
during upgrades).
The WPJ428 is the only device I tested this patch on - others might also
benefit.
Reviewed-by: Robert Marko <robimarko@gmail.com>
Signed-off-by: Leon M. Busch-George <leon@georgemail.eu>
Convert IPQ40xx boards to DSA setup.
Signed-off-by: Leon M. George <leon@georgemail.eu>
Signed-off-by: Lech Perczak <lech.perczak@gmail.com>
Signed-off-by: Nick Hainke <vincent@systemli.org>
Signed-off-by: ChunAm See <z1250747241@gmail.com>
Signed-off-by: Jeff Kletsky <git-commits@allycomm.com>
Signed-off-by: Andrew Sim <andrewsimz@gmail.com>
Signed-off-by: Robert Marko <robert.marko@sartura.hr>
moves extraction entries out of 11-ath10k-caldata and into
the individual board's device-tree.
Some notes:
- mmc could work as well (not tested)
- devices that pass the partitions via mtdparts
bootargs are kept as is
- gl-b2200 has a weird pcie wifi device
(vendor claims 9886 wave 2. But firmware-extraction
was for a wave 1 device?!)
Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Like in the previous patches for ath79 and ramips, this will remove
the "devicename" from LED labels in ipq40xx.
The devicename is removed in DTS files and 01_leds, and a migration
script is added. While at it, also harmonize capitalization of
wlan2G/wlan5G vs. wlan2g/wlan5g.
Signed-off-by: Adrian Schmutzler <freifunk@adrianschmutzler.de>
The DTS files in files-4.19 and files-5.4 are exactly identical
except for one file (qcom-ipq4018-emr3500.dts), which is only
present for 5.4.
Since there is no point in maintaining all these identical files
twice, this patch moves them to the "files" directory.
If there ever was a new kernel with substantial DTS changes, a
new folder would need to be introduced anyway and could easily be
done.
Signed-off-by: Adrian Schmutzler <freifunk@adrianschmutzler.de>