Hi!
I've put a fuller explanation of the bug in a bugzilla comment[1], and I've
included the text below. Please let us know if you have any additional
questions.
============================================================
The problem was discovered when the cache was initialized,
and a user then issued a command to add an additional data device to
the pool, resulting in the following assertion failure:
stratisd[6378]: thread 'stratis-wt-6' panicked at 'assertion failed:
`(left == right)`
stratisd[6378]: left: `Device { major: 252, minor: 7 }`,
stratisd[6378]: right: `Device { major: 252, minor: 0 }`',
src/engine/strat_engine/thinpool/thinpool.rs:516:9
The assertion that failed was:
assert_eq!(
backstore.device().expect(
"thinpool exists and has been allocated to, so
backstore must have a cap device"
),
self.backstore_device
);
This assertion checks that the device that the thinpool allocates its
component devices from is
the same as the one that the backstore considers to be its cap device,
the one to be allocated from.
The assertion itself was quite correct, and effective in identifying the bug.
The assertion failed because the backstore's device was the cache
device, but the thinpool
was allocating from the cap device it was using before the cache was
initialized and bypassing
the cache. There is no risk of data corruption in this case; the
problem is that the cache is
unused and so no performance benefit is gained.
In the example above, where the operation is correct, before the cache
is initialized the thinpool
devices, 253:1 and 253:2, are allocated from the cap device, 253:0.
After the cache is initialized,
the thinpool devices are allocated from the cache device, 253:7,
instead. This is the correct
configuration.
Previously, 253:1 and 253:2 would continue to allocate from 253:0 even
though the cache
device, 253:7, was constructed properly.
Because the pool metadata had been written properly, any action that
destroyed and rebuilt
the device stack, would cause the thinpool devices to be set up
properly to allocate from
the cache device. So, a reboot would certainly cause the device stack
to be reconstructed
correctly.
Because of the particular nature of the code defect that caused the
bug, adding an additional device to
the cache would cause the thinpool devices to be allocated properly
from the cache device.
=====================================================================
- muhern
[1]
https://bugzilla.redhat.com/show_bug.cgi?id=2007018#c4
On Mon, Nov 15, 2021 at 11:28 PM Ryan Gonzalez <rymg19(a)gmail.com> wrote:
> stratisd was not immediately updating the devicemapper device stack when a cache was
initialized with the result that the cache was not immediately put in use
Out of curiosity, what were the practical effects of this issue? Was it just degraded
performance in some cases?
-- Ryan
https://refi64.com/
On Nov 15, 2021, 8:25 PM -0600, the Mulhern <amulhern(a)redhat.com>, wrote:
Hi!
Stratis 3.0.0, which includes new versions of stratisd and stratis-cli
has been released.
Please see the blog post[1] for details of the release.
Thanks for your continued interest in the Stratis project.
- mulhern
[1]
https://stratis-storage.github.io/stratis-release-notes-3-0-0/
_______________________________________________
stratis-devel mailing list -- stratis-devel(a)lists.fedorahosted.org
To unsubscribe send an email to stratis-devel-leave(a)lists.fedorahosted.org
Fedora Code of Conduct:
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines:
https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives:
https://lists.fedorahosted.org/archives/list/stratis-devel@lists.fedoraho...
Do not reply to spam on the list, report it:
https://pagure.io/fedora-infrastructure