diff mbox series

core: util: add retry for mount to avoid udev timing

Message ID PA4PR01MB76467B0DF885133EEC19AFD388652@PA4PR01MB7646.eurprd01.prod.exchangelabs.com
State Changes Requested
Headers show
Series core: util: add retry for mount to avoid udev timing | expand

Commit Message

Daniel Matt Sept. 13, 2024, 8:41 a.m. UTC
After partitioning, there are cases in which swupdate is faster than
udev can create the /dev/disk symbolic links. If these symlinks are used
in the sw-description, the devices will not be available at that time,
causing the update to fail. Implementing a retry mechanism resolves this
issue.

Signed-off-by: Daniel Matt <daniel.matt@kuka.com>
---
core/util.c | 17 ++++++++++++++++-
1 file changed, 16 insertions(+), 1 deletion(-)

{
#if defined(__linux__)
-              return mount(device, dir, fstype, 0, NULL);
+             int ret;
+
+             for (int i = 0; i < MAX_AMOUNT_MOUNT_TRIES; i++) {
+                             ret = mount(device, dir, fstype, 0, NULL);
+
+                             if (ret == -1 && errno == ENOENT) {
+                                             WARN("Device %s not found, retry after delay", device);
+                                             sleep(1);
+                             } else {
+                                             break;
+                             }
+             }
+
+             return ret;
#elif defined(__FreeBSD__)
               int iovlen = 8;
               struct iovec iov[iovlen];
--
2.39.2



Internal

Comments

Stefano Babic Sept. 13, 2024, 10:02 a.m. UTC | #1
Hi Daniel,

On 13.09.24 10:41, 'Daniel Matt' via swupdate wrote:
> After partitioning, there are cases in which swupdate is faster than
>
> udev can create the /dev/disk symbolic links. If these symlinks are used
>
> in the sw-description,
> the devices will not be available at that time,
>
> causing the update to fail. Implementing a retry mechanism resolves this
>
> issue.

This is a specific use case.

>
> Signed-off-by: Daniel Matt <daniel.matt@kuka.com>
>
> ---
>
> core/util.c | 17 ++++++++++++++++-
>
> 1 file changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/core/util.c b/core/util.c
>
> index 2316a57..75810a3 100644
>
> --- a/core/util.c
>
> +++ b/core/util.c
>
> @@ -52,6 +52,8 @@ struct decryption_key {
>
>                 unsigned char ivt[AES_BLK_SIZE];
>
> };
>
> +#define MAX_AMOUNT_MOUNT_TRIES            2

It seems to me a sort of hack / work-around to fix the issue on just
your project. The first question is why is 2 a good number (apart it
fixes your project), and what happens on different projects where it
takes maybe longer.

And this is a second work-around to solve this.

So, when a new partition table is generated, kernel must reread it. This
is already forced by SWUpdate, but of course the new table should be
propagated to all processes like udev that are using it.

The cause then is in the new partition table, not in the mount command.

https://github.com/sbabic/swupdate/blob/master/handlers/diskpart_handler.c#L1467

Your patch is just doing the same, adding an additional second but
somewhere else in code (the additional sleep is what fixes your issue).
It is then clear that a "default" sleep cannot be set, and I can imagine
projects then where even your patch is not enough. Then we can have
another solution and we can make the sleep programmable and set
explicitely the time into sw-description. Something like this:


partitions: (
{
	type = "diskpart";
	device = <....>;
	properties: {
	  labeltype = "gpt";
		partition-1 = ["size=2G", ....];
		partition-2 = ["name=OEM",....];
		wait = "3"; // Time for system to reload part table
	}

This should solve your issue and can solve any other project that
requires more time instead of hard-coding as it is now.

What do you think ?

Best regards,
Stefano Babic

>
> +
>
> static struct decryption_key *aes_key = NULL;
>
>   /*
>
> @@ -757,7 +759,20 @@ bool strtobool(const char *s)
>
> int swupdate_mount(const char *device, const char *dir, const char *fstype)
>
> {
>
> #if defined(__linux__)
>
> -              return mount(device, dir, fstype, 0, NULL);
>
> +             int ret;
>
> +
>
> +             for (int i = 0; i < MAX_AMOUNT_MOUNT_TRIES; i++) {
>
> +                             ret = mount(device, dir, fstype, 0, NULL);
>
> +
>
> +                             if (ret == -1 && errno == ENOENT) {
>
> +                                             WARN("Device %s not found,
> retry after delay", device);
>
> +                                             sleep(1);
>
> +                             } else {
>
> +                                             break;
>
> +                             }
>
> +             }
>
> +
>
> +             return ret;
>
> #elif defined(__FreeBSD__)
>
>                 int iovlen = 8;
>
> struct iovec iov[iovlen];
>
> --
>
> 2.39.2
>
>
> Internal
>
> --
> You received this message because you are subscribed to the Google
> Groups "swupdate" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to swupdate+unsubscribe@googlegroups.com
> <mailto:swupdate+unsubscribe@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/swupdate/PA4PR01MB76467B0DF885133EEC19AFD388652%40PA4PR01MB7646.eurprd01.prod.exchangelabs.com <https://groups.google.com/d/msgid/swupdate/PA4PR01MB76467B0DF885133EEC19AFD388652%40PA4PR01MB7646.eurprd01.prod.exchangelabs.com?utm_medium=email&utm_source=footer>.
diff mbox series

Patch

diff --git a/core/util.c b/core/util.c
index 2316a57..75810a3 100644
--- a/core/util.c
+++ b/core/util.c
@@ -52,6 +52,8 @@  struct decryption_key {
               unsigned char ivt[AES_BLK_SIZE];
};
+#define MAX_AMOUNT_MOUNT_TRIES            2
+
static struct decryption_key *aes_key = NULL;
 /*
@@ -757,7 +759,20 @@  bool strtobool(const char *s)
int swupdate_mount(const char *device, const char *dir, const char *fstype)