mbox series

[v2,0/1] xilinx: Add option to load environment from outside of

Message ID 20240603164738.437834-1-vassilisamir@gmail.com
Headers show
Series xilinx: Add option to load environment from outside of | expand

Message

Vasileios Amoiridis June 3, 2024, 4:47 p.m. UTC
From: Vasileios Amoiridis <vasileios.amoiridis@cern.ch>

Changes in v2:
	- Remove duplication of custom hardcoded env_locations[] code.
	- Add implementation with general arch_env_get_location(op, prio)

v1: https://lore.kernel.org/u-boot/20240522174738.73522-1-vassilisamir@gmail.com/

Vasileios Amoiridis (1):
  xilinx: Add option to load environment from outside of boot media

 board/xilinx/versal-net/board.c | 47 ++++++++++++++--------------
 board/xilinx/versal/board.c     | 47 ++++++++++++++--------------
 board/xilinx/zynq/board.c       | 49 +++++++++++++++--------------
 board/xilinx/zynqmp/zynqmp.c    | 55 +++++++++++++++++----------------
 4 files changed, 101 insertions(+), 97 deletions(-)

Comments

Michal Simek June 6, 2024, 12:44 p.m. UTC | #1
On 6/3/24 18:47, Vasileios Amoiridis wrote:
> From: Vasileios Amoiridis <vasileios.amoiridis@cern.ch>
> 
> Changes in v2:
> 	- Remove duplication of custom hardcoded env_locations[] code.
> 	- Add implementation with general arch_env_get_location(op, prio)
> 
> v1: https://lore.kernel.org/u-boot/20240522174738.73522-1-vassilisamir@gmail.com/
> 
> Vasileios Amoiridis (1):
>    xilinx: Add option to load environment from outside of boot media
> 
>   board/xilinx/versal-net/board.c | 47 ++++++++++++++--------------
>   board/xilinx/versal/board.c     | 47 ++++++++++++++--------------
>   board/xilinx/zynq/board.c       | 49 +++++++++++++++--------------
>   board/xilinx/zynqmp/zynqmp.c    | 55 +++++++++++++++++----------------
>   4 files changed, 101 insertions(+), 97 deletions(-)
> 

Applied.
M
Michal Simek June 12, 2024, 1:42 p.m. UTC | #2
Hi Vasileios,

út 4. 6. 2024 v 15:21 odesílatel Vasileios Amoiridis
<vassilisamir@gmail.com> napsal:
>
> From: Vasileios Amoiridis <vasileios.amoiridis@cern.ch>
>
> Changes in v2:
>         - Remove duplication of custom hardcoded env_locations[] code.
>         - Add implementation with general arch_env_get_location(op, prio)
>
> v1: https://lore.kernel.org/u-boot/20240522174738.73522-1-vassilisamir@gmail.com/
>
> Vasileios Amoiridis (1):
>   xilinx: Add option to load environment from outside of boot media
>
>  board/xilinx/versal-net/board.c | 47 ++++++++++++++--------------
>  board/xilinx/versal/board.c     | 47 ++++++++++++++--------------
>  board/xilinx/zynq/board.c       | 49 +++++++++++++++--------------
>  board/xilinx/zynqmp/zynqmp.c    | 55 +++++++++++++++++----------------
>  4 files changed, 101 insertions(+), 97 deletions(-)
>
> --
> 2.34.1
>

I have to remove this patch from my queue because it is actually
breaking current behavior.
CI is reporting an issue with it and I did a closer look and what is
happening is that if one location has no valid
data it goes and looks at another location from the list.

On Zynq it behaves like this.

MMC:   mmc@e0100000: 0
prio 0
Loading Environment from SPIFlash... SF: Detected n25q128a13 with page
size 256 Bytes, erase size 64 KiB, total 16 MiB
*** Warning - bad CRC, using default environment

prio 1
Loading Environment from NAND... *** Error - No Valid Environment Area found
*** Warning - bad env area, using default environment

prio 2
Loading Environment from SPIFlash... *** Warning - bad CRC, using
default environment

prio 3
Loading Environment from nowhere... OK

prio is my print to show where code is. Qemu boots out in QSPI boot
mode and SPI is tried first and because
this is xilinx_zynq_virt defconfig drivers/env locations for other
devices are present too. That's why it goes over the list
and it always ends in nowhere which never fails.

If this runs on real HW then the same behavior is visible and I don't
think it is right.
I think this solution should be rethought.
In product current behavior from our code is not the best and current
U-Boot implementation is not allowing
flexibility too. We are enabling redundant variables which can be only
on the same device but not that you have
one copy in QSPI and second in EMMC.
That's why I think location should be pretty much read from DT.
There is options/u-boot node where location of variables should be
described and this should replace
env_get_location().
It is a lot of work that's why I think second solution is more
reasonable for you which is simply create new Kconfig symbol
to disable board env_get_location() implementation.

Thanks,
Michal
Vasileios Amoiridis June 12, 2024, 1:57 p.m. UTC | #3
On Wed, 2024-06-12 at 15:42 +0200, Michal Simek wrote:
> Hi Vasileios,
> 
> út 4. 6. 2024 v 15:21 odesílatel Vasileios Amoiridis
> <vassilisamir@gmail.com> napsal:
> > 
> > From: Vasileios Amoiridis <vasileios.amoiridis@cern.ch>
> > 
> > Changes in v2:
> >         - Remove duplication of custom hardcoded env_locations[]
> > code.
> >         - Add implementation with general arch_env_get_location(op,
> > prio)
> > 
> > v1:
> > https://lore.kernel.org/u-boot/20240522174738.73522-1-vassilisamir@gmail.com/
> > 
> > Vasileios Amoiridis (1):
> >   xilinx: Add option to load environment from outside of boot media
> > 
> >  board/xilinx/versal-net/board.c | 47 ++++++++++++++--------------
> >  board/xilinx/versal/board.c     | 47 ++++++++++++++--------------
> >  board/xilinx/zynq/board.c       | 49 +++++++++++++++--------------
> >  board/xilinx/zynqmp/zynqmp.c    | 55 +++++++++++++++++------------
> > ----
> >  4 files changed, 101 insertions(+), 97 deletions(-)
> > 
> > --
> > 2.34.1
> > 
> 
> I have to remove this patch from my queue because it is actually
> breaking current behavior.
> CI is reporting an issue with it and I did a closer look and what is
> happening is that if one location has no valid
> data it goes and looks at another location from the list.
> 

But that is the desired behavior, if the environment is not in one
location to check to the others.

> On Zynq it behaves like this.
> 
> MMC:   mmc@e0100000: 0
> prio 0
> Loading Environment from SPIFlash... SF: Detected n25q128a13 with
> page
> size 256 Bytes, erase size 64 KiB, total 16 MiB
> *** Warning - bad CRC, using default environment
> 
> prio 1
> Loading Environment from NAND... *** Error - No Valid Environment
> Area found
> *** Warning - bad env area, using default environment
> 
> prio 2
> Loading Environment from SPIFlash... *** Warning - bad CRC, using
> default environment
> 
> prio 3
> Loading Environment from nowhere... OK
> 
> 

This is the message that we get as well when this patch is not added.

> prio is my print to show where code is. Qemu boots out in QSPI boot
> mode and SPI is tried first and because
> this is xilinx_zynq_virt defconfig drivers/env locations for other
> devices are present too. That's why it goes over the list
> and it always ends in nowhere which never fails.
> 
> If this runs on real HW then the same behavior is visible and I don't
> think it is right.
> I think this solution should be rethought.
> In product current behavior from our code is not the best and current
> U-Boot implementation is not allowing
> flexibility too. We are enabling redundant variables which can be
> only
> on the same device but not that you have
> one copy in QSPI and second in EMMC.
> That's why I think location should be pretty much read from DT.
> There is options/u-boot node where location of variables should be
> described and this should replace
> env_get_location().

You mean that there should be some type of new U-Boot node that
describes where the environment should be located and go and
search in this device?

> It is a lot of work that's why I think second solution is more
> reasonable for you which is simply create new Kconfig symbol
> to disable board env_get_location() implementation.
> 
> Thanks,
> Michal
> 

You mean to basically disable the board env_get_location() that
exists in board/xilinx/zynqmp/zynqmp.c for example?

Cheers,
Vasilis
Michal Simek June 12, 2024, 2:47 p.m. UTC | #4
On 6/12/24 15:57, Vasileios Amoiridis wrote:
> On Wed, 2024-06-12 at 15:42 +0200, Michal Simek wrote:
>> Hi Vasileios,
>>
>> út 4. 6. 2024 v 15:21 odesílatel Vasileios Amoiridis
>> <vassilisamir@gmail.com> napsal:
>>>
>>> From: Vasileios Amoiridis <vasileios.amoiridis@cern.ch>
>>>
>>> Changes in v2:
>>>          - Remove duplication of custom hardcoded env_locations[]
>>> code.
>>>          - Add implementation with general arch_env_get_location(op,
>>> prio)
>>>
>>> v1:
>>> https://lore.kernel.org/u-boot/20240522174738.73522-1-vassilisamir@gmail.com/
>>>
>>> Vasileios Amoiridis (1):
>>>    xilinx: Add option to load environment from outside of boot media
>>>
>>>   board/xilinx/versal-net/board.c | 47 ++++++++++++++--------------
>>>   board/xilinx/versal/board.c     | 47 ++++++++++++++--------------
>>>   board/xilinx/zynq/board.c       | 49 +++++++++++++++--------------
>>>   board/xilinx/zynqmp/zynqmp.c    | 55 +++++++++++++++++------------
>>> ----
>>>   4 files changed, 101 insertions(+), 97 deletions(-)
>>>
>>> --
>>> 2.34.1
>>>
>>
>> I have to remove this patch from my queue because it is actually
>> breaking current behavior.
>> CI is reporting an issue with it and I did a closer look and what is
>> happening is that if one location has no valid
>> data it goes and looks at another location from the list.
>>
> 
> But that is the desired behavior, if the environment is not in one
> location to check to the others.

That's how u-boot is written but if device doesn't exist or not accessible by 
u-boot it is clear that it shouldn't be used and then behavior is correct.

If device exists and simple varibles are not yet saved there going to different 
device is IMHO problematic.

> 
>> On Zynq it behaves like this.
>>
>> MMC:   mmc@e0100000: 0
>> prio 0
>> Loading Environment from SPIFlash... SF: Detected n25q128a13 with
>> page
>> size 256 Bytes, erase size 64 KiB, total 16 MiB
>> *** Warning - bad CRC, using default environment
>>
>> prio 1
>> Loading Environment from NAND... *** Error - No Valid Environment
>> Area found
>> *** Warning - bad env area, using default environment
>>
>> prio 2
>> Loading Environment from SPIFlash... *** Warning - bad CRC, using
>> default environment
>>
>> prio 3
>> Loading Environment from nowhere... OK
>>
>>
> 
> This is the message that we get as well when this patch is not added.

It means you are booting out of QSPI but qspi is not accessible by u-boot. Correct?


>> prio is my print to show where code is. Qemu boots out in QSPI boot
>> mode and SPI is tried first and because
>> this is xilinx_zynq_virt defconfig drivers/env locations for other
>> devices are present too. That's why it goes over the list
>> and it always ends in nowhere which never fails.
>>
>> If this runs on real HW then the same behavior is visible and I don't
>> think it is right.
>> I think this solution should be rethought.
>> In product current behavior from our code is not the best and current
>> U-Boot implementation is not allowing
>> flexibility too. We are enabling redundant variables which can be
>> only
>> on the same device but not that you have
>> one copy in QSPI and second in EMMC.
>> That's why I think location should be pretty much read from DT.
>> There is options/u-boot node where location of variables should be
>> described and this should replace
>> env_get_location().
> 
> You mean that there should be some type of new U-Boot node that
> describes where the environment should be located and go and
> search in this device?

Not sure if node or just dt property which points where that variables are stored.


>> It is a lot of work that's why I think second solution is more
>> reasonable for you which is simply create new Kconfig symbol
>> to disable board env_get_location() implementation.
>>
>> Thanks,
>> Michal
>>
> 
> You mean to basically disable the board env_get_location() that
> exists in board/xilinx/zynqmp/zynqmp.c for example?

yes. Something like this.

config BOARD_ENV_LOCATION
	bool "Use board defined ENV location"
	default y
	...

if defined (BOARD_ENV_SETUP/LOCATION)
enum env_location env_get_location(enum env_operation op, int prio)
{
...
}
endif

M
Vasileios Amoiridis June 12, 2024, 3:11 p.m. UTC | #5
On Wed, 2024-06-12 at 16:47 +0200, Michal Simek wrote:
> 
> 
> On 6/12/24 15:57, Vasileios Amoiridis wrote:
> > On Wed, 2024-06-12 at 15:42 +0200, Michal Simek wrote:
> > > Hi Vasileios,
> > > 
> > > út 4. 6. 2024 v 15:21 odesílatel Vasileios Amoiridis
> > > <vassilisamir@gmail.com> napsal:
> > > > 
> > > > From: Vasileios Amoiridis <vasileios.amoiridis@cern.ch>
> > > > 
> > > > Changes in v2:
> > > >          - Remove duplication of custom hardcoded
> > > > env_locations[]
> > > > code.
> > > >          - Add implementation with general
> > > > arch_env_get_location(op,
> > > > prio)
> > > > 
> > > > v1:
> > > > https://lore.kernel.org/u-boot/20240522174738.73522-1-vassilisamir@gmail.com/
> > > > 
> > > > Vasileios Amoiridis (1):
> > > >    xilinx: Add option to load environment from outside of boot
> > > > media
> > > > 
> > > >   board/xilinx/versal-net/board.c | 47 ++++++++++++++----------
> > > > ----
> > > >   board/xilinx/versal/board.c     | 47 ++++++++++++++----------
> > > > ----
> > > >   board/xilinx/zynq/board.c       | 49 +++++++++++++++---------
> > > > -----
> > > >   board/xilinx/zynqmp/zynqmp.c    | 55 +++++++++++++++++-------
> > > > -----
> > > > ----
> > > >   4 files changed, 101 insertions(+), 97 deletions(-)
> > > > 
> > > > --
> > > > 2.34.1
> > > > 
> > > 
> > > I have to remove this patch from my queue because it is actually
> > > breaking current behavior.
> > > CI is reporting an issue with it and I did a closer look and what
> > > is
> > > happening is that if one location has no valid
> > > data it goes and looks at another location from the list.
> > > 
> > 
> > But that is the desired behavior, if the environment is not in one
> > location to check to the others.
> 
> That's how u-boot is written but if device doesn't exist or not
> accessible by 
> u-boot it is clear that it shouldn't be used and then behavior is
> correct.
> 

Hmmm, how can a device be not-accessible by u-boot but u-boot still
tries to use it? How could that happen?

Also, I think that this is a problem of the user, no? If for example
you have configure ENV_IS_IN_NAND but there is no NAND device, then
with the patch you get a visible error that U-Boot tries to access
NAND which is not there.

> If device exists and simple varibles are not yet saved there going to
> different 
> device is IMHO problematic.
> 

Maybe I am still a bit confused, but I think that if you have defined
ENV_IS_IN_QSPI and ENV_IS_IN_NAND but the environment is not there then
the behavior that we see is actually correct. U-Boot tries:

a) Get env from QSPI but fails because env is not there,
b) Get env from NAND but fails because NAND is not there,
c) Get env from nowhere.

To me, this looks like the correct behavior. Isn't it?


> > 
> > > On Zynq it behaves like this.
> > > 
> > > MMC:   mmc@e0100000: 0
> > > prio 0
> > > Loading Environment from SPIFlash... SF: Detected n25q128a13 with
> > > page
> > > size 256 Bytes, erase size 64 KiB, total 16 MiB
> > > *** Warning - bad CRC, using default environment
> > > 
> > > prio 1
> > > Loading Environment from NAND... *** Error - No Valid Environment
> > > Area found
> > > *** Warning - bad env area, using default environment
> > > 
> > > prio 2
> > > Loading Environment from SPIFlash... *** Warning - bad CRC, using
> > > default environment
> > > 
> > > prio 3
> > > Loading Environment from nowhere... OK
> > > 
> > > 
> > 
> > This is the message that we get as well when this patch is not
> > added.
> 
> It means you are booting out of QSPI but qspi is not accessible by u-
> boot. Correct?
> 
> 

Well, we are booting from QSPI but environment is in eMMC. In our
config, we have defined ENV_IS_IN_FAT and ENV_IS_NOWHERE but we have
not defined ENV_IS_IN_QSPI. 

> > > prio is my print to show where code is. Qemu boots out in QSPI
> > > boot
> > > mode and SPI is tried first and because
> > > this is xilinx_zynq_virt defconfig drivers/env locations for
> > > other
> > > devices are present too. That's why it goes over the list
> > > and it always ends in nowhere which never fails.
> > > 
> > > If this runs on real HW then the same behavior is visible and I
> > > don't
> > > think it is right.
> > > I think this solution should be rethought.
> > > In product current behavior from our code is not the best and
> > > current
> > > U-Boot implementation is not allowing
> > > flexibility too. We are enabling redundant variables which can be
> > > only
> > > on the same device but not that you have
> > > one copy in QSPI and second in EMMC.
> > > That's why I think location should be pretty much read from DT.
> > > There is options/u-boot node where location of variables should
> > > be
> > > described and this should replace
> > > env_get_location().
> > 
> > You mean that there should be some type of new U-Boot node that
> > describes where the environment should be located and go and
> > search in this device?
> 
> Not sure if node or just dt property which points where that
> variables are stored.
> 
> 

This doesn't sound too difficult to do. Still, even if this was done,
what would the priority be for looking for the environment? For now,
the code is written in a way to check for the environment only in the
bootmedia. My patch added a functionality that if the environment is
not in the bootmedia to check somewhere else. This should be the order
also if we had this u-boot node? Check first in the bootmedia and then
check if there is a dt property?

> > > It is a lot of work that's why I think second solution is more
> > > reasonable for you which is simply create new Kconfig symbol
> > > to disable board env_get_location() implementation.
> > > 
> > > Thanks,
> > > Michal
> > > 
> > 
> > You mean to basically disable the board env_get_location() that
> > exists in board/xilinx/zynqmp/zynqmp.c for example?
> 
> yes. Something like this.
> 
> config BOARD_ENV_LOCATION
>         bool "Use board defined ENV location"
>         default y
>         ...
> 
> if defined (BOARD_ENV_SETUP/LOCATION)
> enum env_location env_get_location(enum env_operation op, int prio)
> {
> ...
> }
> endif
> 
> M
> 
> 

This will be accepted by you in upstream?

Cheers,
Vasilis
Michal Simek June 12, 2024, 3:32 p.m. UTC | #6
On 6/12/24 17:11, Vasileios Amoiridis wrote:
> On Wed, 2024-06-12 at 16:47 +0200, Michal Simek wrote:
>>
>>
>> On 6/12/24 15:57, Vasileios Amoiridis wrote:
>>> On Wed, 2024-06-12 at 15:42 +0200, Michal Simek wrote:
>>>> Hi Vasileios,
>>>>
>>>> út 4. 6. 2024 v 15:21 odesílatel Vasileios Amoiridis
>>>> <vassilisamir@gmail.com> napsal:
>>>>>
>>>>> From: Vasileios Amoiridis <vasileios.amoiridis@cern.ch>
>>>>>
>>>>> Changes in v2:
>>>>>           - Remove duplication of custom hardcoded
>>>>> env_locations[]
>>>>> code.
>>>>>           - Add implementation with general
>>>>> arch_env_get_location(op,
>>>>> prio)
>>>>>
>>>>> v1:
>>>>> https://lore.kernel.org/u-boot/20240522174738.73522-1-vassilisamir@gmail.com/
>>>>>
>>>>> Vasileios Amoiridis (1):
>>>>>     xilinx: Add option to load environment from outside of boot
>>>>> media
>>>>>
>>>>>    board/xilinx/versal-net/board.c | 47 ++++++++++++++----------
>>>>> ----
>>>>>    board/xilinx/versal/board.c     | 47 ++++++++++++++----------
>>>>> ----
>>>>>    board/xilinx/zynq/board.c       | 49 +++++++++++++++---------
>>>>> -----
>>>>>    board/xilinx/zynqmp/zynqmp.c    | 55 +++++++++++++++++-------
>>>>> -----
>>>>> ----
>>>>>    4 files changed, 101 insertions(+), 97 deletions(-)
>>>>>
>>>>> --
>>>>> 2.34.1
>>>>>
>>>>
>>>> I have to remove this patch from my queue because it is actually
>>>> breaking current behavior.
>>>> CI is reporting an issue with it and I did a closer look and what
>>>> is
>>>> happening is that if one location has no valid
>>>> data it goes and looks at another location from the list.
>>>>
>>>
>>> But that is the desired behavior, if the environment is not in one
>>> location to check to the others.
>>
>> That's how u-boot is written but if device doesn't exist or not
>> accessible by
>> u-boot it is clear that it shouldn't be used and then behavior is
>> correct.
>>
> 
> Hmmm, how can a device be not-accessible by u-boot but u-boot still
> tries to use it? How could that happen?

That Xilinx virtual defconfig are setup in a way that all drivers are enabled 
via Kconfig but they are not present on all boards.
If you look below in defconfig we are have that variables can be in NAND but 
there is no NAND device on zc702 but still u-boot tries to reach nand and read 
variables from it.

> 
> Also, I think that this is a problem of the user, no? If for example
> you have configure ENV_IS_IN_NAND but there is no NAND device, then
> with the patch you get a visible error that U-Boot tries to access
> NAND which is not there.

For our platforms where you can burn boot.bin to any boot medium we had no 
choice that's why code was written like that that variables are saved all the 
time to boot medium.
Also that I don't want to maintain other defconfigs for slightly different 
configurations.
On products customers should look at our all in one defconfig and disable things 
which are not needed. And where variables are saved is definitely one of them.

> 
>> If device exists and simple varibles are not yet saved there going to
>> different
>> device is IMHO problematic.
>>
> 
> Maybe I am still a bit confused, but I think that if you have defined
> ENV_IS_IN_QSPI and ENV_IS_IN_NAND but the environment is not there then
> the behavior that we see is actually correct. U-Boot tries:
> 
> a) Get env from QSPI but fails because env is not there,
> b) Get env from NAND but fails because NAND is not there,
> c) Get env from nowhere.
> 
> To me, this looks like the correct behavior. Isn't it?

This sequence is correct but what it is not correct is when you reach nowhere 
and you run saveenv you can't really save variables even they can be saved to 
QSPI. They can't be saved to NAND because it is not present on the board.

The issue is that none write initial variable to QSPI that's why there will be 
all the time bad CRC but location is correct.


>>>
>>>> On Zynq it behaves like this.
>>>>
>>>> MMC:   mmc@e0100000: 0
>>>> prio 0
>>>> Loading Environment from SPIFlash... SF: Detected n25q128a13 with
>>>> page
>>>> size 256 Bytes, erase size 64 KiB, total 16 MiB
>>>> *** Warning - bad CRC, using default environment
>>>>
>>>> prio 1
>>>> Loading Environment from NAND... *** Error - No Valid Environment
>>>> Area found
>>>> *** Warning - bad env area, using default environment
>>>>
>>>> prio 2
>>>> Loading Environment from SPIFlash... *** Warning - bad CRC, using
>>>> default environment
>>>>
>>>> prio 3
>>>> Loading Environment from nowhere... OK
>>>>
>>>>
>>>
>>> This is the message that we get as well when this patch is not
>>> added.
>>
>> It means you are booting out of QSPI but qspi is not accessible by u-
>> boot. Correct?
>>
>>
> 
> Well, we are booting from QSPI but environment is in eMMC. In our
> config, we have defined ENV_IS_IN_FAT and ENV_IS_NOWHERE but we have
> not defined ENV_IS_IN_QSPI.

ok.


>>>> prio is my print to show where code is. Qemu boots out in QSPI
>>>> boot
>>>> mode and SPI is tried first and because
>>>> this is xilinx_zynq_virt defconfig drivers/env locations for
>>>> other
>>>> devices are present too. That's why it goes over the list
>>>> and it always ends in nowhere which never fails.
>>>>
>>>> If this runs on real HW then the same behavior is visible and I
>>>> don't
>>>> think it is right.
>>>> I think this solution should be rethought.
>>>> In product current behavior from our code is not the best and
>>>> current
>>>> U-Boot implementation is not allowing
>>>> flexibility too. We are enabling redundant variables which can be
>>>> only
>>>> on the same device but not that you have
>>>> one copy in QSPI and second in EMMC.
>>>> That's why I think location should be pretty much read from DT.
>>>> There is options/u-boot node where location of variables should
>>>> be
>>>> described and this should replace
>>>> env_get_location().
>>>
>>> You mean that there should be some type of new U-Boot node that
>>> describes where the environment should be located and go and
>>> search in this device?
>>
>> Not sure if node or just dt property which points where that
>> variables are stored.
>>
>>
> 
> This doesn't sound too difficult to do. Still, even if this was done,
> what would the priority be for looking for the environment? For now,
> the code is written in a way to check for the environment only in the
> bootmedia. My patch added a functionality that if the environment is
> not in the bootmedia to check somewhere else. This should be the order
> also if we had this u-boot node? Check first in the bootmedia and then
> check if there is a dt property?

I don't think this is going to be that simple. Especially to support redundant 
variables saved to two different devices.


>>>> It is a lot of work that's why I think second solution is more
>>>> reasonable for you which is simply create new Kconfig symbol
>>>> to disable board env_get_location() implementation.
>>>>
>>>> Thanks,
>>>> Michal
>>>>
>>>
>>> You mean to basically disable the board env_get_location() that
>>> exists in board/xilinx/zynqmp/zynqmp.c for example?
>>
>> yes. Something like this.
>>
>> config BOARD_ENV_LOCATION
>>          bool "Use board defined ENV location"
>>          default y
>>          ...
>>
>> if defined (BOARD_ENV_SETUP/LOCATION)
>> enum env_location env_get_location(enum env_operation op, int prio)
>> {
>> ...
>> }
>> endif
>>
>> M
>>
>>
> 
> This will be accepted by you in upstream?

I think it is reasonable compromise to pretty much disable bootmode media 
selection for storing variables.

Thanks,
Michal
Vasileios Amoiridis June 12, 2024, 3:49 p.m. UTC | #7
On Wed, 2024-06-12 at 17:32 +0200, Michal Simek wrote:
> 
> 
> On 6/12/24 17:11, Vasileios Amoiridis wrote:
> > On Wed, 2024-06-12 at 16:47 +0200, Michal Simek wrote:
> > > 
> > > 
> > > On 6/12/24 15:57, Vasileios Amoiridis wrote:
> > > > On Wed, 2024-06-12 at 15:42 +0200, Michal Simek wrote:
> > > > > Hi Vasileios,
> > > > > 
> > > > > út 4. 6. 2024 v 15:21 odesílatel Vasileios Amoiridis
> > > > > <vassilisamir@gmail.com> napsal:
> > > > > > 
> > > > > > From: Vasileios Amoiridis <vasileios.amoiridis@cern.ch>
> > > > > > 
> > > > > > Changes in v2:
> > > > > >           - Remove duplication of custom hardcoded
> > > > > > env_locations[]
> > > > > > code.
> > > > > >           - Add implementation with general
> > > > > > arch_env_get_location(op,
> > > > > > prio)
> > > > > > 
> > > > > > v1:
> > > > > > https://lore.kernel.org/u-boot/20240522174738.73522-1-vassilisamir@gmail.com/
> > > > > > 
> > > > > > Vasileios Amoiridis (1):
> > > > > >     xilinx: Add option to load environment from outside of
> > > > > > boot
> > > > > > media
> > > > > > 
> > > > > >    board/xilinx/versal-net/board.c | 47 ++++++++++++++-----
> > > > > > -----
> > > > > > ----
> > > > > >    board/xilinx/versal/board.c     | 47 ++++++++++++++-----
> > > > > > -----
> > > > > > ----
> > > > > >    board/xilinx/zynq/board.c       | 49 +++++++++++++++----
> > > > > > -----
> > > > > > -----
> > > > > >    board/xilinx/zynqmp/zynqmp.c    | 55 +++++++++++++++++--
> > > > > > -----
> > > > > > -----
> > > > > > ----
> > > > > >    4 files changed, 101 insertions(+), 97 deletions(-)
> > > > > > 
> > > > > > --
> > > > > > 2.34.1
> > > > > > 
> > > > > 
> > > > > I have to remove this patch from my queue because it is
> > > > > actually
> > > > > breaking current behavior.
> > > > > CI is reporting an issue with it and I did a closer look and
> > > > > what
> > > > > is
> > > > > happening is that if one location has no valid
> > > > > data it goes and looks at another location from the list.
> > > > > 
> > > > 
> > > > But that is the desired behavior, if the environment is not in
> > > > one
> > > > location to check to the others.
> > > 
> > > That's how u-boot is written but if device doesn't exist or not
> > > accessible by
> > > u-boot it is clear that it shouldn't be used and then behavior is
> > > correct.
> > > 
> > 
> > Hmmm, how can a device be not-accessible by u-boot but u-boot still
> > tries to use it? How could that happen?
> 
> That Xilinx virtual defconfig are setup in a way that all drivers are
> enabled 
> via Kconfig but they are not present on all boards.

Ok, now I understand how that happens.
> If you look below in defconfig we are have that variables can be in
> NAND but 
> there is no NAND device on zc702 but still u-boot tries to reach nand
> and read 
> variables from it.
> 

So, the ideal scenario would be to U-Boot never try to access the
device? Because from what I see, is that since the device is not there,
U-Boot throws an error, and moves on.

> > 
> > Also, I think that this is a problem of the user, no? If for
> > example
> > you have configure ENV_IS_IN_NAND but there is no NAND device, then
> > with the patch you get a visible error that U-Boot tries to access
> > NAND which is not there.
> 
> For our platforms where you can burn boot.bin to any boot medium we
> had no 
> choice that's why code was written like that that variables are saved
> all the 
> time to boot medium.
> Also that I don't want to maintain other defconfigs for slightly
> different 
> configurations.
> On products customers should look at our all in one defconfig and
> disable things 
> which are not needed. And where variables are saved is definitely one
> of them.
> 

Well, that makes sense.

> > 
> > > If device exists and simple varibles are not yet saved there
> > > going to
> > > different
> > > device is IMHO problematic.
> > > 
> > 
> > Maybe I am still a bit confused, but I think that if you have
> > defined
> > ENV_IS_IN_QSPI and ENV_IS_IN_NAND but the environment is not there
> > then
> > the behavior that we see is actually correct. U-Boot tries:
> > 
> > a) Get env from QSPI but fails because env is not there,
> > b) Get env from NAND but fails because NAND is not there,
> > c) Get env from nowhere.
> > 
> > To me, this looks like the correct behavior. Isn't it?
> 
> This sequence is correct but what it is not correct is when you reach
> nowhere 
> and you run saveenv you can't really save variables even they can be
> saved to 
> QSPI. They can't be saved to NAND because it is not present on the
> board.
> 

This can be solved by using the env_select command (cmd/nvedit.c) no?

> The issue is that none write initial variable to QSPI that's why
> there will be 
> all the time bad CRC but location is correct.
> 
> 
> > > > 
> > > > > On Zynq it behaves like this.
> > > > > 
> > > > > MMC:   mmc@e0100000: 0
> > > > > prio 0
> > > > > Loading Environment from SPIFlash... SF: Detected n25q128a13
> > > > > with
> > > > > page
> > > > > size 256 Bytes, erase size 64 KiB, total 16 MiB
> > > > > *** Warning - bad CRC, using default environment
> > > > > 
> > > > > prio 1
> > > > > Loading Environment from NAND... *** Error - No Valid
> > > > > Environment
> > > > > Area found
> > > > > *** Warning - bad env area, using default environment
> > > > > 
> > > > > prio 2
> > > > > Loading Environment from SPIFlash... *** Warning - bad CRC,
> > > > > using
> > > > > default environment
> > > > > 
> > > > > prio 3
> > > > > Loading Environment from nowhere... OK
> > > > > 
> > > > > 
> > > > 
> > > > This is the message that we get as well when this patch is not
> > > > added.
> > > 
> > > It means you are booting out of QSPI but qspi is not accessible
> > > by u-
> > > boot. Correct?
> > > 
> > > 
> > 
> > Well, we are booting from QSPI but environment is in eMMC. In our
> > config, we have defined ENV_IS_IN_FAT and ENV_IS_NOWHERE but we
> > have
> > not defined ENV_IS_IN_QSPI.
> 
> ok.
> 
> 
> > > > > prio is my print to show where code is. Qemu boots out in
> > > > > QSPI
> > > > > boot
> > > > > mode and SPI is tried first and because
> > > > > this is xilinx_zynq_virt defconfig drivers/env locations for
> > > > > other
> > > > > devices are present too. That's why it goes over the list
> > > > > and it always ends in nowhere which never fails.
> > > > > 
> > > > > If this runs on real HW then the same behavior is visible and
> > > > > I
> > > > > don't
> > > > > think it is right.
> > > > > I think this solution should be rethought.
> > > > > In product current behavior from our code is not the best and
> > > > > current
> > > > > U-Boot implementation is not allowing
> > > > > flexibility too. We are enabling redundant variables which
> > > > > can be
> > > > > only
> > > > > on the same device but not that you have
> > > > > one copy in QSPI and second in EMMC.
> > > > > That's why I think location should be pretty much read from
> > > > > DT.
> > > > > There is options/u-boot node where location of variables
> > > > > should
> > > > > be
> > > > > described and this should replace
> > > > > env_get_location().
> > > > 
> > > > You mean that there should be some type of new U-Boot node that
> > > > describes where the environment should be located and go and
> > > > search in this device?
> > > 
> > > Not sure if node or just dt property which points where that
> > > variables are stored.
> > > 
> > > 
> > 
> > This doesn't sound too difficult to do. Still, even if this was
> > done,
> > what would the priority be for looking for the environment? For
> > now,
> > the code is written in a way to check for the environment only in
> > the
> > bootmedia. My patch added a functionality that if the environment
> > is
> > not in the bootmedia to check somewhere else. This should be the
> > order
> > also if we had this u-boot node? Check first in the bootmedia and
> > then
> > check if there is a dt property?
> 
> I don't think this is going to be that simple. Especially to support
> redundant 
> variables saved to two different devices.
> 
> 

To my understanding, I think it is enough to have a dt property that
says "emmc" or "nand" etc.. that can be read when the
env_get_location() is executed and the correct env_location is
selected. What am I missing that makes you think that it is not so
trivial?

I don't have that much experience with the U-Boot sources themselves so
my question might be stupid, I know.

> > > > > It is a lot of work that's why I think second solution is
> > > > > more
> > > > > reasonable for you which is simply create new Kconfig symbol
> > > > > to disable board env_get_location() implementation.
> > > > > 
> > > > > Thanks,
> > > > > Michal
> > > > > 
> > > > 
> > > > You mean to basically disable the board env_get_location() that
> > > > exists in board/xilinx/zynqmp/zynqmp.c for example?
> > > 
> > > yes. Something like this.
> > > 
> > > config BOARD_ENV_LOCATION
> > >          bool "Use board defined ENV location"
> > >          default y
> > >          ...
> > > 
> > > if defined (BOARD_ENV_SETUP/LOCATION)
> > > enum env_location env_get_location(enum env_operation op, int
> > > prio)
> > > {
> > > ...
> > > }
> > > endif
> > > 
> > > M
> > > 
> > > 
> > 
> > This will be accepted by you in upstream?
> 
> I think it is reasonable compromise to pretty much disable bootmode
> media 
> selection for storing variables.
> 
> Thanks,
> Michal

Ok, I can submit a v3 then.

Cheers,
Vasilis
Michal Simek June 14, 2024, 11:13 a.m. UTC | #8
On 6/12/24 17:49, Vasileios Amoiridis wrote:
> On Wed, 2024-06-12 at 17:32 +0200, Michal Simek wrote:
>>
>>
>> On 6/12/24 17:11, Vasileios Amoiridis wrote:
>>> On Wed, 2024-06-12 at 16:47 +0200, Michal Simek wrote:
>>>>
>>>>
>>>> On 6/12/24 15:57, Vasileios Amoiridis wrote:
>>>>> On Wed, 2024-06-12 at 15:42 +0200, Michal Simek wrote:
>>>>>> Hi Vasileios,
>>>>>>
>>>>>> út 4. 6. 2024 v 15:21 odesílatel Vasileios Amoiridis
>>>>>> <vassilisamir@gmail.com> napsal:
>>>>>>>
>>>>>>> From: Vasileios Amoiridis <vasileios.amoiridis@cern.ch>
>>>>>>>
>>>>>>> Changes in v2:
>>>>>>>            - Remove duplication of custom hardcoded
>>>>>>> env_locations[]
>>>>>>> code.
>>>>>>>            - Add implementation with general
>>>>>>> arch_env_get_location(op,
>>>>>>> prio)
>>>>>>>
>>>>>>> v1:
>>>>>>> https://lore.kernel.org/u-boot/20240522174738.73522-1-vassilisamir@gmail.com/
>>>>>>>
>>>>>>> Vasileios Amoiridis (1):
>>>>>>>      xilinx: Add option to load environment from outside of
>>>>>>> boot
>>>>>>> media
>>>>>>>
>>>>>>>     board/xilinx/versal-net/board.c | 47 ++++++++++++++-----
>>>>>>> -----
>>>>>>> ----
>>>>>>>     board/xilinx/versal/board.c     | 47 ++++++++++++++-----
>>>>>>> -----
>>>>>>> ----
>>>>>>>     board/xilinx/zynq/board.c       | 49 +++++++++++++++----
>>>>>>> -----
>>>>>>> -----
>>>>>>>     board/xilinx/zynqmp/zynqmp.c    | 55 +++++++++++++++++--
>>>>>>> -----
>>>>>>> -----
>>>>>>> ----
>>>>>>>     4 files changed, 101 insertions(+), 97 deletions(-)
>>>>>>>
>>>>>>> --
>>>>>>> 2.34.1
>>>>>>>
>>>>>>
>>>>>> I have to remove this patch from my queue because it is
>>>>>> actually
>>>>>> breaking current behavior.
>>>>>> CI is reporting an issue with it and I did a closer look and
>>>>>> what
>>>>>> is
>>>>>> happening is that if one location has no valid
>>>>>> data it goes and looks at another location from the list.
>>>>>>
>>>>>
>>>>> But that is the desired behavior, if the environment is not in
>>>>> one
>>>>> location to check to the others.
>>>>
>>>> That's how u-boot is written but if device doesn't exist or not
>>>> accessible by
>>>> u-boot it is clear that it shouldn't be used and then behavior is
>>>> correct.
>>>>
>>>
>>> Hmmm, how can a device be not-accessible by u-boot but u-boot still
>>> tries to use it? How could that happen?
>>
>> That Xilinx virtual defconfig are setup in a way that all drivers are
>> enabled
>> via Kconfig but they are not present on all boards.
> 
> Ok, now I understand how that happens.
>> If you look below in defconfig we are have that variables can be in
>> NAND but
>> there is no NAND device on zc702 but still u-boot tries to reach nand
>> and read
>> variables from it.
>>
> 
> So, the ideal scenario would be to U-Boot never try to access the
> device? Because from what I see, is that since the device is not there,
> U-Boot throws an error, and moves on.
> 
>>>
>>> Also, I think that this is a problem of the user, no? If for
>>> example
>>> you have configure ENV_IS_IN_NAND but there is no NAND device, then
>>> with the patch you get a visible error that U-Boot tries to access
>>> NAND which is not there.
>>
>> For our platforms where you can burn boot.bin to any boot medium we
>> had no
>> choice that's why code was written like that that variables are saved
>> all the
>> time to boot medium.
>> Also that I don't want to maintain other defconfigs for slightly
>> different
>> configurations.
>> On products customers should look at our all in one defconfig and
>> disable things
>> which are not needed. And where variables are saved is definitely one
>> of them.
>>
> 
> Well, that makes sense.
> 
>>>
>>>> If device exists and simple varibles are not yet saved there
>>>> going to
>>>> different
>>>> device is IMHO problematic.
>>>>
>>>
>>> Maybe I am still a bit confused, but I think that if you have
>>> defined
>>> ENV_IS_IN_QSPI and ENV_IS_IN_NAND but the environment is not there
>>> then
>>> the behavior that we see is actually correct. U-Boot tries:
>>>
>>> a) Get env from QSPI but fails because env is not there,
>>> b) Get env from NAND but fails because NAND is not there,
>>> c) Get env from nowhere.
>>>
>>> To me, this looks like the correct behavior. Isn't it?
>>
>> This sequence is correct but what it is not correct is when you reach
>> nowhere
>> and you run saveenv you can't really save variables even they can be
>> saved to
>> QSPI. They can't be saved to NAND because it is not present on the
>> board.
>>
> 
> This can be solved by using the env_select command (cmd/nvedit.c) no?

I looked at it and it is interesting option.
We don't have it enabled by default (and maybe should be enabled). CI test is 
not handling it too.
Love: can you please take a look at it?


>> The issue is that none write initial variable to QSPI that's why
>> there will be
>> all the time bad CRC but location is correct.
>>
>>
>>>>>
>>>>>> On Zynq it behaves like this.
>>>>>>
>>>>>> MMC:   mmc@e0100000: 0
>>>>>> prio 0
>>>>>> Loading Environment from SPIFlash... SF: Detected n25q128a13
>>>>>> with
>>>>>> page
>>>>>> size 256 Bytes, erase size 64 KiB, total 16 MiB
>>>>>> *** Warning - bad CRC, using default environment
>>>>>>
>>>>>> prio 1
>>>>>> Loading Environment from NAND... *** Error - No Valid
>>>>>> Environment
>>>>>> Area found
>>>>>> *** Warning - bad env area, using default environment
>>>>>>
>>>>>> prio 2
>>>>>> Loading Environment from SPIFlash... *** Warning - bad CRC,
>>>>>> using
>>>>>> default environment
>>>>>>
>>>>>> prio 3
>>>>>> Loading Environment from nowhere... OK
>>>>>>
>>>>>>
>>>>>
>>>>> This is the message that we get as well when this patch is not
>>>>> added.
>>>>
>>>> It means you are booting out of QSPI but qspi is not accessible
>>>> by u-
>>>> boot. Correct?
>>>>
>>>>
>>>
>>> Well, we are booting from QSPI but environment is in eMMC. In our
>>> config, we have defined ENV_IS_IN_FAT and ENV_IS_NOWHERE but we
>>> have
>>> not defined ENV_IS_IN_QSPI.
>>
>> ok.
>>
>>
>>>>>> prio is my print to show where code is. Qemu boots out in
>>>>>> QSPI
>>>>>> boot
>>>>>> mode and SPI is tried first and because
>>>>>> this is xilinx_zynq_virt defconfig drivers/env locations for
>>>>>> other
>>>>>> devices are present too. That's why it goes over the list
>>>>>> and it always ends in nowhere which never fails.
>>>>>>
>>>>>> If this runs on real HW then the same behavior is visible and
>>>>>> I
>>>>>> don't
>>>>>> think it is right.
>>>>>> I think this solution should be rethought.
>>>>>> In product current behavior from our code is not the best and
>>>>>> current
>>>>>> U-Boot implementation is not allowing
>>>>>> flexibility too. We are enabling redundant variables which
>>>>>> can be
>>>>>> only
>>>>>> on the same device but not that you have
>>>>>> one copy in QSPI and second in EMMC.
>>>>>> That's why I think location should be pretty much read from
>>>>>> DT.
>>>>>> There is options/u-boot node where location of variables
>>>>>> should
>>>>>> be
>>>>>> described and this should replace
>>>>>> env_get_location().
>>>>>
>>>>> You mean that there should be some type of new U-Boot node that
>>>>> describes where the environment should be located and go and
>>>>> search in this device?
>>>>
>>>> Not sure if node or just dt property which points where that
>>>> variables are stored.
>>>>
>>>>
>>>
>>> This doesn't sound too difficult to do. Still, even if this was
>>> done,
>>> what would the priority be for looking for the environment? For
>>> now,
>>> the code is written in a way to check for the environment only in
>>> the
>>> bootmedia. My patch added a functionality that if the environment
>>> is
>>> not in the bootmedia to check somewhere else. This should be the
>>> order
>>> also if we had this u-boot node? Check first in the bootmedia and
>>> then
>>> check if there is a dt property?
>>
>> I don't think this is going to be that simple. Especially to support
>> redundant
>> variables saved to two different devices.
>>
>>
> 
> To my understanding, I think it is enough to have a dt property that
> says "emmc" or "nand" etc.. that can be read when the
> env_get_location() is executed and the correct env_location is
> selected. What am I missing that makes you think that it is not so
> trivial?
> 
> I don't have that much experience with the U-Boot sources themselves so
> my question might be stupid, I know.

To do it generic it is not just about saying qspi. It is also necessary to say 
where exactly it is for both copies no to be hardcoded via Kconfig.

Something like this.
options {
	u-boot {
		compatible = "u-boot,config";
		env = <&qspi base size>, <&qspi base_redunt size_redunt>;
	};
};

above description should work properly for raw based devices.

Then there are MTD based devices where description can be like this
env = <&mtd_qspi_1>, <&mtd_qspi_2>;
or
env = "partition_name1", "partition_name2";

And also filesystem based
env = <&sdhci 1>, <&sdhci 2>;  // controller + partition ID or GUID.


And I think you should be able to define for example this

env = <&sdhci 1>, <&qspi base size>, <&mtd_qspi_1>;

Which means 3 locations for variables which should have the same content.

For doing it you need to touch all files in env/ folder and change the code 
around CONFIG_SYS_REDUNDAND_ENVIRONMENT.

Thanks,
Michal