mbox series

[v3,0/9] Enabling DCD emulation support in Qemu

Message ID 20231107180907.553451-1-nifan.cxl@gmail.com
Headers show
Series Enabling DCD emulation support in Qemu | expand

Message

Fan Ni Nov. 7, 2023, 6:07 p.m. UTC
From: Fan Ni <nifan.cxl@gmail.com>


The patch series are based on Jonathan's branch cxl-2023-09-26.

The main changes include,
1. Update cxl_find_dc_region to detect the case the range of the extent cross
    multiple DC regions.
2. Add comments to explain the checks performed in function
    cxl_detect_malformed_extent_list. (Jonathan)
3. Minimize the checks in cmd_dcd_add_dyn_cap_rsp.(Jonathan)
4. Update total_extent_count in add/release dynamic capacity response function.
    (Ira and Jorgen Hansen).
5. Fix the logic issue in test_bits and renamed it to
    test_any_bits_set to clear its function.
6. Add pending extent list for dc extent add event.
7. When add extent response is received, use the pending-to-add list to
    verify the extents are valid.
8. Add test_any_bits_set and cxl_insert_extent_to_extent_list declaration to
    cxl_device.h so it can be used in different files.
9. Updated ct3d_qmp_cxl_event_log_enc to include dynamic capacity event
    log type.
10. Extract the functionality to delete extent from extent list to a helper
    function.
11. Move the update of the bitmap which reflects which blocks are backed with
dc extents from the moment when a dc extent is offered to the moment when it
is accepted from the host.
12. Free dc_name after calling address_space_init to avoid memory leak when
    returning early. (Nathan)
13. Add code to detect and reject QMP requests without any extents. (Jonathan)
14. Add code to detect and reject QMP requests where the extent len is 0.
15. Change the QMP interface and move the region-id out of extents and now
    each command only takes care of extent add/release request in a single
    region. (Jonathan)
16. Change the region bitmap length from decode_len to len.
17. Rename "dpa" to "offset" in the add/release dc extent qmp interface.
    (Jonathan)
18. Block any dc extent release command if the exact extent is not already in
    the extent list of the device.

The code is tested together with Ira's kernel DCD support:
https://github.com/weiny2/linux-kernel/tree/dcd-v3-2023-10-30

Cover letter from v2 is here:
https://lore.kernel.org/linux-cxl/20230724162313.34196-1-fan.ni@samsung.com/T/#m63039621087023691c9749a0af1212deb5549ddf

Last version (v2) is here:
https://lore.kernel.org/linux-cxl/20230725183939.2741025-1-fan.ni@samsung.com/

More DCD related discussions are here:
https://lore.kernel.org/linux-cxl/650cc29ab3f64_50d07294e7@iweiny-mobl.notmuch/



Fan Ni (9):
  hw/cxl/cxl-mailbox-utils: Add dc_event_log_size field to output
    payload of identify memory device command
  hw/cxl/cxl-mailbox-utils: Add dynamic capacity region representative
    and mailbox command support
  include/hw/cxl/cxl_device: Rename mem_size as static_mem_size for
    type3 memory devices
  hw/mem/cxl_type3: Add support to create DC regions to type3 memory
    devices
  hw/mem/cxl_type3: Add host backend and address space handling for DC
    regions
  hw/mem/cxl_type3: Add DC extent list representative and get DC extent
    list mailbox support
  hw/cxl/cxl-mailbox-utils: Add mailbox commands to support add/release
    dynamic capacity response
  hw/cxl/events: Add qmp interfaces to add/release dynamic capacity
    extents
  hw/mem/cxl_type3: Add dpa range validation for accesses to dc regions

 hw/cxl/cxl-mailbox-utils.c  | 469 +++++++++++++++++++++++++++++-
 hw/mem/cxl_type3.c          | 548 +++++++++++++++++++++++++++++++++---
 hw/mem/cxl_type3_stubs.c    |  14 +
 include/hw/cxl/cxl_device.h |  64 ++++-
 include/hw/cxl/cxl_events.h |  15 +
 qapi/cxl.json               |  60 +++-
 6 files changed, 1123 insertions(+), 47 deletions(-)

Comments

Ira Weiny Nov. 17, 2023, 12:09 a.m. UTC | #1
nifan.cxl@ wrote:
> From: Fan Ni <nifan.cxl@gmail.com>
> 
> 
> The patch series are based on Jonathan's branch cxl-2023-09-26.

Finally getting around to trying this new series and the patch series does not
seem to apply on top of this branch?

Just to verify is this the top commit this work was based on?

   d4edf131bbac [jonathan/cxl-2023-09-26] cxl/vendor: SK hynix Niagara Multi-Headed SLD Device

I seem to have found some issue with CDAT checksumming[1] which I'm not quite
sure about.

I went ahead and pulled your latest work from:

    https://github.com/moking/qemu-jic-clone.git dcd-dev

    abe893944bb3  hw/mem/cxl_type3: Add dpa range validation for accesses to dc regions

It still has this same problem.

Before I dig into this, is this the latest dcd branch?

Has anything changed in how you specify DCD devices on the qemu command line
with this latest work?  Here is what I have:

...
-device cxl-type3,bus=hb0rp0,memdev=cxl-mem0,num-dc-regions=2,nonvolatile-dc-memdev=cxl-dc-mem0,id=cxl-dev0,lsa=cxl-lsa0,sn=0
-device cxl-type3,bus=hb0rp1,memdev=cxl-mem1,num-dc-regions=2,nonvolatile-dc-memdev=cxl-dc-mem1,id=cxl-dev1,lsa=cxl-lsa1,sn=1
-device cxl-type3,bus=hb1rp0,memdev=cxl-mem2,num-dc-regions=2,nonvolatile-dc-memdev=cxl-dc-mem2,id=cxl-dev2,lsa=cxl-lsa2,sn=2
-device cxl-type3,bus=hb1rp1,memdev=cxl-mem3,num-dc-regions=2,nonvolatile-dc-memdev=cxl-dc-mem3,id=cxl-dev3,lsa=cxl-lsa3,sn=3
...


Ira

[1] https://lore.kernel.org/all/20231116-fix-cdat-devm-free-v1-1-b148b40707d7@intel.com/

 
> The main changes include,
> 1. Update cxl_find_dc_region to detect the case the range of the extent cross
>     multiple DC regions.
> 2. Add comments to explain the checks performed in function
>     cxl_detect_malformed_extent_list. (Jonathan)
> 3. Minimize the checks in cmd_dcd_add_dyn_cap_rsp.(Jonathan)
> 4. Update total_extent_count in add/release dynamic capacity response function.
>     (Ira and Jorgen Hansen).
> 5. Fix the logic issue in test_bits and renamed it to
>     test_any_bits_set to clear its function.
> 6. Add pending extent list for dc extent add event.
> 7. When add extent response is received, use the pending-to-add list to
>     verify the extents are valid.
> 8. Add test_any_bits_set and cxl_insert_extent_to_extent_list declaration to
>     cxl_device.h so it can be used in different files.
> 9. Updated ct3d_qmp_cxl_event_log_enc to include dynamic capacity event
>     log type.
> 10. Extract the functionality to delete extent from extent list to a helper
>     function.
> 11. Move the update of the bitmap which reflects which blocks are backed with
> dc extents from the moment when a dc extent is offered to the moment when it
> is accepted from the host.
> 12. Free dc_name after calling address_space_init to avoid memory leak when
>     returning early. (Nathan)
> 13. Add code to detect and reject QMP requests without any extents. (Jonathan)
> 14. Add code to detect and reject QMP requests where the extent len is 0.
> 15. Change the QMP interface and move the region-id out of extents and now
>     each command only takes care of extent add/release request in a single
>     region. (Jonathan)
> 16. Change the region bitmap length from decode_len to len.
> 17. Rename "dpa" to "offset" in the add/release dc extent qmp interface.
>     (Jonathan)
> 18. Block any dc extent release command if the exact extent is not already in
>     the extent list of the device.
> 
> The code is tested together with Ira's kernel DCD support:
> https://github.com/weiny2/linux-kernel/tree/dcd-v3-2023-10-30
> 
> Cover letter from v2 is here:
> https://lore.kernel.org/linux-cxl/20230724162313.34196-1-fan.ni@samsung.com/T/#m63039621087023691c9749a0af1212deb5549ddf
> 
> Last version (v2) is here:
> https://lore.kernel.org/linux-cxl/20230725183939.2741025-1-fan.ni@samsung.com/
> 
> More DCD related discussions are here:
> https://lore.kernel.org/linux-cxl/650cc29ab3f64_50d07294e7@iweiny-mobl.notmuch/
> 
> 
> 
> Fan Ni (9):
>   hw/cxl/cxl-mailbox-utils: Add dc_event_log_size field to output
>     payload of identify memory device command
>   hw/cxl/cxl-mailbox-utils: Add dynamic capacity region representative
>     and mailbox command support
>   include/hw/cxl/cxl_device: Rename mem_size as static_mem_size for
>     type3 memory devices
>   hw/mem/cxl_type3: Add support to create DC regions to type3 memory
>     devices
>   hw/mem/cxl_type3: Add host backend and address space handling for DC
>     regions
>   hw/mem/cxl_type3: Add DC extent list representative and get DC extent
>     list mailbox support
>   hw/cxl/cxl-mailbox-utils: Add mailbox commands to support add/release
>     dynamic capacity response
>   hw/cxl/events: Add qmp interfaces to add/release dynamic capacity
>     extents
>   hw/mem/cxl_type3: Add dpa range validation for accesses to dc regions
> 
>  hw/cxl/cxl-mailbox-utils.c  | 469 +++++++++++++++++++++++++++++-
>  hw/mem/cxl_type3.c          | 548 +++++++++++++++++++++++++++++++++---
>  hw/mem/cxl_type3_stubs.c    |  14 +
>  include/hw/cxl/cxl_device.h |  64 ++++-
>  include/hw/cxl/cxl_events.h |  15 +
>  qapi/cxl.json               |  60 +++-
>  6 files changed, 1123 insertions(+), 47 deletions(-)
> 
> -- 
> 2.42.0
>
Fan Ni Feb. 13, 2024, 6:18 p.m. UTC | #2
On Tue, Nov 07, 2023 at 10:07:04AM -0800, nifan.cxl@gmail.com wrote:
> From: Fan Ni <nifan.cxl@gmail.com>
> 
> 
> The patch series are based on Jonathan's branch cxl-2023-09-26.
> 
> The main changes include,
> 1. Update cxl_find_dc_region to detect the case the range of the extent cross
>     multiple DC regions.
> 2. Add comments to explain the checks performed in function
>     cxl_detect_malformed_extent_list. (Jonathan)
> 3. Minimize the checks in cmd_dcd_add_dyn_cap_rsp.(Jonathan)
> 4. Update total_extent_count in add/release dynamic capacity response function.
>     (Ira and Jorgen Hansen).
> 5. Fix the logic issue in test_bits and renamed it to
>     test_any_bits_set to clear its function.
> 6. Add pending extent list for dc extent add event.
> 7. When add extent response is received, use the pending-to-add list to
>     verify the extents are valid.
> 8. Add test_any_bits_set and cxl_insert_extent_to_extent_list declaration to
>     cxl_device.h so it can be used in different files.
> 9. Updated ct3d_qmp_cxl_event_log_enc to include dynamic capacity event
>     log type.
> 10. Extract the functionality to delete extent from extent list to a helper
>     function.
> 11. Move the update of the bitmap which reflects which blocks are backed with
> dc extents from the moment when a dc extent is offered to the moment when it
> is accepted from the host.
> 12. Free dc_name after calling address_space_init to avoid memory leak when
>     returning early. (Nathan)
> 13. Add code to detect and reject QMP requests without any extents. (Jonathan)
> 14. Add code to detect and reject QMP requests where the extent len is 0.
> 15. Change the QMP interface and move the region-id out of extents and now
>     each command only takes care of extent add/release request in a single
>     region. (Jonathan)
> 16. Change the region bitmap length from decode_len to len.
> 17. Rename "dpa" to "offset" in the add/release dc extent qmp interface.
>     (Jonathan)
> 18. Block any dc extent release command if the exact extent is not already in
>     the extent list of the device.
> 
> The code is tested together with Ira's kernel DCD support:
> https://github.com/weiny2/linux-kernel/tree/dcd-v3-2023-10-30
> 
> Cover letter from v2 is here:
> https://lore.kernel.org/linux-cxl/20230724162313.34196-1-fan.ni@samsung.com/T/#m63039621087023691c9749a0af1212deb5549ddf
> 
> Last version (v2) is here:
> https://lore.kernel.org/linux-cxl/20230725183939.2741025-1-fan.ni@samsung.com/
> 
> More DCD related discussions are here:
> https://lore.kernel.org/linux-cxl/650cc29ab3f64_50d07294e7@iweiny-mobl.notmuch/
> 
> 
> 
> Fan Ni (9):
>   hw/cxl/cxl-mailbox-utils: Add dc_event_log_size field to output
>     payload of identify memory device command
>   hw/cxl/cxl-mailbox-utils: Add dynamic capacity region representative
>     and mailbox command support
>   include/hw/cxl/cxl_device: Rename mem_size as static_mem_size for
>     type3 memory devices
>   hw/mem/cxl_type3: Add support to create DC regions to type3 memory
>     devices
>   hw/mem/cxl_type3: Add host backend and address space handling for DC
>     regions
>   hw/mem/cxl_type3: Add DC extent list representative and get DC extent
>     list mailbox support
>   hw/cxl/cxl-mailbox-utils: Add mailbox commands to support add/release
>     dynamic capacity response
>   hw/cxl/events: Add qmp interfaces to add/release dynamic capacity
>     extents
>   hw/mem/cxl_type3: Add dpa range validation for accesses to dc regions
> 
>  hw/cxl/cxl-mailbox-utils.c  | 469 +++++++++++++++++++++++++++++-
>  hw/mem/cxl_type3.c          | 548 +++++++++++++++++++++++++++++++++---
>  hw/mem/cxl_type3_stubs.c    |  14 +
>  include/hw/cxl/cxl_device.h |  64 ++++-
>  include/hw/cxl/cxl_events.h |  15 +
>  qapi/cxl.json               |  60 +++-
>  6 files changed, 1123 insertions(+), 47 deletions(-)
> 
> -- 
> 2.42.0
> 

Hi Jonathan,

I have updated the patch set based on your feedback and aligned the code
to cxl spec r3.1.

Here is the new code:
https://github.com/moking/qemu/tree/dcd-v4

I plan to send it out for review early next week to see if there is any kernel
side update for dcd this week so I can test more.

If the plan needs to be adjusted to align with the merge window, please
let me know.

v3[1]->v4: 

The code is rebased on mainstream QEMU with the following patch series:

[PATCH 00/12 qemu] CXL emulation fixes and minor cleanup.
[PATCH 0/5 qemu] hw/cxl: Update CXL emulation to reflect and reference r3.1
hw/cxl/mailbox: change CCI cmd set structure to be a member, not a reference
hw/cxl/mailbox: interface to add CCI commands to an existing CCI

Main changes include:

1. Updated the specification references to align with cxl spec r3.1.
2. Add extra elements to get dc region configuration output payload and
procecced accordingly in mailbox command 4800h.
3. Removed the unwanted space.
4. Refactored ct3_build_cdat_entries_for_mr and extract it as a separate patch.
5. Updated cxl_create_dc_regions function to derive region len from host
backend size.
6. Changed the logic for creating DC regions when host backend and address
space processing is introduced, now cxl_create_dc_regions is called only
when host backend exists.
7. Updated the name of the definitions related to DC extents for consistency.
7. Updated dynamic capacity event record definition to align with spec r3.1.
9. Changed the dynamic capacity request process logic, for release request,
extra checks are done against the pending list to remove the extent yet added.
10. Changed the return value of cxl_create_dc_regions so the return can be used
to remove the extent for the list if needed.
11. Offset and size in the qmp interface are changed to be byte-wise while the
original is MiB-wise.
12. Fixed bugs in handling bitmap for dpa range existence.
13. NOTE: in previous version DC is set to non-volatile, while in this version
we change it to volatile per Jonathan's suggestion.
14. Updated the doc in qapi/cxl.json.

Thank Jonathan for the detailed review of the last version[1].

The code is tested with Ira's last kernel DCD patch set [2] with some minor
bug fixes[3]. Tested operations include:
1. create DC region;
2. Add/release DC extents;
3. convert DC capacity into system RAM;


v3: 
[1] https://lore.kernel.org/linux-cxl/20231107180907.553451-1-nifan.cxl@gmail.com/T/#t
[2] https://github.com/weiny2/linux-kernel/tree/dcd-v3-2023-10-30
[3] https://github.com/moking/linux-dcd/commit/9d24fa6e5d39f934623220953caecc080f93e964