Skip to content

pci_core/cxl_spec: support for partial-DWORD config space accesses#3751

Open
jackschefer-msft wants to merge 7 commits into
microsoft:mainfrom
jackschefer-msft:pci-cfg-partial-wip2
Open

pci_core/cxl_spec: support for partial-DWORD config space accesses#3751
jackschefer-msft wants to merge 7 commits into
microsoft:mainfrom
jackschefer-msft:pci-cfg-partial-wip2

Conversation

@jackschefer-msft

Copy link
Copy Markdown
Contributor

This change is the first in a series to support partial-DWORD config space accesses throughout PCIe an VPCI. This change:

  • introduces new primitives PciConfigAddress, PciConfigByteEnable, and ByteEnabledDword to the chipset_device crate. These primitives are based on how partial accesses are handled in the PCI spec
  • enhances the pci_core and cxl_spec crates to support these partial-DWORD accesses throughout config space and capability emulation helpers

For now, these changes should be a no-op, the bus implementations (VPCI and PCIe) still read-modify-write on full DWORD accesses. Subsequent changes will update config space routing in busses and devices to switch over to this new model (modifying the PciConfigSpace trait itself).

Copilot AI review requested due to automatic review settings June 16, 2026 16:12
@jackschefer-msft jackschefer-msft requested a review from a team as a code owner June 16, 2026 16:12
@smalis-msft

Copy link
Copy Markdown
Contributor

Will this affect normal PCI too?

@jackschefer-msft

Copy link
Copy Markdown
Contributor Author

Will this affect normal PCI too?

This change should not, but subsequent changes that modify PciConfigSpace would need to be adopted in the legacy PCI busses and devices, yes, since that trait is shared

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Ready to approve

Only a minor consistency nit was found; functional changes appear coherent and are backed by new unit tests for partial-access semantics.

Note: this review does not count toward required approvals for merging.

Pull request overview

This PR lays groundwork for PCI partial-DWORD configuration space accesses by introducing byte-enable-aware primitives in chipset_device and threading them through PCI config-space and capability emulation (including CXL DVSECs). It aims to be behavior-preserving for now (buses still do full-DWORD RMW), while enabling follow-on changes to route partial accesses end-to-end.

Changes:

  • Add PciConfigAddress, PciConfigByteEnable, and ByteEnabledDword primitives to model PCI-spec byte enables.
  • Update pci_core config-space emulator and capability/extended-capability traits + implementations to use byte-enabled reads/writes, adding focused unit tests for partial writes.
  • Update VFIO MSI-X handling and CXL DVSEC extended capabilities to work with the new byte-enabled access model.
File summaries
File Description
vm/devices/pci/vfio_assigned_device/src/lib.rs Switch MSI-X capability interception to byte-enable-aware read/write helpers.
vm/devices/pci/pci_core/src/test_helpers/mod.rs Add helpers for reading/writing capability DWORDs via ByteEnabledDword in tests.
vm/devices/pci/pci_core/src/cfg_space_emu.rs Plumb PciConfigAddress + ByteEnabledDword through cfg-space emulation; add partial-write tests.
vm/devices/pci/pci_core/src/capabilities/read_only.rs Migrate read-only capability to byte-enabled read/write API.
vm/devices/pci/pci_core/src/capabilities/pci_express.rs Make PCIe capability byte-enable-aware and add tests covering partial writes preserving RO/RW1C fields.
vm/devices/pci/pci_core/src/capabilities/msix.rs Make MSI-X capability byte-enable-aware via merge-on-write semantics.
vm/devices/pci/pci_core/src/capabilities/msi_cap.rs Make MSI capability byte-enable-aware via merge-on-write semantics.
vm/devices/pci/pci_core/src/capabilities/mod.rs Update PciCapability trait to read/write ByteEnabledDword.
vm/devices/pci/pci_core/src/capabilities/extended/mod.rs Update PciExtendedCapability trait + header contract test for ByteEnabledDword.
vm/devices/pci/pci_core/src/capabilities/extended/acs.rs Make ACS extended capability byte-enable-aware and update tests.
vm/devices/cxl/src/pci_registers/register_locator_dvsec.rs Update CXL Register Locator DVSEC to byte-enabled extended-capability API.
vm/devices/cxl/src/pci_registers/flex_bus_port_dvsec.rs Update Flex Bus Port DVSEC to byte-enabled writes; add tests for disabled-lane RW1CS behavior.
vm/devices/cxl/src/pci_registers/cxl_port_dvsec.rs Update CXL Port DVSEC to byte-enabled writes; add tests for disabled-lane RW1C behavior.
vm/devices/cxl/src/pci_registers/cxl_device_dvsec.rs Update CXL Device DVSEC to byte-enabled writes; add tests for disabled-lane RW1C behavior.
vm/chipset_device/src/pci.rs Introduce the core byte-enable/config-address primitives + unit tests.

Copilot's findings

  • Files reviewed: 15/15 changed files
  • Comments generated: 1

Note

Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.

Comment thread vm/devices/pci/pci_core/src/cfg_space_emu.rs
pub device_function: u8,
/// Aligned DWORD register number in configuration space.
pub dword_number: u16,
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PciConfigAddress notably does not contain a segment number, because those don't exist in the TLP definitions, but it does seem like including the segment would be useful so that devices can capture it along with the bus number if needed

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think a device should ever need to know its segment number. Any time the device needs its RID (e.g., to tell us which function a DMA is from), the thing the device is passing the RID to should know what segment we're on.

There's no such thing as a device that can span multiple segments, so this should be fine.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the end of the day, the segment number is just a software/firmware convention, after all.


impl PciConfigAddress {
/// Create a new PCI configuration-space request.
pub const fn new(bus: u8, device_function: u8, dword_number: u16) -> Self {

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some concern that callers might accidentally pass a byte offset instead of a dword_number to this constructor, but I like that passing DWORD numbers around (a) aligns with the TLP definition, and (b) removes implicit "offset must be 32-bit aligned" throughout

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable to me.

Is there value in something like with_bdf(self, bus, function) so that re-routing is easy?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, also, should we validate here that dword_number is < 1024? That might be a useful invariant to establish (you'd have to remove the pub from dword_number to maintain that invariant, of course).

@smalis-msft

Copy link
Copy Markdown
Contributor

Will this affect normal PCI too?

This change should not, but subsequent changes that modify PciConfigSpace would need to be adopted in the legacy PCI busses and devices, yes, since that trait is shared

Sounds good to me. Just be prepared to encounter some slightly gnarly code I've written in the past to deal with unaligned or fewer-than-4-byte accesses haha. Will be good to make that all cleaner.

Copilot AI review requested due to automatic review settings June 16, 2026 17:12

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Not ready to approve

A confirmed partial-write semantics bug in cxl_device_dvsec base-low register handling can incorrectly clear writable bits when byte lanes are disabled.

Copilot's findings
  • Files reviewed: 16/16 changed files
  • Comments generated: 2

Note

Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.

Comment thread vm/devices/cxl/src/pci_registers/cxl_device_dvsec.rs
Comment thread vm/devices/cxl/src/pci_registers/cxl_device_dvsec.rs
@jackschefer-msft

Copy link
Copy Markdown
Contributor Author

Will this affect normal PCI too?

This change should not, but subsequent changes that modify PciConfigSpace would need to be adopted in the legacy PCI busses and devices, yes, since that trait is shared

Sounds good to me. Just be prepared to encounter some slightly gnarly code I've written in the past to deal with unaligned or fewer-than-4-byte accesses haha. Will be good to make that all cleaner.

For the legacy PCI bus, it might make sense to just continue doing 4byte read-modify-write for compat. I am not sure whether legacy PCI IO ports support byte enabled accesses in the way that PCIe TLPs do, I'll have to consult a spec

@github-actions

Copy link
Copy Markdown

Copilot AI review requested due to automatic review settings June 17, 2026 00:41

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Not ready to approve

One added PCIe capability test should be adjusted to actually exercise status-byte-lane RW1C behavior under partial writes (and a small robustness improvement was noted in MSI-X capability reads).

Copilot's findings
  • Files reviewed: 16/16 changed files
  • Comments generated: 2

Note

Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.

Comment thread vm/devices/pci/pci_core/src/capabilities/pci_express.rs
Comment thread vm/devices/pci/pci_core/src/capabilities/msix.rs
@github-actions

Copy link
Copy Markdown

Copilot AI review requested due to automatic review settings June 19, 2026 12:49

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Human review recommended

It refactors core PCI config-space/capability access semantics across multiple device stacks, so a human should validate subtle correctness/compatibility risks despite the added tests.

Copilot's findings
  • Files reviewed: 16/16 changed files
  • Comments generated: 0 new

Note

Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.

@github-actions

Copy link
Copy Markdown


/// A DWORD value with byte enables for PCI configuration space access.
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub struct ByteEnabledDword {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're sharing this between the read and write paths, and I'm not sure if that's right. In particular, for the read path, passing val: &mut ByteEnabledDword makes it possible to do *val = ByteEnabledDword::with_all_bytes_enabled(v), which is not only wrong but might cause the caller to do the wrong thing if the caller intends to reuse the byte_enable portion of the value (e.g., by calling merge, as is done in the VFIO code).

I think we should probably have two structs: struct ByteEnabledDwordWrite(u32, PciConfigByteEnable) and struct ByteEnabledDwordRead<'a>(&'a mut u32, PciConfigByteEnable). Alternatively, we could just always pass PciConfigByteEnable as an additional parameter.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, after looking through the code, I do feel strongly that these should be split between the read and write paths. But I like the helper methods that we get by having a type, so I don't think you should just pass PciConfigByteEnable around.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although, it does get a little weird--if you want to pass the ByteEnabledDwordRead to something that might update it and then use it again, you'll have to add some kind of method like reborrow(&'a mut self) -> ByteEnabledDwordRead<'a>. But this actually is useful anyway--because I think you want a method that scopes down the read to something smaller, e.g. restrict(&'a mut self, bytes: PciConfigByteEnable) -> Self<'a>, and then reborrow (or whatever you call it) is just a special case of that.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(You can't do the restrict thing with your &mut ByteEnabledDword approach)


if offset == CxlDeviceDvsecRegisterOffset::DVSEC_LOCK_CAPABILITY2 {
let requested_lock = (value as u16) & 0x1;
let requested_lock = value.extract() & 0x1;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm--this isn't quite right, is it? We are updating the lock state even if the access didn't include the lock bit.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, never mind, we are not unlocking based on this, so it should be OK.

}

let clear_mask = ((value >> 16) as u16) & CXL_DEVICE_DVSEC_STATUS_RW1C_MASK;
let clear_mask = (value.extract() >> 16) as u16 & CXL_DEVICE_DVSEC_STATUS_RW1C_MASK;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be useful to have methods to extract the high and low words, since 16-bit registers are so common?

CxlPortDvsecRegisterOffset::DVSEC_HEADER2_PORT_EXTENSION_STATUS => {
let mut clear_mask = ((value >> 16) as u16) & CXL_PORT_DVSEC_STATUS_RW1C_MASK;
let mut clear_mask =
((value.extract() >> 16) as u16) & CXL_PORT_DVSEC_STATUS_RW1C_MASK;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it really feels like we'd benefit from value.extract_high()

Comment on lines +700 to +702
let mut result = value.extract();
assert!(result & 0xff00 == 0);
result |= next << 8;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For cap offset 0, I wonder if, instead of extracting and setting the value like this, we should instead split the access so that the capability is responsible for the low word and we're responsible for the high dword. That seems more in the spirit of the change.

}

/// Merge enabled byte lanes from this value into `current_value`.
pub const fn merge(self, current_value: u32) -> u32 {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be convenient to also have merge_into(self, current_value: &mut u32).

self.alt_prefetch_mem_limit_high = value;
}
CxlPortDvsecRegisterOffset::DVSEC_CXL_RCRB_BASE => {
let value = value.merge(self.read_dvsec_u32(offset));

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm--should we really go through the read path explicitly for these, vs. just reading the specific register?

use crate::io::IoResult;

/// Byte enables for the four lanes of a PCI configuration DWORD.
#[derive(Debug, Clone, Copy, PartialEq, Eq)]

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want a custom Debug impl for this to make it easy to see in traces what's going on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants