diff mbox series

[2/2] tpm: add backend for mssim

Message ID 20221215180125.24632-3-jejb@linux.ibm.com
State New
Headers show
Series tpm: add mssim backend | expand

Commit Message

James Bottomley Dec. 15, 2022, 6:01 p.m. UTC
From: James Bottomley <James.Bottomley@HansenPartnership.com>

The Microsoft Simulator (mssim) is the reference emulation platform
for the TCG TPM 2.0 specification.

https://github.com/Microsoft/ms-tpm-20-ref.git

It exports a fairly simple network socket baset protocol on two
sockets, one for command (default 2321) and one for control (default
2322).  This patch adds a simple backend that can speak the mssim
protocol over the network.  It also allows the host, and two ports to
be specified on the qemu command line.  The benefits are twofold:
firstly it gives us a backend that actually speaks a standard TPM
emulation protocol instead of the linux specific TPM driver format of
the current emulated TPM backend and secondly, using the microsoft
protocol, the end point of the emulator can be anywhere on the
network, facilitating the cloud use case where a central TPM service
can be used over a control network.

The implementation does basic control commands like power off/on, but
doesn't implement cancellation or startup.  The former because
cancellation is pretty much useless on a fast operating TPM emulator
and the latter because this emulator is designed to be used with OVMF
which itself does TPM startup and I wanted to validate that.

To run this, simply download an emulator based on the MS specification
(package ibmswtpm2 on openSUSE) and run it, then add these two lines
to the qemu command and it will use the emulator.

    -tpmdev mssim,id=tpm0 \
    -device tpm-crb,tpmdev=tpm0 \

to use a remote emulator replace the first line with

    -tpmdev "{'type':'mssim','id':'tpm0','command':{'type':inet,'host':'remote','port':'2321'}}"

tpm-tis also works as the backend.

Signed-off-by: James Bottomley <jejb@linux.ibm.com>

---

v2: convert to SocketAddr json and use qio_channel_socket_connect_sync()
---
 MAINTAINERS              |   5 +
 backends/tpm/Kconfig     |   5 +
 backends/tpm/meson.build |   1 +
 backends/tpm/tpm_mssim.c | 251 +++++++++++++++++++++++++++++++++++++++
 backends/tpm/tpm_mssim.h |  43 +++++++
 monitor/hmp-cmds.c       |   7 ++
 qapi/tpm.json            |  25 +++-
 7 files changed, 334 insertions(+), 3 deletions(-)
 create mode 100644 backends/tpm/tpm_mssim.c
 create mode 100644 backends/tpm/tpm_mssim.h

Comments

Stefan Berger Dec. 15, 2022, 6:46 p.m. UTC | #1
On 12/15/22 13:01, James Bottomley wrote:
> From: James Bottomley <James.Bottomley@HansenPartnership.com>
> 
> The Microsoft Simulator (mssim) is the reference emulation platform
> for the TCG TPM 2.0 specification.
> 
> https://github.com/Microsoft/ms-tpm-20-ref.git
> 
> It exports a fairly simple network socket baset protocol on two
> sockets, one for command (default 2321) and one for control (default
> 2322).  This patch adds a simple backend that can speak the mssim
> protocol over the network.  It also allows the host, and two ports to
> be specified on the qemu command line.  The benefits are twofold:
> firstly it gives us a backend that actually speaks a standard TPM
> emulation protocol instead of the linux specific TPM driver format of
> the current emulated TPM backend and secondly, using the microsoft
> protocol, the end point of the emulator can be anywhere on the
> network, facilitating the cloud use case where a central TPM service
> can be used over a control network.
> 
> The implementation does basic control commands like power off/on, but
> doesn't implement cancellation or startup.  The former because
> cancellation is pretty much useless on a fast operating TPM emulator
> and the latter because this emulator is designed to be used with OVMF
> which itself does TPM startup and I wanted to validate that.
> 
> To run this, simply download an emulator based on the MS specification
> (package ibmswtpm2 on openSUSE) and run it, then add these two lines
> to the qemu command and it will use the emulator.
> 
>      -tpmdev mssim,id=tpm0 \
>      -device tpm-crb,tpmdev=tpm0 \
> 
> to use a remote emulator replace the first line with
> 
>      -tpmdev "{'type':'mssim','id':'tpm0','command':{'type':inet,'host':'remote','port':'2321'}}"
> 
> tpm-tis also works as the backend.

Since this device does not properly support migration you have to register a migration blocker.

    Stefan
James Bottomley Dec. 15, 2022, 7:22 p.m. UTC | #2
On Thu, 2022-12-15 at 13:46 -0500, Stefan Berger wrote:
> 
> 
> On 12/15/22 13:01, James Bottomley wrote:
> > From: James Bottomley <James.Bottomley@HansenPartnership.com>
> > 
> > The Microsoft Simulator (mssim) is the reference emulation platform
> > for the TCG TPM 2.0 specification.
> > 
> > https://github.com/Microsoft/ms-tpm-20-ref.git
> > 
> > It exports a fairly simple network socket baset protocol on two
> > sockets, one for command (default 2321) and one for control
> > (default
> > 2322).  This patch adds a simple backend that can speak the mssim
> > protocol over the network.  It also allows the host, and two ports
> > to
> > be specified on the qemu command line.  The benefits are twofold:
> > firstly it gives us a backend that actually speaks a standard TPM
> > emulation protocol instead of the linux specific TPM driver format
> > of
> > the current emulated TPM backend and secondly, using the microsoft
> > protocol, the end point of the emulator can be anywhere on the
> > network, facilitating the cloud use case where a central TPM
> > service
> > can be used over a control network.
> > 
> > The implementation does basic control commands like power off/on,
> > but
> > doesn't implement cancellation or startup.  The former because
> > cancellation is pretty much useless on a fast operating TPM
> > emulator
> > and the latter because this emulator is designed to be used with
> > OVMF
> > which itself does TPM startup and I wanted to validate that.
> > 
> > To run this, simply download an emulator based on the MS
> > specification
> > (package ibmswtpm2 on openSUSE) and run it, then add these two
> > lines
> > to the qemu command and it will use the emulator.
> > 
> >      -tpmdev mssim,id=tpm0 \
> >      -device tpm-crb,tpmdev=tpm0 \
> > 
> > to use a remote emulator replace the first line with
> > 
> >      -tpmdev
> > "{'type':'mssim','id':'tpm0','command':{'type':inet,'host':'remote'
> > ,'port':'2321'}}"
> > 
> > tpm-tis also works as the backend.
> 
> Since this device does not properly support migration you have to
> register a migration blocker.

Actually it seems to support migration just fine.  Currently the PCR's
get zero'd which is my fault for doing a TPM power off/on, but
switching that based on state should be an easy fix.

James
Stefan Berger Dec. 15, 2022, 7:35 p.m. UTC | #3
On 12/15/22 14:22, James Bottomley wrote:
> On Thu, 2022-12-15 at 13:46 -0500, Stefan Berger wrote:
>>
>>
>> On 12/15/22 13:01, James Bottomley wrote:
>>> From: James Bottomley <James.Bottomley@HansenPartnership.com>
>>>
>>> The Microsoft Simulator (mssim) is the reference emulation platform
>>> for the TCG TPM 2.0 specification.
>>>
>>> https://github.com/Microsoft/ms-tpm-20-ref.git
>>>
>>> It exports a fairly simple network socket baset protocol on two
>>> sockets, one for command (default 2321) and one for control
>>> (default
>>> 2322).  This patch adds a simple backend that can speak the mssim
>>> protocol over the network.  It also allows the host, and two ports
>>> to
>>> be specified on the qemu command line.  The benefits are twofold:
>>> firstly it gives us a backend that actually speaks a standard TPM
>>> emulation protocol instead of the linux specific TPM driver format
>>> of
>>> the current emulated TPM backend and secondly, using the microsoft
>>> protocol, the end point of the emulator can be anywhere on the
>>> network, facilitating the cloud use case where a central TPM
>>> service
>>> can be used over a control network.
>>>
>>> The implementation does basic control commands like power off/on,
>>> but
>>> doesn't implement cancellation or startup.  The former because
>>> cancellation is pretty much useless on a fast operating TPM
>>> emulator
>>> and the latter because this emulator is designed to be used with
>>> OVMF
>>> which itself does TPM startup and I wanted to validate that.
>>>
>>> To run this, simply download an emulator based on the MS
>>> specification
>>> (package ibmswtpm2 on openSUSE) and run it, then add these two
>>> lines
>>> to the qemu command and it will use the emulator.
>>>
>>>       -tpmdev mssim,id=tpm0 \
>>>       -device tpm-crb,tpmdev=tpm0 \
>>>
>>> to use a remote emulator replace the first line with
>>>
>>>       -tpmdev
>>> "{'type':'mssim','id':'tpm0','command':{'type':inet,'host':'remote'
>>> ,'port':'2321'}}"
>>>
>>> tpm-tis also works as the backend.
>>
>> Since this device does not properly support migration you have to
>> register a migration blocker.
> 
> Actually it seems to support migration just fine.  Currently the PCR's
> get zero'd which is my fault for doing a TPM power off/on, but
> switching that based on state should be an easy fix.

How do you handle virsh save  -> host reboot -> virsh restore?

You should also add a description to docs/specs/tpm.rst.

     Stefan

> 
> James
>
James Bottomley Dec. 15, 2022, 7:40 p.m. UTC | #4
On Thu, 2022-12-15 at 14:35 -0500, Stefan Berger wrote:
> 
> 
> On 12/15/22 14:22, James Bottomley wrote:
> > On Thu, 2022-12-15 at 13:46 -0500, Stefan Berger wrote:
> > > 
> > > 
> > > On 12/15/22 13:01, James Bottomley wrote:
> > > > From: James Bottomley <James.Bottomley@HansenPartnership.com>
> > > > 
> > > > The Microsoft Simulator (mssim) is the reference emulation
> > > > platform
> > > > for the TCG TPM 2.0 specification.
> > > > 
> > > > https://github.com/Microsoft/ms-tpm-20-ref.git
> > > > 
> > > > It exports a fairly simple network socket baset protocol on two
> > > > sockets, one for command (default 2321) and one for control
> > > > (default
> > > > 2322).  This patch adds a simple backend that can speak the
> > > > mssim
> > > > protocol over the network.  It also allows the host, and two
> > > > ports
> > > > to
> > > > be specified on the qemu command line.  The benefits are
> > > > twofold:
> > > > firstly it gives us a backend that actually speaks a standard
> > > > TPM
> > > > emulation protocol instead of the linux specific TPM driver
> > > > format
> > > > of
> > > > the current emulated TPM backend and secondly, using the
> > > > microsoft
> > > > protocol, the end point of the emulator can be anywhere on the
> > > > network, facilitating the cloud use case where a central TPM
> > > > service
> > > > can be used over a control network.
> > > > 
> > > > The implementation does basic control commands like power
> > > > off/on,
> > > > but
> > > > doesn't implement cancellation or startup.  The former because
> > > > cancellation is pretty much useless on a fast operating TPM
> > > > emulator
> > > > and the latter because this emulator is designed to be used
> > > > with
> > > > OVMF
> > > > which itself does TPM startup and I wanted to validate that.
> > > > 
> > > > To run this, simply download an emulator based on the MS
> > > > specification
> > > > (package ibmswtpm2 on openSUSE) and run it, then add these two
> > > > lines
> > > > to the qemu command and it will use the emulator.
> > > > 
> > > >       -tpmdev mssim,id=tpm0 \
> > > >       -device tpm-crb,tpmdev=tpm0 \
> > > > 
> > > > to use a remote emulator replace the first line with
> > > > 
> > > >       -tpmdev
> > > > "{'type':'mssim','id':'tpm0','command':{'type':inet,'host':'rem
> > > > ote'
> > > > ,'port':'2321'}}"
> > > > 
> > > > tpm-tis also works as the backend.
> > > 
> > > Since this device does not properly support migration you have to
> > > register a migration blocker.
> > 
> > Actually it seems to support migration just fine.  Currently the
> > PCR's
> > get zero'd which is my fault for doing a TPM power off/on, but
> > switching that based on state should be an easy fix.
> 
> How do you handle virsh save  -> host reboot -> virsh restore?

I didn't.  I just pulled out the TPM power state changes and followed
the guide here using the migrate "exec:gzip -c > STATEFILE.gz" recipe:

https://www.linux-kvm.org/page/Migration

and verified the TPM pcrs and the null name were unchanged.

> You should also add a description to docs/specs/tpm.rst.

Description of what?  It functions exactly like passthrough on
migration.  Since the TPM state is retained in the server a
reconnection just brings everything back to where it was.

James
Stefan Berger Dec. 15, 2022, 7:57 p.m. UTC | #5
On 12/15/22 14:40, James Bottomley wrote:
> On Thu, 2022-12-15 at 14:35 -0500, Stefan Berger wrote:
>>
>>
>> On 12/15/22 14:22, James Bottomley wrote:
>>> On Thu, 2022-12-15 at 13:46 -0500, Stefan Berger wrote:
>>>>
>>>>
>>>> On 12/15/22 13:01, James Bottomley wrote:
>>>>> From: James Bottomley <James.Bottomley@HansenPartnership.com>
>>>>>
>>>>> The Microsoft Simulator (mssim) is the reference emulation
>>>>> platform
>>>>> for the TCG TPM 2.0 specification.
>>>>>
>>>>> https://github.com/Microsoft/ms-tpm-20-ref.git
>>>>>
>>>>> It exports a fairly simple network socket baset protocol on two
>>>>> sockets, one for command (default 2321) and one for control
>>>>> (default
>>>>> 2322).  This patch adds a simple backend that can speak the
>>>>> mssim
>>>>> protocol over the network.  It also allows the host, and two
>>>>> ports
>>>>> to
>>>>> be specified on the qemu command line.  The benefits are
>>>>> twofold:
>>>>> firstly it gives us a backend that actually speaks a standard
>>>>> TPM
>>>>> emulation protocol instead of the linux specific TPM driver
>>>>> format
>>>>> of
>>>>> the current emulated TPM backend and secondly, using the
>>>>> microsoft
>>>>> protocol, the end point of the emulator can be anywhere on the
>>>>> network, facilitating the cloud use case where a central TPM
>>>>> service
>>>>> can be used over a control network.
>>>>>
>>>>> The implementation does basic control commands like power
>>>>> off/on,
>>>>> but
>>>>> doesn't implement cancellation or startup.  The former because
>>>>> cancellation is pretty much useless on a fast operating TPM
>>>>> emulator
>>>>> and the latter because this emulator is designed to be used
>>>>> with
>>>>> OVMF
>>>>> which itself does TPM startup and I wanted to validate that.
>>>>>
>>>>> To run this, simply download an emulator based on the MS
>>>>> specification
>>>>> (package ibmswtpm2 on openSUSE) and run it, then add these two
>>>>> lines
>>>>> to the qemu command and it will use the emulator.
>>>>>
>>>>>        -tpmdev mssim,id=tpm0 \
>>>>>        -device tpm-crb,tpmdev=tpm0 \
>>>>>
>>>>> to use a remote emulator replace the first line with
>>>>>
>>>>>        -tpmdev
>>>>> "{'type':'mssim','id':'tpm0','command':{'type':inet,'host':'rem
>>>>> ote'
>>>>> ,'port':'2321'}}"
>>>>>
>>>>> tpm-tis also works as the backend.
>>>>
>>>> Since this device does not properly support migration you have to
>>>> register a migration blocker.
>>>
>>> Actually it seems to support migration just fine.  Currently the
>>> PCR's
>>> get zero'd which is my fault for doing a TPM power off/on, but
>>> switching that based on state should be an easy fix.
>>
>> How do you handle virsh save  -> host reboot -> virsh restore?
> 
> I didn't.  I just pulled out the TPM power state changes and followed
> the guide here using the migrate "exec:gzip -c > STATEFILE.gz" recipe:
> 
> https://www.linux-kvm.org/page/Migration
> 
> and verified the TPM pcrs and the null name were unchanged.

> 
>> You should also add a description to docs/specs/tpm.rst.
> 
> Description of what?  It functions exactly like passthrough on

Please describe all the scenarios so that someone else can repeat them when trying out **your** device.

There are sections describing how things for swtpm and you should add how things work for the mssim TPM.

https://github.com/qemu/qemu/blob/master/docs/specs/tpm.rst#the-qemu-tpm-emulator-device
https://github.com/qemu/qemu/blob/master/docs/specs/tpm.rst#migration-with-the-tpm-emulator


> migration.  Since the TPM state is retained in the server a
> reconnection just brings everything back to where it was.

So it's remote. And the ports are always open and someone can just connect to the open ports and power cycle the device?

This may not be the most important scenario but nevertheless I wouldn't want to deal with bug reports if someone does 'VM snapshotting' -- how this is correctly handled would be of interest.

    Stefan

> 
> James
>
James Bottomley Dec. 15, 2022, 8:07 p.m. UTC | #6
On Thu, 2022-12-15 at 14:57 -0500, Stefan Berger wrote:
> On 12/15/22 14:40, James Bottomley wrote:
> > On Thu, 2022-12-15 at 14:35 -0500, Stefan Berger wrote:
[...]
> > > You should also add a description to docs/specs/tpm.rst.
> > 
> > Description of what?  It functions exactly like passthrough on
> 
> Please describe all the scenarios so that someone else can repeat
> them when trying out **your** device.
> 
> There are sections describing how things for swtpm and you should add
> how things work for the mssim TPM.
> 
> https://github.com/qemu/qemu/blob/master/docs/specs/tpm.rst#the-qemu-tpm-emulator-device
> https://github.com/qemu/qemu/blob/master/docs/specs/tpm.rst#migration-with-the-tpm-emulator

The passthrough snapshot/restore isn't described there either.  This
behaves exactly the same in that it's caveat emptor.  If something
happens in the interim to upset the TPM state then the restore will
have unexpected effects due to the externally changed TPM state.  This
is actually a feature: I'm checking our interposer defences by doing
external state manipulation.

> > migration.  Since the TPM state is retained in the server a
> > reconnection just brings everything back to where it was.
> 
> So it's remote. And the ports are always open and someone can just
> connect to the open ports and power cycle the device?

in the same way as you can power off the hardware and have issues with
a passthrough TPM on vm restore, yes.

> This may not be the most important scenario but nevertheless I
> wouldn't want to deal with bug reports if someone does 'VM
> snapshotting' -- how this is correctly handled would be of interest.

I'd rather say nothing, like passthrough, then there are no
expectations beyond it might work if you know what you're doing.  I
don't really have much interest in the migration use case, but I knew
it should work like the passthrough case, so that's what I tested.

James
Stefan Berger Dec. 15, 2022, 8:22 p.m. UTC | #7
On 12/15/22 15:07, James Bottomley wrote:
> On Thu, 2022-12-15 at 14:57 -0500, Stefan Berger wrote:
>> On 12/15/22 14:40, James Bottomley wrote:
>>> On Thu, 2022-12-15 at 14:35 -0500, Stefan Berger wrote:
> [...]
>>>> You should also add a description to docs/specs/tpm.rst.
>>>
>>> Description of what?  It functions exactly like passthrough on
>>
>> Please describe all the scenarios so that someone else can repeat
>> them when trying out **your** device.
>>
>> There are sections describing how things for swtpm and you should add
>> how things work for the mssim TPM.
>>
>> https://github.com/qemu/qemu/blob/master/docs/specs/tpm.rst#the-qemu-tpm-emulator-device
>> https://github.com/qemu/qemu/blob/master/docs/specs/tpm.rst#migration-with-the-tpm-emulator
> 
> The passthrough snapshot/restore isn't described there either.  This

Forget about passthrough, rather compare it to swtpm.

> behaves exactly the same in that it's caveat emptor.  If something
> happens in the interim to upset the TPM state then the restore will
> have unexpected effects due to the externally changed TPM state.  This
> is actually a feature: I'm checking our interposer defences by doing
> external state manipulation.
> 
>>> migration.  Since the TPM state is retained in the server a
>>> reconnection just brings everything back to where it was.
>>
>> So it's remote. And the ports are always open and someone can just
>> connect to the open ports and power cycle the device?
> 
> in the same way as you can power off the hardware and have issues with
> a passthrough TPM on vm restore, yes.

I don't thinkyou should compare the mssim TPM with passthrough but rather with swtpm emulator + tpm_emulator backend. That's a much better comparison.

> 
>> This may not be the most important scenario but nevertheless I
>> wouldn't want to deal with bug reports if someone does 'VM
>> snapshotting' -- how this is correctly handled would be of interest.
> 
> I'd rather say nothing, like passthrough, then there are no
> expectations beyond it might work if you know what you're doing.  I

Why do we need this device then if it doesn't handle migration scenarios in the same or better way than swtpm + tpm_emulator backends already do?

> don't really have much interest in the migration use case, but I knew
> it should work like the passthrough case, so that's what I tested.

I think your device needs to block migrations since it doesn't handle all migration scenarios correctly.

    Stefan

> 
> James
>
James Bottomley Dec. 15, 2022, 8:30 p.m. UTC | #8
On Thu, 2022-12-15 at 15:22 -0500, Stefan Berger wrote:
> On 12/15/22 15:07, James Bottomley wrote:
[...]
> > don't really have much interest in the migration use case, but I
> > knew it should work like the passthrough case, so that's what I
> > tested.
> 
> I think your device needs to block migrations since it doesn't handle
> all migration scenarios correctly.

Passthrough doesn't block migrations either, presumably because it can
also be made to work if you know what you're doing.  I might not be
particularly interested in migrations, but that's not really a good
reason to prevent anyone from ever using them, particularly when the
experiment says they do work.

James
Stefan Berger Dec. 15, 2022, 8:53 p.m. UTC | #9
On 12/15/22 15:30, James Bottomley wrote:
> On Thu, 2022-12-15 at 15:22 -0500, Stefan Berger wrote:
>> On 12/15/22 15:07, James Bottomley wrote:
> [...]
>>> don't really have much interest in the migration use case, but I
>>> knew it should work like the passthrough case, so that's what I
>>> tested.
>>
>> I think your device needs to block migrations since it doesn't handle
>> all migration scenarios correctly.
> 
> Passthrough doesn't block migrations either, presumably because it can
> also be made to work if you know what you're doing.  I might not be

Don't compare it to passthrough, compare it to swtpm. It should have at least the same features as swtpm or be better, otherwise I don't see why we need to have the backend device in the upstream repo.

     Stefan
Daniel P. Berrangé Dec. 16, 2022, 10:27 a.m. UTC | #10
On Thu, Dec 15, 2022 at 03:53:43PM -0500, Stefan Berger wrote:
> 
> 
> On 12/15/22 15:30, James Bottomley wrote:
> > On Thu, 2022-12-15 at 15:22 -0500, Stefan Berger wrote:
> > > On 12/15/22 15:07, James Bottomley wrote:
> > [...]
> > > > don't really have much interest in the migration use case, but I
> > > > knew it should work like the passthrough case, so that's what I
> > > > tested.
> > > 
> > > I think your device needs to block migrations since it doesn't handle
> > > all migration scenarios correctly.
> > 
> > Passthrough doesn't block migrations either, presumably because it can
> > also be made to work if you know what you're doing.  I might not be
> 
> Don't compare it to passthrough, compare it to swtpm. It should
> have at least the same features as swtpm or be better, otherwise
> I don't see why we need to have the backend device in the upstream
> repo.

James has explained multiple times that mssim is a beneficial
thing to support, given that it is the reference implementation
of TPM2. Requiring the same or greater features than swtpm is
an unreasonable thing to demand.

With regards,
Daniel
Stefan Berger Dec. 16, 2022, 12:28 p.m. UTC | #11
On 12/16/22 05:27, Daniel P. Berrangé wrote:
> On Thu, Dec 15, 2022 at 03:53:43PM -0500, Stefan Berger wrote:
>>
>>
>> On 12/15/22 15:30, James Bottomley wrote:
>>> On Thu, 2022-12-15 at 15:22 -0500, Stefan Berger wrote:
>>>> On 12/15/22 15:07, James Bottomley wrote:
>>> [...]
>>>>> don't really have much interest in the migration use case, but I
>>>>> knew it should work like the passthrough case, so that's what I
>>>>> tested.
>>>>
>>>> I think your device needs to block migrations since it doesn't handle
>>>> all migration scenarios correctly.
>>>
>>> Passthrough doesn't block migrations either, presumably because it can
>>> also be made to work if you know what you're doing.  I might not be
>>
>> Don't compare it to passthrough, compare it to swtpm. It should
>> have at least the same features as swtpm or be better, otherwise
>> I don't see why we need to have the backend device in the upstream
>> repo.
> 
> James has explained multiple times that mssim is a beneficial
> thing to support, given that it is the reference implementation
> of TPM2. Requiring the same or greater features than swtpm is
> an unreasonable thing to demand.

Nevertheless it needs documentation and has to handle migration scenarios either via a blocker or it has to handle them all correctly. Since it's supposed to be a TPM running remote you had asked for TLS support iirc.

   Stefan

> 
> With regards,
> Daniel
Daniel P. Berrangé Dec. 16, 2022, 12:54 p.m. UTC | #12
On Fri, Dec 16, 2022 at 07:28:59AM -0500, Stefan Berger wrote:
> 
> 
> On 12/16/22 05:27, Daniel P. Berrangé wrote:
> > On Thu, Dec 15, 2022 at 03:53:43PM -0500, Stefan Berger wrote:
> > > 
> > > 
> > > On 12/15/22 15:30, James Bottomley wrote:
> > > > On Thu, 2022-12-15 at 15:22 -0500, Stefan Berger wrote:
> > > > > On 12/15/22 15:07, James Bottomley wrote:
> > > > [...]
> > > > > > don't really have much interest in the migration use case, but I
> > > > > > knew it should work like the passthrough case, so that's what I
> > > > > > tested.
> > > > > 
> > > > > I think your device needs to block migrations since it doesn't handle
> > > > > all migration scenarios correctly.
> > > > 
> > > > Passthrough doesn't block migrations either, presumably because it can
> > > > also be made to work if you know what you're doing.  I might not be
> > > 
> > > Don't compare it to passthrough, compare it to swtpm. It should
> > > have at least the same features as swtpm or be better, otherwise
> > > I don't see why we need to have the backend device in the upstream
> > > repo.
> > 
> > James has explained multiple times that mssim is a beneficial
> > thing to support, given that it is the reference implementation
> > of TPM2. Requiring the same or greater features than swtpm is
> > an unreasonable thing to demand.
> 
> Nevertheless it needs documentation and has to handle migration
> scenarios either via a blocker or it has to handle them all
> correctly. Since it's supposed to be a TPM running remote you
> had asked for TLS support iirc.

If the mssim implmentation doesn't provide TLS itself, then I don't
consider that a blocker on the QEMU side, merely a nice-to-have.

With swtpm the control channel is being used to load and store state
during the migration dance. This makes the use of an external process
largely transparent to the user, since QEMU handles all the state
save/load as part of its migration data stream.

With mssim there is state save/load co-ordination with QEMU. Instead
whomever/whatever is managing the mssim instance, is responsible for
ensuring it is running with the correct state at the time QEMU does
a vmstate load. If doing a live migration this co-ordination is trivial
if you just use the same mssim instance for both src/dst to connect to.

If doing save/store to disk, the user needs to be able to save the mssim
state and load it again later. If doing snapshots and reverting to old
snapshots, then again whomever manages mssim needs to be keeping saved
TPM state corresponding to each QEMU snapshot saved, and picking the
right one when restoring to old snapshots.

QEMU exposes enough functionality to enable a mgmt app / admin user to
achieve all of this.

This is not as seemlessly integrated with swtpm is, but it is still
technically posssible todo the right thing with migration from QEMU's
POV. Whether or not the app/person managing mssim instance actually
does the right thing in practice is not a concern of QEMU. I don't
see a need for a migration blocker here.

With regards,
Daniel
Stefan Berger Dec. 16, 2022, 1:32 p.m. UTC | #13
On 12/16/22 07:54, Daniel P. Berrangé wrote:
> On Fri, Dec 16, 2022 at 07:28:59AM -0500, Stefan Berger wrote:
>>
>>
>> On 12/16/22 05:27, Daniel P. Berrangé wrote:
>>> On Thu, Dec 15, 2022 at 03:53:43PM -0500, Stefan Berger wrote:
>>>>
>>>>
>>>> On 12/15/22 15:30, James Bottomley wrote:
>>>>> On Thu, 2022-12-15 at 15:22 -0500, Stefan Berger wrote:
>>>>>> On 12/15/22 15:07, James Bottomley wrote:
>>>>> [...]
>>>>>>> don't really have much interest in the migration use case, but I
>>>>>>> knew it should work like the passthrough case, so that's what I
>>>>>>> tested.
>>>>>>
>>>>>> I think your device needs to block migrations since it doesn't handle
>>>>>> all migration scenarios correctly.
>>>>>
>>>>> Passthrough doesn't block migrations either, presumably because it can
>>>>> also be made to work if you know what you're doing.  I might not be
>>>>
>>>> Don't compare it to passthrough, compare it to swtpm. It should
>>>> have at least the same features as swtpm or be better, otherwise
>>>> I don't see why we need to have the backend device in the upstream
>>>> repo.
>>>
>>> James has explained multiple times that mssim is a beneficial
>>> thing to support, given that it is the reference implementation
>>> of TPM2. Requiring the same or greater features than swtpm is
>>> an unreasonable thing to demand.
>>
>> Nevertheless it needs documentation and has to handle migration
>> scenarios either via a blocker or it has to handle them all
>> correctly. Since it's supposed to be a TPM running remote you
>> had asked for TLS support iirc.
> 
> If the mssim implmentation doesn't provide TLS itself, then I don't
> consider that a blocker on the QEMU side, merely a nice-to-have.
> 
> With swtpm the control channel is being used to load and store state
> during the migration dance. This makes the use of an external process
> largely transparent to the user, since QEMU handles all the state
> save/load as part of its migration data stream.
> 
> With mssim there is state save/load co-ordination with QEMU. Instead
> whomever/whatever is managing the mssim instance, is responsible for
> ensuring it is running with the correct state at the time QEMU does
> a vmstate load. If doing a live migration this co-ordination is trivial
> if you just use the same mssim instance for both src/dst to connect to.
> 
> If doing save/store to disk, the user needs to be able to save the mssim
> state and load it again later. If doing snapshots and reverting to old

There is no way for storing and loading the *volatile state* of the mssim device.

> snapshots, then again whomever manages mssim needs to be keeping saved
> TPM state corresponding to each QEMU snapshot saved, and picking the
> right one when restoring to old snapshots.

This doesn't work.
Either way, if it's possible it can be documented and shown how this works.

> 
> QEMU exposes enough functionality to enable a mgmt app / admin us> achieve all of this.

How do you store the volatile state of this device, like the current state of the PCRs, loaded sessions etc? It doesn't support this.

> 
> This is not as seemlessly integrated with swtpm is, but it is still
> technically posssible todo the right thing with migration from QEMU's
> POV. Whether or not the app/person managing mssim instance actually
> does the right thing in practice is not a concern of QEMU. I don't
> see a need for a migration blocker here.

I do see it because the *volatile state* cannot be extracted from this device. The state of the PCRs is going to be lost.


Regards,
    Stefan

> 
> With regards,
> Daniel
James Bottomley Dec. 16, 2022, 1:53 p.m. UTC | #14
On Fri, 2022-12-16 at 08:32 -0500, Stefan Berger wrote:
> On 12/16/22 07:54, Daniel P. Berrangé wrote:
> > On Fri, Dec 16, 2022 at 07:28:59AM -0500, Stefan Berger wrote:
[...]
> > > Nevertheless it needs documentation and has to handle migration
> > > scenarios either via a blocker or it has to handle them all
> > > correctly. Since it's supposed to be a TPM running remote you
> > > had asked for TLS support iirc.
> > 
> > If the mssim implmentation doesn't provide TLS itself, then I don't
> > consider that a blocker on the QEMU side, merely a nice-to-have.
> > 
> > With swtpm the control channel is being used to load and store
> > state during the migration dance. This makes the use of an external
> > process largely transparent to the user, since QEMU handles all the
> > state save/load as part of its migration data stream.
> > 
> > With mssim there is state save/load co-ordination with QEMU.
> > Instead whomever/whatever is managing the mssim instance, is
> > responsible for ensuring it is running with the correct state at
> > the time QEMU does a vmstate load. If doing a live migration this
> > co-ordination is trivial if you just use the same mssim instance
> > for both src/dst to connect to.
> > 
> > If doing save/store to disk, the user needs to be able to save the
> > mssim state and load it again later. If doing snapshots and
> > reverting to old
> 
> There is no way for storing and loading the *volatile state* of the
> mssim device.

Well, yes there is, it saves internal TPM state to an NVChip file:

https://github.com/microsoft/ms-tpm-20-ref/blob/main/TPMCmd/Platform/src/NVMem.c

However, if I were running this as a service, I'd condition saving and
restoring state on a connection protocol, which would mean QEMU
wouldn't have to worry about it.  The simplest approach, of course, is
just to keep the service running even when the VM is suspended so the
state is kept internally.

> > snapshots, then again whomever manages mssim needs to be keeping
> > saved TPM state corresponding to each QEMU snapshot saved, and
> > picking the right one when restoring to old snapshots.
> 
> This doesn't work.

I already told you I tested this and it does work.  I'll actually add
the migration state check to the power on/off path because I need that
for testing S3 anyway.

> Either way, if it's possible it can be documented and shown how this
> works.

I could do a blog post, but I really don't think you want this in
official documentation because that creates support expectations.
> 
> > QEMU exposes enough functionality to enable a mgmt app / admin us>
> > achieve all of this.
> 
> How do you store the volatile state of this device, like the current
> state of the PCRs, loaded sessions etc? It doesn't support this.

That's not the only way of doing migration.  This precise problem
exists for VFIO and PCI pass through devices as well: external state is
stored in the card and that state must be matched in some way for the
card to work on resume.  Pretty much any external device coupled to the
VM has this problem.  As I keep saying you're thinking about this in
the wrong way: it's not a system directly slaved to QEMU it's an
independent daemon which must be managed separately.  The design is for
it to function like a passthrough.

> > This is not as seemlessly integrated with swtpm is, but it is still
> > technically posssible todo the right thing with migration from
> > QEMU's POV. Whether or not the app/person managing mssim instance
> > actually does the right thing in practice is not a concern of QEMU.
> > I don't see a need for a migration blocker here.
> 
> I do see it because the *volatile state* cannot be extracted from
> this device. The state of the PCRs is going to be lost.

Installing a migration blocker would prevent me from exercising the S3
paths, which I want to test.

James
Stefan Berger Dec. 16, 2022, 2:01 p.m. UTC | #15
On 12/16/22 08:53, James Bottomley wrote:
> On Fri, 2022-12-16 at 08:32 -0500, Stefan Berger wrote:
>> On 12/16/22 07:54, Daniel P. Berrangé wrote:
>>> On Fri, Dec 16, 2022 at 07:28:59AM -0500, Stefan Berger wrote:
> [...]
>>>> Nevertheless it needs documentation and has to handle migration
>>>> scenarios either via a blocker or it has to handle them all
>>>> correctly. Since it's supposed to be a TPM running remote you
>>>> had asked for TLS support iirc.
>>>
>>> If the mssim implmentation doesn't provide TLS itself, then I don't
>>> consider that a blocker on the QEMU side, merely a nice-to-have.
>>>
>>> With swtpm the control channel is being used to load and store
>>> state during the migration dance. This makes the use of an external
>>> process largely transparent to the user, since QEMU handles all the
>>> state save/load as part of its migration data stream.
>>>
>>> With mssim there is state save/load co-ordination with QEMU.
>>> Instead whomever/whatever is managing the mssim instance, is
>>> responsible for ensuring it is running with the correct state at
>>> the time QEMU does a vmstate load. If doing a live migration this
>>> co-ordination is trivial if you just use the same mssim instance
>>> for both src/dst to connect to.
>>>
>>> If doing save/store to disk, the user needs to be able to save the
>>> mssim state and load it again later. If doing snapshots and
>>> reverting to old
>>
>> There is no way for storing and loading the *volatile state* of the
>> mssim device.
> 
> Well, yes there is, it saves internal TPM state to an NVChip file:
> 
> https://github.com/microsoft/ms-tpm-20-ref/blob/main/TPMCmd/Platform/src/NVMem.c
> 
> However, if I were running this as a service, I'd condition saving and
> restoring state on a connection protocol, which would mean QEMU
> wouldn't have to worry about it.  The simplest approach, of course, is
> just to keep the service running even when the VM is suspended so the
> state is kept internally.
> 
>>> snapshots, then again whomever manages mssim needs to be keeping
>>> saved TPM state corresponding to each QEMU snapshot saved, and
>>> picking the right one when restoring to old snapshots.
>>
>> This doesn't work.
> 
> I already told you I tested this and it does work.  I'll actually add
> the migration state check to the power on/off path because I need that
> for testing S3 anyway.


Please document how this needs to be done.
> 
>> Either way, if it's possible it can be documented and shown how this
>> works.
> 
> I could do a blog post, but I really don't think you want this in
> official documentation because that creates support expectations.

We have documentation for passthrough and tpm_emulator. If you don't want to add documentation for it to QEMU then please add the driver in as 'unsupported'.

diff --git a/MAINTAINERS b/MAINTAINERS
index 1729c0901c..32fa2eb282 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3017,6 +3017,7 @@ F: include/hw/acpi/tpm.h
  F: include/sysemu/tpm*
  F: qapi/tpm.json
  F: backends/tpm/
+X: backends/tpm/tpm_mssim.*
  F: tests/qtest/*tpm*
  T: git https://github.com/stefanberger/qemu-tpm.git tpm-next

    Stefan
Daniel P. Berrangé Dec. 16, 2022, 2:29 p.m. UTC | #16
On Fri, Dec 16, 2022 at 08:32:44AM -0500, Stefan Berger wrote:
> 
> 
> On 12/16/22 07:54, Daniel P. Berrangé wrote:
> > On Fri, Dec 16, 2022 at 07:28:59AM -0500, Stefan Berger wrote:
> > > 
> > > 
> > > On 12/16/22 05:27, Daniel P. Berrangé wrote:
> > > > On Thu, Dec 15, 2022 at 03:53:43PM -0500, Stefan Berger wrote:
> > > > > 
> > > > > 
> > > > > On 12/15/22 15:30, James Bottomley wrote:
> > > > > > On Thu, 2022-12-15 at 15:22 -0500, Stefan Berger wrote:
> > > > > > > On 12/15/22 15:07, James Bottomley wrote:
> > > > > > [...]
> > > > > > > > don't really have much interest in the migration use case, but I
> > > > > > > > knew it should work like the passthrough case, so that's what I
> > > > > > > > tested.
> > > > > > > 
> > > > > > > I think your device needs to block migrations since it doesn't handle
> > > > > > > all migration scenarios correctly.
> > > > > > 
> > > > > > Passthrough doesn't block migrations either, presumably because it can
> > > > > > also be made to work if you know what you're doing.  I might not be
> > > > > 
> > > > > Don't compare it to passthrough, compare it to swtpm. It should
> > > > > have at least the same features as swtpm or be better, otherwise
> > > > > I don't see why we need to have the backend device in the upstream
> > > > > repo.
> > > > 
> > > > James has explained multiple times that mssim is a beneficial
> > > > thing to support, given that it is the reference implementation
> > > > of TPM2. Requiring the same or greater features than swtpm is
> > > > an unreasonable thing to demand.
> > > 
> > > Nevertheless it needs documentation and has to handle migration
> > > scenarios either via a blocker or it has to handle them all
> > > correctly. Since it's supposed to be a TPM running remote you
> > > had asked for TLS support iirc.
> > 
> > If the mssim implmentation doesn't provide TLS itself, then I don't
> > consider that a blocker on the QEMU side, merely a nice-to-have.
> > 
> > With swtpm the control channel is being used to load and store state
> > during the migration dance. This makes the use of an external process
> > largely transparent to the user, since QEMU handles all the state
> > save/load as part of its migration data stream.
> > 
> > With mssim there is state save/load co-ordination with QEMU. Instead
> > whomever/whatever is managing the mssim instance, is responsible for
> > ensuring it is running with the correct state at the time QEMU does
> > a vmstate load. If doing a live migration this co-ordination is trivial
> > if you just use the same mssim instance for both src/dst to connect to.
> > 
> > If doing save/store to disk, the user needs to be able to save the mssim
> > state and load it again later. If doing snapshots and reverting to old
> 
> There is no way for storing and loading the *volatile state* of the
> mssim device.
> 
> > snapshots, then again whomever manages mssim needs to be keeping saved
> > TPM state corresponding to each QEMU snapshot saved, and picking the
> > right one when restoring to old snapshots.
> 
> This doesn't work.
> Either way, if it's possible it can be documented and shown how this works.
> 
> > 
> > QEMU exposes enough functionality to enable a mgmt app / admin us> achieve all of this.
> 
> How do you store the volatile state of this device, like the current
> state of the PCRs, loaded sessions etc? It doesn't support this.
> 
> > 
> > This is not as seemlessly integrated with swtpm is, but it is still
> > technically posssible todo the right thing with migration from QEMU's
> > POV. Whether or not the app/person managing mssim instance actually
> > does the right thing in practice is not a concern of QEMU. I don't
> > see a need for a migration blocker here.
> 
> I do see it because the *volatile state* cannot be extracted from
> this device. The state of the PCRs is going to be lost.

All the objections you're raising are related to the current
specifics of the implementation of the mssim remote server.
While valid, this is of no concern to QEMU when deciding whether
to require a migration blocker on the client side. This is 3rd
party remote service that should be considered a black box from
QEMU's POV. It is possible to write a remote server that supports
the mssim network protocol, and has the ability to serialize
its state. Whether such an impl exists today or not is separate.

With regards,
Daniel
Stefan Berger Dec. 16, 2022, 2:55 p.m. UTC | #17
On 12/16/22 09:29, Daniel P. Berrangé wrote:

> 
> All the objections you're raising are related to the current
> specifics of the implementation of the mssim remote server.
> While valid, this is of no concern to QEMU when deciding whether
> to require a migration blocker on the client side. This is 3rd
> party remote service that should be considered a black box from
> QEMU's POV. It is possible to write a remote server that supports
> the mssim network protocol, and has the ability to serialize
> its state. Whether such an impl exists today or not is separate.

Then let's document the scenarios so someone can repeat them, I think this is just fair. James said he tested state migration scenarios and it works, so let's enable others to do it as well. I am open to someone maintaining just this driver and the dynamics that may develop around it.

Regards,
    Stefan

> With regards,
> Daniel
James Bottomley Dec. 16, 2022, 3:48 p.m. UTC | #18
On Fri, 2022-12-16 at 09:55 -0500, Stefan Berger wrote:
> 
> 
> On 12/16/22 09:29, Daniel P. Berrangé wrote:
> 
> > 
> > All the objections you're raising are related to the current
> > specifics of the implementation of the mssim remote server.
> > While valid, this is of no concern to QEMU when deciding whether
> > to require a migration blocker on the client side. This is 3rd
> > party remote service that should be considered a black box from
> > QEMU's POV. It is possible to write a remote server that supports
> > the mssim network protocol, and has the ability to serialize
> > its state. Whether such an impl exists today or not is separate.
> 
> Then let's document the scenarios so someone can repeat them, I think
> this is just fair. James said he tested state migration scenarios and
> it works, so let's enable others to do it as well. I am open to
> someone maintaining just this driver and the dynamics that may
> develop around it.

Well, OK, this is what I think would be appropriate ... I'll fold it in
to the second patch.

James

---

diff --git a/docs/specs/tpm.rst b/docs/specs/tpm.rst
index 535912a92b..985d0775a0 100644
--- a/docs/specs/tpm.rst
+++ b/docs/specs/tpm.rst
@@ -270,6 +270,38 @@ available as a module (assuming a TPM 2 is passed through):
   /sys/devices/LNXSYSTEM:00/LNXSYBUS:00/MSFT0101:00/tpm/tpm0/pcr-sha256/9
   ...
 
+The QEMU TPM Microsoft Simulator Device
+---------------------------------------
+
+The TCG provides a reference implementation for TPM 2.0 written by
+Microsoft (See `ms-tpm-20-ref`_ on github).  The reference implementation
+starts a network server and listens for TPM commands on port 2321 and
+TPM Platform control commands on port 2322, although these can be
+altered.  The QEMU mssim TPM backend talks to this implementation.  By
+default it connects to the default ports on localhost:
+
+.. code-block:: console
+
+  qemu-system-x86_64 <qemu-options> \
+    -tpmdev mssim,id=tpm0 \
+    -device tpm-crb,tpmdev=tpm0
+
+
+Although it can also communicate with a remote host, which must be
+specified as a SocketAddress via json on the command line for each of
+the command and control ports:
+
+.. code-block:: console
+
+  qemu-system-x86_64 <qemu-options> \
+    -tpmdev "{'type':'mssim','id':'tpm0','command':{'type':inet,'host':'remote','port':'2321'},'control':{'type':'inet','host':'remote','port':'2322'}}" \
+    -device tpm-crb,tpmdev=tpm0
+
+
+The mssim backend supports snapshotting and migration, but the state
+of the Microsoft Simulator server must be preserved (or the server
+kept running) outside of QEMU for restore to be successful.
+
 The QEMU TPM emulator device
 ----------------------------
 
@@ -526,3 +558,6 @@ the following:
 
 .. _SWTPM protocol:
    https://github.com/stefanberger/swtpm/blob/master/man/man3/swtpm_ioctls.pod
+
+.. _ms-tpm-20-ref:
+   https://github.com/microsoft/ms-tpm-20-ref
Stefan Berger Dec. 16, 2022, 4:08 p.m. UTC | #19
On 12/16/22 10:48, James Bottomley wrote:
> On Fri, 2022-12-16 at 09:55 -0500, Stefan Berger wrote:
>>
>>
>> On 12/16/22 09:29, Daniel P. Berrangé wrote:
>>
>>>
>>> All the objections you're raising are related to the current
>>> specifics of the implementation of the mssim remote server.
>>> While valid, this is of no concern to QEMU when deciding whether
>>> to require a migration blocker on the client side. This is 3rd
>>> party remote service that should be considered a black box from
>>> QEMU's POV. It is possible to write a remote server that supports
>>> the mssim network protocol, and has the ability to serialize
>>> its state. Whether such an impl exists today or not is separate.
>>
>> Then let's document the scenarios so someone can repeat them, I think
>> this is just fair. James said he tested state migration scenarios and
>> it works, so let's enable others to do it as well. I am open to
>> someone maintaining just this driver and the dynamics that may
>> develop around it.
> 
> Well, OK, this is what I think would be appropriate ... I'll fold it in
> to the second patch.
> 
> James
> 
> ---
> 
> diff --git a/docs/specs/tpm.rst b/docs/specs/tpm.rst
> index 535912a92b..985d0775a0 100644
> --- a/docs/specs/tpm.rst
> +++ b/docs/specs/tpm.rst
> @@ -270,6 +270,38 @@ available as a module (assuming a TPM 2 is passed through):
>     /sys/devices/LNXSYSTEM:00/LNXSYBUS:00/MSFT0101:00/tpm/tpm0/pcr-sha256/9
>     ...
>   
> +The QEMU TPM Microsoft Simulator Device
> +---------------------------------------
> +
> +The TCG provides a reference implementation for TPM 2.0 written by
> +Microsoft (See `ms-tpm-20-ref`_ on github).  The reference implementation
> +starts a network server and listens for TPM commands on port 2321 and
> +TPM Platform control commands on port 2322, although these can be
> +altered.  The QEMU mssim TPM backend talks to this implementation.  By
> +default it connects to the default ports on localhost:
> +
> +.. code-block:: console
> +
> +  qemu-system-x86_64 <qemu-options> \
> +    -tpmdev mssim,id=tpm0 \
> +    -device tpm-crb,tpmdev=tpm0
> +
> +
> +Although it can also communicate with a remote host, which must be
> +specified as a SocketAddress via json on the command line for each of
> +the command and control ports:
> +
> +.. code-block:: console
> +
> +  qemu-system-x86_64 <qemu-options> \
> +    -tpmdev "{'type':'mssim','id':'tpm0','command':{'type':inet,'host':'remote','port':'2321'},'control':{'type':'inet','host':'remote','port':'2322'}}" \
> +    -device tpm-crb,tpmdev=tpm0
> +
> +
> +The mssim backend supports snapshotting and migration, but the state
> +of the Microsoft Simulator server must be preserved (or the server
> +kept running) outside of QEMU for restore to be successful.

You said you tested it. Can you show how to set it up with command lines? I want to try out at least suspend and resume .



    Stefan

> +
>   The QEMU TPM emulator device
>   ----------------------------
>   
> @@ -526,3 +558,6 @@ the following:
>   
>   .. _SWTPM protocol:
>      https://github.com/stefanberger/swtpm/blob/master/man/man3/swtpm_ioctls.pod
> +
> +.. _ms-tpm-20-ref:
> +   https://github.com/microsoft/ms-tpm-20-ref
>
James Bottomley Dec. 16, 2022, 4:13 p.m. UTC | #20
On Fri, 2022-12-16 at 11:08 -0500, Stefan Berger wrote:
> On 12/16/22 10:48, James Bottomley wrote:
[...]
> > +The mssim backend supports snapshotting and migration, but the
> > state
> > +of the Microsoft Simulator server must be preserved (or the server
> > +kept running) outside of QEMU for restore to be successful.
> 
> You said you tested it. Can you show how to set it up with command
> lines? I want to try out at least suspend and resume .

I already did here:

https://lore.kernel.org/qemu-devel/77bc5a11fcb7b06deba1c54b1ef2de28e0c53fb1.camel@linux.ibm.com/

But to recap, it's

stop                                                               
migrate "exec:gzip -c > STATEFILE.gz"                              
quit

Followed by a restart with

<qemu-command-line> -incoming "exec: gzip -c -d STATEFILE.gz"

James
Stefan Berger Dec. 16, 2022, 4:21 p.m. UTC | #21
On 12/16/22 11:13, James Bottomley wrote:
> On Fri, 2022-12-16 at 11:08 -0500, Stefan Berger wrote:
>> On 12/16/22 10:48, James Bottomley wrote:
> [...]
>>> +The mssim backend supports snapshotting and migration, but the
>>> state
>>> +of the Microsoft Simulator server must be preserved (or the server
>>> +kept running) outside of QEMU for restore to be successful.
>>
>> You said you tested it. Can you show how to set it up with command
>> lines? I want to try out at least suspend and resume .
> 
> I already did here:
> 
> https://lore.kernel.org/qemu-devel/77bc5a11fcb7b06deba1c54b1ef2de28e0c53fb1.camel@linux.ibm.com/
> 
> But to recap, it's
> 
> stop
> migrate "exec:gzip -c > STATEFILE.gz"
> quit
> 
> Followed by a restart with
> 
> <qemu-command-line> -incoming "exec: gzip -c -d STATEFILE.gz"

Good, you can put it into the documentation. Can I do a reboot of the host in between or does the TPM have to keep on running?

    Stefan

> 
> James
>
Stefan Berger Dec. 19, 2022, 11:49 a.m. UTC | #22
On 12/16/22 08:53, James Bottomley wrote:

> 
> I could do a blog post, but I really don't think you want this in
> official documentation because that creates support expectations.

We get support expectations if we don't mention it as not being supported. So, since this driver is not supported the documentation for QEMU should state something along the lines of 'this driver is for experimental or testing purposes and is otherwise unsupported.' That's fair to the user and maintainer. Nevertheless, if the documentation (or as a matter of fact the code) was to claim that VM / TPM state migration scenarios, such as VM snapshotting, are working then users should be able to ask someone 'how' this can be done with the mssim protocol **today**. Since I cannot answer that question you may need to find a way for how to address this concern.

Regards,
    Stefan
James Bottomley Dec. 19, 2022, 1:02 p.m. UTC | #23
On Mon, 2022-12-19 at 06:49 -0500, Stefan Berger wrote:
> 
> 
> On 12/16/22 08:53, James Bottomley wrote:
> 
> > 
> > I could do a blog post, but I really don't think you want this in
> > official documentation because that creates support expectations.
> 
> We get support expectations if we don't mention it as not being
> supported. So, since this driver is not supported the documentation
> for QEMU should state something along the lines of 'this driver is
> for experimental or testing purposes and is otherwise unsupported.'
> That's fair to the user and maintainer.

Open source project don't provide support.  I already added a
Maintainer entry for it, so I'll maintain it.

>  Nevertheless, if the documentation (or as a matter of fact the code)
> was to claim that VM / TPM state migration scenarios, such as VM
> snapshotting, are working then users should be able to ask someone
> 'how' this can be done with the mssim protocol **today**. Since I
> cannot answer that question you may need to find a way for how to
> address this concern.

I already proposed all of this ... you were the one wanting to document
migration.  The current wording is:

   The mssim backend supports snapshotting and migration, but the state
   of the Microsoft Simulator server must be preserved (or the server
   kept running) outside of QEMU for restore to be successful.

James
Stefan Berger Dec. 19, 2022, 2:01 p.m. UTC | #24
On 12/19/22 08:02, James Bottomley wrote:
> On Mon, 2022-12-19 at 06:49 -0500, Stefan Berger wrote:
>>
>>
>> On 12/16/22 08:53, James Bottomley wrote:
>>
>>>
>>> I could do a blog post, but I really don't think you want this in
>>> official documentation because that creates support expectations.
>>
>> We get support expectations if we don't mention it as not being
>> supported. So, since this driver is not supported the documentation
>> for QEMU should state something along the lines of 'this driver is
>> for experimental or testing purposes and is otherwise unsupported.'
>> That's fair to the user and maintainer.
> 
> Open source project don't provide support.  I already added a
> Maintainer entry for it, so I'll maintain it.

Support for me means reacting to user questions and addressing issues. Good that you maintain this now.

> 
>>   Nevertheless, if the documentation (or as a matter of fact the code)
>> was to claim that VM / TPM state migration scenarios, such as VM
>> snapshotting, are working then users should be able to ask someone
>> 'how' this can be done with the mssim protocol **today**. Since I
>> cannot answer that question you may need to find a way for how to
>> address this concern.
> 
> I already proposed all of this ... you were the one wanting to document
> migration.  The current wording is:

With documenting I wanted to see how users need to provide command lines for the mssim TPM.

> 
>     The mssim backend supports snapshotting and migration, but the state
>     of the Microsoft Simulator server must be preserved (or the server
>     kept running) outside of QEMU for restore to be successful.

VM snapshotting is basically VM suspend / resume on steroids requiring permanent and volatile state to be saved and restoreable from possible very different points in time with possibly different seeds, NVRAM locations etc. How the mssim protocol does this is non-obvious to me and how one coordinates the restoring and saving of the TPM's state without direct coordination by QEMU is also non-obvious.

    Stefan

> 
> James
> 
>
Dr. David Alan Gilbert Jan. 9, 2023, 4:59 p.m. UTC | #25
* Daniel P. Berrangé (berrange@redhat.com) wrote:
> On Fri, Dec 16, 2022 at 08:32:44AM -0500, Stefan Berger wrote:
> > 
> > 
> > On 12/16/22 07:54, Daniel P. Berrangé wrote:
> > > On Fri, Dec 16, 2022 at 07:28:59AM -0500, Stefan Berger wrote:
> > > > 
> > > > 
> > > > On 12/16/22 05:27, Daniel P. Berrangé wrote:
> > > > > On Thu, Dec 15, 2022 at 03:53:43PM -0500, Stefan Berger wrote:
> > > > > > 
> > > > > > 
> > > > > > On 12/15/22 15:30, James Bottomley wrote:
> > > > > > > On Thu, 2022-12-15 at 15:22 -0500, Stefan Berger wrote:
> > > > > > > > On 12/15/22 15:07, James Bottomley wrote:
> > > > > > > [...]
> > > > > > > > > don't really have much interest in the migration use case, but I
> > > > > > > > > knew it should work like the passthrough case, so that's what I
> > > > > > > > > tested.
> > > > > > > > 
> > > > > > > > I think your device needs to block migrations since it doesn't handle
> > > > > > > > all migration scenarios correctly.
> > > > > > > 
> > > > > > > Passthrough doesn't block migrations either, presumably because it can
> > > > > > > also be made to work if you know what you're doing.  I might not be
> > > > > > 
> > > > > > Don't compare it to passthrough, compare it to swtpm. It should
> > > > > > have at least the same features as swtpm or be better, otherwise
> > > > > > I don't see why we need to have the backend device in the upstream
> > > > > > repo.
> > > > > 
> > > > > James has explained multiple times that mssim is a beneficial
> > > > > thing to support, given that it is the reference implementation
> > > > > of TPM2. Requiring the same or greater features than swtpm is
> > > > > an unreasonable thing to demand.
> > > > 
> > > > Nevertheless it needs documentation and has to handle migration
> > > > scenarios either via a blocker or it has to handle them all
> > > > correctly. Since it's supposed to be a TPM running remote you
> > > > had asked for TLS support iirc.
> > > 
> > > If the mssim implmentation doesn't provide TLS itself, then I don't
> > > consider that a blocker on the QEMU side, merely a nice-to-have.
> > > 
> > > With swtpm the control channel is being used to load and store state
> > > during the migration dance. This makes the use of an external process
> > > largely transparent to the user, since QEMU handles all the state
> > > save/load as part of its migration data stream.
> > > 
> > > With mssim there is state save/load co-ordination with QEMU. Instead
> > > whomever/whatever is managing the mssim instance, is responsible for
> > > ensuring it is running with the correct state at the time QEMU does
> > > a vmstate load. If doing a live migration this co-ordination is trivial
> > > if you just use the same mssim instance for both src/dst to connect to.
> > > 
> > > If doing save/store to disk, the user needs to be able to save the mssim
> > > state and load it again later. If doing snapshots and reverting to old
> > 
> > There is no way for storing and loading the *volatile state* of the
> > mssim device.
> > 
> > > snapshots, then again whomever manages mssim needs to be keeping saved
> > > TPM state corresponding to each QEMU snapshot saved, and picking the
> > > right one when restoring to old snapshots.
> > 
> > This doesn't work.
> > Either way, if it's possible it can be documented and shown how this works.
> > 
> > > 
> > > QEMU exposes enough functionality to enable a mgmt app / admin us> achieve all of this.
> > 
> > How do you store the volatile state of this device, like the current
> > state of the PCRs, loaded sessions etc? It doesn't support this.
> > 
> > > 
> > > This is not as seemlessly integrated with swtpm is, but it is still
> > > technically posssible todo the right thing with migration from QEMU's
> > > POV. Whether or not the app/person managing mssim instance actually
> > > does the right thing in practice is not a concern of QEMU. I don't
> > > see a need for a migration blocker here.
> > 
> > I do see it because the *volatile state* cannot be extracted from
> > this device. The state of the PCRs is going to be lost.
> 
> All the objections you're raising are related to the current
> specifics of the implementation of the mssim remote server.
> While valid, this is of no concern to QEMU when deciding whether
> to require a migration blocker on the client side. This is 3rd
> party remote service that should be considered a black box from
> QEMU's POV. It is possible to write a remote server that supports
> the mssim network protocol, and has the ability to serialize
> its state. Whether such an impl exists today or not is separate.

We would normally want an example of a working implementation though
wouldn't we?

So I think it's fair to at least want some documentation; if it can be
documented and works, fine; if it doesn't work, then it needs a blocker.

Dave

> With regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
> 
>
James Bottomley Jan. 9, 2023, 5:43 p.m. UTC | #26
On Mon, 2023-01-09 at 16:59 +0000, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > On Fri, Dec 16, 2022 at 08:32:44AM -0500, Stefan Berger wrote:
[...]
> > > I do see it because the *volatile state* cannot be extracted from
> > > this device. The state of the PCRs is going to be lost.
> > 
> > All the objections you're raising are related to the current
> > specifics of the implementation of the mssim remote server.
> > While valid, this is of no concern to QEMU when deciding whether
> > to require a migration blocker on the client side. This is 3rd
> > party remote service that should be considered a black box from
> > QEMU's POV. It is possible to write a remote server that supports
> > the mssim network protocol, and has the ability to serialize
> > its state. Whether such an impl exists today or not is separate.
> 
> We would normally want an example of a working implementation though
> wouldn't we?
> 
> So I think it's fair to at least want some documentation; if it can
> be documented and works, fine; if it doesn't work, then it needs a
> blocker.

It works under limited circumstances ... in fact similar circumstances
passthrough migration works under, which is also not documented.  The
external MSSIM TPM emulator has to be kept running to preserve the
state.  If you restart it, the migration will fail.

James
Dr. David Alan Gilbert Jan. 9, 2023, 5:52 p.m. UTC | #27
* James Bottomley (jejb@linux.ibm.com) wrote:
> On Mon, 2023-01-09 at 16:59 +0000, Dr. David Alan Gilbert wrote:
> > * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > > On Fri, Dec 16, 2022 at 08:32:44AM -0500, Stefan Berger wrote:
> [...]
> > > > I do see it because the *volatile state* cannot be extracted from
> > > > this device. The state of the PCRs is going to be lost.
> > > 
> > > All the objections you're raising are related to the current
> > > specifics of the implementation of the mssim remote server.
> > > While valid, this is of no concern to QEMU when deciding whether
> > > to require a migration blocker on the client side. This is 3rd
> > > party remote service that should be considered a black box from
> > > QEMU's POV. It is possible to write a remote server that supports
> > > the mssim network protocol, and has the ability to serialize
> > > its state. Whether such an impl exists today or not is separate.
> > 
> > We would normally want an example of a working implementation though
> > wouldn't we?
> > 
> > So I think it's fair to at least want some documentation; if it can
> > be documented and works, fine; if it doesn't work, then it needs a
> > blocker.
> 
> It works under limited circumstances ... in fact similar circumstances
> passthrough migration works under,

Well, not that similar - people expect passthrough migration to fail
because, being nailed to a physical servers hardware it's not likely to
migrate; where as you're creating a new virtual thing which people might
imagine is similar to the existing swtpm.  Their imagination might be
wrong and thus you need to say why.

> which is also not documented.  The

Inductive proof that we should have no good documentation doesn't get us
anywhere.

> external MSSIM TPM emulator has to be kept running to preserve the
> state.  If you restart it, the migration will fail.

Document that and we're getting there.

Dave

> James
>
James Bottomley Jan. 9, 2023, 5:55 p.m. UTC | #28
On Mon, 2023-01-09 at 17:52 +0000, Dr. David Alan Gilbert wrote:
> * James Bottomley (jejb@linux.ibm.com) wrote:
[...]
> > external MSSIM TPM emulator has to be kept running to preserve the
> > state.  If you restart it, the migration will fail.
> 
> Document that and we're getting there.


The documentation in the current patch series says

----
The mssim backend supports snapshotting and migration, but the state
of the Microsoft Simulator server must be preserved (or the server
kept running) outside of QEMU for restore to be successful.
----

What, beyond this would you want to see?

James
Stefan Berger Jan. 9, 2023, 6:34 p.m. UTC | #29
On 1/9/23 12:55, James Bottomley wrote:
> On Mon, 2023-01-09 at 17:52 +0000, Dr. David Alan Gilbert wrote:
>> * James Bottomley (jejb@linux.ibm.com) wrote:
> [...]
>>> external MSSIM TPM emulator has to be kept running to preserve the
>>> state.  If you restart it, the migration will fail.
>>
>> Document that and we're getting there.
> 
> 
> The documentation in the current patch series says
> 
> ----
> The mssim backend supports snapshotting and migration, but the state
> of the Microsoft Simulator server must be preserved (or the server
> kept running) outside of QEMU for restore to be successful.
> ----
> 
> What, beyond this would you want to see?

mssim today lacks the functionality of marshalling and unmarshalling the permanent and volatile state of the TPM 2, which are both needed for snapshot support. How does this work with mssim?

    Stefan

> 
> James
>
James Bottomley Jan. 9, 2023, 6:51 p.m. UTC | #30
On Mon, 2023-01-09 at 13:34 -0500, Stefan Berger wrote:
> 
> 
> On 1/9/23 12:55, James Bottomley wrote:
> > On Mon, 2023-01-09 at 17:52 +0000, Dr. David Alan Gilbert wrote:
> > > * James Bottomley (jejb@linux.ibm.com) wrote:
> > [...]
> > > > external MSSIM TPM emulator has to be kept running to preserve
> > > > the state.  If you restart it, the migration will fail.
> > > 
> > > Document that and we're getting there.
> > 
> > 
> > The documentation in the current patch series says
> > 
> > ----
> > The mssim backend supports snapshotting and migration, but the
> > state of the Microsoft Simulator server must be preserved (or the
> > server kept running) outside of QEMU for restore to be successful.
> > ----
> > 
> > What, beyond this would you want to see?
> 
> mssim today lacks the functionality of marshalling and unmarshalling
> the permanent and volatile state of the TPM 2, which are both needed
> for snapshot support. How does this work with mssim?

You preserve the state by keeping the simulator running as the above
says.  As long as you can preserve the state, there's no maximum time
between snapshots.  There's no need of marshal/unmarshal if you do
this.

James
Dr. David Alan Gilbert Jan. 9, 2023, 6:54 p.m. UTC | #31
* James Bottomley (jejb@linux.ibm.com) wrote:
> On Mon, 2023-01-09 at 13:34 -0500, Stefan Berger wrote:
> > 
> > 
> > On 1/9/23 12:55, James Bottomley wrote:
> > > On Mon, 2023-01-09 at 17:52 +0000, Dr. David Alan Gilbert wrote:
> > > > * James Bottomley (jejb@linux.ibm.com) wrote:
> > > [...]
> > > > > external MSSIM TPM emulator has to be kept running to preserve
> > > > > the state.  If you restart it, the migration will fail.
> > > > 
> > > > Document that and we're getting there.
> > > 
> > > 
> > > The documentation in the current patch series says
> > > 
> > > ----
> > > The mssim backend supports snapshotting and migration, but the
> > > state of the Microsoft Simulator server must be preserved (or the
> > > server kept running) outside of QEMU for restore to be successful.
> > > ----
> > > 
> > > What, beyond this would you want to see?
> > 
> > mssim today lacks the functionality of marshalling and unmarshalling
> > the permanent and volatile state of the TPM 2, which are both needed
> > for snapshot support. How does this work with mssim?
> 
> You preserve the state by keeping the simulator running as the above
> says.  As long as you can preserve the state, there's no maximum time
> between snapshots.  There's no need of marshal/unmarshal if you do
> this.

So I think I can understand how that works with a suspend/resume; I'm
less sure about a live migration.

In a live migration, you normally start up the destination VM
qemu process and other processes attached to it, prior to the inwards
live migration of state.  Then you live migrate the state, then kill the
source.

With this mssim setup, will the start up of the destination attempt
to change the vtpm state during the initialisation?

Dave

> James
>
James Bottomley Jan. 9, 2023, 6:59 p.m. UTC | #32
On Mon, 2023-01-09 at 18:54 +0000, Dr. David Alan Gilbert wrote:
> * James Bottomley (jejb@linux.ibm.com) wrote:
> > On Mon, 2023-01-09 at 13:34 -0500, Stefan Berger wrote:
> > > 
> > > 
> > > On 1/9/23 12:55, James Bottomley wrote:
> > > > On Mon, 2023-01-09 at 17:52 +0000, Dr. David Alan Gilbert
> > > > wrote:
> > > > > * James Bottomley (jejb@linux.ibm.com) wrote:
> > > > [...]
> > > > > > external MSSIM TPM emulator has to be kept running to
> > > > > > preserve
> > > > > > the state.  If you restart it, the migration will fail.
> > > > > 
> > > > > Document that and we're getting there.
> > > > 
> > > > 
> > > > The documentation in the current patch series says
> > > > 
> > > > ----
> > > > The mssim backend supports snapshotting and migration, but the
> > > > state of the Microsoft Simulator server must be preserved (or
> > > > the
> > > > server kept running) outside of QEMU for restore to be
> > > > successful.
> > > > ----
> > > > 
> > > > What, beyond this would you want to see?
> > > 
> > > mssim today lacks the functionality of marshalling and
> > > unmarshalling
> > > the permanent and volatile state of the TPM 2, which are both
> > > needed
> > > for snapshot support. How does this work with mssim?
> > 
> > You preserve the state by keeping the simulator running as the
> > above
> > says.  As long as you can preserve the state, there's no maximum
> > time
> > between snapshots.  There's no need of marshal/unmarshal if you do
> > this.
> 
> So I think I can understand how that works with a suspend/resume; I'm
> less sure about a live migration.
> 
> In a live migration, you normally start up the destination VM
> qemu process and other processes attached to it, prior to the inwards
> live migration of state.  Then you live migrate the state, then kill
> the source.
> 
> With this mssim setup, will the start up of the destination attempt
> to change the vtpm state during the initialisation?

The backend driver contains state checks to prevent this, so if you
follow the standard migration in

https://www.qemu.org/docs/master/devel/migration.html

it detects that you have done a migration on shutdown and simply closes
the TPM socket.  On start up it sees you're in migrate and doesn't do
the power on reset of the TPM.

James
Stefan Berger Jan. 9, 2023, 7:01 p.m. UTC | #33
On 1/9/23 13:51, James Bottomley wrote:
> On Mon, 2023-01-09 at 13:34 -0500, Stefan Berger wrote:
>>
>>
>> On 1/9/23 12:55, James Bottomley wrote:
>>> On Mon, 2023-01-09 at 17:52 +0000, Dr. David Alan Gilbert wrote:
>>>> * James Bottomley (jejb@linux.ibm.com) wrote:
>>> [...]
>>>>> external MSSIM TPM emulator has to be kept running to preserve
>>>>> the state.  If you restart it, the migration will fail.
>>>>
>>>> Document that and we're getting there.
>>>
>>>
>>> The documentation in the current patch series says
>>>
>>> ----
>>> The mssim backend supports snapshotting and migration, but the
>>> state of the Microsoft Simulator server must be preserved (or the
>>> server kept running) outside of QEMU for restore to be successful.
>>> ----
>>>
>>> What, beyond this would you want to see?
>>
>> mssim today lacks the functionality of marshalling and unmarshalling
>> the permanent and volatile state of the TPM 2, which are both needed
>> for snapshot support. How does this work with mssim?
> 
> You preserve the state by keeping the simulator running as the above
> says.  As long as you can preserve the state, there's no maximum time
> between snapshots.  There's no need of marshal/unmarshal if you do
> this

 From https://lists.gnu.org/archive/html/qemu-devel/2022-12/msg03146.html

"VM snapshotting is basically VM suspend / resume on steroids requiring
permanent and volatile state to be saved and restoreable from possible very
different points in time with possibly different seeds, NVRAM locations etc.
How the mssim protocol does this is non-obvious to me and how one coordinates
the restoring and saving of the TPM's state without direct coordination by QEMU
is also non-obvious."


    Stefan
.
> 
> James
>
Stefan Berger Jan. 9, 2023, 9:06 p.m. UTC | #34
On 1/9/23 14:01, Stefan Berger wrote:
> 
> 
> On 1/9/23 13:51, James Bottomley wrote:
>> On Mon, 2023-01-09 at 13:34 -0500, Stefan Berger wrote:
>>> 
>>> 
>>> On 1/9/23 12:55, James Bottomley wrote:
>>>> On Mon, 2023-01-09 at 17:52 +0000, Dr. David Alan Gilbert 
>>>> wrote:
>>>>> * James Bottomley (jejb@linux.ibm.com) wrote:
>>>> [...]
>>>>>> external MSSIM TPM emulator has to be kept running to 
>>>>>> preserve the state.  If you restart it, the migration will 
>>>>>> fail.
>>>>> 
>>>>> Document that and we're getting there.
>>>> 
>>>> 
>>>> The documentation in the current patch series says
>>>> 
>>>> ---- The mssim backend supports snapshotting and migration,
>>>> but the state of the Microsoft Simulator server must be
>>>> preserved (or the server kept running) outside of QEMU for
>>>> restore to be successful. ----
>>>> 
>>>> What, beyond this would you want to see?
>>> 
>>> mssim today lacks the functionality of marshalling and 
>>> unmarshalling the permanent and volatile state of the TPM 2, 
>>> which are both needed for snapshot support. How does this work 
>>> with mssim?
>> 
>> You preserve the state by keeping the simulator running as the 
>> above says.  As long as you can preserve the state, there's no 
>> maximum time between snapshots.  There's no need of 
>> marshal/unmarshal if you do this
> 
> From 
> https://lists.gnu.org/archive/html/qemu-devel/2022-12/msg03146.html
> 
> "VM snapshotting is basically VM suspend / resume on steroids 
> requiring permanent and volatile state to be saved and restoreable 
> from possible very different points in time with possibly different 
> seeds, NVRAM locations etc. How the mssim protocol does this is 
> non-obvious to me and how one coordinates the restoring and saving
> of the TPM's state without direct coordination by QEMU is also 
> non-obvious."

One thing, though: I am aware of the issues that may arise due to
support for TPM state migration. However, whether TPM state migration becomes an issue
depends on how you use the TPM 2.

If the use case is to use the TPM 2 as a local crypto device then state migration
is  likely not an issue. You may have different keys in the TPM 2 at
different points in time and even snapshotting may not be an issue but possibly
quite a welcome feature to have along with support of scenarios of VM suspend + host
upgrade + host reboot + VM resume.

If you use TPM 2 for attestation then certain TPM 2 state migration scenarios
may become problematic. One could construct a scenario where attestation preceeds
some action that requires trust to have been established in the system in the
preceeding attestation step and support for snapshotting the state of the TPM 2
could become an issue if I was to wait for the attestation to have been concluded
and then I quickly restart a different snapshot that is not trustworthy and the client
proceeds thinking that the system is trustworthy (maybe a few SYNs from the client
went into the void)

Eliminating TPM 2 state migration is probably not a good idea, because  environments
where attestation may occur may also support VM suspend/resume along  with upgrading
a host and rebooting the host or VM migration for some sort of host evacuation
before upgrade.


When it comes to snapshotting and using the TPM 2 as a crypto device just saying that
VM snapshot is supported by leaving the TPM 2 running and not touching it doesn't make
this function correctly for all scenarios where the TPM 2 may have had different keys
loaded. It is even a worse idea for attestation where I could construct a snapshot A
and wait until the attestation has passed and then resume with a snapshot A' that runs
untrustworty software but uses the state of the TPM 2 from snapshot A times and remains
happy to quote the state of the PCRs from before. If launching a snapshot also restores
the state of the PCRs that goes along with the state of the system at that time then
that at least allows for quotes to have valid contents of PCRs that reflects the
system state at snapshot A'.

Kexec also comes to mind in this context where I could quickly start a new system
post attestation. So physical system could possibly be used for fooling clients as well.

A solution for how to resolve this may involve some sort of protocol and a  connection
that may not be broken *while* the system needs to be in a trusted state. The protocol
would have to help detection of substantial changes of state such as resume of some
snapshot or kexec into a system. Repeated attestation (with correctly restored TPM 2 state)
  may also help resolve the issue.

Cheers!
   Stefan




> 
> 
> Stefan .
>> 
>> James
>>

>
James Bottomley Jan. 10, 2023, 2:14 p.m. UTC | #35
On Mon, 2023-01-09 at 16:06 -0500, Stefan Berger wrote:
> On 1/9/23 14:01, Stefan Berger wrote:
[...]
> If you use TPM 2 for attestation then certain TPM 2 state migration
> scenarios may become problematic. One could construct a scenario
> where attestation preceeds some action that requires trust to have
> been established in the system in the preceeding attestation step and
> support for snapshotting the state of the TPM 2 could become an issue
> if I was to wait for the attestation to have been concluded and then
> I quickly restart a different snapshot that is not trustworthy and
> the client proceeds thinking that the system is trustworthy (maybe a
> few SYNs from the client went into the void)

You're over thinking this.  For a non-confidential VM, Migration gives
you a saved image you can always replay from (this is seen as a feature
for fast starts) and if you use the tpm_simulator the TPM state is
stored in the migration image, so you can always roll it back if you
have access to the migration file.  Saving the image state is also a
huge problem because the TPM seeds are in the clear if the migration
image isn't encrypted.  The other big problem is that an external
software TPM is always going to give up its state to the service
provider, regardless of migration, so you have to have some trust in
the provider and thus you'd also have to trust them with the migration
replay policy.  For Confidential VMs, this is a bit different because
the vTPM runs in a secure ring inside the confidential enclave and the
secure migration agent ensures that either migration and startup happen
or migration doesn't happen at all, so for them you don't have to worry
about rollback.

Provided you can trust the vTPM provider, having external state not
stored in the migration image has the potential actually to solve the
rollback problem because you could keep the TPM clock running and
potentially increase the reset count, so migrations would show up in
TPM quotes and you don't have control of the state of the vTPM to
replay it.

James
Stefan Berger Jan. 10, 2023, 2:47 p.m. UTC | #36
On 1/10/23 09:14, James Bottomley wrote:
> On Mon, 2023-01-09 at 16:06 -0500, Stefan Berger wrote:
>> On 1/9/23 14:01, Stefan Berger wrote:
> [...]
>> If you use TPM 2 for attestation then certain TPM 2 state migration
>> scenarios may become problematic. One could construct a scenario
>> where attestation preceeds some action that requires trust to have
>> been established in the system in the preceeding attestation step and
>> support for snapshotting the state of the TPM 2 could become an issue
>> if I was to wait for the attestation to have been concluded and then
>> I quickly restart a different snapshot that is not trustworthy and
>> the client proceeds thinking that the system is trustworthy (maybe a
>> few SYNs from the client went into the void)
> 
> You're over thinking this.  For a non-confidential VM, Migration gives
> you a saved image you can always replay from (this is seen as a feature
> for fast starts) and if you use the tpm_simulator the TPM state is
> stored in the migration image, so you can always roll it back if you

'How' is it stored in the migration image? Does tpm_simulator marshal and unmarshal the state so
that it is carried inside the save image? For the tpm_emulator backend this particular code is
here:
- https://github.com/qemu/qemu/blob/master/backends/tpm/tpm_emulator.c#L758
- https://github.com/qemu/qemu/blob/master/backends/tpm/tpm_emulator.c#L792

> have access to the migration file.  Saving the image state is also a
> huge problem because the TPM seeds are in the clear if the migration
> image isn't encrypted.  The other big problem is that an external

True. DAC protection of the file versus protection via encryption. Neither really helps against malicious root.

> software TPM is always going to give up its state to the service
> provider, regardless of migration, so you have to have some trust in
> the provider and thus you'd also have to trust them with the migration
> replay policy.  For Confidential VMs, this is a bit different because
> the vTPM runs in a secure ring inside the confidential enclave and the
> secure migration agent ensures that either migration and startup happen
> or migration doesn't happen at all, so for them you don't have to worry
> about rollback.

what is the enclave here? Is it an SGX enclave or is it running somewhere inside the address space of the VM?

> 
> Provided you can trust the vTPM provider, having external state not
> stored in the migration image has the potential actually to solve the
> rollback problem because you could keep the TPM clock running and
> potentially increase the reset count, so migrations would show up in
> TPM quotes and you don't have control of the state of the vTPM to
> replay it.

I just don't see how you do that and prevent scenarios where VM A is suspended and then the tpm_simulator just sits there with
the state and one resumes VM B with the state.

   Stefan

> 
> James
>
James Bottomley Jan. 10, 2023, 2:55 p.m. UTC | #37
On Tue, 2023-01-10 at 09:47 -0500, Stefan Berger wrote:
> On 1/10/23 09:14, James Bottomley wrote:
> > On Mon, 2023-01-09 at 16:06 -0500, Stefan Berger wrote:
> > > On 1/9/23 14:01, Stefan Berger wrote:
> > [...]
> > > If you use TPM 2 for attestation then certain TPM 2 state
> > > migration scenarios may become problematic. One could construct a
> > > scenario where attestation preceeds some action that requires
> > > trust to have been established in the system in the preceeding
> > > attestation step and support for snapshotting the state of the
> > > TPM 2 could become an issue if I was to wait for the attestation
> > > to have been concluded and then I quickly restart a different
> > > snapshot that is not trustworthy and the client proceeds thinking
> > > that the system is trustworthy (maybe a few SYNs from the client
> > > went into the void)
> > 
> > You're over thinking this.  For a non-confidential VM, Migration
> > gives you a saved image you can always replay from (this is seen as
> > a feature for fast starts) and if you use the tpm_simulator the TPM
> > state is stored in the migration image, so you can always roll it
> > back if you
> 
> 'How' is it stored in the migration image? Does tpm_simulator marshal
> and unmarshal the state so that it is carried inside the save image?
> For the tpm_emulator backend this particular code is here:
> -
> https://github.com/qemu/qemu/blob/master/backends/tpm/tpm_emulator.c#L758
> -
> https://github.com/qemu/qemu/blob/master/backends/tpm/tpm_emulator.c#L792

We seem to be going around in circles: your TPM simulator stores the
TPM state in the migration image, mine keeps it in the external TPM. 
The above paragraph is referring to your simulator.

> > have access to the migration file.  Saving the image state is also
> > a huge problem because the TPM seeds are in the clear if the
> > migration image isn't encrypted.  The other big problem is that an
> > external
> 
> True. DAC protection of the file versus protection via encryption.
> Neither really helps against malicious root.
> 
> > software TPM is always going to give up its state to the service
> > provider, regardless of migration, so you have to have some trust
> > in the provider and thus you'd also have to trust them with the
> > migration replay policy.  For Confidential VMs, this is a bit
> > different because the vTPM runs in a secure ring inside the
> > confidential enclave and the secure migration agent ensures that
> > either migration and startup happen or migration doesn't happen at
> > all, so for them you don't have to worry about rollback.
> 
> what is the enclave here? Is it an SGX enclave or is it running
> somewhere inside the address space of the VM?

The only current one we're playing with is the SEV-SNP SVSM vTPM which
runs the TPM in VMPL0.

> > 
> > Provided you can trust the vTPM provider, having external state not
> > stored in the migration image has the potential actually to solve
> > the rollback problem because you could keep the TPM clock running
> > and potentially increase the reset count, so migrations would show
> > up in TPM quotes and you don't have control of the state of the
> > vTPM to replay it.
> 
> I just don't see how you do that and prevent scenarios where VM A is
> suspended and then the tpm_simulator just sits there with
> the state and one resumes VM B with the state.

You can't with your TPM simulator because it stores state in the image.
If the state is external (not stored in the image) then rolling back
the image doesn't roll back the TPM state.

James
Stefan Berger Jan. 10, 2023, 3 p.m. UTC | #38
On 1/10/23 09:55, James Bottomley wrote:
> On Tue, 2023-01-10 at 09:47 -0500, Stefan Berger wrote:
>> On 1/10/23 09:14, James Bottomley wrote:
>>> On Mon, 2023-01-09 at 16:06 -0500, Stefan Berger wrote:
>>>> On 1/9/23 14:01, Stefan Berger wrote:
>>> [...]
>>>> If you use TPM 2 for attestation then certain TPM 2 state
>>>> migration scenarios may become problematic. One could construct a
>>>> scenario where attestation preceeds some action that requires
>>>> trust to have been established in the system in the preceeding
>>>> attestation step and support for snapshotting the state of the
>>>> TPM 2 could become an issue if I was to wait for the attestation
>>>> to have been concluded and then I quickly restart a different
>>>> snapshot that is not trustworthy and the client proceeds thinking
>>>> that the system is trustworthy (maybe a few SYNs from the client
>>>> went into the void)
>>>
>>> You're over thinking this.  For a non-confidential VM, Migration
>>> gives you a saved image you can always replay from (this is seen as
>>> a feature for fast starts) and if you use the tpm_simulator the TPM
>>> state is stored in the migration image, so you can always roll it
>>> back if you
>>
>> 'How' is it stored in the migration image? Does tpm_simulator marshal
>> and unmarshal the state so that it is carried inside the save image?
>> For the tpm_emulator backend this particular code is here:
>> -
>> https://github.com/qemu/qemu/blob/master/backends/tpm/tpm_emulator.c#L758
>> -
>> https://github.com/qemu/qemu/blob/master/backends/tpm/tpm_emulator.c#L792
> 
> We seem to be going around in circles: your TPM simulator stores the
> TPM state in the migration image, mine keeps it in the external TPM.
> The above paragraph is referring to your simulator.

My simulator is typically called 'swtpm'.


> 
>>> have access to the migration file.  Saving the image state is also
>>> a huge problem because the TPM seeds are in the clear if the
>>> migration image isn't encrypted.  The other big problem is that an
>>> external
>>
>> True. DAC protection of the file versus protection via encryption.
>> Neither really helps against malicious root.
>>
>>> software TPM is always going to give up its state to the service
>>> provider, regardless of migration, so you have to have some trust
>>> in the provider and thus you'd also have to trust them with the
>>> migration replay policy.  For Confidential VMs, this is a bit
>>> different because the vTPM runs in a secure ring inside the
>>> confidential enclave and the secure migration agent ensures that
>>> either migration and startup happen or migration doesn't happen at
>>> all, so for them you don't have to worry about rollback.
>>
>> what is the enclave here? Is it an SGX enclave or is it running
>> somewhere inside the address space of the VM?
> 
> The only current one we're playing with is the SEV-SNP SVSM vTPM which
> runs the TPM in VMPL0.

And how is this related to this PR?

> 
>>>
>>> Provided you can trust the vTPM provider, having external state not
>>> stored in the migration image has the potential actually to solve
>>> the rollback problem because you could keep the TPM clock running
>>> and potentially increase the reset count, so migrations would show
>>> up in TPM quotes and you don't have control of the state of the
>>> vTPM to replay it.
>>
>> I just don't see how you do that and prevent scenarios where VM A is
>> suspended and then the tpm_simulator just sits there with
>> the state and one resumes VM B with the state.
> 
> You can't with your TPM simulator because it stores state in the image.
> If the state is external (not stored in the image) then rolling back
> the image doesn't roll back the TPM state.

And resuming VM B with the TPM state of suspend VM A is considered 'good'?

    Stefan

> 
> James
>
diff mbox series

Patch

diff --git a/MAINTAINERS b/MAINTAINERS
index 6966490c94..a4a3bf9ab4 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3046,6 +3046,11 @@  F: backends/tpm/
 F: tests/qtest/*tpm*
 T: git https://github.com/stefanberger/qemu-tpm.git tpm-next
 
+MSSIM TPM Backend
+M: James Bottomley <jejb@linux.ibm.com>
+S: Maintained
+F: backends/tpm/tpm_mssim.*
+
 Checkpatch
 S: Odd Fixes
 F: scripts/checkpatch.pl
diff --git a/backends/tpm/Kconfig b/backends/tpm/Kconfig
index 5d91eb89c2..d6d6fa53e9 100644
--- a/backends/tpm/Kconfig
+++ b/backends/tpm/Kconfig
@@ -12,3 +12,8 @@  config TPM_EMULATOR
     bool
     default y
     depends on TPM_BACKEND
+
+config TPM_MSSIM
+    bool
+    default y
+    depends on TPM_BACKEND
diff --git a/backends/tpm/meson.build b/backends/tpm/meson.build
index 7f2503f84e..c7c3c79125 100644
--- a/backends/tpm/meson.build
+++ b/backends/tpm/meson.build
@@ -3,4 +3,5 @@  if have_tpm
   softmmu_ss.add(files('tpm_util.c'))
   softmmu_ss.add(when: 'CONFIG_TPM_PASSTHROUGH', if_true: files('tpm_passthrough.c'))
   softmmu_ss.add(when: 'CONFIG_TPM_EMULATOR', if_true: files('tpm_emulator.c'))
+  softmmu_ss.add(when: 'CONFIG_TPM_MSSIM', if_true: files('tpm_mssim.c'))
 endif
diff --git a/backends/tpm/tpm_mssim.c b/backends/tpm/tpm_mssim.c
new file mode 100644
index 0000000000..7c10ce2944
--- /dev/null
+++ b/backends/tpm/tpm_mssim.c
@@ -0,0 +1,251 @@ 
+/*
+ * Emulator TPM driver which connects over the mssim protocol
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ *
+ * Copyright (c) 2022
+ * Author: James Bottomley <jejb@linux.ibm.com>
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/error-report.h"
+#include "qemu/sockets.h"
+
+#include "qapi/clone-visitor.h"
+#include "qapi/qapi-visit-tpm.h"
+
+#include "io/channel-socket.h"
+
+#include "sysemu/tpm_backend.h"
+#include "sysemu/tpm_util.h"
+
+#include "qom/object.h"
+
+#include "tpm_int.h"
+#include "tpm_mssim.h"
+
+#define ERROR_PREFIX "TPM mssim Emulator: "
+
+#define TYPE_TPM_MSSIM "tpm-mssim"
+OBJECT_DECLARE_SIMPLE_TYPE(TPMmssim, TPM_MSSIM)
+
+struct TPMmssim {
+    TPMBackend parent;
+
+    TpmTypeOptions *opts;
+
+    QIOChannelSocket *cmd_qc, *ctrl_qc;
+};
+
+static int tpm_send_ctrl(TPMmssim *t, uint32_t cmd, Error **errp)
+{
+    int ret;
+
+    cmd = htonl(cmd);
+    ret = qio_channel_write_all(QIO_CHANNEL(t->ctrl_qc), (char *)&cmd, sizeof(cmd), errp);
+    if (ret != 0)
+        return ret;
+    ret = qio_channel_read_all(QIO_CHANNEL(t->ctrl_qc), (char *)&cmd, sizeof(cmd), errp);
+    if (ret != 0)
+        return ret;
+    if (cmd != 0) {
+        error_setg(errp, ERROR_PREFIX "Incorrect ACK recieved on control channel 0x%x\n", cmd);
+        return -1;
+    }
+    return 0;
+}
+
+static void tpm_mssim_instance_init(Object *obj)
+{
+}
+
+static void tpm_mssim_instance_finalize(Object *obj)
+{
+    TPMmssim *t = TPM_MSSIM(obj);
+
+    if (t->ctrl_qc)
+        tpm_send_ctrl(t, TPM_SIGNAL_POWER_OFF, NULL);
+
+    object_unref(OBJECT(t->ctrl_qc));
+    object_unref(OBJECT(t->cmd_qc));
+}
+
+static void tpm_mssim_cancel_cmd(TPMBackend *tb)
+{
+        return;
+}
+
+static TPMVersion tpm_mssim_get_version(TPMBackend *tb)
+{
+    return TPM_VERSION_2_0;
+}
+
+static size_t tpm_mssim_get_buffer_size(TPMBackend *tb)
+{
+    /* TCG standard profile max buffer size */
+    return 4096;
+}
+
+static TpmTypeOptions *tpm_mssim_get_opts(TPMBackend *tb)
+{
+    TPMmssim *t = TPM_MSSIM(tb);
+    TpmTypeOptions *opts;
+
+    opts = QAPI_CLONE(TpmTypeOptions, t->opts);
+
+    return opts;
+}
+
+static void tpm_mssim_handle_request(TPMBackend *tb, TPMBackendCmd *cmd,
+                                     Error **errp)
+{
+    TPMmssim *t = TPM_MSSIM(tb);
+    uint32_t header, len;
+    uint8_t locality = cmd->locty;
+    struct iovec iov[4];
+    int ret;
+
+    header = htonl(TPM_SEND_COMMAND);
+    len = htonl(cmd->in_len);
+
+    iov[0].iov_base = &header;
+    iov[0].iov_len = sizeof(header);
+    iov[1].iov_base = &locality;
+    iov[1].iov_len = sizeof(locality);
+    iov[2].iov_base = &len;
+    iov[2].iov_len = sizeof(len);
+    iov[3].iov_base = (void *)cmd->in;
+    iov[3].iov_len = cmd->in_len;
+
+    ret = qio_channel_writev_all(QIO_CHANNEL(t->cmd_qc), iov, 4, errp);
+    if (ret != 0)
+        goto fail;
+
+    ret = qio_channel_read_all(QIO_CHANNEL(t->cmd_qc), (char *)&len, sizeof(len), errp);
+    if (ret != 0)
+        goto fail;
+    len = ntohl(len);
+    if (len > cmd->out_len) {
+        error_setg(errp, "receive size is too large");
+        goto fail;
+    }
+    ret = qio_channel_read_all(QIO_CHANNEL(t->cmd_qc), (char *)cmd->out, len, errp);
+    if (ret != 0)
+        goto fail;
+    /* ACK packet */
+    ret = qio_channel_read_all(QIO_CHANNEL(t->cmd_qc), (char *)&header, sizeof(header), errp);
+    if (ret != 0)
+        goto fail;
+    if (header != 0) {
+        error_setg(errp, "incorrect ACK received on command channel 0x%x", len);
+        goto fail;
+    }
+
+    return;
+
+ fail:
+    error_prepend(errp, ERROR_PREFIX);
+    tpm_util_write_fatal_error_response(cmd->out, cmd->out_len);
+}
+
+static TPMBackend *tpm_mssim_create(TpmTypeOptions *opts)
+{
+    TPMBackend *be = TPM_BACKEND(object_new(TYPE_TPM_MSSIM));
+    TPMmssim *t = TPM_MSSIM(be);
+    int sock;
+    Error *errp = NULL;
+    TPMmssimOptions *mo = &opts->u.mssim;
+
+    t->opts = opts;
+    if (!mo->has_command) {
+            mo->has_command = true;
+            mo->command = g_new0(SocketAddress, 1);
+            mo->command->type = SOCKET_ADDRESS_TYPE_INET;
+            mo->command->u.inet.host = g_strdup("localhost");
+            mo->command->u.inet.port = g_strdup("2321");
+    }
+    if (!mo->has_control) {
+            mo->has_control = true;
+            mo->control = g_new0(SocketAddress, 1);
+            mo->control->type = SOCKET_ADDRESS_TYPE_INET;
+            mo->control->u.inet.host = g_strdup(mo->command->u.inet.host);
+            mo->control->u.inet.port = g_strdup("2322");
+    }
+
+    t->cmd_qc = qio_channel_socket_new();
+    t->ctrl_qc = qio_channel_socket_new();
+
+    if (qio_channel_socket_connect_sync(t->cmd_qc, mo->command, &errp) < 0)
+        goto fail;
+
+    if (qio_channel_socket_connect_sync(t->ctrl_qc, mo->control, &errp) < 0)
+        goto fail;
+
+    /* reset the TPM using a power cycle sequence, in case someone
+     * has previously powered it up */
+    sock = tpm_send_ctrl(t, TPM_SIGNAL_POWER_OFF, &errp);
+    if (sock != 0)
+        goto fail;
+    sock = tpm_send_ctrl(t, TPM_SIGNAL_POWER_ON, &errp);
+    if (sock != 0)
+        goto fail;
+    sock = tpm_send_ctrl(t, TPM_SIGNAL_NV_ON, &errp);
+    if (sock != 0)
+        goto fail;
+
+    return be;
+
+ fail:
+    object_unref(OBJECT(t->ctrl_qc));
+    object_unref(OBJECT(t->cmd_qc));
+    t->ctrl_qc = NULL;
+    error_prepend(&errp, ERROR_PREFIX);
+    error_report_err(errp);
+    object_unref(OBJECT(be));
+
+    return NULL;
+}
+
+static const QemuOptDesc tpm_mssim_cmdline_opts[] = {
+    TPM_STANDARD_CMDLINE_OPTS,
+    {
+        .name = "command",
+        .type = QEMU_OPT_STRING,
+        .help = "Command socket (default localhost:2321)",
+    },
+    {
+        .name = "control",
+        .type = QEMU_OPT_STRING,
+        .help = "control socket (default localhost:2322)",
+    },
+};
+
+static void tpm_mssim_class_init(ObjectClass *klass, void *data)
+{
+    TPMBackendClass *cl = TPM_BACKEND_CLASS(klass);
+
+    cl->type = TPM_TYPE_MSSIM;
+    cl->opts = tpm_mssim_cmdline_opts;
+    cl->desc = "TPM mssim emulator backend driver";
+    cl->create = tpm_mssim_create;
+    cl->cancel_cmd = tpm_mssim_cancel_cmd;
+    cl->get_tpm_version = tpm_mssim_get_version;
+    cl->get_buffer_size = tpm_mssim_get_buffer_size;
+    cl->get_tpm_options = tpm_mssim_get_opts;
+    cl->handle_request = tpm_mssim_handle_request;
+}
+
+static const TypeInfo tpm_mssim_info = {
+    .name = TYPE_TPM_MSSIM,
+    .parent = TYPE_TPM_BACKEND,
+    .instance_size = sizeof(TPMmssim),
+    .class_init = tpm_mssim_class_init,
+    .instance_init = tpm_mssim_instance_init,
+    .instance_finalize = tpm_mssim_instance_finalize,
+};
+
+static void tpm_mssim_register(void)
+{
+    type_register_static(&tpm_mssim_info);
+}
+
+type_init(tpm_mssim_register)
diff --git a/backends/tpm/tpm_mssim.h b/backends/tpm/tpm_mssim.h
new file mode 100644
index 0000000000..04a270338a
--- /dev/null
+++ b/backends/tpm/tpm_mssim.h
@@ -0,0 +1,43 @@ 
+/*
+ * SPDX-License-Identifier: BSD-2-Clause
+ *
+ * The code below is copied from the Microsoft/TCG Reference implementation
+ *
+ *  https://github.com/Microsoft/ms-tpm-20-ref.git
+ *
+ * In file TPMCmd/Simulator/include/TpmTcpProtocol.h
+ */
+
+#define TPM_SIGNAL_POWER_ON         1
+#define TPM_SIGNAL_POWER_OFF        2
+#define TPM_SIGNAL_PHYS_PRES_ON     3
+#define TPM_SIGNAL_PHYS_PRES_OFF    4
+#define TPM_SIGNAL_HASH_START       5
+#define TPM_SIGNAL_HASH_DATA        6
+        // {uint32_t BufferSize, uint8_t[BufferSize] Buffer}
+#define TPM_SIGNAL_HASH_END         7
+#define TPM_SEND_COMMAND            8
+        // {uint8_t Locality, uint32_t InBufferSize, uint8_t[InBufferSize] InBuffer} ->
+        //     {uint32_t OutBufferSize, uint8_t[OutBufferSize] OutBuffer}
+
+#define TPM_SIGNAL_CANCEL_ON        9
+#define TPM_SIGNAL_CANCEL_OFF       10
+#define TPM_SIGNAL_NV_ON            11
+#define TPM_SIGNAL_NV_OFF           12
+#define TPM_SIGNAL_KEY_CACHE_ON     13
+#define TPM_SIGNAL_KEY_CACHE_OFF    14
+
+#define TPM_REMOTE_HANDSHAKE        15
+#define TPM_SET_ALTERNATIVE_RESULT  16
+
+#define TPM_SIGNAL_RESET            17
+#define TPM_SIGNAL_RESTART          18
+
+#define TPM_SESSION_END             20
+#define TPM_STOP                    21
+
+#define TPM_GET_COMMAND_RESPONSE_SIZES  25
+
+#define TPM_ACT_GET_SIGNALED        26
+
+#define TPM_TEST_FAILURE_MODE       30
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index e99447ad68..319f9eeeb6 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -841,6 +841,7 @@  void hmp_info_tpm(Monitor *mon, const QDict *qdict)
     unsigned int c = 0;
     TPMPassthroughOptions *tpo;
     TPMEmulatorOptions *teo;
+    TPMmssimOptions *tmo;
 
     info_list = qmp_query_tpm(&err);
     if (err) {
@@ -874,6 +875,12 @@  void hmp_info_tpm(Monitor *mon, const QDict *qdict)
             teo = &ti->options->u.emulator;
             monitor_printf(mon, ",chardev=%s", teo->chardev);
             break;
+        case TPM_TYPE_MSSIM:
+            tmo = &ti->options->u.mssim;
+            monitor_printf(mon, ",command=%s:%s,control=%s:%s",
+                           tmo->command->u.inet.host, tmo->command->u.inet.port,
+                           tmo->control->u.inet.host, tmo->control->u.inet.port);
+            break;
         case TPM_TYPE__MAX:
             break;
         }
diff --git a/qapi/tpm.json b/qapi/tpm.json
index d8cbd5ea0e..b773bde2ff 100644
--- a/qapi/tpm.json
+++ b/qapi/tpm.json
@@ -5,6 +5,7 @@ 
 ##
 # = TPM (trusted platform module) devices
 ##
+{ 'include': 'sockets.json' }
 
 ##
 # @TpmModel:
@@ -49,7 +50,7 @@ 
 #
 # Since: 1.5
 ##
-{ 'enum': 'TpmType', 'data': [ 'passthrough', 'emulator' ],
+{ 'enum': 'TpmType', 'data': [ 'passthrough', 'emulator', 'mssim' ],
   'if': 'CONFIG_TPM' }
 
 ##
@@ -64,7 +65,7 @@ 
 # Example:
 #
 # -> { "execute": "query-tpm-types" }
-# <- { "return": [ "passthrough", "emulator" ] }
+# <- { "return": [ "passthrough", "emulator", "mssim" ] }
 #
 ##
 { 'command': 'query-tpm-types', 'returns': ['TpmType'],
@@ -99,6 +100,22 @@ 
 { 'struct': 'TPMEmulatorOptions', 'data': { 'chardev' : 'str' },
   'if': 'CONFIG_TPM' }
 
+##
+# @TPMmssimOptions:
+#
+# Information for the mssim emulator connection
+#
+# @command: command socket for the TPM emulator
+# @control: control socket for the TPM emulator
+#
+# Since: 7.2.0
+##
+{ 'struct': 'TPMmssimOptions',
+  'data': {
+      '*command': 'SocketAddress',
+      '*control': 'SocketAddress' },
+  'if': 'CONFIG_TPM' }
+
 ##
 # @TpmTypeOptions:
 #
@@ -107,6 +124,7 @@ 
 # @id: identifier of the backend
 # @type: - 'passthrough' The configuration options for the TPM passthrough type
 #        - 'emulator' The configuration options for TPM emulator backend type
+#        - 'mssim' The configuration options for TPM emulator mssim type
 #
 # Since: 1.5
 ##
@@ -115,7 +133,8 @@ 
             'id': 'str' },
   'discriminator': 'type',
   'data': { 'passthrough' : 'TPMPassthroughOptions',
-            'emulator': 'TPMEmulatorOptions' },
+            'emulator': 'TPMEmulatorOptions',
+            'mssim': 'TPMmssimOptions' },
   'if': 'CONFIG_TPM' }
 
 ##