Message ID | 20160212183415.GA4289@obsidianresearch.com |
---|---|
State | New |
Headers | show |
Jason Gunthorpe <jgunthorpe@obsidianresearch.com> wrote on 02/12/2016 01:34:15 PM: > > On Fri, Feb 12, 2016 at 01:13:39PM -0500, Stefan Berger wrote: > > Stefan Berger/Watson/IBM@IBMUS wrote on 02/11/2016 10:56:47 PM: > > > > > > Jason Gunthorpe <jgunthorpe@obsidianresearch.com> wrote on 02/11/ > > > 2016 06:56:11 PM: > > > > > > > > > > > > > > On Thu, Feb 11, 2016 at 05:26:24PM -0500, Stefan Berger wrote: > > > > > > > > > > What is the point of tpmm_chip_dev? > > > > > So that the usage model of the chip is the same. We get this > > in the > > > > > tpm-vtpm.c with tpm_alloc_chip + tpmm_chip_dev while all > > others can > > > > > call tpmm_chip_alloc, which combines the two. > > > > > > > > No need, just don't use devm in vtpm, that is even better. The > > > > standard devm idiom is a with and without version. > > > > > > Updated the branch. Are you going to upstream your patch? Otherwise > > > I would just add your Signed-off-by to it if that's ok ? > > > > > > [1]https://github.com/stefanberger/linux/tree/vtpm-driver.v3 > > I converted tpm-chip.c to use IDA as well. > > A test for 4096 vTPM instances: > > for ((i = 0; i < 4096; i++)); do ./vtpmctrl & done > > Yeah, that looks good, thanks for doing this > With the IDR I ran into the problem that TPM core driver and backend now share the ID and create their device names with it. What can happen is that the core driver gives up ID 123, while the vTPM driver still has the device name vtpms123 registered with sysfs. Now the next device is created, ID 123 is recycled by the core driver, and it bombs while the vTPM driver again tries to register vtpm123 that still hasn't been unregistered. One can trigger this problem with lots of concurrency (see below) and then the whole system even locks up. I don't know how to solve this ID issue in an easier way than having an IDR in the core driver and an IDA in the vTPM driver and so handling the IDs independently. This in turn makes the split of tpmm_chip_alloc / tpm_chip_alloc unnecessary since we don't need the ID from the chip anymore. Now the below test runs stable. for pid in $(ps aux | grep vtpm | gawk '{print $2}'); do kill -9 $pid; done ; for ((i =0; i< 8192; i++)); do ./vtpmctrl &>/dev/null & done Stefan ------------------------------------------------------------------------------ Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
Stefan Berger/Watson/IBM@IBMUS wrote on 02/14/2016 01:37:08 AM: > > Jason Gunthorpe <jgunthorpe@obsidianresearch.com> wrote on 02/12/ > 2016 01:34:15 PM: > > > > > On Fri, Feb 12, 2016 at 01:13:39PM -0500, Stefan Berger wrote: > > > Stefan Berger/Watson/IBM@IBMUS wrote on 02/11/2016 10:56:47 PM: > > > > > > > > Jason Gunthorpe <jgunthorpe@obsidianresearch.com> wrote on 02/11/ > > > > 2016 06:56:11 PM: > > > > > > > > > > > > > > > > > > On Thu, Feb 11, 2016 at 05:26:24PM -0500, Stefan Berger wrote: > > > > > > > > > > > > What is the point of tpmm_chip_dev? > > > > > > So that the usage model of the chip is the same. We get this > > > in the > > > > > > tpm-vtpm.c with tpm_alloc_chip + tpmm_chip_dev while all > > > others can > > > > > > call tpmm_chip_alloc, which combines the two. > > > > > > > > > > No need, just don't use devm in vtpm, that is even better. The > > > > > standard devm idiom is a with and without version. > > > > > > > > Updated the branch. Are you going to upstream your patch? Otherwise > > > > I would just add your Signed-off-by to it if that's ok ? > > > > > > > > [1]https://github.com/stefanberger/linux/tree/vtpm-driver.v3 > > > I converted tpm-chip.c to use IDA as well. > > > A test for 4096 vTPM instances: > > > for ((i = 0; i < 4096; i++)); do ./vtpmctrl & done > > > > Yeah, that looks good, thanks for doing this > > > > With the IDR I ran into the problem that TPM core driver and backend > now share the ID and create their device names with it. What can > happen is that the core driver gives up ID 123, while the vTPM > driver still has the device name vtpms123 registered with sysfs. Now > the next device is created, ID 123 is recycled by the core driver, > and it bombs while the vTPM driver again tries to register vtpm123 > that still hasn't been unregistered. One can trigger this problem > with lots of concurrency (see below) and then the whole system even > locks up. I don't know how to solve this ID issue in an easier way > than having an IDR in the core driver and an IDA in the vTPM driver > and so handling the IDs independently. This in turn makes the split > of tpmm_chip_alloc / tpm_chip_alloc unnecessary since we don't need > the ID from the chip anymore. Now the below test runs stable. Scratch that. The easier way to fix this was to release the tpm_chip after the vtpm_dev. Stefan > > > for pid in $(ps aux | grep vtpm | gawk '{print $2}'); do kill -9 > $pid; done ; for ((i =0; i< 8192; i++)); do ./vtpmctrl &>/dev/null & done > > Stefan > ------------------------------------------------------------------------------ > Site24x7 APM Insight: Get Deep Visibility into Application Performance > APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month > Monitor end-to-end web transactions and take corrective actions now > Troubleshoot faster and improve end-user experience. Signup Now! > http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 > _______________________________________________ > tpmdd-devel mailing list > tpmdd-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/tpmdd-devel ------------------------------------------------------------------------------ Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
On Tue, Feb 16, 2016 at 11:44:06AM -0500, Stefan Berger wrote: > Scratch that. The easier way to fix this was to release the tpm_chip > after the vtpm_dev. Right, that is nicer Did you solve the lock ups when register fails? One other thing you can look at is ditching the vtpm struct device. With the last patches I sent the tpm core would only need two more lines to work with a null parent device, which would be even simpler for vtpm. Jason ------------------------------------------------------------------------------ Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
Jason Gunthorpe <jgunthorpe@obsidianresearch.com> wrote on 02/16/2016 01:00:28 PM: > > On Tue, Feb 16, 2016 at 11:44:06AM -0500, Stefan Berger wrote: > > > Scratch that. The easier way to fix this was to release the tpm_chip > > after the vtpm_dev. > > Right, that is nicer > > Did you solve the lock ups when register fails? The re-ordering of the tpm_chip and vtpm_dev release actually solved that problem. [My feeling is, something outside of the vtpm driver doesn't unroll properly and locks up if there's lots of concurrency]. > > One other thing you can look at is ditching the vtpm struct > device. With the last patches I sent the tpm core would only need two > more lines to work with a null parent device, which would be even > simpler for vtpm. Not having a parent device isn't really necessary, except for not having it appear in sysfs maybe? Stefan > > Jason > ------------------------------------------------------------------------------ Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
On Tue, Feb 16, 2016 at 02:26:56PM -0500, Stefan Berger wrote: > The re-ordering of the tpm_chip and vtpm_dev release actually solved that > problem. ? You mean you don't see the lock up because there is no error unwind? > [My feeling is, something outside of the vtpm driver doesn't unroll properly > and locks up if there's lots of concurrency]. Hum, I'd be surprised if this is not caused by something outside tpm/vtpm.. It would be very nice to know, but perhaps not essential. I didn't see anything obvious in vtpm either. > > One other thing you can look at is ditching the vtpm struct > > device. With the last patches I sent the tpm core would only need two > > more lines to work with a null parent device, which would be even > > simpler for vtpm. > > Not having a parent device isn't really necessary, except for not having it > appear in sysfs maybe? Dropping the dummy parent out of vtpm makes it work a little more more like the rest of the virtual stuff and slightly changes the sysfs paths visble to user space. If this is the right way to go it would be great to do that at the start. It wasn't worth bringing up before, but now that I wrote the parent removal patch it is pretty trivial. Jason ------------------------------------------------------------------------------ Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
Jason Gunthorpe <jgunthorpe@obsidianresearch.com> wrote on 02/16/2016 11:47:50 PM: > > On Tue, Feb 16, 2016 at 02:26:56PM -0500, Stefan Berger wrote: > > The re-ordering of the tpm_chip and vtpm_dev release actually solved that > > problem. > > ? You mean you don't see the lock up because there is no error unwind? > > > [My feeling is, something outside of the vtpm driver doesn't unroll properly > > and locks up if there's lots of concurrency]. > > Hum, I'd be surprised if this is not caused by something outside tpm/vtpm.. > It would be very nice to know, but perhaps not essential. I didn't see > anything obvious in vtpm either. > > > > One other thing you can look at is ditching the vtpm struct > > > device. With the last patches I sent the tpm core would only need two > > > more lines to work with a null parent device, which would be even > > > simpler for vtpm. > > > > Not having a parent device isn't really necessary, except for not having it > > appear in sysfs maybe? > > Dropping the dummy parent out of vtpm makes it work a little more more > like the rest of the virtual stuff and slightly changes the sysfs > paths visble to user space. If this is the right way to go it would be > great to do that at the start. It wasn't worth bringing up before, but > now that I wrote the parent removal patch it is pretty trivial. Beside the code to lock the module with the parent device, the chip->dev.parent is used in sysfs API calls. Using the chip->dev.kobj there in case chip->dev.parent is NULL also seems a way to hang the system. Stefan ------------------------------------------------------------------------------ Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c index 8134003b023c..c07e397bfddd 100644 --- a/drivers/char/tpm/tpm-chip.c +++ b/drivers/char/tpm/tpm-chip.c @@ -88,6 +88,7 @@ struct tpm_chip *tpm_chip_alloc(struct device *dev, const struct tpm_class_ops *ops) { struct tpm_chip *chip; + int err; chip = kzalloc(sizeof(*chip), GFP_KERNEL); if (chip == NULL) @@ -111,8 +112,6 @@ struct tpm_chip *tpm_chip_alloc(struct device *dev, set_bit(chip->dev_num, dev_mask); device_initialize(&chip->dev); - scnprintf(chip->devname, sizeof(chip->devname), "tpm%d", chip->dev_num); - chip->pdev = dev; chip->dev.class = tpm_class; @@ -127,12 +126,18 @@ struct tpm_chip *tpm_chip_alloc(struct device *dev, else chip->dev.devt = MKDEV(MAJOR(tpm_devt), chip->dev_num); - dev_set_name(&chip->dev, "%s", chip->devname); + err = dev_set_name(&chip->dev, "tpm%d", chip->dev_num); + if (err) + goto out; cdev_init(&chip->cdev, &tpm_fops); chip->cdev.kobj.parent = &chip->dev.kobj; return chip; + +out: + device_put(&chip->dev); + return PTR_ERR(err); } EXPORT_SYMBOL_GPL(tpm_chip_alloc); @@ -174,7 +179,7 @@ static int tpm_dev_add_device(struct tpm_chip *chip) if (rc) { dev_err(&chip->dev, "unable to cdev_add() %s, major %d, minor %d, err=%d\n", - chip->devname, MAJOR(chip->dev.devt), + dev_name(&chip->dev), MAJOR(chip->dev.devt), MINOR(chip->dev.devt), rc); device_unregister(&chip->dev); @@ -185,7 +190,7 @@ static int tpm_dev_add_device(struct tpm_chip *chip) if (rc) { dev_err(&chip->dev, "unable to device_register() %s, major %d, minor %d, err=%d\n", - chip->devname, MAJOR(chip->dev.devt), + dev_name(&chip->dev), MAJOR(chip->dev.devt), MINOR(chip->dev.devt), rc); return rc; @@ -211,7 +216,7 @@ static int tpm1_chip_register(struct tpm_chip *chip) if (rc) return rc; - chip->bios_dir = tpm_bios_log_setup(chip->devname); + chip->bios_dir = tpm_bios_log_setup(dev_name(&chip->dev)); return 0; } diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h index 3a5452f09b96..d51ac7f0da01 100644 --- a/drivers/char/tpm/tpm.h +++ b/drivers/char/tpm/tpm.h @@ -174,7 +174,6 @@ struct tpm_chip { unsigned int flags; int dev_num; /* /dev/tpm# */ - char devname[7]; unsigned long is_open; /* only one allowed */ int time_expired; diff --git a/drivers/char/tpm/tpm_eventlog.c b/drivers/char/tpm/tpm_eventlog.c index bd72fb04225e..49e50976efc8 100644 --- a/drivers/char/tpm/tpm_eventlog.c +++ b/drivers/char/tpm/tpm_eventlog.c @@ -397,7 +397,7 @@ static int is_bad(void *p) return 0; } -struct dentry **tpm_bios_log_setup(char *name) +struct dentry **tpm_bios_log_setup(const char *name) { struct dentry **ret = NULL, *tpm_dir, *bin_file, *ascii_file; diff --git a/drivers/char/tpm/tpm_eventlog.h b/drivers/char/tpm/tpm_eventlog.h index 267bfbd1b7bb..f072a1a1d5cc 100644 --- a/drivers/char/tpm/tpm_eventlog.h +++ b/drivers/char/tpm/tpm_eventlog.h @@ -77,7 +77,7 @@ int read_log(struct tpm_bios_log *log); #if defined(CONFIG_TCG_IBMVTPM) || defined(CONFIG_TCG_IBMVTPM_MODULE) || \ defined(CONFIG_ACPI) -extern struct dentry **tpm_bios_log_setup(char *); +extern struct dentry **tpm_bios_log_setup(const char *name); extern void tpm_bios_log_teardown(struct dentry **); #else static inline struct dentry **tpm_bios_log_setup(char *name) diff --git a/drivers/char/tpm/tpm_i2c_nuvoton.c b/drivers/char/tpm/tpm_i2c_nuvoton.c index 847f1597fe9b..02f2b35ad753 100644 --- a/drivers/char/tpm/tpm_i2c_nuvoton.c +++ b/drivers/char/tpm/tpm_i2c_nuvoton.c @@ -560,7 +560,7 @@ static int i2c_nuvoton_probe(struct i2c_client *client, rc = devm_request_irq(dev, chip->vendor.irq, i2c_nuvoton_int_handler, IRQF_TRIGGER_LOW, - chip->devname, + dev_name(&chip->dev), chip); if (rc) { dev_err(dev, "%s() Unable to request irq: %d for use\n", diff --git a/drivers/char/tpm/tpm_tis.c b/drivers/char/tpm/tpm_tis.c index 12aa96a69b6c..39bc0e908e38 100644 --- a/drivers/char/tpm/tpm_tis.c +++ b/drivers/char/tpm/tpm_tis.c @@ -578,7 +578,7 @@ static int tpm_tis_probe_irq_single(struct tpm_chip *chip, u32 intmask, u8 original_int_vec; if (devm_request_irq(chip->pdev, irq, tis_int_handler, flags, - chip->devname, chip) != 0) { + dev_name(&chip->dev), chip) != 0) { dev_info(chip->pdev, "Unable to request irq: %d for probe\n", irq); return -1;