mbox series

[printk,v1,00/10] printk: introduce atomic consoles and sync mode

Message ID 20210803131301.5588-1-john.ogness@linutronix.de (mailing list archive)
Headers show
Series printk: introduce atomic consoles and sync mode | expand

Message

John Ogness Aug. 3, 2021, 1:12 p.m. UTC
Hi,

This is the next part of our printk-rework effort (points 3 and
4 of the LPC 2019 summary [0]).

Here the concept of "atomic consoles" is introduced through  a
new (optional) write_atomic() callback for console drivers. This
callback must be implemented as an NMI-safe variant of the
write() callback, meaning that it can function from any context
without relying on questionable tactics such as ignoring locking
and also without relying on the synchronization of console
semaphore.

As an example of how such an atomic console can look like, this
series implements write_atomic() for the 8250 UART driver.

This series also introduces a new console printing mode called
"sync mode" that is only activated when the kernel is about to
end (such as panic, oops, shutdown, reboot). Sync mode can only
be activated if atomic consoles are available. A system without
registered atomic consoles will be unaffected by this series.

When in sync mode, the console printing behavior becomes:

- only consoles implementing write_atomic() will be called

- printing occurs within vprintk_store() instead of
  console_unlock(), since the console semaphore is irrelevant
  for atomic consoles

For systems that have registered atomic consoles, this series
improves the reliability of seeing crash messages by using new
locking techniques rather than "ignoring locks and hoping for
the best". In particular, atomic consoles rely on the
CPU-reentrant spinlock (i.e. the printk cpulock) for
synchronizing console output.

John Ogness

[0] https://lore.kernel.org/lkml/87k1acz5rx.fsf@linutronix.de/

John Ogness (10):
  printk: relocate printk cpulock functions
  printk: rename printk cpulock API and always disable interrupts
  kgdb: delay roundup if holding printk cpulock
  printk: relocate printk_delay()
  printk: call boot_delay_msec() in printk_delay()
  printk: use seqcount_latch for console_seq
  console: add write_atomic interface
  printk: introduce kernel sync mode
  kdb: if available, only use atomic consoles for output mirroring
  serial: 8250: implement write_atomic

 arch/powerpc/include/asm/smp.h         |   1 +
 arch/powerpc/kernel/kgdb.c             |  10 +-
 arch/powerpc/kernel/smp.c              |   5 +
 arch/x86/kernel/kgdb.c                 |   9 +-
 drivers/tty/serial/8250/8250.h         |  47 ++-
 drivers/tty/serial/8250/8250_core.c    |  17 +-
 drivers/tty/serial/8250/8250_fsl.c     |   9 +
 drivers/tty/serial/8250/8250_ingenic.c |   7 +
 drivers/tty/serial/8250/8250_mtk.c     |  29 +-
 drivers/tty/serial/8250/8250_port.c    |  92 ++--
 drivers/tty/serial/8250/Kconfig        |   1 +
 include/linux/console.h                |  32 ++
 include/linux/kgdb.h                   |   3 +
 include/linux/printk.h                 |  57 +--
 include/linux/serial_8250.h            |   5 +
 kernel/debug/debug_core.c              |  45 +-
 kernel/debug/kdb/kdb_io.c              |  16 +
 kernel/printk/printk.c                 | 554 +++++++++++++++++--------
 lib/Kconfig.debug                      |   3 +
 lib/dump_stack.c                       |   4 +-
 lib/nmi_backtrace.c                    |   4 +-
 21 files changed, 684 insertions(+), 266 deletions(-)


base-commit: 23d8adcf8022b9483605531d8985f5b77533cb3a

Comments

Andy Shevchenko Aug. 3, 2021, 1:52 p.m. UTC | #1
On Tue, Aug 03, 2021 at 03:18:51PM +0206, John Ogness wrote:
> Hi,
> 
> This is the next part of our printk-rework effort (points 3 and
> 4 of the LPC 2019 summary [0]).
> 
> Here the concept of "atomic consoles" is introduced through  a
> new (optional) write_atomic() callback for console drivers. This
> callback must be implemented as an NMI-safe variant of the
> write() callback, meaning that it can function from any context
> without relying on questionable tactics such as ignoring locking
> and also without relying on the synchronization of console
> semaphore.
> 
> As an example of how such an atomic console can look like, this
> series implements write_atomic() for the 8250 UART driver.
> 
> This series also introduces a new console printing mode called
> "sync mode" that is only activated when the kernel is about to
> end (such as panic, oops, shutdown, reboot). Sync mode can only
> be activated if atomic consoles are available. A system without
> registered atomic consoles will be unaffected by this series.
> 
> When in sync mode, the console printing behavior becomes:
> 
> - only consoles implementing write_atomic() will be called
> 
> - printing occurs within vprintk_store() instead of
>   console_unlock(), since the console semaphore is irrelevant
>   for atomic consoles
> 
> For systems that have registered atomic consoles, this series
> improves the reliability of seeing crash messages by using new
> locking techniques rather than "ignoring locks and hoping for
> the best". In particular, atomic consoles rely on the
> CPU-reentrant spinlock (i.e. the printk cpulock) for
> synchronizing console output.

If console is runtime suspended, who will bring it up?
Does it mean that this callback can't be implemented on the consoles that
do runtime suspend (some of 8250 currently, for example)?
Petr Mladek Aug. 5, 2021, 3:47 p.m. UTC | #2
On Tue 2021-08-03 15:18:51, John Ogness wrote:
> Hi,
> 
> This is the next part of our printk-rework effort (points 3 and
> 4 of the LPC 2019 summary [0]).
> 
> Here the concept of "atomic consoles" is introduced through  a
> new (optional) write_atomic() callback for console drivers. This
> callback must be implemented as an NMI-safe variant of the
> write() callback, meaning that it can function from any context
> without relying on questionable tactics such as ignoring locking
> and also without relying on the synchronization of console
> semaphore.
> 
> As an example of how such an atomic console can look like, this
> series implements write_atomic() for the 8250 UART driver.
> 
> This series also introduces a new console printing mode called
> "sync mode" that is only activated when the kernel is about to
> end (such as panic, oops, shutdown, reboot). Sync mode can only
> be activated if atomic consoles are available. A system without
> registered atomic consoles will be unaffected by this series.
>
> When in sync mode, the console printing behavior becomes:
> 
> - only consoles implementing write_atomic() will be called
> 
> - printing occurs within vprintk_store() instead of
>   console_unlock(), since the console semaphore is irrelevant
>   for atomic consoles

I am fine with the new behavior at this stage. It is a quite clear
win when (only) the atomic console is used. And it does not make any
difference when atomic consoles are disabled.

But I am not sure about the proposed terms and implementation.
I want to be sure that we are on the right way for introducing
console kthreads.

Let me try to compare the behavior:

1. before this patchset():

	/* printk: store immediately; try all consoles immediately */
	int printk(...)
	{
		vprintk_store();
		if (console_try_lock()) {
			/* flush pending messages to the consoles */
			console_unlock();
		}
	}

	/* panic: try hard to flush messages to the consoles and avoid deadlock */
	void panic()
	{
		/* Ignore locks in console drivers */
		bust_spinlocks(1);

		printk("Kernel panic ...);
		dump_stack();

		smp_send_stop();
		/* ignore console lock */
		console_flush_on_panic();
	}


2. after this patchset():

   + same as before in normal mode or when there is no atomic console

   + in panic with atomic console; it modifies the behavior:

	/*
	 * printk: store immediately; immediately flush atomic consoles;
	 *         unsafe consoles are not used anymore;
	 */
	int printk(...)
	{
		vprintk_store();
		flush_atomic_consoles();
	}

	/* panic: no hacks; only atomic consoles are used */
	void panic()
	{
		printk("Kernel panic ...);
		dump_stack();
	}


3. After introducing console kthread(s):

	int printk(...)
	{
		vprintk_store();
		wake_consoles_via_irqwork();
	}

	+ in panic:

	    + with atomic console like after this patchset?
	    + without atomic consoles?

	+ during early boot?


I guess that we will need another sync mode for the early boot,
panic, suspend, kexec, etc.. It must be posible to debug these states
even wihtout atomic console and working kthreads.

Best Regards,
Petr
Sergey Senozhatsky Aug. 31, 2021, 12:33 a.m. UTC | #3
On (21/08/05 17:47), Petr Mladek wrote:
[..]
> 3. After introducing console kthread(s):
> 
> 	int printk(...)
> 	{
> 		vprintk_store();
> 		wake_consoles_via_irqwork();
> 	}
> 
> 	+ in panic:
> 
> 	    + with atomic console like after this patchset?
> 	    + without atomic consoles?
> 
> 	+ during early boot?

I guess I'd also add netconsole to the list.