Message ID | 1303993705-sup-5213@think |
---|---|
State | Not Applicable, archived |
Headers | show |
On Thu, 2011-04-28 at 08:29 -0400, Chris Mason wrote: > Excerpts from Colin Ian King's message of 2011-04-28 07:36:30 -0400: > > One more data point to add, I've been looking at an identical issue when > > copying large amounts of data. I bisected this - and the lockups occur > > with commit > > 3e7d344970673c5334cf7b5bb27c8c0942b06126 - before that I don't see the > > issue. With this commit, my file copy test locks up after ~8-10 > > iterations, before this commit I can copy > 100 times and don't see the > > lockup. > > Well, that's really interesting. I tried with compaction on here and > couldn't trigger it, but this (very very lightly) tested patch might > help. > Thanks Chris, I've given this a soak test but I still see the same lockup. > It moves the writeout throttle before the goto restart, and also makes > sure we do at least one cond_resched before we loop. > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 6771ea7..cb08b41 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -1934,12 +1934,14 @@ restart: > if (inactive_anon_is_low(zone, sc)) > shrink_active_list(SWAP_CLUSTER_MAX, zone, sc, priority, 0); > > + throttle_vm_writeout(sc->gfp_mask); > + > /* reclaim/compaction might need reclaim to continue */ > if (should_continue_reclaim(zone, nr_reclaimed, > - sc->nr_scanned - nr_scanned, sc)) > + sc->nr_scanned - nr_scanned, sc)) { > + cond_resched(); > goto restart; > - > - throttle_vm_writeout(sc->gfp_mask); > + } > } > > /* -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Excerpts from Colin Ian King's message of 2011-04-28 09:42:20 -0400: > > On Thu, 2011-04-28 at 08:29 -0400, Chris Mason wrote: > > Excerpts from Colin Ian King's message of 2011-04-28 07:36:30 -0400: > > > One more data point to add, I've been looking at an identical issue when > > > copying large amounts of data. I bisected this - and the lockups occur > > > with commit > > > 3e7d344970673c5334cf7b5bb27c8c0942b06126 - before that I don't see the > > > issue. With this commit, my file copy test locks up after ~8-10 > > > iterations, before this commit I can copy > 100 times and don't see the > > > lockup. > > > > Well, that's really interesting. I tried with compaction on here and > > couldn't trigger it, but this (very very lightly) tested patch might > > help. > > > Thanks Chris, > > I've given this a soak test but I still see the same lockup. Could you post the soft lockups you're seeing? -chris -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> Could you post the soft lockups you're seeing? As requested, attached Colin > > -chris > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/
Excerpts from Colin Ian King's message of 2011-04-28 10:01:22 -0400: > > > Could you post the soft lockups you're seeing? > > As requested, attached These are not good, but they aren't the lockup James was seeing. Were these messages with my patch? If yes, please post the messages from without my patch. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu 28-04-11 15:01:22, Colin Ian King wrote: > > > Could you post the soft lockups you're seeing? > > As requested, attached Hum, what keeps puzzling me is that in all the cases of hangs I've seen so far, we are stuck waiting for IO to finish for a long time - e.g. in the traces below kjournald waits for PageWriteback bit to get cleared. Also we are stuck waiting for page locks which might be because those pages are being read in? All in all it seems that the IO is just incredibly slow. But it's not clear to me what pushes us into that situation (especially since ext4 refuses to do any IO from ->writepage (i.e. kswapd) when the underlying blocks are not already allocated. Honza [ 287.088371] INFO: task rs:main Q:Reg:749 blocked for more than 30 seconds. [ 287.088374] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 287.088376] rs:main Q:Reg D 0000000000000000 0 749 1 0x00000000 [ 287.088381] ffff880072c17b68 0000000000000082 ffff880072c17fd8 ffff880072c16000 [ 287.088392] 0000000000013d00 ffff88003591b178 ffff880072c17fd8 0000000000013d00 [ 287.088396] ffffffff81a0b020 ffff88003591adc0 ffff88001fffc3e8 ffff88001fc13d00 [ 287.088400] Call Trace: [ 287.088404] [<ffffffff8110c070>] ? sync_page+0x0/0x50 [ 287.088408] [<ffffffff815c0990>] io_schedule+0x70/0xc0 [ 287.088411] [<ffffffff8110c0b0>] sync_page+0x40/0x50 [ 287.088414] [<ffffffff815c130f>] __wait_on_bit+0x5f/0x90 [ 287.088418] [<ffffffff8110c278>] wait_on_page_bit+0x78/0x80 [ 287.088421] [<ffffffff81087f70>] ? wake_bit_function+0x0/0x50 [ 287.088425] [<ffffffff8110dffd>] __lock_page_or_retry+0x3d/0x70 [ 287.088428] [<ffffffff8110e3c7>] filemap_fault+0x397/0x4a0 [ 287.088431] [<ffffffff8112d144>] __do_fault+0x54/0x520 [ 287.088434] [<ffffffff81134a43>] ? unmap_region+0x113/0x170 [ 287.088437] [<ffffffff812ded90>] ? prio_tree_insert+0x150/0x1c0 [ 287.088440] [<ffffffff811309da>] handle_pte_fault+0xfa/0x210 [ 287.088442] [<ffffffff810442a7>] ? pte_alloc_one+0x37/0x50 [ 287.088446] [<ffffffff815c2cce>] ? _raw_spin_lock+0xe/0x20 [ 287.088448] [<ffffffff8112de25>] ? __pte_alloc+0xb5/0x100 [ 287.088451] [<ffffffff81131d5d>] handle_mm_fault+0x16d/0x250 [ 287.088454] [<ffffffff815c6a47>] do_page_fault+0x1a7/0x540 [ 287.088457] [<ffffffff81136f85>] ? do_mmap_pgoff+0x335/0x370 [ 287.088460] [<ffffffff81137127>] ? sys_mmap_pgoff+0x167/0x230 [ 287.088463] [<ffffffff815c34d5>] page_fault+0x25/0x30 [ 287.088466] INFO: task NetworkManager:764 blocked for more than 30 seconds. [ 287.088468] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 287.088470] NetworkManager D 0000000000000002 0 764 1 0x00000000 [ 287.088473] ffff880074ffbb68 0000000000000082 ffff880074ffbfd8 ffff880074ffa000 [ 287.088477] 0000000000013d00 ffff880036051a98 ffff880074ffbfd8 0000000000013d00 [ 287.088481] ffff8801005badc0 ffff8800360516e0 ffff88001ffef128 ffff88001fc53d00 [ 287.088484] Call Trace: [ 287.088488] [<ffffffff8110c070>] ? sync_page+0x0/0x50 [ 287.088491] [<ffffffff815c0990>] io_schedule+0x70/0xc0 [ 287.088494] [<ffffffff8110c0b0>] sync_page+0x40/0x50 [ 287.088497] [<ffffffff815c130f>] __wait_on_bit+0x5f/0x90 [ 287.088500] [<ffffffff8110c278>] wait_on_page_bit+0x78/0x80 [ 287.088503] [<ffffffff81087f70>] ? wake_bit_function+0x0/0x50 [ 287.088506] [<ffffffff8110dffd>] __lock_page_or_retry+0x3d/0x70 [ 287.088509] [<ffffffff8110e3c7>] filemap_fault+0x397/0x4a0 [ 287.088513] [<ffffffff81177110>] ? pollwake+0x0/0x60 [ 287.088516] [<ffffffff8112d144>] __do_fault+0x54/0x520 [ 287.088519] [<ffffffff81177110>] ? pollwake+0x0/0x60 [ 287.088522] [<ffffffff811309da>] handle_pte_fault+0xfa/0x210 [ 287.088525] [<ffffffff8111561d>] ? __free_pages+0x2d/0x40 [ 287.088527] [<ffffffff8112de4f>] ? __pte_alloc+0xdf/0x100 [ 287.088530] [<ffffffff81131d5d>] handle_mm_fault+0x16d/0x250 [ 287.088533] [<ffffffff815c6a47>] do_page_fault+0x1a7/0x540 [ 287.088537] [<ffffffff81013859>] ? read_tsc+0x9/0x20 [ 287.088540] [<ffffffff81092eb1>] ? ktime_get_ts+0xb1/0xf0 [ 287.088543] [<ffffffff811776d2>] ? poll_select_set_timeout+0x82/0x90 [ 287.088546] [<ffffffff815c34d5>] page_fault+0x25/0x30 [ 287.088559] INFO: task unity-panel-ser:1521 blocked for more than 30 seconds. [ 287.088561] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 287.088562] unity-panel-ser D 0000000000000000 0 1521 1 0x00000000 [ 287.088566] ffff880061f37b68 0000000000000082 ffff880061f37fd8 ffff880061f36000 [ 287.088570] 0000000000013d00 ffff880068c7c858 ffff880061f37fd8 0000000000013d00 [ 287.088573] ffff88003591c4a0 ffff880068c7c4a0 ffff88001fff0c88 ffff88001fc13d00 [ 287.088577] Call Trace: [ 287.088581] [<ffffffff8110c070>] ? sync_page+0x0/0x50 [ 287.088583] [<ffffffff815c0990>] io_schedule+0x70/0xc0 [ 287.088587] [<ffffffff8110c0b0>] sync_page+0x40/0x50 [ 287.088589] [<ffffffff815c130f>] __wait_on_bit+0x5f/0x90 [ 287.088593] [<ffffffff8110c278>] wait_on_page_bit+0x78/0x80 [ 287.088596] [<ffffffff81087f70>] ? wake_bit_function+0x0/0x50 [ 287.088599] [<ffffffff8110dffd>] __lock_page_or_retry+0x3d/0x70 [ 287.088602] [<ffffffff8110e3c7>] filemap_fault+0x397/0x4a0 [ 287.088605] [<ffffffff8112d144>] __do_fault+0x54/0x520 [ 287.088608] [<ffffffff811309da>] handle_pte_fault+0xfa/0x210 [ 287.088610] [<ffffffff8111561d>] ? __free_pages+0x2d/0x40 [ 287.088613] [<ffffffff8112de4f>] ? __pte_alloc+0xdf/0x100 [ 287.088616] [<ffffffff81131d5d>] handle_mm_fault+0x16d/0x250 [ 287.088619] [<ffffffff815c6a47>] do_page_fault+0x1a7/0x540 [ 287.088622] [<ffffffff81136f85>] ? do_mmap_pgoff+0x335/0x370 [ 287.088625] [<ffffffff815c34d5>] page_fault+0x25/0x30 [ 287.088629] INFO: task jbd2/sda4-8:1845 blocked for more than 30 seconds. [ 287.088630] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 287.088632] jbd2/sda4-8 D 0000000000000000 0 1845 2 0x00000000 [ 287.088636] ffff880068f6baf0 0000000000000046 ffff880068f6bfd8 ffff880068f6a000 [ 287.088639] 0000000000013d00 ffff880061d603b8 ffff880068f6bfd8 0000000000013d00 [ 287.088643] ffff88003591c4a0 ffff880061d60000 ffff88001fff8548 ffff88001fc13d00 [ 287.088647] Call Trace: [ 287.088650] [<ffffffff8110c070>] ? sync_page+0x0/0x50 [ 287.088653] [<ffffffff815c0990>] io_schedule+0x70/0xc0 [ 287.088656] [<ffffffff8110c0b0>] sync_page+0x40/0x50 [ 287.088659] [<ffffffff815c130f>] __wait_on_bit+0x5f/0x90 [ 287.088662] [<ffffffff8110c278>] wait_on_page_bit+0x78/0x80 [ 287.088665] [<ffffffff81087f70>] ? wake_bit_function+0x0/0x50 [ 287.088668] [<ffffffff8110c41d>] filemap_fdatawait_range+0xfd/0x190 [ 287.088672] [<ffffffff8110c4db>] filemap_fdatawait+0x2b/0x30 [ 287.088675] [<ffffffff81242a93>] journal_finish_inode_data_buffers+0x63/0x170 [ 287.088678] [<ffffffff81243284>] jbd2_journal_commit_transaction+0x6e4/0x1190 [ 287.088682] [<ffffffff81076185>] ? try_to_del_timer_sync+0x85/0xe0 [ 287.088685] [<ffffffff81247e9b>] kjournald2+0xbb/0x220 [ 287.088688] [<ffffffff81087f30>] ? autoremove_wake_function+0x0/0x40 [ 287.088691] [<ffffffff81247de0>] ? kjournald2+0x0/0x220 [ 287.088694] [<ffffffff810877e6>] kthread+0x96/0xa0 [ 287.088697] [<ffffffff8100ce24>] kernel_thread_helper+0x4/0x10 [ 287.088700] [<ffffffff81087750>] ? kthread+0x0/0xa0 [ 287.088703] [<ffffffff8100ce20>] ? kernel_thread_helper+0x0/0x10 [ 287.088705] INFO: task dirname:5969 blocked for more than 30 seconds. [ 287.088707] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 287.088709] dirname D 0000000000000002 0 5969 5214 0x00000000 [ 287.088712] ffff88005bd9d8b8 0000000000000086 ffff88005bd9dfd8 ffff88005bd9c000 [ 287.088716] 0000000000013d00 ffff88005d65b178 ffff88005bd9dfd8 0000000000013d00 [ 287.088720] ffff8801005e5b80 ffff88005d65adc0 ffff88001ffe5228 ffff88001fc53d00 [ 287.088723] Call Trace: [ 287.088726] [<ffffffff8110c070>] ? sync_page+0x0/0x50 [ 287.088729] [<ffffffff815c0990>] io_schedule+0x70/0xc0 [ 287.088732] [<ffffffff8110c0b0>] sync_page+0x40/0x50 [ 287.088735] [<ffffffff815c130f>] __wait_on_bit+0x5f/0x90 [ 287.088738] [<ffffffff8110c278>] wait_on_page_bit+0x78/0x80 [ 287.088741] [<ffffffff81087f70>] ? wake_bit_function+0x0/0x50 [ 287.088744] [<ffffffff8110dffd>] __lock_page_or_retry+0x3d/0x70 [ 287.088747] [<ffffffff8110e3c7>] filemap_fault+0x397/0x4a0 [ 287.088750] [<ffffffff8112d144>] __do_fault+0x54/0x520 [ 287.088753] [<ffffffff811309da>] handle_pte_fault+0xfa/0x210 [ 287.088756] [<ffffffff810442a7>] ? pte_alloc_one+0x37/0x50 [ 287.088759] [<ffffffff815c2cce>] ? _raw_spin_lock+0xe/0x20 [ 287.088761] [<ffffffff8112de25>] ? __pte_alloc+0xb5/0x100 [ 287.088764] [<ffffffff81131d5d>] handle_mm_fault+0x16d/0x250 [ 287.088767] [<ffffffff815c6a47>] do_page_fault+0x1a7/0x540 [ 287.088770] [<ffffffff81136947>] ? mmap_region+0x1f7/0x500 [ 287.088773] [<ffffffff8112db06>] ? free_pgd_range+0x356/0x4a0 [ 287.088776] [<ffffffff815c34d5>] page_fault+0x25/0x30 [ 287.088779] [<ffffffff812e6d5f>] ? __clear_user+0x3f/0x70 [ 287.088782] [<ffffffff812e6d41>] ? __clear_user+0x21/0x70 [ 287.088786] [<ffffffff812e6dc6>] clear_user+0x36/0x40 [ 287.088788] [<ffffffff811b0b6d>] padzero+0x2d/0x40 [ 287.088791] [<ffffffff811b2c7a>] load_elf_binary+0x95a/0xe00 [ 287.088794] [<ffffffff8116aa8a>] search_binary_handler+0xda/0x300 [ 287.088797] [<ffffffff811b2320>] ? load_elf_binary+0x0/0xe00 [ 287.088800] [<ffffffff8116c49c>] do_execve+0x24c/0x2d0 [ 287.088802] [<ffffffff8101521a>] sys_execve+0x4a/0x80 [ 287.088805] [<ffffffff8100c45c>] stub_execve+0x6c/0xc0
On Thu 28-04-11 16:25:51, Jan Kara wrote: > On Thu 28-04-11 15:01:22, Colin Ian King wrote: > > > > > Could you post the soft lockups you're seeing? > > > > As requested, attached > Hum, what keeps puzzling me is that in all the cases of hangs I've seen > so far, we are stuck waiting for IO to finish for a long time - e.g. in the > traces below kjournald waits for PageWriteback bit to get cleared. Also we > are stuck waiting for page locks which might be because those pages are > being read in? All in all it seems that the IO is just incredibly slow. > > But it's not clear to me what pushes us into that situation (especially > since ext4 refuses to do any IO from ->writepage (i.e. kswapd) when the > underlying blocks are not already allocated. Hmm, maybe because the system is under memory pressure (and kswapd is not able to get rid of dirty pages), we page out clean pages. Thus also pages of executables which need to be paged in soon anyway thus putting heavy read load on the system which makes writes crawl? I'm not sure why compaction should make this any worse but maybe it can. James, Colin, can you capture output of 'vmstat 1' while you do the copying? Thanks. Honza > [ 287.088371] INFO: task rs:main Q:Reg:749 blocked for more than 30 seconds. > [ 287.088374] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [ 287.088376] rs:main Q:Reg D 0000000000000000 0 749 1 0x00000000 > [ 287.088381] ffff880072c17b68 0000000000000082 ffff880072c17fd8 ffff880072c16000 > [ 287.088392] 0000000000013d00 ffff88003591b178 ffff880072c17fd8 0000000000013d00 > [ 287.088396] ffffffff81a0b020 ffff88003591adc0 ffff88001fffc3e8 ffff88001fc13d00 > [ 287.088400] Call Trace: > [ 287.088404] [<ffffffff8110c070>] ? sync_page+0x0/0x50 > [ 287.088408] [<ffffffff815c0990>] io_schedule+0x70/0xc0 > [ 287.088411] [<ffffffff8110c0b0>] sync_page+0x40/0x50 > [ 287.088414] [<ffffffff815c130f>] __wait_on_bit+0x5f/0x90 > [ 287.088418] [<ffffffff8110c278>] wait_on_page_bit+0x78/0x80 > [ 287.088421] [<ffffffff81087f70>] ? wake_bit_function+0x0/0x50 > [ 287.088425] [<ffffffff8110dffd>] __lock_page_or_retry+0x3d/0x70 > [ 287.088428] [<ffffffff8110e3c7>] filemap_fault+0x397/0x4a0 > [ 287.088431] [<ffffffff8112d144>] __do_fault+0x54/0x520 > [ 287.088434] [<ffffffff81134a43>] ? unmap_region+0x113/0x170 > [ 287.088437] [<ffffffff812ded90>] ? prio_tree_insert+0x150/0x1c0 > [ 287.088440] [<ffffffff811309da>] handle_pte_fault+0xfa/0x210 > [ 287.088442] [<ffffffff810442a7>] ? pte_alloc_one+0x37/0x50 > [ 287.088446] [<ffffffff815c2cce>] ? _raw_spin_lock+0xe/0x20 > [ 287.088448] [<ffffffff8112de25>] ? __pte_alloc+0xb5/0x100 > [ 287.088451] [<ffffffff81131d5d>] handle_mm_fault+0x16d/0x250 > [ 287.088454] [<ffffffff815c6a47>] do_page_fault+0x1a7/0x540 > [ 287.088457] [<ffffffff81136f85>] ? do_mmap_pgoff+0x335/0x370 > [ 287.088460] [<ffffffff81137127>] ? sys_mmap_pgoff+0x167/0x230 > [ 287.088463] [<ffffffff815c34d5>] page_fault+0x25/0x30 > [ 287.088466] INFO: task NetworkManager:764 blocked for more than 30 seconds. > [ 287.088468] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [ 287.088470] NetworkManager D 0000000000000002 0 764 1 0x00000000 > [ 287.088473] ffff880074ffbb68 0000000000000082 ffff880074ffbfd8 ffff880074ffa000 > [ 287.088477] 0000000000013d00 ffff880036051a98 ffff880074ffbfd8 0000000000013d00 > [ 287.088481] ffff8801005badc0 ffff8800360516e0 ffff88001ffef128 ffff88001fc53d00 > [ 287.088484] Call Trace: > [ 287.088488] [<ffffffff8110c070>] ? sync_page+0x0/0x50 > [ 287.088491] [<ffffffff815c0990>] io_schedule+0x70/0xc0 > [ 287.088494] [<ffffffff8110c0b0>] sync_page+0x40/0x50 > [ 287.088497] [<ffffffff815c130f>] __wait_on_bit+0x5f/0x90 > [ 287.088500] [<ffffffff8110c278>] wait_on_page_bit+0x78/0x80 > [ 287.088503] [<ffffffff81087f70>] ? wake_bit_function+0x0/0x50 > [ 287.088506] [<ffffffff8110dffd>] __lock_page_or_retry+0x3d/0x70 > [ 287.088509] [<ffffffff8110e3c7>] filemap_fault+0x397/0x4a0 > [ 287.088513] [<ffffffff81177110>] ? pollwake+0x0/0x60 > [ 287.088516] [<ffffffff8112d144>] __do_fault+0x54/0x520 > [ 287.088519] [<ffffffff81177110>] ? pollwake+0x0/0x60 > [ 287.088522] [<ffffffff811309da>] handle_pte_fault+0xfa/0x210 > [ 287.088525] [<ffffffff8111561d>] ? __free_pages+0x2d/0x40 > [ 287.088527] [<ffffffff8112de4f>] ? __pte_alloc+0xdf/0x100 > [ 287.088530] [<ffffffff81131d5d>] handle_mm_fault+0x16d/0x250 > [ 287.088533] [<ffffffff815c6a47>] do_page_fault+0x1a7/0x540 > [ 287.088537] [<ffffffff81013859>] ? read_tsc+0x9/0x20 > [ 287.088540] [<ffffffff81092eb1>] ? ktime_get_ts+0xb1/0xf0 > [ 287.088543] [<ffffffff811776d2>] ? poll_select_set_timeout+0x82/0x90 > [ 287.088546] [<ffffffff815c34d5>] page_fault+0x25/0x30 > [ 287.088559] INFO: task unity-panel-ser:1521 blocked for more than 30 seconds. > [ 287.088561] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [ 287.088562] unity-panel-ser D 0000000000000000 0 1521 1 0x00000000 > [ 287.088566] ffff880061f37b68 0000000000000082 ffff880061f37fd8 ffff880061f36000 > [ 287.088570] 0000000000013d00 ffff880068c7c858 ffff880061f37fd8 0000000000013d00 > [ 287.088573] ffff88003591c4a0 ffff880068c7c4a0 ffff88001fff0c88 ffff88001fc13d00 > [ 287.088577] Call Trace: > [ 287.088581] [<ffffffff8110c070>] ? sync_page+0x0/0x50 > [ 287.088583] [<ffffffff815c0990>] io_schedule+0x70/0xc0 > [ 287.088587] [<ffffffff8110c0b0>] sync_page+0x40/0x50 > [ 287.088589] [<ffffffff815c130f>] __wait_on_bit+0x5f/0x90 > [ 287.088593] [<ffffffff8110c278>] wait_on_page_bit+0x78/0x80 > [ 287.088596] [<ffffffff81087f70>] ? wake_bit_function+0x0/0x50 > [ 287.088599] [<ffffffff8110dffd>] __lock_page_or_retry+0x3d/0x70 > [ 287.088602] [<ffffffff8110e3c7>] filemap_fault+0x397/0x4a0 > [ 287.088605] [<ffffffff8112d144>] __do_fault+0x54/0x520 > [ 287.088608] [<ffffffff811309da>] handle_pte_fault+0xfa/0x210 > [ 287.088610] [<ffffffff8111561d>] ? __free_pages+0x2d/0x40 > [ 287.088613] [<ffffffff8112de4f>] ? __pte_alloc+0xdf/0x100 > [ 287.088616] [<ffffffff81131d5d>] handle_mm_fault+0x16d/0x250 > [ 287.088619] [<ffffffff815c6a47>] do_page_fault+0x1a7/0x540 > [ 287.088622] [<ffffffff81136f85>] ? do_mmap_pgoff+0x335/0x370 > [ 287.088625] [<ffffffff815c34d5>] page_fault+0x25/0x30 > [ 287.088629] INFO: task jbd2/sda4-8:1845 blocked for more than 30 seconds. > [ 287.088630] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [ 287.088632] jbd2/sda4-8 D 0000000000000000 0 1845 2 0x00000000 > [ 287.088636] ffff880068f6baf0 0000000000000046 ffff880068f6bfd8 ffff880068f6a000 > [ 287.088639] 0000000000013d00 ffff880061d603b8 ffff880068f6bfd8 0000000000013d00 > [ 287.088643] ffff88003591c4a0 ffff880061d60000 ffff88001fff8548 ffff88001fc13d00 > [ 287.088647] Call Trace: > [ 287.088650] [<ffffffff8110c070>] ? sync_page+0x0/0x50 > [ 287.088653] [<ffffffff815c0990>] io_schedule+0x70/0xc0 > [ 287.088656] [<ffffffff8110c0b0>] sync_page+0x40/0x50 > [ 287.088659] [<ffffffff815c130f>] __wait_on_bit+0x5f/0x90 > [ 287.088662] [<ffffffff8110c278>] wait_on_page_bit+0x78/0x80 > [ 287.088665] [<ffffffff81087f70>] ? wake_bit_function+0x0/0x50 > [ 287.088668] [<ffffffff8110c41d>] filemap_fdatawait_range+0xfd/0x190 > [ 287.088672] [<ffffffff8110c4db>] filemap_fdatawait+0x2b/0x30 > [ 287.088675] [<ffffffff81242a93>] journal_finish_inode_data_buffers+0x63/0x170 > [ 287.088678] [<ffffffff81243284>] jbd2_journal_commit_transaction+0x6e4/0x1190 > [ 287.088682] [<ffffffff81076185>] ? try_to_del_timer_sync+0x85/0xe0 > [ 287.088685] [<ffffffff81247e9b>] kjournald2+0xbb/0x220 > [ 287.088688] [<ffffffff81087f30>] ? autoremove_wake_function+0x0/0x40 > [ 287.088691] [<ffffffff81247de0>] ? kjournald2+0x0/0x220 > [ 287.088694] [<ffffffff810877e6>] kthread+0x96/0xa0 > [ 287.088697] [<ffffffff8100ce24>] kernel_thread_helper+0x4/0x10 > [ 287.088700] [<ffffffff81087750>] ? kthread+0x0/0xa0 > [ 287.088703] [<ffffffff8100ce20>] ? kernel_thread_helper+0x0/0x10 > [ 287.088705] INFO: task dirname:5969 blocked for more than 30 seconds. > [ 287.088707] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [ 287.088709] dirname D 0000000000000002 0 5969 5214 0x00000000 > [ 287.088712] ffff88005bd9d8b8 0000000000000086 ffff88005bd9dfd8 ffff88005bd9c000 > [ 287.088716] 0000000000013d00 ffff88005d65b178 ffff88005bd9dfd8 0000000000013d00 > [ 287.088720] ffff8801005e5b80 ffff88005d65adc0 ffff88001ffe5228 ffff88001fc53d00 > [ 287.088723] Call Trace: > [ 287.088726] [<ffffffff8110c070>] ? sync_page+0x0/0x50 > [ 287.088729] [<ffffffff815c0990>] io_schedule+0x70/0xc0 > [ 287.088732] [<ffffffff8110c0b0>] sync_page+0x40/0x50 > [ 287.088735] [<ffffffff815c130f>] __wait_on_bit+0x5f/0x90 > [ 287.088738] [<ffffffff8110c278>] wait_on_page_bit+0x78/0x80 > [ 287.088741] [<ffffffff81087f70>] ? wake_bit_function+0x0/0x50 > [ 287.088744] [<ffffffff8110dffd>] __lock_page_or_retry+0x3d/0x70 > [ 287.088747] [<ffffffff8110e3c7>] filemap_fault+0x397/0x4a0 > [ 287.088750] [<ffffffff8112d144>] __do_fault+0x54/0x520 > [ 287.088753] [<ffffffff811309da>] handle_pte_fault+0xfa/0x210 > [ 287.088756] [<ffffffff810442a7>] ? pte_alloc_one+0x37/0x50 > [ 287.088759] [<ffffffff815c2cce>] ? _raw_spin_lock+0xe/0x20 > [ 287.088761] [<ffffffff8112de25>] ? __pte_alloc+0xb5/0x100 > [ 287.088764] [<ffffffff81131d5d>] handle_mm_fault+0x16d/0x250 > [ 287.088767] [<ffffffff815c6a47>] do_page_fault+0x1a7/0x540 > [ 287.088770] [<ffffffff81136947>] ? mmap_region+0x1f7/0x500 > [ 287.088773] [<ffffffff8112db06>] ? free_pgd_range+0x356/0x4a0 > [ 287.088776] [<ffffffff815c34d5>] page_fault+0x25/0x30 > [ 287.088779] [<ffffffff812e6d5f>] ? __clear_user+0x3f/0x70 > [ 287.088782] [<ffffffff812e6d41>] ? __clear_user+0x21/0x70 > [ 287.088786] [<ffffffff812e6dc6>] clear_user+0x36/0x40 > [ 287.088788] [<ffffffff811b0b6d>] padzero+0x2d/0x40 > [ 287.088791] [<ffffffff811b2c7a>] load_elf_binary+0x95a/0xe00 > [ 287.088794] [<ffffffff8116aa8a>] search_binary_handler+0xda/0x300 > [ 287.088797] [<ffffffff811b2320>] ? load_elf_binary+0x0/0xe00 > [ 287.088800] [<ffffffff8116c49c>] do_execve+0x24c/0x2d0 > [ 287.088802] [<ffffffff8101521a>] sys_execve+0x4a/0x80 > [ 287.088805] [<ffffffff8100c45c>] stub_execve+0x6c/0xc0 > -- > Jan Kara <jack@suse.cz> > SUSE Labs, CR > -- > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2011-04-28 at 08:29 -0400, Chris Mason wrote: > Excerpts from Colin Ian King's message of 2011-04-28 07:36:30 -0400: > > One more data point to add, I've been looking at an identical issue when > > copying large amounts of data. I bisected this - and the lockups occur > > with commit > > 3e7d344970673c5334cf7b5bb27c8c0942b06126 - before that I don't see the > > issue. With this commit, my file copy test locks up after ~8-10 > > iterations, before this commit I can copy > 100 times and don't see the > > lockup. > > Well, that's really interesting. I tried with compaction on here and > couldn't trigger it, but this (very very lightly) tested patch might > help. > > It moves the writeout throttle before the goto restart, and also makes > sure we do at least one cond_resched before we loop. It seems to take longer, but with a PREEMPT kernel, kswapd eventually shoots up to 99% during the tar. James -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2011-04-28 at 16:33 +0200, Jan Kara wrote: > On Thu 28-04-11 16:25:51, Jan Kara wrote: > > On Thu 28-04-11 15:01:22, Colin Ian King wrote: > > > > > > > Could you post the soft lockups you're seeing? > > > > > > As requested, attached > > Hum, what keeps puzzling me is that in all the cases of hangs I've seen > > so far, we are stuck waiting for IO to finish for a long time - e.g. in the > > traces below kjournald waits for PageWriteback bit to get cleared. Also we > > are stuck waiting for page locks which might be because those pages are > > being read in? All in all it seems that the IO is just incredibly slow. > > > > But it's not clear to me what pushes us into that situation (especially > > since ext4 refuses to do any IO from ->writepage (i.e. kswapd) when the > > underlying blocks are not already allocated. > Hmm, maybe because the system is under memory pressure (and kswapd is not > able to get rid of dirty pages), we page out clean pages. Thus also pages > of executables which need to be paged in soon anyway thus putting heavy > read load on the system which makes writes crawl? I'm not sure why > compaction should make this any worse but maybe it can. > > James, Colin, can you capture output of 'vmstat 1' while you do the > copying? Thanks. Attached. > > Honza > > > [ 287.088371] INFO: task rs:main Q:Reg:749 blocked for more than 30 seconds. > > [ 287.088374] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > [ 287.088376] rs:main Q:Reg D 0000000000000000 0 749 1 0x00000000 > > [ 287.088381] ffff880072c17b68 0000000000000082 ffff880072c17fd8 ffff880072c16000 > > [ 287.088392] 0000000000013d00 ffff88003591b178 ffff880072c17fd8 0000000000013d00 > > [ 287.088396] ffffffff81a0b020 ffff88003591adc0 ffff88001fffc3e8 ffff88001fc13d00 > > [ 287.088400] Call Trace: > > [ 287.088404] [<ffffffff8110c070>] ? sync_page+0x0/0x50 > > [ 287.088408] [<ffffffff815c0990>] io_schedule+0x70/0xc0 > > [ 287.088411] [<ffffffff8110c0b0>] sync_page+0x40/0x50 > > [ 287.088414] [<ffffffff815c130f>] __wait_on_bit+0x5f/0x90 > > [ 287.088418] [<ffffffff8110c278>] wait_on_page_bit+0x78/0x80 > > [ 287.088421] [<ffffffff81087f70>] ? wake_bit_function+0x0/0x50 > > [ 287.088425] [<ffffffff8110dffd>] __lock_page_or_retry+0x3d/0x70 > > [ 287.088428] [<ffffffff8110e3c7>] filemap_fault+0x397/0x4a0 > > [ 287.088431] [<ffffffff8112d144>] __do_fault+0x54/0x520 > > [ 287.088434] [<ffffffff81134a43>] ? unmap_region+0x113/0x170 > > [ 287.088437] [<ffffffff812ded90>] ? prio_tree_insert+0x150/0x1c0 > > [ 287.088440] [<ffffffff811309da>] handle_pte_fault+0xfa/0x210 > > [ 287.088442] [<ffffffff810442a7>] ? pte_alloc_one+0x37/0x50 > > [ 287.088446] [<ffffffff815c2cce>] ? _raw_spin_lock+0xe/0x20 > > [ 287.088448] [<ffffffff8112de25>] ? __pte_alloc+0xb5/0x100 > > [ 287.088451] [<ffffffff81131d5d>] handle_mm_fault+0x16d/0x250 > > [ 287.088454] [<ffffffff815c6a47>] do_page_fault+0x1a7/0x540 > > [ 287.088457] [<ffffffff81136f85>] ? do_mmap_pgoff+0x335/0x370 > > [ 287.088460] [<ffffffff81137127>] ? sys_mmap_pgoff+0x167/0x230 > > [ 287.088463] [<ffffffff815c34d5>] page_fault+0x25/0x30 > > [ 287.088466] INFO: task NetworkManager:764 blocked for more than 30 seconds. > > [ 287.088468] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > [ 287.088470] NetworkManager D 0000000000000002 0 764 1 0x00000000 > > [ 287.088473] ffff880074ffbb68 0000000000000082 ffff880074ffbfd8 ffff880074ffa000 > > [ 287.088477] 0000000000013d00 ffff880036051a98 ffff880074ffbfd8 0000000000013d00 > > [ 287.088481] ffff8801005badc0 ffff8800360516e0 ffff88001ffef128 ffff88001fc53d00 > > [ 287.088484] Call Trace: > > [ 287.088488] [<ffffffff8110c070>] ? sync_page+0x0/0x50 > > [ 287.088491] [<ffffffff815c0990>] io_schedule+0x70/0xc0 > > [ 287.088494] [<ffffffff8110c0b0>] sync_page+0x40/0x50 > > [ 287.088497] [<ffffffff815c130f>] __wait_on_bit+0x5f/0x90 > > [ 287.088500] [<ffffffff8110c278>] wait_on_page_bit+0x78/0x80 > > [ 287.088503] [<ffffffff81087f70>] ? wake_bit_function+0x0/0x50 > > [ 287.088506] [<ffffffff8110dffd>] __lock_page_or_retry+0x3d/0x70 > > [ 287.088509] [<ffffffff8110e3c7>] filemap_fault+0x397/0x4a0 > > [ 287.088513] [<ffffffff81177110>] ? pollwake+0x0/0x60 > > [ 287.088516] [<ffffffff8112d144>] __do_fault+0x54/0x520 > > [ 287.088519] [<ffffffff81177110>] ? pollwake+0x0/0x60 > > [ 287.088522] [<ffffffff811309da>] handle_pte_fault+0xfa/0x210 > > [ 287.088525] [<ffffffff8111561d>] ? __free_pages+0x2d/0x40 > > [ 287.088527] [<ffffffff8112de4f>] ? __pte_alloc+0xdf/0x100 > > [ 287.088530] [<ffffffff81131d5d>] handle_mm_fault+0x16d/0x250 > > [ 287.088533] [<ffffffff815c6a47>] do_page_fault+0x1a7/0x540 > > [ 287.088537] [<ffffffff81013859>] ? read_tsc+0x9/0x20 > > [ 287.088540] [<ffffffff81092eb1>] ? ktime_get_ts+0xb1/0xf0 > > [ 287.088543] [<ffffffff811776d2>] ? poll_select_set_timeout+0x82/0x90 > > [ 287.088546] [<ffffffff815c34d5>] page_fault+0x25/0x30 > > [ 287.088559] INFO: task unity-panel-ser:1521 blocked for more than 30 seconds. > > [ 287.088561] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > [ 287.088562] unity-panel-ser D 0000000000000000 0 1521 1 0x00000000 > > [ 287.088566] ffff880061f37b68 0000000000000082 ffff880061f37fd8 ffff880061f36000 > > [ 287.088570] 0000000000013d00 ffff880068c7c858 ffff880061f37fd8 0000000000013d00 > > [ 287.088573] ffff88003591c4a0 ffff880068c7c4a0 ffff88001fff0c88 ffff88001fc13d00 > > [ 287.088577] Call Trace: > > [ 287.088581] [<ffffffff8110c070>] ? sync_page+0x0/0x50 > > [ 287.088583] [<ffffffff815c0990>] io_schedule+0x70/0xc0 > > [ 287.088587] [<ffffffff8110c0b0>] sync_page+0x40/0x50 > > [ 287.088589] [<ffffffff815c130f>] __wait_on_bit+0x5f/0x90 > > [ 287.088593] [<ffffffff8110c278>] wait_on_page_bit+0x78/0x80 > > [ 287.088596] [<ffffffff81087f70>] ? wake_bit_function+0x0/0x50 > > [ 287.088599] [<ffffffff8110dffd>] __lock_page_or_retry+0x3d/0x70 > > [ 287.088602] [<ffffffff8110e3c7>] filemap_fault+0x397/0x4a0 > > [ 287.088605] [<ffffffff8112d144>] __do_fault+0x54/0x520 > > [ 287.088608] [<ffffffff811309da>] handle_pte_fault+0xfa/0x210 > > [ 287.088610] [<ffffffff8111561d>] ? __free_pages+0x2d/0x40 > > [ 287.088613] [<ffffffff8112de4f>] ? __pte_alloc+0xdf/0x100 > > [ 287.088616] [<ffffffff81131d5d>] handle_mm_fault+0x16d/0x250 > > [ 287.088619] [<ffffffff815c6a47>] do_page_fault+0x1a7/0x540 > > [ 287.088622] [<ffffffff81136f85>] ? do_mmap_pgoff+0x335/0x370 > > [ 287.088625] [<ffffffff815c34d5>] page_fault+0x25/0x30 > > [ 287.088629] INFO: task jbd2/sda4-8:1845 blocked for more than 30 seconds. > > [ 287.088630] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > [ 287.088632] jbd2/sda4-8 D 0000000000000000 0 1845 2 0x00000000 > > [ 287.088636] ffff880068f6baf0 0000000000000046 ffff880068f6bfd8 ffff880068f6a000 > > [ 287.088639] 0000000000013d00 ffff880061d603b8 ffff880068f6bfd8 0000000000013d00 > > [ 287.088643] ffff88003591c4a0 ffff880061d60000 ffff88001fff8548 ffff88001fc13d00 > > [ 287.088647] Call Trace: > > [ 287.088650] [<ffffffff8110c070>] ? sync_page+0x0/0x50 > > [ 287.088653] [<ffffffff815c0990>] io_schedule+0x70/0xc0 > > [ 287.088656] [<ffffffff8110c0b0>] sync_page+0x40/0x50 > > [ 287.088659] [<ffffffff815c130f>] __wait_on_bit+0x5f/0x90 > > [ 287.088662] [<ffffffff8110c278>] wait_on_page_bit+0x78/0x80 > > [ 287.088665] [<ffffffff81087f70>] ? wake_bit_function+0x0/0x50 > > [ 287.088668] [<ffffffff8110c41d>] filemap_fdatawait_range+0xfd/0x190 > > [ 287.088672] [<ffffffff8110c4db>] filemap_fdatawait+0x2b/0x30 > > [ 287.088675] [<ffffffff81242a93>] journal_finish_inode_data_buffers+0x63/0x170 > > [ 287.088678] [<ffffffff81243284>] jbd2_journal_commit_transaction+0x6e4/0x1190 > > [ 287.088682] [<ffffffff81076185>] ? try_to_del_timer_sync+0x85/0xe0 > > [ 287.088685] [<ffffffff81247e9b>] kjournald2+0xbb/0x220 > > [ 287.088688] [<ffffffff81087f30>] ? autoremove_wake_function+0x0/0x40 > > [ 287.088691] [<ffffffff81247de0>] ? kjournald2+0x0/0x220 > > [ 287.088694] [<ffffffff810877e6>] kthread+0x96/0xa0 > > [ 287.088697] [<ffffffff8100ce24>] kernel_thread_helper+0x4/0x10 > > [ 287.088700] [<ffffffff81087750>] ? kthread+0x0/0xa0 > > [ 287.088703] [<ffffffff8100ce20>] ? kernel_thread_helper+0x0/0x10 > > [ 287.088705] INFO: task dirname:5969 blocked for more than 30 seconds. > > [ 287.088707] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > [ 287.088709] dirname D 0000000000000002 0 5969 5214 0x00000000 > > [ 287.088712] ffff88005bd9d8b8 0000000000000086 ffff88005bd9dfd8 ffff88005bd9c000 > > [ 287.088716] 0000000000013d00 ffff88005d65b178 ffff88005bd9dfd8 0000000000013d00 > > [ 287.088720] ffff8801005e5b80 ffff88005d65adc0 ffff88001ffe5228 ffff88001fc53d00 > > [ 287.088723] Call Trace: > > [ 287.088726] [<ffffffff8110c070>] ? sync_page+0x0/0x50 > > [ 287.088729] [<ffffffff815c0990>] io_schedule+0x70/0xc0 > > [ 287.088732] [<ffffffff8110c0b0>] sync_page+0x40/0x50 > > [ 287.088735] [<ffffffff815c130f>] __wait_on_bit+0x5f/0x90 > > [ 287.088738] [<ffffffff8110c278>] wait_on_page_bit+0x78/0x80 > > [ 287.088741] [<ffffffff81087f70>] ? wake_bit_function+0x0/0x50 > > [ 287.088744] [<ffffffff8110dffd>] __lock_page_or_retry+0x3d/0x70 > > [ 287.088747] [<ffffffff8110e3c7>] filemap_fault+0x397/0x4a0 > > [ 287.088750] [<ffffffff8112d144>] __do_fault+0x54/0x520 > > [ 287.088753] [<ffffffff811309da>] handle_pte_fault+0xfa/0x210 > > [ 287.088756] [<ffffffff810442a7>] ? pte_alloc_one+0x37/0x50 > > [ 287.088759] [<ffffffff815c2cce>] ? _raw_spin_lock+0xe/0x20 > > [ 287.088761] [<ffffffff8112de25>] ? __pte_alloc+0xb5/0x100 > > [ 287.088764] [<ffffffff81131d5d>] handle_mm_fault+0x16d/0x250 > > [ 287.088767] [<ffffffff815c6a47>] do_page_fault+0x1a7/0x540 > > [ 287.088770] [<ffffffff81136947>] ? mmap_region+0x1f7/0x500 > > [ 287.088773] [<ffffffff8112db06>] ? free_pgd_range+0x356/0x4a0 > > [ 287.088776] [<ffffffff815c34d5>] page_fault+0x25/0x30 > > [ 287.088779] [<ffffffff812e6d5f>] ? __clear_user+0x3f/0x70 > > [ 287.088782] [<ffffffff812e6d41>] ? __clear_user+0x21/0x70 > > [ 287.088786] [<ffffffff812e6dc6>] clear_user+0x36/0x40 > > [ 287.088788] [<ffffffff811b0b6d>] padzero+0x2d/0x40 > > [ 287.088791] [<ffffffff811b2c7a>] load_elf_binary+0x95a/0xe00 > > [ 287.088794] [<ffffffff8116aa8a>] search_binary_handler+0xda/0x300 > > [ 287.088797] [<ffffffff811b2320>] ? load_elf_binary+0x0/0xe00 > > [ 287.088800] [<ffffffff8116c49c>] do_execve+0x24c/0x2d0 > > [ 287.088802] [<ffffffff8101521a>] sys_execve+0x4a/0x80 > > [ 287.088805] [<ffffffff8100c45c>] stub_execve+0x6c/0xc0 > > -- > > Jan Kara <jack@suse.cz> > > SUSE Labs, CR > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2011-04-28 at 10:04 -0400, Chris Mason wrote: > Excerpts from Colin Ian King's message of 2011-04-28 10:01:22 -0400: > > > > > Could you post the soft lockups you're seeing? > > > > As requested, attached > > These are not good, but they aren't the lockup James was seeing. Were > these messages with my patch? If yes, please post the messages from > without my patch. Attached are the messages without your patch. Colin > > -chris
On Thu, 2011-04-28 at 16:33 +0200, Jan Kara wrote: > James, Colin, can you capture output of 'vmstat 1' while you do the > copying? Thanks. Sure; this is with a PREEMPT kernel so the tar goes to completion without locking the system. I started using a memory balloon to clear out the page cache. procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 98708 1676100 4112 39556 0 45 1566 1879 392 168 14 17 59 10 0 0 0 98708 1676100 4112 39604 0 0 0 0 37 35 0 0 100 0 0 0 0 98708 1676100 4112 39652 0 0 0 0 31 35 0 0 100 0 0 0 0 98708 1676100 4112 39652 0 0 0 0 26 27 0 0 100 0 0 0 0 98708 1676100 4112 39652 32 0 32 0 50 48 0 0 100 0 0 1 1 98708 1673984 4616 40576 192 0 1812 0 286 117 0 0 92 8 0 0 1 98708 1590540 4792 124236 0 0 41972 0 901 822 0 4 67 29 0 0 1 98708 1499524 4808 214820 0 0 45328 0 917 890 0 4 77 20 0 0 1 98708 1404532 4808 308316 64 0 46784 0 965 1033 0 3 77 20 0 0 1 98708 1312020 5612 399008 0 0 44836 0 1274 1075 0 6 77 18 0 0 1 98708 1304540 5628 406324 0 0 3600 47260 289 115 0 2 68 30 0 0 1 98708 1230760 5672 479020 0 0 36260 712 914 854 0 3 79 18 0 0 1 98708 1208696 5676 500016 0 0 10500 93888 421 251 0 3 67 31 0 0 1 98708 1157632 5676 551264 0 0 25600 0 658 518 0 2 75 23 0 0 1 98708 1052356 5680 655344 0 0 51972 0 1112 1112 0 4 78 19 0 0 2 98708 1024960 5680 679080 0 0 11904 87304 477 277 0 3 54 43 0 1 1 98708 1027288 5684 679432 0 0 128 2896 195 42 0 0 85 14 0 0 1 98708 947124 5688 758528 0 0 39552 4 932 899 0 4 68 28 0 0 1 98708 849792 5692 854272 0 0 47876 4468 1045 936 0 4 77 20 0 0 1 98708 813000 5820 890492 0 0 18176 18300 512 386 0 3 66 32 0 1 2 98708 794304 5820 908132 0 0 8832 95640 379 324 0 2 77 21 0 procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 1 1 98708 794272 5824 908128 0 0 0 39456 212 58 0 1 73 26 0 0 1 98708 792344 5832 910452 0 0 1156 44440 193 62 0 1 74 25 0 0 1 98708 764392 5836 938288 0 0 13956 0 441 374 0 1 79 19 0 0 1 98708 660828 5840 1040232 0 0 50948 0 1141 992 0 6 65 29 0 0 1 98708 555156 5840 1144216 0 0 51968 0 1072 1008 0 4 76 20 0 0 1 98708 458692 5856 1239212 0 0 47496 120 995 1046 0 3 78 19 0 0 1 98708 411336 5856 1285896 0 0 23296 17764 536 463 0 3 71 26 0 1 1 98708 367716 5856 1328904 0 0 21504 22272 541 463 0 3 78 20 0 0 1 98708 330048 5864 1366044 0 0 18568 17524 529 515 0 2 78 20 0 0 1 98708 301620 5864 1393912 0 0 13952 28148 382 309 0 2 68 30 0 1 1 98708 245140 5872 1449576 0 0 27776 13224 610 567 0 3 73 24 0 0 2 98708 226868 5872 1467000 0 0 8704 114804 375 322 0 2 77 21 0 0 2 98708 226268 5876 1467552 0 0 260 51720 204 40 0 1 75 24 0 0 1 98708 226548 5880 1467536 0 0 4 13180 149 29 0 1 70 30 0 2 0 98708 159792 5880 1533844 0 0 33152 0 721 747 0 2 81 17 0 0 1 98708 55252 5888 1636692 0 0 51456 104 1021 977 0 5 70 25 0 1 1 98708 22804 2680 1671344 0 0 53632 0 1806 1188 0 20 59 21 0 2 1 98708 22832 2672 1671372 0 0 40704 11196 1967 1126 0 22 60 17 0 1 1 98708 22908 2676 1671196 0 0 24580 23040 1621 705 0 35 36 30 0 1 1 98708 22968 2676 1670796 0 0 21504 22212 1576 674 0 22 60 18 0 1 1 98708 22672 2684 1670996 0 0 17920 16968 1450 687 0 27 51 23 0 procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 2 1 98708 22904 2452 1671344 0 0 24832 25780 1632 714 0 27 50 23 0 1 2 98708 22764 2452 1670724 0 0 15488 96304 1459 531 0 37 33 30 0 2 2 98708 22640 2452 1670724 0 0 0 60300 1219 356 0 20 45 35 0 1 1 98708 22744 2452 1670688 0 0 768 29312 1200 253 0 29 36 34 0 2 0 98708 22964 2460 1671040 0 0 10752 68 1345 445 0 23 56 21 0 1 1 98708 22652 2536 1671560 0 0 51540 0 2113 1246 0 28 49 23 0 1 1 98708 22676 2528 1671768 0 0 118276 14604 4875 2681 0 27 51 22 0 1 1 98708 22644 2528 1671804 0 0 23680 18172 1539 774 0 29 44 27 0 1 1 98708 22720 2536 1671464 0 0 21888 24664 1494 651 0 24 55 22 0 2 1 98708 22672 2496 1671592 0 0 22528 18144 1493 643 0 30 42 28 0 1 1 98708 22732 2488 1671508 0 0 22528 25192 1537 726 0 23 55 21 0 1 2 98708 22884 2488 1670504 0 0 18816 89440 1464 581 0 31 37 32 0 1 2 98708 22740 2488 1670764 0 0 768 75012 1196 254 0 22 55 23 0 1 2 98708 22848 2492 1670600 0 0 256 15512 1225 358 0 29 27 44 0 1 1 98708 22620 2500 1671628 0 0 19076 11408 1490 575 0 23 54 23 0 1 1 98708 22988 2500 1671108 0 0 55040 0 2118 1203 0 27 51 22 0 1 1 98708 22876 2488 1671044 0 0 53760 0 2138 1323 0 28 49 23 0 1 1 98708 22904 2500 1671012 0 0 48012 5528 2049 1097 0 32 43 25 0 1 1 98708 22856 2508 1670876 0 0 21504 18536 1500 646 0 23 55 22 0 2 0 98708 22684 2504 1670972 0 0 23680 23552 1537 784 0 29 44 26 0 1 1 98708 22960 2500 1670520 0 0 23808 25552 1529 677 0 24 55 21 0 procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 2 1 98708 22820 2504 1670804 0 0 21764 25104 1507 647 0 30 43 26 0 1 1 98708 22880 2504 1670812 0 0 24192 18900 1541 768 0 23 55 22 0 2 3 98708 22716 2512 1670068 0 0 2816 132592 1257 303 0 30 30 40 0 1 1 98708 22944 2512 1669940 0 0 512 40400 1165 247 0 23 57 20 0 1 1 98708 22672 2512 1670624 0 0 512 0 1186 349 0 28 44 29 0 1 1 98708 22788 2504 1670416 0 0 51200 0 2130 1158 0 25 53 22 0 1 1 98708 22660 2508 1670688 0 0 50564 0 1972 1161 0 31 43 26 0 1 1 98708 22780 2516 1670088 0 0 50176 64 2028 1204 0 24 56 20 0 1 1 98708 22856 2508 1669884 0 0 32640 19040 1723 819 0 31 43 26 0 3 0 98708 22808 2512 1669812 0 0 23812 19176 1535 670 0 23 55 22 0 1 1 98708 22744 2504 1669728 0 0 21888 24872 1535 653 0 30 43 27 0 1 1 98708 22696 2504 1669636 0 0 22784 25032 1529 647 0 23 55 21 0 1 1 98708 22632 2512 1669596 0 0 21896 19228 1515 647 0 30 43 27 0 1 2 98708 22676 2512 1668992 0 0 9600 74484 1366 418 0 24 43 33 0 1 2 98708 22780 2512 1668772 0 0 256 107120 1211 245 0 29 42 30 0 1 1 98708 23196 2512 1668920 0 0 256 40 1158 228 0 23 57 20 0 2 1 98708 22968 2512 1669092 0 0 47360 0 2023 1097 0 26 49 25 0 1 1 98708 22864 2704 1668784 0 0 39992 116 1878 994 0 27 49 24 0 1 1 98708 22704 2708 1668872 0 0 53388 0 2166 1206 0 28 49 23 0 1 1 98708 22796 2712 1668472 0 0 39940 12784 1873 966 0 27 52 21 0 2 1 98708 22624 2712 1668508 0 0 21888 21904 1498 645 0 30 43 27 0 procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 2 1 98708 22700 2704 1668688 0 0 24320 26240 1548 685 0 23 55 22 0 1 1 98708 22776 2712 1668284 0 0 18048 17532 1438 586 0 30 43 27 0 1 1 98708 22868 2712 1668336 0 0 24192 21764 1552 661 0 23 55 23 0 2 2 98708 22772 2716 1667696 0 0 12420 88772 1433 494 0 30 41 28 0 1 2 98708 23424 2720 1667584 0 0 4 9092 1181 246 0 23 40 37 0 1 1 98708 23120 2724 1667584 0 0 4 86380 1262 269 0 29 42 29 0 1 0 98708 23408 2984 1667708 0 0 364 0 1187 249 0 22 60 18 0 2 0 98708 23416 2992 1667772 0 0 0 120 1045 246 0 25 74 1 0 1 0 98708 23408 2992 1667720 0 0 0 0 1025 221 0 26 74 0 0 1 0 98708 23408 2992 1667720 0 0 0 0 1030 225 0 28 72 0 0 1 0 98708 23408 2992 1667720 0 0 0 0 1024 224 0 23 77 0 0 1 0 98708 23408 2992 1667720 0 0 0 0 1034 226 0 28 72 0 0 1 0 98708 23408 2992 1667720 0 0 0 0 1022 221 0 23 77 0 0 1 0 98708 23408 2992 1667720 0 0 0 0 1032 227 0 28 72 0 0 1 0 98708 23408 2992 1667720 0 0 0 0 1024 221 0 23 77 0 0 1 0 98708 23408 2992 1667720 0 0 0 0 1034 228 0 28 72 0 0 kswapd goes up to 99% at nearly the end of the second page. I ^C on the tar but kswapd stays at 99% for the end. James -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu 28-04-11 15:58:21, Colin Ian King wrote: > On Thu, 2011-04-28 at 16:33 +0200, Jan Kara wrote: > > On Thu 28-04-11 16:25:51, Jan Kara wrote: > > > On Thu 28-04-11 15:01:22, Colin Ian King wrote: > > > > > > > > > Could you post the soft lockups you're seeing? > > > > > > > > As requested, attached > > > Hum, what keeps puzzling me is that in all the cases of hangs I've seen > > > so far, we are stuck waiting for IO to finish for a long time - e.g. in the > > > traces below kjournald waits for PageWriteback bit to get cleared. Also we > > > are stuck waiting for page locks which might be because those pages are > > > being read in? All in all it seems that the IO is just incredibly slow. > > > > > > But it's not clear to me what pushes us into that situation (especially > > > since ext4 refuses to do any IO from ->writepage (i.e. kswapd) when the > > > underlying blocks are not already allocated. > > Hmm, maybe because the system is under memory pressure (and kswapd is not > > able to get rid of dirty pages), we page out clean pages. Thus also pages > > of executables which need to be paged in soon anyway thus putting heavy > > read load on the system which makes writes crawl? I'm not sure why > > compaction should make this any worse but maybe it can. > > > > James, Colin, can you capture output of 'vmstat 1' while you do the > > copying? Thanks. > > Attached. Thanks. So I there are a few interesting points in the vmstat output: For first 30 seconds, we are happily copying data - relatively steady read throughput (about 20-40 MB/s) and occasional peak from flusher thread flushing dirty data. During this time free memory drops from about 1.4 GB to about 22!!! MB - mm seems to like to really use the machine ;). Then things get interesting: procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 1 0 83372 5228 1450776 0 0 39684 90132 450 918 0 4 74 22 0 1 0 22416 5228 1509864 0 0 29452 48492 403 869 1 2 80 18 2 0 0 20056 5384 1513996 0 0 2248 2116 434 1191 4 4 71 21 0 1 0 19800 5932 1514600 0 0 644 104 454 1166 8 3 64 24 1 0 0 21848 5940 1515244 0 0 292 380 468 1775 15 6 75 3 1 0 0 20936 5940 1515876 0 0 296 296 496 1846 18 8 74 0 1 0 0 17792 5940 1516580 0 0 356 356 484 1862 23 8 69 0 1 0 0 17544 5940 1517364 0 0 412 412 482 1812 16 7 77 0 4 0 0 18148 5948 1517968 0 0 288 344 436 1749 19 9 69 3 0 2 220 137528 1616 1402468 0 220 74708 2164 849 1806 3 6 63 28 0 3 224 36184 1628 1499648 0 4 50820 86204 532 1272 0 4 64 32 0 2 19680 53688 1628 1484388 32 19456 6080 62972 242 287 0 2 63 34 0 2 36928 1407432 1356 150980 0 17252 1564 17276 368 764 1 3 73 22 So when free memory reached about 20 MB, both read and write activity almost stalled for 7 s (probably everybody waits for free memory). Then mm manages to free 100 MB from page cache, things move on for two seconds, then we swap out! about 36 MB and page reclaim also finally decides it maybe has too much of page cache and reaps most of it (1.3 GB in one go). Then things get going again, although there are still occasional stalls such as this (about 30s later): 1 3 36688 753192 1208 792344 0 0 35948 32768 435 6625 0 6 61 33 0 2 36668 754996 1344 792760 0 0 252 58736 2832 16239 0 1 60 39 0 2 36668 750132 1388 796688 0 0 2508 1524 325 959 1 3 68 28 1 0 36668 751160 1400 797968 0 0 620 620 460 1470 6 6 50 38 1 0 36668 750516 1400 798520 0 0 300 300 412 1764 17 8 75 1 1 0 36668 750648 1408 799108 0 0 280 340 423 1816 18 6 73 3 1 0 36668 748856 1408 799752 0 0 336 328 409 1788 18 8 75 0 1 0 36668 748120 1416 800604 0 0 428 452 407 1723 14 10 75 2 1 0 36668 750048 1416 801176 0 0 296 296 405 1779 18 7 75 1 0 1 36668 650428 1420 897252 0 0 48100 556 658 1718 10 3 71 15 0 2 36668 505444 1424 1037012 0 0 69888 90272 686 1491 1 4 68 27 0 1 36668 479264 1428 1063372 0 0 12984 40896 324 674 1 1 76 23 ... I'm not sure what we were blocked on here since there is still plenty of free memory (750 MB). These stalls repeat once in a while but things go on. Then at about 350s, things just stop: procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 3 1 75280 73564 1844 1503848 0 0 43396 81976 627 1061 0 25 42 32 3 3 75280 73344 1852 1504256 0 0 256 20 240 149 0 26 25 49 3 3 75280 73344 1852 1504268 0 0 0 0 265 140 0 29 13 58 3 3 75280 73468 1852 1504232 0 0 0 0 265 132 0 22 34 44 3 3 75280 73468 1852 1504232 0 0 0 0 339 283 0 25 26 49 3 3 75280 73468 1852 1504232 0 0 0 0 362 327 0 25 25 50 3 3 75280 73468 1852 1504232 0 0 0 0 317 320 0 26 25 49 3 3 75280 73468 1852 1504232 0 0 0 0 361 343 0 26 25 50 and nothing really happens for 150 s, except more and more tasks blocking in D state (second column). 3 6 75272 73416 1872 1503872 0 0 0 0 445 700 0 25 25 50 0 7 75264 67940 1884 1509008 64 0 5056 16 481 876 0 22 23 55 Then suddently things get going again: 0 2 75104 76808 1892 1502552 352 0 14292 40456 459 14865 0 2 39 59 0 1 75104 75704 1900 1503412 0 0 820 32 405 788 1 1 72 27 1 0 75104 76512 1904 1505576 0 0 1068 1072 454 1586 8 7 74 11 I guess this 150 s stall is when kernel barfs the "task blocked for more than 30 seconds" messages. And from the traces we know that everyone is waiting for PageWriteback or page lock during this time. Also James's vmstat report shows that IO is stalled when kswapd is spinning. Really strange. James in the meantime identified that cgroups are somehow involved. Are you using systemd by any chance? Maybe cgroup IO throttling screws us? Honza > > > [ 287.088371] INFO: task rs:main Q:Reg:749 blocked for more than 30 seconds. > > > [ 287.088374] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > > [ 287.088376] rs:main Q:Reg D 0000000000000000 0 749 1 0x00000000 > > > [ 287.088381] ffff880072c17b68 0000000000000082 ffff880072c17fd8 ffff880072c16000 > > > [ 287.088392] 0000000000013d00 ffff88003591b178 ffff880072c17fd8 0000000000013d00 > > > [ 287.088396] ffffffff81a0b020 ffff88003591adc0 ffff88001fffc3e8 ffff88001fc13d00 > > > [ 287.088400] Call Trace: > > > [ 287.088404] [<ffffffff8110c070>] ? sync_page+0x0/0x50 > > > [ 287.088408] [<ffffffff815c0990>] io_schedule+0x70/0xc0 > > > [ 287.088411] [<ffffffff8110c0b0>] sync_page+0x40/0x50 > > > [ 287.088414] [<ffffffff815c130f>] __wait_on_bit+0x5f/0x90 > > > [ 287.088418] [<ffffffff8110c278>] wait_on_page_bit+0x78/0x80 > > > [ 287.088421] [<ffffffff81087f70>] ? wake_bit_function+0x0/0x50 > > > [ 287.088425] [<ffffffff8110dffd>] __lock_page_or_retry+0x3d/0x70 > > > [ 287.088428] [<ffffffff8110e3c7>] filemap_fault+0x397/0x4a0 > > > [ 287.088431] [<ffffffff8112d144>] __do_fault+0x54/0x520 > > > [ 287.088434] [<ffffffff81134a43>] ? unmap_region+0x113/0x170 > > > [ 287.088437] [<ffffffff812ded90>] ? prio_tree_insert+0x150/0x1c0 > > > [ 287.088440] [<ffffffff811309da>] handle_pte_fault+0xfa/0x210 > > > [ 287.088442] [<ffffffff810442a7>] ? pte_alloc_one+0x37/0x50 > > > [ 287.088446] [<ffffffff815c2cce>] ? _raw_spin_lock+0xe/0x20 > > > [ 287.088448] [<ffffffff8112de25>] ? __pte_alloc+0xb5/0x100 > > > [ 287.088451] [<ffffffff81131d5d>] handle_mm_fault+0x16d/0x250 > > > [ 287.088454] [<ffffffff815c6a47>] do_page_fault+0x1a7/0x540 > > > [ 287.088457] [<ffffffff81136f85>] ? do_mmap_pgoff+0x335/0x370 > > > [ 287.088460] [<ffffffff81137127>] ? sys_mmap_pgoff+0x167/0x230 > > > [ 287.088463] [<ffffffff815c34d5>] page_fault+0x25/0x30 > > > [ 287.088466] INFO: task NetworkManager:764 blocked for more than 30 seconds. > > > [ 287.088468] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > > [ 287.088470] NetworkManager D 0000000000000002 0 764 1 0x00000000 > > > [ 287.088473] ffff880074ffbb68 0000000000000082 ffff880074ffbfd8 ffff880074ffa000 > > > [ 287.088477] 0000000000013d00 ffff880036051a98 ffff880074ffbfd8 0000000000013d00 > > > [ 287.088481] ffff8801005badc0 ffff8800360516e0 ffff88001ffef128 ffff88001fc53d00 > > > [ 287.088484] Call Trace: > > > [ 287.088488] [<ffffffff8110c070>] ? sync_page+0x0/0x50 > > > [ 287.088491] [<ffffffff815c0990>] io_schedule+0x70/0xc0 > > > [ 287.088494] [<ffffffff8110c0b0>] sync_page+0x40/0x50 > > > [ 287.088497] [<ffffffff815c130f>] __wait_on_bit+0x5f/0x90 > > > [ 287.088500] [<ffffffff8110c278>] wait_on_page_bit+0x78/0x80 > > > [ 287.088503] [<ffffffff81087f70>] ? wake_bit_function+0x0/0x50 > > > [ 287.088506] [<ffffffff8110dffd>] __lock_page_or_retry+0x3d/0x70 > > > [ 287.088509] [<ffffffff8110e3c7>] filemap_fault+0x397/0x4a0 > > > [ 287.088513] [<ffffffff81177110>] ? pollwake+0x0/0x60 > > > [ 287.088516] [<ffffffff8112d144>] __do_fault+0x54/0x520 > > > [ 287.088519] [<ffffffff81177110>] ? pollwake+0x0/0x60 > > > [ 287.088522] [<ffffffff811309da>] handle_pte_fault+0xfa/0x210 > > > [ 287.088525] [<ffffffff8111561d>] ? __free_pages+0x2d/0x40 > > > [ 287.088527] [<ffffffff8112de4f>] ? __pte_alloc+0xdf/0x100 > > > [ 287.088530] [<ffffffff81131d5d>] handle_mm_fault+0x16d/0x250 > > > [ 287.088533] [<ffffffff815c6a47>] do_page_fault+0x1a7/0x540 > > > [ 287.088537] [<ffffffff81013859>] ? read_tsc+0x9/0x20 > > > [ 287.088540] [<ffffffff81092eb1>] ? ktime_get_ts+0xb1/0xf0 > > > [ 287.088543] [<ffffffff811776d2>] ? poll_select_set_timeout+0x82/0x90 > > > [ 287.088546] [<ffffffff815c34d5>] page_fault+0x25/0x30 > > > [ 287.088559] INFO: task unity-panel-ser:1521 blocked for more than 30 seconds. > > > [ 287.088561] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > > [ 287.088562] unity-panel-ser D 0000000000000000 0 1521 1 0x00000000 > > > [ 287.088566] ffff880061f37b68 0000000000000082 ffff880061f37fd8 ffff880061f36000 > > > [ 287.088570] 0000000000013d00 ffff880068c7c858 ffff880061f37fd8 0000000000013d00 > > > [ 287.088573] ffff88003591c4a0 ffff880068c7c4a0 ffff88001fff0c88 ffff88001fc13d00 > > > [ 287.088577] Call Trace: > > > [ 287.088581] [<ffffffff8110c070>] ? sync_page+0x0/0x50 > > > [ 287.088583] [<ffffffff815c0990>] io_schedule+0x70/0xc0 > > > [ 287.088587] [<ffffffff8110c0b0>] sync_page+0x40/0x50 > > > [ 287.088589] [<ffffffff815c130f>] __wait_on_bit+0x5f/0x90 > > > [ 287.088593] [<ffffffff8110c278>] wait_on_page_bit+0x78/0x80 > > > [ 287.088596] [<ffffffff81087f70>] ? wake_bit_function+0x0/0x50 > > > [ 287.088599] [<ffffffff8110dffd>] __lock_page_or_retry+0x3d/0x70 > > > [ 287.088602] [<ffffffff8110e3c7>] filemap_fault+0x397/0x4a0 > > > [ 287.088605] [<ffffffff8112d144>] __do_fault+0x54/0x520 > > > [ 287.088608] [<ffffffff811309da>] handle_pte_fault+0xfa/0x210 > > > [ 287.088610] [<ffffffff8111561d>] ? __free_pages+0x2d/0x40 > > > [ 287.088613] [<ffffffff8112de4f>] ? __pte_alloc+0xdf/0x100 > > > [ 287.088616] [<ffffffff81131d5d>] handle_mm_fault+0x16d/0x250 > > > [ 287.088619] [<ffffffff815c6a47>] do_page_fault+0x1a7/0x540 > > > [ 287.088622] [<ffffffff81136f85>] ? do_mmap_pgoff+0x335/0x370 > > > [ 287.088625] [<ffffffff815c34d5>] page_fault+0x25/0x30 > > > [ 287.088629] INFO: task jbd2/sda4-8:1845 blocked for more than 30 seconds. > > > [ 287.088630] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > > [ 287.088632] jbd2/sda4-8 D 0000000000000000 0 1845 2 0x00000000 > > > [ 287.088636] ffff880068f6baf0 0000000000000046 ffff880068f6bfd8 ffff880068f6a000 > > > [ 287.088639] 0000000000013d00 ffff880061d603b8 ffff880068f6bfd8 0000000000013d00 > > > [ 287.088643] ffff88003591c4a0 ffff880061d60000 ffff88001fff8548 ffff88001fc13d00 > > > [ 287.088647] Call Trace: > > > [ 287.088650] [<ffffffff8110c070>] ? sync_page+0x0/0x50 > > > [ 287.088653] [<ffffffff815c0990>] io_schedule+0x70/0xc0 > > > [ 287.088656] [<ffffffff8110c0b0>] sync_page+0x40/0x50 > > > [ 287.088659] [<ffffffff815c130f>] __wait_on_bit+0x5f/0x90 > > > [ 287.088662] [<ffffffff8110c278>] wait_on_page_bit+0x78/0x80 > > > [ 287.088665] [<ffffffff81087f70>] ? wake_bit_function+0x0/0x50 > > > [ 287.088668] [<ffffffff8110c41d>] filemap_fdatawait_range+0xfd/0x190 > > > [ 287.088672] [<ffffffff8110c4db>] filemap_fdatawait+0x2b/0x30 > > > [ 287.088675] [<ffffffff81242a93>] journal_finish_inode_data_buffers+0x63/0x170 > > > [ 287.088678] [<ffffffff81243284>] jbd2_journal_commit_transaction+0x6e4/0x1190 > > > [ 287.088682] [<ffffffff81076185>] ? try_to_del_timer_sync+0x85/0xe0 > > > [ 287.088685] [<ffffffff81247e9b>] kjournald2+0xbb/0x220 > > > [ 287.088688] [<ffffffff81087f30>] ? autoremove_wake_function+0x0/0x40 > > > [ 287.088691] [<ffffffff81247de0>] ? kjournald2+0x0/0x220 > > > [ 287.088694] [<ffffffff810877e6>] kthread+0x96/0xa0 > > > [ 287.088697] [<ffffffff8100ce24>] kernel_thread_helper+0x4/0x10 > > > [ 287.088700] [<ffffffff81087750>] ? kthread+0x0/0xa0 > > > [ 287.088703] [<ffffffff8100ce20>] ? kernel_thread_helper+0x0/0x10 > > > [ 287.088705] INFO: task dirname:5969 blocked for more than 30 seconds. > > > [ 287.088707] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > > [ 287.088709] dirname D 0000000000000002 0 5969 5214 0x00000000 > > > [ 287.088712] ffff88005bd9d8b8 0000000000000086 ffff88005bd9dfd8 ffff88005bd9c000 > > > [ 287.088716] 0000000000013d00 ffff88005d65b178 ffff88005bd9dfd8 0000000000013d00 > > > [ 287.088720] ffff8801005e5b80 ffff88005d65adc0 ffff88001ffe5228 ffff88001fc53d00 > > > [ 287.088723] Call Trace: > > > [ 287.088726] [<ffffffff8110c070>] ? sync_page+0x0/0x50 > > > [ 287.088729] [<ffffffff815c0990>] io_schedule+0x70/0xc0 > > > [ 287.088732] [<ffffffff8110c0b0>] sync_page+0x40/0x50 > > > [ 287.088735] [<ffffffff815c130f>] __wait_on_bit+0x5f/0x90 > > > [ 287.088738] [<ffffffff8110c278>] wait_on_page_bit+0x78/0x80 > > > [ 287.088741] [<ffffffff81087f70>] ? wake_bit_function+0x0/0x50 > > > [ 287.088744] [<ffffffff8110dffd>] __lock_page_or_retry+0x3d/0x70 > > > [ 287.088747] [<ffffffff8110e3c7>] filemap_fault+0x397/0x4a0 > > > [ 287.088750] [<ffffffff8112d144>] __do_fault+0x54/0x520 > > > [ 287.088753] [<ffffffff811309da>] handle_pte_fault+0xfa/0x210 > > > [ 287.088756] [<ffffffff810442a7>] ? pte_alloc_one+0x37/0x50 > > > [ 287.088759] [<ffffffff815c2cce>] ? _raw_spin_lock+0xe/0x20 > > > [ 287.088761] [<ffffffff8112de25>] ? __pte_alloc+0xb5/0x100 > > > [ 287.088764] [<ffffffff81131d5d>] handle_mm_fault+0x16d/0x250 > > > [ 287.088767] [<ffffffff815c6a47>] do_page_fault+0x1a7/0x540 > > > [ 287.088770] [<ffffffff81136947>] ? mmap_region+0x1f7/0x500 > > > [ 287.088773] [<ffffffff8112db06>] ? free_pgd_range+0x356/0x4a0 > > > [ 287.088776] [<ffffffff815c34d5>] page_fault+0x25/0x30 > > > [ 287.088779] [<ffffffff812e6d5f>] ? __clear_user+0x3f/0x70 > > > [ 287.088782] [<ffffffff812e6d41>] ? __clear_user+0x21/0x70 > > > [ 287.088786] [<ffffffff812e6dc6>] clear_user+0x36/0x40 > > > [ 287.088788] [<ffffffff811b0b6d>] padzero+0x2d/0x40 > > > [ 287.088791] [<ffffffff811b2c7a>] load_elf_binary+0x95a/0xe00 > > > [ 287.088794] [<ffffffff8116aa8a>] search_binary_handler+0xda/0x300 > > > [ 287.088797] [<ffffffff811b2320>] ? load_elf_binary+0x0/0xe00 > > > [ 287.088800] [<ffffffff8116c49c>] do_execve+0x24c/0x2d0 > > > [ 287.088802] [<ffffffff8101521a>] sys_execve+0x4a/0x80 > > > [ 287.088805] [<ffffffff8100c45c>] stub_execve+0x6c/0xc0 > > > -- > > > Jan Kara <jack@suse.cz> > > > SUSE Labs, CR > > > -- > > > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in > > > the body of a message to majordomo@vger.kernel.org > > > More majordomo info at http://vger.kernel.org/majordomo-info.html >
On Fri, 2011-04-29 at 00:40 +0200, Jan Kara wrote: > James in the meantime identified that cgroups are somehow involved. > Are you > using systemd by any chance? Maybe cgroup IO throttling screws us? Yes, this is a FC15 system with systemd as the default. See related post: it looks like disabling the memory controller is what makes this go away. James -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 2011-04-29 at 00:40 +0200, Jan Kara wrote: > On Thu 28-04-11 15:58:21, Colin Ian King wrote: > > On Thu, 2011-04-28 at 16:33 +0200, Jan Kara wrote: > > > On Thu 28-04-11 16:25:51, Jan Kara wrote: > > > > On Thu 28-04-11 15:01:22, Colin Ian King wrote: > > > > > > > > > > > Could you post the soft lockups you're seeing? > > > > > > > > > > As requested, attached > > > > Hum, what keeps puzzling me is that in all the cases of hangs I've seen > > > > so far, we are stuck waiting for IO to finish for a long time - e.g. in the > > > > traces below kjournald waits for PageWriteback bit to get cleared. Also we > > > > are stuck waiting for page locks which might be because those pages are > > > > being read in? All in all it seems that the IO is just incredibly slow. > > > > > > > > But it's not clear to me what pushes us into that situation (especially > > > > since ext4 refuses to do any IO from ->writepage (i.e. kswapd) when the > > > > underlying blocks are not already allocated. > > > Hmm, maybe because the system is under memory pressure (and kswapd is not > > > able to get rid of dirty pages), we page out clean pages. Thus also pages > > > of executables which need to be paged in soon anyway thus putting heavy > > > read load on the system which makes writes crawl? I'm not sure why > > > compaction should make this any worse but maybe it can. > > > > > > James, Colin, can you capture output of 'vmstat 1' while you do the > > > copying? Thanks. > > > > Attached. > Thanks. So I there are a few interesting points in the vmstat output: > For first 30 seconds, we are happily copying data - relatively steady read > throughput (about 20-40 MB/s) and occasional peak from flusher thread > flushing dirty data. During this time free memory drops from about 1.4 GB > to about 22!!! MB - mm seems to like to really use the machine ;). Then > things get interesting: > procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- > r b swpd free buff cache si so bi bo in cs us sy id wa > 0 1 0 83372 5228 1450776 0 0 39684 90132 450 918 0 4 74 22 > 0 1 0 22416 5228 1509864 0 0 29452 48492 403 869 1 2 80 18 > 2 0 0 20056 5384 1513996 0 0 2248 2116 434 1191 4 4 71 21 > 0 1 0 19800 5932 1514600 0 0 644 104 454 1166 8 3 64 24 > 1 0 0 21848 5940 1515244 0 0 292 380 468 1775 15 6 75 3 > 1 0 0 20936 5940 1515876 0 0 296 296 496 1846 18 8 74 0 > 1 0 0 17792 5940 1516580 0 0 356 356 484 1862 23 8 69 0 > 1 0 0 17544 5940 1517364 0 0 412 412 482 1812 16 7 77 0 > 4 0 0 18148 5948 1517968 0 0 288 344 436 1749 19 9 69 3 > 0 2 220 137528 1616 1402468 0 220 74708 2164 849 1806 3 6 63 28 > 0 3 224 36184 1628 1499648 0 4 50820 86204 532 1272 0 4 64 32 > 0 2 19680 53688 1628 1484388 32 19456 6080 62972 242 287 0 2 63 34 > 0 2 36928 1407432 1356 150980 0 17252 1564 17276 368 764 1 3 73 22 > > So when free memory reached about 20 MB, both read and write activity > almost stalled for 7 s (probably everybody waits for free memory). Then > mm manages to free 100 MB from page cache, things move on for two seconds, > then we swap out! about 36 MB and page reclaim also finally decides it > maybe has too much of page cache and reaps most of it (1.3 GB in one go). > Then things get going again, although there are still occasional stalls > such as this (about 30s later): > 1 3 36688 753192 1208 792344 0 0 35948 32768 435 6625 0 6 61 33 > 0 2 36668 754996 1344 792760 0 0 252 58736 2832 16239 0 1 60 39 > 0 2 36668 750132 1388 796688 0 0 2508 1524 325 959 1 3 68 28 > 1 0 36668 751160 1400 797968 0 0 620 620 460 1470 6 6 50 38 > 1 0 36668 750516 1400 798520 0 0 300 300 412 1764 17 8 75 1 > 1 0 36668 750648 1408 799108 0 0 280 340 423 1816 18 6 73 3 > 1 0 36668 748856 1408 799752 0 0 336 328 409 1788 18 8 75 0 > 1 0 36668 748120 1416 800604 0 0 428 452 407 1723 14 10 75 2 > 1 0 36668 750048 1416 801176 0 0 296 296 405 1779 18 7 75 1 > 0 1 36668 650428 1420 897252 0 0 48100 556 658 1718 10 3 71 15 > 0 2 36668 505444 1424 1037012 0 0 69888 90272 686 1491 1 4 68 27 > 0 1 36668 479264 1428 1063372 0 0 12984 40896 324 674 1 1 76 23 > ... > I'm not sure what we were blocked on here since there is still plenty of > free memory (750 MB). These stalls repeat once in a while but things go on. > Then at about 350s, things just stop: > procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- > r b swpd free buff cache si so bi bo in cs us sy id wa > 3 1 75280 73564 1844 1503848 0 0 43396 81976 627 1061 0 25 42 32 > 3 3 75280 73344 1852 1504256 0 0 256 20 240 149 0 26 25 49 > 3 3 75280 73344 1852 1504268 0 0 0 0 265 140 0 29 13 58 > 3 3 75280 73468 1852 1504232 0 0 0 0 265 132 0 22 34 44 > 3 3 75280 73468 1852 1504232 0 0 0 0 339 283 0 25 26 49 > 3 3 75280 73468 1852 1504232 0 0 0 0 362 327 0 25 25 50 > 3 3 75280 73468 1852 1504232 0 0 0 0 317 320 0 26 25 49 > 3 3 75280 73468 1852 1504232 0 0 0 0 361 343 0 26 25 50 > > and nothing really happens for 150 s, except more and more tasks blocking > in D state (second column). > 3 6 75272 73416 1872 1503872 0 0 0 0 445 700 0 25 25 50 > 0 7 75264 67940 1884 1509008 64 0 5056 16 481 876 0 22 23 55 > Then suddently things get going again: > 0 2 75104 76808 1892 1502552 352 0 14292 40456 459 14865 0 2 39 59 > 0 1 75104 75704 1900 1503412 0 0 820 32 405 788 1 1 72 27 > 1 0 75104 76512 1904 1505576 0 0 1068 1072 454 1586 8 7 74 11 > > I guess this 150 s stall is when kernel barfs the "task blocked for more > than 30 seconds" messages. And from the traces we know that everyone is > waiting for PageWriteback or page lock during this time. Also James's vmstat > report shows that IO is stalled when kswapd is spinning. Really strange. Just to add, this machine has relatively low amount of memory (1GB). I've re-run the tests today with cgroups disabled and it ran for 47 'copy' cycles, 27 'copy' cycles and then 35 'copy' cycles. One extra data point, with maxcpus=1 I get a lockup after 2 'copy' cycles every time, so it's more predictable than the default 4 processor configuration. > > James in the meantime identified that cgroups are somehow involved. Are you > using systemd by any chance? No, I'm using upstart. > Maybe cgroup IO throttling screws us? > > Honza > > > > > [ 287.088371] INFO: task rs:main Q:Reg:749 blocked for more than 30 seconds. > > > > [ 287.088374] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > > > [ 287.088376] rs:main Q:Reg D 0000000000000000 0 749 1 0x00000000 > > > > [ 287.088381] ffff880072c17b68 0000000000000082 ffff880072c17fd8 ffff880072c16000 > > > > [ 287.088392] 0000000000013d00 ffff88003591b178 ffff880072c17fd8 0000000000013d00 > > > > [ 287.088396] ffffffff81a0b020 ffff88003591adc0 ffff88001fffc3e8 ffff88001fc13d00 > > > > [ 287.088400] Call Trace: > > > > [ 287.088404] [<ffffffff8110c070>] ? sync_page+0x0/0x50 > > > > [ 287.088408] [<ffffffff815c0990>] io_schedule+0x70/0xc0 > > > > [ 287.088411] [<ffffffff8110c0b0>] sync_page+0x40/0x50 > > > > [ 287.088414] [<ffffffff815c130f>] __wait_on_bit+0x5f/0x90 > > > > [ 287.088418] [<ffffffff8110c278>] wait_on_page_bit+0x78/0x80 > > > > [ 287.088421] [<ffffffff81087f70>] ? wake_bit_function+0x0/0x50 > > > > [ 287.088425] [<ffffffff8110dffd>] __lock_page_or_retry+0x3d/0x70 > > > > [ 287.088428] [<ffffffff8110e3c7>] filemap_fault+0x397/0x4a0 > > > > [ 287.088431] [<ffffffff8112d144>] __do_fault+0x54/0x520 > > > > [ 287.088434] [<ffffffff81134a43>] ? unmap_region+0x113/0x170 > > > > [ 287.088437] [<ffffffff812ded90>] ? prio_tree_insert+0x150/0x1c0 > > > > [ 287.088440] [<ffffffff811309da>] handle_pte_fault+0xfa/0x210 > > > > [ 287.088442] [<ffffffff810442a7>] ? pte_alloc_one+0x37/0x50 > > > > [ 287.088446] [<ffffffff815c2cce>] ? _raw_spin_lock+0xe/0x20 > > > > [ 287.088448] [<ffffffff8112de25>] ? __pte_alloc+0xb5/0x100 > > > > [ 287.088451] [<ffffffff81131d5d>] handle_mm_fault+0x16d/0x250 > > > > [ 287.088454] [<ffffffff815c6a47>] do_page_fault+0x1a7/0x540 > > > > [ 287.088457] [<ffffffff81136f85>] ? do_mmap_pgoff+0x335/0x370 > > > > [ 287.088460] [<ffffffff81137127>] ? sys_mmap_pgoff+0x167/0x230 > > > > [ 287.088463] [<ffffffff815c34d5>] page_fault+0x25/0x30 > > > > [ 287.088466] INFO: task NetworkManager:764 blocked for more than 30 seconds. > > > > [ 287.088468] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > > > [ 287.088470] NetworkManager D 0000000000000002 0 764 1 0x00000000 > > > > [ 287.088473] ffff880074ffbb68 0000000000000082 ffff880074ffbfd8 ffff880074ffa000 > > > > [ 287.088477] 0000000000013d00 ffff880036051a98 ffff880074ffbfd8 0000000000013d00 > > > > [ 287.088481] ffff8801005badc0 ffff8800360516e0 ffff88001ffef128 ffff88001fc53d00 > > > > [ 287.088484] Call Trace: > > > > [ 287.088488] [<ffffffff8110c070>] ? sync_page+0x0/0x50 > > > > [ 287.088491] [<ffffffff815c0990>] io_schedule+0x70/0xc0 > > > > [ 287.088494] [<ffffffff8110c0b0>] sync_page+0x40/0x50 > > > > [ 287.088497] [<ffffffff815c130f>] __wait_on_bit+0x5f/0x90 > > > > [ 287.088500] [<ffffffff8110c278>] wait_on_page_bit+0x78/0x80 > > > > [ 287.088503] [<ffffffff81087f70>] ? wake_bit_function+0x0/0x50 > > > > [ 287.088506] [<ffffffff8110dffd>] __lock_page_or_retry+0x3d/0x70 > > > > [ 287.088509] [<ffffffff8110e3c7>] filemap_fault+0x397/0x4a0 > > > > [ 287.088513] [<ffffffff81177110>] ? pollwake+0x0/0x60 > > > > [ 287.088516] [<ffffffff8112d144>] __do_fault+0x54/0x520 > > > > [ 287.088519] [<ffffffff81177110>] ? pollwake+0x0/0x60 > > > > [ 287.088522] [<ffffffff811309da>] handle_pte_fault+0xfa/0x210 > > > > [ 287.088525] [<ffffffff8111561d>] ? __free_pages+0x2d/0x40 > > > > [ 287.088527] [<ffffffff8112de4f>] ? __pte_alloc+0xdf/0x100 > > > > [ 287.088530] [<ffffffff81131d5d>] handle_mm_fault+0x16d/0x250 > > > > [ 287.088533] [<ffffffff815c6a47>] do_page_fault+0x1a7/0x540 > > > > [ 287.088537] [<ffffffff81013859>] ? read_tsc+0x9/0x20 > > > > [ 287.088540] [<ffffffff81092eb1>] ? ktime_get_ts+0xb1/0xf0 > > > > [ 287.088543] [<ffffffff811776d2>] ? poll_select_set_timeout+0x82/0x90 > > > > [ 287.088546] [<ffffffff815c34d5>] page_fault+0x25/0x30 > > > > [ 287.088559] INFO: task unity-panel-ser:1521 blocked for more than 30 seconds. > > > > [ 287.088561] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > > > [ 287.088562] unity-panel-ser D 0000000000000000 0 1521 1 0x00000000 > > > > [ 287.088566] ffff880061f37b68 0000000000000082 ffff880061f37fd8 ffff880061f36000 > > > > [ 287.088570] 0000000000013d00 ffff880068c7c858 ffff880061f37fd8 0000000000013d00 > > > > [ 287.088573] ffff88003591c4a0 ffff880068c7c4a0 ffff88001fff0c88 ffff88001fc13d00 > > > > [ 287.088577] Call Trace: > > > > [ 287.088581] [<ffffffff8110c070>] ? sync_page+0x0/0x50 > > > > [ 287.088583] [<ffffffff815c0990>] io_schedule+0x70/0xc0 > > > > [ 287.088587] [<ffffffff8110c0b0>] sync_page+0x40/0x50 > > > > [ 287.088589] [<ffffffff815c130f>] __wait_on_bit+0x5f/0x90 > > > > [ 287.088593] [<ffffffff8110c278>] wait_on_page_bit+0x78/0x80 > > > > [ 287.088596] [<ffffffff81087f70>] ? wake_bit_function+0x0/0x50 > > > > [ 287.088599] [<ffffffff8110dffd>] __lock_page_or_retry+0x3d/0x70 > > > > [ 287.088602] [<ffffffff8110e3c7>] filemap_fault+0x397/0x4a0 > > > > [ 287.088605] [<ffffffff8112d144>] __do_fault+0x54/0x520 > > > > [ 287.088608] [<ffffffff811309da>] handle_pte_fault+0xfa/0x210 > > > > [ 287.088610] [<ffffffff8111561d>] ? __free_pages+0x2d/0x40 > > > > [ 287.088613] [<ffffffff8112de4f>] ? __pte_alloc+0xdf/0x100 > > > > [ 287.088616] [<ffffffff81131d5d>] handle_mm_fault+0x16d/0x250 > > > > [ 287.088619] [<ffffffff815c6a47>] do_page_fault+0x1a7/0x540 > > > > [ 287.088622] [<ffffffff81136f85>] ? do_mmap_pgoff+0x335/0x370 > > > > [ 287.088625] [<ffffffff815c34d5>] page_fault+0x25/0x30 > > > > [ 287.088629] INFO: task jbd2/sda4-8:1845 blocked for more than 30 seconds. > > > > [ 287.088630] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > > > [ 287.088632] jbd2/sda4-8 D 0000000000000000 0 1845 2 0x00000000 > > > > [ 287.088636] ffff880068f6baf0 0000000000000046 ffff880068f6bfd8 ffff880068f6a000 > > > > [ 287.088639] 0000000000013d00 ffff880061d603b8 ffff880068f6bfd8 0000000000013d00 > > > > [ 287.088643] ffff88003591c4a0 ffff880061d60000 ffff88001fff8548 ffff88001fc13d00 > > > > [ 287.088647] Call Trace: > > > > [ 287.088650] [<ffffffff8110c070>] ? sync_page+0x0/0x50 > > > > [ 287.088653] [<ffffffff815c0990>] io_schedule+0x70/0xc0 > > > > [ 287.088656] [<ffffffff8110c0b0>] sync_page+0x40/0x50 > > > > [ 287.088659] [<ffffffff815c130f>] __wait_on_bit+0x5f/0x90 > > > > [ 287.088662] [<ffffffff8110c278>] wait_on_page_bit+0x78/0x80 > > > > [ 287.088665] [<ffffffff81087f70>] ? wake_bit_function+0x0/0x50 > > > > [ 287.088668] [<ffffffff8110c41d>] filemap_fdatawait_range+0xfd/0x190 > > > > [ 287.088672] [<ffffffff8110c4db>] filemap_fdatawait+0x2b/0x30 > > > > [ 287.088675] [<ffffffff81242a93>] journal_finish_inode_data_buffers+0x63/0x170 > > > > [ 287.088678] [<ffffffff81243284>] jbd2_journal_commit_transaction+0x6e4/0x1190 > > > > [ 287.088682] [<ffffffff81076185>] ? try_to_del_timer_sync+0x85/0xe0 > > > > [ 287.088685] [<ffffffff81247e9b>] kjournald2+0xbb/0x220 > > > > [ 287.088688] [<ffffffff81087f30>] ? autoremove_wake_function+0x0/0x40 > > > > [ 287.088691] [<ffffffff81247de0>] ? kjournald2+0x0/0x220 > > > > [ 287.088694] [<ffffffff810877e6>] kthread+0x96/0xa0 > > > > [ 287.088697] [<ffffffff8100ce24>] kernel_thread_helper+0x4/0x10 > > > > [ 287.088700] [<ffffffff81087750>] ? kthread+0x0/0xa0 > > > > [ 287.088703] [<ffffffff8100ce20>] ? kernel_thread_helper+0x0/0x10 > > > > [ 287.088705] INFO: task dirname:5969 blocked for more than 30 seconds. > > > > [ 287.088707] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > > > [ 287.088709] dirname D 0000000000000002 0 5969 5214 0x00000000 > > > > [ 287.088712] ffff88005bd9d8b8 0000000000000086 ffff88005bd9dfd8 ffff88005bd9c000 > > > > [ 287.088716] 0000000000013d00 ffff88005d65b178 ffff88005bd9dfd8 0000000000013d00 > > > > [ 287.088720] ffff8801005e5b80 ffff88005d65adc0 ffff88001ffe5228 ffff88001fc53d00 > > > > [ 287.088723] Call Trace: > > > > [ 287.088726] [<ffffffff8110c070>] ? sync_page+0x0/0x50 > > > > [ 287.088729] [<ffffffff815c0990>] io_schedule+0x70/0xc0 > > > > [ 287.088732] [<ffffffff8110c0b0>] sync_page+0x40/0x50 > > > > [ 287.088735] [<ffffffff815c130f>] __wait_on_bit+0x5f/0x90 > > > > [ 287.088738] [<ffffffff8110c278>] wait_on_page_bit+0x78/0x80 > > > > [ 287.088741] [<ffffffff81087f70>] ? wake_bit_function+0x0/0x50 > > > > [ 287.088744] [<ffffffff8110dffd>] __lock_page_or_retry+0x3d/0x70 > > > > [ 287.088747] [<ffffffff8110e3c7>] filemap_fault+0x397/0x4a0 > > > > [ 287.088750] [<ffffffff8112d144>] __do_fault+0x54/0x520 > > > > [ 287.088753] [<ffffffff811309da>] handle_pte_fault+0xfa/0x210 > > > > [ 287.088756] [<ffffffff810442a7>] ? pte_alloc_one+0x37/0x50 > > > > [ 287.088759] [<ffffffff815c2cce>] ? _raw_spin_lock+0xe/0x20 > > > > [ 287.088761] [<ffffffff8112de25>] ? __pte_alloc+0xb5/0x100 > > > > [ 287.088764] [<ffffffff81131d5d>] handle_mm_fault+0x16d/0x250 > > > > [ 287.088767] [<ffffffff815c6a47>] do_page_fault+0x1a7/0x540 > > > > [ 287.088770] [<ffffffff81136947>] ? mmap_region+0x1f7/0x500 > > > > [ 287.088773] [<ffffffff8112db06>] ? free_pgd_range+0x356/0x4a0 > > > > [ 287.088776] [<ffffffff815c34d5>] page_fault+0x25/0x30 > > > > [ 287.088779] [<ffffffff812e6d5f>] ? __clear_user+0x3f/0x70 > > > > [ 287.088782] [<ffffffff812e6d41>] ? __clear_user+0x21/0x70 > > > > [ 287.088786] [<ffffffff812e6dc6>] clear_user+0x36/0x40 > > > > [ 287.088788] [<ffffffff811b0b6d>] padzero+0x2d/0x40 > > > > [ 287.088791] [<ffffffff811b2c7a>] load_elf_binary+0x95a/0xe00 > > > > [ 287.088794] [<ffffffff8116aa8a>] search_binary_handler+0xda/0x300 > > > > [ 287.088797] [<ffffffff811b2320>] ? load_elf_binary+0x0/0xe00 > > > > [ 287.088800] [<ffffffff8116c49c>] do_execve+0x24c/0x2d0 > > > > [ 287.088802] [<ffffffff8101521a>] sys_execve+0x4a/0x80 > > > > [ 287.088805] [<ffffffff8100c45c>] stub_execve+0x6c/0xc0 > > > > -- > > > > Jan Kara <jack@suse.cz> > > > > SUSE Labs, CR > > > > -- > > > > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in > > > > the body of a message to majordomo@vger.kernel.org > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/mm/vmscan.c b/mm/vmscan.c index 6771ea7..cb08b41 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1934,12 +1934,14 @@ restart: if (inactive_anon_is_low(zone, sc)) shrink_active_list(SWAP_CLUSTER_MAX, zone, sc, priority, 0); + throttle_vm_writeout(sc->gfp_mask); + /* reclaim/compaction might need reclaim to continue */ if (should_continue_reclaim(zone, nr_reclaimed, - sc->nr_scanned - nr_scanned, sc)) + sc->nr_scanned - nr_scanned, sc)) { + cond_resched(); goto restart; - - throttle_vm_writeout(sc->gfp_mask); + } } /*