Message ID | 20190724165803.87470-1-brianvv@google.com |
---|---|
Headers | show |
Series | bpf: add BPF_MAP_DUMP command to dump more than one entry per call | expand |
On Wed, Jul 24, 2019 at 10:09 AM Brian Vazquez <brianvv@google.com> wrote: > > This introduces a new command to retrieve multiple number of entries > from a bpf map. > > This new command can be executed from the existing BPF syscall as > follows: > > err = bpf(BPF_MAP_DUMP, union bpf_attr *attr, u32 size) > using attr->dump.map_fd, attr->dump.prev_key, attr->dump.buf, > attr->dump.buf_len > returns zero or negative error, and populates buf and buf_len on > succees > > This implementation is wrapping the existing bpf methods: > map_get_next_key and map_lookup_elem > > Note that this implementation can be extended later to do dump and > delete by extending map_lookup_and_delete_elem (currently it only works > for bpf queue/stack maps) and either use a new flag in map_dump or a new > command map_dump_and_delete. > > Results show that even with a 1-elem_size buffer, it runs ~40 faster Why is the new command 40% faster with 1-elem_size buffer? > than the current implementation, improvements of ~85% are reported when > the buffer size is increased, although, after the buffer size is around > 5% of the total number of entries there's no huge difference in > increasing it. > > Tested: > Tried different size buffers to handle case where the bulk is bigger, or > the elements to retrieve are less than the existing ones, all runs read > a map of 100K entries. Below are the results(in ns) from the different > runs: > > buf_len_1: 69038725 entry-by-entry: 112384424 improvement > 38.569134 > buf_len_2: 40897447 entry-by-entry: 111030546 improvement > 63.165590 > buf_len_230: 13652714 entry-by-entry: 111694058 improvement > 87.776687 > buf_len_5000: 13576271 entry-by-entry: 111101169 improvement > 87.780263 > buf_len_73000: 14694343 entry-by-entry: 111740162 improvement > 86.849542 > buf_len_100000: 13745969 entry-by-entry: 114151991 improvement > 87.958187 > buf_len_234567: 14329834 entry-by-entry: 114427589 improvement > 87.476941 It took me a while to figure out the meaning of 87.476941. It is probably a good idea to say 87.5% instead. Thanks, Song
On Wed, Jul 24, 2019 at 12:20 PM Song Liu <liu.song.a23@gmail.com> wrote: > > On Wed, Jul 24, 2019 at 10:09 AM Brian Vazquez <brianvv@google.com> wrote: > > > > This introduces a new command to retrieve multiple number of entries > > from a bpf map. > > > > This new command can be executed from the existing BPF syscall as > > follows: > > > > err = bpf(BPF_MAP_DUMP, union bpf_attr *attr, u32 size) > > using attr->dump.map_fd, attr->dump.prev_key, attr->dump.buf, > > attr->dump.buf_len > > returns zero or negative error, and populates buf and buf_len on > > succees > > > > This implementation is wrapping the existing bpf methods: > > map_get_next_key and map_lookup_elem > > > > Note that this implementation can be extended later to do dump and > > delete by extending map_lookup_and_delete_elem (currently it only works > > for bpf queue/stack maps) and either use a new flag in map_dump or a new > > command map_dump_and_delete. > > > > Results show that even with a 1-elem_size buffer, it runs ~40 faster > > Why is the new command 40% faster with 1-elem_size buffer? The test is using a really simple map structure: uint64_t key,val. Which makes the lookup_elem logic faster since it doesn't spend too much time copying the value. My conclusion with the 40% was that this new implementation only needs 1 syscall per elem compare to the 2 syscalls that we needed with previous implementation so in this particular case the number of ops that we are doing is almost halved. I did one experiment increasing the value_size (448*64B) and it was only 14% faster with 1-elem_size buffer. > > than the current implementation, improvements of ~85% are reported when > > the buffer size is increased, although, after the buffer size is around > > 5% of the total number of entries there's no huge difference in > > increasing it. > > > > Tested: > > Tried different size buffers to handle case where the bulk is bigger, or > > the elements to retrieve are less than the existing ones, all runs read > > a map of 100K entries. Below are the results(in ns) from the different > > runs: > > > > buf_len_1: 69038725 entry-by-entry: 112384424 improvement > > 38.569134 > > buf_len_2: 40897447 entry-by-entry: 111030546 improvement > > 63.165590 > > buf_len_230: 13652714 entry-by-entry: 111694058 improvement > > 87.776687 > > buf_len_5000: 13576271 entry-by-entry: 111101169 improvement > > 87.780263 > > buf_len_73000: 14694343 entry-by-entry: 111740162 improvement > > 86.849542 > > buf_len_100000: 13745969 entry-by-entry: 114151991 improvement > > 87.958187 > > buf_len_234567: 14329834 entry-by-entry: 114427589 improvement > > 87.476941 > > It took me a while to figure out the meaning of 87.476941. It is probably > a good idea to say 87.5% instead. right, will change it in next version.