From patchwork Thu Mar 23 16:17:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 1760418 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.a=rsa-sha256 header.s=default header.b=DuRKcYiT; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Pj9Vh1h4Mz1yXr for ; Fri, 24 Mar 2023 03:18:03 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 922253870887 for ; Thu, 23 Mar 2023 16:18:00 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 922253870887 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1679588280; bh=2iw69Cy0ZQfM2kOCHPziEP90urN51N/a0/lZ1PscR5Q=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=DuRKcYiT4BW7xzEWyXSB+OZDVPU/SHvTX7zbBnTAoENY29jsMzYJI3flepN+kChkj hK+FEqwaRfYunjolSPMy9Jltn60Yu/F7Wn/LYtnv+BlDOdh5r8M2xuZ45wz0RsoElS wzI5thf9PNiImZcFtXnUIbxSj+fOhpZM50yU2dLY= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-ot1-x32f.google.com (mail-ot1-x32f.google.com [IPv6:2607:f8b0:4864:20::32f]) by sourceware.org (Postfix) with ESMTPS id 6DC443858CDB for ; Thu, 23 Mar 2023 16:17:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6DC443858CDB Received: by mail-ot1-x32f.google.com with SMTP id 14-20020a9d010e000000b0069f1287f557so8036926otu.0 for ; Thu, 23 Mar 2023 09:17:43 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679588262; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=2iw69Cy0ZQfM2kOCHPziEP90urN51N/a0/lZ1PscR5Q=; b=HEtSw76A7pYa8ZzRPPhC2lTWhzzyZ0h0XcgKFRSO69v82V91i/qMSlzxz02Nr1uWYj ywTmB6hzkF/Ow03luO7BeIFspuvyZ6MuaC8eTAcF5uqvY/tm8QMl7oF/ELA7Gizze58r KJ7THxPGeghlrpTe7dnqk1P8TBTAX/NMs/OJM38kj0B/PipSDyyS1ylohZfyoG4a44iP yAOmnuySAkSUtg51lrUHiS2w7DXbv9LHOVCw8sh/G7QQYbFzhIKRMXutn0dl61ddd4X/ ZhQfA+Q121F59IC0xquqtl7P9DQhQ5GQnuvpGZoRfU/AsFIX0wpAtclZubc2Rcxh1kNd R8xg== X-Gm-Message-State: AO0yUKUbeCSeF9t0guVEWmCGwZkVIgw1NUjRtYiedA0tPCv3HE+3JF5p hg8XOmu02S0nd0Em0criELS2I2ifncotCIvDqJsPtA== X-Google-Smtp-Source: AK7set+/bZW+P50mqW1Z2sA8IAB5zLNoymQkuvcSyqCyT4rEJxVd4LPYZPTEelwHAReaZ67vgcCMig== X-Received: by 2002:a9d:6a4e:0:b0:69f:6620:8702 with SMTP id h14-20020a9d6a4e000000b0069f66208702mr3671638otn.27.1679588261861; Thu, 23 Mar 2023 09:17:41 -0700 (PDT) Received: from mandiga.. ([2804:1b3:a7c0:c260:db28:337:3c41:db0]) by smtp.gmail.com with ESMTPSA id a14-20020a9d6e8e000000b00670679748f9sm7811690otr.49.2023.03.23.09.17.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Mar 2023 09:17:41 -0700 (PDT) To: libc-alpha@sourceware.org, Wilco Dijkstra , DJ Delorie Subject: [PATCH v3] malloc: Use C11 atomics on memusage Date: Thu, 23 Mar 2023 13:17:37 -0300 Message-Id: <20230323161737.2592579-1-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-Spam-Status: No, score=-12.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org Sender: "Libc-alpha" Checked on x86_64-linux-gnu. Changes from previous version: - Use stdatomic instead of glibc internal atomics and use relaxed MO instead of acquire. Reviewed-by: DJ Delorie --- malloc/memusage.c | 193 ++++++++++++++++++++++++++-------------------- 1 file changed, 111 insertions(+), 82 deletions(-) diff --git a/malloc/memusage.c b/malloc/memusage.c index 6d71047154..2a3a508557 100644 --- a/malloc/memusage.c +++ b/malloc/memusage.c @@ -17,21 +17,16 @@ . */ #include -#include #include -#include #include -#include -#include -#include +#include #include #include #include -#include -#include -#include +#include #include #include +#include #include #include @@ -73,20 +68,20 @@ struct header #define MAGIC 0xfeedbeaf -static unsigned long int calls[idx_last]; -static unsigned long int failed[idx_last]; -static size_t total[idx_last]; -static size_t grand_total; -static unsigned long int histogram[65536 / 16]; -static unsigned long int large; -static unsigned long int calls_total; -static unsigned long int inplace; -static unsigned long int decreasing; -static unsigned long int realloc_free; -static unsigned long int inplace_mremap; -static unsigned long int decreasing_mremap; -static size_t current_heap; -static size_t peak_use[3]; +static _Atomic unsigned long int calls[idx_last]; +static _Atomic unsigned long int failed[idx_last]; +static _Atomic size_t total[idx_last]; +static _Atomic size_t grand_total; +static _Atomic unsigned long int histogram[65536 / 16]; +static _Atomic unsigned long int large; +static _Atomic unsigned long int calls_total; +static _Atomic unsigned long int inplace; +static _Atomic unsigned long int decreasing; +static _Atomic unsigned long int realloc_free; +static _Atomic unsigned long int inplace_mremap; +static _Atomic unsigned long int decreasing_mremap; +static _Atomic size_t current_heap; +static _Atomic size_t peak_use[3]; static __thread uintptr_t start_sp; /* A few macros to make the source more readable. */ @@ -113,7 +108,7 @@ struct entry }; static struct entry buffer[2 * DEFAULT_BUFFER_SIZE]; -static uint32_t buffer_cnt; +static _Atomic uint32_t buffer_cnt; static struct entry first; static void @@ -134,6 +129,19 @@ gettime (struct entry *e) #endif } +static inline void +peak_atomic_max (_Atomic size_t *peak, size_t val) +{ + size_t v; + do + { + v = atomic_load_explicit (peak, memory_order_relaxed); + if (v >= val) + break; + } + while (! atomic_compare_exchange_weak (peak, &v, val)); +} + /* Update the global data after a successful function call. */ static void update_data (struct header *result, size_t len, size_t old_len) @@ -148,8 +156,9 @@ update_data (struct header *result, size_t len, size_t old_len) /* Compute current heap usage and compare it with the maximum value. */ size_t heap - = catomic_exchange_and_add (¤t_heap, len - old_len) + len - old_len; - catomic_max (&peak_heap, heap); + = atomic_fetch_add_explicit (¤t_heap, len - old_len, + memory_order_relaxed) + len - old_len; + peak_atomic_max (&peak_heap, heap); /* Compute current stack usage and compare it with the maximum value. The base stack pointer might not be set if this is not @@ -172,15 +181,16 @@ update_data (struct header *result, size_t len, size_t old_len) start_sp = sp; size_t current_stack = start_sp - sp; #endif - catomic_max (&peak_stack, current_stack); + peak_atomic_max (&peak_stack, current_stack); /* Add up heap and stack usage and compare it with the maximum value. */ - catomic_max (&peak_total, heap + current_stack); + peak_atomic_max (&peak_total, heap + current_stack); /* Store the value only if we are writing to a file. */ if (fd != -1) { - uint32_t idx = catomic_exchange_and_add (&buffer_cnt, 1); + uint32_t idx = atomic_fetch_add_explicit (&buffer_cnt, 1, + memory_order_relaxed); if (idx + 1 >= 2 * buffer_size) { /* We try to reset the counter to the correct range. If @@ -188,7 +198,8 @@ update_data (struct header *result, size_t len, size_t old_len) counter it does not matter since that thread will take care of the correction. */ uint32_t reset = (idx + 1) % (2 * buffer_size); - catomic_compare_and_exchange_val_acq (&buffer_cnt, reset, idx + 1); + uint32_t expected = idx + 1; + atomic_compare_exchange_weak (&buffer_cnt, &expected, reset); if (idx >= 2 * buffer_size) idx = reset - 1; } @@ -362,24 +373,25 @@ malloc (size_t len) return (*mallocp)(len); /* Keep track of number of calls. */ - catomic_increment (&calls[idx_malloc]); + atomic_fetch_add_explicit (&calls[idx_malloc], 1, memory_order_relaxed); /* Keep track of total memory consumption for `malloc'. */ - catomic_add (&total[idx_malloc], len); + atomic_fetch_add_explicit (&total[idx_malloc], len, memory_order_relaxed); /* Keep track of total memory requirement. */ - catomic_add (&grand_total, len); + atomic_fetch_add_explicit (&grand_total, len, memory_order_relaxed); /* Remember the size of the request. */ if (len < 65536) - catomic_increment (&histogram[len / 16]); + atomic_fetch_add_explicit (&histogram[len / 16], 1, memory_order_relaxed); else - catomic_increment (&large); + atomic_fetch_add_explicit (&large, 1, memory_order_relaxed); /* Total number of calls of any of the functions. */ - catomic_increment (&calls_total); + atomic_fetch_add_explicit (&calls_total, 1, memory_order_relaxed); /* Do the real work. */ result = (struct header *) (*mallocp)(len + sizeof (struct header)); if (result == NULL) { - catomic_increment (&failed[idx_malloc]); + atomic_fetch_add_explicit (&failed[idx_malloc], 1, + memory_order_relaxed); return NULL; } @@ -430,21 +442,24 @@ realloc (void *old, size_t len) } /* Keep track of number of calls. */ - catomic_increment (&calls[idx_realloc]); + atomic_fetch_add_explicit (&calls[idx_realloc], 1, memory_order_relaxed); if (len > old_len) { /* Keep track of total memory consumption for `realloc'. */ - catomic_add (&total[idx_realloc], len - old_len); + atomic_fetch_add_explicit (&total[idx_realloc], len - old_len, + memory_order_relaxed); /* Keep track of total memory requirement. */ - catomic_add (&grand_total, len - old_len); + atomic_fetch_add_explicit (&grand_total, len - old_len, + memory_order_relaxed); } if (len == 0 && old != NULL) { /* Special case. */ - catomic_increment (&realloc_free); + atomic_fetch_add_explicit (&realloc_free, 1, memory_order_relaxed); /* Keep track of total memory freed using `free'. */ - catomic_add (&total[idx_free], real->length); + atomic_fetch_add_explicit (&total[idx_free], real->length, + memory_order_relaxed); /* Update the allocation data and write out the records if necessary. */ update_data (NULL, 0, old_len); @@ -457,26 +472,27 @@ realloc (void *old, size_t len) /* Remember the size of the request. */ if (len < 65536) - catomic_increment (&histogram[len / 16]); + atomic_fetch_add_explicit (&histogram[len / 16], 1, memory_order_relaxed); else - catomic_increment (&large); + atomic_fetch_add_explicit (&large, 1, memory_order_relaxed); /* Total number of calls of any of the functions. */ - catomic_increment (&calls_total); + atomic_fetch_add_explicit (&calls_total, 1, memory_order_relaxed); /* Do the real work. */ result = (struct header *) (*reallocp)(real, len + sizeof (struct header)); if (result == NULL) { - catomic_increment (&failed[idx_realloc]); + atomic_fetch_add_explicit (&failed[idx_realloc], 1, + memory_order_relaxed); return NULL; } /* Record whether the reduction/increase happened in place. */ if (real == result) - catomic_increment (&inplace); + atomic_fetch_add_explicit (&inplace, 1, memory_order_relaxed); /* Was the buffer increased? */ if (old_len > len) - catomic_increment (&decreasing); + atomic_fetch_add_explicit (&decreasing, 1, memory_order_relaxed); /* Update the allocation data and write out the records if necessary. */ update_data (result, len, old_len); @@ -508,16 +524,17 @@ calloc (size_t n, size_t len) return (*callocp)(n, len); /* Keep track of number of calls. */ - catomic_increment (&calls[idx_calloc]); + atomic_fetch_add_explicit (&calls[idx_calloc], 1, memory_order_relaxed); /* Keep track of total memory consumption for `calloc'. */ - catomic_add (&total[idx_calloc], size); + atomic_fetch_add_explicit (&total[idx_calloc], size, memory_order_relaxed); /* Keep track of total memory requirement. */ - catomic_add (&grand_total, size); + atomic_fetch_add_explicit (&grand_total, size, memory_order_relaxed); /* Remember the size of the request. */ if (size < 65536) - catomic_increment (&histogram[size / 16]); + atomic_fetch_add_explicit (&histogram[size / 16], 1, + memory_order_relaxed); else - catomic_increment (&large); + atomic_fetch_add_explicit (&large, 1, memory_order_relaxed); /* Total number of calls of any of the functions. */ ++calls_total; @@ -525,7 +542,8 @@ calloc (size_t n, size_t len) result = (struct header *) (*mallocp)(size + sizeof (struct header)); if (result == NULL) { - catomic_increment (&failed[idx_calloc]); + atomic_fetch_add_explicit (&failed[idx_calloc], 1, + memory_order_relaxed); return NULL; } @@ -563,7 +581,7 @@ free (void *ptr) /* `free (NULL)' has no effect. */ if (ptr == NULL) { - catomic_increment (&calls[idx_free]); + atomic_fetch_add_explicit (&calls[idx_free], 1, memory_order_relaxed); return; } @@ -577,9 +595,10 @@ free (void *ptr) } /* Keep track of number of calls. */ - catomic_increment (&calls[idx_free]); + atomic_fetch_add_explicit (&calls[idx_free], 1, memory_order_relaxed); /* Keep track of total memory freed using `free'. */ - catomic_add (&total[idx_free], real->length); + atomic_fetch_add_explicit (&total[idx_free], real->length, + memory_order_relaxed); /* Update the allocation data and write out the records if necessary. */ update_data (NULL, 0, real->length); @@ -614,22 +633,23 @@ mmap (void *start, size_t len, int prot, int flags, int fd, off_t offset) ? idx_mmap_a : prot & PROT_WRITE ? idx_mmap_w : idx_mmap_r); /* Keep track of number of calls. */ - catomic_increment (&calls[idx]); + atomic_fetch_add_explicit (&calls[idx], 1, memory_order_relaxed); /* Keep track of total memory consumption for `malloc'. */ - catomic_add (&total[idx], len); + atomic_fetch_add_explicit (&total[idx], len, memory_order_relaxed); /* Keep track of total memory requirement. */ - catomic_add (&grand_total, len); + atomic_fetch_add_explicit (&grand_total, len, memory_order_relaxed); /* Remember the size of the request. */ if (len < 65536) - catomic_increment (&histogram[len / 16]); + atomic_fetch_add_explicit (&histogram[len / 16], 1, + memory_order_relaxed); else - catomic_increment (&large); + atomic_fetch_add_explicit (&large, 1, memory_order_relaxed); /* Total number of calls of any of the functions. */ - catomic_increment (&calls_total); + atomic_fetch_add_explicit (&calls_total, 1, memory_order_relaxed); /* Check for failures. */ if (result == NULL) - catomic_increment (&failed[idx]); + atomic_fetch_add_explicit (&failed[idx], 1, memory_order_relaxed); else if (idx == idx_mmap_w) /* Update the allocation data and write out the records if necessary. Note the first parameter is NULL which means @@ -667,22 +687,23 @@ mmap64 (void *start, size_t len, int prot, int flags, int fd, off64_t offset) ? idx_mmap_a : prot & PROT_WRITE ? idx_mmap_w : idx_mmap_r); /* Keep track of number of calls. */ - catomic_increment (&calls[idx]); + atomic_fetch_add_explicit (&calls[idx], 1, memory_order_relaxed); /* Keep track of total memory consumption for `malloc'. */ - catomic_add (&total[idx], len); + atomic_fetch_add_explicit (&total[idx], len, memory_order_relaxed); /* Keep track of total memory requirement. */ - catomic_add (&grand_total, len); + atomic_fetch_add_explicit (&grand_total, len, memory_order_relaxed); /* Remember the size of the request. */ if (len < 65536) - catomic_increment (&histogram[len / 16]); + atomic_fetch_add_explicit (&histogram[len / 16], 1, + memory_order_relaxed); else - catomic_increment (&large); + atomic_fetch_add_explicit (&large, 1, memory_order_relaxed); /* Total number of calls of any of the functions. */ - catomic_increment (&calls_total); + atomic_fetch_add_explicit (&calls_total, 1, memory_order_relaxed); /* Check for failures. */ if (result == NULL) - catomic_increment (&failed[idx]); + atomic_fetch_add_explicit (&failed[idx], 1, memory_order_relaxed); else if (idx == idx_mmap_w) /* Update the allocation data and write out the records if necessary. Note the first parameter is NULL which means @@ -722,33 +743,39 @@ mremap (void *start, size_t old_len, size_t len, int flags, ...) if (!not_me && trace_mmap) { /* Keep track of number of calls. */ - catomic_increment (&calls[idx_mremap]); + atomic_fetch_add_explicit (&calls[idx_mremap], 1, memory_order_relaxed); if (len > old_len) { /* Keep track of total memory consumption for `malloc'. */ - catomic_add (&total[idx_mremap], len - old_len); + atomic_fetch_add_explicit (&total[idx_mremap], len - old_len, + memory_order_relaxed); /* Keep track of total memory requirement. */ - catomic_add (&grand_total, len - old_len); + atomic_fetch_add_explicit (&grand_total, len - old_len, + memory_order_relaxed); } /* Remember the size of the request. */ if (len < 65536) - catomic_increment (&histogram[len / 16]); + atomic_fetch_add_explicit (&histogram[len / 16], 1, + memory_order_relaxed); else - catomic_increment (&large); + atomic_fetch_add_explicit (&large, 1, memory_order_relaxed); /* Total number of calls of any of the functions. */ - catomic_increment (&calls_total); + atomic_fetch_add_explicit (&calls_total, 1, memory_order_relaxed); /* Check for failures. */ if (result == NULL) - catomic_increment (&failed[idx_mremap]); + atomic_fetch_add_explicit (&failed[idx_mremap], 1, + memory_order_relaxed); else { /* Record whether the reduction/increase happened in place. */ if (start == result) - catomic_increment (&inplace_mremap); + atomic_fetch_add_explicit (&inplace_mremap, 1, + memory_order_relaxed); /* Was the buffer increased? */ if (old_len > len) - catomic_increment (&decreasing_mremap); + atomic_fetch_add_explicit (&decreasing_mremap, 1, + memory_order_relaxed); /* Update the allocation data and write out the records if necessary. Note the first parameter is NULL which means @@ -783,19 +810,21 @@ munmap (void *start, size_t len) if (!not_me && trace_mmap) { /* Keep track of number of calls. */ - catomic_increment (&calls[idx_munmap]); + atomic_fetch_add_explicit (&calls[idx_munmap], 1, memory_order_relaxed); if (__glibc_likely (result == 0)) { /* Keep track of total memory freed using `free'. */ - catomic_add (&total[idx_munmap], len); + atomic_fetch_add_explicit (&total[idx_munmap], len, + memory_order_relaxed); /* Update the allocation data and write out the records if necessary. */ update_data (NULL, 0, len); } else - catomic_increment (&failed[idx_munmap]); + atomic_fetch_add_explicit (&failed[idx_munmap], 1, + memory_order_relaxed); } return result;