Message ID | 1447767527-27316-3-git-send-email-nhofmeyr@sysmocom.de |
---|---|
State | Changes Requested |
Headers | show |
> On 17 Nov 2015, at 14:38, Neels Hofmeyr <nhofmeyr@sysmocom.de> wrote: > > > + /* The TEI numbers will simply wrap and be reused, which will work out > + * in practice. Problems would arise if one given peer maintained the > + * same TEI for a time long enough for the TEI nr map to wrap an entire > + * uint32_t; if a new TEI were mapped every second, this would take > + * more than 100 years (in which a single given TEI must not time out) > + * to cause a problem. */ Unless performance matters I am not for probalistic correct, specially not in the beginning of a new piece of software. "Modern" smartphones will open a PDP context very early and they live for a potentially long time (until you reboot your phone or such). So I would prefer to check if a TEI is used (at least the number should be random) before using it. So even if the assignment is not random right now I would still prefer the check if it is assigned or not. E.g. a unit test that exhausts the entire number space should be feasible. holger
On Wed, Nov 25, 2015 at 09:05:40PM +0100, Holger Freyther wrote: > > + /* The TEI numbers will simply wrap and be reused, which will work out > > + * in practice. Problems would arise if one given peer maintained the > > + * same TEI for a time long enough for the TEI nr map to wrap an entire > > + * uint32_t; if a new TEI were mapped every second, this would take > > + * more than 100 years (in which a single given TEI must not time out) > > + * to cause a problem. */ > > Unless performance matters I am not for probalistic correct, specially not in the > beginning of a new piece of software. "Modern" smartphones will open a PDP > context very early and they live for a potentially long time (until you reboot your > phone or such). The TEI space will exhaust even slower, then :) The more *new* PDP contexts are created, the faster the pool will wrap. > So I would prefer to check if a TEI is used (at least the number should be random) > before using it. So even if the assignment is not random right now I would still > prefer the check if it is assigned or not. So I should iterate all tunnels to ensure a new TEI is unused. I'm anyway iterating all tunnels to detect a peer reusing a TEI it had sent before, also checking against the mapped TEI is easy to add. > E.g. a unit test that exhausts the entire number space should be feasible. Really? 4.294.967.295 TEI mappings?? Thats 4*sizeof(mapping) gigabytes and might need special memory model options to work, and the testing machine needs that many gigs of ram and will take quite a while: O(n²)... Probably easier to manually assign a given TEI and make the pool return that TEI next. Is your suggestion to shrink the TEI space? And next, what about sequence numbers? The space is much smaller (65.536 values), and expires after 30 seconds. To me, it is much more realistic that we receive 66.000 messages with distinct sequence numbers within the space of 30 seconds, than that a given TEI will remain in use for more than 100 years in a place where a new subscriber shows up every single second. So, verifying that a *sequence number* is unused seems to be way more critical than the TEI check. Which touches the optimization topic. Currently gtphub will have to iterate everything all the time. To optimize, one way would be to list tunnels at the peers they associate with, but if gtphub is used as proxy for a single peer, that has exactly zero effect. So some way of hash map might be more useful. But gtphub wants several different views, i.e. sometimes it wants to find an original TEI, sometimes a mapped TEI, sometimes from SGSN and sometimes from GGSN side, and sometimes it wants to find all tunnels for a given peer. So, to optimize we'd want a couple of separate hash maps hashing one and the same list from different angles. I'd be glad for any ideas, or hints at existing Osmocom API. But, unless told otherwise, I simply won't bother about optimization for now... ~Neels
On Thu, Nov 26, 2015 at 05:51:52PM +0100, Neels Hofmeyr wrote: > Which touches the optimization topic. Currently gtphub will have to [...] > of separate hash maps hashing one and the same list from different angles. I'm thinking, if we want persistence as well as several optimized querying angles: sqlite might do both at the same time. ~Neels
On 26.11.2015 17:51, Neels Hofmeyr wrote: > On Wed, Nov 25, 2015 at 09:05:40PM +0100, Holger Freyther wrote: >> E.g. a unit test that exhausts the entire number space should be feasible. > > Really? 4.294.967.295 TEI mappings?? Thats 4*sizeof(mapping) gigabytes and > might need special memory model options to work, and the testing > machine needs that many gigs of ram and will take quite a while: O(n²)... I am fond of exhausting tests, too. But AFAI understand, it would be enough to establish one mapping at first and then establishing and releasing them immediately (N = 1) or after e.g. N = 2^16 steps. Thus you had a persistent initial one and a sliding window of N mappings and each new mapping could be checked against at least the first one. If you still ran into memory problems, you would have found a bug that way. > Probably easier to manually assign a given TEI and make the pool return > that TEI next. This won't check the TEI generation, will it? If the allocation was changed in the future (e.g. to be based on random numbers), this kind of test wouldn't find bugs there. Jacob
On Fri, Nov 27, 2015 at 10:00:08AM +0100, Jacob Erlbeck wrote: > On 26.11.2015 17:51, Neels Hofmeyr wrote: > > On Wed, Nov 25, 2015 at 09:05:40PM +0100, Holger Freyther wrote: > > >> E.g. a unit test that exhausts the entire number space should be > feasible. > > > > Really? 4.294.967.295 TEI mappings?? Thats 4*sizeof(mapping) gigabytes and > > might need special memory model options to work, and the testing > > machine needs that many gigs of ram and will take quite a while: O(n²)... > > I am fond of exhausting tests, too. I think it would be best to artificially limit the range to 1..16 to test exhaustion. We wouldn't have to run through 2^32 (!) mappings then. Let me outline the two number spaces again... There's the TEI space (32 bits of space), random per PDP context, and the sequence number space (16 bits of space), ++ per GTP Request. I really doubt that anyone will manage to exaust TEIs ("64 kilobytes of RAM ought to be enough for anyone!"), but I accept the argument, and the randomness, too. So for the TEIs, I have added code that runs through all the tunnels and ensures that a TEI is not mapped twice. If it is, the code tries to pick an unused TEI outside the seen min..max range of TEIs. But it can't yet reliably find open gaps in the middle, which would require having a sorted list. If no TEI guaranteed to be unused is found in that way, gtphub will abort(). (It could just drop packets instead, but maybe abort() is better until we have a test for it.) TEIs are so far picked sequentially, but should (TM) be picked randomly. Again, a sorted list would solve. Sequence numbers are different because - they time out after 30 seconds (I picked 30 seconds off the top of my head, haven't really searched for a timeout in the specs yet.) - they are specified to be sequential. Exhausting 65535 sequence numbers in 30 seconds sounds possible; note, this is a general limitation of GTP. If a packet is resent again and again with the same sequence number (common strategy when no reply is received), a given sequence mapping can linger for more than 30 seconds; but the retries should in turn cease within another 30 seconds or so. So on the one hand sequence nrs are more likely to wrap (space is only 2^16), on the other hand less likely to stick around. I'm not checking sequence nr collisions yet. Sequence numbers are supposed to be sequential, so for those, gtphub could remember the range of used sequence nrs as head..tail, and advance head as used ones are discarded. If a discarded number is != head, store that in a list; advance head, emptying the list as appropriate, once head is released (because of the sequential nature, that list would usually stay pretty short). If sequence numbers' tail catches up with head, the GTP space of pending responses is exhausted, and gtphub should probably drop any further requests to that peer until head has advanced (and ensure its timeout is valid or something). Any thoughts are welcome! BTW, the same situations can arise in osmo-sgsn and openggsn. While obsessing about this in gtphub, we should probably visit the others as well...? ;) ~Neels
diff --git a/openbsc/include/openbsc/gtphub.h b/openbsc/include/openbsc/gtphub.h index c43a328..77e43e4 100644 --- a/openbsc/include/openbsc/gtphub.h +++ b/openbsc/include/openbsc/gtphub.h @@ -256,7 +256,8 @@ typedef unsigned int nr_t; * If this becomes random, the tests need to be fixed. */ struct nr_pool { nr_t last_nr; - /* TODO add min, max, for safe wrapping */ + nr_t nr_min; + nr_t nr_max; }; struct nr_mapping { @@ -275,7 +276,7 @@ struct nr_map { }; -void nr_pool_init(struct nr_pool *pool); +void nr_pool_init(struct nr_pool *pool, nr_t nr_min, nr_t nr_max); /* Return the next unused number from the nr_pool. */ nr_t nr_pool_next(struct nr_pool *pool); @@ -398,6 +399,12 @@ struct gtphub { /* pointers to an entry of to_ggsns[x].peers */ struct gtphub_peer_port *ggsn_proxy[GTPH_PLANE_N]; + /* The TEI numbers will simply wrap and be reused, which will work out + * in practice. Problems would arise if one given peer maintained the + * same TEI for a time long enough for the TEI nr map to wrap an entire + * uint32_t; if a new TEI were mapped every second, this would take + * more than 100 years (in which a single given TEI must not time out) + * to cause a problem. */ struct nr_map tei_map[GTPH_PLANE_N]; struct nr_pool tei_pool[GTPH_PLANE_N]; diff --git a/openbsc/src/gprs/gtphub.c b/openbsc/src/gprs/gtphub.c index 7c7f693..155024b 100644 --- a/openbsc/src/gprs/gtphub.c +++ b/openbsc/src/gprs/gtphub.c @@ -577,18 +577,21 @@ void expiring_item_del(struct expiring_item *item) /* nr_map, nr_pool */ -void nr_pool_init(struct nr_pool *pool) +void nr_pool_init(struct nr_pool *pool, nr_t nr_min, nr_t nr_max) { - *pool = (struct nr_pool){}; + *pool = (struct nr_pool){ + .nr_min = nr_min, + .nr_max = nr_max, + .last_nr = nr_max + }; } nr_t nr_pool_next(struct nr_pool *pool) { - pool->last_nr ++; - - OSMO_ASSERT(pool->last_nr > 0); - /* TODO: gracefully handle running out of TEIs. */ - /* TODO: random TEIs. */ + if (pool->last_nr >= pool->nr_max) + pool->last_nr = pool->nr_min; + else + pool->last_nr ++; return pool->last_nr; } @@ -1806,7 +1809,7 @@ void gtphub_init(struct gtphub *hub) int plane_idx; for (plane_idx = 0; plane_idx < GTPH_PLANE_N; plane_idx++) { - nr_pool_init(&hub->tei_pool[plane_idx]); + nr_pool_init(&hub->tei_pool[plane_idx], 1, 0xffffffff); nr_map_init(&hub->tei_map[plane_idx], &hub->tei_pool[plane_idx], &hub->expire_tei_maps); @@ -1974,7 +1977,7 @@ static struct gtphub_peer *gtphub_peer_new(struct gtphub *hub, INIT_LLIST_HEAD(&peer->addresses); - nr_pool_init(&peer->seq_pool); + nr_pool_init(&peer->seq_pool, 0, 0xffff); nr_map_init(&peer->seq_map, &peer->seq_pool, &hub->expire_seq_maps); /* TODO use something random to pick the initial sequence nr. diff --git a/openbsc/tests/gtphub/gtphub_test.c b/openbsc/tests/gtphub/gtphub_test.c index 4e22ac7..b1ebb40 100644 --- a/openbsc/tests/gtphub/gtphub_test.c +++ b/openbsc/tests/gtphub/gtphub_test.c @@ -157,7 +157,7 @@ static void test_nr_map_basic(void) struct nr_map _map; struct nr_map *map = &_map; - nr_pool_init(pool); + nr_pool_init(pool, 1, 1000); nr_map_init(map, pool, NULL); OSMO_ASSERT(llist_empty(&map->mappings)); @@ -253,6 +253,50 @@ static int nr_map_is(struct nr_map *map, const char *str) return 1; } +static int test_nr_map_wrap_with(nr_t nr_min, nr_t nr_max, nr_t repl_last, + nr_t orig_start, int orig_n, + const char *expect) +{ + struct nr_pool _pool; + struct nr_pool *pool = &_pool; + struct nr_map _map; + struct nr_map *map = &_map; + + nr_pool_init(pool, nr_min, nr_max); + nr_map_init(map, pool, NULL); + + pool->last_nr = repl_last; + + void *origin = (void*)0x1234; + + int i; + for (i = 0; i < orig_n; i++) + LVL2_ASSERT(nr_map_have(map, origin, orig_start + i, 0)); + + LVL2_ASSERT(nr_map_is(map, expect)); + + nr_map_clear(map); + return 1; +} + +static void test_nr_map_wrap(void) +{ + OSMO_ASSERT(test_nr_map_wrap_with( + 0, UINT_MAX, UINT_MAX - 2, + 1, 5, + "(1->4294967294@0), " + "(2->4294967295@0), " + "(3->0@0), " + "(4->1@0), " + "(5->2@0), " + )); + OSMO_ASSERT(test_nr_map_wrap_with( + 5, 10, 8, + 1, 5, + "(1->9@0), (2->10@0), (3->5@0), (4->6@0), (5->7@0), " + )); +} + static void test_expiry(void) { struct expiry expiry; @@ -261,7 +305,7 @@ static void test_expiry(void) int i; expiry_init(&expiry, 30); - nr_pool_init(&pool); + nr_pool_init(&pool, 1, 1000); nr_map_init(&map, &pool, &expiry); OSMO_ASSERT(nr_map_is(&map, "")); @@ -947,6 +991,7 @@ int main(int argc, char **argv) osmo_gtphub_ctx = talloc_named_const(NULL, 0, "osmo_gtphub"); test_nr_map_basic(); + test_nr_map_wrap(); test_expiry(); test_echo(); test_create_pdp_ctx();