Ticket #740 (closed Problem: fixed)

Opened 5 months ago

Last modified 5 months ago

Bug resulting in too many internal type references

Reported by: seth Owned by: jsiwek
Priority: Normal Milestone:
Component: Bro Version:
Keywords: Cc:

Description

There is a hard limit to how many references Bro will create for a type or value (INT_MAX) and we are using a function incorrectly leading to termination after long uptimes.

There are a few places (for example strings.bif:132) where the internal_type function is called to get a reference to the type for a Bro script level defined type. Unfortunately when internal_type is called, it in turn calls lookup_ID (Var.cc:486 leading to Scope.cc:115) which looks up the type, adds an additional reference to it and returns it. The code that called internal_type never Unrefs the value. Eventually Obj.h:207 causes a shutdown when INT_MAX references are created to the type.

I'm thinking we should just lookup the type a single time at startup along with all of the other Bro defined types in NetVar?.cc and reuse the value after that so we one reference it a single time. It's more consistent with how the internal_type function is called for other cases.

Change History

comment:1 Changed 5 months ago by seth

Here's a backtrace....

#0  0x286b7ea7 in kill () from /lib/libc.so.7
#0  0x286b7ea7 in kill () from /lib/libc.so.7
#1  0x286b7e06 in raise () from /lib/libc.so.7
#2  0x286b6a1a in abort () from /lib/libc.so.7
#3  0x08196a0a in Reporter::InternalError (this=) at /home/users/bro/bro-2.0-beta/src/Reporter.cc:118
#4  0x081bd912 in bad_ref (type=1) at /home/users/bro/bro-2.0-beta/src/Obj.cc:250
#5  0x081f29cd in lookup_ID (name=0x8261025 "string_array", curr_module=0x824ad2f "GLOBAL", no_global=false, same_module_only=false) at Obj.h:208
#6  0x0823bf50 in internal_type (name=0x8261025 "string_array") at /home/users/bro/bro-2.0-beta/src/Var.cc:486
#7  0x08165ba8 in do_split (str_val=0x30745980, re=0x28866f20, other_sep=) at strings.bif:219
#8  0x08166606 in BifFunc::bro_split1 (frame=0x30715c80, BiF_ARGS=0x316025a0) at strings.bif:386
#9  0x08150ce3 in BuiltinFunc::Call (this=0x2885bd30, args=0x316025a0, parent=0x30715c80) at /home/users/bro/bro-2.0-beta/src/Func.cc:476
#10 0x0812b2bb in CallExpr::Eval (this=0x28874a00, f=0x30715c80) at /home/users/bro/bro-2.0-beta/src/Expr.cc:4651
#11 0x0812a222 in AssignExpr::Eval (this=0x28875400, f=0x30715c80) at /home/users/bro/bro-2.0-beta/src/Expr.cc:2598
#12 0x0820e2aa in ExprStmt::Exec (this=0x28874b80, f=0x30715c80, flow=@0xbfbfe044) at /home/users/bro/bro-2.0-beta/src/Stmt.cc:367
#13 0x0820a8dc in StmtList::Exec (this=0x28869b00, f=0x30715c80, flow=@0xbfbfe044) at /home/users/bro/bro-2.0-beta/src/Stmt.cc:1402
#14 0x0815110f in BroFunc::Call (this=0x288953c0, args=0xbfbfe0ac, parent=0x0) at /home/users/bro/bro-2.0-beta/src/Func.cc:333
#15 0x0819e2fd in LogMgr::Write (this=0x2872c770, id=0x28d45480, columns=0x2f3b34e0) at /home/users/bro/bro-2.0-beta/src/LogMgr.cc:938
#16 0x081560fc in BifFunc::Log::bro___write (frame=0x305f6c40, BiF_ARGS=0x30143b70) at logging.bif:46
#17 0x08150ce3 in BuiltinFunc::Call (this=0x2885b7e0, args=0x30143b70, parent=0x305f6c40) at /home/users/bro/bro-2.0-beta/src/Func.cc:476
#18 0x0812b2bb in CallExpr::Eval (this=0x2889d340, f=0x305f6c40) at /home/users/bro/bro-2.0-beta/src/Expr.cc:4651
#19 0x0820a8dc in StmtList::Exec (this=0x2889a740, f=0x305f6c40, flow=@0xbfbfe314) at /home/users/bro/bro-2.0-beta/src/Stmt.cc:1402
#20 0x0815110f in BroFunc::Call (this=0x2889a800, args=0x2ffa2550, parent=0x305f3980) at /home/users/bro/bro-2.0-beta/src/Func.cc:333
#21 0x0812b2bb in CallExpr::Eval (this=0x28d47250, f=0x305f3980) at /home/users/bro/bro-2.0-beta/src/Expr.cc:4651
#22 0x0820e2aa in ExprStmt::Exec (this=0x28d472b0, f=0x305f3980, flow=@0xbfbfe474) at /home/users/bro/bro-2.0-beta/src/Stmt.cc:367
#23 0x0820a8dc in StmtList::Exec (this=0x28d45440, f=0x305f3980, flow=@0xbfbfe474) at /home/users/bro/bro-2.0-beta/src/Stmt.cc:1402
#24 0x0815110f in BroFunc::Call (this=0x287fb980, args=0x2e6a6f80, parent=0x0) at /home/users/bro/bro-2.0-beta/src/Func.cc:333
#25 0x08103836 in EventHandler::Call (this=0x287640d0, vl=0x2e6a6f80, no_remote=false) at /home/users/bro/bro-2.0-beta/src/EventHandler.cc:72
#26 0x081030cc in EventMgr::Dispatch (this=0x82d0ec0) at Event.h:46
#27 0x08103218 in EventMgr::Drain (this=0x82d0ec0) at /home/users/bro/bro-2.0-beta/src/Event.cc:117
#28 0x081b462c in net_packet_dispatch (t=1324679985.0004079, hdr=0x28706c38, pkt=0x293000be "", hdr_size=14, src_ps=0x28706c00, pkt_elem=0x0) at /home/users/bro/bro-2.0-beta/src/Net.cc:352
#29 0x081b4b59 in net_packet_arrival (t=1324679985.0004079, hdr=0x28706c38, pkt=0x293000be "", hdr_size=14, src_ps=0x28706c00) at /home/users/bro/bro-2.0-beta/src/Net.cc:414
#30 0x081c44c2 in PktSrc::Process (this=0x28706c00) at /home/users/bro/bro-2.0-beta/src/PktSrc.cc:273
#31 0x081b48c5 in net_run () at /home/users/bro/bro-2.0-beta/src/Net.cc:444
#32 0x080b50c6 in main (argc=) at /home/users/bro/bro-2.0-beta/src/main.cc:1015

comment:2 follow-up: ↓ 3 Changed 5 months ago by robin

This sounds like a plain ref counting bug to me. The caller should
unref once it doesn't need the type anymore. Were you able to
understand which caller isn't doing it? It doesn't need to be the on
the stack backtrace (and from a quick look, it isn't).

I'm thinking we should just lookup the type a single time at startup along with all of the other Bro defined types in NetVar??.cc and reuse the value after that so > we one reference it a single time.

That won't help as anybody using a a type would still need to ref it (and thus also unref).

It's more consistent with how the internal_type function is called for other cases.

Why would that be more consistent?

comment:2 follow-up: ↓ 3 Changed 5 months ago by seth

This sounds like a plain ref counting bug to me. The caller should
unref once it doesn't need the type anymore. Were you able to
understand which caller isn't doing it? It doesn't need to be the on
the stack backtrace (and from a quick look, it isn't).

There are only a few places outside of NetVar?.cc that even call internal_type which is why I made the consistency comment. You can find everywhere that causes this ref counting problem with a quick grep since there aren't that many, especially if you count out the stuff in NetVar?.cc.

And I caused this bug (/me ducks).

comment:3 in reply to: ↑ 2 Changed 5 months ago by robin

Replying to seth:

And I caused this bug (/me ducks).

So where's the unref missing?

comment:4 Changed 5 months ago by seth

So where's the unref missing?

In the do_split function. I'm never saving the returned ref'd pointer from the internal_type call so I couldn't unref it (I didn't realize it was adding a reference to the value). I think it should just be moved to NetVar?.cc though.

comment:5 Changed 5 months ago by robin

Ah, I didn't realize that TableVal? ref's the type internally as well (other methods often take an already ref'ed object).

Ok, we can move this to NetVar? (and there are more like this in strings.bif).

comment:6 Changed 5 months ago by jsiwek

  • Owner set to jsiwek
  • Status changed from new to assigned

comment:7 Changed 5 months ago by jsiwek

  • Status changed from assigned to closed
  • Resolution set to fixed

In [2348d794b6e0476a944b119e08319cae4bffc7b9/bro]:

Fix ref counting bug in BIFs that call internal_type. (fixes #740)

comment:8 Changed 5 months ago by robin

In [a2e8146e4f52e46401142970e8150ebcc2951cd2/bro]:

Merge remote-tracking branch 'origin/fastpath'

  • origin/fastpath: Fix ref counting bug in BIFs that call internal_type. (fixes #740)

comment:9 Changed 5 months ago by robin

  • Milestone Bro2.0 deleted

Milestone Bro2.0 deleted

Note: See TracTickets for help on using tickets.