Node.js, Cygwin and Socket.io walk into a bar... Node.js throws ENOBUFS and everyone dies
I'm hoping someone here can help me out, I'm not having much luck figuring this out myself. I'm running node.js version 0.3.1 on Cygwin. I'm using Connect and Socket.io. I seem to be having some random problems with DNS or something, I haven't quite figured it out. The end result is that I the server is running fine, but when a browser attempts to connect to it the initial HTTP Request works, Socket.io connects, and then the server dies (output below).
I don't think it has anything to do with the HTTP request because the server gets a lot data posted to it, and it was receiving requests and responding up until my connection that killed it. I've googled around and the closest thing I've found is DNS being set improperly. It's a network program meant to run only on an internal network, so I've set the nameserver x.x.x.x
in my /etc/resolv.conf
to the internal DNS. I've also added nameserver 8.8.8.8
in addition. I'm not sure what else to check, but would be grateful of any help.
In node.exe.stackdump
Exception: STATUS_ACCESS_VIOLATION at eip=610C51B9
eax=00000000 ebx=00000001 ecx=00000000 edx=00000308 esi=00000000 edi=010FCCB0
ebp=010FCAEC esp=010FCAC4 program=\\?\E:\cygwin\usr\local\bin\node.exe, pid 3296, thread unknown (0xBEC)
cs=0023 ds=002B es=002B fs=0053 gs=002B ss=002B
Stack trace:
Frame Function Args
010FCAEC 610C51B9 (00000000, 00000000, 00000000, 00000000)
010FCBFC 610C5B55 (00000000, 00000000, 00000000, 00000000)
010FCCBC 610C693A (FFFFFFFF, FFFFFFFF, 750334F3, FFFFFFFE)
010FCD0C 61027CB2 (00000002, F4B994D5, 010FCE64, 00000002)
010FCD98 76306B59 (00000002, 010FCDD4, 763069A4, 00000002)
End of stack trace
Node Output:
node.js:50
throw e; // process.nextTick error, or 'error' event on first tick
^
Error: ENOBUFS, No buffer space available
at doConnect (net.js:642:19)
at net.js:803:9
at dns.js:166:30
at IOWatcher.callback (dns.js:48:15)
EDIT
I'm hitting an LDAP server using http.createClient
immediately after a client connects to get information, and that seems to be where the problem is that is causing ENOBUFS. I've edited the source to include && errno != ENOBUFS
which now prevents the server from dying, however now the LDAP request isn't working. I'm not sure what the problem is that would cause that though. As I mentioned this is an internal only application, so I set the DNS servers in /etc/resolv.conf
to the DNS servers that are being applied to the host machine. Not sure if this is part of the issue?
EDIT 2
Here's some output from gdb --args ./node_g --debug ../myscript.js
. I'm not sure if this is related to ENOBUFS, however, as it seems to be disconnecting immediately after connection with Socket.io
[New thread 672.0x100]
Error: dll starting at 0x76e30000 not found.
Error: dll starting at 0x76250000 not found.
Error: dll starting at 0x76e30000 not found.
Error: dll starting at 0x76f50000 not found.
[New thread 672.0xc90]
[New thread 672.0x448]
debugger listening on port 5858
[New thread 672.0xbf4]
14 Jan 18:48:57 - socket.io ready - accepting connections
[New thread 672.0xed4]
[New thread 672.0xd68]
[New thread 672.0x1244]
[New thread 672.0xf14]
14 Jan 18:49:02 - Initializing client with transport "websocket"
assertion "b[1] == 0" failed: file "../src/node.cc", line 933, function: ssize_t
node::DecodeWrite(char*, size_t, v8::Handle<v8::Value>, node::encoding)
Program received signal SIGABRT, Aborted.
0x7724f861 in ntdll!RtlUpdateClonedSRWLock ()
from /cygdrive/c/Windows/system32/ntdll.dll
(gdb) backtrace
#0 0x7724f861 in ntdll!RtlUpdateClonedSRWLock ()
from /cygdrive/c/Windows/system32/ntdll.dll
#1 0x7724f861 in ntdll!RtlUpdateClonedSRWLock ()
from /cygdrive/c/Windows/system32/ntdll.dll
#2 0x75030816 in WaitForSingleObjectEx ()
from /cygdrive/c/Windows/syswow64/KernelBase.dll
#3 0x0000035c in ?? ()
#4 0x00000000 in ?? ()
(gdb)
Solution 1:
OK, I digged around a bit, and after your second edit I found this bug on the issue list.
I doesn't state whether this was encountered under cygwin or not, but the error that it is hitting leads down to this piece of code:
uint16_t * twobytebuf = new uint16_t[buflen];
str->Write(twobytebuf, 0, buflen, String::HINT_MANY_WRITES_EXPECTED);
for (size_t i = 0; i < buflen; i++) {
unsigned char *b = reinterpret_cast<unsigned char*>(&twobytebuf[i]);
assert(b[1] == 0); // this assertion fails
buf[i] = b[0];
}
From what I can read (with my rusted C) it will convert it will create a new uin16 array and write the contents of the V8 string in their, then it will ensure that casting did not write any values outside the range of 0 - 255
, and that's exactly what fails here.
I couldn't find anything regarding whether this is a V8 issue or not.
Since the code was added in this commit, the only thing I can suggest here is to try pulling the tree from a commit before the code was added. Since all versions after that have the crashing code.
If that works, I would recommend you to file another bug report on the issue Node.js issue list, although I made do this later this day.