Monday, May 14, 2018

Seemingly Random Network Deaths on an Industrial Single Board Computer

Not sure if this is the right place to ask but I'm at my wits end.

We have a device running Windows XP Embedded SP3 on a Single Board Computer (SBC) which randomly maybe once a week just half-heartedly drops off the network. The network is pretty simple it's just 5 or 6 devices connected via an unmanaged switch on a 192.168.1.0/24 subnet with no other connectivity. When this happens everything else on the network still works but just can't talk to the SBC.

When it dies I can still see broadcast traffic leaving the interface (NBNS mostly) but all active connections die and I can't see Windows logging anything to the event log to explain why. When it goes down running 'arp -a' at the command line shows no MAC entries. Tracing the ARPs in Wireshark from another device on the network shows the SBC sending ARPs and other devices responding to them but nothing happens on the SBC and nothing gets added to its ARP table. If we try to open a TCP connection to another device we can see the ARP go out, a reply come back but then no handshake, nothing.

Fixing it temporarily is as simple as disabling the ethernet adapter and enabling it again. It's happening on most of the SBC's we have of the same model but we can't recreate it because it's so sporadic. I'm leaning to a driver issue but we've reached out to the manufacturer and so far they've got nothing and I have no idea how to go about troubleshooting it any further.



No comments:

Post a Comment