Saturday, January 26, 2019

Switch timeout EAP, but send access-accept to client

Hi,

I got a strange issue at a customer location and would like the opinion of the 802.1x wizards.

For several month now, we found some computers are taking more than 30s to authenticate via EAP on 802.1x enabled wired ports.

Issue arise after the computer boot.

Only one customer site is impacted. Others sites never had the issue.

All of the computers are installed from the same Windows image.

All of the switches of all locations are the same hardware, using the same software release and using the same configuration template.

If you take one of the computer of this specific site to another site, you won't be able to trigger the issue.

If you take one of the computer from another site to this specific site, you will trigger the issue after some tries.

From PCAP, we found that the delay is caused by the client, not the Radius server.

From EAPHost Windows events, we confirmed the client was taking an average of 50s to authenticate.

We were not yet able to find a reason why their Windows behave like this.

I'm waiting for the customer to enable debug and analytics logs on EAP events to troubleshoot this further.

  1. So 802.1x Windows wizards, please let me know if you have an idea about the reason why Windows behave like this.

As 30s is the default supplicant timeout configured in ALE OmniSwitch, they timeout the client authentication and consider it as failed.

But when they finally get an answer from the Radius server (which is an access-accept most of the time), they relay it to the client.

The client receive the Access-Accept, thinks it's authenticated and don't know the switch consider the authentication as timed out.

The client then won't try to authenticate and is stuck unauthenticated in the guest VLAN...

We thinks the switch doesn't do its job here as he should tell the client something went wrong.

We searched across RFCs to know how the switch is supposed to behave regarding supplicant timeout, but didn't found anything.

  1. So 802.1x RFC wizards, do you also think the switch isn't doing its job here?

  2. Can you point me to a RFC or resource to show to ALE support so I can request a behavior change in the software?

Thanks for reading me.

PS: Yeah, I know I can change the supplicant timeout value. But I'm interested in fixing things, not working around it.



No comments:

Post a Comment