Thursday, February 1, 2018

Do you make notes while troubleshooting?

I have recently been trying to narrow down a registration bug with our SIP phones and eventually sorted the problem out.

After a few other failed attempts of the same thing I decided this time to make a note of every change/attempt I made at fixing the problem in chronological order. This made it easy to see what I'd already tried, but also to tie in my changes with the remote server log should I need to ask the provider.

The problem is, once I started making progress I completely forgot about the manual logging. MY question is, do you make logs as you go, if so, do you have a set structure?

As an example, here are some of my logs from the fix attempt:

20:34 - Enabled SIP alg on gateway via system -> conntrack -> modules -> sip -> disable. 20:38 - Tested session kill from AAISP end, phones d/c and aren't re-registering. 20:41 - Re-enable SIP ALG and reboot router 20:47 - Reboot PoE switch (192.168.1.23) 20:50 - CORE-SW-001 uplinks appear to be p39, p41, p47, p48 20:50 - Add VLAN 999 'VoIPDMZ' to CORE-SW-001 uplink ports and port 11 untagged (to firebrick) 20:55 - Add tagged VLAN 999 to test port 13 20:57 - Change tagged VLAN 999 on port 13 to UNtagged VLAN 999 on port 13 because no IP address was assigned when tagged 21:04 - Add VLAN 999 to HP PoE switch as tagged VLAN on p50 and untagged on test port 27 (Brew Room phone), rebooted brew room phone 21:11 - Change port 27 to tagged VLAN 999 and tagged VLAN 40 (original VoIP VLAN) 21:12 - Factory reset test Brew room phone 21:16 - Add p21 on HP PoE switch to UNtagged VLAN 999 to test laptop DHCP lease 21:18 - Laptop not getting DHCP lease on test port - checking new VLAN is traversing switches 21:19 - first checking if laptop gets DHCP lease on CORE-SW-001 p13 which is still UNtagged VLAN 999 21:21 - laptop gets DHCP lease and public IP of PUBLICIP on p13 or CORE-SW-001 21:22 - hardcode brew room test phone to use VLAN 999 tag 21:23 - Added tagged VLAN 999 on CORE-SW-001 to p43, brew room phone got DHCP lease PUBLICIP 21:33 - Manually configured test brew room phone with SIP account, registered successfully. 21:33 - Killing AAISP session to test new connection method 21:37 - Brew room phone failed to re-register 21:38 - Change brew room phone to TCP from UDP and kill session at AAISP 21:41 - Brew room phone failed to re-register 21:42 - Revert brew room phone to UDP and enable STUN 21:42 - Brew room phone re-registers, killing session at AAISP again to see if it re-registers 21:44 - Session killed, no re-register from test phone yet 21:47 - Disabled STUN and switch to IP rather than FQDN on brew room phone 22:00 - Switch brew room phone to TCP while on IP without DNS and force re-register, trying kill session at AAISP 22:17 - Lowered all SIP sessions timers on brew room phone, tested UDP and TCP and killed session - still no re-register 22:17 - Found useful forum post with similar issue (https://www.3cx.com/community/threads/periodically-phones-lose-registration.47666/) 22:17 - Found newer version of Yealink firmware for T19P E2 models, trying now 22:23 - Upgraded brew room phone to firmware v. 53.82.0.20, set STUN and TCP, trying kill session at AAISP 22:45 - Brew room phone back to original VLAN 40 (tagged) - testing on Drug Right with new settings didn't work so updating firmware 


No comments:

Post a Comment