我们的网络环境完全由2960,4500,3850等的Cisco交换机支持。虚拟环境正在使用Citrix Xen和VMware产品。

从几个月前开始,在Xen环境升级到7.2之后,我们面临开关端口错误禁用问题。

在不同的Cisco 2960s上的几个端口,2960x交换机不断进入套件服务器中断的错误禁用模式。它已发生在没有NIC组合或绑定的绑定端口和访问端口。

SW-MGMT1#sh int g0/12
GigabitEthernet0/12 is down, line protocol is down (err-disabled) 
  Hardware is Gigabit Ethernet, address is 189c.5d6b.3c0c (bia 189c.5d6b.3c0c)
  Description: l-TEST1-2 10.9.12.27
  MTU 1500 bytes, BW 1000000 Kbit, DLY 10 usec, 
     reliability 255/255, txload 1/255, rxload 1/255
  Encapsulation ARPA, loopback not set
  Keepalive set (10 sec)
  Auto-duplex, Auto-speed, media type is 10/100/1000BaseTX
  input flow-control is off, output flow-control is unsupported 
  ARP type: ARPA, ARP Timeout 04:00:00
  Last input never, output 3w3d, output hang never
  Last clearing of "show interface" counters never
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
  Queueing strategy: fifo
  Output queue: 0/40 (size/max)
  5 minute input rate 0 bits/sec, 0 packets/sec
  5 minute output rate 0 bits/sec, 0 packets/sec
     1901183 packets input, 485681442 bytes, 0 no buffer
     Received 23 broadcasts (6 multicasts)
     0 runts, 0 giants, 0 throttles
     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
     0 watchdog, 6 multicast, 0 pause input
     0 input packets with dribble condition detected
     4523326 packets output, 1258124208 bytes, 0 underruns
     0 output errors, 0 collisions, 4 interface resets
     0 babbles, 0 late collision, 0 deferred
     0 lost carrier, 0 no carrier, 0 PAUSE output
     0 output buffer failures, 0 output buffers swapped out


SW-MGMT1#sh ver 思科  IOS Software, C2960S Software (C2960S-UNIVERSALK9-M), Version 12.2(55)SE7, RELEASE SOFTWARE (fc1)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2013 by Cisco Systems, Inc.
Compiled Mon 28-Jan-13 10:28 by prod_rel_team
Image text-base: 0x00003000, data-base: 0x01B00000

ROM: Bootstrap program is Alpha board boot loader
BOOTLDR: C2960S Boot Loader (C2960S-HBOOT-M) Version 12.2(55r)SE, RELEASE SOFTWARE (fc1)

SW-TEST1-FWMGMT1 uptime is 13 weeks, 2 days, 21 hours, 31 minutes
System returned to ROM by power-on
System restarted at 15:18:10 EDT Fri Nov 3 2017
System image file is "flash:/c2960s-universalk9-mz.122-55.SE7/c2960s-universalk9-mz.122-55.SE7.bin"


This product contains cryptographic features and is subject to United
States and local country laws governing import, export, transfer and
use. Delivery of Cisco cryptographic products does not imply
third-party authority to import, export, distribute or use encryption.
Importers, exporters, distributors and users are responsible for
compliance with U.S. and local country laws. By using this product you
agree to comply with applicable laws and regulations. If you are unable
to comply with U.S. and local laws, return this product immediately.

A summary of U.S. laws governing Cisco cryptographic products may be found at:
http://www.cisco.com/wwl/export/crypto/tool/stqrg.html

If you require further assistance please contact us by sending email to
[email protected]

cisco WS-C2960S-24TS-S (PowerPC) processor (revision J0) with 131072K bytes of memory.
Processor board ID FOC1647V1LN
Last reset from power-on
1 Virtual Ethernet interface
1 FastEthernet interface
26 Gigabit Ethernet interfaces
The password-recovery mechanism is enabled.

512K bytes of flash-simulated non-volatile configuration memory.
Base ethernet MAC Address       : 18:9C:5D:6B:3C:00
Motherboard assembly number     : 73-12423-09
Power supply part number        : 341-0328-03
Motherboard serial number       : FOC1647101J
Power supply serial number      : DCA1644M7WA
Model revision number           : J0
Motherboard revision number     : A0
Model number                    : WS-C2960S-24TS-S
Daughterboard assembly number   : 73-11933-04
Daughterboard serial number     : FOC16467T1G
System serial number            : FOC1647V1LN
Top Assembly Part Number        : 800-32448-04
Top Assembly Revision Number    : B0
Version ID                      : V04
CLEI Code Number                : COMGJ00ARD
Daughterboard revision number   : A0
Hardware Board Revision Number  : 0x01


Switch Ports Model              SW Version            SW Image                 
------ ----- -----              ----------            ----------               
*    1 26    WS-C2960S-24TS-S   12.2(55)SE7           C2960S-UNIVERSALK9-M     

此SW-MGMT 2960x上有两个开关端口连接到Citrix Xen Server。另一个港口似乎很好。它总是g0 / 12得到了错误的状态。

基于Citrix Post: //discussions.citrix.com/topic/391523-after-upgrade-70-to-72-load-cisco-switch-ports-shutdown-due-to-err-disabled/

“当Keepalive数据包循环回发送Keepalive的端口时,会发生环回错误。默认情况下,交换机将Keepalive发送出所有接口。
设备可以将数据包循环回源界面,这通常是由于网络中存在逻辑环,即生成树没有阻止。
源接口接收它发送出的keepalive数据包,交换机禁用接口(errdisable)。
发生此消息是因为keepalive数据包循环回发送Keepalive的端口:
%PM-4-ERR_DISABLE:在GI4 / 1上检测到环回错误,将GI4 / 1放在错误禁用状态下
默认情况下,keepalives在所有接口上发送。“

如果软件(iOS或CATOS)检测到端口上的错误情况,则交换机端口可以结束禁用错误。端口被有效地关闭直至手动或自动为错误条件指定恢复计时器而自动重新启用。

一个这样的错误条件是由端口上的环回的存在。交换机将Keepalive数据包发送出所有接口。如果在相同的接口上接收到keepAlive数据包,则从已发送,则存在尚未阻止的循环 跨越树协议。如果发生这种情况,则生成这些消息

当它发生时,日志显示检测到有环回错误。


000305: Nov  5 21:23:59.177 EST: %ETHCNTR-3-LOOP_BACK_DETECTED: Loop-back detected on GigabitEthernet0/12.

000306: Nov  5 21:23:59.177 EST: %PM-4-ERR_DISABLE: loopback error detected on Gi0/12, putting Gi0/12 in err-disable state

000307: Nov  5 21:24:00.179 EST: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/12, changed state to down

000308: Nov  5 21:24:01.186 EST: %LINK-3-UPDOWN: Interface GigabitEthernet0/12, changed state to down

解决方案:

1.快速/临时修复
快速修复很容易,只是关闭界面,也没有再次关闭。


SW-MGMT(config)#int g0/12
SW-MGMT(config-if)#shu
SW-MGMT(config-if)#
SW-MGMT(config-if)#no shu
SW-MGMT(config-if)#


005014: Feb  5 11:55:13.089 EST: %PARSER-5-CFGLOG_LOGGEDCMD: User:admin  logged command:interface GigabitEthernet0/12 
005015: Feb  5 11:55:18.594 EST: %PARSER-5-CFGLOG_LOGGEDCMD: User:admin  logged command:shutdown 
005016: Feb  5 11:55:19.863 EST: %PARSER-5-CFGLOG_LOGGEDCMD: User:admin  logged command:no shutdown 
005018: Feb  5 11:55:24.980 EST: %LINK-3-UPDOWN: Interface GigabitEthernet0/12, changed state to up
005019: Feb  5 11:55:25.982 EST: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/12, changed state to up


SW-MGMT#sh int g0/12
GigabitEthernet0/12 is up, line protocol is up (connected) 
  Hardware is Gigabit Ethernet, address is 189c.5d6b.3c0c (bia 189c.5d6b.3c0c)
  Description: l-TEST1-2 10.9.12.27
  MTU 1500 bytes, BW 1000000 Kbit, DLY 10 usec, 
     reliability 255/255, txload 1/255, rxload 1/255
  Encapsulation ARPA, loopback not set
  Keepalive set (10 sec)
  Full-duplex, 1000Mb/s, media type is 10/100/1000BaseTX
  input flow-control is off, output flow-control is unsupported 
  ARP type: ARPA, ARP Timeout 04:00:00
  Last input never, output 00:00:00, output hang never
  Last clearing of "show interface" counters never
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
  Queueing strategy: fifo
  Output queue: 0/40 (size/max)
  5 minute input rate 1000 bits/sec, 1 packets/sec
  5 minute output rate 1000 bits/sec, 1 packets/sec
     1901190 packets input, 485682616 bytes, 0 no buffer
     Received 24 broadcasts (6 multicasts)
     0 runts, 0 giants, 0 throttles
     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
     0 watchdog, 6 multicast, 0 pause input
     0 input packets with dribble condition detected
     4523338 packets output, 1258126740 bytes, 0 underruns
     0 output errors, 0 collisions, 5 interface resets
     0 babbles, 0 late collision, 0 deferred
     0 lost carrier, 0 no carrier, 0 PAUSE output
     0 output buffer failures, 0 output buffers swapped out

2.永久性修复 
到目前为止,Citrix结束没有官方补丁/修复。关于思科解决方案,有两种不同的方法来解决这个问题:

2.1在特定界面上禁用Keepalive

发行  没有keepalive.  接口命令才能禁用这些接口上的keepalive数据包。禁用keepalive可防止默认的界面,但它不会删除循环。问题是您需要收集之前发生的港口。

2.2启用自动恢复一旦发生这种类型的错误禁用问题
通过这种方式,您不需要知道哪个端口有哪些问题。它是交换机上的全局命令,并影响所有端口。

errdisable recovery interval 600
errdisable recovery cause link-flap
errdisable recovery cause udld
errdisable recovery cause bpduguard
errdisable recovery cause loopback
errdisable recovery cause psecure-violation
errdisable recovery cause dcbx-error
errdisable recovery cause pause-rate-limit
errdisable recovery cause inline-power

经过 Jon.

发表评论