Can I debug network connectivity through two switches?
I am simplifying the real scenario here but:
Let's assume I have a machine M1 that is connected to a network switch SW1 which is itself connected to a second switch SW2 that is connected to machine M2.
M1-> (port1) SW1 (port2) -> (port1) SW2 (port2) -> M2
Both M1 & M2 are on the same IP network (i.e. 10.0.0.1/24 & 10.0.0.2/24 )
Suppose the 3 cables are just fine but something goes bad in one of the two switches.
How can I check where the problem is?
Is there any way to "ping" them in some sort of a low level?
Does this type of device have a network identity of some sort? or are they totally transparent and the only way to check which one is faulty is to try and replace them? (assuming I can't jump over them because of physical limitations - i.e. I can't connect M1 directly to SW2....)
Thanks
EDIT/UPDATE:
will try to answer some of the Q's in the provided answers.
The full scenario is a home scenario - but its a big house (2 floors)..
phone line comes in at the first floor where a Filter splits it to ADSL only and phone only. The ADSL line goes up to my study room at the second floor where a modem/router is connected to it. all wires from there are ethernet. there is an additional router in that room to connect multiple devices i have there and a cable goes to a central point in Floor 2 (This is SW2) where a simple switch is in the wall and cables go from it to all the rooms of the second floor and an additional wire goes down to a central point on floor one and into another switch (This is SW1) that has cables to all the rooms in floor 1. The to the socket in the living room (floor 1) yet another router is connected to provide WiFi and connect several media devices and a cable to another corner of the room with yet another switch for a computer (M1) and a printer...
If happens from time to time that I do not have internet available at M1 (floor 1) but do have internet at my study room on the second floor - this is the most common scenario I am trying to debug...
I obviously do have physical access to all the involved devices but do to distance limits I can not connect M1 to SW1 directly (maybe I should buy or crimp a long cable to allow this - and so jump over one switch and one router).
When I say a cable is o.k. I mean that I tested it with a simple ethernet cable tester where leds light up one by one 1..8.
AFAIK my switches do not have IP addresses but I would be glad lo learn otherwise... how can I tell? @Peregrino69 says "Connect to the switch with a console cable" - what is a condole cable? where can buy one or how should I wire one? is it some wiring on an ethernet cable? and what device am I supposed to connect on the other side of the wire? (can a laptop be used as the console? and how?)
@user1686 wrote "Some cheap 5-port switches (e.g. a few of TP-Link's products) are branded as "unmanaged" but do speak IP and have a web interface, though a very minimal one, but this is generally an exception and not the rule." - Well at least one of my switches is tp-link (model: TL-SG1005D) - how can I know if it has an IP address?
Do this type of devices have a network identity of some sort? or are they totally transparent and the only way to check which one is faulty is to try and replace them?
"Managed" switches do act as hosts and have an IP address through which they can be configured. They can be pinged, monitored through SNMP, or configured through web interface (similar to routers).
(But even a managed switch is still a bridge, not a router/gateway, and will not show up in regular IP traceroute. The management interface is merely on the same subnet but doesn't handle the actual traffic.)
Most managed switches will show you whether each port detects a link, what speed it has negotiated, and usually keep system logs which can tell you whether a specific port has been bouncing up/down in the last few days. Many of them even include a "cable test" feature that performs a basic TDR test on selected ports.
There is unfortunately no standard "layer-2 traceroute" that would be widely implemented, even in managed switches, so you'll still need to look at each port manually hop-by-hop.
"Unmanaged" switches generally don't offer any management or monitoring features and will not have an IP address either.
Some cheap 5-port switches (e.g. a few of TP-Link's products) are branded as "unmanaged" but do speak IP and have a web interface, though a very minimal one, but this is generally an exception and not the rule.
You gave the following parameters:
- All 3 cables are OK (have they been measured? "This works between these other devices" is an indication, but not a solid proof that the cable actually is OK)
- The switches do not have IP addresses
- You work on M1 and cannot connect it directly to SW2 for an unspecified reason
- Somewhere in this setup an unknown failure has happened - unspecified, undescribed
You tagged this as home-networking, but are also implying this is a real life scenario, simplified, which suggests a small, SoHo-type production network with more devices. There's really not much to go on, but unfortunately as there's no IP connectivity, troubleshooting requires a physical access to the devices.
Firstly I'd find out what exactly is happening, and what is not happening. For example "When user in M2 logs in they don't get any access to network", "User in M2 has no access to network printer" and "Network in M2 is very slow" are different scenarios with different troubleshooting paths.
One could take the heavy-handed approach: power cycle all devices and see if the problem disappears. If not, start plugging cables to different ports. If it still doesn't disappear, replace cables, one by one. However this is not a very good approach in a production environment. It will cause downtime. If someone has changed the configuration of one device but forgot to save it, the change will be lost. Plugging a cable to another port could restore a physical link, but depending on the configuration might result in lost network access. And so on.
If for example a sudden power spike has put one of the network devices into an unresolvable condition, simply power cycling it could resolve the problems. Just looking at the devices might give a hint - do all devices have power in the first place? Are they running steady, or is one of them maybe rebooting every so often?
Do the LEDs look normal? On connected ports they usually blink fast but randomly, indicating traffic is passing through the port. LEDs of disconnected ports should be off. If a LED in one or more ports either burns bright or blinks in a steady pattern, the problem can be in the switch, the link or the connected devices. If all LEDs are burning steady, connected or not, power cycling the device is usually the only way out.
Now we don't know whether these are unmanaged or managed switches. If they're unmanaged, there's really no other way to troubleshoot than getting physical as above.
If these are "light-managed" switches with limited capabilities and a web interface, they might actually be factory configured to either have static IP addresses, or receive IPs from a DHCP server. Check the manual.
Assuming these are fully managed switches and a visual examination shows nothing out of the ordinary, my troubleshooting path here would be something like below.
Connect to the switch with a console cable. CLI has multiple tools available for troubleshooting. With some devices (Cisco, H3C...) you will only get access to user-level with very limited toolset, and troubleshooting requires elevation to "manager" level. The command is often "configure terminal". Pressing tab or question mark often displays a list of available commands. Once on the CLI, I'd start with checking the configuration, and preferably saving it.
With modern devices all ports are usually set to Autonegotiation, and are running on 1000FDx mode - with older or really cheap devices maybe 100FDx. Is one or more running on 10HDx instead? If yes, is it actually configured that way, or did the switch set 10HDx mode automatically?
If the switch is running a L2 neighbor detection protocol such as LLDP or CDP, what do those tables show? Are all the neighbors that should be visible listed in the tables?
What's in the log? That will show for example if one or more ports are flickering up/down; or if a port suddenly was automatically set to 10HDx mode. One or more MAC addresses might be flapping. And a lot more.
...and that's just the start, but would give enough to know where to look next. But again, without IP connectivity getting physical access is essential. Might be even if it existed... if the physical link between SW1 and SW2 is down, there's no way to remote into it.
Like everything else, it's easy when you know what you're doing - a lot easier than trying to explain it. Especially how to troubleshoot an unclear situation like this.