Minecraft: weird lag issues for a certain player on my Vanilla 1.11.2 server
Basically I'm running a Vanilla server (1.11.2) on my computer and one player has suddenly started having a strange lag/packet loss issue.
Symptoms
- 10 or so second joining time
- The player can open chests and break/place blocks. When they interact, other players can see the interaction immediately, but the affected player does not see the intended action for 10-20 seconds or so.
- Sometimes, but not always, the player will leave the server but their player model remains on the server. When this occurs, I (as admin) can do the kick command and it says successfully kicked. However, the player remains on the server and they remain in the tab menu.
- On one occasion I attempted to kick the player when this was happening and instead the following appeared in the chat as if I had said it:
internal exception io.netty.handler.codec.decoderexception
- At one point the player attempted to join the server and got a similar error message which was repeated in the console (unfortunately I didn't note it)
- The player sees slightly worse tracert results to my IP when connected to the server
Other notes/attempted fixes:
- Both the player and I have generally stable internet connections (around 10-40ms ping, 5-15 Mbps down and 3-5 Mbps up)
- The player has a good connection on other Minecraft servers
- The player and I are located within 10kms of each other
- No other players experience lag of this nature
- We have tried creating Windows Firewall rules on both ends for inbound and outbound connections
- We have tried setting the firewall level to "lax" on the host router
- We have tried restarting the host router
- We have tried destroying all signs on the server (Googling the Java error gives results about this being the problem on 1.8~ Spigot servers but I doubt this is the problem)
- I have tried adding more RAM to the server (8 gigabytes in total now)
- The player used to have no lag issues, but I cannot think of anything that has changed since it started about a week or so ago
Solution 1:
I want to stress this point: the source of the network problem might not be the network.
10 or so second joining time
The player can open chests and break/place blocks. When they interact, other players can see the interaction immediately, but the affected player does not see the intended action for 10-20 seconds or so.
Sometimes, but not always, the player will leave the server but their player model remains on the server. When this occurs, I (as admin) can do the kick command and it says successfully kicked. However, the player remains on the server and they remain in the tab menu.
These is all "Lag". It could be low bandwidth, latency, etc. It does not really help to discard what is going wrong.
On one occasion I attempted to kick the player when this was happening and instead the following appeared in the chat as if I had said it: internal exception
io.netty.handler.codec.decoderexception
The documentation on DecoderException suggest that it happens when the receiving party is unable to rebuild the data being sent.
This could be:
- Packages being dropped or exceeding timeout
- Network buffer getting corrupted (Memory failure)
- Packages corrupted on the network
At one point the player attempted to join the server and got a similar error message which was repeated in the console (unfortunately I didn't note it)
Ditto.
The player sees slightly worse tracert results to my IP when connected to the server
This sounds like the fastest route between you cannot handle the traffic, is saturated, and the network found a route with more bandwidth but more latency.
Edit: shouldn’t be happening. It suggest that at some point along the network the connection is not good enough.
Both the player and I have generally stable internet connections (around 10-40ms ping, 5-15 Mbps down and 3-5 Mbps up)
Does not seem to be the problem.
The player has a good connection on other Minecraft servers
This suggests that the problem is not the last mile on the client.
The player and I are located within 10kms of each other
This is odd. Because if it is not the last mile, we would have to blame the Metropolitan area network for this. Edit: We have to start on the assumption that the MAN is installed correctly, and is not failing.
No other players experience lag of this nature
How far from you are the other players?
We have tried creating Windows Firewall rules on both ends for inbound and outbound connections
We have tried setting the firewall level to "lax" on the host router
Shouldn't be a problem
We have tried restarting the host router
Ok.
We have tried destroying all signs on the server (Googling the Java error gives results about this being the problem on 1.8~ Spigot servers but I doubt this is the problem)
Spigot - for what I read - is a fork of Bukkit. You said you are "Basically" running Vanilla, are you using Spigot for that? If so, it might be worth investigating.
You are not using Spigot, then this is not the problem.
I have tried adding more RAM to the server (8 gigabytes in total now)
Did you replace your old RAM, or just add a new card?
If you just added a new card, try without the old one. A RAM card can be defective in such way that the OS is able to load, but you get memory corruption along the way.
Also, try Memtest (Download the ISO, burn it, and boot with it). It will tell you if there is any not evident problem with your RAM. Note: Memtest is more thorough than Windows memory diagnostic.
Side note: On my experience with Minecraft, the hard disk tends to be the bottleneck.
Addendum: Running the Minecraft server doesn't necessarily improve performance, it will allow the server to keep more chunks loaded, which in turn translate to more units being spawn, and more units eat more CPU time.
The player used to have no lag issues, but I cannot think of anything that has changed since it started about a week or so ago
I will speculate for you. Consider the following:
-
It might not be fault of the server, nor the IPS or the client. It could be that at the time the client connects, something else also happens... for example, other people on networks close to the client has an schedule where they watch online movies or streaming content at the same time the client joins the game.
-
It might be old hardware that is starting to fail. In this case, you, nor the client did any intentional change that started the problem. Addendum: Old hardware is not likely to be affecting a single client, unless we were talking about a problem on some router along the trace from the client to the server.
-
It might be malware. For example, a bot net could be using the network. It might be worth looking for possible malware on both sides. Again, in this case you did not change anything intentionally. Addendum: A botnet that has compromised both client and server could be using the connection between them to move data around, eating bandwidth.
-
It could be automatic updates. Either the change happened because of an update that placed a defective component, or more likely, has added a scheduled task that is adding latency. Addendum: this should not be affecting a single client. Yet, I do not know, bugs are bugs.
It would actually be easier to make theories for what is the problem if you had multiple clients with the problem, because then we could try to figure out what they have in common.
This is what you will try:
- Diagnose the memory on the server (memtest). Replace any card that reports errors.
- If the server is on wireless network, try a wired connection.
- Identify and stop any other services that might be listening to the network on both server and client. Try nmap to make sure that only the ports that correspond to Minecraft are being used. There is a GUI tool for NMap, or if you have to use the terminal... Debian has a good introduction.
- Run a scan using your prefered antivirus software.
If your server is on Windows:
- Try using SysinternalsSuite. In particular:
- Use Autoruns to identify what is running on windows startup. Configure it to verify the signatures of the entries (to see if they have been tampered) and to send samples to VirusTotal to see if any antivirus software identifies them as malware.
- Also use Process Explorer and configure it to do the same with whatever software is running while the problems happen.
- Run
sfc /scannow
on a terminal with elevated privilegies, it will check if any system files has been tampered with, and attempt to repair them. It it fails, move toDism /Online /Cleanup-Image /RestoreHealth
. If that fails, go to Microsoft's support. - Run Windows Updates.
If you are Linux, follow the instructions of your distro documentation to identify and fix broken packages.
With collaboration of the client:
- If the client is on wireless network, try a wired connection.
- Have the client try to connect off the usual times (If they often join on the night, try on the morning or vice versa).
- Make sure they are not running any other thing that is taking resources during the gaming session. A tool such as Razer Cortex would suffice. If the client is willing to, they can try nmap too.
- Try connecting via VPN. I have successfully used DynVPN for these purposes. This will also allow an easier way to measure traffic between server and client.
Advanced diagnostics:
If the options above did not work, we are on diminishing returns territory. What I could suggest is to capture network traffic. However, remember that culprit may not be in your control.
- Use Wireshark on both ends and capture a test gaming session (join the server, do some change on the world, leave the server). You might want to also try this on the VPN. If there are dropped packages, if there is network latency, if there are corrupted packages... you will see it there. HowToGeek has a good starting guide on Wireshark
- Use Scapy. It allows for more powerful analytics than Wireshark, but it is harder to use... because it is a Python library! If you are familiar with Python, follow the Scapy tutorial to get you started.