For Minecraft 1.17 and newer, head to the new website

Bug: #1039 - Client time-out, server lag

ikabodo
Type: Member
Joined: 08/27/2014
Tracker ID:#1039
Type:Bug
MC Version:1.7.10
AE Version:rv1-beta-24
Created:
Message:
First of all, I'm not sure this is an applied energistics problem, if it's not. Please forgive me, I'm working hard to locate the issue, and trying all options. I saw some weird things going on in the ME-network, and figured that could be it. If you could point me to another mod, I'll gladly go there. **What is happening?** Well, the clients time out. I've noticed that when this happens several things seem to happen: * The server "avg-tick" freezes, so here is a HUGE lag-spike. * After the freeze is gone, a message of "can't keep up" is delivered. (with hundreds of lost ticks) * All players time out (usually only one or two players use the server) * If I'm in game when this happens, I noticed that it happens just when the ME-systems tries to apply all the channel (when the cables are lit up, I'm guessing that they are trying to find the shortest route. * Sometimes I get a "skipping BlockEntity with id" when the lag-spikes occur. (I fiddled around a lot with mods. It could be anything. Could it be I'm storing something that was deleted? Sometimes (I have not been able to tell when), the server just "get's it" and stops lagging, and we can play normally for a while. **Other interesting notes:** * I am not sure, but I think the problem gets worse when I am not around the base. (when the chunks aren't loaded? * The problem seem to come and go, it's really hard to pinpoint. Sometimes it just works. * The problem disappears when I turn of power. * The problem get worse if the me-system is LOW on power (turning on and off, and I'm guessing doing the "shortest" route calculation all the time) * The problem is not computer related as far as I can tell, I tried starting the server on two differant computers, and the server runs decently most of the time it's not lagging. * When the server get's in it's worst "loops", it freezes almost 1-3 times per minute, and the lag seems to get progressively worse (as if stuck in a loop). * Sometimes, the server freezes for several minutes at a time, and to quit it you have to force-shut the server or wait minutes. **What is expected?** Server not freezing. Client's not timing out. **How to reproduce?** I'll upload the world-save if you want to try it, but there are a lot of mods, so I'm not sure you want to take the time to look at it. (I'm using it privately, so I haven't gotten permission to post it publicly). Unfortunately, I'm not getting any relevant errors as far as I can tell, I got a ui error that caused a client-crash this very last time, but I have not gotten that error before, so I expect it's unrelated to the problems above. It's important to notice, that the problem seems to occur when the chunks in the main base are loaded by chunkloaders. But no player is there, so Latest client UI crash: http://pastebin.com/g6BdUs1R Server log when it is happening: http://pastebin.com/k0E8YCbP Server or Single Player?: Server (2-3 players, private) Screenshot?: Not needed, simply
Comment
ikabodo
Type: Member
Joined: 08/27/2014
Posted:
Comment:
Small update: I managed to "get in" the server for a while no. It all worked nice when the ME-system was up and running. When I logged out (no player online anymore), the server spiked again. Wierd.
Comment
AlgorithmX2
Type: Admin
Joined: 12/31/2012
Posted:
Comment:
The crash in your report was already fixed, you didn't pick the correct version on your report. I'm not sure what else to do about this, you can try profiling the server, but timeouts are not easily to diagnose they are most commonly caused by things sending packets that shouldn't be sending packets.
Comment
ikabodo
Type: Member
Joined: 08/27/2014
Posted:
Comment:
Thanks for taking the time! I understand these sort of "bug report" are probably hard to track down. Just double-checked the versions to be 24 (the world was created in an earlier version) in both client and server, they were correct. But perhaps that error-log was from before, I changed that very night. I will triple check later tonight, but I'm pretty sure it's unrelated. I got that error when I shift clicked a silicon circuit in the machine before it was complete. It caused a crash, and then I was unable to log in again without time outs, it was the first time I got this error. Could this sort of issue be caused by building a bad network like loops or nested p2p-tunnels? I get the FEELING the server gets frozen when it tries to build the channels in the network. I might be coincidental, and the error might be from any other mod, or perhaps a conflict I can't find. Are there any logs more than the regular server log that can be enabled to see what the ME-network does? More info: I noticed the lag disappears if I disconnect the power from the ME-network, but leave the rest of the power on.
Comment
AlgorithmX2
Type: Admin
Joined: 12/31/2012
Posted:
Comment:
I'll probably have to write something to print out packets or something to diagnose the issue, just by turning off the power we can't really confirm what is sending packets, or what is causing the issue. I had numerous issues reported for AE1 like this, and they all ended up to be a mod with barrels sending packets when they were connected to a storage bus.. So it could be something like that, or maybe a unknown bug in AE2 thats causing lots of packets to be sent. I'll have to look into making some kinda of packet logging or something that we can use to try and track it down.
Comment
ikabodo
Type: Member
Joined: 08/27/2014
Posted:
Comment:
That's great! You are a fantastic mod-author, things such as this is well beyond what can be expected from anyone's free time. (Sidenote: do you have Patreon?) I'll be willing to put time into logging and testing this if you need it, or just tell me if you want to have the world to fiddle around with.
Comment
thatsIch
Type: Moderator
Joined: 02/05/2013
Posted:
Comment:
Comment
AlgorithmX2
Type: Admin
Joined: 12/31/2012
Posted:
Comment:
Ok, I wrote https://drone.io/github.com/AlgorithmX2/PacketLogger/files snag the non dev version and throw it on your client it will pretty much all the network traffic then reproduce the issue. once your have that, dropbox/pastebin it and we can see what it says, hopefully it will get us something useful.
Comment
ikabodo
Type: Member
Joined: 08/27/2014
Posted:
Comment:
Got it! I'll get back when I have some data! Also, I'm 95% certain now, that the issue starts when the network runs when I'm not near the base. I just left for the nether, and the issue came back within 1 min. So I think I can reproduce the issue.
Comment
ikabodo
Type: Member
Joined: 08/27/2014
Posted:
Comment:
Comment
AlgorithmX2
Type: Admin
Joined: 12/31/2012
Posted:
Comment:
Ok looks like some kind of S3FPacketCustomPayload, my logger didn't do a good job, i'll try to improve it.
Comment
AlgorithmX2
Type: Admin
Joined: 12/31/2012
Posted:
Comment:
Ok so fixed that, also I may have found the issue if it was indeed AE2, so go ahead and update AE2. if it persists try the new version of the logger and we will continue the search.
Comment
ikabodo
Type: Member
Joined: 08/27/2014
Posted:
Comment:
Great! I'll try it out as soon as I get home to the server tonight! Out of curiousity: Did it seem like a compability issue? You mentioned earlier that another mod could be sending packet where it shouldn't? If you figure out what mod it is, could you tell me? I figured I should mention it to that author so they can work on it. Also, I got a packet can only be 32bit in size error just after I posted the log. Not sure if it's related. Thanks again for taking the time to look into this issue.
Comment
ikabodo
Type: Member
Joined: 08/27/2014
Posted:
Comment:
Unfortunately, the new version didn't seem to remove the issue. I disconnected within one minute of logging in. Here is the new log: https://gist.github.com/Ikabodo/eff44f1a98341ff21ca2
Comment
ikabodo
Type: Member
Joined: 08/27/2014
Posted:
Comment:
Oh, I think the text cut of. Here is the complete file: http://filer.ikabodo.se/latest.log
Comment
ikabodo
Type: Member
Joined: 08/27/2014
Posted:
Comment:
This is with rv1.beta build 25 ofcourse.
Comment
AlgorithmX2
Type: Admin
Joined: 12/31/2012
Posted:
Comment:
AE2 sent 1702 tile entities Carpenters Blocks sent 595 beds ( tileentities ) Something sent 886 empty tile entities ( 208,84,48 or 226,80,45 ) are the location of two such tile entities. ProjRed Sent 27 packets Open Mods sent 64 packets CoFH sent 84 packets FMP Sent 183 packets Mek sent 2096 custom packets. I don't know what the contents of the Mek data is, it might be tile entity data, or it might be something else, they appear to be 34 bytes a-peace on average FMP Packets tend to be 60-1200 bytes pretty big range here AE2's packets tend to be 18-20 bytes, the unknown packets around a few bytes 4-5 ish. the beds are pretty size-able, I don't have a solid number on them however. This probably gives you a size-able world load of about 0.5 - 1.0mb of data if you combine it with other chunk data What is your internet connection? Maybe we should do one without the AE2 network online to see what the difference is?
Comment
ikabodo
Type: Member
Joined: 08/27/2014
Posted:
Comment:
I'm on a 60 mbit/s cable service, but I doubt that is the issue, since all experiments are done on the local network. Also, I can actually SEE the server freezing the avg tick on the server screen (memory is 91% free, the avg tick is frozen), so something that should not be happening, is happening. * I'll try removing carpenters. (I ruled it out before, since it uses no power, and turning off power removed the issue earlier) * The unknown packages seem to be coming from big reactors, at least that is what is on the coordinates. Turning them off has made no difference so far, and they are not connected to the ME-network (bedsides from providing power via mekanism cables to the energy acceptor) I'll try to log in and disable to ME-network. If I'm able to turn it off in the few seconds I'm in game. I'll get back to you as soon as I'm able to. I'm getting timeout so often now I can barely walk 2 feet.
Comment
AlgorithmX2
Type: Admin
Joined: 12/31/2012
Posted:
Comment:
The tick rate on the server is freezing? I must have missed that i thought we were talking about a packet overload issue this whole time... If thats the case you will want o profile the server instead. http://ae1.ae-mod.info/Profiling/ provides details on how to do that.
Comment
ikabodo
Type: Member
Joined: 08/27/2014
Posted:
Comment:
I managed to remove the power from the me-network. And the time-outs are gone. I can't get a log from a crash now though. Do you want me to send just a packetlog from regular play? Or do you want me to log and then put the power back on?
Comment
AlgorithmX2
Type: Admin
Joined: 12/31/2012
Posted:
Comment:
Lets switch gears and profile the server as you login when it freezes, I've been approaching the issue wrong, I completely missed the "server is locking up" part of your initial report and thought we are talking about you getting "lagged out" because of too many packets.
Comment
ikabodo
Type: Member
Joined: 08/27/2014
Posted:
Comment:
I'm sorry if this is a stupid question, but should I run the profiler on the client computer, the server computer or both? Considering the client times out and the server freezes.
Comment
AlgorithmX2
Type: Admin
Joined: 12/31/2012
Posted:
Comment:
Server, since the server is the the thing freezing. the client just can't talk to it while that happens.
Comment
ikabodo
Type: Member
Joined: 08/27/2014
Posted:
Comment:
I'm on it. I'll get back to you in a few minutes. I can reproduce it now, when giving direct power to the AE2 network with a creative energy cell.
Comment
ikabodo
Type: Member
Joined: 08/27/2014
Posted:
Comment:
I think I managed to do it. Although it looked like the profiler almost froze itself. http://filer.ikabodo.se/profiling_1.nps
Comment
ikabodo
Type: Member
Joined: 08/27/2014
Posted:
Comment:
Oups, posted the wrong file. Give me a moment.
Comment
ikabodo
Type: Member
Joined: 08/27/2014
Posted:
Comment:
Comment
ikabodo
Type: Member
Joined: 08/27/2014
Posted:
Comment:
I also got an error in the log when running the profiler the first time (I started the cpu-log from the profile-tab first. Not sure it makes any differance) http://filer.ikabodo.se/oj.txt
Comment
AlgorithmX2
Type: Admin
Joined: 12/31/2012
Posted:
Comment:
Seems related to a number of AE2's internals interacting in a strange way, mostly it seems to be rebuilding the the storage array a number of times because a chunk keeps unloading. I'm pushing a new build with some of the code in question re-worked, so we'll see if that improves it, or not. If it doesn't I might need you to send me your world so I can inspect things closer. For now please test with the new build, if the issue persists repeat the profile; the sampler ( second one ) is the one I needed.
Comment
ikabodo
Type: Member
Joined: 08/27/2014
Posted:
Comment:
I didn't get the time to fully test it. I didn't time out when I switched to the nether this time. But I DID get a few big lagspikes. I'll get some serious testing tomorrow. But first impression is that you at the very least minimized the problem.
Comment
AlgorithmX2
Type: Admin
Joined: 12/31/2012
Posted:
Comment:
Turns out that stuff I changed caused way more issues then it helped so I had to roll it back, I think you should send me your server so I can do more in depth testing and work on resolving the issue.
Comment
AlgorithmX2
Type: Admin
Joined: 12/31/2012
Posted:
Comment:
In the mean time I tried a diffrent route of attacking the issue I saw in the profile, https://dl.dropboxusercontent.com/u/9117298/appliedenergistics2-rv1-beta-27b.jar I tested this with a setup I build specifically designed to try and reproduce the issue your seeing, it seemed to work well for my setup, so heres hoping it works for yours. As a side note, I'm not sure if its compatible with ExtraCells.. I'm not sure if you had that installed or not.. so just a heads up if you crash it might be cause of that.
Comment
ikabodo
Type: Member
Joined: 08/27/2014
Posted:
Comment:
I'll try the new version tonight. If i does not work, I'll upload the world for you. By the way, I'm not sure if this is related, probably a long shot. https://github.com/aidancbrady/Mekanism/issues/1762 But I've been having a separate issue with FPS drops from mekanism-machine, and I've interpreted it as unrelated. But in the last thread they mentioned they might be causing to many chunk-renderings. I've been using your storage monitors, could it be something with them? I've noticed that during lag they don't seem to update the correct amount of the item in the network. P.S.: I sent them the profiling from the client that you showed me how to do. I think I learned something usefull. :)
Comment
ikabodo
Type: Member
Joined: 08/27/2014
Posted:
Comment:
Well, I've tried it out for a while now. It's definately getting better, only a few time-outs not, not at all like before. I'll do some more testing, to see if I get some more info. Right now I'm focusing on the stutter (from mekanism), I got an indication that something called sack overflow could be causing the lag. Another long-shot.
Comment
ikabodo
Type: Member
Joined: 08/27/2014
Posted:
Comment:
I got some more time outs on the the moon. (there is a quantum link there, and a small ME-network) Using 27b http://filer.ikabodo.se/sampler_2.nps
Comment
ikabodo
Type: Member
Joined: 08/27/2014
Posted:
Comment:
Also, it looks like the CPU is spiking strangely. http://filer.ikabodo.se/capture.jpg Or am I reading it wrong?
Comment
AlgorithmX2
Type: Admin
Joined: 12/31/2012
Posted:
Comment:
It looks like the issue is a huge chain of chunk loading, AE2 is basically gone now; better then I was hoping for honestly. So essentially whats happening, is the controller is changing its power state, this causes a neighbor update, this causes a mekanism cable to load and from the looks of of it; it then spirals outward loading your entire base or a least a pretty huge chunk of it; more then likely this then probably causes AE2 to get powered, then the chunks unload; then it loses power... then the cycle restarts. Thats at least what i'm kinda guessing is happening, why the controller is staying loaded, is uncertain. Two things could probably help; 1. Mekanism should probably not load chunks as it builds its network; this will cut it off before it spirals out of control. 2. You could chunkload your base so it stops unloading constantly. You're correct that your CPU is spiking, More than likely the CPU spikes you're seeing are related to the loading/unloading going on.
Comment
ikabodo
Type: Member
Joined: 08/27/2014
Posted:
Comment:
Woa, that's really specific. I'll go talk to the mekanism guys about this. Thanks for going above and beyond. You are my first Patreon. :) I do have a chunkloaded in the base during testing (didn't before) but apparently its not covering a big enough area. The base is pretty spread out, I'll a few small chunkloaders that covers the smaler parts of the base to see if that helps temporarily. Just to be clear, you mean the ME Controller is changing power-state right? Should it be doing this? Could a powerfluctiation from the powermod (big reactors/mekanism) cause the ME Controller to switch on an off, and in the end causing mekanism to spiral out? In that case, I could try a creative powercell near the ME-controller temporarily.
Comment
AlgorithmX2
Type: Admin
Joined: 12/31/2012
Posted:
Comment:
it should if it runs out of power, as long as its getting power steadily it shouldn't' happen, perhaps the power infrastructure is unloading? Keep in mind that most of my /theory/ is based on just the stack trace, I really have no idea how your base is setup, or what is chunk loaded or not. With that said I think I'm ready to close this issue since the existing issue with AE2 appears to be resolved.
Comment
ikabodo
Type: Member
Joined: 08/27/2014
Posted:
Comment:
Thanks again for all the help, chunkloading has worked so far, although I'm still going to look into the issues causing it to begin with. I chunkloaded before, but I'm thinking the issue was either a MFR-farm or a Mekanism Brine-farm that was a bit off from the main base. I'll look into it. Yeah, closing this was the right the decision. :D
Back to List