Performance Issue: Lobby Lag

Discussion of all aspects of the game engine, including development of new and existing features.

Moderator: Forum Moderators

Post Reply
User avatar
Sapient
Inactive Developer
Posts: 4453
Joined: November 26th, 2005, 7:41 am
Contact:

Performance Issue: Lobby Lag

Post by Sapient »

If anyone is interested in helping with the lobby lag issue in MP, I am posting here to draw attention to the issue and outline some ways you can help.

1) Benchmarking Lobby Lag
The way I've been testing it is very unprofessional. I was just logging into a busy lobby and clicking the quick replay checkbox repeatedly and then counting the seconds where it didn't respond. What you are measuring is the lag in GUI response time. That is, the time between clicking and the checkmark appearing. The lag only happens when there is a change received in the game/user list. So click in quick repetition until the lag occurs, then measure it.

When you report, I am not interested in the average lag, moreso the peak (maximum). Also, please tell what kind of connection you have (dial-up/high-speed), and about how many games/users were in the lobby at the time you measured.
If you can provide your OS (Windows/Linux), CPU, and memory, and any special flags you compiled with, that would also help.

2) Isolating Component Sources of Lag

From my own tests I estimate 3 seconds of row drawing lag and 1 second of network lag, combined for max 4 seconds in response time. One thing you can try to reduce row drawing lag is remove the files named selection-border-left.png and selection2-border-left.png from your Wesnoth/images/misc directory. (Also, in trunk you can disable minimap drawing, but the trunk lobby is never busy). If you do this, then please report it with a Benchmark before/after.

If you are a coder you can also hack the code to help isolate component sources of lag and provide a more accurate way for us to benchmark it.

3) Coding Help

A partial fix isn't really going to work. All the cumulative factors need to be alleviated. I estimate 25% of my peak-lag is network. It stacks since it occurs at the same time. the entire menu is re-calculated and redrawn when changes occur. both for userlist and for maps. drawing all those usernames is not as fast as you might think (remember how long it takes to draw the chat log). so there is going to have to be (1) a smarter menu, That only redraws changed rows, And (2) a smarter minimap calculator That caches minimap images maybe. And on top of that (3) a network receive thread probably.

I don't know much about the network code, Or the way wesnoth handles threading, Or the way it handles image caching. So the only one I would probably work on is the smarter menu.

* * * *
If you would like to help, then please post here. Or if you have performed a benchmark as described, please post the information here. Any unhelpful responses such as whining or complaints will simply be deleted. Thanks.
Last edited by Sapient on July 4th, 2007, 2:51 am, edited 2 times in total.
http://www.wesnoth.org/wiki/User:Sapient... "Looks like your skills saved us again. Uh, well at least, they saved Soarin's apple pie."
Ardonik
Posts: 70
Joined: May 8th, 2005, 5:31 am
Location: Western United States

Re: Performace Issue: Lobby Lag

Post by Ardonik »

Sapient wrote:1) Benchmarking Lobby Lag
The way I've been testing it is very unprofessional. I was just logging into a busy lobby and clicking the quick replay checkbox repeatedly and then counting the seconds where it didn't respond. What you are measuring is the lag in GUI response time. That is, the time between clicking and the checkmark appearing. The lag only happens when there is a change received in the game/user list. So click in quick repetition until the lag occurs, then measure it.

When you report, I am not interested in the average lag, moreso the peak (maximum). Also, please tell what kind of connection you have (dial-up/high-speed), and about how many games/users were in the lobby at the time you measured.
If you can provide your OS (Windows/Linux), CPU, and memory, and any special flags you compiled with, that would also help.
I used Wesnoth 1.3 (i.e., the Subversion version), r15106. To join the multiplayer lobby for Wesnoth 1.2 (as opposed to the empty devsrv lobby), I hard-coded game_config::version to "1.2" in game_config.cpp and joined server.wesnoth.org:14999.

The peak lag time between checking the "Quick Replay" box and unchecking it with /usr/local/share/wesnoth/images/misc/{selection-border-left.png,selection2-border-left.png} present was around 4 seconds. Without the presence of these two files, the peak time was around 2 or 3 seconds (more like 3 than 2), but the difference wasn't really perceptible in my opinion.

The lobby lagged much more noticably when people joined it, left it, created a game, or joined a game. At other times, the lobby was fairly responsive. There had to be nearly two dozen games and maybe 50 or so users on the server while I was experimenting.

My operating system is Mandrivalinux 2006.0, running on a 1.5 Mb/s down/384 Kb/s up residential DSL connection. My CPU is a 2.4 GHz Pentium 4, its physical memory is 512 MB, and I didn't compile with any special flags. (In particular, I'm running an unoptimized debug build of last night's Wesnothsvn.)
Sapient wrote: 2) Isolating Component Sources of Lag

From my own tests I estimate 3 seconds of row drawing lag and 1 second of network lag, combined for max 4 seconds in response time. One thing you can try to reduce row drawing lag is remove the files named selection-border-left.png and selection2-border-left.png from your Wesnoth/images/misc directory. (Also, in trunk you can disable minimap drawing, but the trunk lobby is never busy).

If you are a coder you can also hack the code to help isolate component sources of lag and provide a more accurate way for us to benchmark it.
Disabling the lobby minimap display (again, with a hacked Wesnoth 1.3 connected to the 1.2 lobby, and without those two PNG files) didn't really have any perceptible effect, which surprised me.
Sapient wrote: 3) Coding Help

A partial fix isn't really going to work. All the cumulative factors need to be alleviated. I estimate 25% of my peak-lag is network. It stacks since it occurs at the same time. the entire menu is re-calculated and redrawn when changes occur. both for userlist and for maps. drawing all those usernames is not as fast as you might think (remember how long it takes to draw the chat log). so there is going to have to be (1) a smarter menu, That only redraws changed rows, And (2) a smarter minimap calculator That caches minimap images maybe. And on top of that (3) a network receive thread probably.

I don't know much about the network code, Or the way wesnoth handles threading, Or the way it handles image caching. So the only one I would probably work on is the smarter menu.
That looks like the right way to fix the problem, but it's easy for me to say that without knowing the code that well.
User avatar
Sapient
Inactive Developer
Posts: 4453
Joined: November 26th, 2005, 7:41 am
Contact:

Post by Sapient »

I have posted a patch for MP Lobby Lag Profiling to patches.wesnoth.org

It seems that network is not the problem afterall (by my test results). However, I'd be interested to see what other people's results show.
http://www.wesnoth.org/wiki/User:Sapient... "Looks like your skills saved us again. Uh, well at least, they saved Soarin's apple pie."
Ardonik
Posts: 70
Joined: May 8th, 2005, 5:31 am
Location: Western United States

Post by Ardonik »

Sapient wrote:I have posted a patch for MP Lobby Lag Profiling to patches.wesnoth.org

It seems that network is not the problem afterall (by my test results). However, I'd be interested to see what other people's results show.
https://gna.org/patch/index.php?666 (mp_lobbylag.patch)

It's easier to paste results if you print the report string to stderr.

Code: Select all

report << "MP Lobby Lag (" << total_time
       << " ms): Network "
       << (100 * (ui.network_rcv_lag - ui.lobby_timer_start) / total_time)
       << "%, Userlist "
       << (100 * (ui.userlist_lag - ui.network_rcv_lag) / total_time)
       << "%, Gamelist "
       << (100 * (ui.gamelist_lag - ui.userlist_lag) / total_time)
       << "%, Redraw "
       << (100 * (ui.redraw_lag - ui.gamelist_lag) / total_time) << "%";
ui.add_chat_message("report",0, report.str());
std::cerr << report.str() << "\n";
Apply the patch and just join the multiplayer lobby. After a few initial anomalies, these were my typical results:

Code: Select all

MP Lobby Lag (852 ms): Network 0%, Userlist 28%, Gamelist 42%, Redraw 29%
MP Lobby Lag (909 ms): Network 0%, Userlist 26%, Gamelist 39%, Redraw 33%
MP Lobby Lag (945 ms): Network 0%, Userlist 27%, Gamelist 42%, Redraw 29%
MP Lobby Lag (946 ms): Network 0%, Userlist 25%, Gamelist 43%, Redraw 31%
Gamelist was always the largest and Network was always 0. Regardless of the actual peak lag time, the percentages were pretty consistent.
WildPenguin
Posts: 161
Joined: September 6th, 2005, 10:41 pm
Location: Australia

Post by WildPenguin »

Maximum Lag: MP Lobby Lag (1460 ms): Network 0%, Userlist 12%, Gamelist 29%, Redraw 57%
Users: Just under 50.
Games: 12.
Connection: Residential ADSL 512k/128k.
OS: Debian GNU/Linux.
CPU: Pentium III 800MHz.
Memory: 640MB.
Special flags: none.
Version: wesnoth-svn r15134 (1.2 server)

Redraw was always significantly higher and network was always 0.

I'll see if I can get around to testing again when the server is more active.
User avatar
Sapient
Inactive Developer
Posts: 4453
Joined: November 26th, 2005, 7:41 am
Contact:

Post by Sapient »

WP - so just to clarify, you were using 1.3 (on the 1.2 server)? Or were you using the mordante_terrain branch?
http://www.wesnoth.org/wiki/User:Sapient... "Looks like your skills saved us again. Uh, well at least, they saved Soarin's apple pie."
WildPenguin
Posts: 161
Joined: September 6th, 2005, 10:41 pm
Location: Australia

Post by WildPenguin »

Sapient wrote:WP - so just to clarify, you were using 1.3 (on the 1.2 server)? Or were you using the mordante_terrain branch?
Sorry, forgot to specify before - I was using 1.3.
User avatar
Sapient
Inactive Developer
Posts: 4453
Joined: November 26th, 2005, 7:41 am
Contact:

Post by Sapient »

I have an idea that would eliminate a lot of redrawing. Basically, when the menu is re-created, pass it a list of rows representing what it has already drawn and visible, then if it matches what is about to be drawn, skip drawing. However, this wouldn't affect the gamelist and userlist build-time which is a larger component according to the analysis of patch 666. I have another idea for that: insert a bunch of checks for user input in the userlist and gamelist building routines, then if any input is found, abort the entire process, process the user input, then try to process the network later.
http://www.wesnoth.org/wiki/User:Sapient... "Looks like your skills saved us again. Uh, well at least, they saved Soarin's apple pie."
User avatar
Eleazar
Retired Terrain Art Director
Posts: 2481
Joined: July 16th, 2004, 1:47 am
Location: US Midwest
Contact:

Post by Eleazar »

I don't do much MP, but i spent some time playing MP with 1.2.1 recently.
The lag made it hard to use.

A simple thing that would help is to reverse the sort order. I assume most people go to the lobby to join games. But the games which are accepting players are at the bottom of the list, which can require painful scrolling after each refresh. Putting the games that haven't started at the top should hide some of the unresponsiveness, but by all means a smarter redraw would be great too.
Feel free to PM me if you start a new terrain oriented thread. It's easy for me to miss them among all the other art threads.
-> What i might be working on
Attempting Lucidity
User avatar
Sapient
Inactive Developer
Posts: 4453
Joined: November 26th, 2005, 7:41 am
Contact:

Post by Sapient »

update:
The gamelist_diff is being merged directly to the gamelist config in multiplay_ui.cpp; it will take extensive changes, but this information should be propagated farther down-- ideally it should reach all the way down to a smarter menu redraw logic.

Eleazar - I just changed that behavior in trunk. Menus can stay scrolled to the bottom now. (Also, try clicking on the menu and pressing the 'End' key to go to the bottom.)
http://www.wesnoth.org/wiki/User:Sapient... "Looks like your skills saved us again. Uh, well at least, they saved Soarin's apple pie."
SkeletonCrew
Inactive Developer
Posts: 787
Joined: March 31st, 2006, 6:55 am

Post by SkeletonCrew »

update:
maxy has posted 2 patches which seem to solve the problem. These are in the 1.2 branch and will be released in the upcoming 1.2.2.

If people want to test please do so.
Post Reply