Archive

Archive for the ‘Game Stuff’ Category

Hot patching is not reverse engineering

November 30th, 2008

So lets say I have a binary on my system that runs and does stuff. Mkay? Well, let’s say I notice (from a quick debug session with gdb) that some of their functions aren’t optimized.

So I see a function that is something like engine_memcpy or so, and it looks like a basic memory copy routine. I see it being used a lot for gameservers. I notice it could be improved for speed, so I hot patch a new engine_memcpy function into the engine so it uses SSE2 movnti to save L1 cache.

Reverse engineering is a grey area. Of course, it happens all the time (WiFi drivers for open source)

Code, Game Stuff

Userland FPS estimator

November 28th, 2008

This more or less displays scheduler latency with select. Only 1.6 with a pingboost of 2 uses select() for frame timing. This shows how much FPS you can expect (with jitter) on a machine..

 
#include <string.h>
#include <stdio.h>
#include <time.h>
#include <sys/time.h>
#include <sys/types.h>
#include <unistd.h>
 
unsigned long long tv_diff(struct timeval *tv1, struct timeval *tv2) {
        unsigned long long ret;
 
        ret = ((unsigned long long)(tv2->tv_sec - tv1->tv_sec)) * 1000000ULL;
        if (tv2->tv_usec > tv1->tv_usec)
                ret += (tv2->tv_usec - tv1->tv_usec);
        else
                ret -= (tv1->tv_usec - tv2->tv_usec);
        return ret;
}
 
 
main(int argc, char **argv) {
        struct timeval tv1, tv2, tv3, tv4, wait;
        int loops;
 
        if (argc > 1)
                loops = atol(argv[1]);
        else
                loops = 100000;
 
        memset(&tv1, 0, sizeof(tv1)); memset(&tv2, 0, sizeof(tv2));
        memset(&tv3, 0, sizeof(tv3)); memset(&tv4, 0, sizeof(tv4));
 
        printf("Date_sec.msec   select    loop  maxlat    nbmax\n");
        gettimeofday(&tv1, NULL);
        while (1) {
                int i, max, times;
 
                gettimeofday(&tv2, NULL);
                tv4 = tv2;
                max = 0; times = 0;
                do {
                        gettimeofday(&tv3, NULL);
                        i = tv_diff(&tv4, &tv3);
                        if (i > max)
                                max = i;
                        if (i > 10000)
                                times++;
                        tv4 = tv3;
                        i = tv_diff(&tv2, &tv3);
                } while (i < loops);
 
                printf("%lu.%03lu %7lld %7lld %7d %7d\n",
                        tv3.tv_sec, tv3.tv_usec/1000,     
                        tv_diff(&tv1, &tv2), tv_diff(&tv2, &tv3),
                        max, times);
 
                wait.tv_sec = 0;
                wait.tv_usec = 1000;
 
                gettimeofday(&tv1, NULL);
                select(0, NULL, NULL, NULL, &wait);
        }
}

Code, Game Stuff

The truth about VALVe

November 27th, 2008

Valve doesn’t give a shit about Linux code for their servers. They don’t listen to people who want AMD64 binaries. They don’t listen to people who complain about high CPU usage on servers. They don’t listen to people when they tell VALVe that their Linux code is expensive to call.

Game Stuff

SourceTV

November 26th, 2008

It’s my opinion that sourcetv is a utter pile of shit. HLTV’s were much simpler.

SourceTV relays require their own dataport, and the game engine will chose another port if it’s in use. Solution? Don’t run source tv’s.

Another thing that bugs me is the port behavior that source has. If a port is in use, it just goes to the next port down.. instead of 1.6, where it would ERRINUSE and return 1.

I think source is shitty, and it’s a directx9 version of 1.6… 1.6 has better behavior serverside, at least.

Game Stuff

Before you ask. It’s real.

November 23rd, 2008

Summit’s upcoming packages

November 13th, 2008

Over the past couple of months, I’ve decided to take a different approach to optimizing gameservers. First and foremost, I had to peek at their code.. considering VALVes code has it’s roots from quake code, it was pretty simple to reverse engineer the game to see what functions are doing what.. using olly/ida with visual studio, i was able to reconstruct their sleep code for “Linux” and rewrite it to be more modern. I’ve created a module that is loaded during the game’s init in main() to override those functions and replace them with what I deem optimized.

- memcpy now uses SSE2 registers to save L1 cache.
- time is now serviced by clock_gettime()
Adaptive Sleeping code:
- usleep() is deprecated. We use clock_nanosleep all the way now.
- Adaptive TSC based sleep timers that can be switched on and off. This uses the TSC’s register to get ultra accurate times from the
the processor, and we can schedule a wakeup exactly, with about 200ns of erroring.
- A shared process, /tmp/clock.mmap that is read every few seconds to get time from the OS. This reduces expensive context switching
into the kernel.

About 25 users tested it so far, and everyone LIKES it. There was 2 users who said they couldn’t tell a difference, but I think they said that because I was in the inital beta phases.

Overall this module FIXES busted code. The game is probably 1000x more accurate now (using tv_nsec instead of tv_usec) and it gives you an idea of what really goes on.

Game Stuff

FPS

November 11th, 2008

FPS measures how accurate a couple of syscalls are.

It all depends on the hardware, and the OS, and the kernel..

Most server processors don’t have power management stuff, therefor, desktop CPUs have too much jitter (SMI interrupt, C3 halt stats) to host high precision applications that
require ultra-accurate nanosleep.

For example, here’s a measurement of usleep wakeup latency on a Linux machine:

0.000018
0.000009
0.000010
0.000015
0.000011
0.000011
0.000012
0.000010
0.000009

That’s a usleep(1000000), the results are never consistant at all. Anyone who says FPS doesn’t do this is full of shit.

Game Stuff

sv_fps on CoD4

November 16th, 2007

Running servers with sv_fps 30 is stupid, because sv_fps is used for physics, etc. also, there are timing issues with events no longer happening on exact frame boundaries, but rather events happening a frame too late.

Here’s my post on the CAL Forums.

You can’t fix a majority of the problems with scripting, you’d need engine source to properly fix it, instead of using bandaids (scripting)

When you raise sv_fps to 30, quite a few things happen, and there are more downsides to raising it than upsides.. also; there are more issues with it being anything other than 20.. I just post the signifigant ones.

- Physics get whacky. Some/most/all physics that are hard coded to sv_fps 20 wake up too early/too late. They can be fixed, but they won’t be as accurate as they would be with it being 20, since 30 doesn’t divide evenly. Timers are the most common thing to get broken.. Any animations that are controlled server side can also be broken.

- G_RunFrame() is called at 33.3ms intervals on the server. Sometimes, it will get called a frame or 2 late, or 2 early, because of rounding errors.

- Prediction gets fubared, since the built-in engine lag of 50ms is reduced to 33ms, which means there isn’t as much data to predict/interpolate for. High pingers notice this more than low pingers (you can even debug prediction errors to see for yourself)

- Antilag is affected, but nobody uses it because it’s broken. I think IW used/uses a point release from ID with busted antilag.

I’ve always told people to stick with using sv_fps of 20, because things are hard coded around it, and fixing them gets messy.

Game Stuff

CS: Source ktrace output

August 14th, 2007
 
[root@bravo /tmp]# ktrace -p 86175 && sleep 10 && ktrace -C

Ktrace is a very good utility :)

 
[root@bravo /tmp]# linux_kdump | grep linux_nanosleep | wc -l
   19815
[root@bravo /tmp]# linux_kdump | grep gettimeofday | wc -l
  109332

gettimeofday() is used on source to measure the server FPS.
nanosleep() is used, for sleeping and waking up for a frame.

So, this means a couple of things. gettimeofday is archaic, nonstandard, and imprecise. srcds should use clock_gettime() instead. Don’t trust rcon stats for FPS, because in reality it doesn’t mean anything. As long as the ticrate stays at the level the server says it is, then everything is fine. In the case of CS:Source, the tickrate and server FPS are completely seperate, unlike HL1.6

#define gettimeofday(tv) clock_gettime(CLOCK_REALTIME, tv)

Still reversing the game to see what high fps really does.

Code, Game Stuff