Archive

Archive for November, 2008

Linux Kernel 2.6.24+summit

November 30th, 2008

I have a field tested kernel for linux that summit has to use for one if it’s hltv machines. I was using it to provide some newer games (L4D) instead of upgrading my FreeBSD machines to include newer linuxolator stuff.

Anyways, I’ve modified it to suit gameservers. I field tested it and the FPS is pretty stable, if anyone wants it just let me know (yes, I’m serious)

Features:
Realtime Scheduler
Fixed some bugs in the timer routines to not add 3 jiffies

Added some other enhancements
Attempted to reduce how much latency usleep returns. This should fix the jitter on FPS stats on source.
Mark the TSC as useable *only* on Intel processors. Use HPET on AMD gear.
Disable some stuff that doesn’t need to be enabled (cpufreq, acpi processor stuff, etc)
ToDo:

Add vsyscall support for x86. This will map a shared page into every process running and allow a syscall without the overhead of an interrupt. Should allow low overhead gettimeofday() calls.

Code, Game Stuff, uboost

Hot patching is not reverse engineering

November 30th, 2008

So lets say I have a binary on my system that runs and does stuff. Mkay? Well, let’s say I notice (from a quick debug session with gdb) that some of their functions aren’t optimized.

So I see a function that is something like engine_memcpy or so, and it looks like a basic memory copy routine. I see it being used a lot for gameservers. I notice it could be improved for speed, so I hot patch a new engine_memcpy function into the engine so it uses SSE2 movnti to save L1 cache.

Reverse engineering is a grey area. Of course, it happens all the time (WiFi drivers for open source)

Code, Game Stuff

Userland FPS estimator

November 28th, 2008

This more or less displays scheduler latency with select. Only 1.6 with a pingboost of 2 uses select() for frame timing. This shows how much FPS you can expect (with jitter) on a machine..

 
#include <string.h>
#include <stdio.h>
#include <time.h>
#include <sys/time.h>
#include <sys/types.h>
#include <unistd.h>
 
unsigned long long tv_diff(struct timeval *tv1, struct timeval *tv2) {
        unsigned long long ret;
 
        ret = ((unsigned long long)(tv2->tv_sec - tv1->tv_sec)) * 1000000ULL;
        if (tv2->tv_usec > tv1->tv_usec)
                ret += (tv2->tv_usec - tv1->tv_usec);
        else
                ret -= (tv1->tv_usec - tv2->tv_usec);
        return ret;
}
 
 
main(int argc, char **argv) {
        struct timeval tv1, tv2, tv3, tv4, wait;
        int loops;
 
        if (argc > 1)
                loops = atol(argv[1]);
        else
                loops = 100000;
 
        memset(&tv1, 0, sizeof(tv1)); memset(&tv2, 0, sizeof(tv2));
        memset(&tv3, 0, sizeof(tv3)); memset(&tv4, 0, sizeof(tv4));
 
        printf("Date_sec.msec   select    loop  maxlat    nbmax\n");
        gettimeofday(&tv1, NULL);
        while (1) {
                int i, max, times;
 
                gettimeofday(&tv2, NULL);
                tv4 = tv2;
                max = 0; times = 0;
                do {
                        gettimeofday(&tv3, NULL);
                        i = tv_diff(&tv4, &tv3);
                        if (i > max)
                                max = i;
                        if (i > 10000)
                                times++;
                        tv4 = tv3;
                        i = tv_diff(&tv2, &tv3);
                } while (i < loops);
 
                printf("%lu.%03lu %7lld %7lld %7d %7d\n",
                        tv3.tv_sec, tv3.tv_usec/1000,     
                        tv_diff(&tv1, &tv2), tv_diff(&tv2, &tv3),
                        max, times);
 
                wait.tv_sec = 0;
                wait.tv_usec = 1000;
 
                gettimeofday(&tv1, NULL);
                select(0, NULL, NULL, NULL, &wait);
        }
}

Code, Game Stuff

The truth about VALVe

November 27th, 2008

Valve doesn’t give a shit about Linux code for their servers. They don’t listen to people who want AMD64 binaries. They don’t listen to people who complain about high CPU usage on servers. They don’t listen to people when they tell VALVe that their Linux code is expensive to call.

Game Stuff

SourceTV

November 26th, 2008

It’s my opinion that sourcetv is a utter pile of shit. HLTV’s were much simpler.

SourceTV relays require their own dataport, and the game engine will chose another port if it’s in use. Solution? Don’t run source tv’s.

Another thing that bugs me is the port behavior that source has. If a port is in use, it just goes to the next port down.. instead of 1.6, where it would ERRINUSE and return 1.

I think source is shitty, and it’s a directx9 version of 1.6… 1.6 has better behavior serverside, at least.

Game Stuff

Gameserver Ramblings

November 24th, 2008

Here’s a long post documenting everything as technical as I can.. I also am rambling.
Most of the information is copyrighted(me), so if you use any of this stuff at all to sell servers YOU MUST GIVE ME CREDIT.

VALVe games are based off of quake 3. Even though source and future engines may seem to run on their own, they still have their roots in quake 3 (remember half-life itself).

- FPS
There has been much argument to what VALVes serverside FPS is. One thing that is known is the higher the number is, the lower the latency is for nanosleep(). When you crank up the HZ to 1000. A HZ is how many times a second interrupts fire off, which reduces latency/increases accuracy of several timers like nanosleep/select/itimers. When the game calls a frame, it updates world objects, does a couple of other things. Higher FPS also *seems* to reduce prediction erroring, which you can debug for yourself, since there are so many frames being called on the server. Higher server FPS reduces the chance is late or missed client snapshots. A late snapshot sent from the client won’t show on any netgraph 1/2/3. Overall, FPS is more or less a System Call Precision Accuracy Tester.. It reflects what HZ is set to on the system, more or less. There are more magic tricks going on with that many interrupts, but more or less that what it does.

- OS Caveats
The linux kernel as of 2.6.18 (just this version) has support (via patch) for vsyscall+gettimeofday. A vsyscall is a shared page mapped into the address space of every running process, and it is updated atomically. It avoids a full context switch into the kernel, and tries to make calling system calls cheap.. This only works on x86 games, not a x86_64 kernel with a x86 game. On a 32slot pub, I reduced CPU usage by 28% (average over 60 minutes with MRTG+perl script to monitor it’s pid) with this patch. Older versions of Linux have superior low latency support compared to the newer ones. 2.6.18 and 2.4 series are probably the best versions to use for gameservers, IMO. I don’t use Linux that much.

Another thing to consider about linux is most newer kernels have brain dead code to check to see if the TSC skews or not. It’s called about every 10/HZ (IIRC), and can make the FPS code unhappy (which makes CAL Open kids cry). GLIBC is very slow and bloated with junk, so don’t expect magic from userland. If you want to speed up userland for games, try compiling glibc without frame pointers.. this will prevent debugging, but it will remove a ton of shit from the stack and at least speed it up.

TSC vs HPET. I get asked this question quite a bit.. should you use TSC, or HPET for the timecounter? TSC is probably the fastest, but it *can* skew on SMP machines. Intel CPUs update at a fixed rate regardless of power management crap, so if you are using intel, you can do that I personally use HPET on my machines, but it’s slower to read but it’s better to use than the PIT/8254.

Does throwing more CPU power fix FPS problems? No. That just means you can put more servers on a machine. Another think to take in mind is valve isn’t the worlds most talented group of coders.. they make mistakes, just like kids when they complain that a bullet missed someone’s head.

What about 64bit kernel/userland, vs 32bit binaries? This is another age old arguement. If binaries are 32bit, then a 64bit kernel/userland doesn’t help them. The only reason you need 64bit any ways is to overcome the 232 limit of memory.

Filesystem? UFS? EXT/2/3? ZFS? RFS? Hammer? NTFS? .. Overall, the fastest one out of that is probably ext2. ZFS is a 128bit filesystem, and it’s maximum limits are so large (18.4 × 1018) that they will never been encountered. I like ZFS over them all due to it’s geeky-ness. For more ZFS lingo, if you put 1 exabyte on a ZFS filesystem each year, it would take around 7.9 million years to fill it up. Another way to write it is you wrote 1,000 files per sec (full dvd movies), it would take approx 284,012,568,000 seconds to fill it up (which is about 9000 years)

Windows? Microsoft’s operating systems provide the very basic interfaces. However, interrupt latency and poor design/ability to change very basic system parameters (sleep latency etc). To be honest, it’s so much of a joke that I have a hard time finding out why a machine is lagging in a game. The user interfaces for 2008 are so terrible it takes me at least 15 minutes to analyze event loggers horrible logs for security violations. I feel sorry for people who have to contend with so many problems on this platform.

FreeBSD? FreeBSD has probably one of the best development platforms (stable, release, current). Linux used to follow this method (odd number kernels would be development and even would be production). However, there are a couple of bugs in the linux emulation code. It’s being worked on slowly, but overall FreeBSD is a pretty good platform. The downside is lack of hardware support and general slow release cycles. However the quality of code is really good except for a few minor issues.

- Misc stuff

Kids want the following: low latency, stable framerate, and.. the most intrusive of them all, bullet registration. Bullet reg is something of a mystery, sort of like bigfoot. There is no way to prove it exists, and you’re unable to prove it doesn’t exist. So why assume it exists without evidence? Reverse feeble logic? Kids don’t understand that most of VALVes games’ netcode estimates things. Just as sure as MPG in a car, everything is either rounded up or rounded down. And the final value is estimated. So in math terms, you have the same odds as everyone else in terms of missing. Some people are just better at it than others, don’t expect to bowl a 300 game every time.

Main

Shaun Weston from Maxed Servers or whatever

November 24th, 2008

I would avoid doing business with this person… Why?

Here’s a ticket (yes, I’m aware of what I’m doing)

WHY THE FUCK YOU CLOSE TICKET BECAUSE YOU NO IAM RIGHT IF I NEW WHERE U LIVED U PISSED ME OFF SO FUCKING MUCH I WOULD SEROUISLY FUKIN KILL YOU ARE IF MY UNCLE AKA PROOF WAS STILL ALIVE I GET HIM IAM SURE U FUKIN HERD OF HIM DETROIT’Z MAYOR ! HE WOULD FUKIN GET YOU YOUR PISSING ME OFF SO MUCH I WOULD FUKIN CHOKE YOU TO DEATH

I don’t like him. No amount of apologies can make up for the constant harrassment, the death threats, the insults towards me/alex/jon/brian/cody/etc.

Main

Before you ask. It’s real.

November 23rd, 2008

Ok, this is retarded.

November 21st, 2008

Welcome to HyperionServers.com! We have now been in industry for over a year and decided to make some much awaited changes. We are very excited to finally be releasing our industry first 12,000FPS servers. After a few hundred kernel rebuilds and a few hundred failed programs, we’ve finally done it. We are now dedicated to show you that we are the best. No-one can even come close to our performance and prices. I know what you are thinking, “I thought the max anyone could get to was 2,000″. That myth has now been busted wide open. Contact support and get your hour trial today. You will notice the difference. Our 12,000FPS servers are hosted on your own core. We have 100% dedicated Internap bandwidth. That means no-one will be fighting over box resources, your ping will always be low, and your competition can’t complain about lag. Our company owned Dual-Quad-Core Xeon servers are some of the best-equipped boxes in the business. The box you will be put on includes at a minimum: Two 500gb hard drives in raid-1, 8gb of DDR2-667 registered memory, redundant power, and our modified Linux operating system. Currently, we only have a Chicago location. We hope to change this very soon if people realize how amazing our server’s really are. Spread the word of the Hyperion Servers revolution!

.. ready? ………… HAHAHAHAHAHAHAHHhahahahahHAHAHAhAhAhAHHAHhausihdiah7e8yq

I was first. They were last.

High FPS on Test Machine

Drama

High CPU Priority? BUSTED.

November 19th, 2008

High CPU priority does nothing for gameservers… it doesn’t affect how accurate anything is, and it also seems (on a test linux machine) to not affect scheduler latency (at all.)
People who sell this stuff don’t research what it’s supposed to do. They think it’s going to get served CPU time first and make FPS stable, which is completely ridiculous.

[root@unknown tmp]# ./wakeup_latency 
Testing usleep wakeup
Wakeup time: 11.0 uS
Wakeup time: 7.0 uS
Wakeup time: 8.0 uS
Wakeup time: 7.0 uS
Wakeup time: 9.0 uS
Wakeup time: 8.0 uS
Wakeup time: 10.0 uS
Wakeup time: 9.0 uS
Wakeup time: 9.0 uS
Wakeup time: 9.0 uS
Wakeup time: 9.0 uS
Wakeup time: 10.0 uS
Wakeup time: 10.0 uS
Wakeup time: 10.0 uS
Wakeup time: 10.0 uS
Wakeup time: 10.0 uS
Wakeup time: 8.0 uS
Wakeup time: 8.0 uS

With nice of -20

[root@unknown tmp]# nice -20 ./wakeup_latency
Testing usleep wakeup
Wakeup time: 17.0 uS
Wakeup time: 7.0 uS
Wakeup time: 8.0 uS
Wakeup time: 8.0 uS
Wakeup time: 8.0 uS
Wakeup time: 9.0 uS
Wakeup time: 8.0 uS
Wakeup time: 8.0 uS
Wakeup time: 10.0 uS
Wakeup time: 8.0 uS

Main