Serialization Round Four

Serialization is the conversion of an object into a magical stream of data that can be deserialized later back into an object.  At first glance you could probably be convinced that this is a totally unnecessary thing for objects to be able to do.  For most simple projects this sort of thing never even comes up, so you’d be right if you figured they’re useless.  For QubeKwest however, they aren’t just needed, they’re needed a lot.

Consider for a moment all the data in QubeKwest.  Everything from the world you are walking around in to chat messages between players is just data at the end of the day.  If the objects in the game can be easily converted to data and then can be easily restored, you suddenly have lots of fun new options for how they can be used.  Pure data versions can be shuttled down a network wire or stored away in files on the disk.  Later when you need the objects again they can be easily loaded from the disk or reconstructed by the client on the other end of the network.

There are loads of different ways to serialize data.  Some are crazy flexible, some are practically automatic, some involve elaborate frameworks, and others are hard coded.  Some make serialized data that is large and fluffy and others are small and concise.  Due to the vastness of the worlds I’m trying to build with QubeKwest, it makes sense to make sure my serializer makes as little data as possible.  That loses me some flexibility, but it means I can store larger worlds in less space and that I can use less bandwidth when you’re playing.

I set out to make my own serialization patterns and to say it’s been an evolutionary process isn’t giving it proper credit.  My first idea involved saving things into 32-bit integers.  I built a collection of functions for aligning data into the integers and started making the basic patterns of how my serialization would work.  Almost immediately I realized that serializing things into integers was a total pain in the butt, and really not worth the trouble.  With that, version 1 died a horrible death.

Later, while learning the finer points of NIO and LWJGL, I realized that both of those make rather heavy use of ByteBuffers.  My integer pattern could be adapted fairly easily to use bytes and that allowed me to throw away all the issues with aligning data in integers and all the masking and shifting that involves.  Even more conveniently, ByteBuffers already have methods for storing and retrieving every type of data I could possibly need.  They even let you add other ByteBuffers to them so objects that have other serializable objects in them are easier to deal with.  This was version 2 and it died when I tried to deserialize things without any direct knowledge of what I was deserializing.  

The next version added the idea of a header to the serialized data.  I could use that to identify what type of data it is I’m looking at when it’s still in its serialized form.  The header data itself was only defined by a collection of constants and the way several helper methods dealt with it.  This wasn’t a great solution, but it lined up with the mess I’d been crafting.  This version also started to add the idea of variable sized data objects.  I freely admit that it was wishful thinking to pretend that everything could be a fixed size, but I knew that variable sized things were going to add a lot of complexity.

Before long, I realized that I would probably need multiple versions of a particular type of object over time.  I know I saw this coming, but for some reason I didn’t include this feature into the header data I was using.  The problem is that if I can’t identify which version I’d serialized, I wouldn’t be able to load it properly.  With that, version 3 also passed on.

The evolution continued when I realized I didn’t just want versions, but also that I’d created an honest to goodness mess of things for serialization.  My interface had static methods on it to avoid creating dummy objects just for the sake of asking them questions about serialization.  The problem with that, is that the static methods were literally on the interface and while they could be overridden they were not enforceable.  Every time I implemented a new object, I had to remember to create those too.  As mentioned before, my header data was a twisted jumble of constants and helper methods.  To fix that I created actual serialization header classes.  That meant a bunch of my helper methods now had a proper place to live and all the constants that told where things go could just be removed.

Attempt number 4 was born as I cleaned up all of the serialization patterns that were broken and sad.  It was a total refactor.  To make the use of the serialization more logical in other packages, I also pulled it into its own package.  All these changes were just the tip of a pretty serious ice burg because after I started the refactor, all of the objects that I had already created the serialization functions for had to be revised to follow the new patterns I’d just established.  Now two days later, I think I’ve finally finished and I’m actually really happy with how it all works.  Hopefully this version will be the last one, but if I’m being honest, I sort of doubt it.

World Features

As I progress on the data structures and the tests for them, I’m expanding further and further the things the engine will need to be able to handle and how those things will probably happen.  This results in a fair bit of thinking and refactoring as I go.  The most recent development is with work around the concept of a Feature.  These things represent a way I came up with to simplify how generators for the world work.

At its core, the world is randomly generated with noise maps that are all combined to make decisions about how the world looks at a particular spot.  These maps include things like coarse height (general massive scale waves in the landscape like mountain ranges), fine height (smaller scale variances that help to define rolling plains or hills or valleys), variance height (turbulence in the landscape), moisture (how wet or dry an area is), temperature (how hot or cold an area is), age (how old or young an area is), vitality (how vitally alive or dead and crumbling an area is), and possibly others I haven’t defined in my head yet.

This pattern works very well for landscapes and biomes, but makes for rather complicated generators for things that aren’t part of a natural landscape.  That in turn brought me to the idea of a Feature.  In a nutshell, a Feature is something like a castle, mine, dungeon, or village.  Something large and non-landscape based that exists in the world that some piece of code will have to generate.

Imagine with me for a moment that the world you are walking around in is being created as you walk a little ways out ahead of you.  By extension, if you are walking through totally uncharted areas, the same thing is happening out to the sides of you as well.  Now imagine that way off to your left is a castle, but it’s mostly on land that doesn’t exist yet because you weren’t close enough to it.  That means there is a little corner of a castle out there in the landscape, but that the rest of the castle doesn’t exist yet because it doesn’t need to.  With that scenario nicely gelled in your brain, now try to imagine how you would make a generator that knows how to make just a corner of a castle.  It has to be able to do that without the context of the rest of the castle, and when you can see the rest of the castle because you walked over to it, the new parts have to all properly line up with the old parts (walls, doors, rooms, decorative carpet, whatever).

This brings us back to the idea of a Feature.  A Feature holds the whole castle and lets you make it all at once.  This allows you to write your castle generator in a way that doesn’t need to care about building a small part of a castle at a time.  So now we have castles that can be built as one whole thing and the game engine just needs to be able to merge them into the landscape as it needs to.  This comes with a bit of complexity for the engine itself, but it should greatly simplify the task of making generators for anything that is man made.

Data Structures 101

The revision of my world geography objects inspired a lot of other thought about how my world was going to work.  Specifically how I was going to represent or store the important parts of my world.  This in turn resulted in me digging in with a notebook and a pen and starting to try to figure out what I would need and how it would work.

Once I had my basics in place on paper, I started making all of the changes needed to make them possible.  The first was to start making loads of additional data structures.  After all the talking about clusters, I needed to finish creating the class that represents them.  I had additional ideas about world features as an external structure, which led me to the idea of a BitList for storing large numbers of booleans efficiently without all the extra features or auto-sizing that Java’s BitSet provides.  The idea of having lots of features meant that keeping some sort of index to help know when to use them would be handy so the FeatureIndex and WorldVolume classes were born.  And so on…

Next, my current design of IByteBufferSerializable was missing the idea of data structures that were variable size so I added that.  Also, I wanted to remove the serialization header from the size the objects reported for their serialized size.  As it turned out, this change to support variable sized objects meant it was actually helpful to have two different functions for reporting the size of serialized objects.  One for the size of the data, and the other for the size of the data plus the header (because the header is a variable size too).  This makes slightly different hoops to jump through with serializing and deserializing, but in the end I believe them to be simpler hoops than remembering to include the header every time in my reported size.  Several of my data classes didn’t even support IByteBufferSerializable and others needed to be revised to match the new patterns.

Finally, I wanted to add piles and piles of tests to ensure all the various parts of my fancy data structures worked the way I expected.  It would be a shame to be writing a world generator later on and to think something not appearing in the world correctly was because I screwed something up in the generator when in fact the error is in the way the data structure is holding the data.  You could literally waste hours or days trying to figure out where you went wrong all because you thought you got your data structures correct.  For that reason, similar to the massive numbers of math library tests, the data structure tests need to be extremely careful and complete.  If you can’t trust they are behaving properly when you are using them somewhere else, you don’t know where the actual problem you are encountering is coming from.

Next I get to try to figure out how the heck things actually spatially line up in all these structures when considering the “real world” rendering of the places you will walk around.  My three dimensional chunk structure makes that frustratingly difficult to draw on paper, so it’s going to be a bit of mental gymnastics before I have the actual graphics pipeline working.

 

Rearranged World

The original layout of the geometry for QubeKwest was simply blocks in 16x16x16 block chunks in the world.  This was designed to accommodate bulk network transmissions.  After all, sending 4,096 blocks in a neat little package is a good deal more efficient than sending them one at a time.  At the moment, that means that a chunk is around 65 KB of data.  As I was working on the idea of how to store chunks in permanent storage on the server, I got to thinking about how obnoxious it would be to have vast numbers of 65 KB chunks all over the disk.

Instead of making an artificial storage construct for chunks on the disk drive, I got to thinking about how to best manage it and came up with an honest to goodness layer of physical space that can be used directly.  This may end up being handy for generation of the world as well since it gives a slightly more impressive physical space into which I can easily render advanced features.  The idea is that above chunks, we now have clusters.  These hold 8x8x8 chunks (512 chunks total) and are designed to be the way of storing the world on a hard disk.  These presently average something near 26 MB each in their uncompressed form, but they will likely be stored as compressed gzip streams so the world doesn’t take over your whole drive after 20 seconds of exploring.

These clusters are also designed to be a semi-sparse structure.  In other words, they have a hard list of 512 Chunks in them, but those chunks are allowed to be “EmptyChunk” objects instead of real ones.  This type of chunk takes approximately 4 bytes instead of 50,000.  After all, if there are parts of the cluster that haven’t been generated yet, then there is no need to store massive amounts of block data that is just waiting to be replaced by the generator later on.  This is especially likely in the vast depths of the underground.  If you haven’t dug down close to it, it really doesn’t need to have been generated yet.  If it hasn’t been generated yet, don’t waste the hard drive space yet either.

There were a few different trade offs on the sizing of a cluster before I settled on 8x8x8 chunks.  My first theory was 16x16x16 chunks in a cluster (after all I’d already done the math for storing 16x16x16 blocks in a chunk), but I soon realized that would mean way too much memory to load them off the disk to make them active on the server.  After that, I briefly toyed with the idea of getting rid of clusters altogether.  In the end, I decided they would still be useful to prevent the need for obscene numbers of individual chunk files.

Consider for a moment a world that is a small 128x128x32 blocks.  If it was nice and flat and you were running at 6m/s it would still take you over 20 seconds to walk end to end of this tiny world.  If I got rid of the cluster concept, that volume of world data would be stored in 128 files.  With an 8x8x8 chunk cluster, that world fits neatly into a single file with loads of room to spare for more vertical exploring.  This is much friendlier to the storage system on the server, especially as the worlds get larger.

 

 

Network Diversion

In my head, QubeKwest is always a client/server program.  Meaning that even a single player local game is going to be the client program connecting to the server program over localhost.  The fun part there is that it means that the client and the server can be located anywhere.  On a powerful computer, they talk to each other over a localhost network with both parts running on the same computer.  On lesser computers they talk to each other over an actual network with the client and server each being their own computer.  And in the case of playing with your friends, they talk to each other over the internet.  If I code it right, the game not only “won’t care” but it actually won’t even know the difference.

Networking is a tricky beast and starting with the server makes the most sense.  A server can be coded in tons of different ways I’m sure, but the two that I have personally coded are the threaded approach and the asynchronous approach.  The threaded approach just uses the normal java.net.* things while the asynchronous approach uses java.nio.*.  If you’ve ever tried to code your own NIO version without help and pulled it off you are better at coding than me.  Simply put, I get the java.net.* stuff and my brain skips a step and stumbles the rest of the way down the stairs to end in a heap at the bottom when I try to understand the java.nio.* stuff.  Ok, that’s not fully true anymore, but it certainly was for previous attempts.

For a little background on the differences between the two approaches, I present a couple of paragraphs that get a little deep.  In the threaded approach, your server has a full blown operating system thread to handle every single incoming connection from a client, and another one to act as the server thread itself.  This is extremely heavy, not terribly scalable, and puts a pretty hard limit on the number of clients that can be connected at the same time.  Most modern operating systems support thousands of concurrent threads, but long before you get anywhere near that number, you start slamming into the overhead of switching between the threads you are using.  This approach is easy to code, easy to understand (if you understand how to play nicely with threads at least), and completely useless for large numbers of client connections.

In the asynchronous approach, you create a semi-magical server with just a thread or two to handle all the work of responding to clients.  The magical part is that it doesn’t matter if there is one client connected or 1000, the number of threads the server uses to manage the whole thing doesn’t change.  For a most excellent explanation of how this works, go check out the Rox Nio Server Tutorial.  I used that tutorial to craft how my game server works and it is the best description of how to make Java NIO stuff work correctly that I’ve ever bumped into.  This approach is a little trickier to code, vastly more complicated to get your head around, and scales well since you don’t need tons of threads to make it work.

This brings me to my diversion.  In all of the refactoring and playing with and confusing things and in general just blowing up of code that I did while writing an NIO server for QubeKwest, I managed to entirely lose track of how to use it and then I managed to substantially break it.  After seven months without looking at the code, I came back to discover that my network code was well and truly busted and that I had no clue how to fix it.  I started trying to follow the Rox Tutorial again on the code I had, but quickly found that my code was so different from the original that even that wasn’t useful.

Then an idea popped into my head.  Start over.  I know, great idea…  but sometimes it’s just what you have to do.  In this case I decided to simply make a stand alone server that wasn’t connected to the QubeKwest code at all.  This approach eliminated the QubeKwest package structure, the broken code, the refactors I’d incorporated, and it allowed me to learn how the whole thing was supposed to work again.  All without further destroying the networking in QubeKwest.

It took me a couple of days to get it to work properly because I was taking my time and making sure I understood the code I was writing.  When I thought I was done, I found that I was exactly one line of code short of having it work properly.  The code I’d produced was pegging an entire CPU core to 100% (even without requests coming in) and didn’t respond to the client.  This is obviously not an ideal server since the goal would be to use essentially 0% of the CPU unless you are responding to something and of course to actually respond to things.

Once I found the one line of code I’d missed, added it in, recompiled, and fired it up again, everything was suddenly behaving correctly.  Now I just need to refactor the code a bit in the stand alone version so it is a little cleaner, and then start integrating it into the QubeKwest network package.  It was an interesting couple of days, but at least I learned something and have functional code again.