Unorthodox index types for spatial containers
In this article I’d like to share some of my thoughts about using index types different from the common ones. We’ll explore the possibilities this change brings to the table
The most basic and ubiquituous data structure in the computing world is most certainly an array:
int arr[5];
It’s available in every programming language, and keeps elements in sequential order. What’s hidden in this definition is the access to it:
int value = arr[ix];
In the C programming language, we index arrays with unsigned integers starting at 0 and ending
at array size - 1. Thus, ix
could simply be of type unsigned
(unsigned
and unsigned int
are
exactly the same). It’s actually expressed as signed int
for various (bad) reasons, and you can kind
of get the idea about where we’re going with the remainder of this article.
In languages like Lua, it gets closer to something like a “natural” number, because indices, by convention, start at 1; however Lua has just one number type, and doesn’t enforce it either.
Many beginner game programmers thus choose this straightforward data structure to implement their game states, most notably boards. They exploit the ability that arrays can actually contain arrays inside, and so we get:
int board[size][size];
Bam! board[x][y]
gives us access to one particular field. Great, isn’t it?
Actually, it’s pretty terrible.
Let’s write a simple game.
enum Field {
Empty,
Player,
Monster,
Rock
};
Entity board [size][size];
int playerX, playerY;
void movePlayer(int keyPressed) {
switch (keyPressed) {
case LEFT_ARROW:
playerX -= 1; break;
// ...
}
}
Okey, so what’s wrong with that? Well, for one playerX
and playerY
look like something we could
refactor into a “point” (or perhaps “position” class):
struct Point {
int x, y;
}
Point playerPosition;
Instead of writing playerX
we now write playerPosition.x
, which is arguably much cooler. We can also
make our movement function pure:
Point getMovement(int keyPressed) {
switch (keyPressed) {
case LEFT_ARROW:
return Point (-1, 0);
// ...
}
}
Moving the player is now a bit more explicit:
playerPosition.add(getMovement(key));
And we can reuse this function for other things. So far, so good. However, one problem remains: how do we place a player onto the board?
placeOnTheBoard(Unit u) {
// act on u polymorphically
board[u.position.x][u.position.y] = u.type;
}
Most of the people will stop there, thinking it’s fine. And, for most of the time, it will. But it’s not going to be enough for us.
Point being, I always hated people writing code like so:
pX += dX;
pY += dY;
Every time I saw something like that I immediately had the proper solution in mind:
p += d;
If your language doesn’t support operator overloading, p.add(d);
is perfectly fine as well. So is add(p,d);
.
Both of them treat the X and Y coordinates equally, without exposing the dimensionality and details to the user.
The only problem with this approach is that arrays can’t take the point parameter, and people don’t notice how
their nice abstraction of space breaks there.
What if we could change this?
I’m going to show you two examples from my real-life codebases where I’ve managed to get around that issue. I hope they serve as an inspiration in the future endeavours:
Potato empires is a Haskell game that takes place on a regular 2D grid. I use an Array to store that.
Typically, you’ll see Array
used with (Int, Int)
index type in that scenario. That was what I begun with,
and that was what I ultimately deemed much less useful than:
data Point = Point Int Int deriving (Show, Eq, Ord)
The cool thing about Haskell is that Index types in it get their own special typeclass - Ix
.
instance Ix Point where
range (Point minX minY, Point maxX maxY) = [Point x y | x <- [minX .. maxX], y <- [minY .. maxY]]
inRange (Point minX minY, Point maxX maxY) (Point x y) = and [x >= minX, y >= minY, x <= maxX, y <= maxY]
-- implemented the same as default (a,b) Ix instance
index r@(Point l1 l2, Point u1 u2) p@(Point i1 i2) | inRange r p = index (l1,u1) i1 * rangeSize (l2,u2) + index (l2,u2) i2
| otherwise = error "Out of range"
My game map becomes simply
type GameMap = Array Point MapField
Because the second type parameter to Array
is anything that has an instance for Ix
, I can use my own custom type,
and I never have to think about conversions, despite using my own index type.
getUnitAt :: GameMap -> Point -> Maybe Unit
getUnitAt gmap p = (gmap ! p) ^. unit
I can place arbitrary restrictions on Point
type, and I’m not constrained by anything that my data structure imposes on me.
The minicraft’s case is arguably even more interesting and needs a bit more of explanation. This game used much more complicated
coordinate system - it was based on voxels, and to avoid loading the whole map at once, I did it in n
xn
xn
chunks.
This required establishing two coordinate systems; one that described chunks in the “chunk space”, and one that described voxels inside of the chunk. Actually, that made 3, because I wanted to be able to know the absolute coordinate of one voxel. That prompted code reuse that used a funny technique called tagging:
static const int size = 24;
struct WorldCoordTag;
struct OuterChunkCoordTag;
struct InnerChunkCoordTag;
template<typename Tag>
class Coord;
typedef Coord<WorldCoordTag> WorldCoord;
typedef Coord<OuterChunkCoordTag> OuterChunkCoord;
typedef Coord<InnerChunkCoordTag> InnerChunkCoord;
template<typename Tag>
class Coord {
public:
int x, y, z;
Coord()
: x(0), y(0), z(0) { }
Coord(int _x, int _y, int _z)
: x(_x), y(_y), z(_z) {
}
void operator+= (Coord const& other) {
x += other.x;
y += other.y;
z += other.z;
}
};
What followed were conversion routines that I’ll skip implementations of (they’re available here and are worth another article dedicated just to the maths describing them; it took me quite some time to perfect their behaviour so that the world can grow in both positive and negative dimensions without issues).
static OuterChunkCoord convertToOuter (WorldCoord const& wc);
static InnerChunkCoord convertToInner (WorldCoord const& wc);
static WorldCoord convertToWorld (InnerChunkCoord const& ic, OuterChunkCoord const& oc);
This is just the baseline, and the really interesting stuff is what you can build with them. Here’s the Chunk
:
class Chunk {
std::array<BlockType, size*size*size> data;
};
And finally, the main data structure:
std::unordered_map<OuterChunkCoord, Chunk, CoordHash> data;
I’ve decided to use the unordered
variant because there’s no obvious way to order voxels in 3D space. (Hell, there’s
no obvious ways to order fields of a grid; both row-major and column-major formats are present and widely used.) This
required a simple hashing routine:
class CoordHash {
template<typename Tag>
std::size_t operator()(Coord<Tag> const& p)
{
std::size_t seed = 0;
boost::hash_combine(seed, p.x);
boost::hash_combine(seed, p.y);
boost::hash_combine(seed, p.z);
return seed;
}
};
hash_combine
sadly isn’t a part of C++ standard library. It’s a pity, considering how useful this little helper is.
What this whole complication buys us might not be obvious at first, so let’s take a quick look at the list:
O(1)
- unordered_map
with such a hash should be able to avoid collisions
most of the time, and once it finds a chunk, accesing an array element is just addition and multiplication.