Roguelike column thumbnail ['@ Play' is a monthly column by John Harris which discusses the history, present and future of the Roguelike dungeon exploring genre. This time - a discussion of using Python to make Roguelike games.]

So let's talk a bit about Roguelike development languages.

Traditionally, the One True Roguelike Language has been C. All of the current "major roguelikes," Nethack, Angband, ADOM and Dungeon Crawl, are either written in it or in C++. (Nethack makes use of bison and lex, and Angband used to use Lua for scripting.)

The genre's origins on Unix systems, its ties with the curses console library, its reliance on a console in general, and the fact that when the genre got started C was basically it as far as serious programming languages go, all these things combined to identify the genre somewhat strongly with C.

This is not as much the case now. As this year's 7DRL competition demonstrated, new roguelikes are now written in all kinds of languages, ranging from Java to MOS6510 assembly. A few were even written in Python, a language that has historically been regarded as more of a scripting language. I mean, scoff scoff, what business does a "real game" have being written in something like Python?

[NOTE: Our column this time is concerned with matters of development and computer languages. If you're just interested in playing them, you might find this one to be somewhat dry at best, and confusing at worst. I apologize for this; try back next time and we should have a more interesting column for you.]

Well... Python has progressed a point of nearly-unmatched flexibility. (Perl may be more flexible, but it's harder to learn. Ruby is similar in many ways.) It is a semi-compiled language; upon running, it is automatically compiled into a bytecode that is then run in an interpreted fashion. However, it also has an "immediate mode," allowing someone to test out code efficiently. These attributes combine to give it more than a little similarity to the BASICs that used to come with 8-bit computers, on which a legion of programmers, including myself, learned to smash bits together. Writing in Python is fun in a sense that is not often attached to the idea of software development.

And yet, Python is serious news. It's available for a plethora of systems, and it's got more than one interface to SDL for high-level hardware utilization. If the Python virtual machine isn't to your liking, there are versions that have been rewritten to use the Java and .NET machines. If that's not enough, using the Psyco module you can even cause your Python program to be transparently compiled to machine code, providing amazing speed increases in most cases.

Some time back I spent a couple of months tinkering with a roguelike engine in Python. The contents of this column are my own observations concerning Python as a roguelike language. It offers a unique set of benefits, but also a couple of surprisingly drawbacks, for use in implementing these games. I am by no means a Python expert, but I have played around with it for a while. Please give whatever weight to my impressions you deem appropriate. But it's enough for me to strongly suggest, if you're looking to get your feet wet in roguelike development, to give Python a try.

Note: although intended for people unfamiliar with Python in general, this column is not intended to be a primer or tutorial for the language of Python. It's pretty much just an overview of interesting features and potential pitfalls. We don't even get into some of Python's more interesting general aspects, such as its enforced indentation scheme. We're focused pretty sharply on using Python for making roguelikes. There are plenty of good Python resources on the web for the Googling, though, and not a few books devoted to the subject.

Lists

One of the primary advantages of using a language like Python (or one of several other scripting languages) is the ubiquity of lists as a data type. If you're used to arrays from other languages, which tend to enforce all elements being of the same data type (even if those are frequently pointers) and being tricky to resize at runtime, this can seem amazing. These attributes make arrays ill-suited to being used to manage stuff like inventory and monster lists, since they typically impose a hard limit on size.

Because of these limitations, roguelike development in C tends to be loaded to the gills with linked lists to handle space contents, inventory, monster populations, and, heaven help us, monster inventory. Linked lists are not a native type to C; the programmer has to either create and manage them himself (and I can say, from personal experience at the very least, that beginners designing such code are prone to making maddeningly elusive errors) or use an external library to handle the mechanics for you.

If you avoid using a library, you'll end up writing a lot of utility code to maintain these lists, and expending energy to write that code, energy that comes at the cost of your enthusiasm for the project. The great majority of software projects never see completion, and part of that, I submit, is the need to write abundant utility code. C's strengths lie in its relative closeness to the metal, but that means it does little for the developer. Even common things like string handling are notoriously convoluted in C.

Python, as a "very high-level language," handles strings like a dream, uses an efficient garbage collector, and its lists are practically a godsend. There are some difficulties with learning to use them; we'll talk about those once we get into Python's drawbacks. But most of it is a substance remarkably like gravy. What is especially cool about lists is that, unlike C's arrays, you can put any data type you want into a list's cells; they don't all have to be the same size. If you wanted, you could store a string in the first spot, an integer in the second, another entire list in the third, and so on, using them as makeshift structs. Mind, however, that Python contains robust support for classes, so there's little reason to do this.... All this is possible because, behind the scenes, Python is actually doing all the pointer juggling itself. This is not perfectly efficient, of course, but for a roguelike game the difference in speed can usually be measured in milliseconds.

Further, lists can do lots of things easily that arrays cannot. They can easily be increased or decreased in size, arbitrary elements inserted and removed, used as stacks, operations performed on every item, copied, and reversed, and a surprising number of other things. The random module contains a method for easily shuffling the contents of a list in place. Of special interest is sorting: once your brain comes to properly grok list sorting, it'll start coming up with all kinds of nifty, unexpected uses.

Now, granted, the tradeoff is that lists are more computationally-expensive than arrays. Python's internal algorithms are well-optimized, but it simply has more to do beneath the surface than a C array lookup would, which is basically simple pointer arithmetic. It is worth noting that fast-action games have been written in Python (for instance, popular indie game Rom Check Fail is actually written in Python using a PyCap, an interface to PopCap's game development libraries), but it is still lacking for some purposes. Roguelikes, however, are mostly turn-based, and a lot of processing behind the scenes may not produce anything more than a tenth-of-a-second pause. This makes Python, and other very-high-level languages too, potent tools for a roguelike developer.

Dictionaries

Nearly as cool as lists is the dictionary data type, which is analogous to Perl and Ruby's hashes. If you're not familiar with the concept, try to imagine the following. Start with an array. Then, abandon the idea that the elements in the array are in some sort of definite order; you can iterate through all the elements in the dictionary with a for loop, but there are no guarantees what order they'll be in. To compensate for this, instead of indexing the dictionary by a boring ol' integer value, you can use any immutable type in Python. That is to say, you can have an "array" that you access, not with a number, but with a string, or a float, or--of particular use to us--a pair of coordinates. Since Python refers to everything using objects, and variables just contain references to them, you can have one list of all the monsters in the level, and alongside it a dictionary that refers to the same monster objects according to their coordinates. This would make getting all the monsters in a room, or in the range of an area-effect spell, a matter of checking all coordinates in a smal range, instead of the potentially much-more-expensive route of checking every monster on the level for proximity. More importantly, the code is more elegant; elegant code is easier to read and maintain, because your brain doesn't have to context shift to whatever devious technique you made up in prior weeks to get a feature working.

The idea of pointing these uses for hashes out is not to say you must do it this way, but to show that Python provides a strange and wonderful toolset for use. As you come to understand the things that Python makes both easy, and surprisingly fast, all kinds of clever ways to do things may present themselves to your mind.

For beginning programmers, however, I don't think any feature of Python is more useful than its interactive shell. It's what elevates Python to the realm of old 8-bit computer programming languages, and I mean that in a good way; with a quick command, you can test out almost any code Python will compile and make sure it does what you expect, just like the old days of sloped keyboard boxes and Microsoft BASIC. This above all helps make Python fun to work with by allowing for near-instantaneous testing of prospective code.

List Copy Troubles

I think Python is, overall, fairly well suited to the task of serving as a roguelike development language, but there are a few gotchas one must look out for when beginning to use it as such. I mention it here because I myself was bitten by this, and it took me quite some time to overcome the issue. One of the most intractable such problems comes from one of its greatest advantages, how it treats everything as an object, with all variables, behind the scenes, serving as merely references to the data.

This may not sound like much of a problem. Isn't that the entire point of a variable after all? But there is a secret gotcha here that will bite if you are unprepared for it! The trouble lies with Python's two classes of datatype, mutable and immutable. Simply, the data in a mutable type can be changed without changing the reference. Lists are mutable because you may change any of its contents but the identity of the list does not change: it's the same list, just modified. Imutable objects, conversely, cannot be changed without creating a new copy of the item. Python often takes care of that for you, so in many cases you don't notice there's a difference at all. In most cases, it just means that behind the scenes Python creates new copies of changed values and discards any old value that had been there. This is good because pretty much all numbers are immutable, and you never have difficulty using the += operator because Python just creates a new value to go in that spot. The most commonly encountered problems with immutable objects usually have to do with strings, which are immutable types in Python.

But mutable objects can pose difficulties, too. The problem comes in when you make a copy of a piece of data, producing two references to the same thing. If the data is immutable then that's not much of a problem, for when one of the values is changed it'll create a new value anyway. But if you assign the same list, a mutable type, to two different variables and then change one of the elements in that list, you'll find that both lists have changed. It is possible to be profoundly screwed over by this behaviour if you're not looking for it; it looks much like the kind of stray pointer bugs that C code sometimes spawns, with values changing unexpectedly, but it's not a bug. But if you're not careful, such reference copies can spread far throughout your program. It's difficult to cause Python to make an explicit copy of a mutable object, so difficult that Python has an entire module, copy, devoted to it to making it easy.

Yet even with this module, the problem is not always easily solved. Consider the case where you have a list of lists, a structure that roguelike authors, for reasons we'll shortly examine, often end up trying in Python. A list is just a sequence of references, referred to by number instead of, as with simple variables, name. So, a list of lists is really a list of references to lists of references! If you make an ordinary copy of such a list, you'll only end up with a shallow copy; The top-level list will be copied, but the interior list contents will all refer to the originals, and these "quantum entanglement" change bugs will persist. Heaven help you if you make a list of lists of lists, which my own code used. The copy module contains a special function for these structures, deepcopy, that ensures that every data item copied is new. The drawback is that deepcopy is relatively slow, since it does paranoid reference checking, and making lots of use of it can drag down your game.

Lack of Built-in Multi-Dimensional List

I mention that copy problems become bad if you make a list of lists. Why would one do such a thing? It sounds slightly obscene, doesn't it, an unholy rite of development that could conjure demon bugs. Yet Python has special need of such a data structure because... here it comes... the basic language does not contain an analogue for multi-dimensional arrays. Lists are a one-dimensional structures only. Need I remind you, roguelike dungeon levels are usually stored as two-dimensioned arrays of spaces. To simulate a grid, if you want to keep the syntax similar to C, you must use a list of lists. In code, this ends up looking like:

dungeon = [[wall, wall, wall, wall, wall],
[wall, stairsup, space, player, wall],
[wall, space, monster, space, wall],
[wall, space, treasure, stairsdown, wall],
[wall, wall, wall, wall, wall]]

Do you see what this list does? A single list is defined by square brackets; thus, square-brackets inside of square brackets define sublists. To read the contents of a space, for game logic purposes or to render the display, one can refer to it with:

dungeon[y][x]

But this scheme is vulnerable to all the copy problems mentioned above. I know of two ways to remedy this sanely: using a wrapper class around a list, and using a dictionary. The dictionary method I've described above; you just use a tuple (an immutable analogue for an array, using parentheses instead of square ones) for the coordinates. Like this:

dictarray = {}
# Assignment
dictarray[(3, 3)] = "floor"
# Recall
print dictarray[(3, 3)]

It's kind of a weird solution though; although it only uses up memory for the spaces you actually fill, it's a bit less efficient than an array, and if you try to access a spot that contains nothing it'll raise an exception. There are ways around this: you could access dictionaries using its get method, like so:

dictarray.get((0, 0), "wall")

That supplies a default value for unassigned spaces. You could also use a wrapping class like we're about to do with lists to handle default cases, and there's also a way to change Python's handling of dictionaries to avoid this problem. Those are beyond the scope of this column, but are definitely doable.

Possibly the best tradeoff between efficiency and sane syntax is to make a new class to represent your multidimensional array. The first thing we should realize is that a multi-dimensional array is simply a one-dimensional one remapped. A 5 x 5 array is really, to the runtime, a 25-cell array with a little syntactic sugar to make it seem like something else. If we know the size of the x dimension of the array, we can easily do this math ourselves.

listarray = ["topleft", "topmiddle", "topright",
"middleleft", "center", "middleright", "bottomleft",
"bottommiddle", "bottomright"]
xsize = 3
#
# Retrieve cell 1, 2
x = 1
y = 2
print listarray[x + y * xsize]

This code returns "bottommiddle" as its result. C actually makes this relationship a bit more explicit, and we can do some pointer math in a way that makes sense if a raw buffer were to be used exactly like an array. But usually the compiler does this math behind the scenes. We can do it back there too, by using a wrapper class:

import copy
from defines import *

# 2D array-faking helper functions
import copy

class TwoDArray(object):
"""
Implemented as a list simulating an array, with extra attributes for the
size.
"""
def __init__(self, xextent, yextent, initialstate = 0):
"""
To initialize, we simply store the extents and create a list of the appropriate size.
Most of the code here should be self-explanatory.
"""
self.xsize = xextent
self.ysize = yextent
arraybuild = []
for index in range(xextent * yextent):
arraybuild.append(copy.deepcopy(initialstate))
self.array = arraybuild

def checkbounds(self, x, y):
if (x >= self.xsize) or (x < 0) or (y >= self.ysize) or (y < 0):
return False
else:
return True

def get(self, x, y):
if self.checkbounds(x, y):
try:
return self.array[x + (y * self.xsize)]
except IndexError:
raise
else:
raise ValueError, "Index out of bounds:" + str(x) + "," + str(y)

def put(self, x, y, assign):
if self.checkbounds(x, y):
self.array[x + (y * self.xsize)] = assign
else:
raise ValueError, "Index out of bounds" + str(x) + "," + str(y)

An example of use:
dungeon = TwoDArray(5, 5, "wall")
for x in range(1, 4):
for y in range(1, 4):
TwoDArray.set(x, y, "floor")

To recall the contents of a cell, we'd use the get method instead, providing only the coordinates as arguments. If we wanted to, we could even subclass lists and make a new access method "official," with its own syntax, but that's beyond the scope of this column and my abilities at the moment. Heh.

There, that wasn't so bad, was it? Python is capable of performing miracles if used correctly. Notice, by the way, that many of the things I've mentioned here can also be done, in a similar way, with Ruby. It doesn't have as many built-in helper functions as Python, and those can come in handy at times, but to our perspective the languages aren't much different at all. The biggest change to this basic exposition lies in the fact that Ruby doesn't seem to have an explicit copy method, which sometimes leads to having to use strange solutions such as serializing the data with Marshal and unserializing it to make an unconnected copy.


In other news....

The developers of Dungeon Crawl Stone Soup have announced a tournament for the month of August under the most recent version of their game. The tournament will take place on the two semi-official Crawl public servers at and crawl.develz.org. Visit the Stone Soup home page for more information.

There is an up-and-coming commercial iPhone roguelike called 100 Rogues. We're going to try to have an interview with project lead Keith Burgun up in a few days, but in the meantime you might have a look at their Facebook page, which offers a little more information as well as an info video of an intriguing character class....