http://github.com/dschleck/edith
A Dota 2 replay (.dem) parser that understands packet entities. Tested on OS X and Ubuntu.
cd build
ccmake ../
make
./death_recording <a replay>
If you encounter errors compiling the protocol buffers, make sure the CMake variable PROTOBUF_INCLUDE_DIR points to the directory containing google/protobuf/descriptor.proto. The other PROTOBUF_ variables should be set as well.
This is a reverse engineering project and I stopped caring the moment this worked, sorry. The great news is others have used this as a reference to write this better!
Parsing packet entities requires several components of knowledge, luckily each replay contains everything needed to parse it.
The first thing to understand are the send properties (send props). The primary use for these is to describe a variable in a class. As a simple and imprecise example, suppose Dota had the following class they needed to synchronize:
class Ent {
float x;
int y;
};
To describe this class, they could have a send prop named x with type 1 (float) and a send prop named y with type 0 (int). These send props could be contained in a send table named Ent.
Beyond simply adding additional types (strings, int64s, arrays) send props can also point at other send tables whose props should also be included. The primary use of this is for inheritance, most send tables have a send prop named "baseclass" that points at the send table of the parent class.
The other big source of complexity comes from exclude props. Suppose Dota had some class that inherited from Ent, but no longer used Ent.y. To prevent wasting bandwidth, they could add an exclude prop that will prevent it from being transmitted. The name of this prop would be y and it would also have a field called datatable name set to Ent.
After we read in all of the send tables from the replay we have to flatten them. When a table is flattened, we recursively follow all the links (such as baseclass) to other send tables and copy all of their send props into the table being flattened.
Before we can parse the packet entities, there is one efficiency optimization. Suppose there's a class with two hundred integers and every instance sets those to zero. Rather than transferring two hundred zeroes every time a new instance of this class is created, it would be better to have a default instance of the class and then transfer only what is changed from the default when a new instance is created. The instancebaseline table accomplishes this, every class has an entry in the table with a default instance.
Finally we can parse our packet entities. When the server thinks an entity enters the client's potentially visible set (PVS) it tells the client to make a new entity. Making a new entity requires first parsing the entry in instancebaseline, and then updating the instance by parsing the data sent in the packet. Parsing both of these is exactly the same, first you read the list of properties that they contain and then you read each property.
Entities are updated using the same mechanics. First you read the list of properties in the update and then you read each property and put them in the entity.
Deleting an entity only requires transferring the ID of the entity.
examples/death_recording_visitor is an example of doing something useful with my code. This outputs a line whenever a hero dies. This was used to gather data for the tool that produced the image above.
src/entity describes an entity and stores its properties.
src/property handles the different types of send props and stores the correct data for each one.
src/state contains a bunch of data structures read from the replay and handles flattening send tables.
src/edith reads the replay, converts it into the internal representation used by the program, and runs the logic for everything but flattening send tables.
This seems to work on the replays that I have tried, however there are probably a million different bugs hiding. The read_float_coord_mp is almost certainly completely buggy.
I implemented just enough logic that all the replays I have parse successfully. In particular the following are not implemented:
- There are a million variations of float serialization. I only implemented the ones required for the send props in my replays.
- There are a bunch of string tables that I can't read. The ones I've seen appear to be related to precaching, but there may be more.
If you have a replay that works in the client but doesn't parse correctly, let me know!