-
Notifications
You must be signed in to change notification settings - Fork 33
Writing Dissectors
Pyreshark's dissectors are written in the form of a .py (e.g. my_protocol.py) file, containing a class named Protocol inheriting ProtocolBase.
In order to be loaded, this file should be placed in <Wireshark-dir>\python\protocols.
When Pyreshark is initialized, it creates an instance of this class and uses said instance to generate the new protocol (and its fields, trees, etc.) and register it in Wireshark.
Let's have a look at the sample protocol (you can find it in \python\protocols\sample_protocol.py):
from cal.cal_types import ProtocolBase, FieldItem, PyFunctionItem, Subtree, TextItem
from cal.ws_consts import FT_UINT16, BASE_HEX, FT_UINT8, FT_ETHER, FT_IPv4
ETHERNET = 1
IP = 0x0800
ARPOP_REQUEST = 1
ARPOP_REPLY = 2
HW_TYPE_STRINGS = {ETHERNET : "Ethernet"}
PROTO_TYPE_STRINGS = {IP : "IP"}
OPCODE_STRINGS = {ARPOP_REQUEST: "request",
ARPOP_REPLY: "reply"}
class Protocol(ProtocolBase):
def __init__(self):
self._name = "Pyreshark Sample Protocol (ARP)"
self._filter_name = "pysample"
self._short_name = "PYSAMPLE"
self._items = [FieldItem("hw.type", FT_UINT16, "Hardware Type", strings = HW_TYPE_STRINGS),
FieldItem("proto.type", FT_UINT16, "Protocol Type", display = BASE_HEX, strings = PROTO_TYPE_STRINGS),
FieldItem("hw.size", FT_UINT8, "Hardware Size"),
FieldItem("proto.size", FT_UINT8, "Protocol Size"),
FieldItem("opcode", FT_UINT16, "Opcode", strings = OPCODE_STRINGS),
Subtree(TextItem("src", "Sender"), [PyFunctionItem(self.add_addresses, { "mac" : FieldItem("hw_mac", FT_ETHER, "Sender MAC Address"),
"ip" : FieldItem("proto_ipv4", FT_IPv4, "Sender IP Address"),})]),
Subtree(TextItem("dst", "Target"), [PyFunctionItem(self.add_addresses, { "mac" : FieldItem("hw_mac", FT_ETHER, "Target MAC Address"),
"ip" : FieldItem("proto_ipv4", FT_IPv4, "Target IP Address"),})]),
]
#self._register_under = { "ethertype": 0x0806} # UNCOMMENT THIS TO TEST THE PROTOCOL
def add_addresses(self, packet):
(hw_type, proto_type, hw_size, proto_size) = packet.unpack(">HHBB", 0)
if hw_type == ETHERNET:
packet.read_item("mac")
else:
packet.add_text("Unimplemented hardware type")
packet.offset += hw_size
if proto_type == IP:
packet.read_item("ip")
else:
packet.add_text("Unimplemented protocol type")
packet.offset += proto_size
This a very thin and incomplete implementation of the ARP protocol. We shall now inspect the different parts of the code
from cal.cal_types import ProtocolBase, FieldItem, PyFunctionItem, Subtree, TextItem
from cal.ws_consts import FT_UINT16, BASE_HEX, FT_UINT8, FT_ETHER, FT_IPv4
As you can see two modules are being imported, both under the cal package.
cal stands for C Abstraction Layer, it's actually the core of Pyreshark, hiding Wireshark's C API and providing you with its own "pythonic" one.
- cal_types holds Pyreshark's API, almost everything you'll need is there.
- ws_consts holds several constant values from Wireshark's C code, that are necessary for protocol writing.
class Protocol(ProtocolBase):
def __init__(self):
self._name = "Pyreshark Sample Protocol (ARP)"
self._filter_name = "pysample"
self._short_name = "PYSAMPLE"
self._items = [...]
#self._register_under = { "ethertype": 0x0806}
The first thing you notice about this class is that it inherits ProtocolBase (from cal.cal_types). Removing this may cause various exceptions and errors. Try your best to just leave this line as is.
Now let's have a look on the various members initialized in the constructor:
Variable | Description |
---|---|
_name | The full name of your protocol |
_filter_name | The name of your protocol in the filter box |
_short_name | The name of your protocol in the protocol column |
_items | This is where you state the structure of your protocol, we'll discuss this properly later on |
_register_under | Use this if you want your dissector to be called by an existing protocol, you can register it in the latter's table (find all available tables in the menu Internals->Dissector Tables) |
_hidden | If you don't want the protocol to be automatically added to the tree, set this to True |
Note that _register_under is commented out so it doesn't override the original ARP protocol. To test the sample protocol just uncomment this line, start Wireshark and inspect any ARP packet.
There's one more thing that can be set in the constructor that doesn't appear in the sample:
- set_next_dissector(dissector_name, length = REMAINING_LENGTH) - Used for dictating which protocol will be parsed after yours, and how many bytes it'll receive (omitting the second argument will pass all remaining bytes to the next dissector). For example:
self.set_next_dissector("tcp")
Calling this function in the constructor sets the default value for all dissected packets. Omitting this line will keep "data" as the default next protocol.
That's the fun part where you actually write your protocol's structure (note that we're still in the constructor):
self._items = [FieldItem("hw.type", FT_UINT16, "Hardware Type", strings = HW_TYPE_STRINGS),
FieldItem("proto.type", FT_UINT16, "Protocol Type", display = BASE_HEX, strings = PROTO_TYPE_STRINGS),
FieldItem("hw.size", FT_UINT8, "Hardware Size"),
FieldItem("proto.size", FT_UINT8, "Protocol Size"),
FieldItem("opcode", FT_UINT16, "Opcode", strings = OPCODE_STRINGS),
Subtree(TextItem("src", "Sender"), [PyFunctionItem(self.add_addresses, { "mac" : FieldItem("hw_mac", FT_ETHER, "Sender MAC Address"),
"ip" : FieldItem("proto_ipv4", FT_IPv4, "Sender IP Address"),})]),
Subtree(TextItem("dst", "Target"), [PyFunctionItem(self.add_addresses, { "mac" : FieldItem("hw_mac", FT_ETHER, "Target MAC Address"),
"ip" : FieldItem("proto_ipv4", FT_IPv4, "Target IP Address"),})]),
]
As you can see _items is a list of several objects (all of which reside happily in cal.cal_types). During the dissection these items are being processed sequentially, starting in the beginning of the packet and advancing the offset as needed.
Item | What happens during Dissection |
---|---|
FieldItem | Reads a regular field in the packet's bytes, like an integer or an IP address and adds it to the tree |
TextItem | Adds a custom textual field to the tree |
Subtree | Adds a sub-tree to the tree |
PyFunctionItem | Calls a python function to process the packet |
The constructor accepts a myriad parameters, most of which have very convenient defaults.
Parameter | Description | Default value |
---|---|---|
name | The name of the field. Used for generating the filter name. | - |
field_type | Any of the FT_* from ws_consts.py (also wireshark's ftypes.h). | - |
full_name | The name that'll be shown in the tree. If it is set to None, full_name=name. | None |
descr | A short description of the field. If it is set to None, descr=name. | None |
encoding | Encoding for reading the field. See ws_consts.py. If it is set to None, a default encoding is picked from FIELD_TYPES_DICT in cal_consts.py. | None |
mask | Bit mask. | NO_MASK=0 |
display | How the field's value will be displayed in the tree. See ws_consts.py. If it is set to None, a default display is picked from FIELD_TYPES_DICT in cal_consts.py. | None |
strings | A dictionary for translating the field's value into text. For boolean fields use True and False as keys, for integers use either the values directly or tuples of (min, max) - not both at the same dictionary! | None |
length | Length of the field in bytes. If it is set to None, a default length is picked from FIELD_TYPES_DICT in cal_consts.py. | None |
Note that None means python's None and not the English word "none".
Useful tips:
- The item's filter name will be generated according to its position. If it's directly under the protocol root, it'll be named
(protocol-filter-name).(item's name)
(e.g. "pysample.opcode"). - If it's under a tree, the tree's parent item will join the filter name as well (e.g. "pysample.src.hw_mac").
- The offset is advanced after this item is dissected (according to the item's length).
When you just need another line of text in the tree, this item is for you!
Parameter | Description | Default value |
---|---|---|
name | The name of the field. Used for generating the filter name. | - |
text | The text that will be added to the tree. | - |
length | Length of the field in bytes. | 0 |
Useful tips:
- Note that the offset is not advanced.
- Extremely handy as the parent item of a Subtree.
If you want to have sub-tree in your protocol's tree, it's easy and fun!
Parameter | Description | Default value |
---|---|---|
parent_item | The subtree's parent item. | - |
item_list | The subtree's children - A list of items. | - |
tree_name | Used by Wireshark for remembering which trees are expanded. Put AUTO_TREE for the name of parent_item. | AUTO_TREE |
- In 9 times out of 10, you don't want to set tree_name.
- You'd usually want to use a TextItem as parent_item.
With the three items above we can create wonderful protocols, with a slight limitation: no dissection logic. That's where PyFunctionItem comes to the rescue. When this item is being dissected it'll call a python function of your choice where you can happily program your protocol's logic in Python.
PyFunctionItem(self.add_addresses, { "mac" : FieldItem("hw_mac", FT_ETHER, "Sender MAC Address"),
"ip" : FieldItem("proto_ipv4", FT_IPv4, "Sender IP Address"),}
.
.
.
def add_addresses(self, packet):
(hw_type, proto_type, hw_size, proto_size) = packet.unpack(">HHBB", 0)
if hw_type == ETHERNET:
packet.read_item("mac")
else:
packet.add_text("Unimplemented hardware type")
packet.offset += hw_size
if proto_type == IP:
packet.read_item("ip")
else:
packet.add_text("Unimplemented protocol type")
packet.offset += proto_size
Parameter | Description | Default value |
---|---|---|
dissection_func | A python function. It'll be called with a single parameter: a Packet instance. | - |
items_dict | A dictionary of all the items the function might read. The keys can be anything and will be used when the function calls packet.read_item(key). | - |
The Packet object your function receives contains your API for dissecting the packet.
- packet.id - The packet's position in the capture.
- packet.visited - Whether the packet was visited before, or it's our first time seeing it.
- packet.buffer - The packet's bytes as a string.
- packet.offset - The current offset in packet.buffer.
- packet.add_text(text, length, offset) - Used for adding a line of text to the tree.
Parameter | Description | Default value |
---|---|---|
text | The text to be added. | - |
length | The number of bytes that'll be marked when selecting the item. packet.offset is not advanced. | 0 |
offset | The beginning offset for the marked bytes. If set to None, offset=self.offset. | None |
- packet.set_column_text(col_id, text) - Used for setting a columns text.
Parameter | Description | Default value |
---|---|---|
col_id | The column's id (any COL_* from ws_consts.py). | - |
text | The new text of the column. | - |
-
packet.read_item(item_key) - Used for adding any item from the aforementioned items_dict to the tree.
- Note that the offset is advanced according to the item read.
Parameter | Description | Default value |
---|---|---|
item_key | The key of the item in the items_dict. | - |
-
packet.unpack(format, offset) - Used for reading values from the packet's bytes.
- Note that the offset is not affected.
Parameter | Description | Default value |
---|---|---|
format | A format string (see Python's documentation for the module struct). | - |
offset | The offset from which the values will be read. None will set it to the current offset. | None |
There's another important function that can be called from here. It belongs to ProtocolBase and we have already met it:
- set_next_dissector(dissector_name, length = REMAINING_LENGTH) - When being called from a function, it'll only change the next dissector for the current packet.
Useful tip:
- The item after the PyFunctionItem (or the next dissector, if it is the last item) will be read beginning in packet.offset, don't forget to set it to the right position if necessary.
An item that calls another dissector.
Parameter | Description | Default value |
---|---|---|
name | A protocol's name | - |
length | The number of bytes to be dissected | REMAINING_LENGTH |
- The item has a function set(name, length=REMAINING_BYTES) that lets you change its parameters temporarily for the next time it'll be invoked. Only use it if you know what you're doing.
An item that advances the offset.
Parameter | Description | Default value |
---|---|---|
length | Number of bytes by which to advance the offset. | - |
encoding | one of ENC_*, relevant for whether it's big endian or little endian. | ENC_BIG_ENDIAN |
flags | Any of OFFSET_FLAGS_*, Useful for length preceded fields. | OFFSET_FLAG_NONE |
The three flags available are:
Flag | What does it do? |
---|---|
OFFSET_FLAGS_NONE | The offset is advanced length bytes. |
OFFSET_FLAGS_READ_LENGTH | A uint of size length is read, and the offset is advanced by length + the uint's value |
OFFSET_FLAGS_READ_LENGTH_INCLUDING | A uint of size length is read, and the offset is advanced by the uint's value |
An item that changes a columns text.
Parameter | Description | Default value |
---|---|---|
col_id | The column's id (any COL_* from ws_consts.py). | - |
text | The new text of the column. | - |
An item that adds a new data source from which its sub-fields will be read. The source is created from a python string returned by a function passed as a parameter.
Parameter | Description | Default value |
---|---|---|
source_name | The name of the new source. | - |
create_data_func | A python function that returns the new source's bytes as a string. It'll be called with a single parameter: a Packet instance (See PyFunctionItem). | - |
items_list | A list of the items that will be read from the new source. | - |
- IMPORTANT: Even though you declare the items in Python, all items outside a PyFunctionItem will be dissected by C code! If you're worried about speed, avoid using PyFunctionItem unless necessary. Theoretically, if your protocol has no inner logic and contains no PyFunctionItems, there won't be any Python code running after Wireshark starts. I might write an explanation of how Pyreshark works later, in the meanwhile have a look at the code.
- You can pass information between different functions by storing it in the Protocol object (accessible through self), just make sure you reset your value when dissecting a new packet, as the same Protocol object is used for dissecting all packets.
- Don't make recursive dissectors (a dissector that contains a DissectorItem of itself). I'm not responsible for anything that might happen if you do. That probably sums the current possibilities and opportunities Pyreshark has to offer.
Good luck with your dissector(s)!