Skip to content

Writing Rules

Pavel Bansky edited this page Mar 14, 2017 · 28 revisions

Writing Rules

Rules in DevSkim are fairly simple, relative to other analysis engines. The detection logic is a regular expression (based on JavaScript/C# based RegEx syntax - there is a translation layer in the sublime plugin to translate capture groups and other discrepancies for its Python based engine), though long term there will also be support for JavaScript based lambdas for a bit more sophisticated detection logic. The overall rule is JSON, with the guidance, suggested fixes, etc. all present. Below is a sample rule, and following that is an explanation of each key/value pair. An ATOM based UI is in the works to make creating and editing rules a little more approachable, but once familiar with them it is fairly straight forward to simply add and edit the .json files directly.

Rule Example:

[
    {
        "id": "DS185832",
        "name": "Banned C function detected (strcpy)",
        "disabled": false,
        "tags": [
            "API.DangerousAPI.BannedFunction"
        ],
        "applies_to": [
            "c",
            "cpp",
            "objective-c"
        ],
        "severity": "important",
        "description": "strcpy is frequently dangerous, as it will cause a buffer overflow if the source is larger than the destination.",
        "replacement": "Use strcpy_s or strlcpy if possible. If no safe function is viable, strcpy/strncpy should be proceeded by conditional checks to verify tha that the source string will fit in the destination with a null termnator.",
        "rule_info": "https://github.com/microsoft/devskim/guidance/DS185832.md",
        "patterns": [
            {
                "pattern": "\\bstrcpy\\(([^,]+),([^,\\)]+)\\)",
                "type": "regex",
                "subtype": [
                    "function-call"
                ]
            }
        ],
        "fix_it": [
            {
                "type": "regex-substitute",
                "name": "Change to strcpy_s (Recommended for VC++)",
                "search": "\\bstrcpy\\(([^,]+),([^,\\)]+)\\)",
                "replace": "strcpy_s(\\1, <size of \\1>, \\2)"
            },
            {
                "type": "regex-substitute",
                "name": "Change to strlcpy",
                "search": "\\bstrcpy\\(([^,]+),([^,\\)]+)\\)",
                "replace": "strlcpy(\\1, \\2, <size of \\1>)"
            }
        ]
    }
]

Explanation of Rule Key/Value pairs

Each key/value pair is listed below in the format of {data type}, with details of the expected values

####id {string} example "id" : "DS185832"

The id key must be a unique string (i.e. no other rule should have the same id). The convention in the existing security rules is to present the ID in the form of "DS1XXXXX" where Xs are additional decimal numbers. The idea with that format is that an organization could write custom security rules for themselves and label them "DS2XXXXX" and there would be no accidental collisions with the default rules. Ostensibly if someone wanted to extend DevSkim beyond security specific rules they could increment the leading digit (e.g. "DS3XXXXX") for whatever area they are extending to.

####name {string} example "name": "Banned C function detected (strcpy)"

A short, human understandable identifier for the rule. This does not need to be unique, for several rules might all be titled "Weak/Broken Hash Algorithm" and essentially do the same thing, but have different regular expressions based to find problems in different languages.

While organizations are free to use whatever spoken language they wish for their own rules, the rules published here have standardized on English language identifiers HOWEVER there is effort to design a localization mechanism so that locale specific translations can also be provided and shown to users. That will be forthcoming