Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add How-to guides on parsing Solidity for CLI/Rust/NPM #716

Merged
merged 5 commits into from
Dec 19, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .cspell.json
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
"doxygen",
"ebnf",
"inheritdoc",
"instanceof",
"ipfs",
"mkdocs",
"napi",
Expand Down
27 changes: 27 additions & 0 deletions crates/solidity/outputs/npm/tests/src/tests/cst-cursor.ts
Original file line number Diff line number Diff line change
Expand Up @@ -106,3 +106,30 @@ test("use cursor", () => {
expectToken(cursor.node(), TokenKind.Semicolon, ";");
expect(cursor.goToNext()).toBe(false);
});

test("cursor navigation", () => {
const data = "contract Foo {} contract Bar {} contract Baz {}";

const language = new Language("0.8.0");
const parseTree = language.parse(RuleKind.SourceUnit, data);

let contractNames = [];
let cursor = parseTree.createTreeCursor();

while (cursor.goToNextRuleWithKinds([RuleKind.ContractDefinition])) {
// You have to make sure you return the cursor to original position
cursor.goToFirstChild();
cursor.goToNextTokenWithKinds([TokenKind.Identifier]);

// The currently pointed-to node is the name of the contract
let tokenNode = cursor.node();
if (tokenNode.kind !== TokenKind.Identifier) {
OmarTawfik marked this conversation as resolved.
Show resolved Hide resolved
throw new Error("Expected identifier");
}
contractNames.push(tokenNode.text);

cursor.goToParent();
}

expect(contractNames).toEqual(["Foo", "Bar", "Baz"]);
});
Original file line number Diff line number Diff line change
@@ -1,3 +1,116 @@
# How to parse a Solidity file

--8<-- "crates/solidity/inputs/language/snippets/under-construction.md"
In this guide, we'll walk you through the process of parsing a Solidity file using Slang. See [Installation](../#installation) on how to install Slang.

A file has to be parsed according to a specific Solidity [version](../../../solidity-specification/supported-versions/). The version has to be explicitly specified and is not inferred from the source. To selectively parse parts of the source code using different versions, e.g. when the contract across multiple files has been flattened, you need to do that manually.

## Using the NPM package

Start by adding the Slang package as a dependency to your project:

```bash
$ npm install "@nomicfoundation/slang"
```

Using the API directly provides us with a more fine-grained control over the parsing process; we can parse individual rules like contracts, various definitions or even expressions.

We start by creating a `Language` struct with a given version. This is an entry point for our parser API.

```ts
import { Language } from "@nomicfoundation/slang/language";
import { RuleKind, TokenKind } from "@nomicfoundation/slang/kinds";
import { Cursor } from "@nomicfoundation/slang/cursor";

const source = "int256 constant z = 1 + 2;";
const language = new Language("0.8.11");

const parseOutput = language.parse(RuleKind.SourceUnit, source);
const cursor: Cursor = parseOutput.createTreeCursor();
```

The resulting `ParseOutput` class exposes these helpful functions:

- `errors()/isValid()` that return structured parse errors, if any,
- `tree()` that gives us back a CST (partial if there were parse errors),
- `fn createTreeCursor()` that creates a `Cursor` type used to conveniently walk the parse tree.

### Example 1: Reconstruct the Solidity file

Let's try the same example, only now using the API directly.

We'll start with this file:

```solidity
// file: file.sol
pragma solidity ^0.8.0;
```

#### Step 1: Parse the Solidity file

Let's naively (ignore the errors) read the file and parse it:

```ts
import { fs } from "node:fs";
const data = fs.readFileSync("file.sol", "utf8");

let parseTree = language.parse(RuleKind.SourceUnit, data);
```

#### Step 2: Reconstruct the source code

The `Cursor` visits the tree nodes in a depth-first search (DFS) fashion. Since our CST is complete (includes trivia such as whitespace), it's enough to visit the `Token` nodes and concatenate their text to reconstruct the original source code.

Let's do that:

```ts
import { TokenNode } from "@nomicfoundation/slang/cst";

let output = "";
while (cursor.goToNext()) {
let node = cursor.node();
if (node instanceof TokenNode) {
output += node.text;
}
}

// Jest-style assertion for clarity
expect(output).toEqual("pragma solidity ^0.8.0\n");
```

### Example 2: List the top-level contracts and their names

The `Cursor` type exposes more procedural-style functions that allow you to navigate the source in an imperative fashion. In addition to `goToNext`, we can go to the parent, first child, next sibling, etc., as well as nodes with a given kind.

To list the top-level contracts and their names, we need to visit the `ContractDefinition` rule nodes and then their `Identifier` children.

Let's do that:

```ts
import { fs } from "node:fs";
import { RuleKind, TokenKind } from "@nomicfoundation/slang/kinds";

const data = fs.readFileSync("file.sol", "utf8");

const language = new Language("0.8.0");
const parseTree = language.parse(RuleKind.SourceUnit, data);

let contractNames = [];
let cursor = parseTree.createTreeCursor();

while (cursor.goToNextRuleWithKinds([RuleKind.ContractDefinition])) {
// You have to make sure you return the cursor to original position
cursor.goToFirstChild();
cursor.goToNextTokenWithKinds([TokenKind.Identifier]);

// The currently pointed-to node is the name of the contract
let tokenNode = cursor.node();
if (tokenNode.kind !== TokenKind.Identifier) {
throw new Error("Expected identifier");
}
contractNames.push(tokenNode.text);

cursor.goToParent();
}

expect(contractNames).toEqual(["Foo", "Bar", "Baz"]);
```
Loading
Loading