GoCrab is a Rust-to-Go transpiler written in Go. Why? Its my attempt to learn about compilers/transpilers, who knows I may love it?
This image alone is enough to make this project sound sooooo much cooler.
Writing a complete back end for a language can be a lot of work. If you have some existing generic IR to target, you could bolt your front end onto that. Otherwise, it seems like you’re stuck. But what if you treated some other source language as if it were an intermediate representation?
You write a front end for your language. Then, in the back end, instead of doing all the work to lower the semantics to some primitive target language, you produce a string of valid source code for some other language that’s about as high level as yours. Then, you use the existing compilation tools for that language as your escape route off the mountain and down to something you can execute.
The front end—scanner and parser—of a transpiler looks like other compilers. Then, if the source language is only a simple syntactic skin over the target language, it may skip analysis entirely and go straight to outputting the analogous syntax in the destination language.
If the two languages are more semantically different, you’ll see more of the typical phases of a full compiler including analysis and possibly even optimization. Then, when it comes to code generation, instead of outputting some binary language like machine code, you produce a string of grammatically correct source (well, destination) code in the target language.
Either way, you then run that resulting code through the output language’s existing compilation pipeline, and you’re good to go.
A scanner (or lexer) takes in the linear stream of characters and chunks them together into a series of something more akin to “words”. In programming languages, each of these words is called a token.
Here in our lexer package, we have a loop that runs through the entire code and tokenizes the code based on whether it is an identifier, a keyword, a string, a number etc.
The output for the following rust code :
fn main() {
let x = 10;
let y = 20.5;
let z = x + y - 3 * (5 / 2) % 2;
let a = !x && y || z;
let c = x > y && y <= z || x == 10;
let result = if x != y { "unequal" } else { "equal" };
let complex = (x << 2) + (y >> 1) - (a ^ c);
}
Very Cool!
Initially, based on a the following limited grammer , we will build an AST:
expression → literal
| unary
| binary
| grouping ;
literal → NUMBER | STRING | "true" | "false" | "nil" ;
grouping → "(" expression ")" ;
unary → ( "-" | "!" ) expression ;
binary → expression operator expression ;
operator → "==" | "!=" | "<" | "<=" | ">" | ">="
| "+" | "-" | "*" | "/" ;
We can run the script to autogenerate the ast's with go run .\scripts\main.go
, which will generate the trees in expr.go
package.
Similarly , we can run the pretty printer with go run .\printer\main.go
, to show a basic AST output of (* (- 123) (group 45.67))
This is where we use LLVMIR or MLIR? IDk?
I’d probably do an AST dump and then just do 1:1 substitutions for the language constructs wherever possible You can pick version 1 of the language and ignore everything outside the core language I dont think you need to reach the llvm ir at all. LLVM IR is very low level (close to assembly), so you will lose information about loops and other programming constructs, so if you attempt to “unparse” backwards from LLVMIR to source, your go code will end up looking very different from the rust source you began with. It would be better to just transform one AST to the other AST and then “walk” the AST to get the final transpiled source. ASTs have far more source information so you will end up with something closer to the source rust program Only trouble might be that I don’t know what kind of AST rust compilers produce, and how easy it is to manipulate Some compilers have immutable ASTs I think llvm has a rust and go frontend so maybe those two would work?
https://typeset.io/papers/reverse-compilation-techniques-123dy58xi2 Might find this useful Source to source compilation is a pretty hard problem to tackle, because programming patterns don't carry very well across languages You will only ever be able to transpile trivial code, since anything beyond that will make significant use of stdlib/language intrinsics that cannot be directly replicated E.g. async/await with an executor in Rust mapping to goroutines Literally my day job lmao, I do not recommend doing it 😂 Pain in the ass
- Translates core Rust syntax to Go equivalents
- Handles basic data structures, functions, and modules
- Converts Rust's memory safety features to idiomatic Go patterns
- Provides CLI for easy file and directory transpilation
- Build the transpiler (to be elaborated)
- Write Proper tests
- Add Golint
- Add one click install (via docker maybe?)
- Have CI/CD for automated releases
- Build a frontend to showcase the compiler
- Write Blog
- Go 1.20 or higher
To install GoCrab, run the following command:
go install github.com/skysingh04/GoCrab@latest
After installing, you can start using GoCrab to transpile Rust code files to Go:
GoCrab path/to/rust/file.rs
This command will generate a .go
file with equivalent Go code in the same directory.
If you prefer to use Docker to run GoCrab, you can pull the latest image from Docker Hub and use it to transpile Rust code to Go.
To get the latest GoCrab image from Docker Hub, run the following command:
docker pull SkySingh04/gocrab:latest
You can use the Docker container to transpile your Rust files. Mount the directory containing your Rust file and specify the file as an argument:
docker run --rm -v $(pwd):/app SkySingh04/gocrab:latest /app/path/to/rust/file.rs
This will generate the transpiled .go
file in the same directory as your Rust file.
If you have a Rust file example.rs
in your current directory, you can transpile it as follows:
docker run --rm -v $(pwd):/app SkySingh04/gocrab:latest /app/example.rs
The output file example.go
will be created in the same directory.
Contributions are welcome! Please read the CONTRIBUTING.md file for more details on how to get involved.
This project is licensed under the MIT License. See the LICENSE file for details.