Skip to content

Tool Developer Tutorial

David Wright edited this page Feb 7, 2023 · 6 revisions

This tutorial will go through a few basic ROSE based tools to demonstrate important parts of ROSE tool development.

Identity Translator

The IdentityTranslator is the most basic ROSE based tool. The primary purpose of this tool is to test ROSE, and to demonstrate how ROSE transformed code differs from original source code. The code can be seen below, and is located in the ROSE repository at ${ROSE_SRC}/tutorials/identityTranslator.C.

#include "rose.h"

int main( int argc, char * argv[] ){
    // Initialize and check compatibility. See Rose::initialize
    ROSE_INITIALIZE;

    // Build the AST used by ROSE
    SgProject* project = frontend(argc,argv);
    ROSE_ASSERT (project != NULL);

    // Run internal consistency tests on AST
    AstTests::runAllTests(project);

    // Insert your own manipulation of the AST here...

    // Generate source code from AST and call the vendor's compiler
    return backend(project);
}

A few important things to notice about the code:

  • The include file rose.h is a large header file containing all ROSE includes.
  • The line SgProject* project = frontend(argc,argv); is what builds the entire AST with a SgProject node serving as root. Both argc and argv are passed to the call of frontend so that any command line arguments passed to the tool are relayed to the compiler.
  • The function AstTests::runAllTests(project); can be run before or after any transformations to ensure the AST remains valid.
  • When transforming code, the call backend(project) generates source code based on the AST and calls the backend compiler. If you are writing an analysis tool, this call is not necessary.

The IdentityTranslator can be used as a starting point for developing most tools by adding any transformation or analysis before the call to backend.

Print Scope Information

This analysis tool will traverse the AST and print out the scope information for nodes that have scope. The code can be seen below and is located in the ROSE repository at ${ROSE_SRC}/tutorials/scopeInformation.C.

#include "rose.h"

class visitorTraversal : public AstSimpleProcessing{
    public:
        virtual void visit(SgNode* n);
};

void visitorTraversal::visit(SgNode* n){
    // There are two types ir IR nodes that can be queried for scope: SgStatement and SgInitializedName
    SgStatement* statement = isSgStatement(n);
    if (statement != NULL){
        SgScopeStatement* scope = statement->get_scope();
        ROSE_ASSERT(scope != NULL);
        printf ("SgStatement       = %12p = %30s has scope = %12p = %s (total number = %d) \n",
        statement,statement->class_name().c_str(),
        scope,scope->class_name().c_str(),(int)scope->numberOfNodes());
    }

    SgInitializedName* initializedName = isSgInitializedName(n);
    if (initializedName != NULL){
        SgScopeStatement* scope = initializedName->get_scope();
        ROSE_ASSERT(scope != NULL);
        printf ("SgInitializedName = %12p = %30s has scope = %12p = %s (total number = %d)\n",
        initializedName,initializedName->get_name().str(),
        scope,scope->class_name().c_str(),(int)scope->numberOfNodes());
    }
}

int main ( int argc, char* argv[] ){
    ROSE_INITIALIZE;

    SgProject* project = frontend(argc,argv);
    ROSE_ASSERT (project != NULL);

    // Build the traversal object
    visitorTraversal exampleTraversal;

    // Call the traversal starting at the project node of the AST
    exampleTraversal.traverseInputFiles(project,preorder);

    printf ("Number of scopes (SgScopeStatement) = %d \n",(int)SgScopeStatement::numberOfNodes());
    printf ("Number of scopes (SgBasicBlock)     = %d \n",(int)SgBasicBlock::numberOfNodes());

    return 0;
}

Scope Information

There are two kinds of nodes in the ROSE AST that contain scope information, SgStatement and SgInitializedName. The SgStatements is a node representing all the different statements that may appear in the code. A SgInitializedName represents the notion of a declared variable.

Tree Traversal

This tool uses the tree traversal mechanism in ROSE to process the AST (How to search for an AST node). To define what should be done on this traversal, we create our own subclass of AstSimpleProcessing and override the visit method.

class visitorTraversal : public AstSimpleProcessing{
    public:
        virtual void visit(SgNode* n);
};

When a traversal is performed on the tree, the visit method will be called on each node in the tree, which we can then use to perform our analysis.

void visitorTraversal::visit(SgNode* n){
    // There are two types ir IR nodes that can be queried for scope: SgStatement and SgInitializedName
    SgStatement* statement = isSgStatement(n);
    if (statement != NULL){
        SgScopeStatement* scope = statement->get_scope();
        ROSE_ASSERT(scope != NULL);
        printf ("SgStatement       = %12p = %30s has scope = %12p = %s (total number = %d) \n",
        statement,statement->class_name().c_str(),
        scope,scope->class_name().c_str(),(int)scope->numberOfNodes());
    }
    ...
}

For each node, the visit method checks if the node is a SgStatement by using the function isSgStatement(SgNode* node). This function returns the SgNode pointer cast to a SgStatement in a more efficient and easier to read manner, you can also directly cast ROSE AST nodes. A method for doing the appropriate casting exists for all types nodes. The visitor then checks if the node is not null, meaning it is a SgStatement. It then gets the information about the scope and prints out the information about the node and scope. This same process is then repeated for SgInitializedName.

Begin Traversal

    visitorTraversal exampleTraversal;
    exampleTraversal.traverseInputFiles(project,preorder);

To begin the traversal, first, create an instance of the traversal object, then call its method traverseInputFiles. This method takes the AST to search and the order to search it in, preorder or postorder.

Insert Additional Function Call

This transformation tool inserts a function call before the return statement, and can be found in the ROSE repository at ${ROSE_SRC}/tests/nonsmoke/functional/roseTests/astInterfaceTests/buildFunctionCalls.C.

#include "rose.h"
#include <iostream>
using namespace std;

int main (int argc, char *argv[]){
    SgProject *project = frontend (argc, argv);

    // insert header
    SgGlobal* globalscope = SageInterface::getFirstGlobalScope(project);
    SageInterface::insertHeader("inputbuildFunctionCalls.h",PreprocessingInfo::after,false,globalscope);

    // go to the function body
    SgFunctionDeclaration* mainFunc= SageInterface::findMain(project);
    SgBasicBlock* body= mainFunc->get_definition()->get_body();
    SageBuilder::pushScopeStack(body);

    // void foo(int p_sum)
    SgType* return_type = SageBuilder::buildVoidType();
    SgVarRefExp* arg1 = SageBuilder::buildVarRefExp(SgName("p_sum"));//type is inferred from symbol table
    SgExprListExp* arg_list = SageBuilder::buildExprListExp();
    SageInterface::appendExpression(arg_list,arg1);
    SgExprStatement* callStmt1 = SageBuilder::buildFunctionCallStmt(SgName("foo"),return_type, arg_list);

    // insert before the last return statement
    SgStatement* lastStmt = getLastStatement(topScopeStack());
    insertStatement(lastStmt,callStmt1); 
    popScopeStack();

    AstPostProcessing(project);
    AstTests::runAllTests(project);

    return backend (project);
}

ROSE Namespaces

This tool uses the SageInterface namespace to locate elements and modify the AST, and the SageBuilder namespace is used to build the AST, including the section we will add. For this tutorial, we have made clear where methods come from, but in practice it is common to use using namespace SageInterface; and using namespace SageBuilder;

Insert Header

SgGlobal* globalscope = SageInterface::getFirstGlobalScope(project);
SageInterface::insertHeader("inputbuildFunctionCalls.h",PreprocessingInfo::after,false,globalscope);

The first transformation is to insert the header file from where the inserted function will come. This is done using SageInterface, first to find the scope that the header should be inserted into by using SageInterface::getFirstGlobalScope. Then we use SageInterface::insertHeader to insert the new header after the previous headers for that scope.

Locate function

SgFunctionDeclaration* mainFunc= SageInterface::findMain(project);
SgBasicBlock* body= mainFunc->get_definition()->get_body();
SageBuilder::pushScopeStack(body);

Before a new part of the AST can be built, the scope must be acquired. In this case, we use the body of the main method, but there are many other SgScopeStatements that can be used. We first use SageInterface::findMain to get the SgFunctionDeclaration for main method, then use get_definition() to get the SgFunctionDefinition, then use get_body() to get the SgBasicBlock that is the body of the method. The body is then pushed onto the scope stack with SageBuilder::pushScopeStack which is used to determine the current context of what is being built. When SageBuilder is building new parts of the AST is uses the top of the scope stack to get information about the current context to build in. By pushing to the scope stack when we start and popping when we finish it allows us to call other methods that can build in their own context and then easily return to our context.

Build Function Call

SgType* return_type = SageBuilder::buildVoidType();
SgVarRefExp* arg1 = SageBuilder::buildVarRefExp(SgName("p_sum"));//type is inferred from symbol table
SgExprListExp* arg_list = SageBuilder::buildExprListExp();
SageInterface::appendExpression(arg_list,arg1);
SgExprStatement* callStmt1 = SageBuilder::buildFunctionCallStmt(SgName("foo"),return_type, arg_list);

The next step is to build the new sub tree to insert into the AST. SageBuilder has functions that build the elements of the AST, then link them appropriately for the current scope at the top of the scope stack.

Insert Function Call

SgStatement* lastStmt = SageInterface::getLastStatement(SageBuilder::topScopeStack());
SageInterface::insertStatement(lastStmt,callStmt1); 
SageBuilder::popScopeStack();

Now that the statement has been built, we can use SageInterface::getLastStatement to look at the top of the scope stack for the last statement in that scope where we want to insert our function. We then use SageInterface::insertStatement to inset the statement into the AST. In the same way that we pushed to the scope stack at the beginning, we now use SageBuilder::popScopeStack to pop the stack and return it to the context it was before we started.

Final Steps

SageInterface::AstPostProcessing(project);
AstTests::runAllTests(project);
return backend (project);

Now that the transformed AST has been built, we need to run the post processing and tests, which will update attributes and test to ensure that the AST is still correct. The final step is to pass the transformed AST to the backend compiler, which will do final compilation and output the transformed source code saved in a file named rose_*.