Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Translation engine catalog optimization #1012

Merged
merged 11 commits into from
Oct 20, 2021

Conversation

fineg74
Copy link
Contributor

@fineg74 fineg74 commented Oct 13, 2021

The purpose of this PR is to refactor the gaiat to eliminate duplication of code between llvm, gaiat and gaia, optimize interaction of gaiat with catalog to reduce number of DB transactions, make gaiat and Gaia introduced llvm code to use llvm-native ADT that per some evidence faster then STL ones, various performance optimizations.
There are no functional changes. The users are expected to see noticeable improvement in gaiat performance when used on large rulesets with complex DB schema.

@@ -610,19 +505,10 @@ bool table_navigation_t::generate_navigation_step(const string& source_table, co
return true;
}

void table_navigation_t::ensure_initialization()
llvm::SmallVector<string, 8> table_navigation_t::get_table_fields(const string& table)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can go a little higher than 8 to avoid heap allocation. You can easily expect a table to have more than 8 columns. (maybe 16?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -458,6 +459,44 @@ namespace llvm {
}
};

template <>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add documentation about what this is and why did you need to add it (what is the Gaia use-case that we are trying to solve by adding this class).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

void ASTContext::InitBuiltinType(CanQualType &R, BuiltinType::Kind K) {
auto *Ty = new (*this, TypeAlignment) BuiltinType(K);
R = CanQualType::CreateUnsafe(QualType(Ty, 0));
cacheEDCType(Ty);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you know this is an EDC type (it doesn't look so to me). Same for the following instances.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I think we should find a better way than putting cacheEDCType(...); in so many methods within this calss. Maybe we can do it once when the translation engine is initialized?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added verification by name.
Not sure if we can do anything else unless we want to make some shortcuts i.e. similar to what Wayne did (his solution doesn't take into account definition context that may impact the correctness of cached types)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough.

Copy link
Contributor

@simone-gaia simone-gaia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@simone-gaia simone-gaia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocking comment (you can merge): were you able to measure a performance improvement after this refactoring?

@@ -15,4 +15,4 @@ ruleset test10
}
}

// CHECK: Multiple shortest paths exist between tables 'crop' and 'sensor'. Explicitly specify the navigation path between these two tables to resolve the ambiguity.
// CHECK: Multiple shortest paths exist between tables 'crop' and 'feeding'. Explicitly specify the navigation path between these two tables to resolve the ambiguity.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity why this changed? Does this mean we had a bug earlier that was fixed by this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe different hashing function led to different order in which tables are iterated and reported. (The schema has multiple tables to which multiple shortest paths exist)

@fineg74
Copy link
Contributor Author

fineg74 commented Oct 15, 2021

Non-blocking comment (you can merge): were you able to measure a performance improvement after this refactoring?

It likely depends on your schema and ruleset. For my change Wayne reported improvement of around 30%. He reported much better improvement for his caching search improvement and since my change has the same caching but implemented slightly differently and hopefully more correctly the improvement should be comparable

@simone-gaia
Copy link
Contributor

Non-blocking comment (you can merge): were you able to measure a performance improvement after this refactoring?

It likely depends on your schema and ruleset. For my change Wayne reported improvement of around 30%. He reported much better improvement for his caching search improvement and since my change has the same caching but implemented slightly differently and hopefully more correctly the improvement should be comparable

Awesome, thanks!

@fineg74 fineg74 merged commit be687c3 into master Oct 20, 2021
@fineg74 fineg74 deleted the TranslationEngineCatalogOptimization branch October 20, 2021 02:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants