Add Hive Query language as an SQL dialect #880

tkluck-booking · 2018-03-07T17:51:33Z

This adds syntax support for Hive Query Language (HQL), an sql dialect that is heavily used on Hadoop clusters. For reference: https://cwiki.apache.org/confluence/display/Hive/LanguageManual

This is a very naive attempt, extending the SQL lexer only by adding interpolation variables and by updating the list of reserved words/keywords. I'll be happy to hear your thoughts!

lib/rouge/lexers/hql.rb

tkluck-booking · 2018-04-16T20:44:25Z

Just found time to implement your suggestions and for fixing the omissions you found. Thanks again for the review @dblessing ! Let me know what you think of these latest commits.

tkluck · 2018-07-26T11:09:55Z

Hi @dblessing , is there anything I can do to help this get merged?

(Commenting from my personal account -- hope that doesn't lead to confusion.)

zoidyzoidzoid · 2018-11-27T11:37:07Z

Any update on this?

vidarh

Hi, I'm trying to assist in triaging pull requests to see if we can get the backlog down. For the most part your pull request looks fine, but see the comments below. Thanks for your submission...

lib/rouge/lexers/sql.rb

pyrmont · 2019-08-05T10:05:55Z

@tkluck-booking @tkluck I've recently joined the maintainers group of Rouge and have been working my way chronologically backwards through the outstanding PRs. I'm sorry that's meant I'm only just getting to this now :(

Are you still able to submit changes to this? @vidarh raises a couple of good points that are outstanding and I'm personally a little unclear on why some of the keywords were removed from the SQL lexer. Just wanted to check you're still in a position to respond.

tkluck-booking · 2019-08-05T10:31:10Z

Hi @pyrmont congrats on stepping up as a maintainer! I'll be glad to update the PR with these comments. I'll aim for later this week.

pyrmont · 2019-08-05T11:54:03Z

@tkluck-booking Thanks for the quick respond! Look forward to hearing more fully from you soon! :)

… spec

I don't understand the difference between the files in `lib/rouge/demos` and `spec/visual/samples`, but I'm just cargo-culting having the same file in both.

@dblessing

I added the distinction between keywords_type and keywords to HQL which subclasses SQL. @dblessing makes the excellent suggestion to move that up to the SQL class, which might benefit too. This commit does that. That also means adding a list of type keywords to the SQL class. I opted to take MySQL's list, which I'm more familiar with than other dialects. Choice of dialect is an arbitrary choice anyway. This *mostly* adds keywords but a few keywords were moved from keywords to keywords_type. SET is a tricky one, as it's also a statement, but highlighting it as a type works for what's probably the more prevalent occurrence.

As discussed in rouge-ruby#880, we're guessing it wasn't a conscious decision to highlight a literal string as Name::Variable, so we may as well fix this in SQL as opposed to fixing it by an override in the subclass.

This is a change that has been made more generally in the SQL lexer since this feature branch was started. Applying it here for consistency.

@vidarh

As noticed by @vidarh, this better represents the existence of many different interpretations in different SQL dialects. The suggestion in the code review was to maybe make this an optional argument (e.g. `dialect`) to the lexer, but since we only have two dialects at the moment, I feel it's clearer to stick with subclassing as a way to define dialects for now.

tkluck-booking · 2019-08-05T15:31:29Z

Just force-pushed a branch with the conflicts resolved and the suggestions implemented. Let me know if anything else is needed to merge this!

pyrmont

Let me know if anything is unclear!

lib/rouge/lexers/sql.rb

spec/visual/samples/hql

@pyrmont

This should have been part of b39605e. Thanks to @pyrmont's code review.

@pyrmont

As suggested by @pyrmont in rouge-ruby#880 (comment) and as is also done in Pygments and Chroma.

Both of these are (part of) type specifications as well, and so we have to choose where to put them. In f3311f1, I let the interpretation as a type always take precedence. This commit applies a subjective rule-of-least-suprise instead: - Most people will run into COLLATE as a function, not as part of a unicode collation rule for a column - Most people will run into SET as part of an UPDATE statement, not as part of a CHARACTER SET specification I chose not to move YEAR and TIMESTAMP from that same commit. They are functions as well as types, but their use as a function is so closely related to the type that there's no confusion to be feared. I didn't move NATIONAL (from NATIONAL CHAR), NCHAR and PRECISION. They were moved to be types in the same commit, but that's their only interpretation.

tkluck · 2019-08-13T21:40:47Z

Will fix&push failing CI tomorrow (cest).

pyrmont · 2019-08-14T00:21:03Z

Travis was failing because the lexer was using the SQL lexer's rules in the :double_string state and those tokenise " as Name::Variable. I added a rule to override this but also took the opportunity to change the tokens to Str::Double. Visually, these should almost always look the same, but it's more semantically correct. (If you think it's better to have single-quoted and double-quoted strings use the same token, the Str would be more correct.)

pyrmont · 2019-08-14T03:34:08Z

@tkluck-booking I think this is good to merge. Let me know if you think anything is missing.

tkluck-booking · 2019-08-14T15:11:25Z

@pyrmont thanks for collaborating on this with me. I approve of & love your changes. Let's merge!

pyrmont · 2019-08-14T15:32:36Z

Thanks for all your work, @tkluck-booking! It's good to support an additional language and thank you for improving the SQL lexer :)

dblessing requested changes Apr 3, 2018

View reviewed changes

lib/rouge/lexers/hql.rb Show resolved Hide resolved

lib/rouge/lexers/hql.rb Outdated Show resolved Hide resolved

vidarh suggested changes Jan 11, 2019

View reviewed changes

lib/rouge/lexers/sql.rb Outdated Show resolved Hide resolved

lib/rouge/lexers/sql.rb Outdated Show resolved Hide resolved

vidarh mentioned this pull request Jan 17, 2019

Cleaning up pull requests - Reviewed lexer PRs #1063

Closed

pyrmont added the needs-review The PR needs to be reviewed label Jul 18, 2019

pyrmont added author-action The PR has been reviewed but action by the author is needed and removed needs-review The PR needs to be reviewed labels Aug 5, 2019

tkluck-booking added 8 commits August 5, 2019 15:12

Add Hive Query language as an SQL dialect

fa4b3b9

HQL lexer: fix double quote string highlighting and the corresponding…

54b4228

… spec

Add an HQL demo file

ba267a3

I don't understand the difference between the files in `lib/rouge/demos` and `spec/visual/samples`, but I'm just cargo-culting having the same file in both.

SQL/HQL interpretation of ": move to SQL base class

33b4d40

As discussed in rouge-ruby#880, we're guessing it wasn't a conscious decision to highlight a literal string as Name::Variable, so we may as well fix this in SQL as opposed to fixing it by an override in the subclass.

HQL: add built-in functions to the list of keywords

5e6456f

SQL/HQL lexer: Use %r to introduce regular expressions

5b71464

This is a change that has been made more generally in the SQL lexer since this feature branch was started. Applying it here for consistency.

tkluck-booking force-pushed the hive-hql branch from cfc3ceb to b39605e Compare August 5, 2019 15:30

pyrmont suggested changes Aug 7, 2019

View reviewed changes

lib/rouge/lexers/sql.rb Outdated Show resolved Hide resolved

lib/rouge/lexers/sql.rb Outdated Show resolved Hide resolved

spec/visual/samples/hql Outdated Show resolved Hide resolved

tkluck-booking added 3 commits August 7, 2019 11:08

SQL/HQL: one more place for distinguishing single/double quotes

b8ed3f1

This should have been part of b39605e. Thanks to @pyrmont's code review.

SQL: make types a Name::Builtin

6ce6486

As suggested by @pyrmont in rouge-ruby#880 (comment) and as is also done in Pygments and Chroma.

Fix tokens for double-quoted strings

f48e72f

Tweak visual sample

f408ef3

pyrmont merged commit 30e1e8c into rouge-ruby:master Aug 14, 2019

pyrmont removed the author-action The PR has been reviewed but action by the author is needed label Aug 14, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Hive Query language as an SQL dialect #880

Add Hive Query language as an SQL dialect #880

tkluck-booking commented Mar 7, 2018

tkluck-booking commented Apr 16, 2018 •

edited

Loading

tkluck commented Jul 26, 2018

zoidyzoidzoid commented Nov 27, 2018

vidarh left a comment

pyrmont commented Aug 5, 2019

tkluck-booking commented Aug 5, 2019

pyrmont commented Aug 5, 2019

tkluck-booking commented Aug 5, 2019

pyrmont left a comment

tkluck commented Aug 13, 2019

pyrmont commented Aug 14, 2019 •

edited

Loading

pyrmont commented Aug 14, 2019

tkluck-booking commented Aug 14, 2019

pyrmont commented Aug 14, 2019

Add Hive Query language as an SQL dialect #880

Add Hive Query language as an SQL dialect #880

Conversation

tkluck-booking commented Mar 7, 2018

tkluck-booking commented Apr 16, 2018 • edited Loading

tkluck commented Jul 26, 2018

zoidyzoidzoid commented Nov 27, 2018

vidarh left a comment

Choose a reason for hiding this comment

pyrmont commented Aug 5, 2019

tkluck-booking commented Aug 5, 2019

pyrmont commented Aug 5, 2019

tkluck-booking commented Aug 5, 2019

pyrmont left a comment

Choose a reason for hiding this comment

tkluck commented Aug 13, 2019

pyrmont commented Aug 14, 2019 • edited Loading

pyrmont commented Aug 14, 2019

tkluck-booking commented Aug 14, 2019

pyrmont commented Aug 14, 2019

tkluck-booking commented Apr 16, 2018 •

edited

Loading

pyrmont commented Aug 14, 2019 •

edited

Loading