-
Notifications
You must be signed in to change notification settings - Fork 14.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support 'chevron' library for templating as jinja alternative #11617
Conversation
(chevron)[https://github.com/noahmorrison/chevron] is a python implementation of mustache.js, and seems much safer by scope than jinja could ever be. While the library is not super active lately, it appears to be feature complete and is easy to review from a security standpoint as it's a few short-ish modules. This PR adds support for the chevron library behind a feature flag.
Codecov Report
@@ Coverage Diff @@
## master #11617 +/- ##
==========================================
+ Coverage 62.26% 62.69% +0.42%
==========================================
Files 873 873
Lines 42238 43307 +1069
Branches 3959 4079 +120
==========================================
+ Hits 26301 27151 +850
- Misses 15757 15976 +219
Partials 180 180
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
superset/jinja_context.py
Outdated
@@ -338,6 +347,8 @@ def get_template_processor( | |||
template_processor = template_processors.get( | |||
database.backend, BaseTemplateProcessor | |||
) | |||
elif feature_flag_manager.is_feature_enabled("CHEVRON_TEMPLATE_PROCESSING"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be better to nest this conditional under if ENABLE_TEMPLATE_PROCESSING
, sort of as a sub option after the user turns tempate processing on
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should rename ENABLE_TEMPLATE_PROCESSING
to JINJA_TEMPLATE_PROCESSING
instead (albeit a breaking change)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, clearly there's some cleanup to be done here, but also need to maintain backward compatibility. I think that's the easy portion of this PR, the real question is around the viability of chevron
as a secure templating backend.
As a solution to this particular topic though I'd like to add JINJA_TEMPLATE_PROCESSING
as True
by default and nest the condition under ENABLE_TEMPLATE_PROCESSING
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would propose calling the feature flag TEMPLATE_PROCESSING
, and then adding a config parameter TEMPLATE_PROCESSOR: Optional[TemplateProcessorType] = TemplateProcessorType.CHEVRON
. Similar to how other feature flagged options do (see e.g. THUMBNAIL_SELENIUM_USER
, THUMBNAIL_CACHE_CONFIG
etc).
This looks good upon cursory review. I'll look into the lib a bit more deeply. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good and no harm in supporting multiple template processors. I'm slightly concerned by the staleness of the library, but that's a different topic (not a problem before chevron
is potentially enabled by default).
superset/jinja_context.py
Outdated
@@ -338,6 +347,8 @@ def get_template_processor( | |||
template_processor = template_processors.get( | |||
database.backend, BaseTemplateProcessor | |||
) | |||
elif feature_flag_manager.is_feature_enabled("CHEVRON_TEMPLATE_PROCESSING"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would propose calling the feature flag TEMPLATE_PROCESSING
, and then adding a config parameter TEMPLATE_PROCESSOR: Optional[TemplateProcessorType] = TemplateProcessorType.CHEVRON
. Similar to how other feature flagged options do (see e.g. THUMBNAIL_SELENIUM_USER
, THUMBNAIL_CACHE_CONFIG
etc).
FYI, offered help to maintain |
On another note, have you looked into JinjaSQL? I've used it in some Python scripts before, but not sure whether it is safer than the raw jinja. |
@ktmud Thanks for the suggestion! It seems with JinjaSQL "You can use the full power of Jinja templates," which is what we're trying to move away from (the ability to execute arbitrary python code in user-generated input). |
@ktmud didn't spend much time but that lib seems to be doing very little on top of jinja as highlighted in the single, very simple <200 LoC module here: |
It seems the only thing it does it to parameterize SQL queries, which at least reduces the risk of SQL injections (if there's ever such risk within Superset). |
Not sure if we really want to move away from Jinja. Other than the unintentional exposure of unsafe objects, what are other potential risks with Jinja? One thing we could to is to be more careful with what we expose to Jinja context---and the first thing we should probably do is to always expose raw functions instead of class objects (i.e., replace |
One concern with
While this can be mitigated since we control how |
@ktmud The intention is the prevent security issues like CVE-2020-13948. We should absolutely move away from exposing class objects to the Jinja context, but this doesn't prevent access to Python internals. We're relying heavily on Jijna's Sandboxing to prevent RCE, which feels very brittle. |
I think CVE-2020-13948 has the same root cause as the other security bug we had, which is exposing class objects or modules that allow unsafe chained access to things we don't want to expose. As long as we stop doing that, is there anything else we should worry about? A lot of OSS use Jinja as their template engine, including Airflow and dbt, is there anything we can learn from them to make Jinja more secure? |
@ktmud The difference here is that we're processing user-generated input, while Airflow and dbt are operating on templates defined in code. Unclear how the hosted versions of these apps (Astronomer and dbt Cloud) handle Jinja security. |
To be fair dbt is also very much user-generated input. Anyone can login to the dbt UI and write templated code in the IDE: https://docs.getdbt.com/docs/running-a-dbt-project/using-the-dbt-ide |
Out of curiosity we could look at DBT's code to see how they secure their sandbox. |
Clearly the compatibility with DBT AND Airflow through Jinja is a really important factor for many. People want to be able to iterate and go back and forth between these tools. |
I'll take a stab at making the jinja implementation more secure in a separate branch. |
Attempt at making Jinja more secure: #11704 |
Superseded by #11704 |
SUMMARY
chevron is a python implementation of mustache.js, and seems much safer by scope (project tagline is "Logic-less templates" !) than
jinja2
could ever be.While the library is not super active lately, it appears to be feature complete and is easy to review from a security standpoint as it's a few short-ish modules. Overall the whole library is <1k lines.
This PR adds support for the chevron library behind a feature flag.
SHORTCOMINGS
{{{ ... }}}
. I submitted a PR to enable turning this off in our context here allow bypassing html_escaping noahmorrison/chevron#81{{mylist.0}}
mustache feature, but works in chevron'smaster
, highlighting the fact that the lib hasn't released to PyPI in a long timeBEFORE/AFTER SCREENSHOTS OR ANIMATED GIF