Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed taking a screenshot Message: 'geckodriver' executable needs to be in PATH. #23057

Closed
hanyujunsir opened this issue Feb 10, 2023 · 7 comments
Labels
#bug Bug report

Comments

@hanyujunsir
Copy link

import logging
import os
from datetime import timedelta
from typing import Optional

from cachelib.file import FileSystemCache
from celery.schedules import crontab

logger = logging.getLogger()

def get_env_variable(var_name: str, default: Optional[str] = None) -> str:
"""Get the environment variable or raise exception."""
try:
return os.environ[var_name]
except KeyError:
if default is not None:
return default
else:
error_msg = "The environment variable {} was missing, abort...".format(
var_name
)
raise EnvironmentError(error_msg)

DATABASE_DIALECT = get_env_variable("DATABASE_DIALECT")
DATABASE_USER = get_env_variable("DATABASE_USER")
DATABASE_PASSWORD = get_env_variable("DATABASE_PASSWORD")
DATABASE_HOST = get_env_variable("DATABASE_HOST")
DATABASE_PORT = get_env_variable("DATABASE_PORT")
DATABASE_DB = get_env_variable("DATABASE_DB")

The SQLAlchemy connection string.

SQLALCHEMY_DATABASE_URI = "%s://%s:%s@%s:%s/%s" % (
DATABASE_DIALECT,
DATABASE_USER,
DATABASE_PASSWORD,
DATABASE_HOST,
DATABASE_PORT,
DATABASE_DB,
)

REDIS_HOST = get_env_variable("REDIS_HOST")
REDIS_PORT = get_env_variable("REDIS_PORT")
REDIS_CELERY_DB = get_env_variable("REDIS_CELERY_DB", "0")
REDIS_RESULTS_DB = get_env_variable("REDIS_RESULTS_DB", "1")

RESULTS_BACKEND = FileSystemCache("/app/superset_home/sqllab")

CACHE_CONFIG = {
"CACHE_TYPE": "redis",
"CACHE_DEFAULT_TIMEOUT": 300,
"CACHE_KEY_PREFIX": "superset_",
"CACHE_REDIS_HOST": REDIS_HOST,
"CACHE_REDIS_PORT": REDIS_PORT,
"CACHE_REDIS_DB": REDIS_RESULTS_DB,
}
DATA_CACHE_CONFIG = CACHE_CONFIG

class CeleryConfig(object):
BROKER_URL = 'redis://%s:%s/0' % (REDIS_HOST, REDIS_PORT)
CELERY_IMPORTS = ('superset.sql_lab', "superset.tasks", "superset.tasks.thumbnails", )
CELERY_RESULT_BACKEND = 'redis://%s:%s/0' % (REDIS_HOST, REDIS_PORT)
CELERYD_LOG_LEVEL = "DEBUG"
CELERYD_PREFETCH_MULTIPLIER = 10
CELERY_ACKS_LATE = False
CELERY_ANNOTATIONS = {
'sql_lab.get_sql_results': {
'rate_limit': '100/s',
},
'email_reports.send': {
'rate_limit': '1/s',
'time_limit': 600,
'soft_time_limit': 600,
'ignore_result': True,
},
}
CELERYBEAT_SCHEDULE = {
"reports.scheduler": {
"task": "reports.scheduler",
"schedule": crontab(minute="", hour=""),
},
"reports.prune_log": {
"task": "reports.prune_log",
"schedule": crontab(minute=0, hour=0),
},
}

CELERY_CONFIG = CeleryConfig
SCREENSHOT_LOCATE_WAIT = 1000
SCREENSHOT_LOAD_WAIT = 1200

Slack configuration

SLACK_API_TOKEN = "xoxb-"

Email configuration

SMTP_HOST = "smtp.126.com" # change to your host
SMTP_PORT = 25 # your port, e.g. 587
SMTP_STARTTLS = True
SMTP_SSL_SERVER_AUTH = True # If your using an SMTP server with a valid certificate
SMTP_SSL = False
SMTP_USER = "####" # use the empty string "" if using an unauthenticated SMTP server
SMTP_PASSWORD = "###" # use the empty string "" if using an unauthenticated SMTP server
SMTP_MAIL_FROM = "###"
EMAIL_REPORTS_SUBJECT_PREFIX = "[Superset] " # optional - overwrites default value in config.py of "[Report] "

FEATURE_FLAGS = {"ALERT_REPORTS": True}
ALERT_REPORTS_NOTIFICATION_DRY_RUN = False
WEBDRIVER_BASEURL = "http://superset:8088/"

The base URL for the email report hyperlinks.

WEBDRIVER_BASEURL_USER_FRIENDLY = "http://####"

WebDriver configuration

If you use Firefox, you can stick with default values

If you use Chrome, then add the following WEBDRIVER_TYPE and WEBDRIVER_OPTION_ARGS

SQLLAB_CTAS_NO_LIMIT = True

Optionally import superset_config_docker.py (which will have been included on

the PYTHONPATH) in order to allow for local settings to be overridden

try:
import superset_config_docker
from superset_config_docker import * # noqa

logger.info(
    f"Loaded your Docker configuration at " f"[{superset_config_docker.__file__}]"
)

except ImportError:
logger.info("Using default Docker config...")

I use alert and report functions, and I have installed geckodriver in the container according to the official website. I want to use screenshots, but this problem still occurs. How to solve it? thank you.

I use superset2.0.1, I use
TAG=2.0.1 Docker-compose- f docker-compose-non-dev.yml. Up
command installation
Could you tell me where I made a mistake or missed some operation?
Please let me know. Thank you.

@hanyujunsir hanyujunsir added the #bug Bug report label Feb 10, 2023
@sfirke
Copy link
Member

sfirke commented Feb 10, 2023

From https://superset.apache.org/docs/installation/alerts-reports/:

If you are running a non-dev docker image, e.g., a stable release like apache/superset:2.0.1, that image does not include a headless browser. Only the superset_worker container needs this headless browser to browse to the target chart or dashboard. You can either install and configure the headless browser - see "Custom Dockerfile" section below - or when deploying via docker-compose, modify your docker-compose.yml file to use a dev image for the worker container and a stable release image for the superset_app container.

Note: In this context, a "dev image" is the same application software as its corresponding non-dev image, just bundled with additional tools. So an image like 2.0.1-dev is identical to 2.0.1 when it comes to stability, functionality, and running in production. The actual "in-development" versions of Superset - cutting-edge and unstable - are not tagged with version numbers on Docker Hub and will display version 0.0.0-dev within the Superset UI.

Have you tried using the 2.0.1-dev image, either for all of your containers or at least for the worker containers?

@rusackas
Copy link
Member

rusackas commented Mar 7, 2023

Assuming the provided solution resolves the issue, due to lack of response.

@rusackas rusackas closed this as completed Mar 7, 2023
@carys-cc
Copy link

I have the same problem. Here is my Dockerfile

`#

Licensed to the Apache Software Foundation (ASF) under one or more

contributor license agreements. See the NOTICE file distributed with

this work for additional information regarding copyright ownership.

The ASF licenses this file to You under the Apache License, Version 2.0

(the "License"); you may not use this file except in compliance with

the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software

distributed under the License is distributed on an "AS IS" BASIS,

WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

See the License for the specific language governing permissions and

limitations under the License.

######################################################################

PY stage that simply does a pip install on our requirements

######################################################################
ARG PY_VER=3.8.12
FROM python:${PY_VER} AS superset-py

RUN mkdir /app
&& apt-get update -y
&& apt-get install -y --no-install-recommends
build-essential
default-libmysqlclient-dev
libpq-dev
libsasl2-dev
libecpg-dev
&& rm -rf /var/lib/apt/lists/*

First, we just wanna install requirements, which will allow us to utilize the cache

in order to only build if and only if requirements change

COPY ./requirements/*.txt /app/requirements/
COPY setup.py MANIFEST.in README.md /app/
COPY superset-frontend/package.json /app/superset-frontend/
RUN cd /app
&& mkdir -p superset/static
&& touch superset/static/version_info.json
&& pip install --no-cache -r requirements/local.txt

######################################################################

Node stage to deal with static asset construction

######################################################################
FROM node:16 AS superset-node

ARG NPM_VER=7
RUN npm install -g npm@${NPM_VER}

ARG NPM_BUILD_CMD="build"
ENV BUILD_CMD=${NPM_BUILD_CMD}

NPM ci first, as to NOT invalidate previous steps except for when package.json changes

RUN mkdir -p /app/superset-frontend
RUN mkdir -p /app/superset/assets
COPY ./docker/frontend-mem-nag.sh /
COPY ./superset-frontend /app/superset-frontend
RUN /frontend-mem-nag.sh
&& cd /app/superset-frontend
&& npm ci

This seems to be the most expensive step

RUN cd /app/superset-frontend
&& npm run ${BUILD_CMD}
&& rm -rf node_modules

######################################################################

Final lean image...

######################################################################
ARG PY_VER=3.8.12
FROM python:${PY_VER} AS lean

ENV LANG=C.UTF-8
LC_ALL=C.UTF-8
FLASK_ENV=production
FLASK_APP="superset.app:create_app()"
PYTHONPATH="/app/pythonpath"
SUPERSET_HOME="/app/superset_home"
SUPERSET_PORT=8088

RUN mkdir -p ${PYTHONPATH}
&& useradd --user-group -d ${SUPERSET_HOME} -m --no-log-init --shell /bin/bash superset
&& apt-get update -y
&& apt-get install -y --no-install-recommends
build-essential
default-libmysqlclient-dev
libsasl2-modules-gssapi-mit
libpq-dev
libecpg-dev
&& rm -rf /var/lib/apt/lists/*

COPY --from=superset-py /usr/local/lib/python3.8/site-packages/ /usr/local/lib/python3.8/site-packages/

Copying site-packages doesn't move the CLIs, so let's copy them one by one

COPY --from=superset-py /usr/local/bin/gunicorn /usr/local/bin/celery /usr/local/bin/flask /usr/bin/
COPY --from=superset-node /app/superset/static/assets /app/superset/static/assets
COPY --from=superset-node /app/superset-frontend /app/superset-frontend

Lastly, let's install superset itself

COPY superset /app/superset
COPY setup.py MANIFEST.in README.md /app/
RUN cd /app
&& chown -R superset:superset *
&& pip install -e .
&& flask fab babel-compile --target superset/translations

COPY ./docker/run-server.sh /usr/bin/

RUN chmod a+x /usr/bin/run-server.sh

WORKDIR /app

USER superset

HEALTHCHECK CMD curl -f "http://localhost:$SUPERSET_PORT/health"

EXPOSE ${SUPERSET_PORT}

CMD /usr/bin/run-server.sh

######################################################################

Dev image...

######################################################################
FROM lean AS dev
ARG GECKODRIVER_VERSION=v0.28.0
ARG FIREFOX_VERSION=88.0

COPY ./requirements/.txt ./docker/requirements-.txt/ /app/requirements/

USER root

chrome

RUN apt-get update && \

wget -q https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb && \

apt-get install -y --no-install-recommends ./google-chrome-stable_current_amd64.deb && \

rm -f google-chrome-stable_current_amd64.deb

RUN export CHROMEDRIVER_VERSION=$(curl --silent https://chromedriver.storage.googleapis.com/LATEST_RELEASE) && \

wget -q https://chromedriver.storage.googleapis.com/${CHROMEDRIVER_VERSION}/chromedriver_linux64.zip && \

unzip chromedriver_linux64.zip -d /usr/bin && \

chmod 755 /usr/bin/chromedriver && \

rm -f chromedriver_linux64.zip

RUN pip install --no-cache gevent psycopg2 redis

firefox

RUN apt-get update -y
&& apt-get install -y --no-install-recommends libnss3 libdbus-glib-1-2 libgtk-3-0 libx11-xcb1

# Install GeckoDriver WebDriver

RUN wget https://github.com/mozilla/geckodriver/releases/download/${GECKODRIVER_VERSION}/geckodriver-${GECKODRIVER_VERSION}-linux64.tar.gz -O /tmp/geckodriver.tar.gz && \

tar xvfz /tmp/geckodriver.tar.gz -C /tmp && \

mv /tmp/geckodriver /usr/local/bin/geckodriver && \

rm /tmp/geckodriver.tar.gz

# Install Firefox

RUN wget https://download-installer.cdn.mozilla.net/pub/firefox/releases/${FIREFOX_VERSION}/linux-x86_64/en-US/firefox-${FIREFOX_VERSION}.tar.bz2 -O /opt/firefox.tar.bz2 && \

tar xvf /opt/firefox.tar.bz2 -C /opt && \

ln -s /opt/firefox/firefox /usr/local/bin/firefox

firefox - added by PC from https://superset.apache.org/docs/installation/alerts-reports/

RUN apt-get update &&
apt-get install --no-install-recommends -y firefox-esr

ENV GECKODRIVER_VERSION=0.29.0
RUN wget -q https://github.com/mozilla/geckodriver/releases/download/v${GECKODRIVER_VERSION}/geckodriver-v${GECKODRIVER_VERSION}-linux64.tar.gz &&
tar -x geckodriver -zf geckodriver-v${GECKODRIVER_VERSION}-linux64.tar.gz -O > /usr/bin/geckodriver &&
chmod 755 /usr/bin/geckodriver &&
rm geckodriver-v${GECKODRIVER_VERSION}-linux64.tar.gz

RUN pip install --no-cache gevent psycopg2 redis

Cache everything for dev purposes...

RUN cd /app
&& pip install --no-cache -r requirements/docker.txt
&& pip install --no-cache -r requirements/requirements-local.txt || true
USER superset

######################################################################

CI image...

######################################################################
FROM lean AS ci

COPY --chown=superset ./docker/docker-bootstrap.sh /app/docker/
COPY --chown=superset ./docker/docker-init.sh /app/docker/
COPY --chown=superset ./docker/docker-ci.sh /app/docker/

RUN chmod a+x /app/docker/*.sh

CMD /app/docker/docker-ci.sh
`

@anantmulchandani
Copy link

anantmulchandani commented Jan 2, 2025

If you are using the non-dev (prod) setup, follow the documentation to make changes to your Dockerfile:
Custom Dockerfile modification for using Firefox


Otherwise as an easy alternative here’s what I did to fix the issue:

Change the line FROM lean AS ci line in your Dockerfile to:

FROM dev AS ci

Then build the image again.

@sudhanshuagariya
Copy link

sudhanshuagariya commented Jan 10, 2025

Hi @anantmulchandani ,I am not able to implement alerts and reports in my local.
i have installed superset using docker and when i run superset using command sudo docker compose up --build i see my version as 0.0.0-dev.

There is only one change which i made anfd that is in superset_config.py file and my updated superset_config.py file looks like this

''''
import logging
import os

from celery.schedules import crontab
from flask_caching.backends.filesystemcache import FileSystemCache

logger = logging.getLogger()

DATABASE_DIALECT = os.getenv("DATABASE_DIALECT")
DATABASE_USER = os.getenv("DATABASE_USER")
DATABASE_PASSWORD = os.getenv("DATABASE_PASSWORD")
DATABASE_HOST = os.getenv("DATABASE_HOST")
DATABASE_PORT = os.getenv("DATABASE_PORT")
DATABASE_DB = os.getenv("DATABASE_DB")

EXAMPLES_USER = os.getenv("EXAMPLES_USER")
EXAMPLES_PASSWORD = os.getenv("EXAMPLES_PASSWORD")
EXAMPLES_HOST = os.getenv("EXAMPLES_HOST")
EXAMPLES_PORT = os.getenv("EXAMPLES_PORT")
EXAMPLES_DB = os.getenv("EXAMPLES_DB")

SQLALCHEMY_DATABASE_URI = (
f"{DATABASE_DIALECT}://"
f"{DATABASE_USER}:{DATABASE_PASSWORD}@"
f"{DATABASE_HOST}:{DATABASE_PORT}/{DATABASE_DB}"
)

SQLALCHEMY_EXAMPLES_URI = (
f"{DATABASE_DIALECT}://"
f"{EXAMPLES_USER}:{EXAMPLES_PASSWORD}@"
f"{EXAMPLES_HOST}:{EXAMPLES_PORT}/{EXAMPLES_DB}"
)

REDIS_HOST = os.getenv("REDIS_HOST", "redis")
REDIS_PORT = os.getenv("REDIS_PORT", "6379")
REDIS_CELERY_DB = os.getenv("REDIS_CELERY_DB", "0")
REDIS_RESULTS_DB = os.getenv("REDIS_RESULTS_DB", "1")

RESULTS_BACKEND = FileSystemCache("/app/superset_home/sqllab")

CACHE_CONFIG = {
"CACHE_TYPE": "RedisCache",
"CACHE_DEFAULT_TIMEOUT": 300,
"CACHE_KEY_PREFIX": "superset_",
"CACHE_REDIS_HOST": REDIS_HOST,
"CACHE_REDIS_PORT": REDIS_PORT,
"CACHE_REDIS_DB": REDIS_RESULTS_DB,
}
DATA_CACHE_CONFIG = CACHE_CONFIG

class CeleryConfig:
broker_url = f"redis://{REDIS_HOST}:{REDIS_PORT}/{REDIS_CELERY_DB}"
imports = (
"superset.sql_lab",
"superset.tasks.scheduler",
"superset.tasks.thumbnails",
"superset.tasks.cache",
)
result_backend = f"redis://{REDIS_HOST}:{REDIS_PORT}/{REDIS_RESULTS_DB}"
worker_prefetch_multiplier = 1
task_acks_late = False
beat_schedule = {
"reports.scheduler": {
"task": "reports.scheduler",
"schedule": crontab(minute="", hour=""),
},
"reports.prune_log": {
"task": "reports.prune_log",
"schedule": crontab(minute=0, hour=0),
},
}

CELERY_CONFIG = CeleryConfig
SCREENSHOT_LOCATE_WAIT = 100
SCREENSHOT_LOAD_WAIT = 600

SMTP_HOST = "xxxx" # change to your host
SMTP_PORT = xxx
SMTP_STARTTLS = True
SMTP_SSL_SERVER_AUTH = False # If your using an SMTP server with a valid certificate
SMTP_SSL = False
SMTP_USER = "xxx" # use the empty string "" if using an unauthenticated SMTP server
SMTP_PASSWORD = "xxx" # use the empty string "" if using an unauthenticated SMTP server
SMTP_MAIL_FROM = "[email protected]"
EMAIL_REPORTS_SUBJECT_PREFIX = "[Superset] "

FEATURE_FLAGS = {"ALERT_REPORTS": True}
ALERT_REPORTS_NOTIFICATION_DRY_RUN = False
WEBDRIVER_BASEURL = "http://superset:8088/" .
WEBDRIVER_BASEURL_USER_FRIENDLY = WEBDRIVER_BASEURL
SQLLAB_CTAS_NO_LIMIT = True

try:
import superset_config_docker
from superset_config_docker import * # noqa

logger.info(
    f"Loaded your Docker configuration at " f"[{superset_config_docker.__file__}]"
)

except ImportError:
logger.info("Using default Docker config...")

''''

but after these changes when i again do sudo docker compose up --build i am getting this error :
superset.commands.report.exceptions.ReportScheduleScreenshotFailedError: Failed taking a screenshot Message: 'geckodriver' executable needs to be in PATH.

How to resolve this?

@sudhanshuagariya
Copy link

Also @anantmulchandani ,can you please create a blog or a youtube video for this.would be great help

@anantmulchandani
Copy link

anantmulchandani commented Jan 10, 2025

Hi @sudhanshuagariya
Since you are building your image, ensure that in docker-compose.yml, target configuration is set to dev:

x-common-build: &common-build
  context: .
  target: dev

Next, you need to check in Dockerfile that geckodriver and firefox are getting installed in the dev (image) stage of building. If not, add this snippet.

# Install GeckoDriver WebDriver
ARG GECKODRIVER_VERSION=v0.34.0 \
    FIREFOX_VERSION=125.0.3

RUN apt-get update -qq \
    && apt-get install -yqq --no-install-recommends wget bzip2 \
    && wget -q https://github.com/mozilla/geckodriver/releases/download/${GECKODRIVER_VERSION}/geckodriver-${GECKODRIVER_VERSION}-linux64.tar.gz -O - | tar xfz - -C /usr/local/bin \
    # Install Firefox
    && wget -q https://download-installer.cdn.mozilla.net/pub/firefox/releases/${FIREFOX_VERSION}/linux-x86_64/en-US/firefox-${FIREFOX_VERSION}.tar.bz2 -O - | tar xfj - -C /opt \
    && ln -s /opt/firefox/firefox /usr/local/bin/firefox \
    && apt-get autoremove -yqq --purge wget bzip2 && rm -rf /var/[log,tmp]/* /tmp/* /var/lib/apt/lists/*

Upon successful completion of your build and once the containers are up, you can enter into the running containers using
docker exec -it superset_app bash
Inside the container, validate the geckodriver installation by executing the commands:

geckodriver --version
which geckodriver 

Ensure that the location of geckodriver installation (usually /usr/local/bin) is added there in PATH

This should help you send image/pdf reports for which geckodriver and firefox are required in headless browser mode.

*PS: Also, use a tagged image of superset (like 4.1.1) instead of 0.0.0-dev.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
#bug Bug report
Projects
None yet
Development

No branches or pull requests

6 participants