Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JupyterLab 3.0 release survey #107

Merged
merged 31 commits into from
Nov 20, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
1115a0c
2020 JupyterLab survey (final)
lresende Nov 4, 2020
7d32b06
Add git configuratio file
lresende Nov 4, 2020
e1dd6ae
Markdown version of the survey
lresende Nov 4, 2020
8402737
Remove pdf version of the survey
lresende Nov 4, 2020
b2429e5
Update description with timeframe/data availability
lresende Nov 9, 2020
4eb73fa
Change wording for heavy/moderate usage
lresende Nov 9, 2020
44afc8a
Add additional languages
lresende Nov 9, 2020
166f9f9
Sort tools alphabetically
lresende Nov 9, 2020
5ffe449
Update/Sort how you access Jupyter
lresende Nov 9, 2020
294d0db
Update order for tasks/tools
lresende Nov 9, 2020
a508fec
Update data formats
lresende Nov 9, 2020
e37da57
Update data problems
lresende Nov 9, 2020
0016894
Update type of analysis options
lresende Nov 9, 2020
8e8cede
Update dashboard tool options
lresende Nov 9, 2020
908edd3
Update dashboard problems
lresende Nov 9, 2020
eaaeb09
Sort list of scale options
lresende Nov 9, 2020
bb19744
Update problems with scale options
lresende Nov 9, 2020
d69b4f7
Reword question about collaboration
lresende Nov 9, 2020
ea6d43d
Update challenges with collaboration
lresende Nov 9, 2020
9397c6d
Update chalenges with the notebook ui
lresende Nov 9, 2020
9ebaa1f
Reword e-mail address optin question
lresende Nov 9, 2020
2555072
Update local envs as first choices
lresende Nov 9, 2020
e5450ce
Update official names
lresende Nov 9, 2020
c5eb36e
Update missing word
lresende Nov 19, 2020
df0df2f
Reword question 22
lresende Nov 19, 2020
55c3233
Update workload options
lresende Nov 19, 2020
43321bf
Update wording around how many options to choose
lresende Nov 19, 2020
439672b
Update options for tools in use
lresende Nov 19, 2020
e98ce97
Update order on how you access jupyter
lresende Nov 19, 2020
b6682a1
Update order for what tools you use
lresende Nov 19, 2020
1a0ca19
Reword dimensional data option
lresende Nov 19, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
.DS_Store

# PyCharm
.idea/
*.iml
371 changes: 371 additions & 0 deletions docs/surveys/2020-jupyterlab-survey.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,371 @@
# JupyterLab

- Even if you don't use Jupyter, you can still take this survey. Just indicate this fact in the first question
and carry on as best as you can.
- Thank you. Your participation guides Jupyter's roadmap toward your real-life use cases by quantifiably
helping us prioritize the functionality that is important to our userbase.
- So that you know what to expect, it's comprised of 20 questions spread across the sections below.
As a fair heads up, Question 7 is the biggest one, but it provides critical information.
- Usage patterns.
- Data
- Visualization.
- Scale.
- Collaboration.
- The aggregate survey data itself will be openly shared with the Jupyter community when polling closes in mid-December.
If you opt to provide your email address for a user interview, it will not be used for Jupyter's promotional purposes and
it will not be shared with a 3rd party.


### Usage Patterns

1. How frequently do you use Jupyter?
- Daily - heavy usage; 3+ hours per day.
- Daily - moderate usage; less than 3 hours per day.
- Weekly.
- Monthly.
- I no longer use Jupyter.
- I have never used Jupyter.

2. How long have you been using Jupyter?
- 2+ years.
- 1-2 years.
- 6-12 months.
- Less than 6 months (welcome =]).
- I don't use Jupyter.

3. What languages do you use in Jupyter? (pick up to 4)
lresende marked this conversation as resolved.
Show resolved Hide resolved
- C (and derivatives)
- Go
- Groovy
- Java
- JavaScript
- Julia
- NodeJS
- Perl
- PHP
- Python
- R
- Ruby
- Rust
- Scala
- Spark SQL
- SQL
- TypeScript
- ❗I wrap/ use bindings for other languages.
- ❗My preferred language is not supported in Jupyter.
- Other (please specify)

4. What are your primary job roles when you are using Jupyter? (pick up to 2)
- Backend engineer.
- Business analyst.
- Data engineer.
- Data scientist.
- Database Admin (DBA).
- DevOps.
- Financial modeler/ analyst.
- Front end/ web development.
- Infrastructure engineer/ cloud architect.
- Scientist/ researcher.
- Student.
- Sysadmin.
- Teacher/ lecturer.
- Tutor/ teaching assistant.
- Other (please specify)

5. What are your go-to tools for performing data science, scientific computing,
and machine learning on your laptop/ desktop (non-cloud) for data science? (pick up to 3)
- Atom.
- Emacs.
- IPython.
- Jupyter Notebook - Classic.
- JupyterLab.
- nteract.
- PyCharm.
- RStudio.
- Spyder.
- Sublime Text.
- Vim.
- VS Code.
- Zeppelin.
- Other (please specify).

6. How do you run and/ or access Jupyter? (pick up to 4)
lresende marked this conversation as resolved.
Show resolved Hide resolved
- 🖥️ Run directly on local machine (e.g. laptop, desktop).
- Through a Python virtual environment (e.g. conda, virtualenv).
- Through Docker.
- HPC or on-premise server.
- Cloud server (e.g. AWS EC2).
- JupyterHub.
- BinderHub / MyBinder.
- Cloud service - AWS (e.g. EMR, SageMaker).
- Cloud service - Azure (e.g. Notebooks, ML Studio).
- Cloud service - Databricks.
- Cloud service - Google (e.g. AI Platform, Dataproc).
- Cloud service - IBM (e.g. Watson Studio).
- Google Colab.
- CoCalc.
- Mobile device (e.g. phone, tablet). Comments welcome.
- ❓Don’t know how, I just go to a URL.
- Other (please specify).

7. What tasks do you need to perform and what tools do you use to accomplish them?
willingc marked this conversation as resolved.
Show resolved Hide resolved
- Writing and running tests for software.
- Writing a software package.
- Creating content (e.g. blogs, books, education materials).
- Cleaning and preparing data.
- Run pipelines, workflows, or ETL (extract, transform, load) jobs.
- Developing extensions/ plugins to solve my problems.
- Writing software documentation.
- Finding extensions/ plugins to solve my problems.
- Building a machine learning or statistical model.
- Documenting research (reports, scientific papers)
lresende marked this conversation as resolved.
Show resolved Hide resolved
- Visualize data in charts, plots, or dashboards.
- Other major use cases (please specify).

For each of the items above, provide additional information related to:
- How frequently do you perform this task?
- Never.
- Every few months.
- Monthly.
- Weekly.
- Daily.
- To what degree does Jupyter meet your expectations for this?
- Does not apply.
- No.
- Neutral.
- Yes.
- To what degree do alternative tools meet your expectations for this?
- Does not apply.
- No.
- Neutral.
- Yes.

### Data

8. What data sources are you primarily working with in your role? (pick up to 3)
- 🖥️ My local file system (e.g. files and folder on local machine).
- File system (e.g. HPC, EBS/EFS, JupyterHub volumes).
- Cloud object storage (e.g. buckets, S3, Blob, GS).
- SQL (e.g. PostgreSQL, MySQL).
- SQL - embedded (e.g. SQLite).
- NoSQL - columnar store (e.g. Parquet, Arrow, HDFS, BigQuery).
- NoSQL - document store (e.g. MongoDB, Elasticsearch, DynamoDB).
- Graph database (e.g. Neo4j, TigerGraph).
- Time Series (e.g. InfluxDB).
- Pub/ sub (e.g. Apache Kafka, Druid).
- Key value (e.g. Redis, MemcacheDB).
- Google Sheets.
- ❗Industry or field specific APIs.
- Streaming.
- Other (please specify).

9. What data formats are you mostly working with? (pick up to 3)
- Tabular (e.g. csv, spreadsheet, SQL tables, Parquet).
- Images.
- Tensors (e.g. manually handling PyTorch, Tensorflow inputs).
- Nested (e.g. JSON, NoSQL document).
- Hierarchical Data Format (e.g. HDF5 or similar).
- Time series.
- Text.
- Audio.
- Video.
- 3D/ CAD.
- Graph (e.g. nodes, edges).
- Spatial/ geographic (e.g. coordinates, GIS).
- Game/ reinforcement simulation.
- ❗Industry-specific file formats.
- Other (please specify)

10. Do you experience these **problems with data** in Jupyter? (rate from scale of 0-4)
- No grid view for manipulating/ filtering dataframes and arrays.
- Can’t see a list of my current variables.
- Plaintext or environment variable management of database passwords/ keys/ secrets.
- Lost data during failure or restart of kernel/ server.
- Data is too big to fit into memory on my machine/ server.
- Poor MVC/ ORM integrations (e.g. Django, Flask).
- Managing database/ source connections and secrets.
- Other (please specify)

For each of the items above, specify:
- Not a problem for me.
- Trivial.
- Minor.
- Major.
- Critical.
- N/A - skip, don't know.

11. What type of analysis are you running? (pick up to 4)
- ❗I am not performing ML/statistical tasks.
- Regression; predict a numeric output.
- Classification; predict a categorical output.
- Generative/ auto-encode; create new data based on existing data.
- Reinforcement learning; actions that maximize a reward.
- Dimensionality reduction (e.g. PCA, K-Nearest Neighbors)
- Feature engineering (e.g. importance, extraction, selection, permutation).
- Natural language processing (NLP).
- Graph data science.
- Outlier detection.
- Other (please specify)


### Visualization

12. What tools does your team use to create dashboards tools? (pick up to 3)
- Dash-Plotly.
- Google Data Studio.
- Grafana
- Kibana.
- Klipfolio.
- Looker.
- R Shiny.
- Spotfire.
- Tableau.
- Voila.
- ❗I don't create dashboards.
- ❗I write my own in HTML & JS.
- Other (please specify).

13. Do you experience these problems with visualization in Jupyter?
- No built-in UI for creating charts.
- Can't publish my charts as web-based dashboards.
- Poor/ buggy support for my plotting tool.
- Difficulty displaying highly dimensional data (e.g. array of array of arrays, too many rows/ columns to fit on screen).
- Lacking templating support (Jinja2)

For each of the items above, specify:
- Not a problem for me.
- Trivial.
- Minor.
- Major.
- Critical.
- N/A - skip, don't know.

### Scale

14. How do you scale and schedule your workloads? (pick up to 4)
- 🖥️ They run just fine on my local machine.
- ❓I need to scale, but don't know how.
- Server - on premise HPC/ data center.
- Server - cloud (e.g. AWS EC2).
- Cloud ML/ AI (e.g. AWS SageMaker, IBM Wastson Studio).
- Cluster - Spark and/ Hadoop.
- Cluster - Dask.
- Cluster - Kubernetes (or similar e.g. Mesos, Swarm, Slurm).
- Cluster - Jupyter Enterprise Gateway.
- Jupyter BinderHub.
- Quantum (e.g. D-Wave).
- Horovod.
- Kubeflow.
- Elyra.
- Snakemake.
- Papermill.
- CWL, Nextflow, and/ or WDL.
- Apache Airflow.
- Prefect.
- Cloud pipelines (e.g. AWS Batch).
- Cloud queries (e.g. AWS Presto, AWS Athena).
- Other (please specify).

15. Do you experience these problems with scale in Jupyter?
- Figuring out how to schedule batch execution of notebook-based jobs.
- Don’t have the budget for more scalable environment/ cloud services.
- Haven’t divided longer notebooks into multiple, modular notebooks.
- Not persisting the outputs of a notebook.
- Machine learning training jobs take too long.
- Can't call code/ modules from other notebooks.
- Difficulty managing Spark dependencies (Java).

For each of the items above, specify:
- Not a problem for me.
- Trivial.
- Minor.
- Major.
- Critical.
- N/A - skip, don't know.

### Collaboration

16. When it comes to working on notebooks in a team setting, with how many other people are you collaborating?
- 0
- 10
- 20
- 30
- 40
- 50+

17. What is your reason for sharing a notebook with someone else? (pick up to 3)
- ❗I am not working with other people.
- Share knowledge.
- Feedback about my writing.
- Feedback about my code.
- Formal code review.
- Integrate my code/ data with their downstream or upstream processes.
- Edit/ contribute some of their own code.
- Edit/ contribute some of their own writing.
- Teach/ tutor them.
- Peer programming.
- Deploy my code/ model/ pipeline/ dashboard.
- Other (please specify)

18. What is the nature of your collaboration?
- Describe the collaboration:
- How long have you been working together?
- I am not collaborating.
- 2+ years.
- 1-2 years.
- 6-12 months.
- Less then 6 months.
- How frequently do you work together?
- I am not collaborating.
- 2+ times per week.
- Weekly.
- A few times a month.
- Monthly.
- Less then monthly.
- How do you divide the work?
- I am not collaborating.
- We work on different projects.
- We work on the same project, but different parts.
- We work on the same part of the same project together.
- Comments about collaboration:

19. Do you have challenges with collaboration in Jupyter?
- Don't know what dependencies (versions of language, packages, extensions) a notebook uses.
- Don't know/ have the data a notebook is supposed to use.
- Poor support for our version control (git) system.
- No built-in way to publish my notebook to a shared location.
- Not being able to comment on notebooks.
- No "track changes;" can't figure out what changed between notebook checkpoints/ versions.

For each of the items above, specify:
- Not a problem for me.
- Trivial.
- Minor.
- Major.
- Critical.
- N/A - skip, don't know.

20. Do you have challenges with the notebook UI?
- No progress bar for running long notebooks.
- No marketplace for Extensions (e.g. 5 star ratings, browsable categories).
- No global search.
- Can't collapse sections of a notebook hierarchically.
- Poor autocompletion (e.g. LSP, show methods/ attributes).
- No modes for editing other Jupyter documents (MyST, Jupyter Book).
- Can't see hidden (.) files in file browser.
- Don't know which cell failed in long notebook.

For each of the items above, specify:
- Not a problem for me.
- Trivial.
- Minor.
- Major.
- Critical.
- N/A - skip, don't know.

lresende marked this conversation as resolved.
Show resolved Hide resolved
### You did it - thank you!

21. Open feedback for problems/ pain points you didn't get to share.


22. Optional - Are you interested in giving qualitative feedback on JupyterLab, JupyterHub, and the JupyterLab developer experience? If we have permission to contact you for follow-up questions, please leave your email address below.