Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coalesce multiple stdout/stderr Notebook outputs into single blocks #973

Closed
fbelotti opened this issue Sep 17, 2020 · 9 comments · Fixed by #1448 or cs3110/textbook#121
Closed

Coalesce multiple stdout/stderr Notebook outputs into single blocks #973

fbelotti opened this issue Sep 17, 2020 · 9 comments · Fixed by #1448 or cs3110/textbook#121
Labels
enhancement New feature or request 🏷️ execution

Comments

@fbelotti
Copy link

fbelotti commented Sep 17, 2020

I am struggling to understand why the cells output is split in multiple lines when the book is rendered using HTML. My stata.ipynb notebook source code looks like

 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Operators\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "(1978 Automobile Data)\n",
      "\n",
      "\n",
      "      Source |       SS           df       MS      Number of obs   =        74\n",
      "-------------+----------------------------------   F(1, 72)        =     20.26\n",
      "       Model |   139449474         1   139449474   Prob > F        =    0.0000\n",
      "    Residual |   495615923        72  6883554.48   R-squared       =    0.2196\n",
      "-------------+----------------------------------   Adj R-squared   =    0.2087\n",
      "       Total |   635065396        73  8699525.97   Root MSE        =    2623.7\n",
      "\n",
      "------------------------------------------------------------------------------\n",
      "       price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]\n",
      "-------------+----------------------------------------------------------------\n",
      "         mpg |  -238.8943   53.07669    -4.50   0.000    -344.7008   -133.0879\n",
      "       _cons |   11253.06   1170.813     9.61   0.000     8919.088    13587.03\n",
      "------------------------------------------------------------------------------\n"
     ]
    }
   ],
   "source": [
    "sysuse auto\n",
    "reg price mpg"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Stata",
   "language": "stata",
   "name": "stata"
  },
  "language_info": {
   "codemirror_mode": "stata",
   "file_extension": ".do",
   "mimetype": "text/x-stata",
   "name": "stata",
   "version": "15.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}

After I built the book using

(base) Federicos-MacBook-Pro:books federico$ jupyter-book clean mybook/ --all

===============================================================================

Your _build directory has been removed

===============================================================================

(base) Federicos-MacBook-Pro:books federico$ jupyter-book build mybook/
Running Jupyter-Book v0.8.1
Source Folder: /Users/federico/Desktop/Dropbox/Teaching/jupyter/books/mybook
Config Path: /Users/federico/Desktop/Dropbox/Teaching/jupyter/books/mybook/_config.yml
Output Path: /Users/federico/Desktop/Dropbox/Teaching/jupyter/books/mybook/_build/html
Running Sphinx v2.4.0
making output directory... done
myst v0.12.9: MdParserConfig(renderer='sphinx', commonmark_only=False, dmath_enable=True, dmath_allow_labels=True, dmath_allow_space=True, dmath_allow_digits=True, amsmath_enable=False, deflist_enable=False, update_mathjax=True, admonition_enable=False, figure_enable=False, disable_syntax=[], html_img_enable=False, url_schemes=['mailto', 'http', 'https'], heading_anchors=None)
building [mo]: targets for 0 po files that are out of date
building [html]: targets for 1 source files that are out of date
updating environment: [new config] 1 added, 0 changed, 0 removed
Executing: stata in: /Users/federico/Desktop/Dropbox/Teaching/jupyter/books/mybook                                                           

looking for now-outdated files... none found
pickling environment... done
checking consistency... done
preparing documents... done
writing output... [100%] stata                                                                                                               
generating indices...  genindexdone
writing additional pages...  searchdone
copying static files... ... done
copying extra files... done
dumping search index in English (code: en)... done
dumping object inventory... done
build succeeded.

The HTML pages are in mybook/_build/html.

===============================================================================

Finished generating HTML for book.
Your book's HTML pages are here:
    mybook/_build/html/
You can look at your book by opening this file in a browser:
    mybook/_build/html/index.html
Or paste this line directly into your browser bar:
    file:///Users/federico/Desktop/Dropbox/Teaching/jupyter/books/mybook/_build/html/index.html            

===============================================================================

the notebook source code in the mybook/_build/jupyter_execute appears to be different and, as you can see below, the output of each "code" cell is split in separate lines

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Operators\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "(1978 Automobile Data)\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "      Source |       SS           df       MS      Number of obs   =        74\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "-------------+----------------------------------   F(1, 72)        =     20.26\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "       Model |   139449474         1   139449474   Prob > F        =    0.0000\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "    Residual |   495615923        72  6883554.48   R-squared       =    0.2196\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "-------------+----------------------------------   Adj R-squared   =    0.2087\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "       Total |   635065396        73  8699525.97   Root MSE        =    2623.7\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "------------------------------------------------------------------------------\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "       price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "-------------+----------------------------------------------------------------\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "         mpg |  -238.8943   53.07669    -4.50   0.000    -344.7008   -133.0879\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "       _cons |   11253.06   1170.813     9.61   0.000     8919.088    13587.03\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "------------------------------------------------------------------------------\n"
     ]
    }
   ],
   "source": [
    "sysuse auto\n",
    "reg price mpg"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Stata",
   "language": "stata",
   "name": "stata"
  },
  "language_info": {
   "codemirror_mode": "stata",
   "file_extension": ".do",
   "mimetype": "text/x-stata",
   "name": "stata",
   "version": "15.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}

This is how the HTML final output looks like

Screen Shot 2020-09-17 at 11 25 00

Do you have any idea of why this is happening?

Many thanks!
Federico

Environment

  • Python v3.7.6
  • Jupyter-Book v0.8.1
  • Sphinx v2.4.0
  • myst v0.12.9
  • Operating System: OsX Catalina
@fbelotti fbelotti added the bug Something isn't working label Sep 17, 2020
@welcome
Copy link

welcome bot commented Sep 17, 2020

Thanks for opening your first issue here! Engagement like this is essential for open source projects! 🤗

If you haven't done so already, check out EBP's Code of Conduct. Also, please try to follow the issue template as it helps other community members to contribute more effectively.

If your issue is a feature request, others may react to it, to raise its prominence (see Feature Voting).

Welcome to the EBP community! 🎉

@chrisjsewell chrisjsewell added enhancement New feature or request 🏷️ execution and removed bug Something isn't working labels Sep 17, 2020
@chrisjsewell
Copy link
Contributor

chrisjsewell commented Sep 17, 2020

This is the expected behaviour of Jupyter notebook execution; that stdout streams are output in (non-determistic) chunks. The amount of chunks just depends on your transcient processing power I guess. It would be better if you could find a way to output the content as a plain/text output rather than a stream.

That being said, we could look into adding an option in myst-nb to "coalesce streams", i.e. exactly what I do in pytest-notebook: https://github.com/chrisjsewell/pytest-notebook/blob/40381be0963a8866d1a46994595844bbfb4e66c1/pytest_notebook/post_processors.py#L80

@fbelotti
Copy link
Author

Thank you very much Chris for your reply. I am new to notebooks, your advices are precious. It would be great if you could look into adding an option that can solve this issue. That would be really helpful.

If I may, what I do not understand is why sometimes stout streams are rendered as just one block and sometimes (in a seemingly random way) as more blocks. Is that due to the Stata kernel? or may this happen also with python or other kernels? What do you mean for find a way to output the content as a plain/text output rather than a stream. Is that a kernel related behaviour?

@chrisjsewell
Copy link
Contributor

Is that due to the Stata kernel? or may this happen also with python or other kernels?

Yes it can happen for any kernel. With stdout/stderr streams, Jupyter dumps the output periodically, e.g. if you had:

print(1)
sleep(10)
print(2)

Then you want don't want to have to wait 10 seconds to see 1, it will update with 1 then later will add 2.

The output you are creating is essentially doing:

print("      Source |       SS           df       MS      Number of obs   =        74\n")
print("-------------+----------------------------------   F(1, 72)        =     20.26\n")
...

and it just depends how fast each line reaches the client (which is non-deterministic) vs the time interval between stream dumps.

Ideally for non-interactive outputs you want to "package up" all the text and send it at the same time, which is what I mean by a plain/text output, e.g. with python:

from IPython.display import display
display(
"      Source |       SS           df       MS      Number of obs   =        74\n"
"-------------+----------------------------------   F(1, 72)        =     20.26\n"
...
)

But I don't know personally how you do that with stata I'm afraid.

@chrisjsewell chrisjsewell changed the title Jupyter nb output cells split in multiple lines when the book is rendered in HTML Coalesce Jupyter Notebook stdout/stderr outputs into single blocks Sep 17, 2020
@chrisjsewell chrisjsewell changed the title Coalesce Jupyter Notebook stdout/stderr outputs into single blocks Coalesce multiple stdout/stderr Notebook outputs into single blocks Sep 17, 2020
@fbelotti
Copy link
Author

Got it, thanks. I'll look into the Stata kernel to see if I can package up all the output text from a single code cell and send it out at the same time as you suggest.

In any case, if this may happen with any kernel, then I think that an option to coalesce multiple stdout/stderr notebook outputs into single blocks would be really useful.

@fbelotti
Copy link
Author

fbelotti commented Sep 17, 2020

For those who have the same issue, a "temporary" workaround is to execute the notebooks on your own, copy and paste all the .ipynb files in the ./_build/jupyter_execute/ folder and put the execute_notebooks: off setting in the execute section of the book's _config.yml file.

execute:
  execute_notebooks: off 

Hope this helps.

@roblem
Copy link

roblem commented May 12, 2021

FYI. In case anyone here is encountering similar issues with Stata 17's built in %%stata magic, there is an undocumented config:

from pystata import config
config.set_streaming_output_mode('off')

That condenses all output into a single codeblock. Still doesn't solve the problem seen in stata_kernel for jupyter.

@fmaussion
Copy link
Contributor

Hi! I'm expressing interest for this issue here (sorry for the noise). I'm coming from there: executablebooks/sphinx-book-theme#281

Is there any chance this issue will be fixed in the foreseeable future? I am well aware that this is a volunteer-led project: I'm asking now in order to decide whether or not we want to implement the workaround mentioned above in our workflow or wait for a fix.

Thanks!

@chrisjsewell
Copy link
Contributor

This is now available in executablebooks/MyST-NB@v0.13.0...v0.13.1 (as nb_merge_streams=True) 😄 and so will be incorporated into jupyter-book in the next release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request 🏷️ execution
Projects
None yet
4 participants