Daily Dose of Data Science – Day 9 – Publishing content from Jupyter Notebooks made easy using Notebook-as-PDF

Daily Dose of Data Science – Day 9 – Publishing content from Jupyter Notebooks made easy using Notebook-as-PDF

Welcome to day 9 of the Daily Dose of Data Science series! Jupyter notebooks has inarguably one of the most widely used IDEs for programming in Python, particularly for Data Science and for Machine Learning related use cases. But for publishing content, converting Jupyter Notebooks directly to PDF was not an easy option before until the availability of the Notebook-as-PDF framework! Let’s dive in to learn more!

Quickest way to try out Jupyter Notebook: zero install, 3 CLI commands and  5 minutes to action - AMIS, Data Driven Blog - Oracle & Microsoft Azure

Jupyter notebooks provide a very easy interface for everyone to run their code and the immediate and intermediate output and does have great neat features to add necessary latex like descriptions, formula and even add visualizations through images and gifs. After it was widely adopted by the community, more and more requests started coming in to convert Jupyter notebooks in pdfs so that developers can actually produce a book with well written and structured content through jupyter notebooks! But converting jupyter notebook files to pdf document maintaining all necessary alignment and orientation was extremely challenging. As a temporary solution, there is an option to download as an HTML file and then convert the HTML to a PDF, but some of the content alignment and orientation used to get messed up quite easily. That’s where we can leverage the framework of Notebook-as-PDF in python to easily convert notebook files to a PDF document in just one single command.

Notebook-as-PDF is actually a python Jupyter notebook extension allows to save notebook files directly as PDF documents. It has some key new features compared to the official “save as PDF” extension which makes it much easy and hassle free to work with:

  1. produce a PDF with the smallest number of page breaks,
  2. the original notebook is attached to the PDF; and
  3. this extension does not require LaTex.

The created PDF will have as few pages as possible, in many cases only one and is extremely useful for sharing with others to read and consume the content. particularly for researchers and authors who would want to share the contents of a book or a research paper with their peers or co-authors, this is a very effective option.

Every <h1> tag in the notebook will be converted into a entry in the table of contents of the PDF. To make it easier to reproduce the contents of the PDF at a later date the original notebook is attached to the PDF.

Certain example samples from an input notebook can be found at my GitHub repository for the Daily Dose of data Science series.

The installation steps are straight forward and can be done by:

python -m pip install -U notebook-as-pdf

pyppeteer-install

The usage is also very simple. Through terminal or command prompt it can be used by point to the location of the notebook file and by the following command using nbconvert:

jupyter-nbconvert --to pdfviahtml example.ipynb

Otherwise using the GUI option, you can create your notebook and then click “File -> Download As”. Click the new menu entry called “PDF via HTML” to convert the notebook to a PDF on the fly and then download it.

I will highly recommend you to take a look at the official GitHub project page to find out more: https://github.com/betatim/notebook-as-pdf. So, that’s it for today’s dose! Visit again for getting another daily dose of data science and please feel free to like, share, comment and subscribe to my posts if you find it helpful!

Leave a Reply

Your email address will not be published. Required fields are marked *