python-course.eu

3. PDF version of this site

By Bernd Klein. Last modified: 24 Jul 2021.

The Books for Download

In US letter format:

In A4 format:

Explanantion of the PDF Book Development Process

By James Paden

Python PDF Books

I stumbled upon Bernd’s excellent online tutorial python-course.eu while attempting to write some Python. These courses are book-quality and should look that way in ebook format as well! I offered to help make some new PDF ebooks. Our work turned into an excellent tutorial that highlights the power of DocRaptor’s HTML-to-PDF API and the Prince generator we use.

The primary problem is that Bernd does not have full control over his HTML. Lack of control is a common problem faced by developers, especially at larger companies or with legacy systems. In Bernd’s case, the HTML is created by the Jupyter Notebook platform.

Jupyter simplifies Bernd’s content management, but we needed to turn it into a PDF with headers, footers, and a table of contents. Bernd also wanted to insert a title page into each PDF. Here’s a step-by-step walkthrough of how we accomplished that without changing any of his existing content.

Define Chapters

We start by defining the chapters of the ebook from <h2>’s created by Jupyter:

  h2 {
    break-before: page;
    string-set: chapter content() /* we'll use this string in the footer */
    font-size: 30px !important; /* override Jupyter's default styling */    
    …more styling… 
}

First, we page break before each <h2>. Then we’ll define chapter as a named string. This powerful CSS property is unique to Prince and DocRaptor and is particularly useful when creating documents from content you cannot modify. We’ll use this string next, when we add it to the document footer.

The footer is inserted within the bottom margin we define for each document page. In this case, 100px on both the top and bottom. We put the current chapter name (from the chapter named string we created above) on the left and the page number on the right. The named string changes every time it reaches a new <h2> in the document.

  @page {
    margin: 100px 0 ;
    @bottom-left {
      content: string(chapter);
      …more styling… 
    }
    @bottom-right {
      content: counter(page);
      …more styling… 
    }
  }

Create the Title Page and Table of Content

The table of content is dynamically generated so we can use the same PDF generation script for all of Bernd’s courses. To create the table of content, we loop through the various <h2> tags in the document and generate a list of chapters. While Bernd’s courses have flat chapters in the navigation menu, this JavaScript supports a mutli-level table of contents.

    var toc = "";
    var level = 0;
    document.getElementById("content").innerHTML =
      document.getElementById("content").innerHTML.replace(/<h([\d])[^>]*>(.*)<\/h([\d])>/gi,
        function (str, openLevel, titleText, closeLevel) {
          if (openLevel != closeLevel) {
            return str;
          }
          titleText = titleText.replace(/<[^>]*>/, "");
          if (openLevel > level) {
            toc += (new Array(openLevel - level + 1)).join("<ul>");
          } else if (openLevel < level) {
            toc += (new Array(level - openLevel + 1)).join("</ul>");
          }
          level = parseInt(openLevel);
          var anchor = titleText.replace(/\W/g, "_");
          toc += "<li><a href=\"#" + anchor + "\">" + titleText +
            "</a></li>";
          return "<h" + openLevel + "><a name=\"" + anchor + "\">" +
            titleText + "</a></h" + closeLevel + ">";
        }
      );
    if (level) {
      toc += (new Array(level + 1)).join("</ul>");
    }
    var tocObject = document.createElement("div");
    tocObject.id = "title-page";
    tocObject.innerHTML = toc;
We also use more Prince-specific CSS to create perfect dot leaders on each line (this isn’t currently possible with standard CSS):
  #title-page a:after {
    content: leader('.')  target-counter(attr(href), page);
  }
Lastly, we created the title header by simply reusing the document title and insert everything onto the page:
    var titleObject = document.createElement("h1");
    titleObject.innerText = document.title
    document.getElementById("content").insertAdjacentElement("afterbegin", tocObject);
    document.getElementById("title-page").insertAdjacentElement("afterbegin", titleObject);

Summary

There are several methods for achieving this outcome, but by relying on CSS and JavaScript alone, we’ve ensured we can handle any content outputted by Jupyter, regardless of length or topic. You’re welcome to peruse the full source code or an example ebook. Let us know if you have any questions!