Testing Jupyter Notebooks

The more you do programming, the more you will here about how you should test your code. You will hear about things like Extreme Programming and Test Driven Development (TDD). These are great ways to create quality code. But how does testing fit in with Jupyter? Frankly, it really doesn’t. If you want to test your code properly, you should write your code outside of Jupyter and import it into cells if you need to. This allows you to use Python’s unittest module or py.test to write tests for your code separately from Jupyter. This will also let you add on test runners like nose or put your code into a Continuous Integration setup using something like Travis CI or Jenkins.

However all is now lost. You can do some testing of your Jupyter Notebooks even though you won’t have the full flexibility that you would get from keeping your code separate. We will look at some ideas that you can use to do some basic testing with Jupyter.

Execute and Check

One popular method of “testing” a Notebook is to run it from the command line and send its output to a file. Here is the example syntax that you could use if you wanted to do the execution on the command line:

jupyter-nbconvert --to notebook --execute --output output_file_path input_file_path

Of course, we want to do this programmatically and we want to be able to capture errors. To do that, we will take our Notebook runner code from my exporting Jupyter Notebook article and re-use it. Here it is again for your convenience:

# notebook_runner.py

import nbformat
import os

from nbconvert.preprocessors import ExecutePreprocessor


def run_notebook(notebook_path):
    nb_name, _ = os.path.splitext(os.path.basename(notebook_path))
    dirname = os.path.dirname(notebook_path)
    
    with open(notebook_path) as f:
        nb = nbformat.read(f, as_version=4)
        
    proc = ExecutePreprocessor(timeout=600, kernel_name='python3')
    proc.allow_errors = True
    
    proc.preprocess(nb, {'metadata': {'path': '/'}})
    output_path = os.path.join(dirname, '{}_all_output.ipynb'.format(nb_name))
    
    with open(output_path, mode='wt') as f:
        nbformat.write(nb, f)

    errors = []
    for cell in nb.cells:
        if 'outputs' in cell:
            for output in cell['outputs']:
                if output.output_type == 'error':
                    errors.append(output)

    return nb, errors

if __name__ == '__main__':
    nb, errors = run_notebook('Testing.ipynb')
    print(errors)

You will note that I have updated the code to run a new Notebook. Let’s go ahead and create a Notebook that has two cells of code in it. After creating the Notebook, change the title to Testing and save it. That will cause Jupyter to save the file as Testing.ipynb. Now enter the following code in the first cell:

def add(a, b):
    return a + b

add(5, 6)

And enter the following code into cell #2:

1 / 0

Now you can run the Notebook runner code. When you do, you should get the following output:

[{'ename': 'ZeroDivisionError',
  'evalue': 'integer division or modulo by zero',
  'output_type': 'error',
  'traceback': ['\x1b[0;31m\x1b[0m',
                '\x1b[0;31mZeroDivisionError\x1b[0mTraceback (most recent call '
                'last)',
                '\x1b[0;32m\x1b[0m in '
                '\x1b[0;36m\x1b[0;34m()\x1b[0m\n'
                '\x1b[0;32m----> 1\x1b[0;31m \x1b[0;36m1\x1b[0m '
                '\x1b[0;34m/\x1b[0m '
                '\x1b[0;36m0\x1b[0m\x1b[0;34m\x1b[0m\x1b[0m\n'
                '\x1b[0m',
                '\x1b[0;31mZeroDivisionError\x1b[0m: integer division or '
                'modulo by zero']}]

This indicates that we have some code that outputs an error. In this case, we did expect that as this is a very contrived example. In your own code, you probably wouldn’t want any of your code to output an error. Regardless, this Notebook runner script isn’t enough to actually do a real test. You need to wrap this code with testing code. So let’s create a new file that we will save to the same location as our Notebook runner code. We will save this script with the name “test_runner.py”. Put the following code in your new script:

import unittest

import runner


class TestNotebook(unittest.TestCase):
    
    def test_runner(self):
        nb, errors = runner.run_notebook('Testing.ipynb')
        self.assertEqual(errors, [])
        
        
if __name__ == '__main__':
    unittest.main()

This code uses Python’s unittest module. Here we create a testing class with a single test function inside of it called test_runner. This function calls our Notebook runner and asserts that the errors list should be empty. To run this code, open up a terminal and navigate to the folder that contains your code. Then run the following command:

python test_runner.py

When I ran this, I got the following output:

F
======================================================================
FAIL: test_runner (__main__.TestNotebook)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_runner.py", line 10, in test_runner
    self.assertEqual(errors, [])
AssertionError: Lists differ: [{'output_type': u'error', 'ev... != []

First list contains 1 additional elements.
First extra element 0:
{'ename': 'ZeroDivisionError',
 'evalue': 'integer division or modulo by zero',
 'output_type': 'error',
 'traceback': ['\x1b[0;31m---------------------------------------------------------------------------\x1b[0m',
               '\x1b[0;31mZeroDivisionError\x1b[0m                         '
               'Traceback (most recent call last)',
               '\x1b[0;32m\x1b[0m in '
               '\x1b[0;36m\x1b[0;34m()\x1b[0m\n'
               '\x1b[0;32m----> 1\x1b[0;31m \x1b[0;36m1\x1b[0m '
               '\x1b[0;34m/\x1b[0m \x1b[0;36m0\x1b[0m\x1b[0;34m\x1b[0m\x1b[0m\n'
               '\x1b[0m',
               '\x1b[0;31mZeroDivisionError\x1b[0m: integer division or modulo '
               'by zero']}

Diff is 677 characters long. Set self.maxDiff to None to see it.

----------------------------------------------------------------------
Ran 1 test in 1.463s

FAILED (failures=1)

This clearly shows that our code failed. If you remove the cell that has the divide by zero issue and re-run your test, you should get this:

.
----------------------------------------------------------------------
Ran 1 test in 1.324s

OK

By removing the cell (or just correcting the error in that cell), you can make your tests pass.

The py.test Plugin

I discovered a neat plugin you can use that appears to help you out by making the workflow a bit easier. I am referring to the py.test plugin for Jupyter, which you can learn more about here.

Basically it gives py.test the ability to recognize Jupyter Notebooks and check if the stored inputs match the stored outputs and also that Notebooks run without error. After installing the nbval package, you can run it with py.test like this (assuming you have py.test installed):

py.test --nbval

Frankly you can actually run just py.test with no commands on the test file we already created and it will use our test code as is. The main benefit of adding nbval is that you won’t need to necessarily add wrapper code around Jupyter if you do so.

Testing within the Notebook

Another way to run tests is to just include some tests in the Notebook itself. Let’s add a new cell to our Testing Notebook that contains the following code:

import unittest

class TestNotebook(unittest.TestCase):
    
    def test_add(self):
        self.assertEqual(add(2, 3), 5)

This will test the add function in the first cell eventually. We could add a bunch of different tests here. For example, we might want to test what happens if we add a string type with a None type. But you may have noticed that if you try to run this cell, you get to output. The reason is that we aren’t instantiating the class yet. We need to call unittest.main to do that. So while it’s good to run that cell to get it into Jupyter’s memory, we actually need to add one more cell with the following code:

unittest.main(argv=[''], verbosity=2, exit=False)

This code should be put in the last cell of your Notebook so it can run all the tests that you have added. It is basically telling Python to run with verbosity level of 2 and not to exit. When you run this code you should see the following output in your Notebook:

test_add (__main__.TestNotebook) ... ok

----------------------------------------------------------------------
Ran 1 test in 0.003s

OK

You can do something similar with Python’s doctest module inside of Jupyter Notebooks as well.

Wrapping Up

As I mentioned at the beginning, while you can test your code in your Jupyter Notebooks, it is actually much better if you just test your code outside of it. However there are workarounds and since some people like to use Jupyter for documentation purposes, it is good to have a way to verify that they are working correctly. In this chapter you learned how to run Notebooks programmatically and verify that the output was as you expected. You could enhance that code to verify certain errors are present if you wanted to as well.

You also learned how to use Python’s unittest module in your Notebook cells directly. This does offer some nice flexibility as you can now run your code all in one place. Use these tools wisely and they will serve you well.

Execute and Check

The py.test Plugin

Testing within the Notebook

Wrapping Up

Related Reading

1 thought on “Testing Jupyter Notebooks”