Skip to content

Jupyter integration#

PLPipes includes an IPython extension which exposes the framework functionality in Jupyter notebooks.

Initialization#

The extension is loaded adding the following lines at the beginning of your notebook:

%load_ext plpipes.jupyter
%plpipes {stem}

Where {stem} is the name used as the main key when looking for configuration files (defaults to jupyter).

In order to find the project configuration, the extension looks into the environment variable PLPIPES_ROOT_DIR. If that variable is not defined then it looks for a config directory in the current working directory of the IPython kernel (usually the directory from where jupyter-lab was launched) and walks up the file system until such directory is found.

Once the extension is loaded and initialized, the features described in the following sections can be used.

Variable, packages and method shortcuts#

The following variables and methods are made available in the session:

  • cfg: The configuration object

  • input_dir, work_dir and output_dir: libpath objects pointing to the input, work and output directories.

    For instance:

    df = pandas.read_csv(input_dir / "data001.csv")
    

  • db: a shortcut for plpipes.database

  • create_table and query: shortcuts for the functions of the same name in plpipes.database.

SQL integration#

Note: due to some incompatibility with recent versions of ipython-sql the SQL integration is currently disabled.

The IPython SQL extension (see https://pypi.org/project/ipython-sql/) is automatically loaded and the configured PLPipes work database set as the default one.

Other databases configured in PLPipes can be selected using a double at sign (@@) followed by the database name. For instance:

%%sql @@input
select * from customers
limit 100