Workflow¶
papermill.engines¶
Engines to perform different roles
-
class
papermill.engines.
Engine
¶ Bases:
object
Base class for engines.
Other specific engine classes should inherit and implement the execute_managed_notebook method.
Defines execute_notebook method which is used to correctly setup the NotebookExecutionManager object for engines to interact against.
-
classmethod
execute_managed_notebook
(nb_man, kernel_name, **kwargs)¶ An abstract method where implementation will be defined in a subclass.
-
classmethod
execute_notebook
(nb, kernel_name, output_path=None, progress_bar=True, log_output=False, autosave_cell_every=30, **kwargs)¶ A wrapper to handle notebook execution tasks.
Wraps the notebook object in a NotebookExecutionManager in order to track execution state in a uniform manner. This is meant to help simplify engine implementations. This allows a developer to just focus on iterating and executing the cell contents.
-
classmethod
nb_kernel_name
(nb, name=None)¶ Use default implementation to fetch kernel name from the notebook object
-
classmethod
nb_language
(nb, language=None)¶ Use default implementation to fetch programming language from the notebook object
-
classmethod
-
class
papermill.engines.
NBClientEngine
¶ Bases:
papermill.engines.Engine
A notebook engine representing an nbclient process.
This can execute a notebook document and update the nb_man.nb object with the results.
-
classmethod
execute_managed_notebook
(nb_man, kernel_name, log_output=False, stdout_file=None, stderr_file=None, start_timeout=60, execution_timeout=None, **kwargs)¶ Performs the actual execution of the parameterized notebook locally.
- Parameters
nb_man (NotebookExecutionManager) – Wrapper for execution state of a notebook.
kernel_name (str) – Name of kernel to execute the notebook against.
log_output (bool) – Flag for whether or not to write notebook output to the configured logger.
start_timeout (int) – Duration to wait for kernel start-up.
execution_timeout (int) – Duration to wait before failing execution (default: never).
-
classmethod
-
class
papermill.engines.
NotebookExecutionManager
(nb, output_path=None, log_output=False, progress_bar=True, autosave_cell_every=30)¶ Bases:
object
Wrapper for execution state of a notebook.
This class is a wrapper for notebook objects to house execution state related to the notebook being run through an engine.
In particular the NotebookExecutionManager provides common update callbacks for use within engines to facilitate metadata and persistence actions in a shared manner.
-
COMPLETED
= 'completed'¶
-
FAILED
= 'failed'¶
-
PENDING
= 'pending'¶
-
RUNNING
= 'running'¶
-
autosave_cell
()¶ Saves the notebook if it’s been more than self.autosave_cell_every seconds since it was last saved.
-
cell_complete
(cell, cell_index=None, **kwargs)¶ Finalize metadata for a cell and save notebook.
Optionally called by engines during execution to finalize the metadata for a cell and save the notebook to the output path.
-
cell_exception
(cell, cell_index=None, **kwargs)¶ Set metadata when an exception is raised.
Called by engines when an exception is raised within a notebook to set the metadata on the notebook indicating the location of the failure.
-
cell_start
(cell, cell_index=None, **kwargs)¶ Set and save a cell’s start state.
Optionally called by engines during execution to initialize the metadata for a cell and save the notebook to the output path.
-
cleanup_pbar
()¶ Clean up a progress bar
-
complete_pbar
()¶ Refresh progress bar
-
get_cell_description
(cell, escape_str='papermill_description=')¶ Fetches cell description if present
-
notebook_complete
(**kwargs)¶ Finalize the metadata for a notebook and save the notebook to the output path.
Called by Engine when execution concludes, regardless of exceptions.
-
notebook_start
(**kwargs)¶ Initialize a notebook, clearing its metadata, and save it.
When starting a notebook, this initializes and clears the metadata for the notebook and its cells, and saves the notebook to the given output path.
Called by Engine when execution begins.
-
now
()¶ Helper to return current UTC time
-
save
(**kwargs)¶ Saves the wrapped notebook state.
If an output path is known, this triggers a save of the wrapped notebook state to the provided path.
Can be used outside of cell state changes if execution is taking a long time to conclude but the notebook object should be synced.
For example, you may want to save the notebook every 10 minutes when running a 5 hour cell execution to capture output messages in the notebook.
-
set_timer
()¶ Initializes the execution timer for the notebook.
This is called automatically when a NotebookExecutionManager is constructed.
-
-
class
papermill.engines.
PapermillEngines
¶ Bases:
object
The holder which houses any engine registered with the system.
This object is used in a singleton manner to save and load particular named Engine objects so they may be referenced externally.
-
execute_notebook_with_engine
(engine_name, nb, kernel_name, **kwargs)¶ Fetch a named engine and execute the nb object against it.
-
get_engine
(name=None)¶ Retrieves an engine by name.
-
nb_kernel_name
(engine_name, nb, name=None)¶ Fetch kernel name from the document by dropping-down into the provided engine.
-
nb_language
(engine_name, nb, language=None)¶ Fetch language from the document by dropping-down into the provided engine.
-
register
(name, engine)¶ Register a named engine
-
register_entry_points
()¶ Register entrypoints for an engine
Load handlers provided by other packages
-
-
papermill.engines.
catch_nb_assignment
(func)¶ Wrapper to catch nb keyword arguments
This helps catch nb keyword arguments and assign onto self when passed to the wrapped function.
Used for callback methods when the caller may optionally have a new copy of the originally wrapped nb object.
papermill.execute¶
-
papermill.execute.
execute_notebook
(input_path, output_path, parameters=None, engine_name=None, request_save_on_cell_execute=True, prepare_only=False, kernel_name=None, language=None, progress_bar=True, log_output=False, stdout_file=None, stderr_file=None, start_timeout=60, report_mode=False, cwd=None, **engine_kwargs)¶ Executes a single notebook locally.
- Parameters
input_path (str or Path or nbformat.NotebookNode) – Path to input notebook or NotebookNode object of notebook
output_path (str or Path or None) – Path to save executed notebook. If None, no file will be saved
parameters (dict, optional) – Arbitrary keyword arguments to pass to the notebook parameters
engine_name (str, optional) – Name of execution engine to use
request_save_on_cell_execute (bool, optional) – Request save notebook after each cell execution
autosave_cell_every (int, optional) – How often in seconds to save in the middle of long cell executions
prepare_only (bool, optional) – Flag to determine if execution should occur or not
kernel_name (str, optional) – Name of kernel to execute the notebook against
language (str, optional) – Programming language of the notebook
progress_bar (bool, optional) – Flag for whether or not to show the progress bar.
log_output (bool, optional) – Flag for whether or not to write notebook output to the configured logger
start_timeout (int, optional) – Duration in seconds to wait for kernel start-up
report_mode (bool, optional) – Flag for whether or not to hide input.
cwd (str or Path, optional) – Working directory to use when executing the notebook
**kwargs – Arbitrary keyword arguments to pass to the notebook engine
- Returns
nb – Executed notebook object
- Return type
NotebookNode
-
papermill.execute.
prepare_notebook_metadata
(nb, input_path, output_path, report_mode=False)¶ Prepare metadata associated with a notebook and its cells
-
papermill.execute.
raise_for_execution_errors
(nb, output_path)¶ Assigned parameters into the appropriate place in the input notebook
- Parameters
nb (NotebookNode) – Executable notebook object
output_path (str) – Path to write executed notebook
-
papermill.execute.
remove_error_markers
(nb)¶
papermill.clientwrap¶
-
class
papermill.clientwrap.
PapermillNotebookClient
(**kwargs: Any)¶ Bases:
nbclient.client.NotebookClient
Module containing a that executes the code cells and updates outputs
-
execute
(**kwargs)¶ Wraps the parent class process call slightly
-
log_output
¶ A boolean (True, False) trait.
-
log_output_message
(output)¶ Process a given output. May log it in the configured logger and/or write it into the configured stdout/stderr files.
- Parameters
output – nbformat.notebooknode.NotebookNode
- Returns
-
papermill_execute_cells
()¶ This function replaces cell execution with it’s own wrapper.
We are doing this for the following reasons:
Notebooks will stop executing when they encounter a failure but not raise a CellException. This allows us to save the notebook with the traceback even though a CellExecutionError was encountered.
We want to write the notebook as cells are executed. We inject our logic for that here.
We want to include timing and execution status information with the metadata of each cell.
-
process_message
(*arg, **kwargs)¶ Processes a kernel message, updates cell state, and returns the resulting output object that was appended to cell.outputs.
The input argument cell is modified in-place.
- Parameters
- Returns
output – The execution output payload (or None for no output).
- Return type
NotebookNode
- Raises
CellExecutionComplete – Once a message arrives which indicates computation completeness.
-
stderr_file
¶ A trait whose value must be an instance of a specified class.
The value can also be an instance of a subclass of the specified class.
Subclasses can declare default classes by overriding the klass attribute
-
stdout_file
¶ A trait whose value must be an instance of a specified class.
The value can also be an instance of a subclass of the specified class.
Subclasses can declare default classes by overriding the klass attribute
-