Understanding Context and API Endpoint Notebook Cells
Overview
Jupyter notebooks are widely used for interactive and collaborative data analysis, and they support the creation of RESTful API endpoints through special cells known as "Context Cells" and "API Endpoints."
Defining Context Cells in Jupyter Notebooks
Context cells are typically placed at the beginning of a Jupyter notebook and are executed only once in a new kernel. The purpose of context cells is to set up the environment and prepare the necessary configurations for the subsequent API endpoint cells to function effectively.
import pandas as pd
# Read data from a CSV file
data = pd.read_csv('data.csv')
In this example, the context cell is used to set up the data source by importing packages, reading data from a CSV file, and defining the global variable data
. This context cell will be executed only once when the notebook kernel is started or restarted, and it prepares the environment for subsequent cells that might use the data
DataFrame.
Defining API Endpoints in Jupyter Notebooks
In Data Lab Functions, API endpoints are defined in Jupyter notebooks as special cells, which must appear after context cells to ensure proper execution. The endpoint definition is placed as a comment on the first line of the cell, incorporating the HTTP method and the endpoint name. A notebook cell that is decorated with a top-line comment resembling a RESTful endpoint. For example:
# GET /items
items = spy.search({
'Path': 'Example >> Cooling Tower 1 >> Area A',
'Name': 'Compressor'
})
items.to_json()
Here's a step-by-step explanation of what the code is doing:
# GET /items
: This comment indicates that the API call is using the HTTP GET method to fetch data related to "items."items = spy.search({...})
: Thesearch()
function is called with the specified search criteria, and the result is stored in the variableitems
. Theitems
variable will now hold the items that match the search conditions.items.to_json()
: This line converts theitems
variable, which contains the search results in some structured format (e.g., DataFrame), into a JSON representation. JSON (JavaScript Object Notation) is a common data format for easy data interchange.
Allowed HTTP Methods
The API endpoints in Data Lab Functions support the following HTTP methods:
GET: Retrieve data or perform queries without altering the server state.
POST: Create new resources on the server.
PATCH: Update partially the existing resources.
PUT: Update or replace the entire existing resource.
DELETE: Remove resources from the server.
Nested Paths in Endpoints
API endpoints can include nested paths, providing a way to organize and categorize related functionalities. For example:
# GET /items/area-a/compressor
Query Parameters and Request Body
Data Lab Functions' API endpoints do not support path parameters. Instead, query parameters and a request body are available in the REQUEST object. The REQUEST object is instantiated in the kernel for every Data Lab Functions request, enabling users to pass relevant data and configuration options. See Reserved Variables for more information.
Unique Endpoint Cells
Each API endpoint cell must be unique, representing a distinct combination of the HTTP method and endpoint name. This uniqueness ensures clarity and consistency within the Jupyter notebook.
Endpoint Cell Response Content-Types
API endpoint cells can produce responses in the following content-types:
application/json
(default): The response is in JSON format. Data Lab Functions performsjson.dumps()
on the endpoint cell's output to serialize Python objects or literals into JSON.image/*
: Endpoints can also generate image responses.text/plain
: Responses can be in plain text format.text/html
: HTML content can be generated as part of the endpoint response.
Executing Data Lab Functions via API
Users and developers can now perform a variety of operations by sending HTTP requests to the custom API endpoints defined within Jupyter notebooks. These operations can include data analysis, visualization, model execution, and much more.