Submit queries asynchronously

The asynchronous queries feature decouples query submission from the fetching of query results, enabling you to run queries in the background while you continue working in your Jupyter notebooks. With the asynchronous queries feature, you can:

  • Submit multiple queries (rate limited) to process asynchronously.
  • Monitor the progress of your queries using a JupyterHub extension (automatic status update every 10 minutes plus status refresh on demand).
  • Retrieve the results of your queries when they complete.
  • Stop a query before it completes.

This feature is available for Python only (not available for R).

Importing and using the async methods

To use this feature, you must import the following methods that allow you to execute queries and retrieve query results in the background. These methods can be used in a Python Jupyter notebook environment:

  • submit_async_query
  • poll_async_query
  • get_async_query_results
  • stop_async_query

The following example shows importing the methods:

from fbri.private.sql.query import submit_async_query, poll_async_query, get_async_query_results, stop_async_query

To submit an asynchronous query, use the submit_async_query method as shown in the following example:

# Insert your database and table names
sql = "SELECT * FROM <databasename>.<tablename> LIMIT 100000"
query_id = submit_async_query(sql)

# Outputs the query ID
print(query_id)

The query output consists of the query ID.

The next example shows using the query ID to check on the status of a query:

query_status = poll_async_query(query_id)
          
# Or you can insert the query ID directly as a string
query_status = poll_async_query('insert-query-id') 

The query output indicates whether the query is in progress, was successful, failed, or was canceled.

You can fetch the results of successful queries, with or without the following options:

  • Specify the number of line items to include
  • Name an output file to contain the results

If you do not specify an output file, the results are output directly to the dataframe.

This example shows both options:

get_async_query_results(query_id)
          
# Optionally specify an output file and/or item count
get_async_query_results(query_id, output_file="output.csv")
get_async_query_results(query_id, item_count=10)

If you attempt to fetch the results of an unfinished or failed query, you receive an error message.

You can stop a query before it has completed as shown in the following two-step example:

# Insert your database and table names
sql = "SELECT * FROM <databasename>.<tablename>"
query_id = submit_async_query(sql)
stop_async_query(query_id)

The output is a message indicating that the query was successfully canceled. If you attempt to stop a query that has already completed, you receive an error message.

Rate limiting

To ensure fair resource allocation, the number of concurrently running queries you can submit is controlled by a rate limit. If you attempt to submit a query after your rate limit has been reached, you receive a message to that effect and the last query is not submitted. The rate limit is applied on a per-user basis, regardless of the number of Jupyter notebooks a user has.

The message you receive when you attempt to submit a query that would surpass your rate limit looks similar to this:

If you are blocked by the rate limit, you can wait for some of your running queries to complete, or stop some of your running queries.

If you have questions or concerns about the rate limit, please contact us through Direct Support using the procedure in Use the Direct Support Tool to Get Help.

Automatic status update reporting

You will receive an automatic jobs status update pop-up window if there are any jobs with status changes to report since the previous automatic update. The pop-up window only includes jobs with status changes, and you can search within the pop-up window.