Get started

The following sections guide you through getting started with URL Shares:

Log in to the Secure Research Environment URL

Use one of the Amazon WorkSpaces Secure Browser portals to access URL Shares in Secure Research Environment. For the best user experience and platform performance, select the closest portal to your location:

See WorkSpaces Secure Browser in the Secure Research Environment user documentation for more information about WorkSpaces Secure Browser.

Log into the site using your Facebook credentials. This will spin up an instance of JupyterHub server for your use in Secure Research Environment.

You will be offered the choice of CPU or GPU server. See GPU server to learn about the difference between the two. See Secure Research Environment for complete Secure Research Environment documentation.

Using a Jupyter notebook

This section steps you through creating a new Jupyter notebook and using it to run queries. The examples below query the public environment.

Create a notebook

Click the blue + button and select Python 3. This creates a new Jupyter notebook in a new browser tab. You can optionally rename the notebook by right-clicking on the notebook in the left pane and selecting Rename.

Secure Research Environment also supports R language if you prefer.

Examples in this documentation currently use Python.

Run a Python-based SQL query

To use our URL Shares SQL module, click in an empty notebook cell and enter the following code:

from fbri.public.sql.query import execute

database = "fbri_prod_public"
table = "erc_condor_url_attributes_dp_final_public_v3"

sql = f"""
SELECT *
FROM {database}.{table}
LIMIT 20
"""

result = execute(sql)

Some rows in the raw results (now in memory) contain negative values within the numeric columns. This is due to the Gaussian noise that we have added to the original values in the data for the purpose of preserving privacy. Specifically, it's an implementation of user-level zero-Concentrated Differential Privacy (zCDP).

Using Pandas DataFrame

While the raw result is still in memory, you might find it easier to manipulate the data if you load the file as a Pandas DataFrame. It can also be saved to a .tsv file by parameterization in the `execute()` function by using the Pandas Library.

Create a tab-separated save file (.tsv)

Import the Pandas DataFrame module by clicking in an empty notebook cell and running the following code:

import pandas as pd

database = "fbri_prod_public"
table = "erc_condor_url_attributes_dp_final_public_v3"

sql = f"""
SELECT *
FROM {database}.{table}
LIMIT 20
"""

result = execute(sql, "attributes.tsv")
df = pd.read_csv('attributes.tsv', delimiter = '\t')
df

Next steps

See the public and private environment guides for more examples and forms of analysis: