Get started

The following sections guide you through getting started with URL Shares:

OpenVPN setup
Using a Jupyter notebook
Using Pandas DataFrame
Next steps

OpenVPN setup

After January 7, 2025, access to URL Shares in Researcher Platform will require Amazon WorkSpaces Secure Browser. See WorkSpaces Secure Browser in the Researcher Platform documentation for instructions. You will not lose any data in this transition. With WorkSpaces Secure Browser, you do not need a VPN to access Researcher Platform. The Using a Jupyter notebook section on this page provides the URL so you can try it.

Until January 7, 2025, you can access URL Shares through a Virtual Private Network (VPN). This section shows you how to install and configure the OpenVPN client and connect to our VPN server. Once connected, you will be able to access URL Shares and perform queries.

Note that while you are connected to our VPN server, all of your internet traffic will be routed through it, so be sure to disconnect when you are finished.

Step 1: Install OpenVPN

Download and install the OpenVPN client.

Step 2: Generate credentials

Go to Meta Business Suite and log in with your Facebook account. Click Generate Credentials.

This will generate your OpenVPN credentials and download a file that contains your access certificate. Do not share this file with anybody!

Step 3: Launch and connect

Launch OpenVPN and import the file you just generated.

Check the Connect after import checkbox, then click Add to add the imported file to your profile.

This will connect to the VPN. OpenVPN displays a clock indicating that you are connected.

If all your other credentials are in order, you should be able to access URL Shares.

Troubleshooting VPN issues

It is possible for other certificate-based VPNs to interfere with certificate authentication. If you have trouble connecting, try disabling any other VPNs you have in use.

If you're having any other issues connecting to the VPN, please contact the support staff by creating a JIRA Service Management case. See Get help for information on how to open a case.

Using a Jupyter notebook

This section steps you through creating a new Jupyter notebook and using it to run queries. The examples below query the public environment.

Step 1: Log in

While connected to the VPN, visit URL Shares by going to the webpage of the environment you want to access.

Environment URLs:

If you'd like to use the latest Amazon WorkSpaces Secure Browser version of Researcher Platform, you can follow these URLs without being connected to the VPN:

Log into the site using your Facebook account.

The Researcher Platform offers a GPU server as an alternative to a CPU server. See GPU server for information about this feature.

Once you have logged in and selected your server type, Jupyter spins up an instance of the Jupyter notebook server for your use.

Step 2: Create a notebook

Click the blue + button and select Python 3. This creates a new Jupyter notebook in a new browser tab. You can optionally rename the notebook by right-clicking on the notebook in the left pane and selecting Rename.

Researcher Platform also supports R language if you prefer.

Examples in this documentation currently use Python.

Step 3: Run a Python-based SQL query

To use our URL Shares SQL module, click in an empty notebook cell and enter the following code:

from fbri.public.sql.query import execute

database = "fbri_prod_public"
table = "erc_condor_url_attributes_dp_final_public_v3"

sql = f"""
SELECT *
FROM {database}.{table}
LIMIT 20
"""

result = execute(sql)

Some rows in the raw results (now in memory) contain negative values within the numeric columns. This is due to the Gaussian noise that we have added to the original values in the data for the purpose of preserving privacy. Specifically, it's an implementation of user-level zero-Concentrated Differential Privacy (zCDP).

Using Pandas DataFrame

While the raw result is still in memory, you might find it easier to manipulate the data if you load the file as a Pandas DataFrame. It can also be saved to a .tsv file by parameterization in the `execute()` function by using the Pandas Library.

Create a tab-separated save file (.tsv)

Import the Pandas DataFrame module by clicking in an empty notebook cell and running the following code:

import pandas as pd

database = "fbri_prod_public"
table = "erc_condor_url_attributes_dp_final_public_v3"

sql = f"""
SELECT *
FROM {database}.{table}
LIMIT 20
"""

result = execute(sql, "attributes.tsv")
df = pd.read_csv('attributes.tsv', delimiter = '\t')
df

Next steps

See the public and private environment guides for more examples and forms of analysis: