Data visualization

You can perform some basic statistical data visualization using the seaborn library.

Data visualization examples

The following examples assume you already have a derived table df by having run code like this:

from fbri.public.sql.query import execute
import pandas as pd
      
database = "fbri_prod_public"
table = "erc_condor_url_attributes_dp_final_public_v3"

sql = f"""
SELECT *
FROM {database}.{table}
LIMIT 20
"""
      
result = execute(sql, "attributes.tsv")
df = pd.read_csv('attributes.tsv', delimiter = '\t')
df

Single value distribution plot

In a new cell, insert the following code to import the seaborn library and create a distribution plot from the value counts of the parent_domains column.

sns.distplot() creates distribution plots from integer value-based tables.

import seaborn as sns
sns.distplot(df.parent_domain.value_counts())

The results look similar to this example:

Pair plot

To create an example pair plot, insert this code into a new cell:

import seaborn as sns #if you have not already done so
sns.pairplot(df[['first_post_time_unix']])

In this example sns.pairplot() creates a pairplot using the first_post_time_unix value from the df table. The result looks similar to this example:

Learn more