You can perform some basic statistical data visualization using the seaborn library.
The following examples assume you already have a derived table df
by having run code like this:
from fbri.public.sql.query import execute import pandas as pd database = "fbri_prod_public" table = "erc_condor_url_attributes_dp_final_public_v3" sql = f""" SELECT * FROM {database}.{table} LIMIT 20 """ result = execute(sql, "attributes.tsv") df = pd.read_csv('attributes.tsv', delimiter = '\t') df
In a new cell, insert the following code to import the seaborn library and create a distribution plot from the value counts of the parent_domains
column.
sns.distplot()
creates distribution plots from integer value-based tables.
import seaborn as sns sns.distplot(df.parent_domain.value_counts())
The results look similar to this example:
To create an example pair plot, insert this code into a new cell:
import seaborn as sns #if you have not already done so sns.pairplot(df[['first_post_time_unix']])
In this example sns.pairplot()
creates a pairplot using the first_post_time_unix
value from the df
table. The result looks similar to this example: