Private environment data tables

There is one accessible database in the private environment:

  • fbri_prod_private

The accessible tables within that database are:

  • erc_condor_url_attributes_dp_final
  • erc_condor_url_attributes_dp_final_v2
  • erc_condor_url_attributes_dp_final_v3
  • erc_condor_url_breakdowns_dp_clean_partitioned
  • erc_condor_url_breakdowns_dp_clean_partitioned_v2
  • erc_condor_user_reports_dp_final

Complete data is available in the latest partition, so there is no need to query older partitions. We recommend using the latest version of each data table as that will give you access to more columns or data.

See the full codebook for the URL Shares dataset which is hosted on the Social Science One website.

Variables

Aggregate statistics exist in the tables marked “DP”. They have noise added to certain variables for differential privacy.

An artificial example data set with a few observations is available here; you might find it helpful in understanding the fields described below.

erc_condor_url_attributes_dp_final

Within erc_condor_url_attributes_dp_final the columns available are:

Column nameData typeDescription

url_rid

string

A unique URL ID created specifically for this data set.

clean_url

string

The web page URL after processing. This is the full URL, not just the domain. URLs that are no longer reachable persist in the data. The URLs have been processed in an attempt to consolidate different web addresses that point to the same URL and to remove potentially private or sensitive data.

full_domain

string

The full domain name from the URL.

parent_domain

string

The parent domain name from the URL.

first_post_time

timestamp

The date and time when the URL was first posted by a user on Facebook. Date-times are truncated to ten-minute increments. The exact format is YYYY-MM-DD HH:MM:SS. For example: 2017-12-02 18:10:00.

first_post_time_unix

bigint

The first_post_time field translated into UNIX time, which is the number of seconds since 1970-01-01 00:00:00. For example: 1449079800.

share_title

string

The title provided by the author of the URL's content, pulled from the og:title field in the original HTML if possible.

share_main_blurb

string

The description provided by the author of the URL's content, pulled from the og:description field in the original HTML if possible.

tpfc_rating

string

If the URL was sent to third-party fact-checkers (TPFC), indicates whether and how they rated it. See the full codebook for the URL Shares dataset which is hosted on the Social Science One website for additional information on third-party fact checking.

tpfc_first_fact_check

timestamp

The date and time the article was first fact-checked. NULL indicates the article has not been fact-checked. Date-times are truncated to ten-minute increments. The exact format is YYYY-MM-DD HH:MM:SS. For example: 2017-12-02 18:10:00.

tpfc_first_fact_check_unix

bigint

The tpfc_first_fact_check field translated into UNIX time which is the number of seconds since 1970-01-01 00:00:00. For example: 1449079800.

spam_usr_feedback

bigint

The total number of unique users who reported posts containing the URL as spam over the period from January 1, 2017 to July 31, 2019.

false_news_usr_feedback

bigint

The total number of unique users who reported posts containing the URL as false news over the period from January 1, 2017 to July 31, 2019. The User Reports Table contains monthly aggregations of the data in this column for months subsequent to July 2019.

hate_speech_usr_feedback

bigint

The total number of unique users who reported posts containing the URL as hate speech over the period from January 1, 2017 to July 31, 2019. The User Reports Table contains monthly aggregations of the data in this column for months subsequent to July 2019.

public_shares_top_country

string

URL shares are tallied by country and the country with the most (differentially private) shares is provided as an ISO 3166-1 alpha-2 code. This field is not indicative of all locations where this article was posted. Rather, it is the top country among users who shared it.

erc_condor_url_attributes_dp_final_v2

Within erc_condor_url_attributes_dp_final_v2 the columns available are:

Column nameData typeDescription

url_rid

string

A unique URL ID created specifically for this data set.

clean_url

string

The web page URL after processing. This is the full URL, not just the domain. URLs that are no longer reachable persist in the data. The URLs have been processed in an attempt to consolidate different web addresses that point to the same URL and to remove potentially private or sensitive data.

parent_domain

string

The parent domain name from the URL.

full_domain

string

The full domain name from the URL.

first_post_time

timestamp

The date and time when the URL was first posted by a user on Facebook. Date-times are truncated to ten-minute increments. The exact format is YYYY-MM-DD HH:MM:SS. For example: 2017-12-02 18:10:00.

first_post_time_unix

bigint

The first_post_time field translated into UNIX time, which is the number of seconds since 1970-01-01 00:00:00. For example: 1449079800.

share_title

string

The title provided by the author of the URL's content, pulled from the og:title field in the original HTML if possible.

share_main_blurb

string

The description provided by the author of the URL's content, pulled from the og:description field in the original HTML if possible.

tpfc_rating

string

If the URL was sent to third-party fact-checkers (TPFC), indicates whether and how they rated it. See the full codebook for the URL Shares dataset which is hosted on the Social Science One website for additional information on third-party fact checking.

tpfc_first_fact_check

timestamp

The date and time the article was first fact-checked. NULL indicates the article has not been fact-checked. Date-times are truncated to ten-minute increments. The exact format is YYYY-MM-DD HH:MM:SS. For example: 2017-12-02 18:10:00.

tpfc_first_fact_check_unix

bigint

The tpfc_first_fact_check field translated into UNIX time which is the number of seconds since 1970-01-01 00:00:00. For example: 1449079800.

spam_usr_feedback

bigint

The total number of unique users who reported posts containing the URL as spam over the period from January 1, 2017 to July 31, 2019.

false_news_usr_feedback

bigint

The total number of unique users who reported posts containing the URL as false news over the period from January 1, 2017 to July 31, 2019. The User Reports Table contains monthly aggregations of the data in this column for months subsequent to July 2019.

hate_speech_usr_feedback

bigint

The total number of unique users who reported posts containing the URL as hate speech over the period from January 1, 2017 to July 31, 2019. The User Reports Table contains monthly aggregations of the data in this column for months subsequent to July 2019.

public_shares_top_country

string

URL shares are tallied by country and the country with the most (differentially private) shares is provided as an ISO 3166-1 alpha-2 code. This field is not indicative of all locations where this article was posted. Rather, it is the top country among users who shared it.

erc_condor_url_attributes_dp_final_v3

Within erc_condor_url_attributes_dp_final_v3 the columns available are:

Column nameData typeDescription

url_rid

string

A unique URL ID created specifically for this data set.

clean_url

string

The web page URL after processing. This is the full URL, not just the domain. URLs that are no longer reachable persist in the data. The URLs have been processed in an attempt to consolidate different web addresses that point to the same URL and to remove potentially private or sensitive data.

parent_domain

string

The parent domain name from the URL.

full_domain

string

The full domain name from the URL.

first_post_time

timestamp

The date and time when the URL was first posted by a user on Facebook. Dates-times are truncated to ten-minute increments. The exact format is YYYY-MM-DD HH:MM:SS. For example: 2022-12-02 18:10:00.

first_post_time_unix

bigint

The first_post_time field translated into UNIX time, which is the number of seconds since 1970-01-01 00:00:00. For example: 1449079800.

share_title

string

The title provided by the author of the URL's content, pulled from the og:title field in the original HTML if possible.

share_main_blurb

string

The description provided by the author of the URL's content, pulled from the og:description field in the original HTML if possible.

tpfc_rating

string

If the URL was sent to third-party fact-checkers (TPFC), indicates whether and how they rated it. See the full codebook for the URL Shares dataset which is hosted on the Social Science One website for additional information on third-party fact checking.

tpfc_first_fact_check

timestamp

The date and time the article was first fact-checked. NULL indicates the article has not been fact-checked. Date-times are truncated to ten-minute increments. The exact format is YYYY-MM-DD HH:MM:SS. For example: 2017-12-02 18:10:00.

tpfc_first_fact_check_unix

bigint

The tpfc_first_fact_check field translated into UNIX time which is the number of seconds since 1970-01-01 00:00:00. For example: 1449079800.

spam_usr_feedback

bigint

The total number of unique users who reported posts containing the URL as spam over the period from January 1, 2017 to July 31, 2019.

false_news_usr_feedback

bigint

The total number of unique users who reported posts containing the URL as false news over the period from January 1, 2017 to July 31, 2019. The User Reports Table contains monthly aggregations of the data in this column for months subsequent to July 2019.

hate_speech_usr_feedback

bigint

The total number of unique users who reported posts containing the URL as hate speech over the period from January 1, 2017 to July 31, 2019. The User Reports Table contains monthly aggregations of the data in this column for months subsequent to July 2019.

public_shares_top_country

string

URL shares are tallied by country and the country with the most (differentially private) shares is provided as an ISO 3166-1 alpha-2 code. This field is not indicative of all locations where this article was posted. Rather, it is the top country among users who shared it.

erc_condor_url_breakdowns_dp_clean_partitioned

Within erc_condor_url_breakdowns_dp_clean_partitioned the columns available are:

Column nameData typeDescription

url_rid

string

A unique URL ID created specifically for this data set.

age_bracket

string

Age data from users' profiles. Brackets include 18-24, 25-34, 35-44, 45-54, 55-64, 65+, and NULL.

gender

string

Gender data from users' profiles. Values include male, female, and other.

political_page_affinity

integer, ordered

A measurement of user's connections to Pages with similar audiences as Pages representing politicians of known political affiliation/ideology, based on Barberá et al. [2015]. See the full codebook for the URL Shares dataset which is hosted on the

Social Science One

website for additional information about our political page affinity model.

views

bigint

Number of users who viewed a post containing the URL.

clicks

bigint

Number of users who clicked on the URL.

shares

bigint

Number of users who shared the URL in a post or reshared such a post.

likes

bigint

Number of users who "liked" posts containing the URL.

loves

bigint

Number of users who "loved" posts containing the URL.

hahas

bigint

Number of users who reacted with "haha" to posts containing the URL.

wows

bigint

Number of users who reacted with "wow" to posts containing the URL.

sorrys

bigint

Number of users who reacted with "sad" to posts containing the URL. Note that the official name for this reaction is "sad", but the column name in this dataset is "sorrys".

angers

bigint

Number of users who reacted with "angry" to posts containing the URL.

comments

bigint

Number of users who commented on posts containing the URL.

total_share_without_clicks

bigint

Number of users who shared a post containing the URL but did not actually click on the link themselves. Some users share articles without first clicking through to the actual content. This number might help identify articles that users are sharing without reading, or URLs used in organized campaigns to spread content.

c (country)

string

Country in which the actions recorded in this table occurred. This variable is stored in a column called **c** in the dataset. Data is partitioned on this variable and includes data for countries needed to conduct analysis for research proposals already approved through Social Science One. See the latest release of the full codebook for the URL Shares dataset which is hosted on the Social Science One website for a list of included countries.

year_month

string

Year and month. Data is partitioned on this variable.

spam_usr_feedback

bigint

The total number of unique users who reported posts containing the URL as spam over the period from January 1, 2017 to July 31, 2019.

false_news_usr_feedback

bigint

The total number of unique users who reported posts containing the URL as false news over the period from January 1, 2017 to July 31, 2019. The User Reports Table contains monthly aggregations of the data in this column for months subsequent to July 2019.

hate_speech_usr_feedback

bigint

The total number of unique users who reported posts containing the URL as hate speech over the period from January 1, 2017 to July 31, 2019. The User Reports Table contains monthly aggregations of the data in this column for months subsequent to July 2019.

public_shares_top_country

string

URL shares are tallied by country and the country with the most (differentially private) shares is provided as an ISO 3166-1 alpha-2 code. This field is not indicative of all locations where this article was posted. Rather, it is the top country among users who shared it.

erc_condor_url_breakdowns_dp_clean_partitioned_v2

Within erc_condor_url_breakdowns_dp_clean_partitioned_v2 the columns available are:

Column nameData typeDescription

url_rid

string

A unique URL ID created specifically for this data set.

age_bracket

string

Age data from users' profiles. Brackets include 18-24, 25-34, 35-44, 45-54, 55-64, 65+, and NULL.

gender

string

Gender data from users' profiles. Values include male, female, and other.

political_page_affinity

integer, ordered

A measurement of user's connections to Pages with similar audiences as Pages representing politicians of known political affiliation/ideology, based on Barberá et al. [2015]. See the full codebook for the URL Shares dataset which is hosted on the Social Science One website for additional information about our political page affinity model.

views

bigint

Number of users who viewed a post containing the URL.

clicks

bigint

Number of users who clicked on the URL.

shares

bigint

Number of users who shared the URL in a post or reshared such a post.

likes

bigint

Number of users who "liked" posts containing the URL.

loves

bigint

Number of users who "loved" posts containing the URL.

hahas

bigint

Number of users who reacted with "haha" to posts containing the URL.

wows

bigint

Number of users who reacted with "wow" to posts containing the URL.

sorrys

bigint

Number of users who reacted with "sad" to posts containing the URL. Note that the official name for this reaction is "sad", but the column name in this dataset is "sorrys".

angers

bigint

Number of users who reacted with "angry" to posts containing the URL.

comments

bigint

Number of users who commented on posts containing the URL.

total_share_without_clicks

bigint

Number of users who shared a post containing the URL but did not actually click on the link themselves. Some users share articles without first clicking through to the actual content. This number might help identify articles that users are sharing without reading, or URLs used in organized campaigns to spread content.

c (country)

string

Country in which the actions recorded in this table occurred. This variable is stored in a column called **c** in the dataset. Data is partitioned on this variable and includes data for countries needed to conduct analysis for research proposals already approved through Social Science One. See the latest release of the full codebook for the URL Shares dataset which is hosted on the Social Science One website for a list of included countries.

year_month

string

Year and month. Data is partitioned on this variable.

spam_usr_feedback

bigint

The total number of unique users who reported posts containing the URL as spam over the period from January 1, 2017 to July 31, 2019.

false_news_usr_feedback

bigint

The total number of unique users who reported posts containing the URL as false news over the period from January 1, 2017 to July 31, 2019. The User Reports Table contains monthly aggregations of the data in this column for months subsequent to July 2019.

hate_speech_usr_feedback

bigint

The total number of unique users who reported posts containing the URL as hate speech over the period from January 1, 2017 to July 31, 2019. The User Reports Table contains monthly aggregations of the data in this column for months subsequent to July 2019.

public_shares_top_country

string

URL shares are tallied by country and the country with the most (differentially private) shares is provided as an ISO 3166-1 alpha-2 code. This field is not indicative of all locations where this article was posted. Rather, it is the top country among users who shared it.

erc_condor_user_reports_dp_final

Within erc_condor_user_reports_dp_final the columns available are:

Column nameData typeDescription

url_rid

string

A unique URL ID created specifically for this data set.

year_month

string

Year and month. Data is partitioned on this variable.

spam_usr_feedback

bigint

The total number of unique users who reported posts containing the URL as spam over the period from January 1, 2017 to July 31, 2019.

false_news_usr_feedback

bigint

The total number of unique users who reported posts containing the URL as false news over the period from January 1, 2017 to July 31, 2019. The User Reports Table contains monthly aggregations of the data in this column for months subsequent to July 2019.

hate_speech_usr_feedback

bigint

The total number of unique users who reported posts containing the URL as hate speech over the period from January 1, 2017 to July 31, 2019. The User Reports Table contains monthly aggregations of the data in this column for months subsequent to July 2019.