There is one accessible database in the public environment:
The accessible tables within that database are:
There is also a fact-checked subset of the latest erc_condor_url_attributes_dp_final_public table (columns and datatypes are the same). For more information about Meta's fact checking program, see How Facebook’s third-party fact-checking program works.
These tables hold the columns of the URL attributes table which are approved for inclusion in the public environment. Complete data is available in the latest partition, so there is no need to query older partitions. We recommend using the latest table erc_condor_url_attributes_dp_final_public_v3
as it gives you access to more columns of data.
See the full codebook for the URL Shares dataset which is hosted on the Social Science One website.
Aggregate statistics in the tables marked “DP” have noise added for differential privacy.
An artificial example data set with a few observations can be found here; it may be helpful in understanding the fields described below.
Within erc_condor_url_attributes_dp_final_public
the columns available are:
Column name | Data type | Description |
---|---|---|
url_rid | string | A unique URL ID created specifically for this data set. |
clean_url | string | The web page URL after processing. This is the full URL, not just the domain. URLs that are no longer reachable persist in the data. The URLs have been processed in an attempt to consolidate different web addresses that point to the same URL and to remove potentially private or sensitive data. |
parent_domain | string | The parent domain name from the URL. |
full_domain | string | The full domain name from the URL. |
public_shares_top_country | string | URL shares are tallied by country and the country with the most (differentially private) shares is provided as an ISO 3166-1 alpha-2 code. This field is not indicative of all locations where this article was posted. Rather, it is the top country among users who shared it. |
Within erc_condor_url_attributes_dp_final_public_v2
the columns available are:
Column name | Data type | Description |
---|---|---|
url_rid | string | A unique URL ID created specifically for this data set. |
clean_url | string | The web page URL after processing. This is the full URL, not just the domain. URLs that are no longer reachable persist in the data. The URLs have been processed in an attempt to consolidate different web addresses that point to the same URL and to remove potentially private or sensitive data. |
parent_domain | string | The parent domain name from the URL. |
full_domain | string | The full domain name from the URL. |
share_title | string | The title provided by the author of the URL's content, pulled from the og:title field in the original HTML if possible. |
share_main_blurb | string | The description provided by the author of the URL's content, pulled from the og:description field in the original HTML if possible. |
public_shares_top_country | string | URL shares are tallied by country and the country with the most (differentially private) shares is provided as an ISO 3166-1 alpha-2 code. This field is not indicative of all locations where this article was posted. Rather, it is the top country among users who shared it. |
Within erc_condor_url_attributes_dp_final_public_v3
the columns available are:
Column name | Data type | Description |
---|---|---|
url_rid | string | A unique URL ID created specifically for this data set. |
clean_url | string | The web page URL after processing. This is the full URL, not just the domain. URLs that are no longer reachable persist in the data. The URLs have been processed in an attempt to consolidate different web addresses that point to the same URL and to remove potentially private or sensitive data. |
parent_domain | string | The parent domain name from the URL. |
full_domain | string | The full domain name from the URL. |
first_post_time | timestamp | The date and time when the URL was first posted by a user on Facebook. Date-times are truncated to ten-minute increments. The exact format is YYYY-MM-DD HH:MM:SS. For example: 2022-12-02 18:10:00. |
first_post_time_unix | bigint | The first_post_time field translated into UNIX time, which is the number of seconds since 1970-01-01 00:00:00. For example: 1449079800. |
share_title | string | The title provided by the author of the URL's content, pulled from the og:title field in the original HTML if possible. |
share_main_blurb | string | The description provided by the author of the URL's content, pulled from the og:description field in the original HTML if possible. |
public_shares_top_country | string | URL shares are tallied by country and the country with the most (differentially private) shares is provided as an ISO 3166-1 alpha-2 code. This field is not indicative of all locations where this article was posted. Rather, it is the top country among users who shared it. |
ds | string | Datestamp: year, month, and day. Data is partitioned on this variable. Filter on the latest date to get the most up-to-date data on public URL attributes. |