Overview of URL Shares data tables

This document provides a reference to the names used for the URL Breakdowns, Attributes, and User Reports tables. It provides information about the most up-to-date version of the tables that comprise the URL Shares dataset. When we perform twice yearly data refresh updates (January and July), we add new months of data into these tables, rather than creating new tables. We update the information below to reflect the latest data refresh.

The full codebook for the URL Shares dataset is hosted on the Social Science One website. It includes information about the scope, structure, fields, and privacy-preserving characteristics.

We do not update the codebook when we perform twice yearly data refresh updates

Instead, this Overview of URL Shares data tables page will always contain the most current dates.

Partitions store

Object functionTableEnvironmentPartitions available

URL Breakdown

erc_condor_url_breakdowns_dp_clean_partitioned_v2

Private

From 2017-01 to 2022-06

URL Attributes

erc_condor_url_attributes_dp_final_v3

Private

2022-07 (Will have complete data available in the latest partition of year_month)

User Reports

erc_condor_user_reports_dp_final

Private

From 2017-01 to 2022-06

URL Attributes Public

erc_condor_url_attributes_dp_final_public_v3

Public

Will have complete data available in the latest partition (2022-07-12)

Fact Checked URL Attributes Public

erc_condor_url_attributes_public_fact_checked

Public

Will have complete data available in the latest partition (2022-07)

URL Breakdowns

The erc_condor_url_breakdowns_dp_clean_partitioned_v2 table will contain data from January 2017 to June 2022. When new months of data are added, that data will be loaded into the erc_condor_url_breakdowns_dp_clean_partitioned_v2 table. New data will go into new partitions of the year_month column, so no previous data will be overwritten.

URL Attributes

The erc_condor_url_attributes_dp_final_v3 contains all the data in the latest date in the year_month column (date format is YYYY-MM, i.e. January 2017 is written as 2017-01). We added a year_month column (a partition column) to erc_condor_url_attributes_dp_final_v3 so that we are able to update data in future releases without having to create a new table with a different name. To explain how this works in practical terms, currently we have all the data in the year_month partition 2022-07. If the next update to the dataset were to happen in August 2022, the data will appear in the 2022-08 partition of year_month. Complete data are available in the latest partition, so there will be no need to query older partitions (filter to the appropriate year_month partition in the WHERE statement of a query on this table).

User Reports

The erc_condor_user_reports_dp_final table will contain data from January 2017 to June 2022. When new months of data are added, that data will be loaded into the erc_condor_user_reports_dp_final table. New data will go into new partitions of the year_month column, so no previous data will be overwritten.

Public URL Attributes

This table holds the columns of the URL attributes table which are approved for inclusion in the public environment. Complete data are available in the latest partition, so there will be no need to query older partitions.

Fact Checked URL Attributes Public

The table holds the columns of the URL attributes table which are approved for inclusion in the public environment which are fact checked. Complete data are available in the latest year month partition, so there will be no need to query older partitions. Here the year_month means the year and the month refresh happened.