Download the COVIDcast data

How to download data from Carnegie Mellon's COVIDcast project for linear regression analysis using YSQL

Simply follow these step-by-step instructions:

  • Create a directory on the computer where you run ysqlsh to hold the files for this case study. Call it, for example, "covid-data-case-study".

  • Go to the COVIDcast site and select the “Export Data” tab. That will bring you to this screen:

Download the COVIDcast Facebook Survey Data

  • In Section #1, “Select Signal”, select “Facebook Survey Results” in the “Data Sources” list and select “People Wearing Masks” in the “Signals” list.

  • In Section #2, “Specify Parameters”, choose the range that interests you for “Date Range” (This case study used "2020-09-13 - 2020-11-01".) Select “States” for the “Geographic Level”.

  • In Section #3, "Get Data” hit the “CSV” button.

  • Then repeat, leaving all choices unchanged except for the choice in the “Signals” list. Select “COVID-Like Symptoms” here.

  • Then repeat again, again leaving all choices unchanged except for the choice in the “Signals” list. Select “COVID-Like Symptoms in Community” here.

    This will give you three files with names like these:

    covidcast-fb-survey-smoothed_wearing_mask-2020-09-13-to-2020-11-01.csv
    covidcast-fb-survey-smoothed_cli-2020-09-13-to-2020-11-01.csv
    covidcast-fb-survey-smoothed_hh_cmnty_cli-2020-09-13-to-2020-11-01.csv
    

    The naming convention is obvious. The names will reflect your choice of date range.

  • Create a directory called "csv-files" on your "covid-data-case-study" directory and move the .csv files to this from your "downloads" directory. Because you will not edit these files, you might like to make them all read-only to be sure that you don't make any accidental changes when you use a text editor or a spreadsheet app to inspect them.