Ingesting, checking, and combining the COVIDcast data
This page documents the preview version (v2.23). Preview includes features under active development and is for development and testing only. For production, use the stable version (v2024.1). To learn more, see Versioning.
Ingest the .csv files, check the assumptions, and combine the interesting values into a single table
Here are the steps:
-
Copy the data from each
.csv
file "as is" into a dedicated staging table, with effective primary key ("state, survey_date)". (The qualifier "effective" recognizes the fact that, as yet, these columns will have different names that reflect how they're named in the.csv
files.) -
Check that the values from the
.csv
files do indeed conform to the stated rules. -
Project the columns of interest from the staging tables and join these into a single table, with primary key ("state, survey_date)" for analysis.
All of these steps are implemented by the ingest-the-data.sql
script. It's designed so that you can run, and re-run, it time and again. It will always finish silently (provided that you say set client_min_messages = warning;
) Each time you run it. It calls various other scripts. You will download these, along with ingest-the-data.sql
, as you step through the sections in the order that the left-hand navigation menu presents.