The following Python scripts were used to produce charts. They are small and very uncomplicated, very straightforward.
plotHmd.py
import sys, csv, datetime
fileName = sys.argv[1] # e.g. stmf.csv
countryCode = sys.argv[2] # e.g. USA
figureFieldName = sys.argv[3] # e.g. D0_14 (deaths for 0-14y), DTotal (total deaths)
smoothingCount = int(sys.argv[4]) if len(sys.argv) >= 4 + 1 else 1 # e.g. 3
data = []
with open(fileName) as file1:
file1.readline() # Two intro lines without data
file1.readline()
for line in csv.DictReader(file1):
if line and line["CountryCode"] == countryCode and line["Sex"] == "b":
deaths = int(float(line[figureFieldName]))
date = line["Year"] + " " + line["Week"] + " 0"
date = datetime.datetime.strptime(date, "%Y %U %w")
data.append( (date, deaths) )
data.pop() # Drop last two weeks for too big registration delay effect
data.pop()
def movingAverage(values, itemCount):
average = 0
averages = []
idx = -1
for val in values:
idx += 1
average += val - values[idx - itemCount] if idx >= itemCount else val
if idx >= itemCount - 1:
averages.append(average / float(itemCount))
else:
averages.append(None)
return averages
values = [v for k, v in data]
if smoothingCount > 1:
values = movingAverage(values, smoothingCount)
sys.stdout.write("|x = " + ", ".join([k.strftime("%Y-%m-%d") for k, v in data]) + "\n")
sys.stdout.write("|y = " + ", ".join(["%.0f" % v if v is not None else "" for v in values]) + "\n")
Usage:
- plotHmd.py stmf.csv USA DTotal
- plotHmd.py stmf.csv USA D0_14
- plotHmd.py stmf.csv USA D0_14 3
To obtain stmf.csv, go to mpidr.shinyapps.io/stmortality and at the bottom of the left pane click on the icon to the right of "CSV".
Country codes: AUT (Austria), BEL (Belgium), BGR (Bulgaria), CHE (Switzerland), CZE (Czechia), DEUTNP (Germany), DNK (Denmark), ESP (Spain), EST (Estonia), FIN (Finland), FRATNP (France), GBRTENW (England and Wales), GBR_SCO (Scotland), HUN (Hungary), ISL (Iceland), ISR (Israel), ITA (Italy), KOR (Croatia), LTU (Lithuania), LUX (Luxembourg), LVA (Latvia), NLD (Netherlands), NOR (Norway), POL (Poland), PRT (Portugal), RUS (Russia), SVK (Slovakia), SVN (Slovenia), SWE (Sweden), USA (U.S.).
Field codes: D0_14 (deaths for 0-14y), D15_64 (deaths for 15-64y), D65_74 (deaths for 65-74y), D75_84 (deaths for 75-84y), D85p (deaths for 85+y), DTotal (deaths total).
The script drops last two data points to prevent the worst effect of registration delay; for some countries, the last two weeks were obviously very badly affected by registration delay.
plotHmdPerYear.py
import sys, csv, datetime
fileName = sys.argv[1] # e.g. stmf.csv
countryCode = sys.argv[2] # e.g. USA
figureFieldName = sys.argv[3] # e.g. D0_14 (deaths for 0-14y), DTotal (total deaths)
data = []
file1 = open(fileName)
file1.readline() # Two intro lines without data
file1.readline()
for line in csv.DictReader(file1):
if line and line["CountryCode"] == countryCode and line["Sex"] == "b":
deaths = int(float(line[figureFieldName]))
data.append( (int(line["Year"]), int(line["Week"]), deaths) )
data.pop() # Drop last two weeks for too big registration delay effect
data.pop()
year2020Weeks = [week for year, week, deaths in data if year == 2020]
maxWeek = year2020Weeks[-1]
years = sorted(list({year for year, week, deaths in data}))
deathsUpToMaxWeek = []
for year in years:
deathsUpToMaxWeek1 = 0
for year1, week, deaths in data:
if year1 == year and week <= maxWeek:
deathsUpToMaxWeek1 += deaths
deathsUpToMaxWeek.append(deathsUpToMaxWeek1)
yearsOut = ", ".join([str(year) for year in years])
deathsOut = ", ".join([str(deaths) for deaths in deathsUpToMaxWeek])
sys.stdout.write("Last week in 2020: %i\n" % maxWeek)
sys.stdout.write("|x = " + yearsOut + "\n")
sys.stdout.write("|y = " + deathsOut + "\n")
Usage: similar to plotHmd.py.
plotHmdPerSeason.py
import sys, csv, datetime
# Plot deaths per season: week 40 of year before to week x of the year
fileName = sys.argv[1] # e.g. stmf.csv
countryCode = sys.argv[2] # e.g. USA
figureFieldName = sys.argv[3] # e.g. D0_14 (deaths for 0-14y), DTotal (total deaths)
seasonStartWeek = 40
data = []
file1 = open(fileName)
file1.readline() # Two intro lines without data
file1.readline()
for line in csv.DictReader(file1):
if line and line["CountryCode"] == countryCode and line["Sex"] == "b":
deaths = int(float(line[figureFieldName]))
data.append( (int(line["Year"]), int(line["Week"]), deaths) )
data.pop() # Drop last two weeks for too big registration delay effect
data.pop()
year2020Weeks = [week for year, week, deaths in data if year == 2020]
maxWeek = year2020Weeks[-1]
years = sorted(list({year for year, week, deaths in data}))
deathsInSeason = []
for year in years[1:]:
deathsInSeason1 = 0
for year1, week, deaths in data:
if year1 == year and week <= maxWeek:
deathsInSeason1 += deaths
if year1 == (year - 1) and week >= seasonStartWeek:
deathsInSeason1 += deaths
deathsInSeason.append(deathsInSeason1)
yearsOut = ", ".join([str(year) for year in years[1:]])
deathsOut = ", ".join([str(deaths) for deaths in deathsInSeason])
write = sys.stdout.write
write("Last week in 2020: %i\n" % maxWeek)
write("All-cause deaths in weeks %i+ year before and weeks 1-%i of the year, year by year:\n" % (seasonStartWeek, maxWeek))
write("|x = " + yearsOut + "\n")
write("|y = " + deathsOut + "\n")
Usage: similar to plotHmd.py.
plotUsCdc.py
import sys, csv, datetime
fileName = sys.argv[1] # e.g. "Weekly_excess_deaths_Full_Data_data.csv"
data = []
for line in csv.reader(open(fileName)):
if "Observed Number" not in line[0]:
date = datetime.datetime.strptime(line[2], "%B %d, %Y")
deaths = line[0].replace(",", "")
data.append( (date, deaths) )
data.sort(key=lambda x: x[0])
datesOut = ", ".join([k.strftime("%Y-%m-%d") for k, v in data])
deathsOut = ", ".join([v for k, v in data])
sys.stdout.write("|x =" + datesOut + "\n")
sys.stdout.write("|y =" + deathsOut + "\n")
Usage:
- plotUsCdc.py Weekly_excess_deaths_Full_Data_data.csv
To obtain Weekly_excess_deaths_Full_Data_data.csv:
- 1) Visit CDC.
- 2) Select the chosen state in "Select a jurisdiction" field.
- 3) In the chart below the field, click on the icon for Download.
- 4) In the popup window, choose "Data" as the format.
- 5) In the larger popup windows, select "Full Data" tab.
- 6) Click on "Download all rows as a text file" and save the csv.
Moving average via awk
You can calculate the 7-day moving average using awk on Windows:
- echo 1, 0, 4, 5, 18, 15, 28, 26, 64, 77, 101 | awk -F, -vn=7 "{for(i=1;i<=NF; i++) {s+=i>n?$i-$(i-n):$i; if(i>=n){printf \"%.0f, \", s/n}else{printf \", \"}}}"
You can put the result into clipboard:
- echo 1, 0, 4, 5, 18, 15, 28, 26, 64, 77, 101 | awk -F, -vn=7 "{for(i=1;i<=NF; i++) {s+=i>n?$i-$(i-n):$i; if(i>=n){printf \"%.0f, \", s/n}else{printf \", \"}}}" | clip
You can do the calculation on Linux:
- echo 1, 0, 4, 5, 18, 15, 28, 26, 64, 77, 101 | awk -F, -vn=7 '{for(i=1;i<=NF; i++) {s+=i>n?$i-$(i-n):$i; if(i>=n){printf "%.0f, ", s/n}else{printf ", "}}}'
If you are on Linux or a modern Mac, you already have awk. For Windows, you can install awk from ezwinports or GnuWin32 project.