2021-04-12 Update: Since this work was begun the Library of Parliament has made available data that is much higher quality than that available from the sources cited here. That data was used for the Parity Across Time project but has not yet been documented here. Many of the basic assumptions remain, however.
Getting the raw data for Canadian federal elections is an exercise in frustration. It’s available from Elections Canada and the government’s Open Data Portal but in several different formats and sometimes with file format errors, multiple names for the same political party, etc.
I’ve collected what’s available, cleaned it up, and put it all into a common format. The programs to do this are freely available on GitHub if others want to extend or improve upon them.
Finally, several other files are available with an overview of each election as well as the parties involved.
The really raw data is from:
To harmonize this, we drop the poll-by-poll data and go to just the riding level.
There is some overlap in the data sets. We prefer the first data set to the second in those cases.
The Open Data Portal also includes a link to data for the 38th election. However, it’s in a different format and does not provide the same level of detail.
Since working on this, Semra Sevi has released a data set developed for her PhD research. She has done an impressive amount of manual work to identify individual candidates across elections, include their gender (everyone), birth year (elected MPs only) and occupation. She also started with data from ParlInfo rather than Elections Canada. There is a paper describing the work in more detail, but it is unfortunately behind a paywall.