Each CSV file contains one row for each candidate.
There is one file for each election since 1867 plus one file that contains all
of the candidates since 1867.
Each CSV file contains the following columns.
- election_id: A sequence number for the election: 1 for the first one in
1867, 2 for the second one in 1872, etc.
- election_date: The date the bulk of the election was held. Some early elections were
held on different days in different regions of the country.
- prov_code: A code such as “NB” or “ON” to indicate the candidate’s province.
- ed_id: A code for the electoral district (riding). In recent elections, it’s the same
code that Elections Canada uses; in older elections it’s simply a sequence number.
- ed_name: The name of the electoral district (riding).
- cand_id: A unique number assigned to each candidate. I’d love to have
the same number assigned to each of the five times Harold Albrecht ran,
for example. But the hurdles of merging “Harold Albrecht” with “Harold Glenn Albrecht”
or the 3 times Julian Ichim ran in Kitchener-Waterloo with the one time he ran in Kitchener-Centre
are more than I can tackle right now. So each candidate in each election has an id number.
- cand_name: The name of the candidate.
- cand_raw_party_name: The candidate’s party name as recorded in the raw data.
- party_id: An id number for the for the “cleaned” parties. Please see the section on
“cleaning party names”, below. This id can be looked up in
to find the party name and a flag for whether it is “mainline” or not.
- party_name: The “cleaned” party name. See below.
- party_short_name: A shortened version of the party name; hopefully suitable for column names.
- mainstream: A boolean value; true if the party ever attained 5% or more of the popular vote
and false otherwise.
- votes: The number of votes the candidate earned (unless s/he was acclaimed).
- acclaimed: In early elections some candidates did not have opposition and
were acclaimed. In those cases, votes were not held and so the votes column
is blank/null. This column is true if the candidate was acclaimed and false otherwise.
- place: The rank of this candidate within their electoral district with
1 being the candidate with the most votes.
Note that the last file has data for all candidates.