Technical Guide to Protecting Voter Registration Databases

The US presidential election is in a few months and although the coronavirus is currently occupying everyone’s attention, it will soon be shifting to the election. And among the chief concerns is election security and integrity.

We have previously covered the importance of audit trail for election security (prior to the mid-terms and the EU elections), but now we’d like to provide a more comprehensive technical guide on one particular aspect of election security – the protection of voter registration databases.

Every state must have one in electronic form, but that poses numerous risks, outlined in multiple reports (thisthisthisthisthis and this).

There are all three typical information security risks:

  • Integrity – is the registration data intact, or somebody has tampered with it without detection, which can lead to voters be silently removed, or their data edited so that they are denied the right to vote on election day
  • Availability – is the voter registration data available when it’s needed, or is there a risk of data loss and chaos on election day when nobody knows who’s registered and who’s not
  • Confidentiality – is the registration data kept secret or malicious actors are able to obtain it and use it to micro-target voters in order to influence them.

The three points above are not in their usual order, as in this case a breach of confidentiality (which usually gets all the spotlight when data protection is concerned), seems to have a smaller negative effect on the election process – an adversary needs to take extra steps that are complicated themselves in order to abuse a list of voters at scale. And unfortunately, the basic data of most Americans has already been leaked through multiple data breaches.

Below is a detailed set of recommendations for protecting voter registration databases that should make sense to a wide range of stakeholders. We want to get as technical as possible but without being incomprehensible for non-technical people.

Protecting the integrity of voter registration databases

“Improving State Voter Registration Databases: Final Report (2010)” by the National Research Council states:

Insecure VRDs pose a number of dangers. Individual voters may be disenfranchised if records of their registration are improperly deleted from the VRD. Voter fraud may be possible if registration records are improperly added. A voter might fall victim to identify theft if sensitive personal information such as a Social Security number is compromised. And improper changes to a voter’s record might also effectively disenfranchise him or her (e.g., an altered address might cause the voter to go to the wrong polling place) and at the very least have the potential for creating confusion and difficulty for a voter

The Brennan Center report adds that “An undetected change to the voter list could incorrectly show that a voter had already cast a ballot, or that she had recently moved“.

Below is a list of technical measures to address the issues of voter registration database integrity:

  • Have a full audit trail of all legitimate modifications, insertions and deletions of records. This allows per-voter integrity guarantees, not just “all or nothing” checks on the whole list. Election officials should be able to reconstruct the history of all modifications for each voter – when the record was initially created, when was it updated and when and why it was deleted. And most importantly – who did the action. Knowing all that could allow investigating alterations due to leaked election official credentials and other attacks. An audit trail would also allow restoring the proper values in case of a direct database manipulation that somehow got passed the audit log facility (which can be at the application level).
  • Audit trail on direct database modifications – for proper forensic investigation, the application audit trail should be augmented by a native database audit trail. In proper circumstances, direct updates to the database (via database management tools) must not happen. However, in case of a security incident, the application database credentials can be abused to modify the records. Many databases have native audit log functionality that has to be turned on and fine-tuned.
  • Protecting the audit trail integrity – having an audit trail is only useful if it has its integrity protected. Cryptographic approaches exist to make sure the audit trail is not tampered with and they must be employed in voter registration databases
  • Regular integrity checks – the current state of data should regularly be compared to the audit trail in order to detect possible manipulation. Timely detection is crucial in election scenarios and so, during election day, these checks should be regularly running. Prior to election day, once a day may be sufficient. These checks can catch any inconsistencies between the current state of the data in the voter registration database and the expected value based on the audit trail.
  • Signing backups – regular backups are happening in almost all cases, however that’s not enough. Backups must be digitally signed by the election authority. Backups can have their signature logged in the audit trail facility and signature comparison must be conducted as part of the restore process.
  • Public proofs – the fact that the database is not tampered with can be made publicly available in case the audit trail is using Merkle trees. A Merkle tree is a cryptographic data structure that allows external verification of the integrity of a set of records (Merkle trees are a core component of blockchain technology, for example). Trust is one of the most important factors when it comes to elections and any tool that can help voters be confident that the voter registration database is intact should be employed. Merkle trees require solid technical knowledge, and having a bunch of cryptographic proofs may not be convincing for non-technical people, but it can inspire confidence.

Protecting the availability of voter registration databases

The Congressional task force on election security writes in their report:

Almost all states make a daily, offline copy of the statewide voter registration database. In addition, states and counties each keep lists that can be used as backup for one another in the event of a breach

That’s a good start, but availability is about more than having regular offline backups. Below is a list of technical measures to address the issues of voter registration database availability:

  • Have a proper backup procedure – that includes the measures above (daily offline backups), but should also include incremental backups so that no piece of data is actually lost (e.g. an attacker with knowledge about certain procedures can attempt to destroy the live data after a day where election officials have automatically registered many voters, if the state law prescribes that). WORM (write once, read many) storage can also be used for storing backups.
  • Run a database cluster – one database instance can always fail, regardless of the cause (hardware failure or malicious actor). Databases should be run in clusters with replication, active-active if possible, and active-passive otherwise, so that another database node can still serve the data if one node fails.
  • Regularly test the restore procedure – database structures change, backup & restore tools get upgrades and it’s not always guaranteed that you will be able to restore a given backup. It may sound obvious, but many organizations skip the restore procedure testing (e.g. restoring a backup on a test database) and so the backup is practically unavailable when it’s actually needed/
  • DDoS protection – a sustained DDoS attack can prevent a publicly facing system to be usable. This can prevent voters from registering or election official to access data that’s necessary for conducting the election. DDoS protection is hard, as there’s little you can do once the traffic reaches your network. You can use a DDoS protection service from a provider that can filter the traffic at its origin (and pass it through so called scrubbing centers for cleaning off malicious requests). This can be expensive, so it’s best for internal communication to be done over a private network, if possible. Private networks still may have their gateway endpoints publicly visible and so they can be DDoS’ed if known, so at least some level of DDoS protection should be available

Protecting the confidentiality of voter registration databases

The Help America Vote Act (HAVA) has a list of requirements to voter registration databases, and one of the requirements is as follows:

(v) Any election official in the State, including any local election official, may obtain immediate electronic access to the information contained in the computerized list.

This has significant implications for the confidentiality of data. As most data breaches happen through compromised credentials, access control should be a primary focus of improvement. Below is a set of recommendations to address risks regarding breaches of confidentiality:

  • 2FA – Each election official should have (at least) 2-factor authentication enabled in order to gain access to the data. That 2FA should ideally not rely on SMS, but rather on other forms of one-time passwords (OTP) via token or smartphone. OTP secret keys should be stored in encrypted form, otherwise breaches to the secrets database would make 2FA useless, as the attacker will be able to generate valid OTPs at will. Out-of-band 2fa should also be considered.
  • Granular access control – election officials should have access only to the data they immediately need (e.g. for their county / their polling station). If they are allowed to access other data by law, that should require authorization by another official. Read-only access should be the default option, and only specific roles should be allowed to modify data (with the required audit trail).
  • Limiting data export functionality – many application developers and product owners are tempted to add “export to Excel” options, as it makes it easier to work with the data. These export options should be reviewed and removed if not absolutely needed for a particular task, as once the data leaves the application and its protected, access-controlled environment, leaks are much more likely.
  • Field-level or column-level encryption – encryption at rest protects data from cases of physical storage theft, but once the data is decrypted and operational, it’s in plaintext for any attacker or malicious insider with access to the database. That’s why some databases offer column-level encryption. There are solutions that do field-level encryption. Alternatively, sensitive data can be tokenized, like credit card numbers often are. These measure mean that mass dumps are less likely to happen if the attacker doesn’t have access to the decryption keys, or can be stopped while they are happening, as each record should be decrypted individually.
  • Encrypt backups – in addition to doing regular backups and signing them, they should also be encrypted – unencrypted backups are much easier to leak, as control over the backup storage is not often as strict as that of operational data. Keys that are used to encrypt backups (and any other encryption keys used) should be either stored on a hardware security module (HSM) or wrapped with a HSM master key, otherwise keys themselves can leak.

General recommendations

Some measures address more than one aspect of voter registration database security:

  • Phishing protection and training – phishing and its variations (spearphishing, whaling) are ways to compromise a system by impersonating a privileged user. There is no bullet-proof solution to that and election officials should be trained to recognize phishing attacks. They should not be using personal emails for work-related activities. Regular drills with fake phishing emails should be performed. Only email services/software with good phishing and spam protection should be used.
  • Endpoint protection – antivirus and endpoint protection solutions should be used to make sure employee computers are not infected and are not being used to infiltrate the network
  • Monitor frontend scripts that could report successful registration – script injections that are often used to steal credit card numbers as the user types them on a website form can be used to either steal or manipulate voter registration data before it even reaches the database. So all static resources should be monitored for changes, either with subresource integrity (SRI) or via an external tool.
  • Look for anomalies in voter registration patterns – voter registration normally follows some patterns of activity and anomalies in that pattern may indicate an issue – either an attacker is blocking user from registering or many fake voters are being registered.
  • Regularly validate application signatures – Applications are what election officials use to work with the underlying database. Application artifacts can be replaced by a malicious actor if they have gained access to some component (e.g. the application server, the CI/CD system, the release repository). All production artifacts should be signed by the organization and then signatures should be regularly verified. Signatures should also be stored in an audit trail to avoid an attacker signing their own malicious application.

How can LogSentinel help

LogSentinel helps large organizations in both the private and the public sector to protect their data. We offer several services for protecting data integrity and confidentiality:

  • SentinelTrails is a secure, cryptographically protected, tamper-evident audit trail solution. It can be used to store the full audit trail for each voter and allow reconstructing the whole history of modification. It can also attach to database-native audit trails. Other integrity-relevant elements like backup signatures and application signatures can also be stored in the audit trail. Because we are using Merkle trees, we expose methods to publicly obtain proofs that the data is not tampered with
  • SentinelDB is a family of products that offers searchable field-level encryption in order to protect the confidentiality of data – it’s no longer possible for a database administrator or someone who stole their credentials to dump the whole database and run away with it.
  • Consultancy we can asses the state of a particular system and recommend ways to improve its security using the best tools available.

Conclusion

Election security is national security, as stated many times by many experts. We can’t afford not to apply state-of-the-art security methods and solutions for protecting elections and therefore democracy. Adversaries are more and more determined and capable and the race to protect critical infrastructure like voter registration databases is a serious task. No amount of checklist-based audits and “install and pray” products is sufficient. Even though we’ve discussed at lengths the technology aspects, people play the most important role. People are the ones to push for upgrades, for better security solutions and for better procedures. We believe that technology should serve people and we are willing to support state election officials in their task of protecting the democratic process.