Sunday, October 13, 2024

Data Scrubbing or Vigilante Election Interference?

A local news story in Bryan County, Georgia provides a disturbing look into the strategies being pursued across multiple states to interfere at wholesale volumes with the voting rights of legitimate registered voters. The story involves a request submitted by a single citizen to purge 859 voters from the county's voter rolls. The county attorney followed the law, notified those affected by the suggested purge and participated in a hearing in front of the county election board. Roughly 100 citizens attended that meeting, the county attorney presented his analysis of the request and its validity and the election board voted to REJECT the ENTIRE request.

This incident is not only maddening in its particulars but it is a reflection of a more harmful trend with conservative legislative trends in controversial areas of policy that shift the point of origination of criminal accusations or enforcement of ordinances from appointed public officials and those in law enforcement to the general public.

The Applicable State Law

Georgia State law includes a provision in Section 21-2-229 that allows ANY voter within a county to challenge the voting eligibility of any OTHER voter in the same county by filing a formal notice with the county. The applicable statute can be read here:

https://law.justia.com/codes/georgia/title-21/chapter-2/article-6/section-21-2-229/

but the gist of it is this:

  • Any Georgia voter ("elector") can submit a challenge to the local election board regarding the eligibility of any other voter in the same county to vote
  • There is no limit to the number of voters whose eligibility can be challenged
  • The county election board has ten business days from receipt of the request to conduct a hearing
  • The county election board must notify those challenged within that same ten business day interval
  • The burden of proof lies with the person challenging eligibility
  • Any voter ruled ineligible has ten business days to submit an appear

Most notably, nothing in the statute establishes any "freeze" period prior to any election during which voter rolls are protected from potential mass changes. In theory, any voter can submit thousands of claims 23 days prior to an election, and given the intervals dictated for initial notification (10 days), formal notification via US mail of a decision (3 days maybe) and time required for appeal (10 days), a voter removed under false pretenses would lose their right to vote with no practical way to restore their right prior to an election.

The Complaint

In this Bryan County matter, a woman named Jenifer Hilburn submitted her list on September 19, 2024 as an Excel spreadsheet. Per the attorney, the spreadsheet was not manually compiled by Hilburn after methodical, manual due diligence. Instead, he stated it appears to have been downloaded from a third-party web site of a commercial data broker that blends data from other online public records. Here is what Hilburn stated in her submission to the county:

https://statesboroherald.cdn-anvilcms.net/media/documents/Voter_Challenge_Bryan_County_Redacted.pdf

First, she starts off with what in hindsight may easily prove to be a false statement:

I, Jenifer Dawn Hilburn, attest that I am a registered voter in Bryan County residing at (redacted), Richmond Hill, GA 31324. My voter registration number is (redacted). I further attest that I have personal knowledge of the manner in which these documents were prepared, and that they were not generated in a systemic manner.

She then provides bullets to describe the different "match buckets" that resulted from her "analysis":

  • Of the 859 voters included in the list, 2 are being challenged on the basis that the address associated with their voter registration is invalid either on the basis that it does not exist in the jurisdiction (i.e., “123 Address Way”) or because it appears to be located out-ofstate.
  • Of the 859 voters included in the list, 115 are being challenged on the basis that they both: appear on the National Change of Address (NCOA) database and there has been an application for a homestead exemption at the new address indicated on their NCOA application.
  • Of the 859 voters included in the list, 381 are being challenged on the basis that they both: appear on the National Change of Address (NCOA) database and have registered to vote in another jurisdiction. Their new voter registration was matched by crossreferencing the address listed in NCOA and the voter’s first and last name on the new voter registration.
  • Of the 859 voters included in the list, 361 are being challenged on the basis that their registration is associated with an address that is not coded or zoned for a residential purpose or use (e.g., it is zoned as being a restricted industrial zone).

Pitfalls of Bulk Scrubbing of Name / Address Data

These descriptions all appear very logical on the surface but those with data scrubbing experience with large datasets can recognize the problems lurking. Here are some examples.

Matching on (firstname + lastname) combinations? Did "Jim Smith" enter his name as "Jim Smith" or "James Smith" or "Jimmy Smith"? Did "Nate Bargatze" enter his name as "Nate Bargatze" or "Nathan Bargatze" or "Nathaniel Bartgatze"? How about Dan, Danny or Daniel? You can see the complications that arise. People may not use the same level of formality on voter registrations as they do a change-of-address form with the post office. And your name is not as unique as you might think.

Using the National Change of Address (NCOA) database? Data in the NCOA database is only retained for 48 months so it cannot be used as a definitive source to "catch up" on a scrub of voter rolls that have not been properly periodically revalidated. If you think you've notified all the banks and businesses you care about of your move directly and don't care who gets your junk mail that remains, you aren't REQUIRED to fill out a change of address form so the ABSENCE of a record in the NCOA database can't be used in a logical decision to invalidate a voter roll entry. But the PRESENCE of a record is still subject to the (firstname + lastname) problems above.

The logic used when matching against NCOA isn't crystal clear either. If the Bryan County voter rolls were stored in table oldvoterrolls, any NEW voter registrations they consulted were in a table newvoterrolls and the post office change of addresses were stored in a table changeofaddress, the data might look like this:

MariaDB [mock]> select * from oldvoterrolls;
+----------+-------+----------+-----------+---------------------+-------+-------+-------+
| voter_id | fname | lname    | ssn       | address             | city  | state | zip   |
+----------+-------+----------+-----------+---------------------+-------+-------+-------+
|        1 | Jim   | Smith    | 500119999 | 1313 Mockingbird Ln | Bryan | GA    | 31324 |
|        2 | Jim   | Smith    | 500229999 | 2112 Rush St        | Bryan | GA    | 31325 |
|        3 | Nate  | Bargatze | 500339999 | 1999 Prince Ave     | Bryan | GA    | 31323 |
|        4 | Dave  | Jones    | 500669999 | 1492 Columbus Ave   | Bryan | GA    | 31322 |
+----------+-------+----------+-----------+---------------------+-------+-------+-------+
4 rows in set (0.000 sec)

MariaDB [mock]> select * from newvoterrolls;
+----------+--------+----------+-----------+----------------+-------+-------+-------+
| voter_id | fname  | lname    | ssn       | address        | city  | state | zip   |
+----------+--------+----------+-----------+----------------+-------+-------+-------+
|      101 | Jim    | Smith    | 500119999 | 461 Ocean Blvd | Miami | FL    | 33101 |
|      102 | Jim    | Smith    | 500449999 | 555 Maple Ln   | Miami | FL    | 33102 |
|      103 | Nathan | Bargatze | 500339999 | 3434 Oak St    | Miami | FL    | 33103 |
+----------+--------+----------+-----------+----------------+-------+-------+-------+
3 rows in set (0.000 sec)

MariaDB [mock]> select * from changeofaddress;
+---------+-------+-------+-----------+---------------------+-------------+----------+--------+----------------+---------+----------+--------+
| ncoa_id | fname | lname | ssn       | oldaddress          | oldcity     | oldstate | oldzip | newaddress     | newcity | newstate | newzip |
+---------+-------+-------+-----------+---------------------+-------------+----------+--------+----------------+---------+----------+--------+
|     101 | Jim   | Smith | 500119999 | 1313 Mockingbird Ln | Bryan       | GA       | 31324  | 461 Ocean Blvd | Miami   | FL       | 33101  |
|     102 | Dave  | Jones | 500779999 | 1968 Truth Way      | Los Angeles | CA       | 90210  | 467 Ocean Blvd | Miami   | FL       | 33101  |
+---------+-------+-------+-----------+---------------------+-------------+----------+--------+----------------+---------+----------+--------+
2 rows in set (0.000 sec)

MariaDB [mock]>

Note that I've included a column for SSN to allow the uniqueness of each name overlap to be preserved, even when that value itself wasn't tracked in these databases. More on that in a minute.

Given these three table structures, the SELECT query to find a list of candidates for removal might look like this:

SELECT n.voter_id as newvoterid,n.fname as newfname,n.lname as newlname,
       o.voter_id, o.fname, o.lname, o.address, o.city, o.state, o.zip,
       o.ssn as oldssn,n.ssn as newssn,c.ssn as possn
FROM newvoterrolls as n
LEFT JOIN changeofaddress as c ON (
   (n.address = c.newaddress) AND
   (n.city    = c.newcity) AND
   (n.fname   = c.fname) AND
   (n.lname   = c.lname)
   )
LEFT JOIN oldvoterrolls as o ON (
   (o.address = c.oldaddress) AND
   (o.city    = c.oldcity) AND
   (o.fname   = c.fname) AND
   (o.lname   = c.lname)
   )
WHERE o.voter_id IS NOT NULL\G

The results of this query look like this:

*************************** 1. row ***************************
newvoterid: 101
  newfname: Jim
  newlname: Smith
  voter_id: 1
     fname: Jim
     lname: Smith
   address: 1313 Mockingbird Ln
      city: Bryan
     state: GA
       zip: 31324
    oldssn: 500119999
    newssn: 500119999
     possn: 500119999
1 row in set (0.000 sec)

MariaDB [mock]>

In the mock data, there WAS a "Jim Smith" (with underlying SSN=50011999) who

  • appeared in the old voter rolls
  • appeared in the change of address database as moving from an old address on the old voter rolls to a new address seen in new voter rolls elsewhere
  • had those old/new addresses appear along with his fname / lname in the change of address database

So this query appears to be "correct." As far as it goes…

But read the description of the logic used by the woman who submitted the list of suspect registrations:

Their new voter registration was matched by crossreferencing the address listed in NCOA and the voter’s first and last name on the new voter registration.

That SOUNDS like it is possible the final join may have only keyed on fname / lname. What happens if the final match criteria omits the city and address?

SELECT n.voter_id as newvoterid,n.fname as newfname, n.lname as newlname,
       o.voter_id, o.fname, o.lname, o.address, o.city, o.state, o.zip,
       o.ssn as oldssn,n.ssn as newssn,c.ssn as possn
FROM newvoterrolls as n
LEFT JOIN changeofaddress as c ON (
   (n.address = c.newaddress) AND
   (n.city    = c.newcity) AND
   (n.fname   = c.fname) AND
   (n.lname   = c.lname)
   )
LEFT JOIN oldvoterrolls as o ON (
   (o.fname   = c.fname) AND
   (o.lname   = c.lname)
   )
WHERE o.voter_id IS NOT NULL\G

The query returns a second FALSE match on the "Jim Smith" with SSN=500229999 in the oldvoterrolls because the change of address record involving the 50011999 version of Jim Smith was only joined back to the oldvoterrolls table by (fname/lname) which matched on BOTH "Jim Smith" entries in that table.

*************************** 1. row ***************************
newvoterid: 101
  newfname: Jim
  newlname: Smith
  voter_id: 1
     fname: Jim
     lname: Smith
   address: 1313 Mockingbird Ln
      city: Bryan
     state: GA
       zip: 31324
    oldssn: 500119999
    newssn: 500119999
     possn: 500119999
*************************** 2. row ***************************
newvoterid: 101
  newfname: Jim
  newlname: Smith
  voter_id: 2
     fname: Jim
     lname: Smith
   address: 2112 Rush St
      city: Bryan
     state: GA
       zip: 31325
    oldssn: 500229999
    newssn: 500119999
     possn: 500119999
2 rows in set (0.000 sec)

MariaDB [mock]>

You learn when doing this type of database work to print out all unique columns across your table sources when creating the logic for a query to spot these "false matches" based on overly lax criteria. Just because the output structure of a query LOOKS correct, the underlying join can reflect logical flaws that may not be evident in the output you chose to display.

In this case, a logical flaw like this that went undetected and unchallenged could have resulted in denying the right to vote to the second 50022999 version of "Jim Smith". This list of challenges was submitted on September 19, 2024 and finally gained a hearing on October 10, 2024, less than 30 days before the election.

Use of zoning codes for addresses or ZIPs? This is the type of third party data that Jenifer Hilburn could NOT have verified on her own. If you have ever worked in e-commerce or in the telecommunications industry where the ability to install and deliver service is based on exact physical location and proximity to other infrastructure, you know that taking an ADDRESS as a data point and attempting to map it to exact (latitude, longitude) or map it to a county or city boundary is NOT an exact science when done from afar. Unless you physically walk to each mailbox, physically check your GPS location on a phone and audit every address, many third-party databases claiming to provide this "geo-coding" data are only approximately correct and they usually lag reality by 6-12 months. These databases rely in part upon plat plans being filed for subdivisions, construction permits being filed for specific lots, etc.

The Outcome in This Case

The county attorney presenting the challenge for review by the election board spent much of his time explaining the legal obligations of both the election board members to process these requests and explaining that the burden of proof lies with the person filing the action, NOT those appearing on the list. He then noted that the person who submitted the request failed to attend the hearing, presenting no additional evidence or explanation to justify her findings. That was met with frustrated laughter by nearly everyone in attendance. At that point, the attorney ceded control of the meeting to the chair who called a motion to reject the request which was approved unanimously by the entire board.

https://www.youtube.com/watch?v=flnGlMFO0fA

So what was the point of this particular effort? If Jenifer Hilburn had no intent to attend the hearing to provide supplemental proof of the validity of her recommendations, what was accomplished? From a rational citizen's perspective, this was just a hassle to the citizens involved and a waste of time for the county attorney and the election board. From the perspective of those trying to taint the perception of elections and physically throw sand in the gears, the goals were likely quite different.

The election law involved in Georgia stems from changes made in 2021 by forces in the state who supported Donald Trump's fake elector scheme. Since the law is so new, it appears no one was sure how clearly the law's provisions would be understood… OR FOLLOWED by any particular local election board. Bryan County sits on the Atlantic coast south of Savannah and is an island of core Republican support surrounded by areas that tend towards an even split or Democratic support. Was Jenifer Hilburn's request an attempt to test processes under this law in her county to see how lax they would be? Was it an attempt to set a marker for later allegations of voter fraud that could be used to delay election certifications in that county as part of a larger scheme across the state?

In this particular case, despite Bryan County appearing to be a "bright red" jurisdiction, there is zero sign anyone involved with this case at the county level supported this petition or thought it had ANY merit. The vocal roll call on the motion to dismiss the request indicated nearly unanimous frustration with this blatant attempt to arbitrarily toss people off the voter rolls.

Two conclusions appear obvious in this case. As a first conclusion, Hilburn was not supplying evidence worthy of being taken seriously if she found 859 suspect cases yet was unwilling to attend a hearing to explain her logic and underlying validation process. Even if one gives her the benefit of the doubt that SOME work was performed to join database A with B then to C, it seems clear she lacked the technical sophistication to include extra controls in the process to avoid "false joins" amid notoriously flawed data.

The second conclusion is that these types of "vigilante" laws being pushed by conservatives are SERIOUSLY flawed in both a legal and ethical sense. Turning average citizens into amateur data scientists and turning them loose on flawed, un-audited data to create proof for crackpot theories is legally flawed because it is essentially handing over part of PUBLIC ("group") power to INDIVIDUALS who lack the expertise and ACCOUNTABILITY to exercise even partial control of such vast powers. These types of vigilante activism are ethically wrong because they are allowing corrupt politicians to outsource their dirty work to individual citizens while preserving deniability when individual citizens wind up infringing the rights of other citizens. But do not be misled. Such cases of citizenry going out of bounds against other citizenry isn't a FLAW of these laws, it's a FEATURE in the eyes of those proposing these laws. It creates the fear, uncertainty and lack of trust desired while shielding those benefiting from it from legal consequences when abuses actually occur.


WTH