A group of graduate researchers from the University of California-Berkeley trained a machine learning model to predict voter preferences using only readily available personal information, suggesting further-reaching implications on the use of AI to infer voter behavior and potentially influence elections.
The students – Ken Chang, Karel Baloun, and Matthew Holmes – discussed their findings today at the RSA Conference. The group used only a small number of characteristics, mined from public voter registration data and Census data, to train their decision tree algorithm.
“We’re just graduate students and we created a model which guessed what people vote well above chance,” Baloun said. “So what could you do with more data than just voter data – voter registration data and Census data? You can do a lot.”
The group drew parallels to the disinformation campaign used in the wake of the 2016 election – including the Cambridge Analytica scandal – to suggest that complex voter manipulation could readily be achieved and automated.
“Unfortunately, the Cambridge Analytica scandal shows us how tremendously well voter profiles can be built,” Baloun said. “They were able to take 200 different things and match them into a voter profile, so they knew who they were targeting at a very high level of probability, for a lot of voters.”
They proposed a simple feedback loop – where disinformation, or even regular information – can be introduced to particular effect.
“The idea here is that you identify a voter, infer more attributes about that voter, generate content for that specific voter, then you get it to them, you influence their vote, and you see if it worked,” Baloun said.
Though the group used machine learning and observed the results at each step in the process, they suggested that completely autonomous conditioning is possible, and that the technology is “close.”
“It becomes AI when the machine is doing the whole thing – when it picks the messages, when it evaluates the scores, when it optimizes the strategy. What we’re trying to tell you here is that AI is really close,” Baloun said.
The researchers recommended expansion of data privacy laws to ensure the anonymization of certain data points. Their model’s ability to predict preference declined as particular attributes of the voter became more obscure.
“The system depends on easily-available, rich voter profiles, and the ability to target voters with messaging,” Baloun said. “You take out any of those, you degrade them to any level, and you’re really protecting democracy.”