Will AI automate data visualization?

Post

As pretty much everyone and their robot dog is now aware, there are jaw-dropping breakthroughs happening in artificial intelligence (AI) on an almost daily basis. To those of us in the data visualization field, this begs the obvious question: Will AIs be able to create expert-level charts without any human intervention, and, if so, when might that happen?

That’s hard to say, of course, but what seems almost certain at this point is that the process of creating a chart is going to change dramatically in the very near future. Already, AI users can describe a chart using simple, plain-language prompts, and get an image of that chart in seconds without having to use the graphical user interfaces (GUIs) of data visualization products like Tableau Desktop or Microsoft Excel. How good are the resulting charts? Well, in my opinion, they’re currently pretty hit-or-miss and often require corrections or enhancements by a human with data visualization expertise before being shown to an audience. Given how quickly AI is advancing, though, how long might that remain the case?

I think writing computer code provides a potentially informative model here since current AIs are much more advanced when it comes to generating code compared with generating charts. The GPT-4 AI was released about a week ago as I write this, and it can produce astonishingly good code based on plain-language prompts. Does this mean that people no longer need to know how to code? Well, that doesn’t seem to be the case, at this point anyway. For now, people with coding expertise are still needed for a few reasons:

  1. A human coder still needs to decide what code is needed in a given situation, and what that code should do. AIs can decide what code is needed for common applications with similar examples in their training data, such as simple games or content management systems, but they have trouble with more complex, novel, or unique applications, such as custom enterprise software applications. As far as I can tell, someone without any coding expertise would struggle to formulate prompts that would result in usable code for anything but simple, common applications.
  2. AIs often make mistakes that must be identified and corrected by expert coders before the code can run without throwing errors or introducing problems like security vulnerabilities or unintended application behaviors.

This means that, for now anyway, humans with coding expertise are still needed to guide and supervise coding AIs. Those coders will be a lot more productive (and so potentially less numerous), but still necessary. A similar consensus seems to have emerged in recent months around car-driving AIs: During the last decade or so, many people assumed that humans would no longer need to know how to drive because car-driving AIs would exceed human driving abilities in all situations. In recent months, however, it’s started to look more like humans will still need to know how to drive since car-driving AIs are unlikely to perform reliably in a wide variety of situations for the foreseeable future. Yes, drivers will be more productive since they can rely on AI for simpler tasks like highway driving in good conditions, but they’ll still need to know how to drive so that they can correct or take over for the AI in more unusual or complex situations.

Data visualization might follow a similar path. It seems almost certain at this point that human chart-makers will become a lot more productive, because they’ll be able to simply describe a chart in plain language and get that chart within seconds. In many cases (but not all), this will be faster than using the GUI of a data visualization software product to create a chart, and learning how to use an AI to create charts will be a lot quicker and easier than learning how to use data visualization software.

Even if they’re using an AI, however, chart makers still need data visualization expertise to decide what charts are needed in a given situation, and to supervise the AI by correcting any data visualization, reasoning, or perceptual mistakes that it might make. A human with data visualization expertise might also need to prompt the AI to make design choices that the AI might have trouble making on its own, such as deciding to visually highlight part of a chart, adding callouts with key insights, or bringing in comparison/reference values from an external source.

If this is how things play out, it would mean that people will still need data visualization skills, but the way in which they’ll use those skills will change drastically. Instead of using those skills to make charts “by hand” using the GUI of a data visualization software application, they’ll use those skills to guide and supervise chart-making AIs, just as human coders use their coding expertise to guide and supervise coding AIs.

Now, a chart-making AI might not offer enough control or flexibility for some users, particularly those who create highly customized charts such as scientific charts, data art, specialized business dashboards, or novel chart types. Those users will likely still need to use data visualization GUIs or code libraries such as ggplot or D3.js, but they represent only a small minority of chart creators. I suspect that a good chart-making AI will meet the needs of most people who create charts.

I’m probably over-estimating its importance, but my upcoming Practical Charts book might accelerate the transition from using GUIs to using AIs to make charts. The book contains chart design guidelines that are more concrete and specific than other books, which is exactly the kind of training data that would help a chart-making AI become more competent. On the one hand, it’s frustrating to think that I might have spent the last several years writing training data for AIs (and also to not be able to block AIs from including it in their training data—unless legislation or policies change). On the other hand, I recognize that AIs that include my book in their training data may allow millions of people to make better charts. This is already happening with AI-generated computer code, which often contains expert-level techniques and practices that were distilled from code in the AI’s training data that was written by expert coders, and that many coders who use the AI wouldn’t think of on their own. It’s also happening with car-driving AIs, which can allow human drivers to perform better by, for example, slamming on the brakes to avoid a frontal collision faster than a human ever could.

Now, this situation could change, of course. Between the late 90s and early 2010s, for example, the best chess players in the world were “centaur” or “hybrid” teams consisting of a human grandmaster using a chess-playing AI to assist them. Such teams could easily beat the best AI-only players. That changed, however, when chess engines like AlphaZero came out a few years ago, which were so good that pairing a human grandmaster with them made them worse, not better. The question, then, is whether data visualization is more like chess, or more like car-driving? Only time will tell, but it feels more like car-driving to me at the moment, i.e., like something that will require expert human supervision for the foreseeable future.

Take all of this with a boulder of salt, of course, since this is pure speculation based on the information that’s available at the moment. Some of the challenges that I’ve described could turn out to be much easier or much harder than expected for AIs to overcome, and things could be very different a few years from now. Or next Tuesday.

Agree? Disagree? Awesome! Let me know here on LinkedIn or on Twitter.