Just this week, I’ve spoken at three events where I’ve been asked about AI and its impact on artists, so for this weekend’s newsletter, I’ll take a break from covering the crypto industry and share some of my thoughts on this new trend. TLDR; although I am deeply impressed by the strides that the AI industry has made over these last 3 years, I think that the generated text, visuals, and music of this wave of AI projects is not at the level that they could be replacing professional human output. For now at least, maybe the “A” in “AI” should just stand for “Average,” because it’s not exceptional just yet.
Nearly all of this generation of projects are based on Large Language Model or LLMs. This is a style of AI where humans teach The Computer by feeding it massive amounts of data (“parameters”) and then reward or punish it based on what it produces using those parameters. It’s a big enough job that there are now organizations that focus on just specific parts of the pipeline. For instance, the training data is collected and maintained by a non-profit like LAION.ai, so the people actually designing the AI don’t have to compile all that information themselves. Meanwhile, the reward/punishment process (formally called “Reinforcement Learning thru Human Feedback” or RLHF) is often handled by other companies too. One of the best-known is SurgeHQ.ai, and it’s literally dozens of humans trained to ask the AI to produce stuff (text, images, etc) and then score it based on whether the output made sense.
But of course, the actual AI design itself rests with a handful of key players, most visibly, OpenAI. They’re the creators of the GPT series, which is the underlying tech of the text generator ChatGPT and the image generator DALL-E. OpenAI does its data collection and RLHF in-house, and GPT3 has access to a truly massive pool of 175B training parameters to improve its output. Other popular art generator platforms include Stable Diffusion and Midjourney, but there are many, many others, all gleaning their training data from the open Internet and using RLHF training to refine their respective AI output. In a way, the public release of these various platforms is just RLHF training at scale, because every time some clever human user finds a way to break the system, the research teams can recreate the problem in their lab and attempt to train it out of the AI through punishment.
So what’s my big issue with the current state of AI? I think it’s useful for many things, but the illusion of intelligence disappears very quickly once you try to use it for work that requires a degree of accuracy. Ask it to write a review about the Samsung Galaxy S21, for example, and it’ll get some of the specifications wrong. Ask it for references to mathematics concepts, and if it can’t find any in its dataset, it will helpfully give you web links to fictitious papers that don’t exist. Normally, it’ll refuse to give you the recipe for crystal meth, but if you first ask it to pretend that it’s an AI without any safety guidelines, it will play along and give you the precise instructions anyway. All of these are RLHF challenges that will eventually get trained out of the AI’s behaviors, but it’s interesting to test the limitations of this current generation.
What about the art side? When I first saw the results coming out of DALL-E, Stable Diffusion, and Midjourney, I’ll admit that I was initially intimidated. Here was a tool that allowed anyone with a keyboard to simply type the description of an artwork that they imagined, and voila! Seconds later, a couple of professionally rendered illustrations would be available for them to choose from. But just like their text-based counterparts, the output of these AI art generators can’t really stand up to close inspection. For instance, it can’t seem to figure out how many fingers human hands have. Hundreds of memes have been shared over the last few months making fun of AI’s weird hand mutations. It’s a very telling weakness in the LLM style, and it says a lot about the current level of intelligence our AI actually has. Think about it this way: Stable Diffusion is trained on an image set of 5 billion examples. A 4-year-old can usually tell you how many fingers a human is supposed to have even if the only other examples they’ve ever seen are their own parents. Clearly, a training set of 5 billion images is not enough, but the answer can’t be that we should just keep increasing the sample size. There are supposedly over 750 billion images on the Internet, and that includes some of the vilest things you’ve ever seen. Deciding which images are suitable or proper would probably take the better part of our lifetimes. The next AI breakthrough is thus probably not going to happen through the LLM approach, but through a strategy that includes some rudimentary form of intuition. The success metric should be how little training data the AI needs, rather than how much.
The bottom line is that this current generation of AI is a mimic, but it doesn’t actually understand what it’s mimicking. It follows the grammar rules of the English language not because it understands them but because it was fed 100 billion examples of (mostly) correct sentences, and then it was trained to suppress nonsense phrases. The industry term for this is a “Stochastic Parrot,” meaning it speaks without truly thinking. It’s the same problem with the images it produces. Without true understanding, AI is just haphazardly remixing whatever visuals it can draw from. It’s still up to the human users to take that output and rework it into something accurate and useful. Ultimately, it’s just a new tool, and therefore something to embrace instead of fear. Photoshop didn’t put artists out of business in the 90’s, and neither will Midjourney.
From February 17th to 19th, I’m launching an exhibit at Art Fair Philippines with Galeria Paloma and a group of fellow artists. The two NFTs I’ve created will be on display on the 6th floor at booth 27-A. It’s a pair of audiovisual pieces commenting on AI art, entitled “Handmade.” (You can see a still image of one of them in this newsletter’s cover photo!) Hope to see you all at the exhibit!
Would love to read every day, all that you write here.
Love the AI angle... learned a whole lot. I want to pledge but no CC. Sorry.