Just as artists and audiences were growing bored of deepfakes and hyperrealistic GAN aesthetics, OpenAI released CLIP. It has never been more welcome. Announced together with DALL·E back in January 2021, CLIP relies on text-based input to generate high-quality images. This new technology took the creative coding and digital art communities by storm and my Twitter feed was flooded by images, their corresponding text prompts, and bemused reactions.
Over the past year, the rise of CLIP-based image-making has coincided with the rise of the NFT, popping up on every platform, claiming its own “clip” tag category and helping digital creators to make a living. Artists began experimenting with the new possibilities afforded by the text prompt — words used as the basis for generating an image. The style of the image can be altered by adding keywords to the text prompt, for example “unreal” for the polished Unreal game engine aesthetics and “watercolor” for the painterly effect.
By the middle of 2021, researchers and developers had begun to combine image steering with a range of image generation models, each with their own aesthetics. Even though CLIP art layers new technology over prior systems like BigGAN or VQGAN, it felt like a substantial change from previous GAN creations owing to its busy, dream-like slickness and its privileging of guided exploration over realism or stylistic derivation. Instead, CLIP links much more closely to Deep Dream, the hallucinatory image generator developed by Alex Mordvintsev at Google in 2015, both in its mainstream appeal and colorfully unpredictable visuals.
Memo Akten has showcased these similarities, and through them, the leap in image generation capabilities between 2015 and 2021 in works such as Journey through the layers of the mind (2015) and All watched over by machines of loving grace (2021). Here, Akten stares into a camera, his face fluctuating via trippy variegation (2015) into vibrant natural and animal-filled sequences (2021).
As a curator and organizer in the creative AI community, I’ve witnessed established AI artists incorporate CLIP-based models into their practice. Many of these artists began minting their works as NFTs: Sofia Crespo, alongside Feileacan McCormick as Entangled Others, caught imaginary creatures basking in forest twilight; Gene Kogan took to Twitter with a vivid flower scroll and Mario Klingemann realized his NFT-DeFi-DAO fantasy Botto — an AI that generates images based on community feedback. Botto generated $1 million in sales within weeks of its arrival. Alongside these established figures, technologists Katherine Crowson and Ryan Murdock have been pushing the envelope of CLIP aesthetics by building new creative notebooks like BigSleep and VQGAN+CLIP.
Recently, CLIP art aesthetics have inspired Vadim Epstein and Xander Steenbrugge to address the theme of dreams. Moscow-based Epstein is known in the creative AI community for developing Aphantasia, a CLIP-based tool designed to produce distinct multicolored visuals, some of which he has already minted on Hic et Nunc. On the project’s github page, Epstein defines aphantasia as “the inability to visualize mental images, the deprivation of visual dreams.” The audience is invited to consider his system as “an antidote” or “an outsourced visual imagination, triggered by words.”
A media artist, VJ, and theoretical physicist, Epstein’s oeuvre spans net.art, science art, AI, and creative coding, with the artist exhibiting worldwide from the Moscow Biennale to Animaze Festival in Montreal. His Vimeo account contains a plethora of generated videos with musical accompaniment, which are frequently regarded as “music videos.” Despite CLIP’s text-to-image powers, which suit the illustration of song lyrics (as in Janelle Shane’s The Wellerman (2021)), Epstein typically chooses audio-only music, free from the interference and additional significance of song lyrics.
Epstein’s recent series “Saga” muses on the narrative power of CLIP-generation. It currently comprises three music videos: Saga [single], Saga [multi], and Saga [3D] (see below, 2021) and is followed by a related video work, TEOPEMA (2022), produced in collaboration with musician COH as 696 unique three-second works released as a BrainDrops collection. Made with a customized version of his Aphantasia toolkit, these works “explore off-plot narratives, which suggest a rich and multilayer blending of views and impressions instead of a linear development of the situations.” Yet the viewer’s experience remains linear. With specific images emerging in the foreground as other subplots dissipate, one cannot but wonder at the possible visual paths not taken.
To Epstein, such an approach represents the next evolution of VJ practice with its “active transformation of semantic and aesthetic flows independently of each other, momentarily creating unforeseen meanings and visions.” The first Saga video zooms out to trace a dream sequence that incorporates ancient temples, monumental scenery and indecipherable lettering, all centered on the same imaginary location. The artist calls this “running on the same spot.”
In the second video, Epstein explores Saga’s narrative by focusing on its stylistic components: Familiar music meets fantastical visuals on a spectrum from Deep Dream to Klimt. Finally, in Saga 3D, labyrinthine elements appear as the audience enters new cellular space. Set to a different soundtrack, its pace slowing, the composition grows sparser and we dive deep into human eyes and faces — a counterbalance to earlier scenes of dystopian civilization.
These technical possibilities and mythical narratives are further developed in TEOPEMA, connecting mathematics and physical reality for poetic purposes. For Epstein, it captures an essential quality “where the real and the ideal merge, occasionally swapping their places and turning into one another.”
Through its original aesthetics, the “Saga” series envisions a new universe — Epstein’s typical CLIP prompts abounding with the keyword “epic.” But he also relies on abstract prompts that address a multitude of categories and contexts: Life, death, love, science, nature, etc. To stylize the subject matter, he adds the names of numerous artists as keywords. Dali, Dore, Klimt, Giger, and Saudek are all familiar to AI systems, and others are carefully chosen for their ability to generate unusual responses to Epstein’s standard “dead man, embraced by passion” text prompt.
Dreams receive more granular representation in Xander Steenbrugge’s recent “DreamScapes” collection. With a background in machine learning, he is the Arxiv Insights YouTuber and start-up founder of WZRD, a video creation platform that emerged from his work on audio-reactive visuals. His video work, When AI generated paintings dance to music... (2019), has amassed over 43,000 views and was included in NeurIPS AI Art Gallery.
Recently, Steenbrugge started playing with CLIP and also launched “DreamScapes” as a BrainDrops NFT collection on OpenSea. The collection includes 500 works — a mix of sharply detailed fantasies and short video deep dives into the unknown. The artist sees his works as “evok[ing] their own unique dreamworld that should be experienced rather than understood.” His privileging of ontology over epistemology explodes the natural scientific desire for logic and explanation, replacing it with a new poetics of human-machine collaboration.
For Steenbrugge, “The code is the art,” and he is keen to probe how “subtle implementation details” dictate dramatic changes in the ultimate creative output. He therefore takes care to modify existing code in order to forge his own visual language. His titles derive from the text prompts used to generate his works, with names such as The brilliant crystal (2021) and My ideas are twisting (2021). Steenbrugge excels in the still image, curating generations replete with complex natural forms and striking perspectival depth.
The beauty of CLIP is its ability to dovetail with the latest generative models, enabling artists to steer their aesthetics in endlessly new directions. CLIP engenders a new form of creativity that reaches beyond the familiar ritual of preparing a dataset, editing the visuals, and developing complex code. By experimenting with text prompts, artists can unleash a spectrum of creative possibilities and, in the process, a new surreal semiotics. It may be yet another medium for artists to master, but it is also a door to previously unseen visions. All that remains is to interpret CLIP’s dreams.
Luba Elliott is a curator, producer and researcher specializing in artificial intelligence in art. She has curated exhibitions and conferences at The Photographers’ Gallery, ZKM Karlsruhe, Impakt Festival, The Leverhulme Centre for the Future of Intelligence, CogX, NeurIPS and ICCV. Her recent projects include the Feral File exhibition, “Reflections in the Water,” (2021) and the ART-AI Festival in Leicester, UK. She is an Honorary Senior Research Fellow at the UCL Centre for Artificial Intelligence.