By: Luiza Vianna
In March 2023, a picture of Pope Francis wearing a white Balenciaga puffer jacket went viral, sparking many to question the Catholic Church's new aesthetic choices. The image was, however, soon revealed to have been generated by the AI tool Midjourney, rather than being a real depiction of the then-Pontiff. Upon closer inspection, irregularities in the picture surfaced: misshapen glasses merging into the Pope’s eyelid, a crucifix hanging perfectly from a one-sided chain, and the Pope’s hand awkwardly grasping at the area above a coffee cup rather than grabbing proper hold of the object.
The accurate depiction of hands has proved to be a challenge for artists for centuries. Historically, artists have broken down the hand’s anatomy and associated its different parts with geometric figures. This is not an uncommon strategy for learning illustration: “we endeavor to reduce an object to simple, understandable terms by considering its function and its component parts” (Cheek 2008). While the likes of Leonardo da Vinci attempted to master their depiction by obsessing over sketches of hands in varying positions, others, such as Edgar Degas and Roy Carruthers, removed or reimagined hands by hiding or distorting the hands of their subjects, portraying them in pockets, with additional fingers, or in exaggerated sizes.
Illustrating hands is no easier for AI tools than for humans. Understanding proportion, shading, movement, and functionality of a hand and its muscles, joints, and tendons requires intense observation and practice that can only be obtained through live subjects - not the partial, static images that often feed AI tools. Text-to-image generators have struggled to produce illustrations for the command “hand” due to the various ways hands may appear, thus leading to fiascos such as the Pope’s floating coffee cup. As ChatGPT itself explains, this is partly due to the lack of quality training data, which is the collection of images used to train AI. Hands are often not shown in a clear manner or in natural positions, so models are unable to create a generalized idea of a hand to then expand into a variation of movements or situations. Moreover, haptic data, or data collected from manual exploration used to discover object properties (Jones, 2018), has not yet been incorporated into AI tools, acting as one less instrument for the generation of realistic hands.
As AI tools progress, will more realistic images of hands be generated? That depends on how new training data is obtained when most of the finite amount of publicly available data is explored. AI tools may begin pulling stills from videos for observation or incorporating different types of information, such as haptic data, into their research. There are diverse, intricate ways in which AI art may handle progress, while humans continue to grapple with their artistic abilities for the time being.
References
Cheek, C. (2012). Drawing Hands. Courier Corporation.
Keyes, O.K. and Hyland, A. (2023). Hands Are Hard: Unlearning How We Talk About Machine Learning in the Arts. Tradition Innovations in Arts, Design, and Media Higher Education, 1(1), Article 4.
Jones, L. (2018). Haptics. Cambridge: The MIT Press.
Perrigo, B., & Johnson, V. (2023, March 28). How to spot an AI-generated image like the ‘Balenciaga Pope’. Time.
Verma, P. (2023, March 26). AI can draw hands now. That's bad news for deepfakes. Washington Post.