Where Computer Vision Meets Art

4th Workshop on Computer Vision for Art Analysis
9th September 2018, Munich, Germany


Paper submission is now open - please use this site to submit your paper!

Following the success of the previous editions of the Workshop on Computer VISion for ART Analysis held in 2012, 2014 and 2016, we present the VISART IV workshop, in conjunction with the 2018 European Conference on Computer Vision (ECCV 2018). VISART will continue its role as a forum for the presentation, discussion and publication of computer vision techniques for the analysis of art. In contrast with prior editions, VISART IV will expand its remit, offering two tracks for submission:

  1. Computer Vision for Art - technical work (standard ECCV submission, 14 page excluding references)
  2. Uses and Reflection of Computer Vision for Art (Extended abstract, 4 page, excluding references)

The recent explosion in the digitisation of artworks highlights the concrete importance of application in the overlap between computer vision and art; such as the automatic indexing of databases of paintings and drawings, or automatic tools for the analysis of cultural heritage. Such an encounter, however, also opens the door both to a wider computational understanding of the image beyond photo-geometry, and to a deeper critical engagement with how images are mediated, understood or produced by computer vision techniques in the 'Age of Image-Machines' (T. J. Clark). Whereas submissions to our first track should primarily consist of technical papers, our second track therefore encourages critical essays or extended abstracts from art historians, artists, cultural historians, media theorists and computer scientists.

This one-day workshop in conjunction with ECCV 2018, calls for high-quality, previously unpublished, works related to Computer Vision and Cultural History. Submissions for both tracks should conform to the ECCV 2018 proceedings style and will be double-blind peer reviewed by at least three reviewers.


  • Art History and Computer Vision
  • 3D reconstruction from visual art or historical sites
  • Artistic style transfer from artworks to images and 3D scans
  • 2D and 3D human pose estimation in art
  • Image and visual representation in art
  • Computer Vision for cultural heritage applications
  • Authentication Forensics and dating
  • Big-data analysis of art
  • Media content analysis and search
  • Visual Question & Answering (VQA) or Captioning for Art
  • Visual human-machine interaction for Cultural Heritage
  • Multimedia databases and digital libraries for artistic and art-historical research
  • Interactive 3D media and immersive AR/VR environments for Cultural Heritage
  • Digital recognition, analysis or augmentation of historical maps
  • Security and legal issues in the digital presentation and distribution of cultural information
  • Surveillance and behaviour analysis in Galleries, Libraries, Archives and Museums


Paper Submission: July 9th 2018

Notification of Acceptance: August 3rd 2018

Workshop: September 9th 2018 (full day workshop)

Camera-Ready Paper Due: September 21st 2018


Stuart James, Istituto Italiano di Tecnologia (IIT) & UCL DH

Leonardo Impett, EPFL & Biblioteca Hertziana, Max Planck for Art History

Peter Hall, University of Bath, UK

Joao Paulo Costeira, Instituto Superior Técnico, Portugal

Peter Bell, Friedrich-Alexander-University Nürnberg

Alessio Del Bue, Istituto Italiano di Tecnologia (IIT), Italy


  • Ali Salah (Bogazici University)
  • Andrea Torsello (Universtà Ca' Foscari of Venice)
  • Arianna Traviglia (Universtà Ca' Foscari of Venice)
  • Bjorn Ommer (Heidelberg University)
  • David Stork
  • Elisavet Stathopoulou (Istituto Italiano di Tecnologia)
  • Erica Nocerino (Fondazione Bruno Kessler)
  • Filippo Stanco (University of Catania)
  • Gustavo Carneiro (University of Adelaide)
  • Jiri Matas (CMP CTU FEE)
  • John Collomosse (University of Surrey)
  • John Hindmarch (University of Bamberg)
  • Lukas Klic (Harvard)
  • Mário Figueiredo (Instituto Superior Tecnico)
  • Martijn Kleppe (National Library of the Netherlands)
  • Matteo Dellepiane (CNR)
  • Matthew Lincoln (The Getty)
  • Mona Hess (University of Bamberg)
  • Naila Murray (Naver Labs)
  • Ohad Ben-Shahar (Ben-Gurion University of the Negev)
  • Paul Rosin (Cardiff University)
  • Rosário Salema de Carvalho (Universidade de Lisboa)
  • Rui Hu (Idiap Research Institute)
  • Sabine Süsstrunk (EPFL)
  • Sarah Kenderdine (EPFL)
  • Stuart Dunn (King’s College London)
  • Tat-Jun Chin (University of Adelaide)
  • Tom Haines (University of Bath)


The Art of Vision

The main challenge of image understanding and, in particular, the analysis of art is to decompose an image into its informative attributes. The style in which a scene is presented needs to be separated from its content. The content, for instance a person, then has to be decomposed into appearance, pose, viewpoint, etc. Recently, there has been a lot of interest in models that not only learn these characteristics to then recognize a scene or detect objects therein. These generative models can also synthesize novel scenes after altering individual attributes. However, the main challenge is still the complex, highly non-linear interaction of all these characteristics. Moreover, although large datasets are easily available, labelled training data is costly and scarce.

We will discuss novel approaches for disentangling visual information into its informative constituents. To avoid the need for tedious annotations during training, the talk will cover self-supervised similarity learning and very recent improvements to it. In the context of art analysis, we will examine the benefit of these approaches for human posture recognition and synthesis, for the analysis of artistic style, and for visual retrieval in large databases of the arts.

Deep Interdisciplinary Learning. Computer Vision and Art History

Computer Vision and Art History are divided through a constructed gap between humanities and sciences with different research cultures, methods, infrastructures. The bridging of this gap is more than just a modern strategy in university politics. In my opinion it seems mandatory, because computer vision and art history are based on image description and understanding. Furthermore art is a particular field of imagery because of its complexity on the one hand and its distinctiveness on the other hand. Art historians have experience with the different ways of artistic perception, which represent several challenges for computer vision. And art historians need image processing to deal with the enhanced data basis which is accessible now. It is a chance for interdisciplinary research to have different visual cultures and methods. In my keynote I focus on the longstanding collaboration of these two fields and I will exemplify it with results. The second aim of the talk is a critical revision of the imagery of computer vision from an art historical perspective.

Deep representation of image aesthetics for visual search and content-aware in-painting.

We present a deep representation for visual aesthetics trained from a corpus of millions of artworks (BAM!), and show how this may be leveraged to distangle image content (structure) from the style in which it is depicted (aethetics). We discuss the resulting feature embedding can be leveraged for both style-aware visual search and content aware image in-painting. For the former, we propose a novel measure of visual similarity for sketch based image image retrieval that incorporates both structural and aesthetic (style) constraints. Our algorithm accepts a query as sketched shape, and a set of one or more contextual images specifying the desired visual aesthetic. For the latter, we propose a non-parametric in-painting algorithm that enforces both structural and aesthetic (style) consistency within the resulting content-aware image completion. By explicitly disentangling image structure and style during patch search and selection to ensure a visually consistent look and feel within the target image. We leverage the same model to perform adaptive stylization of patches to conform the aesthetics of selected patches to the target image, so harmonizing the integration of selected patches into the final composition. The talk comprises work presented previously at ICCV 2017 and CVPR 2018, performed in collaboration with Adobe Research.

On the Limits and Potentialities of Deep Learning for Cultural Analysis

As Yann LeCun (godfather of convolutional neural networks) has recently remarked, today all Artificial Intelligence systems are basically a sophisticated version of “perception” or a form of pattern recognition that can be extended also to non-visual datasets. However, what machine learning in general calculates is not an exact pattern but a statistical distribution of it. The statistical models of machine learning bring about a new breed of errors and limits yet to be properly understood and discussed in their impact on digital humanities and the society (such as bias amplifications, category compression, taxonomy reduction, diversity loss, hypermimicry, style normalization, apophenia, overfitting and more). The paper tries to address the logical limits of machine learning focusing on the case of metaphor detection and “code invention” (Umberto Eco).

-- Matteo's talk has unfortunately beend cancelled --


Time Event
9:15Invited talk: “The Art of Vision” Björn Ommer
10:00S1: Deep in Art
“What was Monet seeing while painting? Translating artworks to photo-realistic images” Matteo Tomei, Lorenzo Baraldi, Marcella Cornia, Rita Cucchiara
“Weakly Supervised Object Detection in Artworks” Nicolas Gonthier, Yann Gousseau, Saïd Ladjal, Olivier Bonfait
“Deep Transfer Learning for Art Classification Problems” Matthia Sabatelli, Mike Kestemont, Walter Daelmans, Pierre Geurts
11:00Coffee Break
11:30Invited talk: “Deep Interdisciplinary Learning. Computer Vision and Art History” Peter Bell
12:15S2: Reflections and Tools
“A Reflection on How Artworks Are Processed and Analyzed by Computer Vision Author” Sabine Lang, Björn Ommer
“A Digital Tool to Understand the Pictorial Procedures of 17th century Realism” Francesca Di Cicco, Lisa Wiersma, Maarten Wijntjes, Joris Dik, Jeroen Stumpel, Sylvia Pont
“Images of Image Machines. Visual Interpretability in Computer Vision for Art” Fabian Offert
14:30Invited talk: “Sketching with Style: Visual Search with Sketches and Aesthetic Context” John Collomosse
15:15Coffee Break
15:30S3: Interpreting and Understanding
“Seeing the World Through Machinic Eyes: Reflections on Computer Vision in the Arts” Marijke Goeting
“Saliency-driven Variational Retargeting for Historical Map” Filippo Bergamasco, Arianna Traviglia, Andrea Torsello
“How to Read Paintings: Semantic Art Understanding with Multi-Modal Retrieval” Noa Garcia, George Vogiatzis
“Analyzing Eye Movements using Multi-Fixation Pattern Analysis with Deep Learning” Sanjana Kapisthalam, Christopher Kanan, Elena Fedorovskaya
17:05Invited talk: “On the Limits and Potentialities of Deep Learning for Cultural Analysis” Matteo Pasquinelli
16:50Closing Remarks
17:10 - 20:00Social Event - Kunsthalle

Please contact stuart.james at with any questions.