NeAT: Neural Artistic Tracing for Beautiful Style Transfer

Style transfer results using NeAT, trained on BBST-4M. Zoomed in areas are shown in the middle columns.

Abstract

Style transfer is the task of reproducing the semantic contents of a source image in the artistic style of a second target image. In this paper, we present NeAT, a new state-of-the art feed-forward style transfer method. We re-formulate feed-forward style transfer as image editing, rather than image generation, resulting in a model which improves over the state-of-the-art in both preserving the source content and matching the target style. An important component of our model's success is identifying and fixing "style halos", a commonly occurring artefact across many style transfer techniques. In addition to training and testing on standard datasets, we introduce the BBST-4M dataset, a new, large scale, high resolution dataset of 4M images. As a component of curating this data, we present a novel model able to classify if an image is stylistic. We use BBST-4M to improve and measure the generalization of NeAT across a huge variety of styles. Not only does NeAT offer state-of-the-art quality and generalization, it is designed and trained for fast inference at high resolution.

Basic architecture diagram. The green modules represent the trainable parameters. For clarity, the discriminator modules and contrastive projection heads are not shown. The contrastive loss is computed 1) between the stylized output and the target style, and 2) between the stylized images only - a group of contrastive losses anchoring on the same style regardless of content, and a group of contrastive losses anchoring on the same content regardless of style. We also leverage several common identity losses between the stylized image and the respective style/content images from the datasets - computed as a Gatys loss

A visualization of the patch co-occurrence loss, specifically showing the Sobel guided selection process. Patches are randomly selected from both the stylized image, and the style image. Sobel edge maps of the content image and style image are used to compute average intensity scores for all patches, which are then sorted by this intensity score. Two patch co-occurrence losses are computed separately, for the simple patches, and the complex patches.

Ablation showing detail gain from use of prior deltas.

Visualization of style transfer using NeAT, and the baseline methods. Please zoom for more details.

Poster

BibTeX

@inproceedings{Ruta:neat:ECCVWS:2024,
        AUTHOR = Ruta, Dan and  Gilbert Andrew and Collomosse, John and  Shechtman, Eli and Kolkin Nicholas",
        TITLE = "NeAT: Neural Artistic Tracing for Beautiful Style Transfer",
        BOOKTITLE = "European Conference of Computer Vision 2024, Vision for Art (VISART VII) Workshop, 2024",
        YEAR = "2024",
        ​}