Colorful Image Colorization

Abstract: Given a grayscale photograph as input, this paper attacks the problem of hallucinating a plausible color version of the photograph. This problem is clearly underconstrained, so previous approaches have either relied on significant user interaction or resulted in desaturated colorizations. We propose a fully automatic approach that produces vibrant and realistic colorizations. We embrace the underlying uncertainty of the problem by posing it as a classification task and use class-rebalancing at training time to increase the diversity of colors in the result. The system is implemented as a feed-forward pass in a CNN at test time and is trained on over a million color images. We evaluate our algorithm using a "colorization Turing test," asking human participants to choose between a generated and ground truth color image. Our method successfully fools humans on 32% of the trials, significantly higher than previous methods. Moreover, we show that colorization can be a powerful pretext task for self-supervised feature learning, acting as a cross-channel encoder. This approach results in state-of-the-art performance on several feature learning benchmarks.

Preview

PDF Thumbnail

Synopsis

Overview

  • Keywords: Colorization, Deep Learning, CNN, Self-supervised Learning, Image Processing
  • Objective: Develop a fully automatic method for colorizing grayscale images that produces vibrant and realistic results.
  • Hypothesis: The proposed method can effectively predict plausible colorizations by modeling the statistical dependencies between grayscale images and their color versions.
  • Innovation: Introduction of a classification-based approach to colorization that utilizes class rebalancing to enhance color diversity, along with a novel evaluation framework termed the "colorization Turing test."

Background

  • Preliminary Theories:

    • Color Perception: Understanding how colors are perceived and the relationship between lightness and color saturation.
    • Multimodal Distribution: The concept that objects can have multiple plausible colors, necessitating a model that captures this ambiguity.
    • Self-supervised Learning: Utilizing unlabeled data to train models by creating tasks that allow the model to learn representations without explicit supervision.
    • Convolutional Neural Networks (CNNs): Leveraging deep learning architectures to process and analyze image data effectively.
  • Prior Research:

    • 2015: Deep Colorization introduced methods using CNNs for colorizing images, but results often appeared desaturated.
    • 2016: Various algorithms emerged, including those by Larsson et al. and Iizuka et al., focusing on different loss functions and architectures for colorization.
    • 2016: The emergence of self-supervised learning techniques that allowed models to learn from raw data without labeled examples.

Methodology

  • Key Ideas:

    • Multinomial Classification: Treating color prediction as a classification problem to predict a distribution of possible colors for each pixel.
    • Class Rebalancing: Adjusting the loss function during training to emphasize less common colors, thus promoting a wider range of color outputs.
    • Annealed Mean Calculation: Using a temperature parameter to adjust the output distribution for more vibrant results.
  • Experiments:

    • Colorization Turing Test: Evaluating the perceptual realism of colorizations by asking human participants to distinguish between real and synthesized colors.
    • Performance Metrics: Evaluating colorization quality using accuracy measures, including area under the curve (AuC) and VGG classification accuracy on recolorized images.
  • Implications: The methodology enables the generation of more realistic colorizations that can also serve as effective pre-training tasks for other computer vision applications.

Findings

  • Outcomes:

    • The proposed method successfully fooled participants in the Turing test 32% of the time, indicating high perceptual realism.
    • Colorizations were found to improve performance on downstream tasks such as object classification, demonstrating the utility of the learned representations.
    • The model effectively generalized to legacy black and white photographs, producing plausible colorizations despite differences in low-level statistics.
  • Significance: This research surpasses previous colorization methods by producing more vibrant and realistic results, while also providing a framework for self-supervised representation learning.

  • Future Work: Exploration of additional applications of the colorization task in other domains of computer vision, refinement of the model architecture, and further investigation into the implications of colorization on visual perception.

  • Potential Impact: Advancements in automatic image colorization could enhance various fields, including digital archiving, film restoration, and improving the performance of computer vision systems through better feature representations.

Notes

Meta

Published: 2016-03-28

Updated: 2025-08-27

URL: https://arxiv.org/abs/1603.08511v5

Authors: Richard Zhang, Phillip Isola, Alexei A. Efros

Citations: 3309

H Index: 175

Categories: cs.CV

Model: gpt-4o-mini