thoughtt ...thinking aloud

The Science of Averaging!

A couple of months back, I created an average face of all members of Iran’s Islamic Consultative Assembly. It was a quick 5-minute project and It has been done thousands of times. There was nothing special about this project, but something very interested happened when I started to show the results to people outside the field of Computer Science, or I should say Engineering. People at first couldn’t comprehend the results and ask silly questions at first. The problem was not the quality of the produced image or even my presentation’s skills (But it kind of was!!) But with the people’s perception of the Averaging operator itself!

The Result of Averaing faces of all members of Iran's Islamic Consultative Assembly (Including women!)
The Result of Averaing faces of all members of Iran's Islamic Consultative Assembly (Including women!)
The Result of Average of the face of some members of Assembly of Experts (around one-third of all members)
The Result of Average of the face of some members of Assembly of Experts (around one-third of all members)

I think the problem might be in how people think of pictures! To us, computer science people, images are only numbers and we can absoulty do anything we want to them.

In computer graphics, especially in shader development, mixing different textures to produce the desired effect is a very common task. Here is the simplest technique possible to merge two textures into one:

Sample of Texture blending
Sample of Texture blending

This is not only limited to Images and we can average anything that can be described with numbers. For example, an animation sequence is just an array of positions of joints, and of course, we can do some math on top of them. By doing a weighted average between two distinct animations, we can do a transition from one to another.

Animation Blending - [1]
Animation Blending - [1]

Walkthrough

Doing a face blending is not at all hard, here are steps needed to reproduce my results.

  1. Gathering: First of all, you need images, gather lots of images from faces, as much as you can

  2. Cleaning up: We only need faces, not other body parts. It is fairly easy to write an OpenCV code to crop an image, so only the face is visible on the image.

  3. Facial Landmarks: We need to align faces on top of each other before doing the averaging. For this, we first find Facial Landmarks on each of the faces we have. this can be achieved easily with libraries like dlib and Algoface.

Facial Landmark with Algoface
Facial Landmark with Algoface
  1. Delaunay Triangulation: After the last part we have some points on our images. With Delaunay Triangulation we can divide/classify/bin each pixel into a triangular region. This is super helpful because we want to align images together. (Put all eyes on top of each other, all noses on each other, …)
Delaunay Triangulation
Delaunay Triangulation
  1. Do The Math: After aligning the images, we now can add the value of pixels together and average them. Magic!
Averagin two Faces
Averagin two Faces

As you can see the alignment is not perfect, lips are not aligned on top of each other correctly. This is due to the variation we have in our landmark finding algorithm.


StyleGAN

After finding the average face, it is very amusing to do some additional things with it. I thought doing some manipulation using StyleGAN would be a great idea. Here are some of my results:

Age

Here is the age transformation for both the Consultative Assembly and Assembly of Experts.

Consultative Assembly:

Assembly of Experts:

Gender

All members of the Assembly of Experts are men, what if they were women?

References

  1. Animation blending in MonoGame / XNA
  2. dlib C++ Library
  3. AlgoFace
  4. Nvidia’s StyleGAN
  5. Arxiv Insights - Face editing with Generative Adversarial Networks

Image Credits

  1. Diragrams made with Draw.io
  2. Icons made by Freepik from www.flaticon.com