Visual content has always been essential for creating good rankings on Google and also for creating a good user experience. In other words, it is absolutely essential to integrate relevant images and videos into your content if you want to conduct business online.
However, it has never been quite clear how important it is for Google that the images are actually relevant, because, as many people have said, Google is a text scanner that cannot see and understand images in the same way a user can.
I myself have said the following before: “If you upload a picture of a cat that you call coffee machine.jpg, then Google will think it’s a picture of a coffee machine”. I don’t recommend doing that — because who wants to destroy the user experience of a coffee enthusiast who is greeted by cat pictures? Hopefully, no one.
But as technologies mature, Google will also be able to understand content at a deeper level, and this is particularly important with regards to visual content.
3 types of evidence that Google can “see pictures”
1) Google Lens
Google Lens is an app that combines your mobile camera with Google software. If I point my phone at my keyboard and click on the keyboard itself on the screen, it suggests that I look at Logitech Illuminated Keyboard K800, which is exactly the keyboard I have. Quite impressive!
If I point to my half-folded Bose Quiet Comfort headphones, it feels certain it’s a pair of sandals. So, it still has a little way to go. But it is actually correct most of the times, especially if the object is “free standing” and is placed in good lighting.
2) Google Photos
You’ve probably already heard of Google’s Picture App — Google Photos — because it has gained quite a lot of popularity (for many good reasons, which I will not go into here).
If you have used this, it is quite clear that Google has a good idea of what is happening in pictures. I have personally imported all my pictures into Google Photos, and if I perform a search for “bicycle” in the app, all the pictures and videos containing a bicycle come up, without naming or describing them in any way.
3) Google Cloud Vision API
The Google Cloud Vision API is probably the most “damning proof” of how much Google understands – because here we get direct insight into what Google can read from different images.
If I upload a picture of two keys (with a house key ring) lying on a wooden table, Google will come back and say that there is:
- 92% probability that the image contains a keychain
- 85% probability that the image contains a key
- 74% probability that the picture contains wood
- 66% probability that the image contains a fashion-accessory.
It must therefore be said to be a fairly accurate assessment of what is in my test picture. If you want, try uploading a photo on cloud.google.com/vision and see how impressive it is in most cases.
Will it be used in this way with regards to SEO?
The very exciting question is whether this relatively modern software is being used when Google reads the content on all the world’s websites – I do not think so, and this is for the following three reasons:
1) It is “too much”
Google must scan an extreme amount of images in the attempt to crawl the entire Internet as quickly as possible. If it were to scan all images to “view the content” then it will slow down its indexing process.
2) It is too imprecise
Although in many cases it works really well, there are still “too many times” where it guesses incorrectly – and although I am quite impressed with their software, I don’t think it’s good enough to be able to influence the rankings.
3) It is not (so) necessary
If the reality was such that many pages were ranked based on “fake images”, they would probably use the server power needed to scan the images. However, there is another part of Google’s algorithm that makes it not so necessary – and that’s the engagement part. If you scare all your visitors away with “fake pictures”, you will fall in ranking.
Why is that important?
When I now say that it is not used in terms of SEO, you may well be thinking: “Well, why should I worry about that?” And the answer lies in the headline of the article — because it is about the future.
When it comes to content, things are becoming progressively more visual. Video continues to rush forward and content pages are becoming more and more marked by visual elements rather than large volumes of text.
Therefore, I also believe that we will see a dramatic development in Google’s way of indexing content, because they will have to be able to understand the content of images and videos if they are to continue to deliver equally strong results.
So even though I don’t think it’s being used now (or at least not on a full scale), I think it’s going to happen soon.
How to act accordingly
If you want to (continue) to be successful with your content, you will need to have an even stronger focus over time on how images and videos fit into what you write about. When you build a content page that focuses on a small number of keywords, you need to support this with visual elements of exactly the things you want to get rankings on.
So, in short, I believe that the future of top rankings on Google belongs to those who are learning to use more (and more relevant) visual elements in their content.