When working online developing systems and websites for clients, you often come across flaws – or at least quirks (or maybe something else) in some of the internet’s largest platforms.
Today I wan’t to have a quick poke around the facebook debugger and one particularly infuriating problem I have encountered on more than one occasion.
The Facebook Debugger Scrape Again Button.
Above you can see an image (blurred for privacy) of a scrape I did. On the image you’ll notice it said scraped two seconds ago. Well, it was, but the fact of the matter is that despite it being “scraped” – the image does not change. Why should it?
Well, because I have uploaded an image overwriting the blurred one shown, and scraping and scraping again repeatedly WILL NOT REFLECT THE IMAGE CHANGE.
I have also deleted the image from my web server, and yet hitting the scrape again button still shows the old image!
I have also then re-instated the NEW image and yet again – no change on the scraper.
It isn’t the end of the world, but what does this mean?
Well, for the time period where the scrape does not appear to scrape the image (or update on facebook) – if someone enters the website URL in their facebook feed in order to promote the site, the old images still shows in the URL preview on their post.
A couple of things to check are :-
a) That your image is actually serving correctly in the browser by direct URL request. You can go to your browser and key in the image URL which is specified in the OG data for the webpage.
If the server serves it as you expect (IE the new image in my case) – then you can be certain you have done all you can within the bounds of human tolerance!
b) If your server does not serve the image correctly, then you may want to revisit the image naming, upload data type (I use BINARY) or the directory structure.
Double check your http and https references – often websites get upgraded to https but the site owners forget to change the reference to the image in the OG data in the page headers.
What else can you do about Facebook image scraping?
The next thing I did was to rename my new image on the server. I simply added _v1 to the end of it – that is all I did. I checked that the newly named image serves correctly through a browser request – yes it did.
Then I changed the source code OG data to reflect the new image name in the tag og:image – I made sure all this was correct by copying out the image url and pasting it into the browser to check that the tag URL matched exactly with the image name and location. Yes – the imaged appeared in chrome on request as I expected.
Then I went back to the scraper on FB again and got this.
“Image Unavailable Provided og:image, could not be downloaded. This can happen due to several different reasons such as your server using unsupported content-encoding. The crawler accepts deflate and gzip content encodings.
The file is a binary uploaded JPG created with Painshop Pro X8 like thousands of other images I’ve created.
Getting bored yet? You are almost there……..
Solved – the Facebook Image Scraping Problem.
I waited around two minutes – changing NOTHING in my website code, or my server configuration or the image, and then hit scrape again.
This time it worked – the image was refreshed and now shows correctly in a facebook newsfeed mention.
What an absolute and total unspeakable drag and drain on resources. But what can one do?