Spoofing URL previews

The rise of social media has enabled us to share content with one another at an unprecedented rate. From Facebook to Twitter to Reddit to iMessage we’re sharing things that we’ve found on the internet more and more, so much so that it became difficult to decide what’s worth clicking on and what isn’t when we’re being presented with vast amounts of content summarized in the form of a URL. To solve this problem, URL previews were created.

So what is a URL preview?

A URL preview is something which we’ve all seen while we browse the internet. Somebody shares a URL to some piece of content, usually a web page, and the social platform which we’re browsing automatically generates a preview of the content found at the URL. In the example Facebook post below, I am sharing a URL to one of my favourite games, RuneScape:

These previews are generally very helpful and provide us with rich context surrounding what sort of content we can expect to find if we were to actually click on the URL. In the example above, we get several pieces of information:

A main picture which is supposed to depict what we might see when we click the link
The actual domain name which we can expect to find ourselves at when we click the link
A short textual description of what the web page is about

Upon seeing the above post you might expect that clicking on the post will take you to runescape.com and that the web page will look roughly as you expect it to look based on the preview provided by Facebook. However if you go ahead and actually click on the post above, you’ll find yourself at my proof-of-concept website instead: poc.razb.me

How are URL previews created?

In order to understand how a URL preview might be spoofed, we must first understand how a URL preview is created. I will be using Facebook as my example platform, but all platforms which provide this sort of preview will function in a similar manner.

There are really only three main steps in generating a URL preview:

Download a copy of the web page found at the URL
Inspect it for various metadata tags
Generate the preview

The first step should be pretty straightforward, just make a GET request to the URL and save the resulting HTML.

The second step is specific to the platform. Each platform defines its own set of metadata tags which can be added to the HTML of any webpage for providing information to be used for populating URL previews. You can find out all about these HTML tags here: https://developers.facebook.com/docs/sharing/webmasters/

The third step is to put together the information provided via the metadata tags into a visual format which becomes the URL preview.

How can we spoof the preview generation?

So far everything looks fine. I share a URL, Facebook fetches that web page and parses the HTML tags then generates the preview from them. How can we possibly spoof any of this? We don’t necessarily own the web page which we’re linking to, so what can we do?

Let’s try sharing a URL to a web page which we do own and check the GET request which Facebook makes.

Upon inspection, when Facebook makes a GET request to my proof-of-concept domain, the headers look something like this:

	{
	“Host“: “poc.razb.me“,
	“Connection“: “Keep-Alive“,
	“Accept-Encoding“: “gzip“,
	“Accept“: “\\/“,
	“User-Agent“: “facebookexternalhit\\/1.1 (+http:\\/\\/www.facebook.com\\/externalhit_uatext.php)“,
	}

view raw facebook_header.json hosted with

by GitHub

The field worth noting is the “User-Agent” field. This is used as a courtesy by the client requesting our web page and it lets us know what sort of browser (or in this case bot) was used to access our web page.

In the case of the Facebook URL preview generation, Facebook’s bot configured the “User-Agent” field to something specific to their use case and they call it “facebookexternalhit”.

For a normal user like you or I requesting my web page, the header will look something like this:

{

 “Host“: “poc.razb.me“,

 “Connection“: “Keep-Alive“,

 “Accept-Encoding“: “gzip“,

 “User-Agent“: “Mozilla\\/5.0 (Macintosh; Intel Mac OS X 10_14_2) AppleWebKit\\/537.36 (KHTML, like Gecko) Chrome\\/71.0.3578.98 Safari\\/537.36“,

}

view raw normal_header.json hosted with

by GitHub

In the normal case, the “User-Agent” field looks very different and actually contains information about the browser which I used to access my web page.

It turns out that we can use this subtle difference in User-Agent to serve Facebook’s bot a different response than we serve normal users.

User-Agent switching

If for every request we read the User-Agent field before serving our web page, we can switch on the User-Agent value to serve different responses for bots and for regular users.

So what might we do differently for a bot so that it generates a preview for something other than our current web page? A simple solution is to simply return a 300 redirect to your target web page of choice.

Here is some sample code which can do just that in PHP:

	<?php

	$user_agent = $_SERVER[“HTTP_USER_AGENT“];

	if (strpos($user_agent, “facebookexternalhit“) !== false) {
	header(“Location: https://runescape.com“);
	} else {
	echo(“poc.razb.me“);
	}

	?>

view raw fb_bot_redirect.php hosted with

by GitHub

In this example we simply set the “Location” header response to https://runescape.com. This will tell Facebook’s bot to ignore whatever content it found at my URL poc.razb.me, and instead go looking at runescape.com. We can confirm that the bot is doing what we expect by using Facebook’s sharing debugger: https://developers.facebook.com/tools/debug/sharing/?q=https%3A%2F%2Fpoc.razb.me

It turns out this is all it takes to spoof the Facebook URL preview bot. It is also possible to perform the same simple switch on iMessage, Twitter, and Reddit previews. Here are some examples:

Twitter preview spoof
https://twitter.com/test_raz/status/1079640248743837696

It’s important to note that both iMessage and Reddit actually preserve the original URL when generating the preview. This is a very good practise as it makes this sort of attack much less effective when the user can still see the original URL.

However, in the case of Facebook and Twitter they use the redirected URL in the preview and omit the original URL entirely. This makes it possible to trick somebody into thinking that they are about to go to one domain when in fact they will be taken to another.

Responsible Disclosure

Prior to posting this blog post, I reached out to Facebook and Twitter (via their respective bug bounty programs here and here) and made sure that they are aware of the trick that I outlined in this post. They both got back to me and said that it is too low risk and they have no plans on changing the current behaviour.

So for the time being, pay extra close attention when clicking on content on Facebook and Twitter, and make sure the domain you end up at is the one you expected.

So what is a URL preview?

How are URL previews created?

How can we spoof the preview generation?

User-Agent switching

Responsible Disclosure

Related

Raz