When inspecting Web Compatibility issues related to UA detection, I go through a small routine to isolate the source of the problem. It often starts with the shell and curl or httpie, where I test the bare domain name as a person would likely it. Today, I was checking an issue with the Chinese Web site baidu, which "serves a static HTML site to Firefox for Android and a desktop site to Firefox OS." according to the bug:
→ http -v GET http://baidu.com/
It generates the following HTTP request (note that the User-Agent is HTTPie by default.)
GET / HTTP/1.1 Accept: */* Accept-Encoding: gzip, deflate, compress Host: baidu.com User-Agent: HTTPie/0.7.2
The response from the server was surprising…
HTTP/1.1 200 OK Accept-Ranges: bytes Cache-Control: max-age=86400 Connection: Keep-Alive Content-Length: 81 Content-Type: text/html Date: Fri, 25 Oct 2013 18:18:45 GMT ETag: "51-4b4c7d90" Expires: Sat, 26 Oct 2013 18:18:45 GMT Last-Modified: Tue, 12 Jan 2010 13:48:00 GMT Server: Apache <html> <meta http-equiv="refresh" content="0;url=http://www.baidu.com/"> </html>
200 OK means I have what you want. Here's the content, but the response payload was minimal with only a
meta for refreshing the content of the page after 0 second and loading the URI
There's already an existing tool in HTTP for doing this
301 Moved Permanently
HTTP/1.1 301 Moved Permanently Location: http://www.baidu.com/
And that's all, you do not need anything else. If you think that the redirection might be blocked by a mechanism and want a human to decide to follow the new location, you can add an HTML payload.
HTTP/1.1 301 Moved Permanently Location: http://www.baidu.com/ Content-Length: 89 Content-Type: text/html <!doctype html><html><title>baidu</title><a href="http://www.baidu.com">Baidu</a></html>
Note: I'm avoiding cache information, because it's a permanent redirect. Though I wonder what bots usually do with and without the dates and
ETag headers, and the redirect information.
The final issue is a bit more complicated as baidu is serving at least three types of content depending on the User-Agent: A low-tech mobile, an enhanced mobile, and a desktop version. Not really related to this post but worth noting for people who think there are only a few browsers on earth. One of the baidu script has a pretty interesting piece of code:
var w = /se /gi.test(navigator.userAgent); var o = /AppleWebKit/gi.test(navigator.userAgent) && /theworld/gi.test(navigator.userAgent); var k = /theworld/gi.test(navigator.userAgent); var p = /360se/gi.test(navigator.userAgent); var a = /360chrome/gi.test(navigator.userAgent); var f = /greenbrowser/gi.test(navigator.userAgent); var t = /qqbrowser/gi.test(navigator.userAgent); var m = /tencenttraveler/gi.test(navigator.userAgent); var j = /maxthon/gi.test(navigator.userAgent); var u = /krbrowser/gi.test(navigator.userAgent); var l = /BIDUBrowser/gi.test(navigator.userAgent) && (typeof window.external.GetVersion != "undefined"); var b = false;
That's a good thing to remember, the market is always more diverse that what you think. The diversity is local, the access is global. Anyone with any tool might access your Web site.