otsukare Thoughts after a day of work

HEAD and its support

What is HTTP HEAD?

HTTP HEAD is defined in RFC7231:

The HEAD method is identical to GET except that the server MUST NOT send a message body in the response (i.e., the response terminates at the end of the header section). The server SHOULD send the same header fields in response to a HEAD request as it would have sent if the request had been a GET, except that the payload header fields (Section 3.3) MAY be omitted. This method can be used for obtaining metadata about the selected representation without transferring the representation data and is often used for testing hypertext links for validity, accessibility, and recent modification.

A payload within a HEAD request message has no defined semantics; sending a payload body on a HEAD request might cause some existing implementations to reject the request.

The response to a HEAD request is cacheable; a cache MAY use it to satisfy subsequent HEAD requests unless otherwise indicated by the Cache-Control header field (Section 5.2 of [RFC7234]). A HEAD response might also have an effect on previously cached responses to GET; see Section 4.3.5 of [RFC7234].

The first sentence says it all. Ok let's test it on a server we recently received a bug report for an issue not related to this post:

http HEAD http://www.webmotors.com.br/comprar/busca-avancada-carros-motos 'User-Agent:Mozilla/5.0 (Android 4.4.4; Mobile; rv:48.0) Gecko/48.0 Firefox/48.0'

and we discover an interesting result.

HTTP/1.1 302 Found
Cache-Control: private
Cache-control: no-cache="set-cookie"
Connection: keep-alive
Content-Length: 219
Content-Type: text/html; charset=utf-8
Date: Thu, 17 Mar 2016 05:05:16 GMT
Location: http://www.webmotors.com.br/erro/paginaindisponivel?aspxerrorpath=/comprar/busca-avancada-carros-motos
Set-Cookie: AWSELB=FBEB4F8D0A4EA85DD70F4AB212E2DE8B1D243194C8211DB47745035BB893091C3757B63537A8283279E292F270164C17215D106B7608F46725A149C18EC4961E97F3828361;PATH=/;MAX-AGE=3600
Vary: User-Agent,Accept-Encoding
X-AspNetMvc-Version: 5.2
X-Powered-By: ASP.NET
X-UA-Compatible: IE=Edge

The server is probably Microsoft IIS with an ASP layer. Not blaming IIS here, I have seen that pattern on more than one server.

The server response is 302, aka a redirection given in the Location: field.

The 302 (Found) status code indicates that the target resource resides temporarily under a different URI.

But if we follow that redirection to that Location:, we get the message:

Ops! A página que você procura não foi encontrada.

which is basically, they can't find the page. So basically, if the page doesn't exist, they should send back either

Let's verify this and request the same URI with a GET and printing only the HTTP response headers.

http --print h GET http://www.webmotors.com.br/comprar/busca-avancada-carros-motos 'User-Agent:Mozilla/5.0 (Android 4.4.4; Mobile; rv:48.0) Gecko/48.0 Firefox/48.0'

The response is this time 200 OK, which means it worked and we received the right answer.

HTTP/1.1 200 OK
Cache-Control: private
Cache-control: no-cache="set-cookie"
Connection: keep-alive
Content-Encoding: gzip
Content-Length: 30458
Content-Type: text/html; charset=utf-8
Date: Thu, 17 Mar 2016 05:38:03 GMT
Set-Cookie: gBT_comprar_vender=X; path=/
Set-Cookie: Segmentacao=controller=Comprar&action=buscaavancada&gBT_comprar_vender=X&gBT_financiamento=&gBT_revista=&gBT_seguros=&gBT_servicos=&marca=X&modelo=X&blindado=&ano_modelo=X&preco=X&uf=X&Cod_clientePJ=&posicao=&BT_comprar_vender=X&BT_financiamento=X&BT_revista=X&BT_seguros=X&BT_servicos=X&carroceria=&anuncioDfp=Comprar/Buscas/Busca_Avan&midiasDfp=101,81; expires=Sat, 17-Sep-2016 05:38:03 GMT; path=/
Set-Cookie: AWSELB=FBEB4F8D0A4EA85DD70F4AB212E2DE8B1D243194C8CA5862EA033196053DA70C8A276707F8A7ABFB1FA324CD7934D5EC56F696CA5DFAECA620F087B46CAE01F4173A09BBD5;PATH=/;MAX-AGE=3600
Vary: Accept-Encoding
X-AspNetMvc-Version: 5.2
X-Powered-By: ASP.NET
X-UA-Compatible: IE=Edge

So if they unfortunately forbid HEAD which is handy for caching check instead of having to download again the full resource. But I guess in their case, it doesn't matter very much because there is no real caching information.

I wonder how Firefox and other browsers are handling HEAD, or if they use it at all.

Though the real issue so far for this server is a server side user agent sniffing sending the desktop version to Firefox Android and the mobile version to Chrome on Android.

Digression

One more thing.

Redbot is quite a nice tool for linting HTTP headers. We can test on this Web site. Specifically for the caching information. This is what redbot is telling us about this resource:

The Cache-Control: no-cache directive means that while caches can store this response, they cannot use it to satisfy a request unless it has been validated (either with an If-None-Match or If-Modified-Since conditional) for that request.

This response doesn't have a Last-Modified or ETag header, so it effectively can't be used by a cache.

No caching. Fresh page all the time. Sad panda for the Web. 🐼

Ok let's make it two. One more thing.

I digress a bit but let's make another funny test on that server. We request multiple times the server with this:

http --print h GET http://www.webmotors.com.br/comprar/busca-avancada-carros-motos 'User-Agent:Mozilla/5.0 (Android 4.4.4; Mobile; rv:48.0) Gecko/48.0 Firefox/48.0' | egrep -i "^Date: "

aka give me the date information. The Date in HTTP Response headers:

The "Date" header field represents the date and time at which the message was originated, having the same semantics as the Origination Date Field (orig-date) defined in Section 3.6.1 of [RFC5322]. (…) When a Date header field is generated, the sender SHOULD generate its field value as the best available approximation of the date and time of message generation.

Do you see what is happening? Focus on the second.

Date: Thu, 17 Mar 2016 05:54:21 GMT
Date: Thu, 17 Mar 2016 05:54:03 GMT
Date: Thu, 17 Mar 2016 05:54:06 GMT
Date: Thu, 17 Mar 2016 05:54:11 GMT
Date: Thu, 17 Mar 2016 05:54:35 GMT
Date: Thu, 17 Mar 2016 05:54:17 GMT
Date: Thu, 17 Mar 2016 05:54:20 GMT
Date: Thu, 17 Mar 2016 05:54:23 GMT
Date: Thu, 17 Mar 2016 05:54:49 GMT
Date: Thu, 17 Mar 2016 05:54:33 GMT

The time is going up and down. I suspect that we are hitting different servers not synchronized.

Otsukare!