How the visiting browser is identified

 

When creating a new browser definition, you will supply an identifying user agent string. The value you enter for this string is the most critical part in accurate browser detection. This string should contain keywords, along with one or more wildcard characters, that will uniquely identify the user agent you want to match that particular browser object.

 

For example, since DEC’s AltaVista search agent will identify itself with "AltaVista" in the beginning of its user agent, you would want to use "AltaVista*" as the user agent string. In addition to the ’r;*’ wildcard character, you can also specify a valid range of characters using brackets, such as "Mozilla/4.0[2-9]*" which would match Netscape version 4.02 - 4.09.

 

Note: All user agent string matching is performed via a case insensitive search, although we do recommend you use capital letters where appropriate to make the strings more intuitive for your own information.

 

How BrowserHawk identifies the best user agent match

Some user agents have very similar or the same keywords, so you need to be strategic about how you define your agent strings. The important thing to keep in mind is that when BrowserHawk identifies a particular user agent string, it does so by searching for the best match out of all your specified user agent match strings. The best match is considered the longest matching user agent string out of all matching user agent strings defined for your browsers.

 

This is an important concept to understand, so let’s study a real scenario where a browser’s user agent string may match more than one of your defined user agent matching strings. Both Microsoft’s IE browser and Netscape browser have much of the same user agent string. The main difference is that IE browsers all have "MSIE" somewhere in the user agent string, and Netscape’s user agents do not.

 

For this example, assume we have two browsers set up in our BDF. One for Netscape v5.0, and one for Internet Explorer v5.0. The matching user agent strings we use for this example are the following:

 

For the Netscape browser, we use: "Mozilla/5*"

For the IE browser, we use: "Mozilla/*MSIE 5*"

 

Now assume that someone hits your site using IE version 5.0. A typical user agent string reported by that browser would be: "Mozilla/4.0 (compatible; MSIE 5.0; Windows NT)".

In this scenario, both user agent strings you defined for Netscape and IE will match, because both start with "Mozilla/". However, BrowserHawk quickly determines that the IE entry is a more specific match, because it matches using a string of 16 characters, and the Netscape version only matched using 10 characters.

 

Conversely, let’s study what happens under this same scenario if a Netscape browser hits the site instead. In that case, the user agent string would typically be: "Mozilla/5.0 (WinNT; I)". Matching that against our defined user agent strings in our definitions above, we quickly see that Netscape would be the correct match made. This is because the IE definition requires "MSIE 5" to be present for a match. So despite the IE browser definition’s string matching the Mozilla/5 part, it is not a valid match without MSIE 5 present as well.

 

See Also:

Advanced user agent recognition

Understanding browsers

Working with browsers