BrowserHawk4J Reports Web Service - Installation and Usage Guide


License
-------
The BRWS4J servlet, supporting files, and documentation are part of the
"BrowserHawk software".  Its use is governed by the BrowserHawk4J End
User License Agreement (EULA), included with this software in the license.txt
file.  Installing or using this Software in any fashion constitutes your full
and unconditional acceptable to the EULA.


Introduction
------------

BrowserHawk Reports allows you to easily collect and log detailed, unique
statistics about the browser capabilities and settings of your site visitors,
as well as their page load times (PLT).  BrowserHawk Reports is implemented as
a servlet known as the BrowserHawk4J Reports Web Service (BRWS4J) servlet.

At its core, BRWS4J contains a Java servlet that acts as an intermediary
between the client browser where the properties are detected and the
relational database where the property values are stored.  This servlet
receives data from the client using a specially constructed <img> tag that
includes the property data to log.

For example, a BrowserHawk4J JSP page can include this line of code to log
statistics:

  <%= BrowserHawk.logData(info, einfo) %>

When executed this will output an <img> tag that might look like this in the
visitor's browser:

  <img src="http://host:port/reports/brws?brow=Mozilla&os=Microsoft&vf=1.7.5">

The browser rendering the HTML page will try to fetch the image by making a
request to the brws servlet and with the request passes the parameters brow,
os, and vf (in this particular example).  The servlet listening as "brws"
notes these parameters and immediately returns a 1x1 pixel hard-coded
transparent GIF.  Then in a background thread the servlet inserts the values
to the database, or queues the insert if the database is temporarily down.

To log page load time values, just run an extended test passing a PLTOptions
instance to control PLT behaviors.  Logging is automatic in this case.

BRWS4J is multi-threaded and extremely scalable.  And due to its design, it
does not add any processing overhead or delays whatsoever in the rendering of
your page since the database writes are handled asynchronously.

For more information and examples of the types of data that can easily be
tracked and reported on see: http://www.cyscape.com/products/reports

The "How to Log Statistics" section below explains more about the logData()
method.  But first, please continue reading to learn about BRWS4J.


Requirements to Run BRWS4J
--------------------------

BrowserHawk4J Reports Web Service requires these components:

* A Java servlet engine supporting Servlets 2.2 or later.  If you don't
  already have a servlet engine, Jetty is a free and robust servlet engine
  (http://jetty.mortbay.org).

* A relational database.  Support scripts have been provided for the MySQL,
  Oracle, DB2, Microsoft SQL Server, and Apache Derby databases.  If you 
  don't already have a database, MySQL is a free and robust database 
  (http://www.mysql.com).

* A Java JDBC (Java database connectivity) client JAR file for accessing your
	database of choice.  Most of these libraries are not redistributable and
  must be downloaded from the database vendor or third-party JDBC vendor.  The
  JDBC libraries downloadable from the database vendors are typically free.

* BrowserHawk 10.0 Enterprise Edition must be installed with a valid license.


Components of BRWS4J
--------------------

BrowserHawk4J Reports Web Service consists of the following components:

* A server-side JAR containing the "brws" servlet (class
  com.cyscape.brws.Stats) and supporting classes.
  These are found within the reports/WEB-INF/lib/brws4j.jar file.

* A supporting JAR, found in reports/WEB-INF/lib/jdom.jar.

* Several sample database setup scripts.  These are found in the
  "scripts" directory.

* Several sample web.xml web application configuration files, found in the 
  "reports/WEB-INF" directory.  The configuration file helps the servlet with
  deployment and configuration.

* Documentation files, found in the "docs" directory.


Setup and Installation
----------------------

Installation of the BrowserHawk4J Reports Web Service servlet is easy, but
does require several steps.  Please read and follow these carefully, as every
step is important.  For upgrade instructions, see the next section.

1. Unzip the brws4j.zip to a temporary directory.  Within the ZIP you'll find
   a "reports" directory containing all the components of BRWS4J, a "docs"
   directory containing documentation, and a "scripts" directory containing
   database setup scripts.

2. Locate the sample database setup script for your database.  The sample
   files can be found within the "scripts" directory.  The goal of 
   these scripts is to construct tables named "bhawkstats" and "bhawkplt"
   with the appropriately named and typed columns.  Every relational
   database has slightly different datatypes and features, so these samples
   have been provided for the most common databases.

3. Execute the provided script for creating the table in your database that
   will hold the stats.  Most databases have a command-line client or 
   graphical client in which you can execute ad hoc queries such as these.
   You may tweak the create table command -- to add indexes for example -- 
   depending on the queries you anticipate making against the database.  
   Note that the BHStatID, BHPltID, and LastUpdate columns are
   database-managed and not inserted into by the servlet.  BHStatID contains
   a unique identifier for each row in the bhawkstats table, and BHPltID does
   the same for the bhawkplt table.  LastUpdate contains a timestamp when
   the insert occurred.  It is recommended that you keep these columns,
   but in the event your database system does not handle these column types
   as expected, you can drop these two columns without harming anything.

4. Install the web application.  The web application files have already been
   placed in the proper locations within the "reports/WEB-INF" directory, so
   you just need to make the "reports" directory a web application root.  For
   example, you could copy this directory under Jetty's "webapps" directory,
   then configure the server configuration files to treat that directory as
   a new web application.

5. Install the JDBC classes to be used for accessing your database.  You do
   this by copying the JDBC JARs for your database to the "reports/WEB-INF/lib"
   directory.  Database vendors don't permit third parties to distribute their
   JDBC JARs, so you'll need to locate these on your own.  To help, here are
   some sample JAR file names (some JDBC drivers require multiple JARs):

   - Oracle: ojdbc14.jar 
   - IBM DB2: db2jcc.jar and db2jcc_license_cu.jar
   - Microsoft SQL Server: msbase.jar, mssqlserver.jar, and msutil.jar
   - MySQL: mysql-connector-3.0.16.jar
   - Derby: derby.jar and derbytools.jar (when embedded)

6. Install and configure the web.xml deployment descriptor file.  Sample
   web.xml files are located in the "reports/WEB-INF" directory.  Choose the
   sample file that matches your database and give it the new name "web.xml".
   A servlet engine will use the WEB-INF/web.xml file to manage the web
   application's configuration.

7. Within the web.xml file are several init parameters that need to be
   configured for your installation.  The file is internally documented, but
   here is an overview:

* The initial part of the file configures the "brws" servlet name to execute
  the com.cyscape.brws.Stats class.  The end of the file says the "brws"
  servlet should handle the "/brws" context path.  These parts don't need to
  be modified.  The rest of the file contains the various initialization
  parameters for the servlet that need customization.

* driver: The name of the JDBC driver class to use when connecting to the
  database.  The exact name depends on the JDBC JAR installed.  Each sample
  web.xml file includes a possible class name to use.

* url: The JDBC connection URL to use.  This specifies how to connect to the
  database.  Again, each sample web.xml file includes a sample URL to use.
  You will need to edit the URL to have the correct host, port, and database
  identifier.  Some JDBC drivers require the username and password in the URL
  also.

* user: The username used to connect to the database.  It may already be
  included in the URL provided above, in which case leave the <param-value>
  empty.

* password: The password for the given user.  Again, it may be included in the
  URL provided above, in which case leave the <param-value> empty.

* license-file: The name of the file containing your BrowserHawk license key.
  This file typically is in the form of "some name.lic" for purchased license
  keys, and "bh_evalkey.lic" for evaluation keys.  This license is typically
  located in the "WEB-INF/classes" directory.  When installing BrowserHawk4J
  Reports into a separate web application from BrowserHawk4J, you will need
  to make sure both web applications have a copy of the key in their
  "WEB-INF/classes" directories.

* escape-non-ascii-chars and max-queue-size: These are flags to control
  execution.  The default value should be suffient during installation but
  you may wish to change it based on your needs.  The purpose of this 
  setting is explained within the file.

8. Start (or restart) the server and test the installation.  Then connect to
   "http://host:port/reports/brws?test" to initiate a self-diagnostic test.
   If the URL cannot be found, that indicates the web application has not been
   installed correctly.  If the database connection fails, that indicates
   there's a problem in the web.xml parameters specifying the database
   connection details.  If either of the two test inserts fail, that indicates
   one or both table were constructed erroneously or in a different database
   than the one accessed by the servlet.  If the license fails, it means the
   license is either not found or invalid.  Any error will result in a
   descriptive string in the test output.

In the section after next, you'll learn how to configure BrowserHawk4J to pass
the statistics it collects to the BrowserHawk4J Reports Web Service for
storage in your database.


Upgrading from BRWS4J 9.x
-------------------------

To upgrade from BRWS4J 9.x you should perform the following steps:

1. Unzip the brws4j.zip to a temporary directory.  Within the ZIP you'll find
   a "reports" directory containing all the components of BRWS4J, a "docs"
   directory containing documentation, and a "scripts" directory containing
   database setup scripts.

2. Copy the new reports/WEB-INF/lib/brws4j.jar on top of the old version.
   (If you're using Windows and the JAR file is locked, you may need to stop
   your web services for the file to be overwritten.)  Copy jdom.jar also.

3. Find the *-table.sql script for your database under the "scripts"
   directory.  Execute the portion of the script at the end that creates a
   new "bhawkplt" table.  This is used for Page Load Time tracking.  You 
   should add it even if you don't plan to record page load times so the
   /reports/brws?test can succeed.  You'll also want to run an "alter table"
   command to add the Plugin_Flip4Mac, UserID, and SessionID columns to the
   "bhawkstats" table.  If you need help with this, write to
   support@cyscape.com.

4. Keep a copy of these instructions and the original .zip file.

In the next section, you'll learn how to configure BrowserHawk4J to pass the
statistics it collects to the BrowserHawk4J Reports Web Service for storage
in your database.


How to Log Statistics
---------------------

To log statistics from BrowserHawk4J you use the BrowserHawk.logData() method.
This static method returns an <img> tag that makes the connection to the
BRWS4J servlet.  For example, with a JSP page you might write:

  <%= BrowserHawk.logData(info, einfo) %>

From a servlet you'd write this:

  out.println(BrowserHawk.logData(info, einfo));

The logData() method uses configuration parameters to know which server to
connect to and which properties to log.  These configuration parameters can be
found in the browserhawk.properties file.  The server location can also be
specified at runtime by passing a LogOptions instance, along with other
configuration options like extra properties to log with the request and user
and session id values to record:

  <%
    LogOptions logOptions = new LogOptions();
    logOptions.setBrwsURL("http://server.com/reports/brws");
    logOptions.setPropertiesToLog("Browser,Version,Platform,Height,Width");
    logOptions.setUserID(request.getRemoteUser());
    logOptions.setSessionIDCookieName("JSESSIONID");
    logOptions.setExtraString1("vhostname");
    logOptions.setExtraString2("actionvalue");
    out.println(BrowserHawk.logData(info, einfo, logOptions, request));
  %>

The brws.url property or setBrwsURL() method specifies the server to connect
to.  For example:

  brws.url=http://brws.yourdomain.com:8080/reports/brws

The brws.properties property or setPropertiesToLog() method specifies the
default BrowserHawk4J properties to log.  For example:

  brws.properties=Browser,Version,Platform,Height,Width

Special notes:

* To log extended properties you must of course first test extended properties
  using the getExtendedBrowserInfo() method prior to calling logData().
  Otherwise only the default values will be logged rather than the user's real
  values.

* You MUST place the string result returned by logData() somewhere within the
  opening and closing BODY section of the HTML page.  Typically we recommend
  writing this value out just prior to sending the closing </BODY> tag.

* It's good practice to use a session variable or flag to record when
  statistics have been logged for a user, so you don't log multiple times.

* The LogOptions can accept a UserID and SessionID to associate and log with
  the request.  These are optional, but especially helpful for joining
  statistic information with the page load time table entries.  The ID values
  can be set directly or from a named cookie.  Note that in BrowserHawk 9
  these had to be set as userdata parameters (now named ExtraString).

* The ExtraString parameters in the logData() method provide a convenient way
  to associate extra information with each log entry, such as the client's
  shopping cart ID or some other unique identifier.  These parameters are
  also handy if you wish to log statistics from mulitple virtual web sites
  into the same database - simply record a unique ID for each web site so
  you can later filter the stats for each site.

* See the brws4jtest.jsp (discussed below) for a full working sample of a JSP
  that tests properties and logs the results to the database.


How to Log Page Load Times
--------------------------

Page load times are automatically logged when performing extended detection
with the pltHead() method (see the PLT documentation for details):

<%= BrowserHawk.pltHead(request, new PLTOptions()) %>

Without any configuration, the values are logged to the server specified by
the brws.url property in the browserhawk.properties config file.  This
behavior, and much more, can be controlled with calls to the PLTOptions
instance:

<%= BrowserHawk.pltHead(request, new PLTOptions()
    .setBrwsURL("http://yourhost:8080/reports/brws")
    .setUserID(request.getRemoteUser())
    .setSessionIDCookieName("JSESSIONID")
    .setExtraString1("wasnotcached")
    .setExtraInt1(500)
    .setExtraDouble1(120.5)
   )
%>

Because every setter methods returns the PLTOptions instance on which it
acted, you can chain the calls together like this.  The above example
overrides the brws.url property with a custom property, assigns a user ID,
assigns a session ID from a cookie, and provides custom values to associate
with the entry.

Special notes:

* To log page load times you must of course follow the rules for instrumenting
  pages for PLT as explained in separate documentation.

* You MUST place the string result returned by pltHead() somewhere within the
  opening and closing HEAD section of the HTML page.  Typically we recommend
  writing this value out just prior to sending the closing </HEAD> tag.


Sample Code
-----------

The brws4jtest.jsp sample file demonstrates how to log statistics using a JSP
page.  The sample also shows how to use the session to log statistics only
once per user.  It tests several extended properties, then logs those
properties along with several standard properties.  It displays the result to
the client in an HTML table as well as logging it.  In production it's of
course possible to log without displaying the statistics to the visitor.
The brws4jtest.jsp sample file can be found with the other BrowserHawk4J
sample files.


Troubleshooting
---------------

The first thing to do when encountering problems is to access
http://host:port/reports/brws?test.  The ?test query string starts a
self-diagnostic that tests the database connection, tests inserts into each
table, and checks the license for validity.  Any errors should appear during
this test.

You should also look at the server's console or error log.  The BRWS4J servlet
logs all errors and warnings.  Any requests that cannot be processed are
logged along with the cause of the problem and the query string of the
request.

The next step in troubleshooting is to hit your web page that logs the stats
(the page that calls the logData() method) and then do a View->Source.  Scan
that HTML source for the BRWS4J <img> tag responsible for writing the results
to the database.  This tag typically looks like <img
src="http://.../reports/brws?...">.  Make certain that the URL in the image
tag points to the real location where you have your BRWS4J servlet installed,
and that the image tag is written within a valid part of the HTML source (in
particular that it is written between the BODY tags, and not broken up by
other tags). In general the statistics will not be logged if it is not a valid
image tag which points to the BRWS4J servlet.  If the URL does not appear
correctly in the source, check the value of the brws.url property in the
browserhawk.properties file or, if you're using them, the setBrwsURL()
run-time settable property or the extra arguments to logData() which can
control this.


Copyright (c) 2005-2010 cyScape, Inc. All rights reserved.
