BouncerRealTimeMetricsProject

From MozillaWiki
Jump to: navigation, search

Bouncer Realtime Metrics Project

For the Firefox 3.5 release, Metrics worked with SQLstream to create a realtime stats dashboard now visible here: Real-Time Firefox Download Stats

Release Engineering and several other interested parties had a problem with the previous method for tracking Bouncer stats, which was database based because the database logging had to be turned off during periods of heavy load, which caused the stats to irrevocably fall behind.

The Mozilla Metrics team has been collecting download statistics for Firefox in our data warehouse for over two years now, and we have fairly accurate statistics for any release issued after mid 2008. There are some informal snapshot numbers we can add derrived from other sources which can be added to the Metrics numbers to give approximate all-time cumulative numbers.

When bouncer traffic was split between two datacenters in April, 2010, the database logging was permanently disabled, and Metrics is working with SQLstream to provide a reasonable replacement for those stats.

Requirements

  1. Monitor and record statistics for bouncer traffic from multiple datacenters
  2. Provide a high capacity back-end that can continue to collect stats even during high traffic periods such as release days
  3. Provide filters to allow interested parties to request particular facets of statistics such as downloads for a particular version or from a particular geographic location.
  4. Provide a reasonably close to real-time storage and retrieval mechanism
  5. Collect geographic data down to the city level
  6. Categorize downloads by the following hierarchies
    1. Date (UTC)
      1. Year
      2. Month
      3. Day
      4. Hour
      5. Minute
    2. Product
      1. Product Name
      2. Product Major Version
      3. Product Full Version
      4. Upgrade From Version ( determined by version string after -partial suffix )
      5. Rebuild tag ( suffix string after primary version string )
      6. Target OS ( from &os= parameter )
    3. Request Info
      1. Download Type
        1. manual | partial | complete ( determined by -partial vs -complete vs any other )
      2. Request Type
        1. download | check | other ( determined by GET vs HEAD vs any other )
      3. Request Result
        1. success | failure ( determined via 302 vs 404 or any other )
    4. Locale ( localization of product )
    5. Location
      1. Continent
      2. Country
      3. Region
      4. City
      5. Latitude
      6. Longitude
    6. User Agent Info ( Possible to be deferred for later version )
      1. UA OS
        1. Platform
        2. OS Name
        3. OS Version
      2. UA Browser
        1. Classification
          1. Desktop | Mobile | Spider | Bot | Other | Unknown
        2. Category
          1. Gecko | MSIE | WebKit | Opera | Other | Unknown
        3. Name
        4. Version
        5. Engine A
        6. Engine B

Current implementation specs

  1. Collect data using SQLstream polling of bouncer access log files
  2. Store statistics in one of our HBase clusters
    1. Use Java remote HBase client
      1. Pro: Higher performance, more flexible API (batch increment)
      2. Con: Possibly need several jars added to classpath
    2. Could use Thrift interface as alternative
      1. Pro: lightweight client interface
      2. Con: Somewhat limited API (i.e. one call per counter increment instead of batching)

Single dimension counters

The rowkey is per minute, but SQLstream can increment these counters at any frequency (i.e. per second). Each of the columns has a very long column specifier that contains the entire dimension hierarchy (e.g. "counter:product:manual:firefox:3.6:3.6.1:win::"). Note that having hundreds or thousands of columns in a rowkey is not a limitation for HBase. Each of these columns is a long integer counter that can be updated via the high performance incrementColumn method of the HBase API.

  • Table Name: dmo_metrics_realtime
  • Rowkey format concatenation of
    • {utc_timestamp_to_minute} (i.e. yyyy-mm-ddTHH:MM )
  • Counter Columns
    • dimension product column name format
      • counter:product:
      • {download_type}:
      • {product_name}:
      • {product_major_version}:
      • {product_version}:
      • {product_os}:
      • {product_rebuild}:
      • {upgrade_from}
    • dimension server_info column name format
      • counter:server_info:
      • {datacenter_code}:
      • {server_name}
    • dimension locale column name format
      • counter:locale:
      • {locale_code}
    • dimension location column name format
      • counter:location:
      • {continent_code}:
      • {country_code}:
      • {region_code}:
      • {city_name}:
      • {latitude}:
      • {longitude}
    • dimension user_agent_info *TBD*
  • examples
    • rowkey: minutes_2010-05-24T03:33
      • columns:
        • counter:product:manual:firefox:3.6:3.6.1:win:: = 1356
        • counter:product:complete:firefox:3.6:3.6.7:win:: = 12456
        • counter:product:partial:firefox:3.6:3.6.7:win::3.6.6 = 16334
        • counter:product:manual:thunderbird:3.1:3.1b2:mac:: = 50
        • counter:locale:en-US = 50000
        • counter:locale:pt-PT = 400
        • counter:locale:ru = 430
        • counter:location:NA:US:NH:0.000:0.000 = 30
        • counter:location:NA:US:CA:30.555:-100.999 = 5000

Multi-dimensional hourly records

SQLstream can increment these counters at any frequency from per second to once per hour. Each rowkey starts with the timestamp down to the hour, and the rest of the rowkey is the Cartesian set of all dimension values which provides uniqueness for the row. This means that one row will be inserted into the table for each seen combination of dimension values for each hour. Again, having this large number of rows is not a problem for HBase. There is one column for each discrete level of each dimension hierarchy. This information is broken out in dimension: columns so that queries can easily filter for the desired rows (e.g. tally the number of requests for the previous 24 hours for all Firefox 3.6 windows downloads"). There is a single counter column containing the total number of requests for this hour and combination of dimensional values. This column is a long integer counter that can either be set once when the row is inserted (if insertion happens once per hour), or it can be updated via the high performance incrementColumn method of the HBase API.

  • Table Name: dmo_metrics_hourly
  • Rowkey format concatenation of
    • {utc_timestamp_to_hour}_
    • {datacenter_code}:{server_name}_
    • {download_type}:{product_name}:{product_major_version}:{product_version}:{product_os}:{product_rebuild}:{upgrade_from}_
    • {locale_code}_
    • {continent_code}:{country_code}:{region_code}:{city_name}:{latitude}:{longitude}
  • Dimension Columns
    • product
      • dimension:product_download_type = {download_type}
      • dimension:product_name = {product_name}
      • dimension:product_major_version = {product_major_version}
      • dimension:product_version = {product_version}
      • dimension:product_os = {product_os}
      • dimension:product_rebuild = {product_rebuild}
      • dimension:product_upgrade_from = {upgrade_from}
    • server_info
      • dimension:datacenter_code = {datacenter_code}
      • dimension:server_name = {server_name}
    • locale
      • dimension:locale_code = {locale_code}
    • location
      • dimension:location_continent_code = {continent_code}
      • dimension:location_country_code = {country_code}
      • dimension:location_region_code = {region_code}
      • dimension:location_city_name = {city_name}
      • dimension:location_latitude = {latitude}
      • dimension:location_longitude = {longitude}
    • dimension user_agent_info *TBD*
  • Counter Column -- long integer updated via incrementColumn API
    • counter:requests