Ateam/Projects/Uber-crawl

From MozillaWiki
Jump to: navigation, search

Overview

Uber-crawl is a proposal from the JS and Layout teams. It's purpose is:

  • collect javascript patterns from websites in the wild
  • collect SVG/CSS/HTML patterns from websites in the wild

This data would be stored into a queryable datastore so that teams could answer the following questions:

  • Is pattern x used on the web?
  • How many of the top x sites use pattern y?
  • How many of the top x sites that also do RTL layout use pattern y?

Goals

  • Crawl top x of web sites
  • Store JS/CSS/HTML/SVG from sites
  • Provide a web tool for querying datastore
  • Provide backend API access to data

Non-Goals

  • Write a search engine

Deadline/Deliverables

  • None yet.

ATeam

We're thinking of reusing a large portion of the bughunter machinery for this, so bc is a good choice.

Dependencies

  • Machines, lots of machines.

Major Tasks

  • TBD

Notes