Ateam/Projects/Uber-crawl
From MozillaWiki
Contents
Overview
Uber-crawl is a proposal from the JS and Layout teams. It's purpose is:
- collect javascript patterns from websites in the wild
- collect SVG/CSS/HTML patterns from websites in the wild
This data would be stored into a queryable datastore so that teams could answer the following questions:
- Is pattern x used on the web?
- How many of the top x sites use pattern y?
- How many of the top x sites that also do RTL layout use pattern y?
Goals
- Crawl top x of web sites
- Store JS/CSS/HTML/SVG from sites
- Provide a web tool for querying datastore
- Provide backend API access to data
Non-Goals
- Write a search engine
Deadline/Deliverables
- None yet.
ATeam
We're thinking of reusing a large portion of the bughunter machinery for this, so bc is a good choice.
Dependencies
- Machines, lots of machines.
Major Tasks
- TBD