CloudServices/SimplePushServer/Testing Notes
Tools are written by the team, used by QA (load test, etc.)
Tools
Does the project need/have:
Unit Tests
- run in travis before merges
Smoke Tests
- python smoke test that isn’t tied to anything yet in dev cluster before stage
Regression Tests
Integration Tests
e2e Tests
Load Tests
- small load test that isn’t tied to anything yet of dev cluster before stage
Performance Tests
How are deployments done?
travis builds pr's
- before merge
jenkins builds ‘dev’ (master)
- deploy to dev cluster
- run smoke and load and then move to stage
staging doesn’t update that regularly
will follow the standard jenkins deploy by mid-late march
- who?
no git event that is triggering stage
no testing at stage yet
nothing that triggers from stage to production
Does the project automatically execute the required test types on checkin via a build server like jenkins or travis?
Can/should each of the test types fail the build on failed runs?
- smoke test should fail
- load test shouldn’t unless there is a regression - percent failure
percent failure - load test logs to influxdb, can use db to query whether or not the test regressed
Are the test results visible in treeherder?
- no, just travis right now
For a specific build, does it require acceptance testing or manual sign-off?
- minimal need during qa sign off
- push could be e2e tested via fmd
- might not be needed once prod slow rollout is added (ensure up time for servers, rollback if needed)
maybe a smoke test in production to validate client
infrequent releases, is it worth doing automated production smoke testing?
production monitoring tests could be valuable
There are 3 projects built for ‘push’:
- simplepush - not updated regularly, will be legacy but currently used
- loop push - same code base as simplepush but different configs, used only for loop/hello
- web push - not happening a while (late 2015?), the long term push solution
simplepush 1.5 will carry data, release to stage next week-ish
As of FF36 we will unthrottle loop, meaning we should see 3x the connection count from simple push and will need more servers
Have tested: 100k people connected, 1 notification per minute, load test push 2000 notification per second so it’s well covered. We have typically been seeing no more than 10s of notifications per second.
Capacity: 160k per server, 1.6mm across 10-node cluster
Next Steps
- Automate load test via FMD, ticket creation and server validation for deployments
- Determine production monitoring and smoke tests