ReleaseEngineering/Buildduty/SVMeetings/Sept21-Sept25
Upcoming vacation/PTO:
- vlad - oct16 - oct20
- coop - sep 27 - oct 2 in Cluj-Napoca (approved!)
Meetings every Tuesday and Thursday
- Main Meetings Page: https://wiki.mozilla.org/ReleaseEngineering/Buildduty/SVMeetings
- https://wiki.mozilla.org/ReleaseEngineering/Buildduty/SVMeetings/Aug31-Sept4
- https://wiki.mozilla.org/ReleaseEngineering/Buildduty/SVMeetings/Sept7-Sept11
- https://wiki.mozilla.org/ReleaseEngineering/Buildduty/SVMeetings/Sept14-Sept18
- Orlando - Dec 7 -11 - "Mozlando"
- [otilia] Started an etherpad - the agenda for next week https://etherpad.mozilla.org/romania-buildduty-agenda
2015-09-18 - 2015-09-22
[vlad]
1. https://bugzilla.mozilla.org/show_bug.cgi?id=1204153
increased the instance type from m1.medium to m3.medium
the loaner closed the bug, terminated the instance and removed the fqdn from inventory and removed the acces from ldap
2. https://bugzilla.mozilla.org/show_bug.cgi?id=1204756
updated the emulator.csv patch
added
3. https://bugzilla.mozilla.org/show_bug.cgi?id=1158729
trying to find a solution to check the log files
[alin]
1. daily tasks:
re-imaged b-2008-ix-014, monitored jobs, marking the bug as solved
terminated tst-linux64-ec2-boris.chiou, revoked VPN access, deleted inventory records
2. slaves t-w732-ix-001, t-w732-ix-117, t-w864-ix-158 and t-xp32-ix-030 are disabled due to graphics issues
investigated, noticed that the NVIDIA drivers are installed and working, also the resolution is the correct one
several jobs ended with a warning status as several inbound tests have failed
we also noticed errors or warnings related to WebGL:
t-w732-ix-001 -> Error: WebGL: WebGL creation is disabled, and so disallowed here.
t-w864-ix-158, t-w732-ix-117 -> Error: WebGL: compressedTexImage2D: Invalid format COMPRESSED_RGB_S3TC_DXT1_EXT: Requires that WEBGL_compressed_texture_s3tc is enabled
t-xp32-ix-030 -> Error: WebGL: WebGL creation is disabled, and so disallowed
-> Error: WebGL: Error during ANGLE OpenGL init -> Error: WebGL: Error during native OpenGL init.
Q: should we open DCOps tickets to investigate? Or are there any steps that we can do?
- DCOps would be good idea for escalation
3. noticed a couple of 10n dep jobs that recently failed on bld-lion-r5 slaves:
Thunderbird comm-aurora macosx64 l10n dep
--> wget to http://ftp.mozilla.org/pub/mozilla.org/calendar/lightning/nightly/latest-comm-aurora/ does not find lightning-4.5a2.en-US.mac.xpi as it does not exist -> ERROR 404: Not Found
Firefox ash macosx64 l10n dep
--> wget to http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/latest-ash/ does not find firefox-43.0a1.en-US.mac.dmg as it does not exist -> ERROR 404: Not Found.
we also checked those locations and can confirm that the requested files are missing
seems like a bad job setup and not a slave issue.
we asked the guys from #releng channel for additional info
UPDATE: opened bug 1207154.
4. https://bugzilla.mozilla.org/show_bug.cgi?id=1203128
managed to connect the t-yosemite-r7-0002 slave to the test master
trying to figure out how to run the jobs from the changeset having the id: fcef8ded8221
UPDATE: had little luck on this one, if you have any suggestions --> please feel free :)
- will look for script to run sendchange and send to you
https://github.com/armenzg/playground/tree/master/mozilla/scripts/sendchanges.py Update to reflect the changeset, platform, branch and well as current_version comment out the platforms you don't need update it to reflect the fqdn and port of your master
- look at master and look at twistd.log and look for sendchange to see if there are any errors
- Can't see sendchange in twistd.log. Where did you invoke the sendchange command?
--> will look tomorrow morning on this, didn't get the chance today
5. since yesterday, we received some alerts from Nagios like: buildbot-master66.bb.releng.usw2.mozilla.com:load is WARNING: WARNING - load average: 11.40, 11.77, 10.33
soon after that, the state comes back to normal: buildbot-master66.bb.releng.usw2.mozilla.com:load is OK: OK - load average: 0.22, 5.33, 8.22
Amy mentioned that a lot of git-remote-http seem to be running when the load goes up
we are currently trying to figure out what could be the cause for this.
this is the b2g_bumper script that runs on this master: https://wiki.mozilla.org/ReleaseEngineering/Applications/Bumper
https://bugzilla.mozilla.org/show_bug.cgi?id=1138234
UPDATE: opened bug 1207229
2015-09-23 - 2015-09-24
[alin]: changeset: https://hg.mozilla.org/mozilla-central/rev/fcef8ded8221 http://dev-master2.bb.releng.use1.mozilla.com:8095/builders
http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-macosx64/1442835574/
https://secure.pub.build.mozilla.org/buildapi/self-serve/mozilla-central
5:35:03 INFO - Running command: ['/tools/buildbot/bin/buildbot', 'sendchange', '--master', 'buildbot-master81.build.mozilla.org:9301', '--username', 'sendchange-unittest', '--branch', 'mozilla-central-macosx64-debug-unittest', '-r', '98231abe637676311a0df1e0a5da85c94adce36d', '--username', 'cbook@mozilla.com', '--comments', 'merge fx-team to mozilla-central a=merge', '--property', 'buildid:20150921043934', '--property', 'pgo_build:False', '--property', 'builduid:e6ef164671184f27bee245e892bb24c2', u'https://queue.taskcluster.net/v1/task/ZQgrK2KiSDGIeGUWJFnurg/artifacts/public/build/firefox-43.0a1.en-US.mac64.dmg', u'https://queue.taskcluster.net/v1/task/ZQgrK2KiSDGIeGUWJFnurg/artifacts/public/build/test_packages.json']
replace buildbot-master81.build.mozilla.org:9301 with your master remove person username and comments remove [ and '
for example:
/tools/buildbot/bin/buildbot sendchange --master dev-master2.bb.releng.use1.mozilla.com:8095 --username sendchange-unittest --branch mozilla-central-macosx64-debug-unittest -r 98231abe637676311a0df1e0a5da85c94adce36d --property buildid:20150921043934 --property pgo_build:False --property builduid:e6ef164671184f27bee245e892bb24c2 https://queue.taskcluster.net/v1/task/ZQgrK2KiSDGIeGUWJFnurg/artifacts/public/build/firefox-43.0a1.en-US.mac64.dmg https://queue.taskcluster.net/v1/task/ZQgrK2KiSDGIeGUWJFnurg/artifacts/public/build/test_packages.json
05:35:03 INFO - Copy/paste: /tools/buildbot/bin/buildbot sendchange --master buildbot-master81.build.mozilla.org:9301 --username sendchange-unittest --branch mozilla-central-macosx64-debug-unittest -r 98231abe637676311a0df1e0a5da85c94adce36d --username cbook@mozilla.com --comments "merge fx-team to mozilla-central a=merge" --property buildid:20150921043934 --property pgo_build:False --property builduid:e6ef164671184f27bee245e892bb24c2 https://queue.taskcluster.net/v1/task/ZQgrK2KiSDGIeGUWJFnurg/artifacts/public/build/firefox-43.0a1.en-US.mac64.dmg https://queue.taskcluster.net/v1/task/ZQgrK2KiSDGIeGUWJFnurg/artifacts/public/build/test_packages.json
opt log
[vlad]
[alin]
1. https://bugzilla.mozilla.org/show_bug.cgi?id=1203128 I would have several questions here, because some things seem weird (at least for me):
created another master and loaned 5 more slaves
added them to DB and connected the slaves to the master
Q1: /builds/buildbot/alin.selagea/test_yosemite-r7-2/master has 1800+ folders which seem to be jobs. From what I noticed, they get downloaded when starting the master ("make start"). What's their role? Do they represent the pool from which the master chooses a job and then triggers it to run on a slave?
http://dev-master2.bb.releng.use1.mozilla.com:8096/builders
Q2: tried to invoke sendchange with a command like this: buildbot sendchange --master dev-master2.bb.releng.use1.mozilla.com:8096 --username sendchange-unittest --branch mozilla-central-macosx64-opt-unittest -r f1dffc8682fbba463cb4bb305f293ddcccbc20b4 --property buildid:20150923152817 --property pgo_build:False --property builduid:a88bf99db731404093dcef56fcddbe89 --> waited around 1 hour, not happened, interrupted the script --> changed the port to 9096 -> received "change sent successfully", but nothing happened on the slaves, no running job was triggered on them
Q3: we run the command above from dev-master2.bb.releng.use1.mozilla.com -> first we ran "/builds/buildbot/alin.selagea/test_yosemite-r7-2/bin/activate" -> is this the right approach?
"Connected to dev-master2.bb.releng.use1.mozilla.com:9096; slave is ready"
From Coop:
start thinking about topics you will want me to go into in more depth next week while I'm onsite
will send email with trip details today
https://etherpad.mozilla.org/romania-buildduty-agenda
1. https://bugzilla.mozilla.org/show_bug.cgi?id=1208074
Created a loan request for a yosmite-r5 slave to me in order to make some tests with my master
2. Attached the following slaves to my dev-master via slavealloc
t-yosemite-r7-0001
t-yosemite-r5-0107
I was able to push changes by running the following commands:
buildbot sendchange --master dev-master2.bb.releng.use1.mozilla.com:9050 --username sendchange --branch mozilla-central-macosx64-talos --revision 001942e4617b http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-macosx64/1443089138/firefox-44.0a1.en-US.mac.dmg
buildbot sendchange --master dev-master2.bb.releng.use1.mozilla.com:9050 --username sendchange --branch mozilla-central-macosx64-talos --revision 001942e4617b http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-macosx64-debug/1443089138/firefox-44.0a1.en-US.mac64.dmg
Run the both commands on dev-master2 , we will wait for jobs to finish and we will check the status for them 3. https://github.com/armenzg/playground/blob/master/mozilla/scripts/sendchanges.py
We will update the script in order to work correctly
Glad you got the sendchanges working for talos I noticed that some of the talos tests are failing because these machines don't exist in the graphserver. Example http://dev-master2.bb.releng.use1.mozilla.com:8050/builders/Rev5%20MacOSX%20Yosemite%2010.10%20mozilla-central%20talos%20other/builds/0/steps/run_script/logs/stdio
You can fix this by writing a patch to add them. Here is an example https://bugzilla.mozilla.org/show_bug.cgi?id=1125919 Your platform etc will be different since we will be adding a new platform for the r7 machines [Vlad] In order to add the yosemite-r7 slaves in DB we need the credentials from private/passwords/graphs.txt.gpg
Here is an sendchange I ran for debug tests
bin/buildbot sendchange --master dev-master2.bb.releng.use1.mozilla.com:9050 --username sendchange-unittest --branch mozilla-central-macosx64-debug-unittest -r 98231abe637676311a0df1e0a5da85c94adce36d https://queue.taskcluster.net/v1/task/ZQgrK2KiSDGIeGUWJFnurg/artifacts/public/build/firefox-43.0a1.en-US.mac64.dmg https://queue.taskcluster.net/v1/task/ZQgrK2KiSDGIeGUWJFnurg/artifacts/public/build/test_packages.json
This is the one a ran for opt bin/buildbot sendchange --master dev-master2.bb.releng.use1.mozilla.com:9050 --username sendchange-unittest --branch mozilla-central-macosx64-opt-unittest -r 98231abe637676311a0df1e0a5da85c94adce36d --property buildid:20150921043934 --property pgo_build:False --property builduid:e6ef164671184f27bee245e892bb24c2 https://queue.taskcluster.net/v1/task/pDvjIdDURD67-4n1_ZNVPg/artifacts/public/build/firefox-43.0a1.en-US.mac.dmg https://queue.taskcluster.net/v1/task/pDvjIdDURD67-4n1_ZNVPg/artifacts/public/build/test_packages.json
You should see a lot of jobs pending on the master now, hopefully many of them will run overnight
2015-09-25
[alin] 1. enabled talos tests on test_yosemite_r7 master (1 slave attached): bin/buildbot sendchange --master dev-master2.bb.releng.use1.mozilla.com:9095 --username sendchange --branch mozilla-central-macosx64-talos --revision f1dffc8682fbba463cb4bb305f293ddcccbc20b4 http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-macosx64/1443047297/firefox-44.0a1.en-US.mac.dmg http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-macosx64/1443047297/test_packages.json --> tests are all orange due to the slave not being added to graphics server
2. enabled opt tests on test_yosemite-r7-2 master (5 slaves attached)
bin/buildbot sendchange --master dev-master2.bb.releng.use1.mozilla.com:9096 --username sendchange-unittest --branch mozilla-central-macosx64-opt-unittest --revision f1dffc8682fbba463cb4bb305f293ddcccbc20b4 http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-macosx64/1443047297/firefox-44.0a1.en-US.mac.dmg http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-macosx64/1443047297/test_packages.json
--> most tests are green
--> web-platform tests are orange, also some mochitest jobs are also orange
3. added patches for graph server, slavealloc and buildbot-configs to the bug: 1203128
4. Run the below commands to start jobs :
buildbot sendchange --master dev-master2.bb.releng.use1.mozilla.com:9050 --username sendchange --branch mozilla-central-macosx64-talos --revision 001942e4617b http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-macosx64/1443089138/firefox-44.0a1.en-US.mac.dmg http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-macosx64/1443089138/test_packages.json
buildbot sendchange --master dev-master2.bb.releng.use1.mozilla.com:9050 --username sendchange --branch mozilla-central-macosx64-talos --revision 001942e4617b http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-macosx64-debug/1443089138/firefox-44.0a1.en-US.mac64.dmg http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-macosx64-debug/1443089138/test_packages.json
buildbot sendchange --master dev-master2.bb.releng.use1.mozilla.com:9050 --username sendchange-unittest --branch mozilla-central-macosx64-debug-unittest --revision 001942e4617b https://queue.taskcluster.net/v1/task/ZQgrK2KiSDGIeGUWJFnurg/artifacts/public/build/firefox-43.0a1.en-US.mac64.dmg https://queue.taskcluster.net/v1/task/ZQgrK2KiSDGIeGUWJFnurg/artifacts/public/build/test_packages.json
buildbot sendchange --master dev-master2.bb.releng.use1.mozilla.com:9050 --username sendchange-unittest --branch mozilla-central-macosx64-opt-unittest --revision 001942e4617b https://queue.taskcluster.net/v1/task/pDvjIdDURD67-4n1_ZNVPg/artifacts/public/build/firefox-43.0a1.en-US.mac.dmg https://queue.taskcluster.net/v1/task/pDvjIdDURD67-4n1_ZNVPg/artifacts/public/build/test_packages.json