ReleaseEngineering/Buildduty/SVMeetings/Sept21-Sept25

From MozillaWiki
Jump to: navigation, search

Upcoming vacation/PTO:

  • vlad - oct16 - oct20
  • coop - sep 27 - oct 2 in Cluj-Napoca (approved!)

Meetings every Tuesday and Thursday

2015-09-18 - 2015-09-22

[vlad]

1. https://bugzilla.mozilla.org/show_bug.cgi?id=1204153

   increased the instance type from m1.medium to m3.medium
   the loaner closed the bug, terminated the instance and removed the fqdn from inventory and removed the acces from ldap

2. https://bugzilla.mozilla.org/show_bug.cgi?id=1204756

   updated the emulator.csv patch
   added 

3. https://bugzilla.mozilla.org/show_bug.cgi?id=1158729

   trying to find a solution to check the log files 


[alin]

1. daily tasks:

   re-imaged b-2008-ix-014, monitored jobs, marking the bug as solved 
   terminated tst-linux64-ec2-boris.chiou, revoked VPN access, deleted inventory records


2. slaves t-w732-ix-001, t-w732-ix-117, t-w864-ix-158 and t-xp32-ix-030 are disabled due to graphics issues

   investigated, noticed that the NVIDIA drivers are installed and working, also the resolution is the correct one
   several jobs ended with a warning status as several inbound tests have failed
   we also noticed errors or warnings related to WebGL:
   t-w732-ix-001 -> Error: WebGL: WebGL creation is disabled, and so disallowed here.
   t-w864-ix-158, t-w732-ix-117 -> Error: WebGL: compressedTexImage2D: Invalid format COMPRESSED_RGB_S3TC_DXT1_EXT: Requires that WEBGL_compressed_texture_s3tc is enabled
   t-xp32-ix-030 -> Error: WebGL: WebGL creation is disabled, and so disallowed
                        -> Error: WebGL: Error during ANGLE OpenGL init
                        -> Error: WebGL: Error during native OpenGL init.

Q: should we open DCOps tickets to investigate? Or are there any steps that we can do?

  • DCOps would be good idea for escalation

3. noticed a couple of 10n dep jobs that recently failed on bld-lion-r5 slaves:

   Thunderbird comm-aurora macosx64 l10n dep
   --> wget to http://ftp.mozilla.org/pub/mozilla.org/calendar/lightning/nightly/latest-comm-aurora/ does not find lightning-4.5a2.en-US.mac.xpi as it does not exist -> ERROR 404: Not Found
   Firefox ash macosx64 l10n dep
   --> wget to http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/latest-ash/ does not find firefox-43.0a1.en-US.mac.dmg as it does not exist -> ERROR 404: Not Found. 
   we also checked those locations and can confirm that the requested files are missing
   seems like a bad job setup and not a slave issue.
   we asked the guys from #releng channel for additional info

UPDATE: opened bug 1207154.

4. https://bugzilla.mozilla.org/show_bug.cgi?id=1203128

   managed to connect the t-yosemite-r7-0002 slave to the test master
   trying to figure out how to run the jobs from the changeset having the id: fcef8ded8221 

UPDATE: had little luck on this one, if you have any suggestions --> please feel free :)

  • will look for script to run sendchange and send to you

https://github.com/armenzg/playground/tree/master/mozilla/scripts/sendchanges.py Update to reflect the changeset, platform, branch and well as current_version comment out the platforms you don't need update it to reflect the fqdn and port of your master

  • look at master and look at twistd.log and look for sendchange to see if there are any errors
  • Can't see sendchange in twistd.log. Where did you invoke the sendchange command?

--> will look tomorrow morning on this, didn't get the chance today

5. since yesterday, we received some alerts from Nagios like: buildbot-master66.bb.releng.usw2.mozilla.com:load is WARNING: WARNING - load average: 11.40, 11.77, 10.33

   soon after that, the state comes back to normal:  buildbot-master66.bb.releng.usw2.mozilla.com:load is OK: OK - load average: 0.22, 5.33, 8.22
   Amy mentioned that a lot of git-remote-http  seem to be running when the load goes up
   we are currently trying to figure out what could be the cause for this.
   this is the b2g_bumper script that runs on this master: https://wiki.mozilla.org/ReleaseEngineering/Applications/Bumper
   https://bugzilla.mozilla.org/show_bug.cgi?id=1138234

UPDATE: opened bug 1207229


2015-09-23 - 2015-09-24

[alin]: changeset: https://hg.mozilla.org/mozilla-central/rev/fcef8ded8221 http://dev-master2.bb.releng.use1.mozilla.com:8095/builders

http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-macosx64/1442835574/

http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-macosx64-debug/1442835574/?C=S;O=A

https://secure.pub.build.mozilla.org/buildapi/self-serve/mozilla-central

http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-macosx64-debug/1442835574/mozilla-central-macosx64-debug-bm86-build1-build280.txt.gz


5:35:03 INFO - Running command: ['/tools/buildbot/bin/buildbot', 'sendchange', '--master', 'buildbot-master81.build.mozilla.org:9301', '--username', 'sendchange-unittest', '--branch', 'mozilla-central-macosx64-debug-unittest', '-r', '98231abe637676311a0df1e0a5da85c94adce36d', '--username', 'cbook@mozilla.com', '--comments', 'merge fx-team to mozilla-central a=merge', '--property', 'buildid:20150921043934', '--property', 'pgo_build:False', '--property', 'builduid:e6ef164671184f27bee245e892bb24c2', u'https://queue.taskcluster.net/v1/task/ZQgrK2KiSDGIeGUWJFnurg/artifacts/public/build/firefox-43.0a1.en-US.mac64.dmg', u'https://queue.taskcluster.net/v1/task/ZQgrK2KiSDGIeGUWJFnurg/artifacts/public/build/test_packages.json']

replace buildbot-master81.build.mozilla.org:9301 with your master remove person username and comments remove [ and '

for example:

/tools/buildbot/bin/buildbot sendchange --master dev-master2.bb.releng.use1.mozilla.com:8095 --username sendchange-unittest --branch mozilla-central-macosx64-debug-unittest -r 98231abe637676311a0df1e0a5da85c94adce36d --property buildid:20150921043934 --property pgo_build:False --property builduid:e6ef164671184f27bee245e892bb24c2 https://queue.taskcluster.net/v1/task/ZQgrK2KiSDGIeGUWJFnurg/artifacts/public/build/firefox-43.0a1.en-US.mac64.dmg https://queue.taskcluster.net/v1/task/ZQgrK2KiSDGIeGUWJFnurg/artifacts/public/build/test_packages.json

05:35:03 INFO - Copy/paste: /tools/buildbot/bin/buildbot sendchange --master buildbot-master81.build.mozilla.org:9301 --username sendchange-unittest --branch mozilla-central-macosx64-debug-unittest -r 98231abe637676311a0df1e0a5da85c94adce36d --username cbook@mozilla.com --comments "merge fx-team to mozilla-central a=merge" --property buildid:20150921043934 --property pgo_build:False --property builduid:e6ef164671184f27bee245e892bb24c2 https://queue.taskcluster.net/v1/task/ZQgrK2KiSDGIeGUWJFnurg/artifacts/public/build/firefox-43.0a1.en-US.mac64.dmg https://queue.taskcluster.net/v1/task/ZQgrK2KiSDGIeGUWJFnurg/artifacts/public/build/test_packages.json


opt log

http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-macosx64/1442835574/mozilla-central-macosx64-bm86-build1-build242.txt.gz

https://queue.taskcluster.net/v1/task/pDvjIdDURD67-4n1_ZNVPg/artifacts/public/build/test_packages.json

[vlad]


[alin]

1. https://bugzilla.mozilla.org/show_bug.cgi?id=1203128 I would have several questions here, because some things seem weird (at least for me):

   created another master and loaned 5 more slaves
   added them to DB and connected the slaves to the master


Q1: /builds/buildbot/alin.selagea/test_yosemite-r7-2/master has 1800+ folders which seem to be jobs. From what I noticed, they get downloaded when starting the master ("make start"). What's their role? Do they represent the pool from which the master chooses a job and then triggers it to run on a slave?

http://dev-master2.bb.releng.use1.mozilla.com:8096/builders

Q2: tried to invoke sendchange with a command like this: buildbot sendchange --master dev-master2.bb.releng.use1.mozilla.com:8096 --username sendchange-unittest --branch mozilla-central-macosx64-opt-unittest -r f1dffc8682fbba463cb4bb305f293ddcccbc20b4 --property buildid:20150923152817 --property pgo_build:False --property builduid:a88bf99db731404093dcef56fcddbe89 --> waited around 1 hour, not happened, interrupted the script --> changed the port to 9096 -> received "change sent successfully", but nothing happened on the slaves, no running job was triggered on them

Q3: we run the command above from dev-master2.bb.releng.use1.mozilla.com -> first we ran "/builds/buildbot/alin.selagea/test_yosemite-r7-2/bin/activate" -> is this the right approach?

"Connected to dev-master2.bb.releng.use1.mozilla.com:9096; slave is ready"

From Coop:

   start thinking about topics you will want me to go into in more depth next week while I'm onsite
   will send email with trip details today

https://etherpad.mozilla.org/romania-buildduty-agenda

1. https://bugzilla.mozilla.org/show_bug.cgi?id=1208074

   Created a loan request for a yosmite-r5 slave to me in order to make some tests with my master 

2. Attached the following slaves to my dev-master via slavealloc

   t-yosemite-r7-0001
   t-yosemite-r5-0107

I was able to push changes by running the following commands:

   buildbot sendchange --master dev-master2.bb.releng.use1.mozilla.com:9050 --username sendchange --branch mozilla-central-macosx64-talos --revision 001942e4617b   http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-macosx64/1443089138/firefox-44.0a1.en-US.mac.dmg


   buildbot sendchange --master dev-master2.bb.releng.use1.mozilla.com:9050 --username sendchange --branch mozilla-central-macosx64-talos --revision 001942e4617b   http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-macosx64-debug/1443089138/firefox-44.0a1.en-US.mac64.dmg

Run the both commands on dev-master2 , we will wait for jobs to finish and we will check the status for them 3. https://github.com/armenzg/playground/blob/master/mozilla/scripts/sendchanges.py

   We will update the script in order to work correctly 


Glad you got the sendchanges working for talos I noticed that some of the talos tests are failing because these machines don't exist in the graphserver. Example http://dev-master2.bb.releng.use1.mozilla.com:8050/builders/Rev5%20MacOSX%20Yosemite%2010.10%20mozilla-central%20talos%20other/builds/0/steps/run_script/logs/stdio

You can fix this by writing a patch to add them. Here is an example https://bugzilla.mozilla.org/show_bug.cgi?id=1125919 Your platform etc will be different since we will be adding a new platform for the r7 machines [Vlad] In order to add the yosemite-r7 slaves in DB we need the credentials from private/passwords/graphs.txt.gpg


Here is an sendchange I ran for debug tests bin/buildbot sendchange --master dev-master2.bb.releng.use1.mozilla.com:9050 --username sendchange-unittest --branch mozilla-central-macosx64-debug-unittest -r 98231abe637676311a0df1e0a5da85c94adce36d https://queue.taskcluster.net/v1/task/ZQgrK2KiSDGIeGUWJFnurg/artifacts/public/build/firefox-43.0a1.en-US.mac64.dmg https://queue.taskcluster.net/v1/task/ZQgrK2KiSDGIeGUWJFnurg/artifacts/public/build/test_packages.json

This is the one a ran for opt bin/buildbot sendchange --master dev-master2.bb.releng.use1.mozilla.com:9050 --username sendchange-unittest --branch mozilla-central-macosx64-opt-unittest -r 98231abe637676311a0df1e0a5da85c94adce36d --property buildid:20150921043934 --property pgo_build:False --property builduid:e6ef164671184f27bee245e892bb24c2 https://queue.taskcluster.net/v1/task/pDvjIdDURD67-4n1_ZNVPg/artifacts/public/build/firefox-43.0a1.en-US.mac.dmg https://queue.taskcluster.net/v1/task/pDvjIdDURD67-4n1_ZNVPg/artifacts/public/build/test_packages.json

You should see a lot of jobs pending on the master now, hopefully many of them will run overnight

2015-09-25

[alin] 1. enabled talos tests on test_yosemite_r7 master (1 slave attached): bin/buildbot sendchange --master dev-master2.bb.releng.use1.mozilla.com:9095 --username sendchange --branch mozilla-central-macosx64-talos --revision f1dffc8682fbba463cb4bb305f293ddcccbc20b4 http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-macosx64/1443047297/firefox-44.0a1.en-US.mac.dmg http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-macosx64/1443047297/test_packages.json --> tests are all orange due to the slave not being added to graphics server


2. enabled opt tests on test_yosemite-r7-2 master (5 slaves attached) bin/buildbot sendchange --master dev-master2.bb.releng.use1.mozilla.com:9096 --username sendchange-unittest --branch mozilla-central-macosx64-opt-unittest --revision f1dffc8682fbba463cb4bb305f293ddcccbc20b4 http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-macosx64/1443047297/firefox-44.0a1.en-US.mac.dmg http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-macosx64/1443047297/test_packages.json --> most tests are green --> web-platform tests are orange, also some mochitest jobs are also orange

3. added patches for graph server, slavealloc and buildbot-configs to the bug: 1203128

4. Run the below commands to start jobs :

   buildbot sendchange --master dev-master2.bb.releng.use1.mozilla.com:9050 --username sendchange --branch mozilla-central-macosx64-talos --revision 001942e4617b   http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-macosx64/1443089138/firefox-44.0a1.en-US.mac.dmg http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-macosx64/1443089138/test_packages.json


   buildbot sendchange --master dev-master2.bb.releng.use1.mozilla.com:9050 --username sendchange --branch mozilla-central-macosx64-talos --revision 001942e4617b   http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-macosx64-debug/1443089138/firefox-44.0a1.en-US.mac64.dmg http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-macosx64-debug/1443089138/test_packages.json


   buildbot sendchange --master dev-master2.bb.releng.use1.mozilla.com:9050 --username sendchange-unittest --branch mozilla-central-macosx64-debug-unittest --revision 001942e4617b   https://queue.taskcluster.net/v1/task/ZQgrK2KiSDGIeGUWJFnurg/artifacts/public/build/firefox-43.0a1.en-US.mac64.dmg https://queue.taskcluster.net/v1/task/ZQgrK2KiSDGIeGUWJFnurg/artifacts/public/build/test_packages.json


   buildbot sendchange --master dev-master2.bb.releng.use1.mozilla.com:9050 --username sendchange-unittest --branch mozilla-central-macosx64-opt-unittest --revision 001942e4617b   https://queue.taskcluster.net/v1/task/pDvjIdDURD67-4n1_ZNVPg/artifacts/public/build/firefox-43.0a1.en-US.mac.dmg https://queue.taskcluster.net/v1/task/pDvjIdDURD67-4n1_ZNVPg/artifacts/public/build/test_packages.json