User:Hsinyi/Gecko engineering
<Disclaimer: This document is a draft and a collection of my personal thoughts of gecko engineering practices. Discussion and comments welcome!>
Contents
What should I work on?
We want to deliver a product with high qualities to our users, we want to bring new values to our users. This won't happen without collaboration with various teams and our community. Therefore, when we are thinking about what we should work on with which priority, we consider to unblock others and help others as well. Here is a list (in descending order of priority) to show others how our work is being prioritized, not necessarily a how to strictly prioritize the work.
- Catastrophic bugs
- Bugs that block development/testing, may impact more than 25% of users, causes data loss, potential chemspill, and no workaround available. Examples include but not limit to:
- sec-critical security bugs
- Sev-blocker or release-tracking+ bugs
- Permanent test failures that block branch merging
- New regressions: Not every regression is a catastrophic issue, but we should look into quickly to investigate the impact and solutions timely.
- Review and needinfo requests to unblock others
- Browser is such a complex system. No one knows every single piece or no one can figure out every single piece shortly. Only when we have collaborative team work, are we able to deliver a satisfying product or bring new values to our users timely. (Yes, TIMELY again!) It takes time and effort to build a product with high qualities, no doubt. I am not saying we should sacrifice the quality. What I am trying to say here is we have to admit the time-to-market impact or how the life cycle of a bug impacts our user, keep that in mind and guide our work. So with that in mind, unblocking others is a key for us to succeed.
- To do a proper and thorough and high-quality review may take time. And we should take what it needs to ensure our quality. Communication is nevertheless a key here. If it takes longer to review/respond, communicate. If you are not sure about the priority, communicate. Oh, and, if it's a simple enough review or question (even the priority isn't high), just do it and help things done.
- Serious bugs
- Bugs that cause major Functionality/product severely impaired. If there's a satisfactory workaround, for example, flipping a preference and explaining to our users about this so they can live with it, we should do it. sec-high security bugs are one example for this category, as well as top crashes. Again, keep an eye on regressions.
- Special programs e.g. WFH bugs, q-flow bugs
- To react upon the real world situation, priorities change for a given period of time. COVID-19 is an example of how people's working, social and daily lives are largely impacted and changed in a certain period. Some old issues may become much more sever due to this.
- Important partner and webcompat bug fixes
- Bug fixes
- New features
- Again, though new features are listed at the bottom of the list, it doesn’t mean that new features are always strictly with the lowest priority compared with other engineering tasks. Nevertheless, there are more important things than new feature implementations we need to work on as well.
Add some notes for tests and documents.
Bug triage
Why are we doing triage?
- Ensuring we don’t miss important bugs and timely actions are taken
What counts as being triaged?
- According to latest triage practice, severity field is set.
What are the actions to take?
- Paying particular attention to regressions
- Paying particular attention for bugs impacting top websites
- When needinfo someone, be clear that help is needed to try to advance it
- Responding to needinfo requests and not collecting ASSIGNED bugs
- Also, rotation duty of Core: General triage
References:
Backlog management presentation
Code review
What is the purpose of code review
“Code review is our basic mechanism for validating the design and implementation of patches. It also helps us maintain a level of consistency in design and implementation practices across the many hackers and among the various modules of Mozilla.” 1
Additionally, knowledge sharing is another major advantage.
What does the code review process look like
A critical thing to remember is that coding to some extent is a social process. The code review process may start before an actual patch and actual review request are submitted. Communication is key. It is very important and helpful that before you start implementing you discuss your implementation plan with the module peers, especially when you are working on something new to you or you are working on a new feature. Because documentation can be lacking or out of date, talking to the other members may save you literally weeks of needless time. It’s also because walking reviewers through your thoughts can help save their time and allow them to provide early feedback, too.
With this thought in mind, the code review process actually contains the steps in the red rectangular.
Initial communication between requesters & reviewers
- Mutual understanding of the priority and the expected timeline
- As a requester, set a clear expectation. For a new feature or big patch, it’s especially important to plan in advance.
- As a reviewer, set a clear expectation about your availability, too.
- Check reviewer’s availability before/when submitting a patch to avoid needless waiting time
- Reviewers should ensure that their phabricator calendars are up-to-date so that requesters will receive warnings when requests are submitted to a reviewer who is not available.
- Reviewers should ensure their bugzilla user name displays the Out-of-office information, if they are away for more than 2 days.
- Reviewers should consider to block accepting bugzilla needinfo/review requests, if they are away for more than 2 days.
- Architecture design documents help! Examples:
Creating a patch & testing
Commit messages
“Changesets should have commit messages. The commit message should describe not just the "what" of the change but also the "why". This is especially true in cases when the "what" is obvious from the diff anyway; for larger changes it makes sense to have a summary of the "what" in the commit message.
As a specific example, if your diff is a one-line change that changes a method call argument from "true" to "false", having a commit message that says "change argument to mymethod from true to false" is not very helpful at all. A good commit message in this situation will at least mention the meaning for the argument. If that does not make it clear why the change is being made, the commit message should explain the "why".2
- Describe WHY in addition to WHAT
- Describe the change but not the symptom. Bad example: [Bug XYZ: Crash at Foo::Bar" ]
- For WPTs which will be upstreamed, consider how the commit message will read outside the m-c repository, and at least mention which features the tests are intending to cover.
- For example, a commit message like "Bug 1234 - Part 2: Tests" is not totally terrible in the context of Mozilla Central (though can be better) but "Part 2: Tests" is utterly meaningless upstream.
- Reviewers should feel comfortable to reject a review request if the commit message isn’t well-written or doesn’t provide useful information.
- [Open question] People have various preferences on the length of commit messages … We need more discussions and a much better style guide, explaining what makes a good commit message and what makes a good and descriptive bug, with a number of (good and bad) examples.
Patch scope & size
It’s arguable if the size of a patch matters much and optimizing for “small changes v.s. big diffs” is hard. But a good principle to keep is: seek to understand (understand the code, implement the change), and seek to be understood (present the change in an understandable form). Normally, smaller is better.
An example for arranging patches in sequence to help reviewers: https://bugzilla.mozilla.org/show_bug.cgi?id=1615581
Ensure to always stick with *the* real problem you want to fix. Be conscious about not adding unrelated fixes into that patch, even if they are super tiny.
Tests
- ALWAYS pursue this target - land code changes along with automated tests. There are exceptions or cases where testing isn’t possible using current testing frameworks. However, always ensure that you’ve thought about automated tests and you have a strong reason why no automated tests are needed or why landing without automated tests is acceptable.
- There are cases when testing isn't possible using current testing frameworks: some parts of DnD, clipboard, printing etc.
- Improving performance can often rely on existing performance tests. Or the patch might improve some very micro level issue, so adding a performance test for that might not be worth.
- Scheduling changes should get tested by overall "do tests still pass" and performance tests (but we do need better tests for responsiveness etc.)
- Scheduling changes for things like CC and GC are tested basically by looking at telemetry data.
- Web platform tests (WPT) have helped a lot for us and for cross-browser behaviors. WPT should always be the default testing framework, where possible.
- Write comments! Comments in the tests are particularly useful. It forces the patch author to think which all cases are tested and reviewers can also easily see that.
- Always, ensure that a patch is well-tested and self-reviewed before it’s submitted for review.
Getting reviews and during reviews
Remember, communication is key! Code reviews are more a discussion, not a dictation. Patch authors are encouraged to have different opinions than reviewers that not every review comment is being taken, as long as there's sensible discussion and reasoning about the opinions that it's better for the code quality and the users.
Respect each other’s time as yours. While we expect reviewers to get back to the patch authors within 2 days, it's also important that patch authors allocate time to address review comments in the same time frame. This helps the discussion momentum while the memory is still fresh. It may not be a good practice that an author submits a patch to review right before they take vacations without proper communication in advance, because reviewers may need explanation or elaboration from the author when they are conducting a proper review. It won’t be effective if the author is away.
Checklist
When requesting a code review
First thing to remember - You are responsible for the code you write. You are responsible for the issue fixing. You are responsible for the code base quality after your code is landed.
When doing a code review
- Check mozilla-standard-positions for new WebAPIs; if in doubt, file an issue to request Mozilla’s official position on that.
- Refer to ExposureGuidelines for web observable changes, including sending intent-to-* emails or creating new preferences.
- Be cautious about the follow-up suggestion. Whenever there is a "follow-up", it should be thought of work which is not going to happen ever - except perhaps in one case, when the follow-up blocks some meta bug, where fixing all the blockers are required. For example Fission meta bug has such blockers.
- Functionality
- Life cycle management
- Performance
Release health and product quality
- Communication with release management (e.g. channel meeting)
- Release metrics tracking
- Regression Engineering Owner (REO)
- Security bulletin report
- Test coverage - are we doing good enough to ensure a test is added along with a fix?
- 72% of the bugs with updated RCA had no tests added along with the fix according to the RCA phase I results
- Paying attentions to these slack channels as many Mozillians often report Firefox problems there
- firefox
- web-platform
- webcompat
- moco
- planning
- triage-and-tracking