Closed Bug 525063 Opened 15 years ago Closed 5 years ago

Analysis to produce an error on uninitialized class members

Categories

(Developer Infrastructure :: Source Code Analysis, enhancement)

x86
Linux
enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: taras.mozilla, Assigned: andi)

References

Details

(Keywords: csectype-uninitialized, sec-audit)

Attachments

(13 files, 83 obsolete files)

(deleted), patch
Details | Diff | Splinter Review
(deleted), application/x-tar
Details
(deleted), text/plain
Details
(deleted), text/plain
Details
(deleted), text/plain
Details
(deleted), application/x-perl
Details
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
      No description provided.
In particular, class members which are not ininitized in the class initializer list and are also not set in the class constructor. This will require an ESP analysis, I think, since there are some cases where we do:

Foo::Foo()
{
  if (something)
    mA = 1;
  else
    mA = 2;
}

I'd like this to be an error not a warning if we can manage it. Perhaps with a NS_UNINITIALIZED annotation if necessary for the truly few cases where we do want to leave a member uninitialized.
...and also where the class doesn't have a custom ::operator new, I assume.
You mean a memset(0) operator new? That's a special case we should think about, yeah.
I was thinking of checking more bluntly to avoid false positives, but sure, something more specific, maybe some form of annotation, is all cool too.
Assignee: nobody → ehren.m
Assignee: ehren.m → nobody
Attached file Shows error for uninitialized class members (obsolete) (deleted) —
This is Varsha from SJCE,Mysore.This javascript detects uninitialized class members and shows error.
Attachment #444602 - Flags: feedback?
Varsha, when you run this on the tree what kind of results do you find?
Assignee: nobody → varsha.j.hegde
Attached file Bug Result (obsolete) (deleted) —
I have attached the result. Please give feedback
Is that result just for js/src?

This is a false-positive:

/home/hegde/src/js/src/jsemit.h:272: error: In class js::detail::HashTable<js::HashMap<JSAtom*, int, js::DefaultHasher<JSAtom*>, js::ContextAllocPolicy>::Entry, js::HashMap<JSAtom*, int, js::DefaultHasher<JSAtom*>, js::ContextAllocPolicy>::MapHashPolicy, js::ContextAllocPolicy> uninitialized field JSTreeContext::decls

JSTreeContext::decls is not a primitive or POD type, it is a JSAtomList which has a constructor. (ditto for lexdeps, and a lot of the stuff in jstracer.h)

/home/hegde/src/js/src/jsemit.h:279: error: In class js::detail::HashTable<js::HashMap<JSAtom*, int, js::DefaultHasher<JSAtom*>, js::ContextAllocPolicy>::Entry, js::HashMap<JSAtom*, int, js::DefaultHasher<JSAtom*>, js::ContextAllocPolicy>::MapHashPolicy, js::ContextAllocPolicy> uninitialized field D_24149

This is also incorrect: the constructor initializes scopeChain(NULL).
(In reply to Jeff Muizelaar [:jrmuizel] from comment #11)
> Created attachment 8674308 [details] [diff] [review]
> A sketch of what this could look like

Some problems with this are:
1. It may look at all fields (including base classes) when it should only look at a class's own fields
2. It doesn't handle fields that are default constructable
3. Maybe something else that I forget.
Assignee: varsha.j.hegde → ehsan
Attached patch clang.patch (obsolete) (deleted) — Splinter Review
Attachment #8677557 - Flags: review?(ehsan)
Assignee: ehsan → amarchesini
can I still this bug? I have a patch for it.
Attached patch part 1 - clang plugin (obsolete) (deleted) — Splinter Review
better approach.
Attachment #8677557 - Attachment is obsolete: true
Attachment #8677557 - Flags: review?(ehsan)
Attachment #8677602 - Flags: feedback?(ehsan)
Attached patch part 2 - MOZ_IGNORE_INITIALIZATION (obsolete) (deleted) — Splinter Review
This second patch is useful to mark variables that we know are correctly initialized (ipdl stuff for instance).
Attachment #8677975 - Flags: review?(ehsan)
Attachment #8677602 - Attachment description: clang.patch → part 1 - clang plugin
Attachment #8677602 - Flags: feedback?(ehsan) → review?(ehsan)
Comment on attachment 8677602 [details] [diff] [review]
part 1 - clang plugin

Review of attachment 8677602 [details] [diff] [review]:
-----------------------------------------------------------------

This is a great start!

::: build/clang-plugin/clang-plugin.cpp
@@ +1490,5 @@
> +  DiagnosticsEngine &Diag = Result.Context->getDiagnostics();
> +
> +  unsigned nonCTORID = Diag.getDiagnosticIDs()->getCustomDiagID(
> +      DiagnosticIDs::Warning,
> +      "%0 should be initialized in an explicit CTOR for %1");

Nit: constructor.

@@ +1495,5 @@
> +
> +  const CXXRecordDecl *decl = Result.Nodes.getNodeAs<CXXRecordDecl>("decl");
> +
> +  // We care just about classes at the moment.
> +  if (decl->getTagKind() != TTK_Class) {

See below.

@@ +1511,5 @@
> +    // We have an implicit CTOR.
> +    if (decl->ctor_begin() == decl->ctor_end()) {
> +      Diag.Report(decl->getLocation(), nonCTORID) << *field << decl;
> +      continue;
> +    }

I'm not quite sure why you need to do this separately.

If the compiler doesn't end up generating an implicit constructor, there is nothing to check!  For a case like:

class Foo {
  int field;
};
void f() {
  Foo f; // Make the compiler generate the ctor
}

you'll find a CXXConstructorDecl that would return true from isImplicit().

Please unify the two checkers.

@@ +1520,5 @@
> +    const MatchFinder::MatchResult &Result) {
> +  DiagnosticsEngine &Diag = Result.Context->getDiagnostics();
> +  unsigned nonInitializedMemberID = Diag.getDiagnosticIDs()->getCustomDiagID(
> +      DiagnosticIDs::Warning,
> +      "%0 is not correctly initialialized in the CTOR of %1");

Nit: constructor

@@ +1527,5 @@
> +      Result.Nodes.getNodeAs<CXXConstructorDecl>("ctor");
> +
> +  // No system headers.
> +  FullSourceLoc loc(ctor->getLocation(), *Result.SourceManager);
> +  if (loc.isInSystemHeader()) {

We usually try to do as much of the filtering on the AST as we can in the matcher.  You can replace this here with unless(isInSystemHeader()) in the matcher.

@@ +1543,5 @@
> +  }
> +
> +  const CXXRecordDecl *decl = ctor->getParent();
> +  if (!decl) {
> +    return;

I think it's better to create a custom matcher for these three checks, similar to isInterestingImplicitCtor(), for example.

@@ +1547,5 @@
> +    return;
> +  }
> +
> +  // We care just about classes at the moment.
> +  if (decl->getTagKind() != TTK_Class) {

Hmm, why?  classes and structs are not different in any way.

If you're trying to cut down the number of things to fix right now, I think it would be better to look for non-POD types first.

@@ +1551,5 @@
> +  if (decl->getTagKind() != TTK_Class) {
> +    return;
> +  }
> +
> +  // For each field, if they are buildinType and or if they are pointer...

Nit: builtin.

@@ +1553,5 @@
> +  }
> +
> +  // For each field, if they are buildinType and or if they are pointer...
> +  CXXRecordDecl::field_iterator field = decl->field_begin(), fieldEnd = decl->field_end();
> +  for (; field != fieldEnd; ++field) {

Nit: please use a range-based for loop, like:

for (auto& field : decl->fields())

@@ +1555,5 @@
> +  // For each field, if they are buildinType and or if they are pointer...
> +  CXXRecordDecl::field_iterator field = decl->field_begin(), fieldEnd = decl->field_end();
> +  for (; field != fieldEnd; ++field) {
> +    QualType type = field->getType();
> +    if (!type->isBuiltinType() && !type->isPointerType()) {

Please add a comment explaining that it's OK to ignore reference types here since they are mandated to be initialized by the language.

@@ +1561,5 @@
> +    }
> +
> +    Stmt* body = ctor->getBody();
> +    if (!body) {
> +      continue;

I think this should be hoisted outside of the loop.

@@ +1566,5 @@
> +    }
> +
> +    // Check if this member is initialized.
> +    CXXConstructorDecl::init_const_iterator init = ctor->init_begin(), initEnd = ctor->init_end();
> +    for (; init != initEnd; ++init) {

Nit: Please use a range-based for loop.

@@ +1579,5 @@
> +    }
> +
> +    // Maybe it's initialized in the body.
> +    Stmt::child_iterator child = body->child_begin(), childEnd = body->child_end();
> +    for (; child != childEnd; ++child) {

Ditto.

@@ +1585,5 @@
> +      if (!isa<BinaryOperator>(*child)) {
> +        continue;
> +      }
> +
> +      BinaryOperator* binOp = dyn_cast<BinaryOperator>(*child);

You don't need the isa check above separately, you can just dyn_cast_or_null() and check the result against null, which should be more efficient.

@@ +1586,5 @@
> +        continue;
> +      }
> +
> +      BinaryOperator* binOp = dyn_cast<BinaryOperator>(*child);
> +      if (binOp->getOpcode() != BO_Assign) {

What about cases like this?

class Foo {
  char str[100];
  Foo() {
    memset(str, 0, sizeof(str));
  }
};

I think you may want to look for MemberExpr's that are lvalue instead of looking for assignments.  What do you think?

@@ +1609,5 @@
> +        break;
> +      }
> +    }
> +
> +    if (child != childEnd) {

With this implementation strategy, you end up doing all of the work for each field.  I think we can be more efficient by collecting all fields in an unordered_set or some such, and then do this work only once, for each member that we find that is correctly initialzied, we'd remove it from the set.  At the end, we can iterate over the set and print out the diagnostic for any remaining fields.  What do you think?
Attachment #8677602 - Flags: review?(ehsan) → feedback+
Comment on attachment 8677975 [details] [diff] [review]
part 2 - MOZ_IGNORE_INITIALIZATION

Review of attachment 8677975 [details] [diff] [review]:
-----------------------------------------------------------------

::: mfbt/Attributes.h
@@ +515,5 @@
>   * MOZ_INHERIT_TYPE_ANNOTATIONS_FROM_TEMPLATE_ARGS: Applies to template class
>   *   declarations where an instance of the template should be considered, for
>   *   static analysis purposes, to inherit any type annotations (such as
>   *   MOZ_MUST_USE and MOZ_STACK_CLASS) from its template arguments.
> + * MOZ_IGNORE_INITIALIZATION: Skip the CTOR initialization check for this

Nit: constructor.
Attachment #8677975 - Flags: review?(ehsan) → review+
Two more things: please issue an error instead of a warning.  Also, please run your patches through clang-format.  Thanks!
Closing the bug since bug 1217756 is marked as security and one can easily find the information in that bug by looking at these patches...
Group: dom-core-security
> @@ +1527,5 @@
> > +      Result.Nodes.getNodeAs<CXXConstructorDecl>("ctor");
> > +
> > +  // No system headers.
> > +  FullSourceLoc loc(ctor->getLocation(), *Result.SourceManager);
> > +  if (loc.isInSystemHeader()) {
> 
> We usually try to do as much of the filtering on the AST as we can in the
> matcher.  You can replace this here with unless(isInSystemHeader()) in the
> matcher.

I tried, but I didn't figure out how to use this unless().

> I think it's better to create a custom matcher for these three checks,
> similar to isInterestingImplicitCtor(), for example.

any particular reason?

> Hmm, why?  classes and structs are not different in any way.

Yes. step by step. Just with this simple code we have 11 patches (and more coming) of fixing...


> > +  // For each field, if they are buildinType and or if they are pointer...
> > +  CXXRecordDecl::field_iterator field = decl->field_begin(), fieldEnd = decl->field_end();
> > +  for (; field != fieldEnd; ++field) {
> 
> Nit: please use a range-based for loop, like:

If I do, then I have to use an external variable to know if the loop is finished or not.


> What about cases like this?
> 
> class Foo {
>   char str[100];
>   Foo() {
>     memset(str, 0, sizeof(str));
>   }
> };

Not supported yet. Follow up?

> With this implementation strategy, you end up doing all of the work for each
> field.  I think we can be more efficient by collecting all fields in an
> unordered_set or some such, and then do this work only once, for each member
> that we find that is correctly initialzied, we'd remove it from the set.  At
> the end, we can iterate over the set and print out the diagnostic for any
> remaining fields.  What do you think?

This method is called when a new constructor is found. There is just 1 loop between all the fields.
But maybe I didn't get what you are suggesting.

The reason why it's warning only, is because I want to fix the main issues, then, when this code is 'stable' enough, we can enable it in error more. How does it sound?
Attached patch part 1 - clang plugin (obsolete) (deleted) — Splinter Review
Attachment #8674308 - Attachment is obsolete: true
Attachment #8677602 - Attachment is obsolete: true
Attachment #8678171 - Flags: review?(ehsan)
(In reply to Andrea Marchesini (:baku) from comment #21)
> > @@ +1527,5 @@
> > > +      Result.Nodes.getNodeAs<CXXConstructorDecl>("ctor");
> > > +
> > > +  // No system headers.
> > > +  FullSourceLoc loc(ctor->getLocation(), *Result.SourceManager);
> > > +  if (loc.isInSystemHeader()) {
> > 
> > We usually try to do as much of the filtering on the AST as we can in the
> > matcher.  You can replace this here with unless(isInSystemHeader()) in the
> > matcher.
> 
> I tried, but I didn't figure out how to use this unless().

As we talked on IRC, I'd like to know what broke.

> > I think it's better to create a custom matcher for these three checks,
> > similar to isInterestingImplicitCtor(), for example.
> 
> any particular reason?

To make it possible to look at the matcher and figure out what the checker acts on without having to read the code for the checker.

> > Hmm, why?  classes and structs are not different in any way.
> 
> Yes. step by step. Just with this simple code we have 11 patches (and more
> coming) of fixing...

My point is that singling out structs is completely arbitrary.  I don't want to land the analysis in such a shape.  It's OK to do it step by step and land the dependencies before we land the analysis, but the thing that we land in this bug needs to make sense to developers.  Making a struct not go through these checks is not a good solution.

In the other cases where I have landed analyses that break all sorts of things, I've had to land individual fixes for a few weeks before the analysis was ready to land.

> > > +  // For each field, if they are buildinType and or if they are pointer...
> > > +  CXXRecordDecl::field_iterator field = decl->field_begin(), fieldEnd = decl->field_end();
> > > +  for (; field != fieldEnd; ++field) {
> > 
> > Nit: please use a range-based for loop, like:
> 
> If I do, then I have to use an external variable to know if the loop is
> finished or not.

Yes.  :-)  Still better than writing iterators manually.

> > What about cases like this?
> > 
> > class Foo {
> >   char str[100];
> >   Foo() {
> >     memset(str, 0, sizeof(str));
> >   }
> > };
> 
> Not supported yet. Follow up?

Well, not doing this sort of things results in false positives.  Like I said above, I would like the analysis that lands in this bug to be as free from false positives as we can in the cases that it handles.

The other case you need to handle is:

class Foo {
  int field;
  Foo() {
    Init(&field);
  }
};

> > With this implementation strategy, you end up doing all of the work for each
> > field.  I think we can be more efficient by collecting all fields in an
> > unordered_set or some such, and then do this work only once, for each member
> > that we find that is correctly initialzied, we'd remove it from the set.  At
> > the end, we can iterate over the set and print out the diagnostic for any
> > remaining fields.  What do you think?
> 
> This method is called when a new constructor is found. There is just 1 loop
> between all the fields.
> But maybe I didn't get what you are suggesting.

My point was that we're doing things such as looking at the initializer list and the body of the ctor once per field.  Basically what you have now is O(n*(m+k)), n being the number of fields, m the number of initializers and k the number of statements in the ctor body.  I'm asking for a O(n+m+k) solution.

> The reason why it's warning only, is because I want to fix the main issues,
> then, when this code is 'stable' enough, we can enable it in error more. How
> does it sound?

The problem with doing that is that in some directories we treat warnings as errors, so this will break the build in some code but not others.  I really think it's better to not be in a hurry to land this, and fix the stuff it finds and land the analysis once we are happy with it!
This should be an interesting read: <http://www.viva64.com/en/b/0354/>
Attached patch clang-plugin-patch1.diff (obsolete) (deleted) — Splinter Review
Attached patch clang-plugin-patch2.diff (obsolete) (deleted) — Splinter Review
This is WIP!!
the order in witch the patches must be applied is self explanatory: patch 2 must be applied after patch 1

Should add support for:
1. vectors, pointers
2. initialization in function where field is passed by reference or address.
3. initialization of structs field by field would be nice.

Now we have all what Andrea did plus support for scenarios like:

class A
{
  int p;
  int x;

  void f() {
    p = 10;
    memset(&x, 0, sizeof(x));
  }

public:
  A() {
    f();
  }
};
Attachment #8727464 - Flags: feedback?(amarchesini)
Attached patch clang-plugin-patch2.diff (obsolete) (deleted) — Splinter Review
Attachment #8727464 - Attachment is obsolete: true
Attachment #8727464 - Flags: feedback?(amarchesini)
Attachment #8727752 - Flags: feedback?(amarchesini)
Attached patch clang-plugin-patch2.diff (obsolete) (deleted) — Splinter Review
Attachment #8727752 - Attachment is obsolete: true
Attachment #8727752 - Flags: feedback?(amarchesini)
Attachment #8727774 - Flags: feedback?(amarchesini)
Attachment #8678171 - Flags: review?(ehsan)
Attached patch clang-plugin.diff (obsolete) (deleted) — Splinter Review
Added support for
1. pass by address and reference
2. look recursively in each function that gets executed from constructor in order to search for attributions. If function is not member search in it only if it's has in it's signature variables that are passed through reference or address, if it's member search in it no matter what.

Till now we have support for the following scenario:

>>class Test {
>>  int a;
>>  int b;
>>  int c;
>>  int d = 2;
>>  int *e;
>>  int *f;
>>  int g;
>>  int h;
>>
>>  void InitPtrByPtr(int **var) {
>>    *var = nullptr;
>>  }
>>  void InitPtr(int *&var) {
>>    var = nullptr;
>>  }
>>
>>  void InitByMemset(int *var) {
>>    memset(var, 0, sizeof(int));
>>  }
>>
>>  void InitByMemset(int& var) {
>>    memset(&var, 0, sizeof(var));
>>  }
>>
>>  void InitByRef(int& var) {
>>    var = 2;
>>  }
>>  int Init(int *var1, int& var2, int& var3, int *&var4, int **var5, int& var6) {
>>    *var1 = 2;
>>    InitByRef(var2);
>>    InitByMemset(var3);
>>    InitPtr(var4);
>>    InitPtrByPtr(var5);
>>    InitByMemset(&var6);
>>  }
>>  Test() {
>>    Init(&a, b, c, e, &f, g);
>>  }
>>};
Attachment #8727454 - Attachment is obsolete: true
Attachment #8727774 - Attachment is obsolete: true
Attachment #8727774 - Flags: feedback?(amarchesini)
Attachment #8728896 - Flags: feedback?(amarchesini)
Attached patch clang-plugin-uninitialized-members-check.patch (obsolete) (deleted) — Splinter Review
Added support for binary operator = CallExpr from RHS like:

>>int f(int& o) {
>>  o = 2;
>>  return 1;   
>>}
>>class Test {
>>int a;
>>  Test() {
      int result;
>>    result = f(a);
>>  }
>>}

This sort of expression is used when function f allocated something and return the success factor.
Attachment #8728896 - Attachment is obsolete: true
Attachment #8728896 - Flags: feedback?(amarchesini)
Attachment #8729568 - Flags: feedback?(amarchesini)
Comment on attachment 8729568 [details] [diff] [review]
clang-plugin-uninitialized-members-check.patch

Review of attachment 8729568 [details] [diff] [review]:
-----------------------------------------------------------------

When ready, send a review request to ehsan directly. Thanks!

::: build/clang-plugin/clang-plugin.cpp
@@ +147,5 @@
> +  public:
> +    virtual void run(const MatchFinder::MatchResult &Result);
> +  private:
> +    void runThroughBody(Stmt* body,
> +        unordered_map<string, bool>& variablesMap,

strange indentation.

@@ +1089,5 @@
>            .bind("node"),
>        &refCountedCopyConstructorChecker);
> +
> +  astMatcher.addMatcher(recordDecl().bind("class"),
> +                          &nonInitializedMemberChecker);

indentation

@@ +1549,5 @@
> +{
> +  auto varFromMap = resolverMap.find(varName);
> +  if (varFromMap != resolverMap.end()) {
> +    // variable found in resolve map that means we must use it's corespondent
> +    // -> second in variablesMap

Right, but you are not using variablesMap at all in this method.

@@ +1645,5 @@
> +  // resolveMap
> +  bool isValid = false;
> +
> +  if (!method) {
> +    Log("Cannot get FunctionDecl from callExpr");

return; ?

@@ +1656,5 @@
> +        param->getType()->isReferenceType()) {
> +
> +      Expr *argument = callExpr->getArg(i);
> +      if (!argument) {
> +        Log("Cannot get Expr from ArgList");

return; ?

or MOZ_ASSERT?

@@ +1715,5 @@
> +    }
> +    varFromExpr = memberExpr->getMemberDecl()->getName();
> +    updateVarMap(variablesMap, varFromExpr);
> +    return;
> +  } else {

no }else{ after a return.

@@ +1775,5 @@
> +    unordered_map<string, string>& resolverMap)
> +{
> +  static struct externFuncDefinition {
> +    string funcName;
> +    uint8_t  indexInArgList;

2 spaces before indexInArgList;

@@ +1778,5 @@
> +    string funcName;
> +    uint8_t  indexInArgList;
> +  } funcSchemaArray[] = {
> +      // for memset
> +      {"memset", 0}

sometimes you indent with 4 spaces, sometimes 2.

@@ +1873,5 @@
> +        // in StdC like memset
> +        runThroughDefaultFunctions(funcCall, variablesMap, resolverMap);
> +      }
> +    } else {
> +    }

else what? :)

@@ +1904,5 @@
> +    return;
> +  }
> +
> +  // No system headers.
> +  FullSourceLoc loc(decl->getLocation(), *Result.SourceManager);

ehsan suggested to move this check in the definition of the matcher.

@@ +1917,5 @@
> +
> +  // For each field, if they are builtinType and or if they are pointer add
> +  // to table
> +  for (auto field = decl->field_begin(); field != decl->field_end(); ++field) {
> +    // Ignore fields if MOZ_IGNORE_INITIALIZATION is present.

add a test for this, if we don't have it yet.

::: build/clang-plugin/tests/moz.build
@@ +34,5 @@
>  ]
>  
>  DISABLE_STL_WRAPPING = True
>  NO_VISIBILITY_FLAGS = True
> +#FINAL_LIBRARY = 'xul-gtest'

Get rid of this
Attachment #8729568 - Flags: feedback?(amarchesini) → feedback+
Attached patch clang-plugin-uninitialized-members.diff (obsolete) (deleted) — Splinter Review
Attachment #8729568 - Attachment is obsolete: true
Attachment #8730613 - Flags: review?(ehsan)
Andi is taking the lead here.

Ehsan, do you have an eta for the review? (no pressure, just to organize our work)
Assignee: amarchesini → bogdan.postelnicu
Flags: needinfo?(ehsan)
Sorry, I'm currently focusing most of my attention on a different project, so I'm afraid my reviews will starve for a while...

Michael, do you have cycles for this review?
Flags: needinfo?(ehsan) → needinfo?(michael)
Comment on attachment 8730613 [details] [diff] [review]
clang-plugin-uninitialized-members.diff

Review of attachment 8730613 [details] [diff] [review]:
-----------------------------------------------------------------

::: build/clang-plugin/clang-plugin.cpp
@@ +15,5 @@
>  #include "llvm/ADT/DenseMap.h"
>  #include "llvm/Support/FileSystem.h"
>  #include "llvm/Support/Path.h"
>  #include <memory>
> +#include <unordered_map>

Most code in the plugin uses DenseMap<> from llvm rather than std::unordered_map.

@@ +781,5 @@
>  }
> +
> +AST_MATCHER(CXXRecordDecl, isNotInSystemHeader) {
> +  const CXXRecordDecl *decl = Node.getCanonicalDecl();
> +   // No system headers.

Indentation seems weird here

@@ +1081,5 @@
>            .bind("node"),
>        &refCountedCopyConstructorChecker);
> +
> +  astMatcher.addMatcher(cxxRecordDecl(isNotInSystemHeader(), isClassType())
> +      .bind("class"), &nonInitializedMemberChecker);

Why are you treating class X {} differently from struct X {}. Aren't they basically the same except that one has private-by-default members and the other has public-by-default members? I would think this analysis should be based on whether the class/struct has a constructor rather than what keyword was used to define it.

@@ +1556,5 @@
> +      return false;
> +    }
> +    varName =  memberExpr->getMemberDecl()->getName();
> +    return true;
> +  } else {

no need for } else { after a return true.

@@ +1557,5 @@
> +    }
> +    varName =  memberExpr->getMemberDecl()->getName();
> +    return true;
> +  } else {
> +    // case 2 - normal variable declaration

I think normal variable reference would be a better wording

@@ +1577,5 @@
> +        if (!varExp) {
> +          return false;
> +        }
> +        return getVarNameFromExprWithThisCheck(varExp, varName);
> +      } else {

Probably use Expr::IgnoreImplicit() or Expr::IgnoreImpCasts() before the branches.

@@ +1611,5 @@
> +  FunctionDecl *method = callExpr->getDirectCallee();
> +  // optimization for extern function to limit their call only if their,
> +  // arguments are passed by address or reference and are already referenced in
> +  // resolveMap
> +  bool isValid = false;

isValid seems like a strange name for this variable. It seems more like it's a variable to determine if a parameter passed could initialize a value we're interested in. Maybe isInteresting or similar?

@@ +1625,5 @@
> +        param->getType()->isReferenceType()) {
> +
> +      Expr *argument = callExpr->getArg(i);
> +      if (!argument) {
> +        return false;

This seems like a strange return false based on an interaction specific value.

@@ +1672,5 @@
> +      return;
> +    }
> +    varFromExpr = memberExpr->getMemberDecl()->getName();
> +    updateVarMap(variablesMap, varFromExpr);
> +  } else {

This function feels like it could be flattened a lot by some early returns.

@@ +1696,5 @@
> +        Expr *varExp = unaryOp->getSubExpr();
> +        if (!varExp) {
> +          return;
> +        }
> +        return checkValueDecl(varExp, variablesMap, resolverMap);

Returning a `void` value feels weird.

@@ +1701,5 @@
> +      } else {
> +        ImplicitCastExpr *implicitCast =
> +          dyn_cast_or_null<ImplicitCastExpr>(expr);
> +        if (implicitCast) {
> +          Expr *varExpr = implicitCast->getSubExpr();

Probably use Expr::IgnoreImplicit() or Expr::IgnoreImpCasts() before the branches.

@@ +1748,5 @@
> +}
> +
> +void DiagnosticsMatcher::NonInitializedMemberChecker::evaluateExpression(
> +  Stmt *stmtExpr, unordered_map<string, bool>& variablesMap,
> +  unordered_map<string, string>& resolverMap) {

You probably want to strip any implicit expressions before looking at the type of the expression AST node.

@@ +1797,5 @@
> +          }
> +        }
> +      } else {
> +        // we are dealing with some default function that are implemented
> +        // in StdC like memset

We could also be looking at some function which is declared in a different cpp file.

@@ +1807,5 @@
> +void DiagnosticsMatcher::NonInitializedMemberChecker::runThroughBody(
> +  Stmt *body, unordered_map<string, bool>& variablesMap,
> +  unordered_map<string, string>& resolverMap) {
> +
> +  for (auto child = body->child_begin(); child != body->child_end();

for (auto child : body->children()) {

@@ +1830,5 @@
> +  }
> +
> +  // For each field, if they are builtinType and or if they are pointer add
> +  // to table
> +  for (auto field = decl->field_begin(); field != decl->field_end(); ++field) {

for (auto field : decl->fields()) {

@@ +1848,5 @@
> +  }
> +
> +  // loop through all constructors and search for initializations or
> +  // attributions
> +  for (auto firstCstr = decl->ctor_begin(); firstCstr != decl->ctor_end();

for (auto firstCstr : decl->ctors()) {

@@ +1862,5 @@
> +    if (!body) {
> +      continue;
> +    }
> +    // first look to see if it's a C++11 declaration Ex. int a = 2 or a(2)
> +    for (auto init = ctor->init_begin(); init != ctor->init_end(); init++ ){

I think you can use `for (auto init : ctor->inits()) {` which might read nicer

::: build/clang-plugin/tests/TestMemberInitialization.cpp
@@ +18,5 @@
> +  int *e;
> +  int *f;
> +  int g;
> +  int h;
> +  int x; // should trigger error for not being initialized

Add a expected-error {{XXX}} comment here so that the clang tests can detect this error.

@@ +50,5 @@
> +
> +    return 3;
> +  }
> +  Test() {
> +    h = Init(&a, b, c, e, &f, g);

Try making sure that it also works with expressions which construct temporaries with destructors. They create a expression node around the statement to run the destructor, and could cause things to break.
Attachment #8730613 - Flags: review?(ehsan) → review-
Flags: needinfo?(michael)
So i did the modification suggested by you, except i kept using the unordered_map as we've discussed on irc this is a better approach than using the DenseMap

(In reply to Michael Layzell [:mystor] from comment #35) 
> @@ +50,5 @@
> > +
> > +    return 3;
> > +  }
> > +  Test() {
> > +    h = Init(&a, b, c, e, &f, g);
> 
> Try making sure that it also works with expressions which construct
> temporaries with destructors. They create a expression node around the
> statement to run the destructor, and could cause things to break.

Can you give me an example of this case?

After we clarify this i will re-update the patch.
Flags: needinfo?(michael)
(In reply to Andi-Bogdan Postelnicu from comment #36)
> So i did the modification suggested by you, except i kept using the
> unordered_map as we've discussed on irc this is a better approach than using
> the DenseMap
> 
> (In reply to Michael Layzell [:mystor] from comment #35) 
> > @@ +50,5 @@
> > > +
> > > +    return 3;
> > > +  }
> > > +  Test() {
> > > +    h = Init(&a, b, c, e, &f, g);
> > 
> > Try making sure that it also works with expressions which construct
> > temporaries with destructors. They create a expression node around the
> > statement to run the destructor, and could cause things to break.
> 
> Can you give me an example of this case?
> 
> After we clarify this i will re-update the patch.

An example would be:

struct Obj {
  ~Obj();
};

int do_something_with_obj(const Obj& o);

struct Test {
  int h;
  Test() {
    h = do_something_with_obj(Obj());
  }
}

The statement containing the assignment is wrapped with another node in the AST to represent the execution of the destructor for the temporary Obj() object. I believe that that could potentially mess up your patch, though I'm not sure. At the least I think it should be in the test.
Flags: needinfo?(michael)
Attached patch clang-plugin-uninitialized-members.diff (obsolete) (deleted) — Splinter Review
I've checked your example and it's supported, i've also added it in the test header.
Regarding the analysis of functions that are in other cpps, when they get called we have AST nodes but empty bodies, they are resolved during linkage. As a first step i think that we can address this issue later on in a separate bug.
Attachment #8730613 - Attachment is obsolete: true
Attachment #8740394 - Flags: review?(michael)
Attached patch clang-plugin-uninitialized-members.diff (obsolete) (deleted) — Splinter Review
I've checked your example and it's supported, i've also added it in the test header.
Regarding the analysis of functions that are in other cpps, when they get called we have AST nodes but empty bodies, they are resolved during linkage. As a first step i think that we can address this issue later on in a separate bug.
Attachment #8740394 - Attachment is obsolete: true
Attachment #8740394 - Flags: review?(michael)
Attachment #8740407 - Flags: review?(michael)
Comment on attachment 8740407 [details] [diff] [review]
clang-plugin-uninitialized-members.diff

Review of attachment 8740407 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM other than the minor comments below.

::: build/clang-plugin/clang-plugin.cpp
@@ +1729,5 @@
> +  Stmt *stmtExpr, unordered_map<string, bool>& variablesMap,
> +  unordered_map<string, string>& resolverMap) {
> +  stmtExpr = stmtExpr->IgnoreImplicit();
> +  BinaryOperator *binOp = dyn_cast_or_null<BinaryOperator>(stmtExpr);
> +  if ( binOp && binOp->getOpcode() == BO_Assign) {

weird spacing

@@ +1749,5 @@
> +      FunctionDecl *method = funcCall->getDirectCallee();
> +      if (!method) {
> +        return;
> +      }
> +      if ( method->hasBody() ) {

no space around condition

@@ +1769,5 @@
> +          CXXMemberCallExpr *memberFuncCall =
> +              dyn_cast_or_null<CXXMemberCallExpr>(stmtExpr);
> +
> +          if (memberFuncCall && isa<CXXThisExpr>(
> +              memberFuncCall->getImplicitObjectArgument())) {

I suppose it's fairly uncommon for a member to be initialized by a call to a method on another object, so we don't necessarily need to support that.

@@ +1856,5 @@
> +
> +    // look through the map for unmarked variables and pop error message, also
> +    // reset the ones that are set for the following ctor to add it
> +    for(auto varIt =  variablesMap.begin(); varIt != variablesMap.end();
> +        varIt++) {

Can you not do `for (auto& varIt : variablesMap)` or something similar?

::: build/clang-plugin/tests/TestNoDuplicateRefCntMember.cpp
@@ +1,5 @@
>  class C1 {};
>  
>  class RC1 {
>  public:
> +  RC1() : mRefCnt(0) {}

I don't think these should be necessary, as we ignore classes & structs with no constructors now.

::: build/clang-plugin/tests/TestNonMemMovableStd.cpp
@@ +10,5 @@
>  typedef basic_string<char> string;
> +template<class T, class U>
> +class pair {
> +public:
> +  pair() {} // expected-warning-re 3 {{'{{(.*)}}' is not correctly initialialized in the constructor of 'pair'}}

Probably not necessary as we ignore classes without constructors now.

Also, I'm not sure how expected-warning-re is matching the error produced in clang-plugin.cpp. Can you make sure that this test is being run?

::: build/clang-plugin/tests/TestRefCountedCopyConstructor.cpp
@@ +1,4 @@
>  // Implicit copy construct which is unused
>  class RC1 {
> +public:
> +  RC1() : mRefCnt(0) {}

We ignore classes without constructors now.

::: mfbt/Attributes.h
@@ +526,5 @@
>   *   declarations where an instance of the template should be considered, for
>   *   static analysis purposes, to inherit any type annotations (such as
>   *   MOZ_MUST_USE and MOZ_STACK_CLASS) from its template arguments.
> + * MOZ_IGNORE_INITIALIZATION: Skip the constructor initialization check for this
> + *   member.

This could probably use a bit more explanation, such as some information about what the constructor initialization check is.
Attachment #8740407 - Flags: review?(michael) → review+
Comment on attachment 8756288 [details] [diff] [review]
clang-plugin-uninitialized-members-update1.diff

Review of attachment 8756288 [details] [diff] [review]:
-----------------------------------------------------------------

I'd like to see the patch after new tests are added for fields with more interesting types :).

::: build/clang-plugin/clang-plugin.cpp
@@ +1900,5 @@
> +
> +        if (ctr != ctor) {
> +          ctor = ctr;
> +          break;
> +        }

This seems wrong to me. Shouldn't it be more along the lines of:

// If we aren't the definition, look through the redecls for the definition
if (!ctor->isThisDeclarationADefinition()) {
  for (auto ctorFromList : ctor->redecls()) {
    if (CXXConstructorDecl *newCtor = dyn_cast<CXXConstructorDecl>(ctorFromList)) {
      if (newCtor->isThisDeclarationADefinition()) {
        ctor = newCtor;
        break;
      }
    }
  }
}

// If we didn't find a definition, skip this declaration.
if (!ctor->isThisDeclarationADefinition()) {
  continue;
}

::: build/clang-plugin/tests/TestMemberInitialization.cpp
@@ +1,1 @@
> +#include "TestMemberInitialization.h"

You shouldn't need to split this into two files. The behavior should be the same whether it is in one file or two

@@ +5,5 @@
> +}
> +
> +void Test::InitPtrByPtr(int **var) {
> +  *var = 0;
> +}

Consistent whitespace after this line

::: build/clang-plugin/tests/TestMemberInitialization.h
@@ +1,1 @@
> +

Unnecessary whitespace at start of file.

@@ +18,5 @@
> +int do_something_with_obj(const Obj& o);
> +
> +void InitByRef(int& var);
> +
> +class Test {

Can you have another test where you use the struct keyword, but produce an error?

e.g.
struct T {
  int x; // expected-error {{...}}
  
  T() {}
};

@@ +27,5 @@
> +  int *e;
> +  int *f;
> +  int g;
> +  int h;
> +  int x; // expected-error {{'x' is not correctly initialialized in the constructor of 'Test'}}

What happens if I have a field

SomeType t;

which I don't explicitly initialize in the constructor, but is not a primitive C-style type? This should have a test. What if it's a struct type like TestStruct?
Attachment #8756288 - Flags: review?(michael)
Comment on attachment 8756283 [details] [diff] [review]
clang-plugin-uninitialized-members.diff

Review of attachment 8756283 [details] [diff] [review]:
-----------------------------------------------------------------

::: build/clang-plugin/clang-plugin.cpp
@@ +21,5 @@
>  #define CLANG_VERSION_FULL (CLANG_VERSION_MAJOR * 100 + CLANG_VERSION_MINOR)
>  
>  using namespace llvm;
>  using namespace clang;
> +using namespace std;

Uses of std:: in this file have mostly been prefixed. I would prefer it if we didn't using namespace std in clang-plugin.

@@ +1625,5 @@
> +    unordered_map<string, string>& resolverMap,
> +    unordered_map<string, string>& newResolverMap) {
> +
> +  // use callExp to see what it needs to be pushed by address or reference,
> +  // if we are in any of these 2 situation lookup variable in both resolveMap

situation => situations

@@ +1631,5 @@
> +  // of passed variables directly from variablesMap
> +  // It's costly but in this way we can avoid a push/ pop algorithm where
> +  // at the beggining of the function we would have pushed the needed variables
> +  // from resoveMap to newResolveMap and at the end of the function we would
> +  // have poped them back thus needing to have more map for tracking results

poped => popped

::: ipc/ipdl/ipdl/cxx/cgen.py
@@ +115,5 @@
>          else:
>              d.type.accept(self)
>  
>          if d.name:
> +            self.write(' MOZ_IGNORE_INITIALIZATION '+ d.name)

I'm not sure I can review changes to ipdl codegen, You should get someone else to review this change.

::: mfbt/Attributes.h
@@ +527,5 @@
>   *   declarations where an instance of the template should be considered, for
>   *   static analysis purposes, to inherit any type annotations (such as
>   *   MOZ_MUST_USE_TYPE and MOZ_STACK_CLASS) from its template arguments.
> + * MOZ_IGNORE_INITIALIZATION: Skip the constructor initialization check for this
> + *   member.

Can we have a bit more documentation here? This doesn't really explain very much about the analysis, and "constructor initialization check" sounds ambiguous.

@@ +562,5 @@
>  #  define MOZ_INHERIT_TYPE_ANNOTATIONS_FROM_TEMPLATE_ARGS \
>      __attribute__((annotate("moz_inherit_type_annotations_from_template_args")))
>  #  define MOZ_NON_AUTOABLE __attribute__((annotate("moz_non_autoable")))
> +#  define MOZ_IGNORE_INITIALIZATION \
> +    __attribute__((annotate("moz_ignore_initialization")))

Can the adding of the attribute to MFBT be split out into a different patch? It should be reviewed by someone who is a peer on MFBT rather than me.
Attachment #8756283 - Flags: review?(michael)
Attached patch clang-plugin-uninitialized-members.diff (obsolete) (deleted) — Splinter Review
is the same as Attachment 8740407 [details] [diff] with update in order to avoid rejects on m-c
Attachment #8740407 - Attachment is obsolete: true
Attachment #8756283 - Flags: review?(michael)
Attached patch clang-plugin-uninitialized-members-update1.diff (obsolete) (deleted) — Splinter Review
Update 1:

I've noticed that case where definition of the constructor was not done inline but outside of RecordDecl was lacking the CXXCtorInitializer list. This update addresses this issue. By looking through the ctor redecls and using the definitions instead of declaration.
Attachment #8756285 - Flags: review?(michael)
Attached patch clang-plugin-uninitialized-members-update1.diff (obsolete) (deleted) — Splinter Review
Update 1:

1. Fixed an issue where definition of the constructor was not done inline but outside of RecordDecl thus lacking the CXXCtorInitializer list. This update addresses this issue. By looking through the ctor redecls and using the definitions instead of declaration.

2. Added distinct header and cpp for TestMemberInitialization in this way tests that are done better reflect the real case scenarios regarding the difference between declaration and definition.
Attachment #8756285 - Attachment is obsolete: true
Attachment #8756285 - Flags: review?(michael)
Attachment #8756288 - Flags: review?(michael)
Attached patch clang-plugin-Skip-Attribute.diff (obsolete) (deleted) — Splinter Review
Hello,

Could you please review this change we plan to use this annotation to member variables that we want to be skipped by initialization checker plugin. 

For example:

>>class Test {
>>  int a;
>>  MOZ_IGNORE_INITIALIZATION int b;
>>  Test() 
>>    : a(0) {
>>  }
};

In this scenario we want |b| be to be skipped during analysis so no error message like this will be present:

'b' is not correctly initialialized in the constructor of 'Test'

We plan on having this feature because for some variables we don't want this checker to be performed.
Attachment #8756741 - Flags: review?(nfroyd)
Attached patch clang-plugin-Uninitialized-members.diff (obsolete) (deleted) — Splinter Review
Attachment #8756283 - Attachment is obsolete: true
Attachment #8756761 - Flags: review?(michael)
Attached patch clang-plugin-Uninitialized-members-update1.diff (obsolete) (deleted) — Splinter Review
I've updated the patch with the desired modifications. If you want to add more test scenarios just let me know.
Attachment #8756288 - Attachment is obsolete: true
Attachment #8756827 - Flags: review?(michael)
Comment on attachment 8756741 [details] [diff] [review]
clang-plugin-Skip-Attribute.diff

Review of attachment 8756741 [details] [diff] [review]:
-----------------------------------------------------------------

::: mfbt/Attributes.h
@@ +527,5 @@
>   *   declarations where an instance of the template should be considered, for
>   *   static analysis purposes, to inherit any type annotations (such as
>   *   MOZ_MUST_USE_TYPE and MOZ_STACK_CLASS) from its template arguments.
> + * MOZ_IGNORE_INITIALIZATION: skip the initialization check, for member variable, 
> + *   performed in the corresponding constructor

This comment needs to clearly explain what entity it applies to (for example, the "Applies to..." sentences in the nearby documentation for other attributes).  The "initialization check" needs some explaining: does it apply to list-initialization or initialization in the body of the constructor?  What is the "initialization check" actually checking?   And it's not especially clear what motivates this attribute, so I think a "This is intended to be used..." similar to the MOZ_NON_AUTOABLE attribute below would be useful.
Attachment #8756741 - Flags: review?(nfroyd) → feedback+
Attached patch clang-plugin-Skip-Attribute.diff (obsolete) (deleted) — Splinter Review
Hope this is better.
Attachment #8756741 - Attachment is obsolete: true
Attachment #8758716 - Flags: review?(nfroyd)
Attached patch clang-plugin-Skip-Attribute.diff (obsolete) (deleted) — Splinter Review
Attachment #8758716 - Attachment is obsolete: true
Attachment #8758716 - Flags: review?(nfroyd)
Attachment #8758717 - Flags: review?(nfroyd)
Attached patch clang-plugin-Uninitialized-members.diff (obsolete) (deleted) — Splinter Review
Attachment #8756761 - Attachment is obsolete: true
Attachment #8756827 - Attachment is obsolete: true
Attachment #8756761 - Flags: review?(michael)
Attachment #8756827 - Flags: review?(michael)
Attachment #8758756 - Flags: review?(michael)
Comment on attachment 8758756 [details] [diff] [review]
clang-plugin-Uninitialized-members.diff

Review of attachment 8758756 [details] [diff] [review]:
-----------------------------------------------------------------

::: build/clang-plugin/clang-plugin.cpp
@@ +1875,5 @@
> +  const MatchFinder::MatchResult &Result) {
> +  DiagnosticsEngine &Diag = Result.Context->getDiagnostics();
> +  unsigned nonInitializedMemberID = Diag.getDiagnosticIDs()->getCustomDiagID(
> +      DiagnosticIDs::Error,
> +      "%0 is not correctly initialialized in the constructor of %1");

As this test might produce false positives, it might be nice to refer people who hit the error to some documentation comment somewhere in the tree which explains how the analysis works, and what to do to fix it.

::: build/clang-plugin/tests/TestMemberInitialization.cpp
@@ +133,5 @@
> +  for (int i = 0; i < 10; i++) {
> +    if (!i)
> +      a = 0;
> +  }
> +  

Unnecessary Whitespace

@@ +157,5 @@
> +};
> +
> +TestDoWhile::TestDoWhile() {
> +  int i = 0;
> +  do { 

Trailing Whitespace
Attachment #8758756 - Flags: review?(michael) → review+
Comment on attachment 8677975 [details] [diff] [review]
part 2 - MOZ_IGNORE_INITIALIZATION

Review of attachment 8677975 [details] [diff] [review]:
-----------------------------------------------------------------

::: ipc/ipdl/ipdl/cxx/cgen.py
@@ +108,5 @@
>          else:
>              d.type.accept(self)
>  
>          if d.name:
> +            self.write(' MOZ_IGNORE_INITIALIZATION '+ d.name)

This change adds MOZ_IGNORE_INITIALIZATION to a billion places in codegen, including places where it has no business being, such as function parameters; I highly doubt this is what we want to use.
Attachment #8677975 - Flags: review-
Comment on attachment 8758717 [details] [diff] [review]
clang-plugin-Skip-Attribute.diff

Review of attachment 8758717 [details] [diff] [review]:
-----------------------------------------------------------------

Apologies for the delay in reviewing this.

Where is this attribute intended to be used?  The lone patch in this bug that features non-test uses of this attribute is busted.

r=me if we have actual uses for the attribute.

::: mfbt/Attributes.h
@@ +530,5 @@
> + * MOZ_IGNORE_INITIALIZATION: Applies to class member declaration. It disables 
> + *   the list and body initialization check when a constructor is defined. The in-body
> + *   initialization checks assignments for the corresponding member variable. This 
> + *   is intended to be used when we specifically know that logic surrounding the 
> + *   variable is correct. 

Thanks for updating this comment.  I think it needs a little wordsmithing, though.  How about:

"Applies to class member declarations.  Occasionally there are class members that are not initialized in the constructor, but logic elsewhere in the class ensures they are initialized prior to use.  Using this attribute on a member disables the check that this member must be initialized in constructors via list-initialization, in the constructor body, or via functions called from the constructor body."

I think we might also want to rename the macro (and perhaps the attribute, too?): we're not ignoring the initialization of the member, we're indicating to the static analysis that the member is correctly initialized outside of constructor.  So perhaps MOZ_INITIALIZED_OUTSIDE_CONSTRUCTOR is more appropriate?

Please also ensure there's no trailing whitespace in this patch.
Attachment #8758717 - Flags: review?(nfroyd) → review+
Attached patch patch_2.diff (obsolete) (deleted) — Splinter Review
About where we could use it, just as you comment at this bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1278796
It is a good example on how we can eliminate from the analysis these types of situations. One more example is here: https://bugzilla.mozilla.org/show_bug.cgi?id=1278242
Attachment #8677975 - Attachment is obsolete: true
Attachment #8758717 - Attachment is obsolete: true
Attachment #8761208 - Flags: review?(nfroyd)
Attachment #8761208 - Flags: review?(nfroyd) → review+
https://hg.mozilla.org/mozilla-central/rev/3d04a9a8e9f0
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla50
This must be left open because more work needs to be done for the clang analyzer.
Status: RESOLVED → REOPENED
Keywords: leave-open
Resolution: FIXED → ---
Attached patch clang-plugin.diff (obsolete) (deleted) — Splinter Review
This new update brings the possibility to eliminate from the matcher specific libraries like: skin, engle, harfbuzz etc:

>>AST_MATCHER(CXXRecordDecl, isInterestingForUninitializedInCtor) {
>>  const CXXRecordDecl *decl = Node.getCanonicalDecl();
>>   // No system headers.
>>  auto &SourceManager = Finder->getASTContext().getSourceManager();
>>  FullSourceLoc loc(decl->getLocation(), SourceManager);
>>
>>  if (loc.isInSystemHeader()) {
>>    // is in system header so skip it
>>    return false;
>>  } else {
>>    // check to see if it's in a package path
>>    // that should be ignored
>>    SmallString<1024> fileName = SourceManager.getFilename(loc);
>>    llvm::sys::fs::make_absolute(fileName);
>>    auto begin = llvm::sys::path::rbegin(fileName);
>>    auto end = llvm::sys::path::rend(fileName);
>>
>>    for (; begin != end; ++begin) {
>>      if (begin->compare_lower(StringRef("skia")) == 0 ||
>>          begin->compare_lower(StringRef("angle")) == 0 ||
>>          begin->compare_lower(StringRef("harfbuzz")) == 0 ||
>>          begin->compare_lower(StringRef("hunspell")) == 0 ||
>>          begin->compare_lower(StringRef("scoped_ptr.h")) == 0 ||
>>          begin->compare_lower(StringRef("graphite2")) == 0) {
>>        return false;
>>      }
>>    }
>>    return true;
>>  }
>>}

I think i should also include a case for switch statement.
Attachment #8678171 - Attachment is obsolete: true
Attachment #8758756 - Attachment is obsolete: true
Attachment #8761208 - Attachment is obsolete: true
Attachment #8762061 - Flags: review?(michael)
Comment on attachment 8762061 [details] [diff] [review]
clang-plugin.diff

Review of attachment 8762061 [details] [diff] [review]:
-----------------------------------------------------------------

Assuming the only change is the isInteresting check, lgtm.
Attachment #8762061 - Flags: review?(michael) → review+
Depends on: 1282408
MozReview-Commit-ID: GPQY8b2OM2V
Attachment #8765851 - Flags: review?(nfroyd)
Attachment #8765851 - Attachment description: correct name for attribute description, from MOZ_IGNORE_INITIALIZATION -> MOZ_INITIALIZED_OUTSIDE_CONSTRUCTOR → Bug 525063 - correct name for attribute description, from MOZ_IGNORE_INITIALIZATION -> MOZ_INITIALIZED_OUTSIDE_CONSTRUCTOR
Attached patch Landed (obsolete) (deleted) — Splinter Review
MozReview-Commit-ID: GPQY8b2OM2V
Attachment #8765889 - Flags: review?(nfroyd)
Attachment #8765851 - Attachment is obsolete: true
Attachment #8765851 - Flags: review?(nfroyd)
(In reply to Andi-Bogdan Postelnicu from comment #63)
> Created attachment 8765889 [details] [diff] [review]
> renamed MOZ_INITIALIZED_OUTSIDE_CONSTRUCTOR -> MOZ_INIT_OUTSIDE_CTOR
> 
> MozReview-Commit-ID: GPQY8b2OM2V

I've decided to rename it since many developers complained that it is to long.
Comment on attachment 8765889 [details] [diff] [review]
Landed

Review of attachment 8765889 [details] [diff] [review]:
-----------------------------------------------------------------

r=me

::: mfbt/Attributes.h
@@ +527,4 @@
>   *   declarations where an instance of the template should be considered, for
>   *   static analysis purposes, to inherit any type annotations (such as
>   *   MOZ_MUST_USE_TYPE and MOZ_STACK_CLASS) from its template arguments.
> + * MOZ_INIT_OUTSIDE_CTOR: Applies to class member declarations. Occasionally 

Nit: trailing whitespace.
Attachment #8765889 - Flags: review?(nfroyd) → review+
Attached patch clang-plugin update 1.diff (obsolete) (deleted) — Splinter Review
- added support for then and else branches comparison
- modified call for buildResolverMap in order to get rid of useless variablesMap
- modified updateVarMap and add any variables that is given a value to the map, in this way we can avoid having two copies of variablesMap given to thenMap and elseMap in IfStmt
Attachment #8766732 - Flags: review?(michael)
Attachment #8765889 - Attachment description: renamed MOZ_INITIALIZED_OUTSIDE_CONSTRUCTOR -> MOZ_INIT_OUTSIDE_CTOR → clang-plugin update 1
Attachment #8765889 - Attachment description: clang-plugin update 1 → Landed
Attachment #8766732 - Attachment description: clang-plugin - unitialise member checker - added support for IfStmt comparing 'then' and 'else' branches → clang-plugin update 1.diff
Comment on attachment 8766732 [details] [diff] [review]
clang-plugin update 1.diff

Review of attachment 8766732 [details] [diff] [review]:
-----------------------------------------------------------------

::: build/clang-plugin/clang-plugin.cpp
@@ +1714,5 @@
>    std::unordered_map<std::string, bool>& variablesMap,
>    StringRef varName) {
>  
> +  variablesMap[varName] = true;
> +

Unnecessary blank line

@@ +1832,5 @@
> +      // Loop through the thenMap and elseMap and look for the same variables
> +      // set to true  and add to variablesMap only the elements that are present
> +      // in both maps set to true
> +      for (auto& item: thenMap) {
> +        if (item.second && elseMap[item.first]) {

I believe operator[] always inserts into the map if the entry is not present. Ideally, extra resizing of elseMap wouldn't be occurring at this point, but that being said, it is not a big deal.

::: build/clang-plugin/tests/TestMemberInitialization.cpp
@@ +146,4 @@
>    while (i < 10) {
>      if (i % 5)
>        a = 0;
> +    else { 

Trailing whitespace
Attachment #8766732 - Flags: review?(michael) → review+
(In reply to Michael Layzell [:mystor] from comment #69)
> Comment on attachment 8766732 [details] [diff] [review]
> clang-plugin update 1.diff
> 
> Review of attachment 8766732 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: build/clang-plugin/clang-plugin.cpp
> @@ +1714,5 @@
> >    std::unordered_map<std::string, bool>& variablesMap,
> >    StringRef varName) {
> >  
> > +  variablesMap[varName] = true;
> > +
> 
> Unnecessary blank line
> 
> @@ +1832,5 @@
> > +      // Loop through the thenMap and elseMap and look for the same variables
> > +      // set to true  and add to variablesMap only the elements that are present
> > +      // in both maps set to true
> > +      for (auto& item: thenMap) {
> > +        if (item.second && elseMap[item.first]) {
> 
> I believe operator[] always inserts into the map if the entry is not
> present. Ideally, extra resizing of elseMap wouldn't be occurring at this
> point, but that being said, it is not a big deal.
> 
You are right, otherwise you couldn't have statements like:

elseMap["Something"] = true;

This being an unordered_map the average insert complexity is O(1), where as if we had used normal maps,
that are ordered, the insert complexity is O(logN) where N is the size of the map. 
Still i will modify it in the next patch, it better to have the best implementation possible. 

> ::: build/clang-plugin/tests/TestMemberInitialization.cpp
> @@ +146,4 @@
> >    while (i < 10) {
> >      if (i % 5)
> >        a = 0;
> > +    else { 
> 
> Trailing whitespace
Attached patch clang-plugin update 2.diff (obsolete) (deleted) — Splinter Review
- added depth to analysis, if it's reached return from the analysed Stmt
 - get item from elseMap without first inserting it
Attachment #8766783 - Flags: review?(michael)
Attachment #8766783 - Attachment description: clang-plugin - unitialise member checker - added max depth for analysis → clang-plugin update 2.diff
Comment on attachment 8766783 [details] [diff] [review]
clang-plugin update 2.diff

Review of attachment 8766783 [details] [diff] [review]:
-----------------------------------------------------------------

::: build/clang-plugin/clang-plugin.cpp
@@ +40,4 @@
>  #define cxxRecordDecl recordDecl
>  #endif
>  
> +#define MAX_DEPTH 100

Maybe make this a constant on the particular checker? This file has many analyses, and MAX_DEPTH doesn't really describe which analysis it is referring to very well.

It would be nicer if it was, say, NonInitializedMemberChecker::MAX_DEPTH, which at least gives it more context. Please also add a comment.

@@ +1989,4 @@
>      }
>  
>      // Maybe it's initialized in the body.
> +    evaluateExpression(body, variablesMap, resolverMap, 0);

Maybe add a /* depth = */ comment before the 0.
Attachment #8766783 - Flags: review?(michael) → review+
Attached patch clang-plugin update 3.diff (obsolete) (deleted) — Splinter Review
As we've talked in London at All Hands Meeting, in the clase the class overloads operator new than we will skip the check for uninitialised variables.
Attachment #444602 - Attachment is obsolete: true
Attachment #446486 - Attachment is obsolete: true
Attachment #8765889 - Attachment is obsolete: true
Attachment #8769627 - Flags: review?(michael)
Comment on attachment 8769627 [details] [diff] [review]
clang-plugin update 3.diff

One a more closer thought for this we should fire another bugzilla issue.
Attachment #8769627 - Flags: review?(michael)
No longer depends on: 1282408
Depends on: 1282408
Depends on: 1287752
No longer depends on: 1287752
Another feature that we might add to this checker, as dbaron suggested, is to have the ability to mark different function members that are part of the initialisation process like:
>>class Test {
>>	int a;
>>	int b;
>>	int c;
>>
>>public:
>>	Test()
>>	: a(0) {
>>
>>	}
>>	
>>	CLASS_INIT 
>>	void Init() {
>>		b = 0;
>>		c = 0;
>>	}	
>>};

Where by using macro CLASS_INIT we tell the checker to also check the indicated function for variable initialisation/ assignment.
Attachment #8769627 - Attachment is obsolete: true
Attachment #8785261 - Attachment is obsolete: true
Attachment #8785261 - Flags: review?(nfroyd)
Comment on attachment 8785262 [details] [diff] [review]
add attribute to mark functions that initialize member variables for their parent class, in order to be scanned by clang-plugin static analysis

Review of attachment 8785262 [details] [diff] [review]:
-----------------------------------------------------------------

The comment needs adjusting, see below.  Is it not possible to traverse calls recursively from within the constructor, and avoid the annotation entirely?  I thought that's what the plugin was already doing; why is this annotation needed?

::: mfbt/Attributes.h
@@ +502,4 @@
>   *   Using this attribute on a member disables the check that this member must be
>   *   initialized in constructors via list-initialization, in the constructor body,
>   *   or via functions called from the constructor body.
> + * MOZ_IS_CLASS_INIT: Applies to function class member declaration. Occasionally

I think this should be "Applies to class method declarations."

@@ +503,5 @@
>   *   initialized in constructors via list-initialization, in the constructor body,
>   *   or via functions called from the constructor body.
> + * MOZ_IS_CLASS_INIT: Applies to function class member declaration. Occasionally
> + *   the constructor doesn't initialize all of the member variables and another
> + *   function is used to initialize the rest. This marker is used to let the static

"...to make the static analysis tool..."

@@ +504,5 @@
>   *   or via functions called from the constructor body.
> + * MOZ_IS_CLASS_INIT: Applies to function class member declaration. Occasionally
> + *   the constructor doesn't initialize all of the member variables and another
> + *   function is used to initialize the rest. This marker is used to let the static
> + *   analysis tool aware that the marked function is part of the initialize process

"..part of the initialization process..."

@@ +505,5 @@
> + * MOZ_IS_CLASS_INIT: Applies to function class member declaration. Occasionally
> + *   the constructor doesn't initialize all of the member variables and another
> + *   function is used to initialize the rest. This marker is used to let the static
> + *   analysis tool aware that the marked function is part of the initialize process
> + *   and to include it in the scan mechanism that determines witch variables still

"...to include the marked function...that determines which member variables..."

@@ +506,5 @@
> + *   the constructor doesn't initialize all of the member variables and another
> + *   function is used to initialize the rest. This marker is used to let the static
> + *   analysis tool aware that the marked function is part of the initialize process
> + *   and to include it in the scan mechanism that determines witch variables still
> + *   remain uninitialized or unassigned values.

We can drop the "or unassigned values" bit here.
(In reply to Nathan Froyd [:froydnj] from comment #78)
> Comment on attachment 8785262 [details] [diff] [review]
> add attribute to mark functions that initialize member variables for their
> parent class, in order to be scanned by clang-plugin static analysis
> 
> Review of attachment 8785262 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> The comment needs adjusting, see below.  Is it not possible to traverse
> calls recursively from within the constructor, and avoid the annotation
> entirely?  I thought that's what the plugin was already doing; why is this
> annotation needed?
This is what clang-plugin does in it's actual state but there could be initialisation function that 
are not called from within the constructor and we want to be able to scan those functions as well, that's
why we could use an annotation. Check this basic example
>>class Test {
>>	int a;
>>	int b;
>>	int c;
>>
>>public:
>>	Test()
>>	: a(0) {
>>
>>	}
>>	
>>	MOZ_IS_CLASS_INIT 
>>	void Init() {
>>		b = 0;
>>		c = 0;
>>	}	
>>};

where somewhere in the code where we instantiate class Test we can have:
>> Test myTest;
>> myTest.Init();

In the current form of the plugin we scan recursively only the body of the CTOR, now after we add the annotation we will also scan Init.

> 
> ::: mfbt/Attributes.h
> @@ +502,4 @@
> >   *   Using this attribute on a member disables the check that this member must be
> >   *   initialized in constructors via list-initialization, in the constructor body,
> >   *   or via functions called from the constructor body.
> > + * MOZ_IS_CLASS_INIT: Applies to function class member declaration. Occasionally
> 
> I think this should be "Applies to class method declarations."
> 
> @@ +503,5 @@
> >   *   initialized in constructors via list-initialization, in the constructor body,
> >   *   or via functions called from the constructor body.
> > + * MOZ_IS_CLASS_INIT: Applies to function class member declaration. Occasionally
> > + *   the constructor doesn't initialize all of the member variables and another
> > + *   function is used to initialize the rest. This marker is used to let the static
> 
> "...to make the static analysis tool..."
> 
> @@ +504,5 @@
> >   *   or via functions called from the constructor body.
> > + * MOZ_IS_CLASS_INIT: Applies to function class member declaration. Occasionally
> > + *   the constructor doesn't initialize all of the member variables and another
> > + *   function is used to initialize the rest. This marker is used to let the static
> > + *   analysis tool aware that the marked function is part of the initialize process
> 
> "..part of the initialization process..."
> 
> @@ +505,5 @@
> > + * MOZ_IS_CLASS_INIT: Applies to function class member declaration. Occasionally
> > + *   the constructor doesn't initialize all of the member variables and another
> > + *   function is used to initialize the rest. This marker is used to let the static
> > + *   analysis tool aware that the marked function is part of the initialize process
> > + *   and to include it in the scan mechanism that determines witch variables still
> 
> "...to include the marked function...that determines which member
> variables..."
> 
> @@ +506,5 @@
> > + *   the constructor doesn't initialize all of the member variables and another
> > + *   function is used to initialize the rest. This marker is used to let the static
> > + *   analysis tool aware that the marked function is part of the initialize process
> > + *   and to include it in the scan mechanism that determines witch variables still
> > + *   remain uninitialized or unassigned values.
> 
> We can drop the "or unassigned values" bit here.

Thanks for the help with the comment i will update it and post it for review.
(In reply to Andi-Bogdan Postelnicu from comment #79)
> (In reply to Nathan Froyd [:froydnj] from comment #78)
> > Comment on attachment 8785262 [details] [diff] [review]
> > add attribute to mark functions that initialize member variables for their
> > parent class, in order to be scanned by clang-plugin static analysis
> > 
> > Review of attachment 8785262 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 
> > The comment needs adjusting, see below.  Is it not possible to traverse
> > calls recursively from within the constructor, and avoid the annotation
> > entirely?  I thought that's what the plugin was already doing; why is this
> > annotation needed?
> This is what clang-plugin does in it's actual state but there could be
> initialisation function that 
> are not called from within the constructor and we want to be able to scan
> those functions as well, that's
> why we could use an annotation. Check this basic example
> >>class Test {
> >>	int a;
> >>	int b;
> >>	int c;
> >>
> >>public:
> >>	Test()
> >>	: a(0) {
> >>
> >>	}
> >>	
> >>	MOZ_IS_CLASS_INIT 
> >>	void Init() {
> >>		b = 0;
> >>		c = 0;
> >>	}	
> >>};
> 
> where somewhere in the code where we instantiate class Test we can have:
> >> Test myTest;
> >> myTest.Init();
> 
> In the current form of the plugin we scan recursively only the body of the
> CTOR, now after we add the annotation we will also scan Init.

Ah, very good, thank you for the example.  The plugin then checks that there are no uses between construction and MOZ_IS_CLASS_INIT methods being called as well?
(In reply to Nathan Froyd [:froydnj] from comment #80)
> Ah, very good, thank you for the example.  The plugin then checks that there
> are no uses between construction and MOZ_IS_CLASS_INIT methods being called
> as well?

Yes, the current stage of the plugin checks only the ctor now having this annotation we will be able to also check the marked functions.
Attachment #8785262 - Attachment is obsolete: true
Attachment #8785262 - Flags: review?(nfroyd)
Attached patch clang-plugin.diff (obsolete) (deleted) — Splinter Review
MozReview-Commit-ID: FhEAzS5CCDf
Attachment #8785924 - Flags: review?(michael)
Comment on attachment 8785924 [details] [diff] [review]
clang-plugin.diff

just a rebase.
Attachment #8785924 - Attachment description: added clang-plugin feature to check for uninitialized class members when a constructor is defined. the check can be skipped for each variable by using in declaration MOZ_IGNORE_INITIALIZATION → clang-plugin.diff
Attachment #8762061 - Attachment is obsolete: true
Attachment #8785825 - Flags: review?(nfroyd) → review+
Comment on attachment 8785924 [details] [diff] [review]
clang-plugin.diff

Review of attachment 8785924 [details] [diff] [review]:
-----------------------------------------------------------------

r=me
I didn't look over it much because it is just a rebase. If part of it changed significantly and you want me to look at it again, just ping me and point me at it.
Attachment #8785924 - Flags: review?(michael) → review+
Comment on attachment 8785825 [details] [diff] [review]
add attribute to mark functions that initialize member variables for their parent class, in order to be scanned by clang-plugin static analysis

https://hg.mozilla.org/integration/mozilla-inbound/rev/8b245d867368
Attachment #8785825 - Attachment is obsolete: true
Attached patch clang-plugin update 3.diff (obsolete) (deleted) — Splinter Review
MozReview-Commit-ID: AYerMratJHe
Attachment #8786261 - Flags: review?(michael)
Comment on attachment 8786261 [details] [diff] [review]
clang-plugin update 3.diff

In order to have things simple we first determine if we have functions that are annotated as being part of the initialisation process and scan them. The variables that get marked as being initialised are stored in the map with a special flag - InitByFunc. This flag is used in order to differentiate variables initialised in init functions from the one initialised by the ctors since the map is refreshed for each ctor pass.
Attachment #8786261 - Attachment description: clang-plugin - uninitialise member checker - added support for init functions scan → clang-plugin update 3.diff
(In reply to Andi-Bogdan Postelnicu from comment #86)
> Comment on attachment 8785825 [details] [diff] [review]

https://hg.mozilla.org/mozilla-central/rev/8b245d867368
Target Milestone: mozilla50 → ---
Comment on attachment 8786261 [details] [diff] [review]
clang-plugin update 3.diff

Review of attachment 8786261 [details] [diff] [review]:
-----------------------------------------------------------------

I feel mildly uncomfortable about initializers working on partially initialized structs, but I understand that it probably happens a lot in our code base, and it would be too much work to get rid of them. 

nit: In bug 1287458, the clang-plugin is getting a consistent style. As that patch is probably going to land before this one, it would be awesome if you could update the style in this patch to match the LLVM style which we're using for the clang-plugin now :).

::: build/clang-plugin/clang-plugin.cpp
@@ +2085,5 @@
> +
> +  // First look for functions that are marked as being part of the initialization
> +  // process and scan though them before looking at constructors since in this way
> +  // we can avoid popping false-positive errors for variables that are not
> +  // initialized in the constructor

I understand why you run this logic first, but it seems a bit odd to me as it's the opposite order to how initialization actually works in the code.

It probably makes the most sense to leave it this way anyways though.

::: build/clang-plugin/tests/TestMemberInitialization.cpp
@@ +29,5 @@
>  
>  // this should trigger an unintialized error message
>  struct TestStruct2 {
> +  int b;
> +  TestStruct2() { // expected-error {{b is not correctly initialialized in the constructor of 'TestStruct2'}}

As you no longer report the error on the property which is not initialized, it seems like it would be good to say something like `member 'b' is not ....`, or perhaps even:

    Error: Member 'TestStruct2::b` is not initialized in this constructor
    Note: All members must be initialized in the constructor, or in a MOZ_IS_CLASS_INIT method

The above error message isn't as clear as it could be.

@@ +178,5 @@
> +public:
> +  TestInitFunc() : a(0) {
> +
> +  }
> +  MOZ_IS_CLASS_INIT void Init();

Can you file a follow-up for making sure that the init function is called immediately after construction of all objects which have one?
Attachment #8786261 - Flags: review?(michael) → review+
Depends on: 1301333
Depends on: 1302975
Depends on: 1308868
What's the status of this project?
I will update it shortly, we still need to fix all of the uninitialized variables that it detects. I'll also add support for templates.
Sounds great.  Thank you.  :-)
So i've returned on this bug since it needs to be compliant with the new clang tidy infrastructure and after talking with Julian Seward, he would really use this patch for https://bugzilla.mozilla.org/show_bug.cgi?id=1232696
Nice.  Sorry I bitrotted your patches.  I hope the new code structure makes it much easier to hack on new static analyses in the future.  I have been meaning to write some documentation but haven't gotten around to doing that yet.  But I just uploaded a rebased version of bug 1114683 and you can use that as a sample on how to rebase.  Let me know if you have any questions or hit any issues.  :-)
Attached patch Untitled.patch (obsolete) (deleted) — Splinter Review
This is WIP in order to work with the new build mechanism based on clang-tidy. The template support will be added.
Attachment #8766732 - Attachment is obsolete: true
Attachment #8766783 - Attachment is obsolete: true
Attachment #8785924 - Attachment is obsolete: true
Attachment #8786261 - Attachment is obsolete: true
Flags: needinfo?(bpostelnicu)
I was trying to run the analysis on layout code today, and I needed this patch in order to avoid a lot of false positives.

Hope it helps :)
(In reply to Emilio Cobos Álvarez [:emilio] from comment #97)
> Created attachment 8864147 [details] [diff] [review]
> Skip delegating constructors in the non-initialized member checker.
> 
> I was trying to run the analysis on layout code today, and I needed this
> patch in order to avoid a lot of false positives.
> 
> Hope it helps :)

Thanks for this, if you need help running it or you encounter corner cases that are not yet covered let me know and I'll update it. Also the present implementation doesn't cover the use of template classes.
Flags: needinfo?(bpostelnicu)
(In reply to Andi-Bogdan Postelnicu [:andi] from comment #98)
> (In reply to Emilio Cobos Álvarez [:emilio] from comment #97)
> > Created attachment 8864147 [details] [diff] [review]
> > Skip delegating constructors in the non-initialized member checker.
> > 
> > I was trying to run the analysis on layout code today, and I needed this
> > patch in order to avoid a lot of false positives.
> > 
> > Hope it helps :)
> 
> Thanks for this, if you need help running it or you encounter corner cases
> that are not yet covered let me know and I'll update it. Also the present
> implementation doesn't cover the use of template classes.

It works quite well, to be honest.

There's one annoying thing when grepping for the error message in the build logs, which is that it contains a typo (initialialized instead of just initialized), but that's not something to be worried about I think :)
Attached patch clang-patch.patch (obsolete) (deleted) — Splinter Review
This new patch has the following new addons:
1. Courtesy of Emilio - Skip delegating constructors in the non-initialized member checker.
2. Fixed type of uninitialised.
4. Added the framework for detecting for the class templates.
Attachment #8861953 - Attachment is obsolete: true
Attachment #8864147 - Attachment is obsolete: true
(In reply to Emilio Cobos Álvarez [:emilio] from comment #99)
> (In reply to Andi-Bogdan Postelnicu [:andi] from comment #98)
> > (In reply to Emilio Cobos Álvarez [:emilio] from comment #97)
> > > Created attachment 8864147 [details] [diff] [review]
> > > Skip delegating constructors in the non-initialized member checker.
> > > 
> > > I was trying to run the analysis on layout code today, and I needed this
> > > patch in order to avoid a lot of false positives.
> > > 
> > > Hope it helps :)
> > 
> > Thanks for this, if you need help running it or you encounter corner cases
> > that are not yet covered let me know and I'll update it. Also the present
> > implementation doesn't cover the use of template classes.
> 
> It works quite well, to be honest.
> 
> There's one annoying thing when grepping for the error message in the build
> logs, which is that it contains a typo (initialialized instead of just
> initialized), but that's not something to be worried about I think :)

Did you encounter any case where this analysis failed?
I seem (In reply to Andi-Bogdan Postelnicu [:andi] from comment #101)
> Did you encounter any case where this analysis failed?

I'm pretty sure it reported wrong the WritingMode constructor[2], but that presumably just needs an init annotation? Ideally it wouldn't require it though.

From memory, I seem to recall it reported incorrectly gfxTextPerfMetrics::reflowCount[1], but I'm less sure about this one.

Note that I was filtering by path and excluded some files that reported a lot of errors, like nsDisplayList.h, which has constructors with uninitialized members that expects subclasses to initialize, and nsStyleStruct.h, which intentionally leaves uninitialized values too. But those errors seemed legit.

I don't recall any other off-hand.

[1]: http://searchfox.org/mozilla-central/rev/6580dd9a4eb0224773897388dba2ddf5ed7f4216/gfx/thebes/gfxFont.h#421
[2]: http://searchfox.org/mozilla-central/rev/6580dd9a4eb0224773897388dba2ddf5ed7f4216/layout/generic/WritingModes.h#483
Thanks for telling me this, I'll take a look and see what was going on there.
This is a good catch:

[1]
>>    gfxTextPerfMetrics() {
>>        memset(this, 0, sizeof(gfxTextPerfMetrics));
>>    }

I wonder if we would need to support this kind of construction. 

[2]

Why would WritingMode need an init annotation, isn't the initialisation done in InitFromStyleVisibility that's called from the constructor?

The annotation MOZ_IS_CLASS_INIT must be used on member functions that are not called from CTOR but are used as initialisers, like:

>>SomeClass A;
>>// Do some funky logic 
>>A->Init();
Attached patch 525063-2.patch (obsolete) (deleted) — Splinter Review
This adds full tree traversal of statements and expressions and also handles switches and ternary conditional operators to make sure all branches initialize a member for it to be considered as initialized.
Attached patch 525063-2.patch (obsolete) (deleted) — Splinter Review
Refactor duplicated traversal of children into a common method
Attachment #8876115 - Attachment is obsolete: true
Attached patch 525063-3.patch (obsolete) (deleted) — Splinter Review
Add support for detecting inits through intermediate references
Attached patch 525063-3.patch (obsolete) (deleted) — Splinter Review
Also added handling of pointers and references to pointers
Attachment #8876762 - Attachment is obsolete: true
Attached patch 525063-4.patch (obsolete) (deleted) — Splinter Review
Add detection for POD fields full initialization (recursively)
Attached patch 525063-5.patch (obsolete) (deleted) — Splinter Review
Add fixit notes to add MOZ_UNCHECKED_INIT in order to whitelist existing defects.
Also refactors the code into smaller functions and cleaner types.
Attached patch 525063-fold.patch (obsolete) (deleted) — Splinter Review
Folds all the previous patches
Attachment #8864444 - Attachment is obsolete: true
Attachment #8876523 - Attachment is obsolete: true
Attachment #8876772 - Attachment is obsolete: true
Attachment #8877582 - Attachment is obsolete: true
Attachment #8879491 - Attachment is obsolete: true
Attachment #8879505 - Flags: review?(michael)
Comment on attachment 8879505 [details] [diff] [review]
525063-fold.patch

Review of attachment 8879505 [details] [diff] [review]:
-----------------------------------------------------------------

Please don't land a commit with this commit message. Either keep it split into separate patches, or use a unified commit name.

It looks really good. My brain started hurting a bit when reading over the expression evaluator, so I might come back to that later, but I think my comment about its exact semantics being wrong being OK is probably correct. I mostly have nits, but this is a pretty big patch, so there are a bunch :)

::: build/clang-plugin/CustomMatchers.h
@@ +161,5 @@
> +      // Skip constructors in system headers
> +      !ASTIsInSystemHeader(Declaration->getASTContext(), *Declaration) &&
> +      // Skip ignored namespaces and paths
> +      !isInIgnoredNamespaceForImplicitCtor(Declaration) &&
> +      !isIgnoredPathForImplicitCtor(Declaration);

Consider using inThirdPartyPath (http://searchfox.org/mozilla-central/rev/2bcd258281da848311769281daf735601685de2d/build/clang-plugin/Utils.h#387-426) to avoid the proliferation of more different "ignored path" and "ignored namespace" methods :).

::: build/clang-plugin/NonInitializedMemberChecker.cpp
@@ +3,5 @@
> + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
> +
> +#include "NonInitializedMemberChecker.h"
> +#include "CustomMatchers.h"
> +#include <iostream>

Do you need iostream in this file? Usually clang uses its own form of iostream, rather than the one from the standard library.

@@ +192,5 @@
> +        return;
> +      }
> +
> +      DeclaratorDecl *decl =
> +          dyn_cast_or_null<DeclaratorDecl>(declExpr->getDecl());

NIT: you don't need the or_null here

@@ +226,5 @@
> +    ResolverMap &resolverMap, InitFlag flag) {
> +
> +  static struct externFuncDefinition {
> +    std::string funcName;
> +    uint8_t indexInArgList;

Why is this only 8-bits wide? This isn't saving any space due to padding, so might as well make it 32-bits.

@@ +227,5 @@
> +
> +  static struct externFuncDefinition {
> +    std::string funcName;
> +    uint8_t indexInArgList;
> +  } funcSchemaArray[] = {// for memset

NIT: Please name this struct with UpperCamelCase, and declare it outside of the variable declaration.

@@ +242,5 @@
> +
> +  for (int i = 0; i < numberOfFuncs; i++) {
> +    externFuncDefinition &funcSchema = funcSchemaArray[i];
> +    if (funcSchema.funcName == method->getName().data()) {
> +      Expr *exprArg = funcCall->getArg(funcSchema.indexInArgList);

Check that funcSchema.indexInArgList is in range before calling getArg. getArg doesn't do bounds checking.

@@ +262,5 @@
> +    }
> +  }
> +}
> +
> +void NonInitializedMemberChecker::evaluateExpression(

I didn't look too closely at the exact behavior of the symbolic evaluator here, as I could easily poke holes of places where it has odd behavior, but I don't think we're trying to build a perfect symbolic evaluator for this bug.

Perhaps in a future bug we can look into creating a generic symbolic evaluator (or using one which someone else has written for us) which can make writing future static analyses easier, but if this one does a good enough job, then I'm probably mostly fine with it.

Please add a comment to this function explaining that it isn't an accurate evaluator, and often makes heuristic judgements to try to catch as many situations as possible situations where a member is correctly initialized. If you have a good idea of places where the evaluator makes incorrect judgements, it might be nice to document some of these as well.

@@ +270,5 @@
> +  if (!stmtExpr) {
> +    return;
> +  }
> +
> +  stmtExpr = stmtExpr->IgnoreImplicit();

We usually use our own IgnoreTrivials method instead of Stmt->IgnoreImplicit(), as it skips more things which we usually don't care about (like ParenExprs).

http://searchfox.org/mozilla-central/rev/2bcd258281da848311769281daf735601685de2d/build/clang-plugin/Utils.h#307-330

@@ +273,5 @@
> +
> +  stmtExpr = stmtExpr->IgnoreImplicit();
> +
> +  // Check depth and if it's equal equal with MAX_DEPTH return
> +  if (depth == MAX_DEPTH) {

NIT: Can you be a little safer here, and use >= :)

@@ +277,5 @@
> +  if (depth == MAX_DEPTH) {
> +    return;
> +  }
> +
> +  if (BinaryOperator *binOp = dyn_cast_or_null<BinaryOperator>(stmtExpr)) {

NIT: Use auto* here, it adds less line noise, and the target type is already written on the right. 

In addition, null-check stmtExpr before entering these ifs, and don't bother with `dyn_cast_or_null`, just use `dyn_cast`

@@ +328,5 @@
> +          }
> +        }
> +      }
> +    }
> +  } else if (IfStmt *ifStmt = dyn_cast_or_null<IfStmt>(stmtExpr)) {

You might be able to get away without all of this explicit handling of control flow by instead using the CFG (see the MustReturnFromCallerChecker for an example of using the CFG in tree http://searchfox.org/mozilla-central/rev/2bcd258281da848311769281daf735601685de2d/build/clang-plugin/MustReturnFromCallerChecker.cpp#34-40)

@@ +505,5 @@
> +      uncheckedFieldNames.insert(Attr->getAnnotation());
> +      isLastAttrMozUncheckedInit = false;
> +    } else if (Attr->getAnnotation() == "moz_unchecked_initialization") {
> +      isLastAttrMozUncheckedInit = true;
> +    }

I make a comment below about how I'm mildly uncomfortable with this design for the annotations, and would prefer it to be simplfiied to a `moz_unchecked_initialzation=mApples,mPears` style annotation.

@@ +510,5 @@
> +  }
> +
> +  UncheckedFieldsSet uncheckedFields;
> +  for (auto field : decl->fields()) {
> +    if (uncheckedFieldNames.count(field->getName())) {

Please use getNameChecked in places where you're calling getName(), as this can fail if the field has a non-identifier name (http://searchfox.org/mozilla-central/rev/2bcd258281da848311769281daf735601685de2d/build/clang-plugin/Utils.h#46-49

@@ +519,5 @@
> +  return uncheckedFields;
> +}
> +
> +void NonInitializedMemberChecker::buildVariablesMap(
> +    VariablesMap &variablesMap, CXXRecordDecl const * decl,

NIT: llvm style would write this as `const CXXRecordDecl *Decl`

@@ +520,5 @@
> +}
> +
> +void NonInitializedMemberChecker::buildVariablesMap(
> +    VariablesMap &variablesMap, CXXRecordDecl const * decl,
> +    std::string prefix = "") {

NIT: Please declare defaults in the declaration in the header, rather than in the definition.

@@ +522,5 @@
> +void NonInitializedMemberChecker::buildVariablesMap(
> +    VariablesMap &variablesMap, CXXRecordDecl const * decl,
> +    std::string prefix = "") {
> +  // For each field, if they are builtinType, if they are pointer  or if they
> +  // are PDO, add them to the table, possibly recursively if PDO

Please document all of these helper methods, and place the comments immediately before the declaration in the header file.

@@ +524,5 @@
> +    std::string prefix = "") {
> +  // For each field, if they are builtinType, if they are pointer  or if they
> +  // are PDO, add them to the table, possibly recursively if PDO
> +
> +  auto uncheckedFields = getUncheckedFields(decl);

You seem to use this list of unchecked fields to then filter the list to only a list of checked field. Why not make `getUncheckedFields` instead be `getCheckedFields`, which returns a list of the fields which you do want to look at instead?

@@ +541,5 @@
> +    QualType type = field->getType();
> +
> +    const CXXRecordDecl *fieldDecl = type->getAsCXXRecordDecl();
> +    // This should be OK because the reference types are mandated to be
> +    // initialized by the language.

I'm not sure I understand what your comment is trying to say.

I think what this check is trying to do is ignore non-POD classes as they will initialize their own members implicitly? I think writing that in a comment will be more clear.

@@ +543,5 @@
> +    const CXXRecordDecl *fieldDecl = type->getAsCXXRecordDecl();
> +    // This should be OK because the reference types are mandated to be
> +    // initialized by the language.
> +    if (!type->isBuiltinType() && !type->isPointerType()
> +        && (!fieldDecl || !fieldDecl->isPOD())) {

This check seems overly complicated to me, Can't you just write:

if (!fieldDecl || !fieldDecl->isPOD())

because having the type being a CXXRecordDecl implies !type->isBuiltinType and !type->isPointerType()?

@@ +550,5 @@
> +
> +    bool isPointerType = type->isPointerType();
> +    if (isPointerType) {
> +      fieldDecl = type->getPointeeCXXRecordDecl();
> +    }

Why are we looking through raw pointers? I would think that we would want to just expect these raw pointers to be initialized to some value.

@@ +555,5 @@
> +
> +    // Add to variable map
> +    variablesMap[field] = FieldData {
> +      InitFlag::NotInit,
> +      prefix + field->getName().str()

getNameChecked

@@ +571,5 @@
> +void NonInitializedMemberChecker::checkInitMethods(
> +    const CXXRecordDecl *decl, VariablesMap &variablesMap,
> +    ResolverMap &resolverMap) {
> +  for (auto method : decl->methods()) {
> +    Stmt *funcBody = method->getBody();

NIT: Call this method after checking for the moz_is_class_init method.

@@ +606,5 @@
> +    return;
> +  }
> +
> +  SourceLocation fixitLoc = decl->getInnerLocStart().getLocWithOffset(offset);
> +  std::string fixitString = "MOZ_UNCHECKED_INIT(";

I don't want to recommend MOZ_UNCHECKED_INIT() as the fix for not initializing these members. I feel like recommending initializing them to 0 would be a better solution. I would rather not encourage using MOZ_UNCHECKED_INIT more than we need to.

If you want to use a fixit hint to do some automated rewriting of the tree for the initial landing then you should keep it around for that, but I would rather not land it.

@@ +630,5 @@
> +
> +const CXXRecordDecl* NonInitializedMemberChecker::getFieldDeclPODRecord(
> +    FieldDecl *fieldDecl) {
> +  const CXXRecordDecl *declPOD = fieldDecl->getType()->getAsCXXRecordDecl();
> +  if (fieldDecl->getType()->isPointerType()) {

I don't understand the motivation behind ensuring that the fields of objects behind pointers are initialized. We should just check that the pointer itself is initialized I would think.

@@ +692,5 @@
> +
> +    // This is when definition is done outside of declaration
> +    // loop through the redecls and get the one that is different
> +    // than the current ctor since it will be the definition.
> +    if (!ctor->isThisDeclarationADefinition()) {

There is a method on FunctionDecl called `getDefinition` and `isDefined` which you can use to do this work for you :)

@@ +771,5 @@
> +          continue;
> +        }
> +
> +        // We emit the error diagnostic for the (var, ctor) pair
> +        Diag.Report(ctor->getLocation(), nonInitializedMemberDiagID)

Use diag() here rather than Diag.Report, as the Checker has that functionality built in now.

@@ +795,5 @@
> +          "'%0' is not correctly initialized in the constructor of %1");
> +  unsigned fixitNoteDiagID = Diag.getDiagnosticIDs()->getCustomDiagID(
> +      DiagnosticIDs::Note,
> +      "consider marking %0's fields as unchecked in MOZ_UNCHECKED_INIT here "
> +      "(after the class/struct keyword)");

With the new clang-tidy based setup, we don't need to do this work to pre-generate these IDs. Instead, you should emit diagnostics through the `diag` method defined on the Checker class.

See for example: 
http://searchfox.org/mozilla-central/rev/2bcd258281da848311769281daf735601685de2d/build/clang-plugin/MustUseChecker.cpp#58-59

For example, wanting to emit a diagnostic would look like:

diag(ctor->getLocation(), "'%0' is not correctly initialized in the constructor of %1", DiagnosticIDs::Error) << "A" << "B";

If you want to use the same diagnostic message in multiple places, please just define a: `static const char* NON_INITIALIZED_MEMBER = "'%0' is not correctly initialized in the constructor of %1";` at the top of this file :).

@@ +808,5 @@
> +  }
> +
> +  // We clear the uncheckedFields cache just in case this class instance is
> +  // reused multiple times
> +  m_uncheckedFieldsCache.clear();

Why are you worried about keeping the same cache between multiple different struct checks? This class will be re-used, but that should be a good thing, potentially saving you unnecessary re-checks (for example if the same POD struct is referenced from 2 different other structs as a member).

::: build/clang-plugin/NonInitializedMemberChecker.h
@@ +24,5 @@
> +  };
> +
> +  struct FieldData {
> +    InitFlag flag;
> +    std::string fullName;

NIT: Please change these member names to match LLVM style (Flag and FullName).

@@ +35,5 @@
> +  typedef std::unordered_map<DeclaratorDecl*, DeclaratorDecl*> ResolverMap;
> +  typedef std::unordered_set<FieldDecl*> FixitFields;
> +
> +  const uint8_t MAX_DEPTH = 100;
> +  UncheckedFieldsCacheMap m_uncheckedFieldsCache;

NIT: We do not use the m_ prefix for member variables in the clang plugin. We use LLVM's coding style which styles member variables in UpperCamelCase, so this would be:

UncheckedFieldsCacheMap UncheckedFieldsCache;

@@ +37,5 @@
> +
> +  const uint8_t MAX_DEPTH = 100;
> +  UncheckedFieldsCacheMap m_uncheckedFieldsCache;
> +
> +  void runThroughDefaultFunctions(

NIT: Please use a consistent style for these method declarations. We use llvm-style in the clang plugin, so this function should be declared like:

    void runThroughDefaultFunctions(CallExpr *callExpr,
                                    VariablesMap &varMap,
                                    ResolverMap &resolverMap,
                                    InitFlag flag);

In addition, LLVM generally uses UpperCamelCase for variable and parameter names, so ideally these parameters would have their names changed to match that style.

::: build/clang-plugin/tests/TestInitializedMemberChecker.cpp
@@ +40,5 @@
> +  }
> +};
> +
> +
> +class MOZ_UNCHECKED_INIT(y) Test { // expected-note {{consider marking Test's fields as unchecked in MOZ_UNCHECKED_INIT here (after the class/struct keyword)}}

Please add a test for MOZ_UNCHECKED_INIT(y, z, q) or something like that (marking multiple fields as un-intialized with one annotation).

::: mfbt/Attributes.h
@@ +517,5 @@
> + *   the time of its introduction. This allows us to focus on the new defects
> + *   introduced later on. Using this attribute on class members disables the
> + *   check that these member must be initialized in constructors via
> + *   list-initialization, in the constructor body, or via functions called from
> + *   the constructor body.

I don't find this explanation of the annotation very clear. How about:

MOZ_UNCHECKED_INIT(members...): Applies to class declarations. Marks the listed members as potentially uninitialized, disabling errors or warnings generated due to them not being initialized in class constructors. All non-annotated members must be initialized in constructors via list initialization, in the constructor body, or via functions called directly from the constructor.

I also don't really understand why we decided to go with MOZ_UNCHECKED_INIT(members...) on the class declaration, rather than just MOZ_UNCHECKED_INIT on the member declaration itself, which seems more intuitive to me.

@@ +575,5 @@
>  #  define MOZ_NON_AUTOABLE __attribute__((annotate("moz_non_autoable")))
>  #  define MOZ_INIT_OUTSIDE_CTOR \
>      __attribute__((annotate("moz_ignore_ctor_initialization")))
> +#  define MOZ_UNCHECKED_INIT(...) \ // Attributes are read backwards
> +    __attribute__((annotate(#__VA_ARGS__))) \

Doing this seems a little sketchy to me. If we have 2 attributes which both want to handle lists, we would not be able to easily tell the annotations apart. I would prefer it if we instead had one annotation:

__attribute__((annotate("moz_unchecked_initialization=" #__VA_ARGS__)))

and in the code when looking for the annotation, we simply checked if an annotation starts with "moz_unchecked_initialization".

@@ +576,5 @@
>  #  define MOZ_INIT_OUTSIDE_CTOR \
>      __attribute__((annotate("moz_ignore_ctor_initialization")))
> +#  define MOZ_UNCHECKED_INIT(...) \ // Attributes are read backwards
> +    __attribute__((annotate(#__VA_ARGS__))) \
> +    __attribute__((annotate("moz_unchecked_initialization")))

Please add the equivalent define in the #else branch, (like #  define MOZ_UNCHECKED_INIT(...) /* nothing */)
Attachment #8879505 - Flags: review?(michael)
Thanks very much for the in-depth review. If I don't answer on one of your points, it means that it is fixed/taken care of.
The patch following this reply will come very soon.

(In reply to Michael Layzell [:mystor] from comment #112)
> Comment on attachment 8879505 [details] [diff] [review]
> 525063-fold.patch
> 
> Review of attachment 8879505 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> @@ +262,5 @@
> > +    }
> > +  }
> > +}
> > +
> > +void NonInitializedMemberChecker::evaluateExpression(
> 
> I didn't look too closely at the exact behavior of the symbolic evaluator
> here, as I could easily poke holes of places where it has odd behavior, but
> I don't think we're trying to build a perfect symbolic evaluator for this
> bug.
> 
> Perhaps in a future bug we can look into creating a generic symbolic
> evaluator (or using one which someone else has written for us) which can
> make writing future static analyses easier, but if this one does a good
> enough job, then I'm probably mostly fine with it.
> 
> Please add a comment to this function explaining that it isn't an accurate
> evaluator, and often makes heuristic judgements to try to catch as many
> situations as possible situations where a member is correctly initialized.
> If you have a good idea of places where the evaluator makes incorrect
> judgements, it might be nice to document some of these as well.

This method probably has very, very few false positives, and mostly has false negatives because it doesn't cover all cases. This is intentional, as I feel we have reached a point where most common constructs are covered, and it would take too much time to try to cover what's left. I will comment on it anyway.

> @@ +328,5 @@
> > +          }
> > +        }
> > +      }
> > +    }
> > +  } else if (IfStmt *ifStmt = dyn_cast_or_null<IfStmt>(stmtExpr)) {
> 
> You might be able to get away without all of this explicit handling of
> control flow by instead using the CFG (see the MustReturnFromCallerChecker
> for an example of using the CFG in tree
> http://searchfox.org/mozilla-central/rev/
> 2bcd258281da848311769281daf735601685de2d/build/clang-plugin/
> MustReturnFromCallerChecker.cpp#34-40)

This could probably have been better if we had used it since the beginning, but now I highly doubt it's worth the time to refactor the existing code, which covers most if not all cases in terms of control flow (especially since it has special cases for if and switch statements).

> @@ +524,5 @@
> > +    std::string prefix = "") {
> > +  // For each field, if they are builtinType, if they are pointer  or if they
> > +  // are PDO, add them to the table, possibly recursively if PDO
> > +
> > +  auto uncheckedFields = getUncheckedFields(decl);
> 
> You seem to use this list of unchecked fields to then filter the list to
> only a list of checked field. Why not make `getUncheckedFields` instead be
> `getCheckedFields`, which returns a list of the fields which you do want to
> look at instead?

Doing this wouldn't change anything in terms of performance or behaviour, but the pattern we have adopted throughout the code is to do a trivial loop and then filter things out progressively with early-returns/continues. It seems logical to me to keep this construct. I don't think it's worth spending time on this.

> 
> @@ +541,5 @@
> > +    QualType type = field->getType();
> > +
> > +    const CXXRecordDecl *fieldDecl = type->getAsCXXRecordDecl();
> > +    // This should be OK because the reference types are mandated to be
> > +    // initialized by the language.
> 
> I'm not sure I understand what your comment is trying to say.
> 
> I think what this check is trying to do is ignore non-POD classes as they
> will initialize their own members implicitly? I think writing that in a
> comment will be more clear.

That's the point of the code, but I have to agree that the comment is not very clear. What it meant is that we don't have to check member references as they have to be initialized. Hence the absence of a !type->isReferenceType() in this early continue.
I will rephrase it.

> @@ +543,5 @@
> > +    const CXXRecordDecl *fieldDecl = type->getAsCXXRecordDecl();
> > +    // This should be OK because the reference types are mandated to be
> > +    // initialized by the language.
> > +    if (!type->isBuiltinType() && !type->isPointerType()
> > +        && (!fieldDecl || !fieldDecl->isPOD())) {
> 
> This check seems overly complicated to me, Can't you just write:
> 
> if (!fieldDecl || !fieldDecl->isPOD())
> 
> because having the type being a CXXRecordDecl implies !type->isBuiltinType
> and !type->isPointerType()?

These two boolean expressions are not equivalent.
The current one goes into the early-continue if the type is NOT a builtin AND NOT a pointer AND not a POD.
These are exactly the cases that we want to cover, so it makes sense to discard any other case (this includes non-POD objects and references).
Your proposal would early-continue if the type is NOT a POD, but would early-continue for a builtin or a pointer as well, which is not what we want.

I think that the implication that you wanted to write was:
fieldDecl => !type->isBuiltinType && !type->isPointerType()
But in the expression you are suggesting, you are using !fieldDecl

We could filter based on (fieldDecl && !field->isPOD()), but that would not filter references (or other cases we haven't thought of). It's better to whitelist than to blacklist cases.

> @@ +550,5 @@
> > +
> > +    bool isPointerType = type->isPointerType();
> > +    if (isPointerType) {
> > +      fieldDecl = type->getPointeeCXXRecordDecl();
> > +    }
> 
> Why are we looking through raw pointers? I would think that we would want to
> just expect these raw pointers to be initialized to some value.
>
> @@ +630,5 @@
> > +
> > +const CXXRecordDecl* NonInitializedMemberChecker::getFieldDeclPODRecord(
> > +    FieldDecl *fieldDecl) {
> > +  const CXXRecordDecl *declPOD = fieldDecl->getType()->getAsCXXRecordDecl();
> > +  if (fieldDecl->getType()->isPointerType()) {
> 
> I don't understand the motivation behind ensuring that the fields of objects
> behind pointers are initialized. We should just check that the pointer
> itself is initialized I would think.

This is actually a good point. My initial thought was that pointers to PODs don't have their member field initialized either, but I just realized that we could be assigning an existing pointer (I was only thinking about the case where we initialize with |new|). Maybe I can try and detect when a pointer is initialized with |new| and when it is initialized from a pre-existing pointer. I will discuss that with Andi.

> @@ +606,5 @@
> > +    return;
> > +  }
> > +
> > +  SourceLocation fixitLoc = decl->getInnerLocStart().getLocWithOffset(offset);
> > +  std::string fixitString = "MOZ_UNCHECKED_INIT(";
> 
> I don't want to recommend MOZ_UNCHECKED_INIT() as the fix for not
> initializing these members. I feel like recommending initializing them to 0
> would be a better solution. I would rather not encourage using
> MOZ_UNCHECKED_INIT more than we need to.
> 
> If you want to use a fixit hint to do some automated rewriting of the tree
> for the initial landing then you should keep it around for that, but I would
> rather not land it.

My intent is to do what you describe in your second paragraph. I'll remove the fixit part from the patch.

> @@ +808,5 @@
> > +  }
> > +
> > +  // We clear the uncheckedFields cache just in case this class instance is
> > +  // reused multiple times
> > +  m_uncheckedFieldsCache.clear();
> 
> Why are you worried about keeping the same cache between multiple different
> struct checks? This class will be re-used, but that should be a good thing,
> potentially saving you unnecessary re-checks (for example if the same POD
> struct is referenced from 2 different other structs as a member).

Good catch. This is a leftover from when we were using strings to identify members.

> ::: mfbt/Attributes.h
> @@ +517,5 @@
> > + *   the time of its introduction. This allows us to focus on the new defects
> > + *   introduced later on. Using this attribute on class members disables the
> > + *   check that these member must be initialized in constructors via
> > + *   list-initialization, in the constructor body, or via functions called from
> > + *   the constructor body.
> 
> I don't find this explanation of the annotation very clear. How about:
> 
> MOZ_UNCHECKED_INIT(members...): Applies to class declarations. Marks the
> listed members as potentially uninitialized, disabling errors or warnings
> generated due to them not being initialized in class constructors. All
> non-annotated members must be initialized in constructors via list
> initialization, in the constructor body, or via functions called directly
> from the constructor.
> 
> I also don't really understand why we decided to go with
> MOZ_UNCHECKED_INIT(members...) on the class declaration, rather than just
> MOZ_UNCHECKED_INIT on the member declaration itself, which seems more
> intuitive to me.

We want to use MOZ_UNCHECKED_INIT to whitelist all existing bugs. Adding the annotation to every single uninitialized field in all classes would add a lot of noise, hence our choice to reduce it to the class declaration.
Semantically, MOZ_UNCHECKED_INIT means we're just disabling the warning because it is about existing code when adding the checker. We don't know if these fields should be initialized or not.
If someone wants to explicitly, and on purpose, disable warnings for a field because it is initialized outside or because our checker missed the initialization, MOZ_INIT_OUTSIDE_CTOR can be used in front of the field itself.

> @@ +575,5 @@
> >  #  define MOZ_NON_AUTOABLE __attribute__((annotate("moz_non_autoable")))
> >  #  define MOZ_INIT_OUTSIDE_CTOR \
> >      __attribute__((annotate("moz_ignore_ctor_initialization")))
> > +#  define MOZ_UNCHECKED_INIT(...) \ // Attributes are read backwards
> > +    __attribute__((annotate(#__VA_ARGS__))) \
> 
> Doing this seems a little sketchy to me. If we have 2 attributes which both
> want to handle lists, we would not be able to easily tell the annotations
> apart. I would prefer it if we instead had one annotation:
> 
> __attribute__((annotate("moz_unchecked_initialization=" #__VA_ARGS__)))
> 
> and in the code when looking for the annotation, we simply checked if an
> annotation starts with "moz_unchecked_initialization".

After re-reading the docs, I realized I had a complete misunderstanding of how var args work with defines. Your solution is indeed much better than the current one.
Flags: needinfo?(michael)
Attached patch 525063.patch (obsolete) (deleted) — Splinter Review
Attachment #8879505 - Attachment is obsolete: true
Attachment #8880382 - Flags: review?(michael)
Comment on attachment 8880382 [details] [diff] [review]
525063.patch

Review of attachment 8880382 [details] [diff] [review]:
-----------------------------------------------------------------

A few more comments :). Thanks for the changes it's looking nicer.

::: build/clang-plugin/CustomMatchers.h
@@ +161,5 @@
> +      // Skip constructors in system headers
> +      !ASTIsInSystemHeader(Declaration->getASTContext(), *Declaration) &&
> +      // Skip ignored namespaces and paths
> +      !isInIgnoredNamespaceForImplicitCtor(Declaration) &&
> +      !isIgnoredPathForImplicitCtor(Declaration);

Does this not work with http://searchfox.org/mozilla-central/rev/ae94cfb36d804727dfb38f2750bfcaab4621df6e/build/clang-plugin/Utils.h#387

I would prefer we use the generic isInThirdPartyPath

::: build/clang-plugin/NonInitializedMemberChecker.cpp
@@ +15,5 @@
> +                                           const DeclaratorDecl *Decl) {
> +  auto VarFromMap = ResolverMap.find(Decl);
> +  if (VarFromMap != ResolverMap.end()) {
> +    // variable found in resolve map that means we must use it's corespondent
> +    // -> second in variablesMap

nit: We like making our comments full sentences, which includes Capitalizing the first letter, and adding a period at the end (I'm pretty bad at this :-S) - it would be awesome if you could do that too!

@@ +226,5 @@
> +    const CallExpr *FuncCall, VariablesMap &VariablesMap,
> +    ResolverMap &ResolverMap, InitFlag Flag) {
> +
> +  struct ExternFuncDefinition {
> +    std::string FuncName;

Why not use StringRef here?

@@ +230,5 @@
> +    std::string FuncName;
> +    unsigned IndexInArgList;
> +  };
> +
> +  static ExternFuncDefinition FuncSchemaArray[] = {{"memset", 0}}; // for memset

nit: remove this comment.

@@ +498,5 @@
> +      Decl->specific_attrs<AnnotateAttr>();
> +
> +  for (AnnotateAttr *Attr : Attrs) {
> +    if (Attr->getAnnotation().startswith("moz_unchecked_initialization=")) {
> +      SmallVector<StringRef, 0> FieldStrings;

You might as well take advantage of the small vector optimization, I would probably pass in 2 for the inline allocation size.

@@ +597,5 @@
> +  if (!DeclPOD) {
> +    return;
> +  }
> +
> +  if (FieldData.Flag) {

Please explicitly compare this against InitFlag. I don't like depending on the first variant being falsey.

@@ +604,5 @@
> +      if (!VariablesMap.count(Field)) {
> +        continue;
> +      }
> +
> +      VariablesMap[Field].Flag = FieldData.Flag;

If you have nested POD types, are those handled here?

@@ +615,5 @@
> +      if (!VariablesMap.count(Field)) {
> +        continue;
> +      }
> +
> +      FlagPOD = VariablesMap[Field].Flag;

This seems wrong, consider this:

struct X {
  int a;
  int b;
};

struct Y {
  X x;

  Y(int) {
    x.a = 10;
  }
  Y(float) {}

  void MOZ_INIT_OUTSIDE_CTOR Init() {
    x.b = 20;
  }
};

We would first walk through Init, and determine that x.b is initialized outside the constructor, and thus set it to InitFlag::InitByFunc.

In our first constructor we walk Y(int), which sets x.a with InitFlag::InitByCtor, then we walk the POD types, and declare x as initialized however, as we use the most recent initialized type, we mark it as InitByFunc, which is wrong.

When we then read the second constructor, we infer that x.a is initialized by a function, even though it isn't, through the first branch in this function.

@@ +616,5 @@
> +        continue;
> +      }
> +
> +      FlagPOD = VariablesMap[Field].Flag;
> +      if (!FlagPOD) {

Again, please explicitly compare to the enum variant.

@@ +710,5 @@
> +            << Var.second.FullName << Ctor
> +            << Ctor->getNameInfo().getSourceRange();
> +      } else {
> +        // Reset it only if it was init by ctor
> +        if (Var.second.Flag == InitFlag::InitByCtor) {

nit: This can be an else if rather than an if in the else :)

::: build/clang-plugin/NonInitializedMemberChecker.h
@@ +16,5 @@
> +  void registerMatchers(MatchFinder *AstMatcher) override;
> +  void check(const MatchFinder::MatchResult &Result) override;
> +
> +private:
> +  enum InitFlag : uint8_t { NotInit, InitByCtor, InitByFunc };

Can you split these variants over multiple lines, and consider marking this type as an `enum class`?

@@ +26,5 @@
> +
> +  typedef std::unordered_set<const FieldDecl *> UncheckedFieldsSet;
> +  typedef std::unordered_map<const CXXRecordDecl *, UncheckedFieldsSet>
> +      UncheckedFieldsCacheMap;
> +  typedef std::unordered_map<const FieldDecl *, FieldData> VariablesMap;

Can we call this something else, because it doesn't really have much to do with variables (It's a map of fields to whether or not they are initialized). Perhaps FieldInitMap?

@@ +41,5 @@
> +  void runThroughDefaultFunctions(const CallExpr *CallExpr,
> +                                  VariablesMap &VariablesMap,
> +                                  ResolverMap &ResolverMap, InitFlag Flag);
> +
> +  /// This method checks if a field or associated variable is the assignee in

Is "associated variable" a variable in the resolver map?

@@ +49,5 @@
> +                      ResolverMap &ResolverMap, InitFlag Flag);
> +
> +  /// Resolves the given declaration through the resolver map, allowing us to
> +  /// know if a local variable of a parameter may be indirectly refering to a
> +  /// field.

Can we reword this as "looks up the given DeclaratorDecl in the resolver map, determining the field which it aliases, if any."? That seems more clear to me.

@@ +53,5 @@
> +  /// field.
> +  const DeclaratorDecl *resolveMapVar(ResolverMap &ResolverMap,
> +                                      const DeclaratorDecl *Decl);
> +
> +  /// This updates the variable map only if the varible is not initialized

"Marks the passed-in field as initialized. Does not override the initialization flag if the field has been previously initialized."

@@ +58,5 @@
> +  void updateVarMap(VariablesMap &VariablesMap, const FieldDecl *Var,
> +                    InitFlag Flag);
> +
> +  /// When going through a function call, builds a new resolver map from an
> +  /// existing one by matching arguments and parameters together.

I don't find this comment super clear. This method is creating a new resolver map which has all of the entries from the original resolver map, but with additional links if parameters are passed in which alias local variables, right?

@@ +95,5 @@
> +
> +  /// Builds the initial variables map from the field list of the declaration,
> +  /// potentially recursing into POD fields to make sure their fields are in
> +  /// the map as well.
> +  void buildVariablesMap(VariablesMap &VariablesMap, const CXXRecordDecl *Decl,

I don't like the name of this function, but I think it will be more clear if we change the name of VariablesMap.

@@ +107,5 @@
> +  /// MOZ_IS_CLASS_INIT.
> +  void checkInitMethods(const CXXRecordDecl *Decl, VariablesMap &VariablesMap,
> +                        ResolverMap &ResolverMap);
> +
> +  /// Returns the CXXRecordDecl of the field if it's POD, else nullptr.

"Returns the fields type as a CXXRecordDecl if it is POD, else nullptr." perhaps?

@@ +110,5 @@
> +
> +  /// Returns the CXXRecordDecl of the field if it's POD, else nullptr.
> +  const CXXRecordDecl *getFieldDeclPODRecord(const FieldDecl *FieldDecl);
> +
> +  /// If all fields of a POD are initialized, this marks the POD as initialized.

and vice-versa :)

@@ +115,5 @@
> +  void updateFlagsIfPOD(const FieldDecl *Variable, FieldData &FieldData,
> +                        VariablesMap &VariablesMap);
> +
> +  /// Goes through the constructors looking for initializations and emits errors
> +  /// for each (uninitialized_field, constructor) couple.

Consider mentioning that init methods should be checked before running this method.

::: build/clang-plugin/tests/TestInitializedMemberChecker.cpp
@@ +1,2 @@
> +#include <stdint.h>
> +#include <string.h>

Can you add a test for initializing nested POD types?

e.g.

struct A { int x; }
struct B { A a; int y; }
class C {
public:
  C() {
    // This shouldn't error:
    b.y = 10;
    b.a.x = 20;
    // Nor should this:
    b.y = 10;
    b.a = SomeA;
    // Or this:
    b = SomeB;
    // But forgetting to set an inner type should:
    b.y = 10;
  }
  B b;
}
Attachment #8880382 - Flags: review?(michael)
Flags: needinfo?(michael)
Attached patch 525063.patch (obsolete) (deleted) — Splinter Review
Attachment #8880382 - Attachment is obsolete: true
Attachment #8882224 - Flags: review?(michael)
In addition to your comments, I realized that identifying fields only by their FieldDecl is incorrect because we can have multiple POD fields of the same type:

struct POD {
  int a;
};

class {
  POD foo;
  POD bar;
};

In the old version, foo.a and bar.a would have the same entry in the FieldInitMap. Now, the full FieldDecl path is stored as the key and allows for a unique identification of fields.

(In reply to Michael Layzell [:mystor] from comment #115)
> Comment on attachment 8880382 [details] [diff] [review]
> 525063.patch
> 
> Review of attachment 8880382 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: build/clang-plugin/CustomMatchers.h
> @@ +161,5 @@
> > +      // Skip constructors in system headers
> > +      !ASTIsInSystemHeader(Declaration->getASTContext(), *Declaration) &&
> > +      // Skip ignored namespaces and paths
> > +      !isInIgnoredNamespaceForImplicitCtor(Declaration) &&
> > +      !isIgnoredPathForImplicitCtor(Declaration);
> 
> Does this not work with
> http://searchfox.org/mozilla-central/rev/
> ae94cfb36d804727dfb38f2750bfcaab4621df6e/build/clang-plugin/Utils.h#387
> 
> I would prefer we use the generic isInThirdPartyPath

Woops missed that one review part, it's fixed now :) 

> @@ +604,5 @@
> > +      if (!VariablesMap.count(Field)) {
> > +        continue;
> > +      }
> > +
> > +      VariablesMap[Field].Flag = FieldData.Flag;
> 
> If you have nested POD types, are those handled here?
>
> ::: build/clang-plugin/tests/TestInitializedMemberChecker.cpp
> @@ +1,2 @@
> > +#include <stdint.h>
> > +#include <string.h>
> 
> Can you add a test for initializing nested POD types?
> 
> e.g.
> 
> struct A { int x; }
> struct B { A a; int y; }
> class C {
> public:
>   C() {
>     // This shouldn't error:
>     b.y = 10;
>     b.a.x = 20;
>     // Nor should this:
>     b.y = 10;
>     b.a = SomeA;
>     // Or this:
>     b = SomeB;
>     // But forgetting to set an inner type should:
>     b.y = 10;
>   }
>   B b;
> }


Thanks for catching this bit, I added a new way to recursively initialize PODs depending on their parents and children, and it seems to work now!

> 
> @@ +615,5 @@
> > +      if (!VariablesMap.count(Field)) {
> > +        continue;
> > +      }
> > +
> > +      FlagPOD = VariablesMap[Field].Flag;
> 
> This seems wrong, consider this:
> 
> struct X {
>   int a;
>   int b;
> };
> 
> struct Y {
>   X x;
> 
>   Y(int) {
>     x.a = 10;
>   }
>   Y(float) {}
> 
>   void MOZ_INIT_OUTSIDE_CTOR Init() {
>     x.b = 20;
>   }
> };
> 
> We would first walk through Init, and determine that x.b is initialized
> outside the constructor, and thus set it to InitFlag::InitByFunc.
> 
> In our first constructor we walk Y(int), which sets x.a with
> InitFlag::InitByCtor, then we walk the POD types, and declare x as
> initialized however, as we use the most recent initialized type, we mark it
> as InitByFunc, which is wrong.
> 
> When we then read the second constructor, we infer that x.a is initialized
> by a function, even though it isn't, through the first branch in this
> function.

I added a new MixedInit flag to account for the cases where a POD is partially initialized in a constructor and in a method.
Attachment #8882224 - Flags: review?(michael)
Attached patch 525063.patch (obsolete) (deleted) — Splinter Review
Small fixes for the patch uploaded 15 minutes ago.
Attachment #8882224 - Attachment is obsolete: true
Attachment #8882233 - Flags: review?(michael)
Comment on attachment 8882233 [details] [diff] [review]
525063.patch

Review of attachment 8882233 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry for the long delay on this review! Was too busy on Friday, and just got back to work today after Canada Day :).

::: build/clang-plugin/NonInitializedMemberChecker.cpp
@@ +793,5 @@
> +    for (auto &Var : FieldInitMap) {
> +      // Reset only if initialized by tthis constructor (completely or
> +      // partially).
> +      if (Var.second.Flag == InitFlag::InitByCtor ||
> +          Var.second.Flag == InitFlag::InitMixed) {

Not sure this is necessary (I think you could just use InitByCtor everywhere InitMixed is used) but it makes it clearer, so I like it :).

@@ +812,5 @@
> +    return;
> +  }
> +
> +  // Build the fields map (recursively on PODs) to know which fields should
> +  // be initialized

nit: periods at the end of comments, here and everywhere.
Attachment #8882233 - Flags: review?(michael) → review+
Attached patch 525063.patch (obsolete) (deleted) — Splinter Review
Don't worry for the delay, you've been very quick to respond all the other times ;)

This fixes grammar, punctuation and basic explanation mistakes in the comments.
Attachment #8882233 - Attachment is obsolete: true
Attachment #8883563 - Flags: review?(michael)
Comment on attachment 8883563 [details] [diff] [review]
525063.patch

Review of attachment 8883563 [details] [diff] [review]:
-----------------------------------------------------------------

carrying over r=me assuming that all you changed was comments (didn't re-read the code)
Attachment #8883563 - Flags: review?(michael) → review+
Attached patch 525063.patch (obsolete) (deleted) — Splinter Review
You shouldn't have to re-read all the code as most changes are just minor tweaks to fix some corner cases I encountered while running the checker on m-c. I have done extensive testing and I'm fairly confident in the logical aspect of the algorithm.

The main thing to review here is the introduction of a database.json file that the checker reads from to know which fields to ignore. We discussed this with Andi and Sylvestre and we believe it's a much nicer way to handle issues whitelisting in comparison with the MOZ_UNCHECKED_INIT macro we had before, which was pretty invasive.
Attachment #8883563 - Attachment is obsolete: true
Attachment #8893826 - Flags: review?(michael)
Attached patch 525063-database.patch (obsolete) (deleted) — Splinter Review
This is the patch which adds whitelisting to the database. By default, thanks to macros that can be controlled from moz.build, it doesn't do anything different from the main patch. However, when whitelisting to database is activated, it writes to the database file all the fields which should be ignored in the future, by using a lock in order to prevent races (as clang-tidy is often run in parallel).

Let me know if you need any more info on this as I'm aware it might not be easy to get into with all the platform specific stuff. I'm available on IRC anytime I'm not asleep ;)
Attachment #8893828 - Flags: review?(michael)
Comment on attachment 8893826 [details] [diff] [review]
525063.patch

Review of attachment 8893826 [details] [diff] [review]:
-----------------------------------------------------------------

I don't like that we imported a new JSON parser into the tree, especially because clang already has one ^.^.

Clang has an internal yaml parser which it uses to parse JSON (JSON is a subset of YAML), so perhaps we could use that? (example usage: https://github.com/llvm-mirror/clang/blob/1b373f4999b64c4b56bf8ef4ce28e01da976ba19/lib/Tooling/JSONCompilationDatabase.cpp#L266-L372).

I'd also like ehsan's feedback as to whether or not he thinks that the JSON static analysis opt-out list is a good solution.
Attachment #8893826 - Flags: review?(michael) → feedback?(ehsan)
Comment on attachment 8893828 [details] [diff] [review]
525063-database.patch

Review of attachment 8893828 [details] [diff] [review]:
-----------------------------------------------------------------

Clearing review for now
Attachment #8893828 - Flags: review?(michael)
I don't think landing this analysis without fixing the existing places where it flags out errors is a good idea.  We have had security bugs in the past due to these types of issues, and this will give black hat hackers a recipe for finding those places in our code to poke through.  It seems that fixing the issues this analysis finds should be fairly simple.  Tristan, can you please explain why you chose this approach?  Why not fix those issues first?
(In reply to Michael Layzell [:mystor] from comment #124)
> Comment on attachment 8893826 [details] [diff] [review]
> 525063.patch
> 
> Review of attachment 8893826 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I don't like that we imported a new JSON parser into the tree, especially
> because clang already has one ^.^.
> 
> Clang has an internal yaml parser which it uses to parse JSON (JSON is a
> subset of YAML), so perhaps we could use that? (example usage:
> https://github.com/llvm-mirror/clang/blob/
> 1b373f4999b64c4b56bf8ef4ce28e01da976ba19/lib/Tooling/JSONCompilationDatabase.
> cpp#L266-L372).
> 
> I'd also like ehsan's feedback as to whether or not he thinks that the JSON
> static analysis opt-out list is a good solution.

Oh I'm sorry, I forgot to explain the JSON bit. The reason for importing the JSON parser into the tree is that using anything else from m-c will make the process of exporting the checker to clang-tidy very complicated. Hence the need to have the parser in the clang-tidy folder.

However, I was not aware that Clang had a JSON parser, which seems to solve the whole issue here. Thanks for pointing that out!
I'm going to adapt the code to make use of clang's parser.



(In reply to :Ehsan Akhgari (needinfo please, extremely long backlog, Away Aug 7-8) from comment #126)
> I don't think landing this analysis without fixing the existing places where
> it flags out errors is a good idea.  We have had security bugs in the past
> due to these types of issues, and this will give black hat hackers a recipe
> for finding those places in our code to poke through.  It seems that fixing
> the issues this analysis finds should be fairly simple.  Tristan, can you
> please explain why you chose this approach?  Why not fix those issues first?

We had a pretty long discussion about this with Andi and Sylvestre: knowing that the checker finds several thousand issues, it seems unreasonable to try to fix all of them by hand. We thought about using an autofix but it doesn't seem doable:
 - needlessly initializing variables has a performance cost: some variables should be marked with MOZ_INIT_OUTSIDE_CTOR instead of being initialized.
 - zero may not be a reasonable default for some variables, and some behaviour in the code right now might rely (wrongfully, of course) on that.

Now obviously the choice is not easy, and depending on your guesses and estimates it might be worth it to autofix everything with default construction, but the best approach for now seemed to be to whitelist all existing issues and only complain about the new ones. What's your opinion on this?
Flags: needinfo?(michael)
Flags: needinfo?(ehsan)
(In reply to Michael Layzell [:mystor] from comment #124)
> I don't like that we imported a new JSON parser into the tree, 

Especially when we already have a copy toolkit/components/jsoncpp/
(In reply to Tristan Bourvon from comment #127)
>  - needlessly initializing variables has a performance cost

Do we have any info on how large that cost is?  I can imagine that it
might not be large on contemporary hardware, given that

* the initialising stores do not lead to read-after-write stalls, if nothing
  depends on them; they just get chucked into the core's store queue and
  drain into the L1 Dcache in their own time.  So they are ILP-friendly.

* for initialisation of heap allocated objects, we already carry the cost of
  mozjemalloc pulling freed memory into L1D as a result of poisoning.  So if
  the memory is soon reallocated to the same core, the initialisation might
  not suffer too many write misses.

* initialisation of a field in the constructor rather than later on might
  merely move a write miss earlier than it would otherwise have happened.

I mention these microarchitectural considerations because it seems to me to
be a shame to rule out what strikes me as an eminently sensible long-term
security precaution (require all fields to be initialised in the constructor)
on performance grounds, when it is unclear [at least to me] how much of a
performance hit that would actually be in practice.
I'm inclined to think that we at least generate the autofix patches and then do some perf testing with speedometer or talos before we assume that it has a significant performance regression, and then we can ask reviewers of the created patches to confirm whether or not the new values make sense.

I'm guessing we'll just land most of the added initializations, and people will be able to point out places in the patch where we could be hitting performance bottlenecks because the member shouldn't be initialized.
Flags: needinfo?(michael)
(In reply to Tristan Bourvon from comment #127)
> We had a pretty long discussion about this with Andi and Sylvestre: knowing
> that the checker finds several thousand issues, it seems unreasonable to try
> to fix all of them by hand. We thought about using an autofix but it doesn't
> seem doable:
>  - needlessly initializing variables has a performance cost: some variables
> should be marked with MOZ_INIT_OUTSIDE_CTOR instead of being initialized.

Julian is right on in comment 129.  Reasoning about the cost of adding a few writes is extremely complicated in the abstract.  I very much doubt that we should avoid initializing members out of the fear of the performance cost.  The desired outcome of this bug is indeed for the *majority* of our code base to switch to initializing all class members in the constructor, because years of our past experience has shown that not doing so creates security vulnerabilities over and over again.  I have no doubt that there will be a few places in the entire code base where past measurements have shown that class member initializations are actually costly, and I think for those places we should rely on our experienced code reviewers to do a good job, as opposed to shying away from the entire task out of the fear of potential performance regressions.

>  - zero may not be a reasonable default for some variables, and some
> behaviour in the code right now might rely (wrongfully, of course) on that.

Absolutely.  Again, we have code review for that.  I'm not suggesting just checking out a patch that zero initializes all class members bypassing all of our checks and balances.  :-)

> Now obviously the choice is not easy, and depending on your guesses and
> estimates it might be worth it to autofix everything with default
> construction, but the best approach for now seemed to be to whitelist all
> existing issues and only complain about the new ones. What's your opinion on
> this?

I think we should start filing some dependencies in order to start fixing our existing code before we can land the analysis.  We have this question every time we come up with an analysis which finds many existing issues, whether to land it and whitelist a whole bunch of potential security holes for the whole world to see, or whether to be conservative and only land the analysis once we have fixed all of the issues that it finds.  Given that sec-critical bugs where the patch has *literally* been initializing a class member, it's unacceptable to land this analysis without having fixed all of the issues that it finds, as sad as it makes me to say that.  :-/

Honestly, I would not worry at all about the performance issue raised, and let our reviewers r- if you end up initializing a class member in some hot code where doing so is a real performance issue.  I've been doing a lot of performance work lately and I can't remember even a single case where we have made any real performance improvements to Firefox by removing a class initializer.  :-)

I'm also still interested to get a better sense of how many issues this patch finds.  Any chance you can generate a full summary please?  One easy way to do that would be to modify the analysis to generate a warning instead of an error, do a full clobber build, and then running the ./mach warning-list command which dumps out a full summary of all build warnings.

Thanks a lot for your work here, much appreciated!
Flags: needinfo?(ehsan)
(In reply to Michael Layzell [:mystor] from comment #130)
> I'm inclined to think that we at least generate the autofix patches and then
> do some perf testing [..]

I agree.  I don't see how we can make an informed decision here without some
real numbers.

If someone can provide autofix patches, I will measure them against
Speedometer, both real-CPU time and also the microarchitectural effects, in
the style of bug 1387001.  Who can make the relevant patches?
(In reply to Julian Seward [:jseward] from comment #132)
> If someone can provide autofix patches, I will measure them against
> Speedometer, both real-CPU time and also the microarchitectural effects, in
> the style of bug 1387001.  Who can make the relevant patches?

I'll take care of that. Expect a few days delay before I come up with it though.
That's wonderful!  Thank you Tristan and Julian.  :-)
Attachment #8893826 - Flags: feedback?(ehsan)
Attached patch 525063.patch (obsolete) (deleted) — Splinter Review
So this one doesn't change much aside from some minor adjustments concerning union (when one union field is initialized, all the union fields are considered as initialized).
Attachment #8893826 - Attachment is obsolete: true
Attachment #8893828 - Attachment is obsolete: true
Attachment #8898836 - Flags: review?(michael)
Attached patch 525063-autofix.patch (obsolete) (deleted) — Splinter Review
And this one brings the autofix. It goes through all the fields of the class and inserts the fields which are not initialized in-between the fields which are initialized in the initializer list so that they all remain in the correct order while keeping the existing ctor initializers.
Attachment #8898838 - Flags: review?(michael)
Ehsan, somehow ./mach warning-list doesn't work for me, most warnings from the SA are not present, only some. I suspect this is due to multi-threading where two messages are output at the same time, causing some non-sense and confusing the parser... But this is only a theory.
Is a run-clang-tidy.py full output of warnings good enough for you? If so I'll get you that.
Flags: needinfo?(ehsan)
Yes for sure, any method to get to this information works for me.  :-)
Flags: needinfo?(ehsan)
Comment on attachment 8898836 [details] [diff] [review]
525063.patch

Review of attachment 8898836 [details] [diff] [review]:
-----------------------------------------------------------------

I looked at the union bits I could find, and it all looks reasonable :-). Thanks for catching that.

::: build/clang-plugin/tests/TestInitializedMemberChecker.cpp
@@ +362,5 @@
> +};
> +
> +class TestUnion {
> +  TestUnion() {} // expected-error {{'mStorage.mAlign' is not correctly initialized in the constructor of 'TestUnion'}} \
> +                 // expected-error {{'mStorage.mBuf' is not correctly initialized in the constructor of 'TestUnion'}} \

Can you add a test where you correctly initialize one field of the union, and no errors are emitted? Thanks :-)
Attachment #8898836 - Flags: review?(michael) → review+
Comment on attachment 8898838 [details] [diff] [review]
525063-autofix.patch

Review of attachment 8898838 [details] [diff] [review]:
-----------------------------------------------------------------

This looks pretty good, if there's some way to add fixit tests so we can see what the fixits will look like on TestInitializedMemberChecker.cpp, that would be fantastic.

I'd also be really interested in a patch of this fixit being applied across the m-c tree. It would be really handy to see how well it works :-).
Attachment #8898838 - Flags: review?(michael) → review+
Can someone actually please generate the fixit patch for mozilla-central,
so I can profile it?
(In reply to Julian Seward [:jseward] from comment #141)
> Can someone actually please generate the fixit patch for mozilla-central,
> so I can profile it?

I'm currently doing exactly that, except that I'm stumbling upon corner cases and fixing the checker/autofix as a result. I should have a patch ready pretty soon.
(In reply to Tristan Bourvon from comment #142)
> I'm currently doing exactly that [..] I should have a patch
> ready pretty soon.

That's excellent.  Thanks.
Hello.  What is the status of this now?  What is the possibility of having
the patch sometime in Q4?  I listed profiling of it as one of my Q4 goals.
Flags: needinfo?(tristanbourvon)
(In reply to Julian Seward [:jseward] from comment #144)
> Hello.  What is the status of this now?  What is the possibility of having
> the patch sometime in Q4?  I listed profiling of it as one of my Q4 goals.
We will have a patch, most likely something that's not final next week.
Attached patch partial-fix-clean.patch (obsolete) (deleted) — Splinter Review
Here is a partial patch (i.e. not fixing everything that the checker has detected). As of this afternoon, it applies on the latest revision smoothly and compiles without any error nor warning.

Keep in mind that this includes some fixes for 3rd party upstreams which will need to be filtered out at a later stage.
Flags: needinfo?(tristanbourvon)
(In reply to Tristan Bourvon from comment #146)
> Created attachment 8916240 [details] [diff] [review]
> partial-fix-clean.patch
> 
> Here is a partial patch (i.e. not fixing everything that the checker has
> detected). As of this afternoon, it applies on the latest revision smoothly
> and compiles without any error nor warning.
> 
> Keep in mind that this includes some fixes for 3rd party upstreams which
> will need to be filtered out at a later stage.

That is one heck of a patch.  

One thing I will request is that you keep this patch -Wreorder clean.  -Wreorder warns when class members are initialized in a different order than they are declared. I have the codebase almost -Wreorder clean over in Bug 1406380, but there's so many initializations here I worry it will break it :)

You can test this by using `./mach try -b d -p win32-mingw32 -u none -t none` and looking in the raw log for reorder] - we don't yet have warnings-as-errors turned on in that platform, although I hope to soon.
(In reply to Tom Ritter [:tjr] from comment #147)
> (In reply to Tristan Bourvon from comment #146)
> > Created attachment 8916240 [details] [diff] [review]
> > partial-fix-clean.patch
> > 
> > Here is a partial patch (i.e. not fixing everything that the checker has
> > detected). As of this afternoon, it applies on the latest revision smoothly
> > and compiles without any error nor warning.
> > 
> > Keep in mind that this includes some fixes for 3rd party upstreams which
> > will need to be filtered out at a later stage.
> 
> That is one heck of a patch.  
> 
> One thing I will request is that you keep this patch -Wreorder clean. 
> -Wreorder warns when class members are initialized in a different order than
> they are declared. I have the codebase almost -Wreorder clean over in Bug
> 1406380, but there's so many initializations here I worry it will break it :)
> 
> You can test this by using `./mach try -b d -p win32-mingw32 -u none -t
> none` and looking in the raw log for reorder] - we don't yet have
> warnings-as-errors turned on in that platform, although I hope to soon.

Thanks for the comment! I already made sure the patch is -Wreorder clean ;)
(In reply to Tristan Bourvon from comment #148)
> Thanks for the comment! I already made sure the patch is -Wreorder clean ;)

Awesome, thanks so much!
(In reply to Tristan Bourvon from comment #146)
> Here is a partial patch (i.e. not fixing everything that the checker has
> detected). As of this afternoon, it applies on the latest revision smoothly
> and compiles without any error nor warning.

Tristan, thanks for this!  It is indeed a huge patch.  I tried to build and
run with it on x86_64-Linux just now.  Here is some feedback:

(1) gfx/2d/JobScheduler_posix.cpp.  "mThread{nullptr}" didn't compile,
    because mThread is some opaque type pthread_t (I think).  I just
    removed it because the constructor initialises mThread anyway.

(2) xpcom/ds/nsArrayEnumerator.cpp.  "mValueArray()" causes Firefox to
    segfault at startup.  mValueArray is used -- I think -- as a variable
    length array at the end of the object, and in some cases has zero
    length.  "mValueArray()" causes one element of it to be zeroed, though,
    which overwrites the heap metadata after the block.  I had to remove it.

    Does your analysis take into account variable length arrays at the
    end of structs?

(3) xpcom/threads/MozPromise.h.  There are some member variables 
    mMagic1, mMagic2, mMagic3, which have initial values at the
    point of declaration:
       uint32_t mMagic1 = sMagic;
    (this is C++ syntax I have never seen before).
    Your patch sets them to zero in the constructor, which causes
    at least one assertion in the same file to fail, because it
    expects them to have value |sValue|.  I changed the constructor
    initialisations accordingly.

    As above .. is this something the analyser can take into account?

My patch of fixes is attached.  Unfortunately, the browser will still
not start up successfully, but it doesn't crash, either.  So I am not
sure where to go with that short of binary-searching the individual
fixes to find the one(s) causing the behaviour to change.

This will be a time consuming process.  Is it possible to make the
analyser emit the fixes each in its own patch file, so as to make it
possible to perform such a binary search?
Fixups for the comment 146 patch, as described in comment 150.
Attached file splitpatches.tar.gz (obsolete) (deleted) —
Here is an archive containing each patch chunk in a separate file, to be used for the binary search. It includes c146 fixups.

As for the bugs, I'll work on getting these fixed alongside the other ones I've been able to find while working on the patch :)
Depends on: 1411595
Attached file 525063-patches-26oct2017.tar-WORKS-1 (deleted) —
Here's a slightly modified version of the comment 152 splitpatches tarfile.
This one builds and works at least well enough to start up, load
techcrunch.com and quit again.  I had to modify approximately 20 of the
patches, mostly to get them to apply (they seem to be for a slightly earlier
rev of m-c than I tried), and also to get them to build.  I then fixed two
segfaults and finally did a binary search to find one which caused the
browser to not work, but not to crash either.  A few of the patches just
appear to be whitespace changes, I noticed.  Overall, getting it working was
easier than I had expected.
Attachment #8916970 - Attachment is obsolete: true
(In reply to Julian Seward [:jseward] from comment #153)
> [..] I had to modify approximately 20 of the patches [..]

To put this in context, there are 835 patches in total.
Some initial performance results.  This is as measured with
http://browserbench.org/Speedometer/, with e10s disabled -- that simplifies
profiling and I can't see that it would negatively impact the validity of
the numbers.  I ran each build 3 times.  Larger numbers are better.

base     71.4 +/- 1.3   71.6 +/- 1.2   71.3 +/- 1.2
patched  66.4 +/- 2.9   66.3 +/- 2.1   65.8 +/- 2.3

The patched version is clearly slower, although the noise levels (the +/-
bit) are also higher, which is strange.  So, this is a disappointing and (to
me, at least) unexpected result.  I will profile both to see if I can learn
anything more.
(In reply to Julian Seward [:jseward] from comment #155)
> Some initial performance results.  This is as measured with
> http://browserbench.org/Speedometer/, with e10s disabled -- that simplifies
> profiling and I can't see that it would negatively impact the validity of
> the numbers.  I ran each build 3 times.  Larger numbers are better.
> 
> base     71.4 +/- 1.3   71.6 +/- 1.2   71.3 +/- 1.2
> patched  66.4 +/- 2.9   66.3 +/- 2.1   65.8 +/- 2.3
> 
> The patched version is clearly slower, although the noise levels (the +/-
> bit) are also higher, which is strange.  So, this is a disappointing and (to
> me, at least) unexpected result.  I will profile both to see if I can learn
> anything more.

Thank you very much for this analysis. I'm also very surprised by the results.
My guess is that such a big performance hit probably comes from a few initializations being run in very hot loops. Maybe we could also do a binary/linear search to find the culprits... Do you think that would wise?
Flags: needinfo?(jseward)
I tried to use Callgrind to extract meaningful data.  What I would really
like is it to tell me the average number of instructions per call for each 
constructor in the system, which would make it obvious where the perf hit(s)
is/are.  But I can't figure out a way to get that information from it.

The binary search idea is also a possibility, although it does kind of
assume that the perf loss is with one or a few constructors, and not spread
out evenly over all 835 patch sites.
Flags: needinfo?(jseward)
FWIW, we target this version of speedometer v2:

https://mozilla.github.io/arewefastyet-speedometer/2.0/

Also, it would be interesting to take more samples so you can do a t-test to determine if the change is statistically significant.
If there is a pref regression on the link in comment 158, maybe we could get some help profiling from Olli or someone else who has been looking at speedometer.

Also, while I don't want to see perf regress, I do kind of feel safety/correctness should come first.
Julian, did you test pgo build vs. pgo build with patch?
Ni for the comment #160 :)
Flags: needinfo?(jseward)
(In reply to Olli Pettay [:smaug] from comment #160)
> Julian, did you test pgo build vs. pgo build with patch?

No, I didn't.  It was a pretty standard --disable-debug --enable-optimize=-O2
build, with gcc 6.whatever on x86_64 Fedora 25.  Do you expect the PGO aspect
to make a big difference?
Flags: needinfo?(jseward)
(In reply to Julian Seward [:jseward] from comment #162)
> (In reply to Olli Pettay [:smaug] from comment #160)
> > Julian, did you test pgo build vs. pgo build with patch?
> 
> No, I didn't.  It was a pretty standard --disable-debug --enable-optimize=-O2
> build, with gcc 6.whatever on x86_64 Fedora 25.  Do you expect the PGO aspect
> to make a big difference?

I know that a while back a performance comparison was done between firefox on windows built with clang-cl without pgo vs built with Microsoft compiler with pgo and the delta performance was significant in the favour of the pgo build.
(In reply to Julian Seward [:jseward] from comment #162)
> (In reply to Olli Pettay [:smaug] from comment #160)
> > Julian, did you test pgo build vs. pgo build with patch?
> 
> No, I didn't.  It was a pretty standard --disable-debug --enable-optimize=-O2
> build, with gcc 6.whatever on x86_64 Fedora 25.  Do you expect the PGO aspect
> to make a big difference?

Did you test pgo vs non-pgo or non-pgo vs non-pgo? I would expect the first comparison to be skewed.
(In reply to Andi-Bogdan Postelnicu [:andi] from comment #163)
(In reply to Marco Castelluccio [:marco] from comment #164)

The comparison was done with two identical builds from source,
identical compilers, identical compiler flags, identical everything.
I think the point is that we ultimately care about regressions that show up in PGO.  Something can be slow in non-PGO and not an issue with PGO enabled.

Before worrying about that, though, it would be good to try this again with the version of the benchmark in comment 158.  Its quite different from the one linked in comment 155.

I also just noticed you have e10s disabled in comment 155.  That does not reflect what we measure against for desktop either.  AWFY and perfherder measure speedometer v2 with e10s enabled.  If there is no measurable regression in e10s that also makes it easier to proceed here.
In the meantime, do you think we can start landing small patches by module while checking on try through Perfherder that they don't cause any perf regression? That would also make it easy to identify the culprits in the big patch.

The only problem here is that this is a sec bug, so we are supposed to keep a low visibility on try...
(In reply to Tristan Bourvon from comment #167)
> In the meantime, do you think we can start landing small patches by module
> while checking on try through Perfherder that they don't cause any perf
> regression? That would also make it easy to identify the culprits in the big
> patch.
> 
> The only problem here is that this is a sec bug, so we are supposed to keep
> a low visibility on try...

I'm not a security peer, but this seems like a reasonable plan to me.  If you file stuff as innocuous "clean up" bugs in various modules.

Daniel, what do you think?
Flags: needinfo?(dveditz)
(In reply to Ben Kelly [:bkelly] from comment #166)
> I think the point is that we ultimately care about regressions that show up
> in PGO.  Something can be slow in non-PGO and not an issue with PGO enabled.

I absolutely agree that for "final calls" on this, we need to test the
configuration(s) that we actually ship and care about.  That said however,
it concerns me that the slowdown measured in comment 155, of around 7%, is
an order of magnitude higher than what I expected, and I want to understand
what happened.  FWIW, the static code size increase for the comment 155
builds is 0.55%.

After some digging around, I wonder if at least part of the problem is due
to a bad interaction with lazily initialised, templated classes.
mozilla::Maybe is a good example.  The fixup patch looks like this:

  -  Maybe() : mIsSome(false) { }
  +  Maybe() : mStorage(), mIsSome(false) { }

  where Maybe<T> has   bool mIsSome
                 and   T mStorage

Imagine the case where we create a Maybe<BigExpensiveClassToInitialize>.
Previously, we would allocate memory and then merely set mIsSome to false.
Now we also force BigExpensiveClassToInitialize's constructor to run.  At
least, if I understand the C++ correctly.

mozilla::Array is similar, but possibly worse, because it is templated on
both element type and number of elements, so the entire array is
force-initialized.

I wonder if we'd avoid this problem and still get most of the mileage out of
the analysis by (initially) restricting the fixup assignments to scalar,
primitive types (bool, char, int, int:fieldwidth, etc).  I've chased many of
these bugs over the years and most of them have to do with missing scalar
initialisations.

Even a missing initialisation of a pointer type is relatively harmless,
because uses of it usually crash immediately.  The worst culprits are bools
or int:1's, since their non-initialisation often leads to non-crashy but
weird behaviour, and statistically speaking half the time their junk initial
value is "correct" anyway.

Tristan, what do you think?  Is it possible to restrict the analysis like that?
Julian, the issue of templated types is actually a great catch, and a flaw in the analysis.

Right now, we only detect and autofix scalar, array, pointer and POD types.

The problem happening here is that if a templated class is instantiated for at least one trivial type, then an issue is raised for this instantiated type and an initialization of the field is automatically added to the constructor. However, this constructor is common to every instantiation, and therefore instantiations using non-trivial types will also initialize the templated field even though they shouldn't (because as you mentioned, it could be very costly).

The solution I see here is to ignore fields with a templated type (or containing a templated type). This might also fix some false positives I've been having trouble fixing.

This is not a perfect fix, and I will fix these specific cases by hand (probably using template condition guards or making sure there is no security issue).

Does that sound good to you?
(In reply to Tristan Bourvon from comment #167)
> In the meantime, do you think we can start landing small patches by module
> while checking on try through Perfherder that they don't cause any perf
> regression? That would also make it easy to identify the culprits in the big
> patch.

This plan (unfortunately) strikes me as dangerous from a performance point
of view, for following reason.  Let's conservatively assume that the 7% perf
lossage discussed in comment 169 is the best we can achieve.  That fixup
patch set modifies 835 constructors.  If we again conservatively assume that
the cost is spread equally over each change, then the average cost of each
change is about 0.01%.

Given that perfherder seems to have a noise level of around 1%, this means
that we cannot meaningfully use it to measure the cost of any single change.
I fear that if we incrementally land a series of fixes, each of which
individually has performance loss which can't be measured by treeherder,
then we will wind up with an overall perf regression which we can't pin down
and which we didn't think we were landing at all.

I think it would be safer, if more hassle, to figure out what is going on
here, see if we can reduce the overall cost, and measure the aggregate cost
of the whole bundle of changes, before pushing any part of them.
(In reply to Tristan Bourvon from comment #170)
> The solution I see here is to ignore fields with a templated type (or
> containing a templated type). This might also fix some false positives I've
> been having trouble fixing.

That sounds like a good starting point to me, yes.  Can you get me a
new patch set to poke at, that applies to current-ish (a day or so old)
m-c?

> [..] I will fix these specific cases by hand

Is it possible you can keep the hand editing to a minimum?  The current
patch set contains a number of whitespace-only changes which only added
to my already high level of confusion :-)

I have another question.  Profiling these changes is really difficult
because it's nearly impossible to identify the specific bits of code in 
question inside the profiler I'm using (Callgrind).  What I started to do is
to change patches like this

-  Foo() : mBar(123) {}
+  Foo() : mXyzzy{0}, mBar(123) {}

into

-  Foo() : mBar(123) {}
+  Foo() : mXyzzy{0}, mBar(123) { extern void NOTIFY(const char*); NOTIFY(__func__); }

In other words -- I dropped in a call to a crude, temporary profiling function,
so I can figure out which constructors are hot.  I started doing this by hand but
gave up after about 35 patches.  Tristan -- is it feasible for your analysis tool
to automatically do something similar (drop in an arbitrary bit of text after the
constructor's opening brace?)  If easy, that would be great.  If not easy, don't
worry; don't let it delay you in dealing with the scalars-only fixup proposal.
Flags: needinfo?(tristanbourvon)
I re-built with flags "-Og -fno-inline".  This disables (almost?) all
inlining.  The resulting builds run very slowly but each constructor is now
in its own function, so we can measure their costs individually.

The Maybe<T> phenomenon mentioned in comment 169 definitely happens.  For
many different instantiations of T, the patched version is many times more
expensive than baseline.  Here's a particularly extreme example:

  mozilla::Maybe<js::AutoRooterGetterSetter::Inner>::Maybe()

  baseline 146220 calls,   731100 insns,   5.00 insns/call
  patched  145884 calls, 50329980 insns, 345.00 insns/call

Given that Maybe is very commonly used (millions of constructions in my
profile run), I suspect this alone makes a significant contribution to the
measured slowdown.

Interestingly I cannot see anything similar with mozilla::Array -- the
patched costs are identical, or nearly so, to baseline.

Can anyone suggest any other templated, lazily initialised classes that I
should look at?
Flags: needinfo?(nfroyd)
BloomFilter is one that comes to mind.

I didn't follow all the details about the restricting, but it does sound like being more conservative there could help a lot.
(In reply to Nicholas Nethercote [:njn] from comment #174)
> BloomFilter is one that comes to mind.

Ah yes, you're (very) right:

  mozilla::BloomFilter<12u, nsAtom>::BloomFilter()

  baseline:  123 calls,   65436 insns (inclusive)
  patched:   122 calls, 3063664 insns (inclusive)

What I really need to do is get Callgrind to print a list of function
names along with the average cost per call.  Then we can just diff the
two lists to find regressing constructors.  I'll see if I can figure out
how to do that.
(In reply to Julian Seward [:jseward] from comment #175)
> (In reply to Nicholas Nethercote [:njn] from comment #174)
> > BloomFilter is one that comes to mind.
> 
> Ah yes, you're (very) right:
> 
>   mozilla::BloomFilter<12u, nsAtom>::BloomFilter()
> 
>   baseline:  123 calls,   65436 insns (inclusive)
>   patched:   122 calls, 3063664 insns (inclusive)

Why is this *so* much more expensive?  BloomFilter's constructor initializes the array of counters in the constructor.  Even if you add a member-list initialization (is that even possible with arrays?) and even if the compiler can't figure out that the clear() is all dead stores and eliminates it (I guess since this is -O0, maybe that's plausible?), what is giving you a two orders of magnitude increase in the number of insns here?

And even then, sure, BloomFilter is templated on some type T, but T is never stored in the class...so it doesn't really sound like an instance of what you're asking for?

ProtoAndIfaceCache is in a similar spot, with members that aren't member-list initialized, but are subject to other initializations in the constructor.

mozilla::Array might fit the bill, though I'd expect its members to be default-initialized anyway?

mozilla::Variant (and uses like mozilla::Result) sound almost like they fit the bill, but the storage isn't typed storage, IIRC, just storage sized large enough to hold the largest variant.  Unless these patches are doing something weird with that?  (If the patches are doing weird things with code like Variant--when they shouldn't be!--another thing to look at would be the auto-generated unions from IPDL files, which tediously, manually implement Variant.)

Looking at uses of AlignedStorage2, they mostly appear in unions, which IIUC, wouldn't subject them to the problems previously described.  There's an AlignedStorage2 use in GlyphBufferAzure which might cause problems, though?
Flags: needinfo?(nfroyd)
(In reply to Nathan Froyd [:froydnj] from comment #176)
> >   mozilla::BloomFilter<12u, nsAtom>::BloomFilter()
> > 
> >   baseline:  123 calls,   65436 insns (inclusive)
> >   patched:   122 calls, 3063664 insns (inclusive)
> 
> Why is this *so* much more expensive?  BloomFilter's constructor initializes
> the array of counters in the constructor.  Even if you add a member-list
> initialization (is that even possible with arrays?) and even if the compiler
> can't figure out that the clear() is all dead stores and eliminates it (I
> guess since this is -O0, maybe that's plausible?), what is giving you a two
> orders of magnitude increase in the number of insns here?

You ask a good question.  It appears that the "12u" in the type above is the
number of bits in the key, and BloomFilter uses an array of 2^bits-in-key
uint8_t's, so to speak, so in this case the compiler will have to zero out
4k of memory (that is, (1 << 12) * sizeof(uint8_t)).  Which would explain
the big performance hit.

Yes, a member-list initialisation seems possible with arrays.  At least,
Tristan's tool just adds ": mCounters()" to the constructor, and we have

template<unsigned KeySize, class T>
  ..
  static const size_t kArraySize = (1 << KeySize);
  ..
  uint8_t mCounters[kArraySize];

Compiler flags are "-Og -fno-inline".  So I would expect at least basic stuff
(register allocation, constant folding, dead code removal) to happen.  -Og is
a bit less aggressive than even just -O1 so I doubt that the initialisation will
detected as dead.

> And even then, sure, BloomFilter is templated on some type T, but T is never
> stored in the class...so it doesn't really sound like an instance of what
> you're asking for?

That's true, but what this really shows us is that it's not just type template
variables that are potential source of trouble -- integer template vars can
be too.  (Is that the correct terminology for the "12u" in the type?)
(In reply to Julian Seward [:jseward] from comment #175)
> What I really need to do is get Callgrind to print a list of function
> names along with the average cost per call.  Then we can just diff the
> two lists to find regressing constructors.

I did that.  Attached are the lists for baseline vs patched, and also the
sdiff of them.  The sdiff shows many whose cost stayed the same, many that
got more expensive, and a few that got cheaper (!).  I'm still trying to
make sense of it.
Attached file my_callgrind_annotate (deleted) —
Script that extracts cost-per-call data from Callgrind output files
(a hacked version of the callgrind_annotate script).
(In reply to Julian Seward [:jseward] from comment #177)
> (In reply to Nathan Froyd [:froydnj] from comment #176)
> > >   mozilla::BloomFilter<12u, nsAtom>::BloomFilter()
> > > 
> > >   baseline:  123 calls,   65436 insns (inclusive)
> > >   patched:   122 calls, 3063664 insns (inclusive)
> > 
> > Why is this *so* much more expensive?  BloomFilter's constructor initializes
> > the array of counters in the constructor.  Even if you add a member-list
> > initialization (is that even possible with arrays?) and even if the compiler
> > can't figure out that the clear() is all dead stores and eliminates it (I
> > guess since this is -O0, maybe that's plausible?), what is giving you a two
> > orders of magnitude increase in the number of insns here?
> 
> You ask a good question.  It appears that the "12u" in the type above is the
> number of bits in the key, and BloomFilter uses an array of 2^bits-in-key
> uint8_t's, so to speak, so in this case the compiler will have to zero out
> 4k of memory (that is, (1 << 12) * sizeof(uint8_t)).  Which would explain
> the big performance hit.

Hm.  The compiler has to zero out 4k of memory in the baseline version, too, no?  Unless the zeroing happens in an out-of-line memset call--in which case I guess it wouldn't be counted in the above analysis?  Or perhaps the memset is inlined to string instructions, which might be counted as single instructions, and not measured in the amount of work they do?

Even so, let's say we have:

  BloomFilter() : mCounters() { memset(&mCounters[0], 0, sizeof(mCounters)); }

that seems like we should be doing ~2x work...not 50x as the above suggests.  (Though maybe we're not counting the actual memset code correctly, see above.)

> > And even then, sure, BloomFilter is templated on some type T, but T is never
> > stored in the class...so it doesn't really sound like an instance of what
> > you're asking for?
> 
> That's true, but what this really shows us is that it's not just type
> template
> variables that are potential source of trouble -- integer template vars can
> be too.  (Is that the correct terminology for the "12u" in the type?)

I think the standardese is "non-type template argument", but your version works just as well.
(In reply to Nathan Froyd [:froydnj] from comment #183)
> The compiler has to zero out 4k of memory in the baseline version, too, no?

Yes, correct.  FTR, Nathan and I chased this a bit on irc yesterday.  It
turns out that the baseline version does already initialise mCounters using
memset, efficiently, using 1 insn per 8 bytes.  Adding " :mCounters()"
causes g++ to zero it out (again) but this time using a very inefficient
loop using 6 insns per byte.  Nathan filed a GCC bug report,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82888, about this.  Thanks are
due to Nathan for getting me to look more closely at this.

> That's true, but what this really shows us is that it's not just type template
> variables that are potential source of trouble -- integer template vars can
> be too. 

This is misleading, I now understand.  The problem here is the initialisation
of an array-typed member, and unrelated to the use of integer template vars.
This has become very long and drawn out.  I would firstly like to thank
Tristan and Andi for their patience in answering questions and for iterating
the analysis and fixup patches multiple times.

Here is a summary of where we are.

* Tristan made an analysis patch (various iterations, latest in comment 135)

* Tristan used that to make an autofix patch for mozilla-central, and split
  it into 835 separate changes so I could binary-search it (comment 152).

* I got it to work (comment 153) and saw a 7% slowdown on Speedometer
  (comment 155) -- an old version, but I don't think that's relevant at this
  stage.  This is an order of magnitude worse than I expected.  Code size
  increase is 0.55% (comment 169) which is the ballpark I'd hoped for for
  the slowdown too.

* After some digging (comment 169 to 184) a couple of causes were identified:

  - Initialisation of members with template types is dangerous, because the
    cost of initialising the templated variable can be arbitrary.  We saw
    that a lot with Maybe<T>.

  - Initialisation of members with array types is dangerous, because the
    initialisation is applied to all elements.  We see this with
    BloomFilter<..>.

* It was proposed to restrict the analysis to adding initialisations for
  fields only of scalar and pointer (primitive) types (comment 169).  This
  would avoid the above difficulties and still, IMO, cover the most
  difficult-to-otherwise-detect uninitialised field uses.

To move forward on this, I believe we currently need two things:

* A fixup patch restricted as described above, so we have a chance to get
  the perf hit down to a level which is more feasible.

* A description of the analysis algorithm.  This is in order that:

  - We can understand the limitations of the algorithm -- what it can and
    can't be expected to do.

  - We can relate it to existing techniques for control flow-, data flow-
    and alias-analysis.

  In particular I was a little concerned to hear that the existing analysis
  patch generates both false positives and false negatives.  I would have
  expected it to have one or the other, but not both, so I'd like to
  understand why this happens.

Tristan, Andi, does that sound reasonable?
(In reply to Julian Seward [:jseward] from comment #185)
> This has become very long and drawn out.  I would firstly like to thank
> Tristan and Andi for their patience in answering questions and for iterating
> the analysis and fixup patches multiple times.
> 
> Here is a summary of where we are.
> 
> * Tristan made an analysis patch (various iterations, latest in comment 135)
> 
> * Tristan used that to make an autofix patch for mozilla-central, and split
>   it into 835 separate changes so I could binary-search it (comment 152).
> 
> * I got it to work (comment 153) and saw a 7% slowdown on Speedometer
>   (comment 155) -- an old version, but I don't think that's relevant at this
>   stage.  This is an order of magnitude worse than I expected.  Code size
>   increase is 0.55% (comment 169) which is the ballpark I'd hoped for for
>   the slowdown too.
> 
> * After some digging (comment 169 to 184) a couple of causes were identified:
> 
>   - Initialisation of members with template types is dangerous, because the
>     cost of initialising the templated variable can be arbitrary.  We saw
>     that a lot with Maybe<T>.
> 
>   - Initialisation of members with array types is dangerous, because the
>     initialisation is applied to all elements.  We see this with
>     BloomFilter<..>.
> 
> * It was proposed to restrict the analysis to adding initialisations for
>   fields only of scalar and pointer (primitive) types (comment 169).  This
>   would avoid the above difficulties and still, IMO, cover the most
>   difficult-to-otherwise-detect uninitialised field uses.
> 
> To move forward on this, I believe we currently need two things:
> 
> * A fixup patch restricted as described above, so we have a chance to get
>   the perf hit down to a level which is more feasible.
> 
> * A description of the analysis algorithm.  This is in order that:
> 
>   - We can understand the limitations of the algorithm -- what it can and
>     can't be expected to do.
> 
>   - We can relate it to existing techniques for control flow-, data flow-
>     and alias-analysis.
> 
>   In particular I was a little concerned to hear that the existing analysis
>   patch generates both false positives and false negatives.  I would have
>   expected it to have one or the other, but not both, so I'd like to
>   understand why this happens.
> 
> Tristan, Andi, does that sound reasonable?

Thanks for doing this thorough analysis. If we will find ourselves in a situation where the current logic limits us, we could also try to use the call graph, but i would use this as a last resource since I think this would involve some serious code refactoring.
We can do a doc where we can describe each step of the algorithm and finally we can inspect it with a tiny example. This is a good idea maybe you can look other it an propose some improvements if you see something obvious.  

And as a last thought, I'm not sure that now we support this kind of context, where we take the address of a member variable and store it into a pointer to a member variable. I'm not sure we should also cover this case, It's very narrowed and I don't see this practice into gecko.
After some discussion with Julian and Andi regarding the state of the checker, we have decided that we should focus on making the patch generation entirely automatic through the autofixes.
This might have a cost in terms of correctness (when in doubt, we will prefer initializing redundantly rather than letting the programmer fix the mistake), but the benefits we get in terms of workflow seem to outweight this problem (plus redundant initializations can be a good thing in some cases).
Also, fields which can only be initialized by hand (for example enums) will generate a comment with a FIXME.
Of course, we will still be analyzing performance and getting rid of patches causing large perf regressions (probably through an annotation).

I'm working on making a new version of the checker which covers these changes as well as ignoring templated types and arrays.

As for a doc describing the exact criterias for the analysis, it should come soon afterwards.
Flags: needinfo?(tristanbourvon)
(In reply to Julian Seward [:jseward] from comment #184)
> (In reply to Nathan Froyd [:froydnj] from comment #183)
> > The compiler has to zero out 4k of memory in the baseline version, too, no?
> 
> Yes, correct.  FTR, Nathan and I chased this a bit on irc yesterday.  It
> turns out that the baseline version does already initialise mCounters using
> memset, efficiently, using 1 insn per 8 bytes.  Adding " :mCounters()"
> causes g++ to zero it out (again) but this time using a very inefficient
> loop using 6 insns per byte.  Nathan filed a GCC bug report,
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82888, about this.

While working on the GCC patch, I noticed that:

  : mCounters{0}

produces...if not excellent code, at least something approximating `rep stosq`, even at -O0.  So that might be a reasonable workaround for the BloomFilter case, and maybe a few others?
(In reply to Tristan Bourvon from comment #167)
> The only problem here is that this is a sec bug, so we are supposed to keep
> a low visibility on try...

Don't worry about it too much. Leaving out the bug number on try is low-enough profile for this one.
Flags: needinfo?(dveditz)
Attached patch 525063.patch (obsolete) (deleted) — Splinter Review
Attachment #8898836 - Attachment is obsolete: true
Attachment #8898838 - Attachment is obsolete: true
Attached patch 525063-autofix.patch (obsolete) (deleted) — Splinter Review
Attached patch Handfixes.patch (obsolete) (deleted) — Splinter Review
Attachment #8916240 - Attachment is obsolete: true
Here are the updated patches: the autofix produces almost compilable code (in non-debug mode), with a few exceptions fixed in the "Handfixes" patch (to be applied after running the autofix).
These exceptions are mostly OS-dependant stuff or minor bugs in the checker, but we'll have to apply a patch with manual edits anyway to cover the DEBUG builds, so this is not a very big issue nor a priority right now.

Julian, do you think we could get a performance analysis of this patch? It should be much lower than the previous one.

To get this up and running, you will probably need to build and use your own version of clang-tidy and add the mozilla checkers using the python scripts located in build/clang-plugin (mostly import_mozilla_checks.py). Don't forget to specify `-header-filter=".*"` and `-fix` when running run-clang-tidy.py. You might need to generate the compile_commands.json file using `./mach build-backend --backend=CompileDB` before being able to run clang-tidy. Make sure you're not in debug mode when doing this, otherwise I can't guarantee the current patch will build.

Don't hesitate to ping me on IRC if you need any assistance!
Flags: needinfo?(jseward)
Attached patch DebugCompat.patch (obsolete) (deleted) — Splinter Review
This patch is to be applied after the autofix but before the other handfixes.
It allows the debug builds to complete successfully by adding preprocessor directives that properly initialize variables depending on the build configuration.
Attached patch DebugCompat.patch (obsolete) (deleted) — Splinter Review
Attachment #8938371 - Attachment is obsolete: true
Attached patch EnumInit.patch (obsolete) (deleted) — Splinter Review
This file manually fixes all the FIXME comments added by the autofix for uninitialized enum fields.

The order in which the files should be applied is now:
1. (Run the autofix and create a patch from it)
2. Debug compatibility
3. Handfixes
4. Enum initialization

Perf results are encouraging; only a handful of specific tests suffer a perf hit, so it is likely that they can be fixed with few changes to the main patch. Some tests even show a perf improvement depending on the base version used for comparison.
Attached patch Autofix-rev397717-f78a83244fbe.patch (obsolete) (deleted) — Splinter Review
This is the autofix generated on m-c rev 397717 (f78a83244fbe). Can be useful if someone just wants to try applying it without having to go through all the autofix generation setup.
Attached patch 525063.patch (obsolete) (deleted) — Splinter Review
Update to the checker to ignore classes with a custom new operator, since the new operator could initialize fields before the constructor is called, making initializations dangerous.
Attachment #8936222 - Attachment is obsolete: true
Attached patch 525063-autofix.patch (obsolete) (deleted) — Splinter Review
Updated patch (see c198).
Attachment #8936223 - Attachment is obsolete: true
Attached patch DebugCompat.patch (obsolete) (deleted) — Splinter Review
Updated patch (see c198).
Attachment #8939225 - Attachment is obsolete: true
Attached patch Handfixes.patch (obsolete) (deleted) — Splinter Review
Updated patch (see c198).
Attachment #8936224 - Attachment is obsolete: true
Attached patch EnumInit.patch (obsolete) (deleted) — Splinter Review
Updated patch (see c198).
Attachment #8939226 - Attachment is obsolete: true
Attached patch Autofix-rev397717-f78a83244fbe.patch (obsolete) (deleted) — Splinter Review
Updated patch (see c198).
Attachment #8940202 - Attachment is obsolete: true
Here is a performance comparison based on rev 397717 (f78a83244fbe).
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=aab423e138ce9501ef6bfd91f797fdf30ca84989&newProject=try&newRevision=4adfbeeb3682e967704b3df9b49e6814cb0ea3d0&framework=1

All unit tests seem to be passing successfully, except for known intermittent issues.
With the new patch set, in comments 203, 200, 201, 202 (applied in that order), and
testing with https://mozilla.github.io/arewefastyet-speedometer/2.0/ on a Skylake
Xeon underclocked to 2.4 GHz, built with gcc-6.4.1 -O2, x86_64-linux, I have the
following results, for 4 runs.  Larger numbers are better.

  base    40.90 41.19 40.54 40.62   avg 40.81
  patched 40.86 40.58 40.70 40.69   avg 40.71

This shows a slowdown of much less than 1%, and certainly below the level of noise in
the measurements.  Considering that the previous patch set gave a 7% slowdown (see
comment 155) this is a great improvement.

Talos shows 3 cases where there's a 2% to 4% speedup (no, I'm not claiming to understand it)
and unfortunately 4 cases where there's a slowdown of 2% to 5%.  So that's the next thing
to chase.  Results are here:

https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=c9a6c3630d3e&newProject=try&newRevision=ed22bf9677a1bdc84f3fc8118e1a81d838c540ec&framework=1&showOnlyImportant=1
Flags: needinfo?(jseward)
Julian, if you have any idea, what is the next step?
The next step is to see if the Talos regressions in comment 205 can be reproduced
locally, so we can profile and see what's going on there.
Attached patch Autofix-rev400482-0e62eb7804c0.patch (obsolete) (deleted) — Splinter Review
Here's the updated autofix for rev400482-0e62eb7804c0
Attachment #8940510 - Attachment is obsolete: true
Attached patch DebugCompat.patch (obsolete) (deleted) — Splinter Review
Updated for rev400482-0e62eb7804c0
Attachment #8940507 - Attachment is obsolete: true
Attached patch Handfixes.patch (obsolete) (deleted) — Splinter Review
Updated for rev400482-0e62eb7804c0
Attachment #8940508 - Attachment is obsolete: true
Attached patch EnumInit.patch (obsolete) (deleted) — Splinter Review
Updated for rev400482-0e62eb7804c0
Attachment #8940509 - Attachment is obsolete: true
(In reply to Julian Seward [:jseward] from comment #207)
> The next step is to see if the Talos regressions in comment 205 can be
> reproduced locally, so we can profile and see what's going on there.

I've been trying to reproduce the 4.91% regression on "tresize opt e10s /
linux64-qr" as referred to in Comment 205.  This has been slowed down by
problems running Talos locally.  Anyway, what I have is, running like this,
with gcc-5.4 -O2 build on Ubuntu 16.04

  MOZ_WEBRENDER=1 MOZ_ACCELERATED=1 ./mach talos-test -a tresize

is

(1) on an Intel Haswell, Intel integrated graphics: circa 1% slowdown

(2) on VirtualBox 5.2.6, with Settings->Display->Screen->Enable*Acceleration
    disabled: circa 10% slowdown

Concentrating on (2).  about:support has

  WebGL 1 Driver Renderer   VMware, Inc. -- llvmpipe (LLVM 5.0, 256 bits)

I have so far failed to profile (2) using Callgrind, because './mach
talos-test' times out and fails in hard-to-understand ways.  I continue to
try.
(In reply to Julian Seward [:jseward] from comment #212)

For (2) [VirtualBox 5.2.6], which has a 10% wallclock slowdown, I
cannot measure any comparable increase in instruction count, 
memory access count or simulated cache miss count using Callgrind
or Cachegrind.  The increases are only around 1%.

Andi and I hypothesised that it might be the use of LLVMpipe
(MESA's software renderer) that causes the slowdown.  So on
real hardware -- (2) in comment #212 -- I reran the tests with
LIBGL_ALWAYS_SOFTWARE=true, having first verified with /usr/bin/glxinfo
that LIBGL_ALWAYS_SOFTWARE=true does in fact cause LLVMpipe
to be used.

Like this I also could not show any significant change, at most
a 0.5% change in the median value, with the number of runs of
tresize set to 50.
FTR, I am pretty much lost at this point, and not sure how to proceed.
Is it worth chasing this regression any further?  I don't know.
Depends on: 1434323
Andi created bug 1434323 to follow up on performances (this bug with 214 comments is too long)
Attached patch Autofix-rev402508-f1a4b64f19b0.patch (obsolete) (deleted) — Splinter Review
Updated for rev402508 (f1a4b64f19b0)
Attachment #8945343 - Attachment is obsolete: true
Attached patch Handfixes.patch (obsolete) (deleted) — Splinter Review
Updated for rev402508 (f1a4b64f19b0)
Attachment #8945345 - Attachment is obsolete: true
Attached patch DebugCompat.patch (obsolete) (deleted) — Splinter Review
Updated for rev402508 (f1a4b64f19b0)
Attachment #8945344 - Attachment is obsolete: true
Attached patch EnumInit.patch (obsolete) (deleted) — Splinter Review
Updated for rev402508 (f1a4b64f19b0)
Attachment #8945346 - Attachment is obsolete: true
Now to be applied in the following order:
1. Autofix
2. Handfixes
3. Debug compat
4. Enum init
It's also very important to test this with low-end HW, and in 32-bit as well on low-ram machines.  Cache sizes and processor impls differ considerably.  How it reacts on a XEON may be very different than a core i3, or mobile dual-core, or worse.  Include older-but-still-usable laptops

(yes, this is all obvious, but I thought I'd make sure we're doing that.)
Product: Core → Firefox Build System
Attached patch Autofix-rev409643-d2bf7b9c80a7.patch (obsolete) (deleted) — Splinter Review
Autofix based on 409643 (d2bf7b9c80a7)
Attachment #8949876 - Attachment is obsolete: true
Attached patch Handfixes.patch (obsolete) (deleted) — Splinter Review
based on 409643 (d2bf7b9c80a7)
Attachment #8949877 - Attachment is obsolete: true
Attached patch DebugCompat.patch (obsolete) (deleted) — Splinter Review
Attachment #8949878 - Attachment is obsolete: true
Attached patch EnumInit.patch (obsolete) (deleted) — Splinter Review
based on 409643 (d2bf7b9c80a7)
Attachment #8949879 - Attachment is obsolete: true
Comment on attachment 8963768 [details] [diff] [review]
Autofix-rev409643-d2bf7b9c80a7.patch

Ehsan, do you think we could land this now? (please see the discussion in https://bugzilla.mozilla.org/show_bug.cgi?id=1434323)
Attachment #8963768 - Attachment description: Autofix-rev409643-d2bf7b9c80a7.patch → Autofix-rev409643-d2bf7b9c80a7.patch
Attachment #8963768 - Flags: review?(ehsan)
Attachment #8963769 - Flags: review?(ehsan)
Attachment #8963770 - Flags: review?(ehsan)
Attachment #8963771 - Flags: review?(ehsan)
Comment on attachment 8963768 [details] [diff] [review]
Autofix-rev409643-d2bf7b9c80a7.patch

Review of attachment 8963768 [details] [diff] [review]:
-----------------------------------------------------------------

Great work here, thanks so much!  r=me with some comments.

::: dom/filesystem/GetFilesHelper.h
@@ +159,5 @@
>    GetFilesHelperChild(nsIGlobalObject* aGlobal, bool aRecursiveFlag)
>      : GetFilesHelper(aGlobal, aRecursiveFlag)
>      , mPendingOperation(false)
> +  {
> +this->mUUID.m0 = {};

It would be nice if you would fix the formatting here and in other places where multiple statements are added.  Thanks!

::: intl/icu/source/common/unicode/bytestrie.h
@@ +105,4 @@
>           * Constructs an empty State.
>           * @stable ICU 4.8
>           */
> +        State()  : pos{nullptr}, remainingMatchLength{} { bytes=NULL; }

Files under intl/icu should not be edited by us since they are upstream.

::: toolkit/crashreporter/breakpad-client/mac/crash_generation/crash_generation_server.cc
@@ +57,4 @@
>        dump_dir_(dump_path.empty() ? "/tmp" : dump_path),
>        started_(false),
>        receive_port_(mach_port_name),
> +      mach_port_name_(mach_port_name) , server_thread_{nullptr} {

I believe breakpad-client is also upstream code.
Attachment #8963768 - Flags: review?(ehsan) → review+
Attachment #8963769 - Flags: review?(ehsan) → review+
Attachment #8963770 - Flags: review?(ehsan) → review+
Attachment #8963771 - Flags: review?(ehsan) → review+
Al, Dan, as these are pretty big patches, do you have any recommendations?
Flags: needinfo?(dveditz)
Flags: needinfo?(abillings)
(In reply to :Ehsan Akhgari from comment #227)
> I believe breakpad-client is also upstream code.

Per Ted, toolkit/crashreporter/google-breakpad is still upstream code, but toolkit/crashreporter/breakpad-client is forked.
Not sure what the question is. Big patches land on nightly all the time. Doesn't look like anyone's trying to uplift this one anywhere, and since it's not a sec-high/sec-critical kind of bug we should leave it at that.

If we know of a particular uninitialized variable that's resulting in crashes in Socorro or fuzz testing we could cherry pick patches just for those instances.
Flags: needinfo?(dveditz)
Flags: needinfo?(abillings)
I would really like to understand why we added initialization of fields that were purposefully not being initialized.  And left in place the comments that say they are purposefully not being initialized.

I would also like to understand why we used a _really_ confusing initializer syntax that makes it incredibly unclear where the initializers end and the constructor body begins...
...and why we gratuitously reformatted code that has nothing to do with initializers, e.g.

https://hg.mozilla.org/integration/mozilla-inbound/rev/d7d2f08e051c3c355d41dc5c1ce1bd3ca55a43b9#l22.19

Chatter in #jsapi also suggests that a bunch of code under js/ was misformatted as well.  Can we back this patch out?
Flags: needinfo?(tristanbourvon)
Flags: needinfo?(tristanbourvon)
I discussed this with Tristan and asked him to back out the js/src part of this patch so it can get a proper review from a JS peer. It looks like great work. I expect a reviewed patch to re-land speedily.
For posterity, that js/src backout did in fact happen.
https://hg.mozilla.org/integration/mozilla-inbound/rev/6ff8aaef2866

There was also a follow-up patch pushed after comment 231 to fix webrender assertions.
https://hg.mozilla.org/integration/mozilla-inbound/rev/bf13e4103150
I'm not happy with the XPConnect patches, nor the ones in layout or DOM. IMO this whole thing should be backed out, not just the changes in js because jorendorff happened to ask. Boris, Nathan - do you agree?
Flags: needinfo?(nfroyd)
Flags: needinfo?(bzbarsky)
(In reply to Jason Orendorff [:jorendorff] from comment #234)
> I discussed this with Tristan and asked him to back out the js/src part of
> this patch so it can get a proper review from a JS peer. It looks like great
> work. I expect a reviewed patch to re-land speedily.

The js/public parts should also be backed out, too, right?  (I'm out of time today to just do it.)
(In reply to Bobby Holley (:bholley) from comment #238)
> I'm not happy with the XPConnect patches, nor the ones in layout or DOM. 
Could you explain the issues you see with that?

> The js/public parts should also be backed out, too, right? 
I has been done, see comment #237
This patch made coverity for fennec pretty happy: 
Defects eliminated: 385

I will trigger a run of Firefox desktop soon.
I agree with comment 232 (and would also note numerous failures to format the code well), and think this should be backed out.
(In reply to Sylvestre Ledru [:sylvestre] from comment #240)
> (In reply to Bobby Holley (:bholley) from comment #238)
> > I'm not happy with the XPConnect patches, nor the ones in layout or DOM. 
> Could you explain the issues you see with that?

Others have pointed a lot of this out already:

- reformatting of unrelated code (for example in dom/base/nsSyncLoadService.cpp)
- just bad formatting in general (for example the initialisation of mOnStopRequestCalled in FetchDriver.cpp, but there's a ton of examples)
- an initialiser syntax that's now applied inconsistently (using {} instead of ())
- inconsistency in the changes themselves (for example in WebGLContext, mBypassShaderValidation is changed but not mCapturedFrameInvalidated)

I also don't think this patch was ready to be landed.
Backed out all 4 changesets in this bug on request from Andi: initial commit, follow-up, backout of js/src/, backout of js/public/

https://hg.mozilla.org/mozilla-central/rev/8a94faa5cc60495da5d80d4b3c07bf5877d2e6d8
Tristan, I suggest breaking up the next iteration into many small patches and getting individual reviews on each bit.  Sorry for the hassle.
Also it seems that it doesn't handle default initialization? Or that's what Jonathan reported at least.
On the bright side, the mega patch managed to land without any performance alerts :)
> the mega patch managed to land without any performance alerts

What that tells me is that we probably don't have good enough test coverage for bindings.
Flags: needinfo?(bzbarsky)
It looks like the patch has been backed out.  Thank you Tristan and Andi for handling all of this!
Flags: needinfo?(nfroyd)
(In reply to :Ehsan Akhgari from comment #245)
> Tristan, I suggest breaking up the next iteration into many small patches
> and getting individual reviews on each bit.  Sorry for the hassle.

FWIW, I don't think this is strictly necessary. Asking people to review relevant sections of a megapatch is probably fine. And I think this was primarily controversial because of the issues mentioned in comment 243, so I suspect the reviews will be mostly rubber-stamp once those issues are fixed.
Depends on: 1453795
All of the concerns in this thread are very legitimate, and in the end I am now convinced that backing out the patch was the right thing to do (though not ideal in terms of security concerns, but uninitialized variables are pretty rarely exploitable, luckily).

Most of the issues should be fairly easy to address with manual review and editing. As Bobby said, I think getting people to review the relevant sections of a single patch should be better because it helps us catch perf regressions more easily, though it might mean more time before we land it.


I am currently on vacations, so I'll be taking some time off regarding this patch (I'm officially not a contractor anymore). However, I will of course shortly resume work as a regular community contributor to get rid of the remaining problems, so that we can finish landing it :)

Thank you for your help, understanding and patience regarding this patch, it's not an easy thing to land - at least for me.
Group: dom-core-security → core-security-release
(In reply to Ryan VanderMeulen [:RyanVM] from comment #247)
> On the bright side, the mega patch managed to land without any performance
> alerts :)

It can take a week to get alerts.
One concern that has been voiced is that the zero values for some types are not accurate. For example, int32_t does not get initialized with {0}. This will be fixed before trying to land the patch again, among other things.
Tristan, any news on this? Merci!
Flags: needinfo?(tristanbourvon)
Fixes points 1 and 3 from #1453795
Attachment #8963768 - Attachment is obsolete: true
Flags: needinfo?(tristanbourvon)
Attached patch Handfixes.patch (deleted) — Splinter Review
Attachment #8963769 - Attachment is obsolete: true
Attached patch DebugCompat.patch (deleted) — Splinter Review
Attachment #8963770 - Attachment is obsolete: true
Attached patch EnumInit.patch (deleted) — Splinter Review
Attachment #8963771 - Attachment is obsolete: true
Attached patch RemoveICUfixes.patch (deleted) — Splinter Review
This patch should become unnecessary in the future as Andi fixed a bug in the thirdparty detection logic
Attached patch 525063-autofix.patch (obsolete) (deleted) — Splinter Review
This fixes missing default values when zero-initializing non-character integers
Attachment #8940506 - Attachment is obsolete: true
Attached patch 525063.patch (deleted) — Splinter Review
Fixes some in-class initializers not being taken into account.
Attachment #8940505 - Attachment is obsolete: true
Attached patch 525063-autofix.patch (deleted) — Splinter Review
Attachment #8974052 - Attachment is obsolete: true
Two things:
* all the fun is now happening in bug 1453795 for the fixes
* A few days ago, Umann Kristóf landed in clang analyzer a new checker (https://reviews.llvm.org/rL334935) to look for this. I run scan-build on trunk and here are the results:
http://sylvestre.ledru.info/reports/fx-scan-build/
Looks for "Uninitialized fields"
Umann is working on a follow up patch: https://reviews.llvm.org/D48325
Severity: normal → enhancement

The leave-open keyword is there and there is no activity for 6 months.
:andi, maybe it's time to close this bug?

Flags: needinfo?(bpostelnicu)

We have now coverity at review phase.
Clang 8 has the new checker on alpha and 9 will have it as stable. We will integrate this checker at review phase.
Therefore closing this bug.

Status: REOPENED → RESOLVED
Closed: 8 years ago5 years ago
Flags: needinfo?(bpostelnicu)
Resolution: --- → FIXED
Product: Firefox Build System → Developer Infrastructure
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: