Closed Bug 893363 Opened 11 years ago Closed 11 years ago

Ionmonkey: improve support for the Linux perf performance tool

Categories

(Core :: JavaScript Engine, defect)

All
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla26

People

(Reporter: dougc, Assigned: dougc)

References

(Blocks 1 open bug)

Details

(Whiteboard: [games])

Attachments

(1 file, 10 obsolete files)

Experience using the Linux 'perf' tool has suggested a number of improvements:

1. There appears to be a stale pointer reference to the function name causing crashes on ARM Linux.  There is a workaround that copies the name to malloc'ed storage, and this eliminates these crashes.  A better solution is planned to address this.

2. The spew file is opened many times, and either not closed or closed too later, and this caused the limit on open files to be reached.  The patch opens the spew file when writing each entry and quickly closes it after each.

3. There are potentially numerous threads writing perf map data and they were walking on each other.  The patch uses a separate file for each thread, avoiding any synchronisation issues, and the files can be later concatenated into a single map file for report generation.  This is a burden when using 'perf top', and an alternative would be to use a lock to synchronise access to a single map file for the process.

4. The patch adds entries for all the dynamic code, including the AsmJS entries and exits, the Baseline code, and the Ion trampolines.  Hopefully it has all been accounted for.  Not all of the entries are particularly useful, but at least it helps validate the map, and gives some more confidence in the reports.

5. Some of the map entries were not recording the correct address. For the ARM it is necessary to use 'actualOffset' to calculate the real final address of labels.

The code needs more reworking and cleanup, but might be of interest to others as-is.
Attached patch perf profiling tool support improvements (obsolete) (deleted) — Splinter Review
This patch is relative to m-c.  It might assume that the patches for bug 893314 and bug 893315 have been applied.
The version has been rebased after bug 880538 which is an important bug to land and which makes some significant changes that might need to be taken into account.
Attached patch Rebase. (obsolete) (deleted) — Splinter Review
Bug 880538 has landed now.
Attachment #775140 - Attachment is obsolete: true
Attachment #775141 - Attachment is obsolete: true
Attached patch bug893363-4.patch (obsolete) (deleted) — Splinter Review
Attachment #778486 - Attachment is obsolete: true
Attached patch Rebase (obsolete) (deleted) — Splinter Review
Attachment #785444 - Attachment is obsolete: true
Attached patch bug893363-6.patch (obsolete) (deleted) — Splinter Review
* Rebased.  Tried to move in the direction of supporting the serialization of the asm.js module by separating the recording of the code areas from the writing of them to the map file.

* Fixed the GC issue with the names - they are now traced.  Have not seen any more crashes or name corruption on the ARM.

* Fixed numerous bugs in the maps, particularly for the ARM.

* Record the start of the OOL code segment and make use of it to give an accurate map at the end of functions.

* Move in the direction of thread safe writing of the map file.  The files are opened and closed briefly which might well support a lock to sync access to a single file, but for now each thread writes to a separate map file.
Attachment #787218 - Attachment is obsolete: true
Attachment #789542 - Flags: review?(bbouvier)
Attached patch Rework the function name GC marking. (obsolete) (deleted) — Splinter Review
Modified the function name GC marking to use the approach suggested in bug 902506 and bug 904809.  Tested on the ARM and still no name corruption or crashes.
Attachment #789542 - Attachment is obsolete: true
Attachment #789542 - Flags: review?(bbouvier)
Attachment #790001 - Flags: review?(bbouvier)
Comment on attachment 790001 [details] [diff] [review]
Rework the function name GC marking.

Review of attachment 790001 [details] [diff] [review]:
-----------------------------------------------------------------

Very nice patch! I feel like everything gets recorded from now, it will avoid the gaps that I saw in some cases :)

My only concern is: can perf aggregate the mapping files in case of parallel compilation?
If it's not the case, we might want a mutex on the file descriptor instead, what do you think?

I think PerfSpewer.cpp needs some cleanup before r+, but otherwise everything looks good to me. Thanks for this patch!

::: js/src/jit/AsmJSModule.h
@@ +319,3 @@
>  
> +        ProfiledBlocksFunction(JSAtom *name, unsigned start, unsigned ool, unsigned end,
> +                               ion::BasicBlocksVector &blocksVector)

Style: as ool code is actually located after the end of the function and as it's not used in the ProfiledFunction ctor, could you put ool after end in the args list? I guess it might be a pain, but it also makes sense in the mean time, so it's up to you.

@@ +420,5 @@
> +        for (unsigned i = 0; i < perfProfiledFunctions_.length(); i++)
> +            perfProfiledFunctions_[i].trace(trc);
> +        for (unsigned i = 0; i < perfProfiledBlocksFunctions_.length(); i++)
> +            perfProfiledBlocksFunctions_[i].trace(trc);
> +#endif

As of today, this block has already been introduced before and is present on inbound.

@@ +510,5 @@
>      }
>      unsigned numProfiledFunctions() const {
>          return profiledFunctions_.length();
>      }
> +    ProfiledFunction &profiledFunction(unsigned i) {

Why did it lose the const qualifier? Correct me if I'm wrong, but all calls I've seen could store a const reference.

@@ +524,5 @@
>      }
>      unsigned numPerfFunctions() const {
>          return perfProfiledFunctions_.length();
>      }
> +    ProfiledFunction &perfProfiledFunction(unsigned i) {

ditto (const)

@@ +536,5 @@
>      }
>      unsigned numPerfBlocksFunctions() const {
>          return perfProfiledBlocksFunctions_.length();
>      }
> +    ProfiledBlocksFunction &perfProfiledBlocksFunction(unsigned i) {

and ditto (const)

::: js/src/jit/CodeGenerator.cpp
@@ +5387,5 @@
>      if (!generateEpilogue())
>          return false;
> +#if defined(JS_ION_PERF)
> +    // Note the end of the inline code and start of the OOL code.
> +    gen->perfSpewer().startOutOfLineCode(masm);

Could you put this call in generateOutOfLineCode instead?

::: js/src/jit/IonCaches.cpp
@@ +406,5 @@
>                  this, attachKind, CacheName(kind()), code->raw());
>      }
> +
> +#ifdef JS_ION_PERF
> +    writePerfSpewerIonCodeProfile(code, "IonCache");

I see in the IonSpew calls above that you could add some useful information for the user, like filename and lineno. It implies adding variable arg lists to writePerfSpewerIonCodeProfile or creating a const char* before this call. Would it be easily doable?

::: js/src/jit/PerfSpewer.cpp
@@ -35,5 @@
>          const char *env = getenv("IONPERF");
>          if (env == NULL) {
>              PerfMode = PERF_MODE_NONE;
> -            fprintf(stderr, "Warning: JIT perf reporting requires IONPERF set to \"block\" or \"func\". ");
> -            fprintf(stderr, "Perf mapping will be deactivated.\n");

Could you please keep this warning message?

@@ +120,5 @@
> +    masm.bind(&ool);
> +    return true;
> +}
> +
> +#include <pthread.h>

Could you put this include at the top of this file, wrapped in a IFDEF statement?

@@ +127,5 @@
> +#if defined(__ANDROID__)
> +ssize_t PerfSpewDirIndex = 0;
> +#endif
> +
> +static FILE *openPerfMap(void)

nit: static FILE*
openPerfMap(void)

@@ +142,5 @@
> +    char filenameBuffer[bufferSize];
> +
> +    pthread_t tid = pthread_self();
> +
> +    if (snprintf(filenameBuffer, bufferSize, "%sperf-%d.map.%lx", PerfSpewDir, getpid(), tid)

If I |perf record| a run of the shell with parallel compilation enable, will |perf report| be able to use the mapping files altogether?

@@ +157,5 @@
> +        PerfSpewDirIndex++;
> +        return openPerfMap();
> +    }
> +# endif
> +

nit: trailing whitespace

@@ +169,5 @@
>                           IonCode *code,
>                           MacroAssembler &masm)
>  {
> +    FILE *fp = openPerfMap();
> +

nit: trailing whitespace

@@ +249,5 @@
> +void
> +js::ion::writePerfSpewerBaselineProfile(JSScript *script, IonCode *code)
> +{
> +    FILE *fp = openPerfMap();
> +

nit: trailing whitespace

@@ +253,5 @@
> +
> +    if (!fp)
> +        return;
> +
> +    if (PerfFuncEnabled() || PerfBlockEnabled()) {

if (PerfEnabled()) {

Also, this check could be done at the beginning of the function, as in writePerfSpewerAsmJSFunctionMap.

@@ +266,5 @@
> +    fclose(fp);
> +}
> +
> +void
> +js::ion::writePerfSpewerIonCodeProfile(IonCode *code, const char *msg)

Could this function and the one above be factorized?

@@ +272,5 @@
> +    if (!code)
> +        return;
> +
> +    FILE *fp = openPerfMap();
> +

nit: trailing whitespace

@@ +276,5 @@
> +
> +    if (!fp)
> +        return;
> +
> +    if (PerfFuncEnabled() || PerfBlockEnabled()) {

PerfEnabled()

@@ +299,4 @@
>          return;
>  
> +    FILE *fp = openPerfMap();
> +

nit: trailing whitespace

@@ +341,5 @@
> +    if (!PerfBlockEnabled() || basicBlocks.length() == 0)
> +        return;
> +
> +    FILE *fp = openPerfMap();
> +

nit: trailing whitespace

@@ +402,5 @@
> +    if (size == 0)
> +        return;
> +
> +    FILE *fp = openPerfMap();
> +

nit: trailing whitespace

@@ +406,5 @@
> +
> +    if (!fp)
> +        return;
> +
> +    fprintf(fp, "%lx %lx AsmJS Entries and Exits (0x%lx 0x%lx)\n", base, size, base, size);

The name of the function suggests that there is only one entry, so one of the two needs to be modified.

@@ +407,5 @@
> +    if (!fp)
> +        return;
> +
> +    fprintf(fp, "%lx %lx AsmJS Entries and Exits (0x%lx 0x%lx)\n", base, size, base, size);
> +

nit: trailing whitespace

::: js/src/jit/PerfSpewer.h
@@ +44,5 @@
> +    Record(const char *filename,
> +           unsigned lineNumber,
> +           unsigned columnNumber,
> +           uint32_t id)
> +        : filename(filename), lineNumber(lineNumber),

nit: two spaces only before the ':'
Attachment #790001 - Flags: review?(bbouvier)
Attached patch Rebase (obsolete) (deleted) — Splinter Review
Attachment #790001 - Attachment is obsolete: true
(In reply to Benjamin Bouvier [:bbouvier] from comment #8)
> Comment on attachment 790001 [details] [diff] [review]
> Rework the function name GC marking.
> 
> Review of attachment 790001 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Very nice patch! I feel like everything gets recorded from now, it will
> avoid the gaps that I saw in some cases :)
> 
> My only concern is: can perf aggregate the mapping files in case of parallel
> compilation?
> If it's not the case, we might want a mutex on the file descriptor instead,
> what do you think?

Thank you for all the feedback.  The patch has been reworked quite a bit
based on the feedback.

It now writes to one file per process, rather than one per thread, and
uses a lock to synchronize access.  This allows the file to be kept
open, so it is now opened when initializing, flushed before releasing
the lock, and never closed.

> I think PerfSpewer.cpp needs some cleanup before r+, but otherwise
> everything looks good to me. Thanks for this patch!

An attempt has been made, but perhaps not all the choices are appropriate.
 
> ::: js/src/jit/AsmJSModule.h
> @@ +319,3 @@
> >  
> > +        ProfiledBlocksFunction(JSAtom *name, unsigned start, unsigned ool, unsigned end,
> > +                               ion::BasicBlocksVector &blocksVector)
> 
> Style: as ool code is actually located after the end of the function and as
> it's not used in the ProfiledFunction ctor, could you put ool after end in
> the args list? I guess it might be a pain, but it also makes sense in the
> mean time, so it's up to you.

The OOL code is located before this 'end' of the function, so the order
looks ok.  'ool' has been renamed which might help clarify the code.
 
> @@ +420,5 @@
> > +        for (unsigned i = 0; i < perfProfiledFunctions_.length(); i++)
> > +            perfProfiledFunctions_[i].trace(trc);
> > +        for (unsigned i = 0; i < perfProfiledBlocksFunctions_.length(); i++)
> > +            perfProfiledBlocksFunctions_[i].trace(trc);
> > +#endif
> 
> As of today, this block has already been introduced before and is present on
> inbound.

Yes, rebased.

> @@ +510,5 @@
> >      }
> >      unsigned numProfiledFunctions() const {
> >          return profiledFunctions_.length();
> >      }
> > +    ProfiledFunction &profiledFunction(unsigned i) {
> 
> Why did it lose the const qualifier? Correct me if I'm wrong, but all calls
> I've seen could store a const reference.

The offsets need to be modified, when compilation is finished, to point to
their actual offset.  This is necessary for the ARM due to the constant pools.
Removing the const qualifier was the easiest path, but if you have some
coding pattern in mind then please let me know.

> ::: js/src/jit/CodeGenerator.cpp
> @@ +5387,5 @@
> >      if (!generateEpilogue())
> >          return false;
> > +#if defined(JS_ION_PERF)
> > +    // Note the end of the inline code and start of the OOL code.
> > +    gen->perfSpewer().startOutOfLineCode(masm);
> 
> Could you put this call in generateOutOfLineCode instead?

The position is being used even if there is no OOL code - it allows
the epilogue to be well defined.  It has been renamed to endInlineCode,
and it seems appropriate to call this here.
 
> ::: js/src/jit/IonCaches.cpp
> @@ +406,5 @@
> >                  this, attachKind, CacheName(kind()), code->raw());
> >      }
> > +
> > +#ifdef JS_ION_PERF
> > +    writePerfSpewerIonCodeProfile(code, "IonCache");
> 
> I see in the IonSpew calls above that you could add some useful information
> for the user, like filename and lineno. It implies adding variable arg lists
> to writePerfSpewerIonCodeProfile or creating a const char* before this call.
> Would it be easily doable?

It would certainly be possible to define a function accepting a format string
and arguments, and to try and present more specific information, but this is
left for follow up work.
 
> ::: js/src/jit/PerfSpewer.cpp
> @@ -35,5 @@
> >          const char *env = getenv("IONPERF");
> >          if (env == NULL) {
> >              PerfMode = PERF_MODE_NONE;
> > -            fprintf(stderr, "Warning: JIT perf reporting requires IONPERF set to \"block\" or \"func\". ");
> > -            fprintf(stderr, "Perf mapping will be deactivated.\n");
> 
> Could you please keep this warning message?

IMHO a warning does not seem appropriate here.  Even if IONPERF is not
set it still writes this warning.  There does not appear to be other
precedences for writing a warning if an optional function is not used.

If it's a show stopper then sure I'll just add it back?
 
> @@ +120,5 @@
> > +    masm.bind(&ool);
> > +    return true;
> > +}
> > +
> > +#include <pthread.h>
> 
> Could you put this include at the top of this file, wrapped in a IFDEF
> statement?

Removed.
 
> @@ +127,5 @@
> > +#if defined(__ANDROID__)
> > +ssize_t PerfSpewDirIndex = 0;
> > +#endif
> > +
> > +static FILE *openPerfMap(void)
> 
> nit: static FILE*
> openPerfMap(void)

Done.
 
> @@ +142,5 @@
> > +    char filenameBuffer[bufferSize];
> > +
> > +    pthread_t tid = pthread_self();
> > +
> > +    if (snprintf(filenameBuffer, bufferSize, "%sperf-%d.map.%lx", PerfSpewDir, getpid(), tid)
> 
> If I |perf record| a run of the shell with parallel compilation enable, will
> |perf report| be able to use the mapping files altogether?

It was necessary to concatenate them all back together, and the order was
not important.  The advantage was that there was no lock contention to distort
the results, but I guess the usability might be too low for common use, so
it has been reworked to write to a single file.
 
> @@ +157,5 @@
> > +        PerfSpewDirIndex++;
> > +        return openPerfMap();
> > +    }
> > +# endif
> > +
> 
> nit: trailing whitespace

There were a few of these. All fixed.

> @@ +253,5 @@
> > +
> > +    if (!fp)
> > +        return;
> > +
> > +    if (PerfFuncEnabled() || PerfBlockEnabled()) {
> 
> if (PerfEnabled()) {
> 
> Also, this check could be done at the beginning of the function, as in
> writePerfSpewerAsmJSFunctionMap.

Done.  Reworked.

> @@ +266,5 @@
> > +    fclose(fp);
> > +}
> > +
> > +void
> > +js::ion::writePerfSpewerIonCodeProfile(IonCode *code, const char *msg)
> 
> Could this function and the one above be factorized?

Probably not in the scope of this bug.  There is IonCode generated
that did not have a clear JSScript to pass in, and exploring this is
left for future work.
 
> @@ +276,5 @@
> > +
> > +    if (!fp)
> > +        return;
> > +
> > +    if (PerfFuncEnabled() || PerfBlockEnabled()) {
> 
> PerfEnabled()

Done.
 
> @@ +406,5 @@
> > +
> > +    if (!fp)
> > +        return;
> > +
> > +    fprintf(fp, "%lx %lx AsmJS Entries and Exits (0x%lx 0x%lx)\n", base, size, base, size);
> 
> The name of the function suggests that there is only one entry, so one of
> the two needs to be modified.

Done.
 
> ::: js/src/jit/PerfSpewer.h
> @@ +44,5 @@
> > +    Record(const char *filename,
> > +           unsigned lineNumber,
> > +           unsigned columnNumber,
> > +           uint32_t id)
> > +        : filename(filename), lineNumber(lineNumber),
> 
> nit: two spaces only before the ':'

Done.
Attachment #797262 - Attachment is obsolete: true
Attachment #798509 - Flags: review?(bbouvier)
Attached patch Fix typo on Android. (obsolete) (deleted) — Splinter Review
Sorry, fixed a last minute typo affecting Android and B2G builds.
Attachment #798509 - Attachment is obsolete: true
Attachment #798509 - Flags: review?(bbouvier)
Attachment #798619 - Flags: review?(bbouvier)
Comment on attachment 798619 [details] [diff] [review]
Fix typo on Android.

Review of attachment 798619 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for the update! r+ with the warning message back.

::: js/src/jit/BaselineIC.cpp
@@ +23,5 @@
>  
>  #include "vm/Interpreter-inl.h"
>  #include "vm/ScopeObject-inl.h"
>  
> +

nit: blank line added here

::: js/src/jit/PerfSpewer.cpp
@@ +50,5 @@
> +#ifdef JS_THREADSAFE
> +# include "jslock.h"
> +static PRLock *PerfMutex;
> +#endif
> +

nit: 2 blank lines, one is enough.

@@ -35,5 @@
>          const char *env = getenv("IONPERF");
>          if (env == NULL) {
>              PerfMode = PERF_MODE_NONE;
> -            fprintf(stderr, "Warning: JIT perf reporting requires IONPERF set to \"block\" or \"func\". ");
> -            fprintf(stderr, "Perf mapping will be deactivated.\n");

As discussed on IRC, could you please let this warning here? It's the only way to tell the user how to activate perf support.
Attachment #798619 - Flags: review?(bbouvier) → review+
Carrying over r+.
Attachment #798619 - Attachment is obsolete: true
Attachment #799218 - Flags: review+
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/7850adf40329
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla26
Whiteboard: [games]
Blocks: gecko-games
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: