Backtrack 1.0
Categories
(Core :: mozglue, task, P5)
Tracking
()
Tracking | Status | |
---|---|---|
firefox65 | --- | affected |
People
(Reporter: mayhemer, Unassigned)
References
(Depends on 3 open bugs, Blocks 1 open bug, )
Details
(Whiteboard: [webcompat-sci-exclude])
Attachments
(1 file, 34 obsolete files)
(deleted),
text/x-phabricator-request
|
Details |
Reporter | ||
Updated•6 years ago
|
Comment 1•6 years ago
|
||
Reporter | ||
Comment 2•6 years ago
|
||
Reporter | ||
Comment 3•6 years ago
|
||
Reporter | ||
Comment 4•6 years ago
|
||
Reporter | ||
Comment 5•6 years ago
|
||
Reporter | ||
Comment 6•6 years ago
|
||
Reporter | ||
Updated•6 years ago
|
Reporter | ||
Comment 7•6 years ago
|
||
Reporter | ||
Comment 8•6 years ago
|
||
Reporter | ||
Comment 9•6 years ago
|
||
Reporter | ||
Comment 10•6 years ago
|
||
Reporter | ||
Updated•6 years ago
|
Reporter | ||
Comment 11•6 years ago
|
||
Reporter | ||
Comment 12•6 years ago
|
||
Reporter | ||
Comment 13•6 years ago
|
||
Reporter | ||
Updated•6 years ago
|
Reporter | ||
Comment 14•6 years ago
|
||
Looks like I'm missing tracking of stuff that gets triggered from inside PresShell::DoFlushPendingNotifications
Reporter | ||
Comment 15•6 years ago
|
||
- added DOM node repaint cause tracking (more work to get to the actual source point needed!)
- added tracking of thread before-/after-event observers, visible as dispatch blockers now
Reporter | ||
Comment 16•6 years ago
|
||
- nits fixed, added some more runtime assertions checks, raised new ideas (written down in the work breakdown)
Reporter | ||
Comment 17•6 years ago
|
||
- XPCOM service exposing BT API (just objective for now)
- fixed potential deadlock in timer
- xpcshell bt startup
- udp/server socket tracking
Reporter | ||
Comment 18•6 years ago
|
||
- few build issues fixed on linux
Reporter | ||
Comment 19•6 years ago
|
||
- rebased, after that only basic testing
- builds and runs on try (not verified after rebase), modulo few details to tackle down
- no need to disable sandboxing on win to enable bt logging from content (exceptions added)
Reporter | ||
Comment 20•6 years ago
|
||
- build again on try
- issues left to have fully green builds:
- use only thread_local static storage for my tls (blocked on: deque allocates dynamically and crashes in free() on a thread shutdown on win, could be std impl bug)
- might be related to previous: fix the check_vanilla_allocations.py test
- fix sm-pkg build, we have to include mozglue/backtrack somewhere, probably
Reporter | ||
Comment 21•6 years ago
|
||
Updated•6 years ago
|
Updated•6 years ago
|
Reporter | ||
Comment 23•5 years ago
|
||
rebased, tested only locally on windows.
Reporter | ||
Comment 24•5 years ago
|
||
few small nits fixed
Reporter | ||
Comment 25•5 years ago
|
||
- ocsp blocking tracked
Reporter | ||
Comment 26•5 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=feec6866213eaf143237aece33e082fc6344777a
- added MOZ_LOG=backtrack option to write data to logs instead of to a separate file; BTa (the analyzer) doesn't understand it tho, but logan does when you checkout a special branch
- added tracking of streams and session queues in http/2 stack
Reporter | ||
Comment 27•5 years ago
|
||
Reporter | ||
Comment 28•5 years ago
|
||
rebase only
Reporter | ||
Comment 29•5 years ago
|
||
number of build fixes and runtime assertion fixes, backup
Reporter | ||
Comment 30•5 years ago
|
||
Nathan, please see Comment 27.
Comment 31•5 years ago
|
||
Reporter | ||
Comment 32•5 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #31)
Comment on attachment 9080634 [details] [diff] [review]
wip 28 (m-c@5805cd9ae294)Review of attachment 9080634 [details] [diff] [review]:
A lot of the documentation in the Backtrack project documentation needs to
move into Backtrack.h.
I wanted to have decent technical doc before asking you for a feedback the first time, but time pressure did not allow me to have one.
It's not clear to me what the description of "dependent queues" in the API
documentation is intended for. Is the idea that you need to track timelines
that may include tasks from different threads, not just tasks executing on a
single thread?
Dependent queues have nothing to do with multi threading. Dependent queues (not sure the name was chosen well) is a pattern you will find in nearly any at least a bit asynchronous software code - whenever you put something aside for later. To explain - an INdependent queue is a queue that when you dispatch to it, will do its best to run the task ASAP in perhaps a FIFO manner, no conditions. Tasks put in a dependent queue though must usually wait for some other (async) code or task chain to finish first and then 'manually' trigger the loop over this dependent queue and run the queued tasks in it. Hence "dependent". Technically, when a task is run from inside another tasks (nesting) it's considered dependent. Note that thread looping cores are surrounded by LOOP_BEGIN and LOOP_END markers that remove (hides) any nesting. Hence, tasks running from inside 'manual' call to NS_ProcessNextEvent() are not considered dependent.
If you ignore dependency triggers (you don't go deep in the nesting) you will get a single path of nodes, no branching (I call it the shallow critical path). This single path may (will) skip among threads/processes. If you start looking at dependency triggers as well, you will be getting a tree. Again, the added branches will skip from thread to thread. To visualize this, you can either draw some kind of connection arrows between threads or show the main (shallow) critical path as one timeline and each dependency triggering path (in the example from the doc it would go back to vsync tick OS notification) as a separate timeline.
Do people somehow need to be aware of dependent queues to
use the API, or is the idea of dependent queues more of an implementation
detail?
It's a detail, really. Maybe only concern may be to add proper 'loop' markers to root thread looping code (like in ImageDecoder). W/o it all the decoding tasks are seen as dependent and when you look at its examination, you just see what the loop was doing (immediately processing tasks) which is not a valuable information at all, and it's rather confusing.
Is it accurate to say that Backtrack is really just extending TaskTracer to
recognize there are more sources of asynchronous task execution than just
the IPC message queue and thread event queues?
Yes, TT was a big inspiration for me and the main idea remains the same.
This patch doesn't contain any bits for the "critical path branches" (aren't
those usually called "blockers"?)
No, no code for branches (or however we may call them eventually) in this patch yet.
"critical path branches" are multiple chains of async tasks that all have to complete to unblock some state evolution (like load all css files before you can paint)
"Blockers" are events/tasks that simply make your task wait in a FIFO queue
and the "annotate sequential tasks" bits
for existing code optimization, mentioned in the project documentation,
correct?
As well, this is another next step thing to do.
I don't understand why
mozilla::Runnable
isn't touched in any way. (Or
nsIEventTarget
and related interfaces.) I guess...rather than modifying
all the individual tasks, we're just trying to instrument the places that
actually deal with the intake and execution of tasks, with the expectation
that the latter involves modifying a smaller number of places?
Exactly. If you wonder why I do wrapping of tasks (generally, not just nsIRunnables) is that I also need to track destructor code same way as the main Run() method. Destructors may do stuff (block CPU) and also may dispatch other tasks and we can't loose the linking.
Updating every runnable everywhere is overkill and puts unnecessary work on implementers and is not errorproof.
I am having a lot of difficulty switching back and forth between the project
documentation, the API documentation, and the actual patch, trying to tie
details from one to the other. I have written some comments below, many of
which are admittedly nitpicky about the details of the code itself, rather
than the broader "this code looks fine as-is" vs. "this code is
architecturally incorrect", which requires integrating the aforementioned
three sources. Assuming my understanding of Backtrack as a more extensive
TaskTracer, above, is correct, I think the ideas are fine, but I wouldn't
agree that this patch could land in its current state.
Sure, I believe there is much to update.
::: mozglue/backtrack/Backtrack.cpp
@@ +53,5 @@+namespace { // ::{anon}
+
+uint32_t CurrentThreadId() {
- // Because of thread id recycling we must add something unique or random
- // This modified tid will then spread automatically
Wait, what? Why not just use an incrementing counter in this case?
No. There would still be a possibility for two threads on different process to share TID and also hit the same (per process) counter.
TIDs are quite open, though. I need to sync with the Gecko profiler on this. For the code as is now (which is not something to land) I want to have something per-session unique, that's mostly the only requirement here that makes things very simple then.
Anyway, this will change before landing.
Actually, why is using the clock for the upper sixteen bits not subject to
the same concerns about thread recycling?
@@ +125,5 @@
void* aClosure) {
- FILE* stream = (FILE*)aClosure;
- MozCodeAddressDetails details;
- static const size_t buflen = 1024;
- char buf[buflen + 1]; // 1 for trailing '\n'
Maybe just make
buflen
long enough and then you don't have to play the
+1
games here and below?::: mozglue/backtrack/Backtrack.h
@@ +177,5 @@+MFBT_API bool Enabled(uint32_t feature = kFeatureEnabled);
+
+MFBT_API gid_t const RunningMarker();
+
+inline MOZ_MUST_USE auto RootSpan() {We generally do not use
auto
as a return type. I do not think it's good
practice to use it as a return type without explicitly listing the returned
type (e.g.auto foo() -> ReturnType
).I understand that writing out the return type here is painful,
yep, auto is here as a shortcut.
but that
seems to be becauseAutoRootMark
requires template parameters, rather than
just having a two-arg constructor that would accomplish the same thing. Can
we fix that, and then theauto
return type stuff would go away on its own?
I'm not sure I follow the idea for a two-arg ctor here. But I can definitely change to comply more with our style.
OTOH, I was more expecting a different concern here. The API is not 100% safe to use as I wrote it (main design driver was to save typing when the API is used around the code base).
you can easily write:
auto root(backtrack::RootSpan().StaticName("some label"));
the code will compile, because RootSpan() returns a Marker
(via copy elision) and StaticName() returns Marker&
(a ref). But the object returned by RootSpan() will only be temporary and destroyed, not mentioning that access to root
will crash. I think one solution would be to make StaticName() and other Marker methods simply be void and thus disallow chaining. But the code may still compile as void still is a type?
@@ +203,5 @@
- detail::AutoRootMark<MarkerType::LABEL_BEGIN, MarkerType::LABEL_END> span;
- return span;
+}+inline auto Dispatch(gid_t& g) {
Same thing here for
detail::Marker
.@@ +251,5 @@
+inline MOZ_MUST_USE auto ResponseSpan(gid_t& g) {
- detail::AutoSpanMark<MarkerType::RESPONSE_BEGIN, MarkerType::RESPONSE_END>
resposnse(g);
- return resposnse;
Nit:
response
. Also the above comments apply toAutoSpanMark
.@@ +258,5 @@
+inline MOZ_MUST_USE auto ResponseSpanAndClear(gid_t& g) {
- detail::AutoSpanMarkAndClear<MarkerType::RESPONSE_BEGIN,
MarkerType::RESPONSE_END>
resposnse(g);
- return resposnse;
Nit:
response
.@@ +282,5 @@
- return marker;
+}+inline auto Milestone() {
- detail::Marker<gid_t> marker(detail::Mark<MarkerType::MILESTONE>());
It's not clear what a
MILESTONE
vs. anOBJECTIVE
(above) is.
Sure. OBJECTIVE is intended to be picked in the (current) offline analyzer as a starting point (or ending, actually) to go back from. MILESTONE is only an informative thing and it was introduced to filter out all other overwhelming data from paths. It was an attempt to make comparing of two paths simpler. I think we still may find MILESTONES useful for data analyzes.
@@ +437,5 @@
+/**
- Provides blocker tracking but only on a single handling thread.
- */
+class SingleThread : public NameQueue {- gid_t mLastExecute = {0, 0};
= gid_t()
?@@ +473,5 @@
- thread (e.g. a thread pool)
- */
+class ThreadPool : public NameQueue {- // TODO convert to Atomic<gid_t>
- gid_t mLastExecute = {0, 0};
= gid_t()
?@@ +501,5 @@
- */
+template <typename trails_t = SingleThread>
+class Safe {- class RefCountedTrails : public trails_t {
- Atomic<int32_t, Relaxed> mRefCnt;
Um. This cannot possibly be correct.
You are referring to the Relaxed ordering? AFAIK, for ref counters, this is quite sufficient. The arithmetic is atomic. I'm not sure about the visibility, but I believe only one thread will see the final 0. And because of Illusion of atomic reference counting access has to be synchronized when manipulating an object at the same referring smart pointer. As this is tight close to the queue manipulation (which is expected to be used on multiple threads), there is an expectation such synchronization will be present.
But please fix me if I'm wrong :)
Maybe it is correct by accident on
x86-ish processors, but it's going to break virtually everywhere else.
Maybe it is correct on a single thread, but then why would this beAtomic
?@@ +507,5 @@
- public:
- // Has to be implemented on its own before we have
- // MOZ_INLINE_DECL_THREADSAFE_REFCOUNTING
- int32_t AddRef() { return ++mRefCnt; }
- int32_t Release() {
Let's not add a third implementation of
AddRef
/Release
(one/several in
XPCOM, one in MFBT).
Don't like it too, but I had to. There is nothing in MFBT last time I have checked that could be used easily on templated classes. I was using external::AtomicRefCounted
before but class size checking was failing - because of the template.
@@ +788,5 @@
- public:
- explicit TrackedTask() = default;
- explicit TrackedTask(char const* queueName) : Base(queueName) {}
- explicit TrackedTask(TaskContainer&& task) {
- Base::TrackDispatch();
Why is this not done in the
Base
constructor?
If you have to ask, then the documentation was not clear enough about how Tracked<> and TrackedTask<> work :(
The Tracked<> class expects TrackDispatch() (and TrackExecute()) be always called fully manually. The TrackedTask<> is intended to hide calls to Track*() method completely (I could even try to derive from Tracked via protected
, perhaps) and the task being wrapped is treated as 'dispatched' the moment we create this wrapper around it. As in the doc, the intended use is to, instead of mQueue.Append(move(task))
, insert the task simply as mQueue.Append(TrackedTask(move(task)))
.
@@ +793,5 @@
- mTask.emplace(std::move(task));
- }
- explicit TrackedTask(TaskContainer&& task, Trails const& trails)
: Base(trails) {
- Base::TrackDispatch();
Why is this not done in the
Base
constructor?@@ +802,5 @@
- release();
- mTask = std::move(other.mTask);
- Base::mTrails = std::move(other.mTrails);
- Base::mMarker = other.mMarker;
Maybe we should just call
Base::operator=
here?@@ +812,5 @@
- release();
- mTask = other.mTask;
- Base::mTrails = other.mTrails;
- Base::mMarker = other.mMarker;
Here too.
@@ +916,5 @@
TrackedTask<TaskContainer> const& container) {
- return &task == &container.const_task();
+}+template <typename TaskContainer, size_t PageSize = 512,
The default needs to be adjusted because 512 * sizeof(void*) + extra
bookkeeping means that you're just exceeding a jemalloc size class,
causing internal fragmentation. The default needs to be adjusted downward
to account for the space taken up by internal members...and accordingly, the
default shouldn't really be user-visible.
Very good point! I found out (after I've already wrote this class :/) that the xpcom internal thread queue is using 256 or even less page size. I definitely wanted to make this smaller here too, yes. I made this a template arg because on number of places I replace nsTArrays with bt::Queue there is an optimization for the array allocation size, sometime to even as low as 16 elements. So I think it may be a good optimization?
@@ +1097,5 @@
--detail::sPageCounter;
+#endif
- }
- void Enqueue(Task&& task, UniquePtr<Page>& head) {
This is a public API?
No - isn't the Page sub-class private?
::: security/manager/ssl/nsNSSCallbacks.cpp
@@ +136,5 @@}
nsresult OCSPRequest::DispatchToMainThreadAndWait() {
- backtrack::Milestone().DynamicName(
"OCSPRequest::DispatchToMainThreadAndWait [%s]", mAIALocation.get());
Maybe we could just make
Marker
overloads, one of which takes a constant
character array (string constants), and one that takesprintf
-style args,
rather than constructing something and callingDynamicName
?
Originally I wanted to have only Name
, overloaded for Name(char const* static_name) and Name(char const* fmt, ...); But the compiler sees it as ambiguous.
I guess we'd
need to add overloads for all the sub-type constructors, which would be
tedious, but at least we'd statically ensure nobody forgets a name.
I invented this 'amend' approach to have a variable length markers. Not all marker types need names, some will have more than one. Some also have special data appended. But I think for e.g. Milestone and Objective (and some others) would be better to pass the args to the Milestone
function directly, yes. Note that most naming will be combination of static and dynamic. There will always be a method name (static) and a resource (URL, ...) in pair. So in the end, I'd see (schematically) something like: Milestone().StaticName(__func__).DynamicName("%s", url.spec());
compressed to Milestone(__func__, "%s", url.spec());
and similarly for other marker types, perhaps. This will save memory a lot when the static part (func names) will not be in the formating string.
Thanks for the feedback. Before I ask again, I'll try to fix as much as possible according your comment and also build a better technical doc on how the API works.
Reporter | ||
Updated•5 years ago
|
Reporter | ||
Comment 33•5 years ago
|
||
Reporter | ||
Comment 34•5 years ago
|
||
Depends on D40467
Reporter | ||
Comment 35•5 years ago
|
||
Depends on D40468
Reporter | ||
Comment 36•5 years ago
|
||
Depends on D42290
Reporter | ||
Comment 37•5 years ago
|
||
Depends on D42291
Reporter | ||
Comment 38•5 years ago
|
||
Depends on D42292
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Reporter | ||
Comment 39•5 years ago
|
||
Reporter | ||
Updated•5 years ago
|
Updated•5 years ago
|
Reporter | ||
Comment 40•5 years ago
|
||
No, sorry bot, I'm not gonna work on this.
Updated•5 years ago
|
Reporter | ||
Updated•5 years ago
|
Reporter | ||
Updated•5 years ago
|
Updated•5 years ago
|
Reporter | ||
Updated•5 years ago
|
Updated•5 years ago
|
Reporter | ||
Updated•5 years ago
|
Comment 41•3 years ago
|
||
Unfortunately we're unlikely to pick this up - at least not any time soon - closing as inactive.
Description
•