Closed Bug 834449 Opened 12 years ago Closed 12 years ago

Sqlite.jsm should allow cleaning up of old cached statements

Categories

(Toolkit :: General, defect)

Product:

Component:

Type:

defect

Priority:

Not set

Severity:

normal

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla21

People

(Reporter: gps, Assigned: rnewman)

References

Details

(Whiteboard: [MemShrink])

Attachments

(3 files, 3 obsolete files)

Part 1: correctly decrement pending statement count. v1 12 years ago Richard Newman [:rnewman] (deleted), patch	gps : review-	Details \| Diff \| Splinter Review
Part 2: allow cleanup of cached statements. v1 12 years ago Richard Newman [:rnewman] (deleted), patch		Details \| Diff \| Splinter Review
Part 1: rework tracking of pending statements. v2 12 years ago Richard Newman [:rnewman] (deleted), patch	gps : review+	Details \| Diff \| Splinter Review
Part 2: allow cleanup of cached statements. v2 12 years ago Richard Newman [:rnewman] (deleted), patch	gps : review+	Details \| Diff \| Splinter Review
Part 3: finalize all the things. v1 12 years ago Richard Newman [:rnewman] (deleted), patch		Details \| Diff \| Splinter Review
Part 3: finalize all the things. v2 12 years ago Richard Newman [:rnewman] (deleted), patch	gps : review+	Details \| Diff \| Splinter Review

Gregory Szorc [:gps]

Reporter

Description

•

12 years ago

+++ This bug was initially created as a clone of Bug #833609 +++ Follow-up from initial landing to implement some form of automatic statement cleanup. This could entail any of the following: * Public APIs to flush out all cached statement objects that aren't in use. * Automatically destroying cached statements that haven't been used in N seconds. * LRU for cached statements / cap on # cached statements. * ...

Richard Newman [:rnewman]

Assignee

Comment 1

•

12 years ago

I recall discussion with sdwilsh and mak around trying to cache intelligently. I don't think it's an automatically solvable problem. I started hacking in a cache-dropping call, which allows individual applications to choose the best time. The only thing I could see being better is a scoped statement cache (connection wrapper), but this is really no more helpful without finalizers.

Marco Bonardo [:mak]

Comment 2

•

12 years ago

(In reply to Richard Newman [:rnewman] from comment #1) > I recall discussion with sdwilsh and mak around trying to cache > intelligently. I don't think it's an automatically solvable problem. we can try! > I started hacking in a cache-dropping call, which allows individual > applications to choose the best time. that may be more complicated than what we need, we can make this opt-in in case. > The only thing I could see being better is a scoped statement cache > (connection wrapper), but this is really no more helpful without finalizers. what do you mean by finalizers? Btw, in js we have Sqlite.jsm that wraps the connection, in cpp we have StatementCache, that may be "easily" modified to drop unused statements.

Gregory Szorc [:gps]

Reporter

Comment 3

•

12 years ago

I agree that "automatic intelligent caching" is often an oxymoron. I feel the best solution to this problem is to make a number of options and APIs available to consumers and let the app make the most suitable call. Surely any solution that allows any type of cache invalidation is better than one that does not (as long as it isn't automatic).

Richard Newman [:rnewman]

Assignee

Comment 4

•

12 years ago

(In reply to Marco Bonardo [:mak] from comment #2) > > I recall discussion with sdwilsh and mak around trying to cache > > intelligently. I don't think it's an automatically solvable problem. > > we can try! Concur, but I don't really want to block on doing so :D > > I started hacking in a cache-dropping call, which allows individual > > applications to choose the best time. > > that may be more complicated than what we need, we can make this opt-in in > case. I think the opposite — having a call which says "drop all of those statements I asked you to cache" seems a lot simpler than having some system which independently decides whether to drop them. I say this because it was about five lines of code for me to do it in my work on Bug 832067 :) The current alternative is that statements are automatically finalized when the connection is closed, which is a heavyweight version of the same. > > The only thing I could see being better is a scoped statement cache > > (connection wrapper), but this is really no more helpful without finalizers. > > what do you mean by finalizers? I mean some kind of scoping construct such that finalization of statements can occur on a lifecycle boundary. And it would be convenient if the lifecycle stages of that object could automatically do so, as with Java/C++ finalizers and destructors. I propose we emulate this in the short term by having a caller say "I'm done with the busy phase, you can uncache now". We could also do it by reifying that phase as a connection object — Greg and I planned for UnopenedConnection/OpenedConnection/etc. to be that boundary, but probably without actually closing the DB connection.

Gregory Szorc [:gps]

Reporter

Comment 5

•

12 years ago

I'm fine with the initial patch being a new public API to drop all cached statements.

Richard Newman [:rnewman]

Assignee

Updated

•

12 years ago

Assignee: nobody → rnewman

Status: NEW → ASSIGNED

Summary: Sqlite.jsm should automatically clean up old statements → Sqlite.jsm should allow cleaning up of old cached statements

Richard Newman [:rnewman]

Assignee

Comment 6

•

12 years ago

Attached patch Part 1: correctly decrement pending statement count. v1 (obsolete) (deleted) — Details — Splinter Review

Attachment #706241 - Flags: review?(gps)

Richard Newman [:rnewman]

Assignee

Comment 7

•

12 years ago

Attached patch Part 2: allow cleanup of cached statements. v1 (obsolete) (deleted) — Details — Splinter Review

Attachment #706242 - Flags: review?(gps)

Richard Newman [:rnewman]

Assignee

Updated

•

12 years ago

Blocks: 832067

Gregory Szorc [:gps]

Reporter

Comment 8

•

12 years ago

Comment on attachment 706241 [details] [diff] [review] Part 1: correctly decrement pending statement count. v1 Review of attachment 706241 [details] [diff] [review]: ----------------------------------------------------------------- This breaks logging, no? One of the purposes of the counter aspect of inProgressCounter was to keep a running tally and unique identifier for each statement. I concede the variable name is a little wonky and should probably be changed.

Attachment #706241 - Flags: review?(gps) → review-

Gregory Szorc [:gps]

Reporter

Comment 9

•

12 years ago

Comment on attachment 706242 [details] [diff] [review] Part 2: allow cleanup of cached statements. v1 Review of attachment 706242 [details] [diff] [review]: ----------------------------------------------------------------- Cancelling until you realize inProgressStatements lives up to its name. ::: toolkit/modules/Sqlite.jsm @@ +603,5 @@ > + */ > + discardCachedStatements: function () { > + // Do this for now. Eventually we might want to track which statements > + // are in use (which inProgressStatements does not actually) and only > + // clear the others. inProgressStatements does in fact track which statements are in use! If it's in the map, it's in progress. @@ +608,5 @@ > + if (this._inProgressCounter) { > + throw new Error("Queries in progress; not clearing cache."); > + } > + > + this._log.debug("Discarding cached statements."); Would you mind tossing the number discarded into the log message? ::: toolkit/modules/tests/xpcshell/test_sqlite.js @@ +155,5 @@ > > yield c.close(); > }); > > +add_task(function test_cleanup_while_active() { s/cleanup/discard/g

Attachment #706242 - Flags: review?(gps)

Richard Newman [:rnewman]

Assignee

Comment 10

•

12 years ago

Look more closely. The things tracked aren't statement objects, they're pending statement objects. You can't tell from the pending instances which storage statement instances are still in use: all you can do is blindly cancel them. Rather than rework everything to track which things we can and can't finalize, I opted to do the simplest thing.

Gregory Szorc [:gps]

Reporter

Comment 11

•

12 years ago

I prefer to do it right. Either create a mapping of statement to in progress count or implement a proper in progress counter and only clear when it is 0.

Richard Newman [:rnewman]

Assignee

Comment 12

•

12 years ago

Attached patch Part 1: rework tracking of pending statements. v2 (deleted) — Details — Splinter Review

Attachment #706241 - Attachment is obsolete: true

Attachment #706263 - Flags: review?(gps)

Richard Newman [:rnewman]

Assignee

Comment 13

•

12 years ago

Attached patch Part 2: allow cleanup of cached statements. v2 (deleted) — Details — Splinter Review

Attachment #706242 - Attachment is obsolete: true

Attachment #706264 - Flags: review?(gps)

Gregory Szorc [:gps]

Reporter

Comment 14

•

12 years ago

Comment on attachment 706263 [details] [diff] [review] Part 1: rework tracking of pending statements. v2 Review of attachment 706263 [details] [diff] [review]: ----------------------------------------------------------------- r+ with comments addressed. I love when refactorings don't have to touch existing tests! ::: toolkit/modules/Sqlite.jsm @@ +177,5 @@ > this._anonymousStatements = new Map(); > this._anonymousCounter = 0; > + > + // A map from statement index to mozIStoragePendingStatement, to allow for > + // canceling. I think you mean "finalizing." "Canceling" is for in progress statements. @@ +186,3 @@ > this._inProgressStatements = new Map(); > + > + // Increments for each executed statement for the life of the connection. Trailing WS. @@ +342,5 @@ > + return 0; > + } > + > + let out = 0; > + for (let v of this._inProgressStatements.values()) { .values() works?! It isn't documented on MDN. @@ +770,5 @@ > } > }, > }); > > + this._recordStatementBeginning(statement, pending, index); You only call this here. Is it worth inlining to avoid yet more statement overhead? ::: toolkit/modules/tests/xpcshell/test_sqlite.js @@ +184,5 @@ > + actual = c.inProgress(); > + }); > + > + do_check_eq(actual, 1); > + do_check_eq(c._statementCounter, c._initialStatementCount + 2); check c.inProgress() after execution as well. Please also queue up multiple statements with different SQL and check an inProgress() > 1 case.

Attachment #706263 - Flags: review?(gps) → review+

Gregory Szorc [:gps]

Reporter

Comment 15

•

12 years ago

Comment on attachment 706264 [details] [diff] [review] Part 2: allow cleanup of cached statements. v2 Review of attachment 706264 [details] [diff] [review]: ----------------------------------------------------------------- Would review again.

Attachment #706264 - Flags: review?(gps) → review+

Richard Newman [:rnewman]

Assignee

Comment 16

•

12 years ago

(In reply to Gregory Szorc [:gps] from comment #14) > > + // A map from statement index to mozIStoragePendingStatement, to allow for > > + // canceling. > > I think you mean "finalizing." "Canceling" is for in progress statements. Nope. The pending statements are what we cancel when we finish up; see: _finalize: function (deferred) { this._log.debug("Finalizing connection."); // Cancel any pending statements. for (let [k, statement] of this._pendingStatements) { statement.cancel(); } StorageStatements produce StoragePendingStatements. We cancel the latter so we can finalize the former. We have to track both because we can't go from SPS -> SS. > .values() works?! It isn't documented on MDN. This is as close to docs as we get :/ http://wiki.ecmascript.org/doku.php?id=harmony:simple_maps_and_sets > > + this._recordStatementBeginning(statement, pending, index); > > You only call this here. Is it worth inlining to avoid yet more statement > overhead? I broke it out for symmetry and isolation of access; I figure it's more important that we don't screw up future changes in bookkeeping than save a single method call when we're already doing expensive stuff. (The right solution, of course, is "JIT chrome code!") > check c.inProgress() after execution as well. > > Please also queue up multiple statements with different SQL and check an > inProgress() > 1 case. KK.

Gregory Szorc [:gps]

Reporter

Comment 17

•

12 years ago

(In reply to Richard Newman [:rnewman] from comment #16) > (In reply to Gregory Szorc [:gps] from comment #14) > > > + // A map from statement index to mozIStoragePendingStatement, to allow for > > > + // canceling. > > > > I think you mean "finalizing." "Canceling" is for in progress statements. > > Nope. The pending statements are what we cancel when we finish up; see: Yeah, I just read the patch wrong. Please commit.

Richard Newman [:rnewman]

Assignee

Comment 18

•

12 years ago

https://hg.mozilla.org/services/services-central/rev/0f28b10053de https://hg.mozilla.org/services/services-central/rev/dce840cda6b8

Whiteboard: [MemShrink] → [MemShrink][fixed in services]

Marco Bonardo [:mak]

Comment 19

•

12 years ago

I think this is more complicated than needed, there is nothing wrong with finalizing an in-progress statement, once you have invoked executeAsync on a statement you can already call finalize() on it. That said, I'm not sure how you are going to use this API, if you don't want to cache statements why not just using execute() instead of executeCached()? cachins statements, executing them and clearing them is useful only if you do many repeated calls or the same statement.

Marco Bonardo [:mak]

Comment 20

•

12 years ago

and the API is lying, since it's called discardCachedStatements, but doesn't discard the in-progress ones. So it's likely the caller expects to have dropped all of them when it's untrue.

Gregory Szorc [:gps]

Reporter

Comment 21

•

12 years ago

(In reply to Marco Bonardo [:mak] from comment #19) > I think this is more complicated than needed, there is nothing wrong with > finalizing an in-progress statement, once you have invoked executeAsync on a > statement you can already call finalize() on it. This I did not know! We can probably drop all the code checking for in-progress state then? Please confirm. > That said, I'm not sure how you are going to use this API, if you don't want > to cache statements why not just using execute() instead of executeCached()? > cachins statements, executing them and clearing them is useful only if you > do many repeated calls or the same statement. The current implementation almost perfectly satisfies FHR's requirements! FHR has periodic bursts of high SQL usage with the same prepared statement but with different bound parameters. It is advantageous for us to cache statements for this reason. However, these bursts are few and far between. So, it makes sense for FHR to issue a clean-up afterwards. This is exactly what we've done: we clean up after startup and after generating a payload.

Flags: needinfo?(mak77)

Richard Newman [:rnewman]

Assignee

Comment 22

•

12 years ago

(In reply to Marco Bonardo [:mak] from comment #20) > and the API is lying, since it's called discardCachedStatements, but doesn't > discard the in-progress ones. So it's likely the caller expects to have > dropped all of them when it's untrue. I dunno, It seems sane to me to imply "except for those that are currently executing", and the docstring says so. Still, now academic! (In reply to Marco Bonardo [:mak] from comment #19) > I think this is more complicated than needed, there is nothing wrong with > finalizing an in-progress statement, once you have invoked executeAsync on a > statement you can already call finalize() on it. ORLY! Very well then :D > That said, I'm not sure how you are going to use this API, if you don't want > to cache statements why not just using execute() instead of executeCached()? > cachins statements, executing them and clearing them is useful only if you > do many repeated calls or the same statement. As Greg said: it's to act as an efficiency boundary after periods of excessive work (during which time we want to cache statements), without actually closing the database connection.

Richard Newman [:rnewman]

Assignee

Comment 23

•

12 years ago

Attached patch Part 3: finalize all the things. v1 (obsolete) (deleted) — Details — Splinter Review

Attachment #706480 - Flags: review?(gps)

Marco Bonardo [:mak]

Comment 24

•

12 years ago

Thank you for the clarification, the use-case looks fine then. Yes I confirm that once you have called asyncExecute you can immediately call finalize. I'll tell you more, you can also immediately call asyncClose() (though not our case here).

Flags: needinfo?(mak77)

Marco Bonardo [:mak]

Comment 25

•

12 years ago

Comment on attachment 706480 [details] [diff] [review] Part 3: finalize all the things. v1 Review of attachment 706480 [details] [diff] [review]: ----------------------------------------------------------------- ::: toolkit/modules/Sqlite.jsm @@ +231,5 @@ > > /** > + * @return (integer) the number of executing queries. > + */ > + get inProgress() { this name is poorly choosen, inProgress doesn't mean a thing, should be inProgressStatementsCount or activeStatementsCount... btw, what's the usefulness of this? why should a consumer be interested in the number of statements in progress, since it changes continuously and you have poor control over what's in progress since everything is async and can happen at any time. looks like a so volatile data that I can barely think of use-cases. Since this is a pseudoAPI, we should manage it like that, and don't add things that complicate it with poor benefit.

Richard Newman [:rnewman]

Assignee

Comment 26

•

12 years ago

(In reply to Marco Bonardo [:mak] from comment #25) > btw, what's the usefulness of this? It's a supported way for test code to find out the number of concurrent queries (which can verify implementation invariants, such as "queries are not serialized") -- see its five uses in test_sqlite.js. There used to be a comment explaining that. I can move it into the test file instead. > why should a consumer be interested in the number of > statements in progress, since it changes continuously and you have poor > control over what's in progress since everything is async and can happen at > any time. Well, it's read-only, so "poor control" isn't really a good argument IMO; it's for introspection. That it changes continuously and we have poor control is actually an argument in favor of being able to monitor it, no? I could certainly imagine that we might want to snapshot how many concurrent statements we execute as part of a telemetry probe, or to hold off on doing some work until it drops to zero, or even just logging it as part of debugging.

Marco Bonardo [:mak]

Comment 27

•

12 years ago

(In reply to Richard Newman [:rnewman] from comment #26) > There used to be a comment explaining that. I can move it into the test file > instead. APIs added for tests is not a good use-case, imo. At that point I'd prefer an "internal" _inProgressStatementsCount getter, it still sucks but at least wouldn't look like a supported API. > That it changes continuously and we have poor > control is actually an argument in favor of being able to monitor it, no? no, it's an argument that is totally pointless thinking to monitor it, imo :) > I could certainly imagine that we might want to snapshot how many concurrent > statements we execute as part of a telemetry probe one. All statements are serialize, this can only be 1 or 0.

Marco Bonardo [:mak]

Comment 28

•

12 years ago

(In reply to Marco Bonardo [:mak] from comment #27) > one. All statements are serialize, this can only be 1 or 0. OR better, this is tracking statements you "planned" to execute, but they will be executed in a fifo queue, so this is even lying more... you may call it statementsThatWillMaybeExecuteCount :)

Richard Newman [:rnewman]

Assignee

Comment 29

•

12 years ago

(In reply to Marco Bonardo [:mak] from comment #27) > > I could certainly imagine that we might want to snapshot how many concurrent > > statements we execute as part of a telemetry probe > > one. All statements are serialize, this can only be 1 or 0. I have a test proving that _pendingStatements.size reaches 2. That doesn't necessarily mean that they're both running against the database, but they're both extant, which is something we do care about. Not enough to leave it in the public API, but we do care. > OR better, this is tracking statements you "planned" to execute, but they > will be executed in a fifo queue, so this is even lying more... you may call > it statementsThatWillMaybeExecuteCount :) They're "statements on which executeAsync have been called". They might not interleave returned results due to storage limitations, but they're concurrently active, and there's nothing in Sqlite.jsm that demands that they not be interleaved.

Marco Bonardo [:mak]

Comment 30

•

12 years ago

(In reply to Richard Newman [:rnewman] from comment #29) > They're "statements on which executeAsync have been called". They might not > interleave returned results due to storage limitations, but they're > concurrently active Not a limitation, it's a wanted feature. They are still not in-progress as the pseudoAPI tries to tell, they are pending or queued, whatever is more clear. I think we should not confuse the consumers that may fear to have 10 pending statements for perf reasons, when actually they will just queue up nicely.

Richard Newman [:rnewman]

Assignee

Comment 31

•

12 years ago

Attached patch Part 3: finalize all the things. v2 (deleted) — Details — Splinter Review

Attachment #706480 - Attachment is obsolete: true

Attachment #706480 - Flags: review?(gps)

Attachment #706609 - Flags: review?(gps)

Gregory Szorc [:gps]

Reporter

Comment 32

•

12 years ago

Comment on attachment 706609 [details] [diff] [review] Part 3: finalize all the things. v2 Review of attachment 706609 [details] [diff] [review]: ----------------------------------------------------------------- Looks good.

Attachment #706609 - Flags: review?(gps) → review+

Marco Bonardo [:mak]

Comment 33

•

12 years ago

Thumbs up!

Richard Newman [:rnewman]

Assignee

Comment 34

•

12 years ago

Thanks chaps!

Richard Newman [:rnewman]

Assignee

Comment 35

•

12 years ago

https://hg.mozilla.org/services/services-central/rev/2e916d9dae06

Gregory Szorc [:gps]

Reporter

Comment 36

•

12 years ago

https://hg.mozilla.org/mozilla-central/rev/0f28b10053de https://hg.mozilla.org/mozilla-central/rev/dce840cda6b8 https://hg.mozilla.org/mozilla-central/rev/2e916d9dae06

Status: ASSIGNED → RESOLVED

Closed: 12 years ago

Resolution: --- → FIXED

Whiteboard: [MemShrink][fixed in services] → [MemShrink]

Target Milestone: --- → mozilla21

You need to log in before you can comment on or make changes to this bug.