Closed
Bug 688338
Opened 13 years ago
Closed 7 years ago
mozharness error context lines
Categories
(Release Engineering :: Applications: MozharnessCore, defect, P4)
Release Engineering
Applications: MozharnessCore
Tracking
(Not tracked)
RESOLVED
WONTFIX
People
(Reporter: mozilla, Unassigned)
References
(Blocks 1 open bug)
Details
(Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/1952] [mozharness])
Attachments
(4 files, 2 obsolete files)
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
text/plain
|
Details | |
(deleted),
patch
|
Details | Diff | Splinter Review |
In 0.4 we have error lists with substr/regexes defined, as well as a level, e.g.
HgErrorList=[ {'regex': r'''^abort:''', 'level': ERROR},
{'substr': r'''command not found''', 'level': ERROR},
{'substr': r'''unknown exception encountered''', 'level': ERROR},
]
These match against each line of output from run_command().
I used to have context lines, e.g.
{'substr': r'''Traceback (most recent call last''', 'level': ERROR,
'context_lines': '-3:+10'}
would give 3 lines of context before and 10 lines of context afterwards all marked as level ERROR.
To do this, we need to buffer the output or write it all to disk and parse at the end.
Comment 1•12 years ago
|
||
I have something implemented for this.
###
Current functionality:
1) does not require any mozharness developer to modify existing code in order to use this feature. All one has to do is add to an existing error_list that is passed to run_command or OutputParser directly and include at least one dict elem that contains a 'context_lines' key.
example: {'substr': r'''some regex''', 'level': WARNING, 'context_lines': {'pre': 5, 'post': 12}},
* note OutputParser is smart enough to not waste performance and will only use a 'buffer' if it sees an elem in error_list like the example above
2) currently we can output a max of 10 lines on either side of the requested context lines. This means the buffer has a max length of 21 but can be adjusted with a one line modification
3) context lines will be produced for either the full amount requested from 'context_lines' or to the limits of the buffer. This means that if we have a run_command that outputs less then the buffer max size, we produce context lines up to whatever the current buffer length is. Likewise, if a regex that has context_lines is found at the beginning of output or the end, the context_lines play within the limitations.
4) overlapping works as well. Say you want context lines for two regex's and their respective 'post' and 'pre' lengths cross into each other: the context lines can represent this overlap.
5) the log level taken for lines of context respects the 'worst_level' idea and will not modify any lines 'level' if that line is already worse.
6) the placeholders '$' and '>' can easily be modified. They were just there for testing.
###
Code for this implementation can be followed here: https://github.com/lundjordan/mozharness/compare/error-context-lines#files_bucket
see next comment's attached patch for feedback
an example output of this in action can be seen here: http://pastebin.mozilla.org/1770917
Updated•12 years ago
|
Assignee: nobody → jlund
Comment 2•12 years ago
|
||
This patch has been ran through unit.sh and has been tested inline with Desktop Unittesting in Mozharness. Will probably require more testing but here's the state so far.
Attachment #654333 -
Flags: feedback?(aki)
Comment 3•12 years ago
|
||
limitations:
1) for now this does not work with error_list's that also want summaries. This should be a quick easy fix though.
2) does not work if you subclass OutputParser like parts of marionette and DesktopUnittests do(the parts I tested in unittests were not using a subclassed OP). This also shouldn't be too bad to implement. However, as the code is now, developers who subclass OP will have to abide to some 'rules' if they want context_lines (since we are not logging lines as they come; we are adding it to a buffer that behaves like a queue).
3) many other limitations I am sure that I have not predicted
Reporter | ||
Comment 4•12 years ago
|
||
Comment on attachment 654333 [details] [diff] [review]
working implementation of context lines
> CONTEXT_WARN_MESSAGE = "Warning: missing context. Current "\
> "%(type)s buffer size: %(buffer)d, Requested %(type)s context lines:"\
> " %(context)d. Setting context length to: %(buffer)d"
Could we move this up to the top of the file?
I prefer header, imports, variable definitions, then classes.
Mixing a variable definition in between two classes bugs me.
>+ self.buffer_limit = 21 # gives 10 contexts lines on either side
It might be nice to set this via kwarg.
It would be even awesomer to set it via the highest context number in the error_list (which would remove the need for the CONTEXT_WARN_MESSAGE), but that's icing on the cake. If this lands with a hardcoded limit of 21, I'd still be happy.
> def add_lines(self, output):
>- if str(output) == output:
>+ self.use_buffer = False
>+ # allows us to enable context_buffer by simply adding
>+ # 'context_lines' to any error at any point in the future
>+ # This allows OutputParser to be optimized for the occasion
>+ # but no flags need to be passed to inform OutputParser to
>+ # use a buffer
>+ for error_check in self.error_list:
>+ if error_check.get('context_lines'):
>+ self.use_buffer = True
>+ break
Is there a reason this is in add_lines rather than in __init__()?
If we don't expect self.error_list to change after __init__(), this would be better there. This also prevents running this for loop every time we call add_lines(), which may be once per line for a long long log.
>+ if self.use_buffer:
>+ for line in output:
>+ line = self._validate_line(line)
>+ self.append_to_buffer_and_parse(line) if line else None
>+ # now empty the remaining lines left in the buffer
>+ self.flush_buffer_and_parse()
I'm a bit worried about the move from adding single lines to adding p.stdout in run_command(), but it seems to work.
As I mentioned on Vidyo, I'd love tests around this.
>+ for i, message_and_level in enumerate(context_lines):
>+ buffer_index = i + pre
>+ message, log_level = message_and_level
>+ if buffer_index == target_index:
>+ message = "$ %s" % message.lstrip()
>+ if warn_message:
>+ message += warn_message
>+ else: # context
>+ if message.startswith('$'):
>+ continue # ignore other regex's that want context
>+ message = "> %s" % message
>+ log_level = self.worst_level(target_level, log_level)
>+ self.context_buffer[buffer_index] = (message, log_level)
This is one way to do it.
I think it's clever, though I'd love to be able to either customize the characters, or remove them entirely.
To do that we'd need to be able to mark the message as a match outside of the actual string, which would probably make the buffer (message, log_level, match) instead of (message, log_level). I'd be fine with a list of dictionaries too.
If this is an easy ask, let's do it; if not we can stick with > and $ unless someone specifically asks for additional funcitonality there.
Overall, this looks great! I've wanted context lines since day 1 of mozharness, and haven't had the time to work on it, so this is very exciting.
Attachment #654333 -
Flags: feedback?(aki) → feedback+
Comment 5•12 years ago
|
||
The full patch of this work is coming in next comment and has been tested with merging into master/default.
example output for this one is similar to before: http://pastebin.mozilla.org/1810952
Unfortunetly I could only test locally on my machine since last night. I had some unittesting for this but there were bugs and it was a mess. I decided to pull those changes out.
Comment 6•12 years ago
|
||
feel free to hit me with any questions/concerns. I have no problem following up on this in my spare time.
Attachment #654333 -
Attachment is obsolete: true
Attachment #659525 -
Flags: review?(aki)
Reporter | ||
Comment 7•12 years ago
|
||
So my main concern here was, does this work?
And could we have testing?
I also started a new wip patch that applies on top of attachment 659525 [details] [diff] [review] that changes a bit of the behavior and commenting. I started a nosetests test object, then decided a standalone script might be best for the first stab at it. (Next comment)
Reporter | ||
Comment 8•12 years ago
|
||
I wrote this little test script, which I can port to test/test_base_log.py once things are written and smoothed out.
However, I got the following output, even though I thought I did pre: 3, post: 0:
INFO - SimpleFileLogger online at 20120911 18:03:52 in /src/jlund/mozharness
ERROR - $ one
ERROR - two
ERROR - three
ERROR - four
ERROR - five
ERROR - ERROR
ERROR - six
ERROR - seven
ERROR - eight
ERROR - nine
ERROR - tenWarning: missing context. Current pre buffer size: 0, Requested pre context lines: 3. Setting context length to: 0
ERROR - The match was found near the start and/or finish of output from command
Then I qpop'ed my patch, and still got this with your patch alone:
INFO - SimpleFileLogger online at 20120911 18:04:40 in /src/jlund/mozharness
ERROR - ----> one
ERROR - two
ERROR - three
ERROR - four
ERROR - five
ERROR - ERROR
ERROR - six
ERROR - seven
ERROR - eight
ERROR - nine
ERROR - tenWarning: missing context. Current pre buffer size: 0, Requested pre context lines: 3. Setting context length to: 0
ERROR - The match was found near the start and/or finish of output from command
Am I doing something wrong, or is something broken here?
Reporter | ||
Updated•12 years ago
|
Attachment #659525 -
Flags: review?(aki)
Comment 9•12 years ago
|
||
> Am I doing something wrong, or is something broken here?
hmmm this is interesting. looking at your script I can not see why nearly everything broken. I will apply this file to my repo and debug it. It sounds like I was overlooking something major or else rushing on my last day was not the best idea.
Will get back to you shortly.
Comment 10•12 years ago
|
||
OK, had some time today to debug.
So we have this condition with a reassignment in add_lines():
http://hg.mozilla.org/build/mozharness/file/160603bc87f8/mozharness/base/log.py#l169
and since in your test file you made (test_base_log.py) you are using a string for output:
output = "one\ntwo\nthree\nfour\nfive\nERROR\nseven\netc"
output will assigned to ["one\ntwo\nthree\nfour\nfive\nERROR\nseven\netc"] via the above link at line 169
then when we say 'for line in output:' in add_lines(), there is only one elem in the list. In other words we are not splitting up output and we are just parsing one string. This is why the substring 'ERROR' is found immediately and there is no pre or post context to buffer
by changing output to:
output = ['one', 'two', 'three', 'four', 'five', 'ERROR', 'six', 'seven', 'eight', 'nine', 'ten']
I get the following (correct) output:
→ python test/test_base_log.py
INFO - SimpleFileLogger online at 20120922 15:16:33 in /home/jlund/devel/mozilla/dirtyBuildRepos-git/mozharness
INFO - one
INFO - two
ERROR - > three
ERROR - > four
ERROR - > five
ERROR - ----> ERROR
INFO - six
INFO - seven
INFO - eight
INFO - nine
INFO - ten
so when we pass a string as output in add_lines() I think we have a bug. This didn't happen in my tests as I was using run_command which previously looped over the file p.stdout and passed each line to add_lines() or, with my last patch, passes the entire file (p.stdout). Both of these situations were working for me
Maybe we can:
if str(output) == output:
output = inputString.split('\n')
for line in output:
but I am not sure how optimized that will be for large strings being passed as output? What do you think? Am I making sense or have I completely overlooked something?
Reporter | ||
Comment 11•12 years ago
|
||
Sure, let's splitlines() if it's a string.
if isinstance(output, basestring):
output = output.splitlines()
?
Comment 12•12 years ago
|
||
I tested this with aki's config_tester.py
Attachment #659525 -
Attachment is obsolete: true
Attachment #664100 -
Flags: review?(aki)
Reporter | ||
Comment 13•12 years ago
|
||
[10:59] <aki> catlee: jlund changes it to | parser.add_lines(p.stdout) | ... will that buffer?
[11:00] <catlee> aki: yes
[11:00] <catlee> that does 'for line in output' internally
We're changing this line to
- for line in p.stdout:
+ while True:
+ line = p.stdout.readline()
+ if not line:
+ break
parser.add_lines(line)
to avoid buffering (when we're not using context lines).
I'm not sure how busy you are at school. If this bug needs to sit til I have time to look at this, I'm ok with that. If you want to tackle that, and potentially also read my mind as to what I was doing in https://bugzilla.mozilla.org/attachment.cgi?id=660279 (which would otherwise become review comments), that's cool too.
Reporter | ||
Comment 14•12 years ago
|
||
Comment on attachment 664100 [details] [diff] [review]
same patch but with one line modification as to Aki's last comment
Thanks for your work here Jordan!
Per comment 13, both of us are busy, and this patch needs a bit more work.
This will be a good starting point when one of us (or someone else) has time, and/or we find we suddenly need context lines.
Attachment #664100 -
Flags: review?(aki)
Assignee | ||
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
Comment 15•11 years ago
|
||
I am not actively working on this. May look at next quarter.
my focus/priority is primarily here:
- Bug 858797 - fx desktop builds in mozharness
Assignee: jlund → nobody
Updated•10 years ago
|
Component: Other → Tools
QA Contact: hwine
Updated•10 years ago
|
Component: Tools → Mozharness
Updated•10 years ago
|
Whiteboard: [mozharness] → [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/1952] [mozharness]
Reporter | ||
Updated•7 years ago
|
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•