Closed
Bug 1457706
Opened 7 years ago
Closed 7 years ago
Symbol API gives Internal server error when requesting symbols for an Android libxul.so
Categories
(Socorro :: Symbols, task)
Socorro
Symbols
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: mstange, Assigned: peterbe)
References
Details
Attachments
(1 file)
(deleted),
text/x-github-pull-request
|
Details |
Example request:
> curl 'https://symbols.mozilla.org/symbolicate/v5' -H 'U Intel Mac OS X 10.12; rv:61.0) Gecko/20100101 Firefox/61.0' -H 'Accept: */*' -H 'Accept-Language: en-US,en;q=0.5' --compressed -H 'content-type: text/plain;charset=UTF-8' -H 'origin: https://perf-html.io' -H 'Connection: keep-alive' --data '{"memoryMap":[["libxul.so","6C6D700C55368ABB2D1564B806504DE70"]],"stacks":[[[0,8755689]]]}'
returns
{"error": "Internal Server Error"}
However, the symbol file at https://symbols.mozilla.org/libxul.so/6C6D700C55368ABB2D1564B806504DE70/libxul.so.sym (warning: hangs Firefox if loaded) exists.
Assignee | ||
Comment 1•7 years ago
|
||
It's probably this: https://sentry.prod.mozaws.net/operations/symbols-prod/issues/4293515/
ConnectionResetError: [Errno 104] Connection reset by peer
File "redis/connection.py", line 590, in send_packed_command
self._sock.sendall(item)
ConnectionError: Error 104 while writing to socket. Connection reset by peer.
File "redis/client.py", line 667, in execute_command
connection.send_command(*args)
File "redis/connection.py", line 610, in send_command
self.send_packed_command(self.pack_command(*args))
File "redis/connection.py", line 603, in send_packed_command
(errno, errmsg))
BrokenPipeError: [Errno 32] Broken pipe
File "redis/connection.py", line 590, in send_packed_command
self._sock.sendall(item)
ConnectionError: Error 32 while writing to socket. Broken pipe.
File "django/core/handlers/exception.py", line 35, in inner
response = get_response(request)
File "django/core/handlers/base.py", line 128, in _get_response
response = self.process_exception_by_middleware(e, request)
File "django/core/handlers/base.py", line 126, in _get_response
response = wrapped_callback(request, *callback_args, **callback_kwargs)
File "tecken/base/decorators.py", line 165, in inner
response = func(*args, **kwargs)
File "django/views/decorators/csrf.py", line 54, in wrapped_view
return view_func(*args, **kwargs)
File "tecken/base/decorators.py", line 100, in wrapper
return view_func(request, *args, **kwargs)
File "markus/main.py", line 357, in _timer_decorator
return fun(*args, **kwargs)
File "tecken/symbolicate/views.py", line 775, in inner
return view_function(request, json_body)
File "tecken/symbolicate/views.py", line 930, in symbolicate_v5_json
job['memoryMap'],
File "tecken/symbolicate/views.py", line 205, in symbolicate
for symbol_key, information, module_index in downloaded:
File "tecken/symbolicate/views.py", line 633, in load_symbols
information['symbol_map']
File "redis/client.py", line 2011, in hmset
return self.execute_command('HMSET', name, *items)
File "redis/client.py", line 673, in execute_command
connection.send_command(*args)
File "redis/connection.py", line 610, in send_command
self.send_packed_command(self.pack_command(*args))
File "redis/connection.py", line 603, in send_packed_command
(errno, errmsg))
Either Redis is currently down (unlikely due to healthcheck monitoring) or it's failing to make the HMSET command because the data is so big that it times out.
Assignee | ||
Comment 2•7 years ago
|
||
By the way, this error has happened 11 times in Sentry.
Assignee: nobody → peterbe
Assignee | ||
Comment 3•7 years ago
|
||
Reproducible locally https://gist.github.com/peterbe/8ebfa6d5698c6a5c628313dccad047c2
Assignee | ||
Comment 4•7 years ago
|
||
The crashing lines are:
redis_store_connection.hmset(
store.make_key(cache_key),
information['symbol_map']
)
That Python dict `information['symbol_map']` about 50MB. Perhaps it's simply too huge to be allowed. It has 525,236 keys which isn't too big according to https://stackoverflow.com/a/39268254/205832
Here they're trying to store a set that is larger than 512MB (the limit) and get the same "Error 104" as I get locally.
https://github.com/andymccurdy/redis-py/issues/850
The "Errno 104" is easy to reproduce with this script: https://gist.github.com/peterbe/1d8c7ea998e27006227d987bc4e37df0
Basically, if the dict is ~10MB as a string, you get the ConnectionError for redis-py.
This is when I'm using Redis and Python inside my docker-for-mac.
The stunning thing is that 50MB is big but it's not unusual. How come we didn't see this before when we played with v5 of the Symbolication API. The file libxul.so.sym is 232MB uncompressed.
Must be a way to break up the HMSET with batching.
Comment 5•7 years ago
|
||
Assignee | ||
Comment 6•7 years ago
|
||
^ This PR fixes it. I was able to repeatedly reproduce it locally. In my Docker Redis thing, I found the HMSET would raise a ConnectionError when the total size of the dict (formatted as a string) was around 9.5MB. So with some rough guesswork (based on /libxul.so/6C6D700C55368ABB2D1564B806504DE70/libxul.so.sym) I guessed roughly 50,000 signatures (and their keys) sums to about 5MB (as a string).
This made it possible for me to locally run the symbolication:
```
▶ curl -s -XPOST --data '{"memoryMap":[["libxul.so","6C6D700C55368ABB2D1564B806504DE70"]],"stacks":[[[0,8755689]]]}' http://localhost:8000/symbolicate/v5 | jq
{
"results": [
{
"stacks": [
[
{
"module_offset": "0x8599e9",
"module": "libxul.so",
"frame": 0,
"function": "XPCWrappedNative::CallMethod(XPCCallContext&, XPCWrappedNative::CallMode)",
"function_offset": "0x8d4"
}
]
],
"found_modules": {
"libxul.so/6C6D700C55368ABB2D1564B806504DE70": true
}
}
]
}
```
Comment 7•7 years ago
|
||
Commit pushed to master at https://github.com/mozilla-services/tecken
https://github.com/mozilla-services/tecken/commit/4e6735681b51049dd428a91266c55221a08692a2
fixes bug 1457706 - batch send HMSET for large symbols (#859)
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 8•7 years ago
|
||
This only just made into master. Let's see if we can test this on (poor feeble) Dev. If it works there, let's aim to make a release and once that works we can close this bug.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 9•7 years ago
|
||
Sorry for the delay. The deployment pipeline is busted so we can't upgrade Stage (for testing). It's unrelated to the Tecken code. Miles is actively working on a resolution. I'm hoping to make a Stage + Prod release tomorrow Tuesday May 1.
Reporter | ||
Comment 10•7 years ago
|
||
No worries! And thanks for investigating this issue so quickly.
Assignee | ||
Comment 11•7 years ago
|
||
I've verified that the fix unbroke Stage. Doing Prod deployment today.
Assignee | ||
Comment 12•7 years ago
|
||
Made it into prod now.
Assignee | ||
Comment 13•7 years ago
|
||
Oops. Forgot to resolve.
Anyway...
▶ curl -s -XPOST --data '{"memoryMap":[["libxul.so","6C6D700C55368ABB2D1564B806504DE70"]],"stacks":[[[0,8755689]]]}' https://symbols.mozilla.org/symbolicate/v5 | jq
{
"results": [
{
"stacks": [
[
{
"module_offset": "0x8599e9",
"module": "libxul.so",
"frame": 0,
"function": "XPCWrappedNative::CallMethod(XPCCallContext&, XPCWrappedNative::CallMode)",
"function_offset": "0x8d4"
}
]
],
"found_modules": {
"libxul.so/6C6D700C55368ABB2D1564B806504DE70": true
}
}
]
}
Status: REOPENED → RESOLVED
Closed: 7 years ago → 7 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 14•7 years ago
|
||
I can confirm it's working now. Thanks!!
You need to log in
before you can comment on or make changes to this bug.
Description
•