Closed Bug 669947 Opened 13 years ago Closed 13 years ago

Deploy minidump stackwalker to new vhost on build.mozilla.org

Categories

(Release Engineering :: General, defect, P3)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: catlee, Unassigned)

References

Details

(Whiteboard: [buildbot-configs])

Can we create a new cname and vhost on build.mozilla.org called stackwalker.pvt.build.mozilla.org This should be accessible only to machines in the build network. The source is located at http://hg.mozilla.org/users/tmielczarek_mozilla.com/minidump-stackwalk-cgi/file/3eb9c6e47b97 build hosts need to be able to hit http://stackwalker.pvt.build.mozilla.org/stackwalker.cgi ted, any special deployment instructions?
You need a minidump_stackwalk binary, you can grab the one from build/tools: http://hg.mozilla.org/build/tools/raw-file/5998154615cf/breakpad/linux/minidump_stackwalk You'll need to copy config.py.in -> config.py and adjust the paths. MINIDUMP_STACKWALK should point to the binary from above, and SYMBOL_CACHE_PATH needs to be a writable directory where the script can store symbol files. We'll probably also need to set up a cron to remove old files from that directory.
Assignee: server-ops-releng → server-ops
Component: Server Operations: RelEng → Server Operations
QA Contact: zandr → mrz
Assignee: server-ops → server-ops-releng
Component: Server Operations → Server Operations: RelEng
QA Contact: mrz → zandr
I can take care of the webby parts of this - I'll need infra folks to take care of the DNS parts.
Assignee: server-ops-releng → dustin
I added a CNAME for build.mozilla.org for stackwalker.pvt.build.mozilla.org.
vhost is up at: http://stackwalker.pvt.build.mozilla.org/index.txt accessible only from the build network. I followed the instructions in comment 1. SYMBOL_CACHE_PATH is /builds/stackwalker_symbols, since there was lots of space on that partition. The CGI is now up at: http://stackwalker.pvt.build.mozilla.org/stackwalk.cgi However, it seems to expect hashlib, which is not available in Python-2.4.3, which is what's installed on this system (it's new in Python-2.5). That's usually pretty easy to replace with the md5 module. Can you fix that up and I'll put a new copy on there? What sort of old files should the crontask look for? Just by date, or only certain files?
I wrote up docs at https://mana.mozilla.org/wiki/display/SpecOps/Stackwalker+CGI Still to do, once it's working: * puppetize (I'll learn how this works on bug 604688) * set up and document crontab
I pushed a fix to use md5 if hashlib isn't available, so it should work on Python 2.4 now. It looks like I was thinking ahead when I wrote the CGI, and it updates the mtime on the cache directory if it uses an already-downloaded set of symbols, so you should be able to find directories that are immediate children of SYMBOL_CACHE_PATH and rm any whose mtime is older than whatever time period we decide. (24 hours?)
OK, I think the CGI is working. I get: Error: no minidump or no symbols Also, I set up the following in cron.d: MAILTO=dustin@mozilla.com @daily root find /builds/stackwalker_symbols/ -mindepth 1 -maxdepth 1 -mtime +7 -type d -exec rm -rf \{} \; once I don't get any interesting emails, I'll change the MAILTO to release@ I think this is it for the ops side of things. Next steps: - do a test stackwalk to ensure permissions are correct, etc. (I can do this if you tell me what to type, or I can get you Build VPN access with releng's permission) - adjust the buildbot config to use this (a releng project)
Okay, try this. Clone the minidump-stackwalk-cgi repo to your local machine (with VPN access to the machine hosting the CGI). Download this mindiump file locally: http://people.mozilla.com/~tmielczarek/6404faf2-deac-09f3-6f26264d-562d535c.dmp Then run: python testsubmit.py 6404faf2-deac-09f3-6f26264d-562d535c.dmp http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-linux64/1311761319/firefox-8.0a1.en-US.linux-x86_64.crashreporter-symbols.zip http://stackwalker.pvt.build.mozilla.org/stackwalk.cgi And pastebin the output, link it to me on IRC.
So there was no output from this. Internally, it's running /var/www/html/stackwalker/minidump_stackwalk /tmp/tmp2f1kt6 /builds/stackwalker_symbols/90a461124f840d5d74fff3a542fd261d which, when run on the console, gives /var/www/html/stackwalker/minidump_stackwalk: /usr/lib/libstdc++.so.6: version `GLIBCXX_3.4.9' not found (required by /var/www/html/stackwalker/minidump_stackwalk) /var/www/html/stackwalker/minidump_stackwalk: /usr/lib/libstdc++.so.6: version `GLIBCXX_3.4.11' not found (required by /var/www/html/stackwalker/minidump_stackwalk) So again we're being bitten by an ancient CentOS. If there's source for this somewhere (maybe committed to the repo?), I'll be happy to recompile locally. For the record: Red Hat Enterprise Linux Server release 5.5 (Tikanga)
Yeah, you can grab it from SVN: http://code.google.com/p/google-breakpad/source/checkout configure && make and it will wind up in src/processor.
Success! /var/www/html/stackwalker/minidump_stackwalk /tmp/tmpGbVkxU /builds/stackwalker_symbols/90a461124f840d5d74fff3a542fd261d Operating system: Linux 0.0.0 Linux 2.6.32-32-generic #62-Ubuntu SMP Wed Apr 20 21:52:38 UTC 2011 x86_64 CPU: amd64 family 6 model 15 stepping 11 4 CPUs Crash reason: SIGSEGV Crash address: 0x0 Thread 0 (crashed) 0 libcrashme.so + 0x2f1 rbx = 0x0000000000000008 r12 = 0x00007f52011abbb8 r13 = 0x0000000000000004 r14 = 0x0000000000000001 r15 = 0x00007fff514e7258 rip = 0x00007f521fe042f1 rsp = 0x00007fff514e7020 rbp = 0x00007fff514e7040 Found by: given as instruction pointer in context 1 libxul.so!js::mjit::ic::NativeCall [MonoIC.cpp:0a936ddb70e9 : 1031 + 0x4] ........ etc. Over to release engineering to set this up in the buildbot configs, then. I also updated the docs to indicate that this needs to be compiled.
Assignee: dustin → nobody
Component: Server Operations: RelEng → Release Engineering
QA Contact: zandr → release
(In reply to comment #11) > Over to release engineering to set this up in the buildbot configs, then. I > also updated the docs to indicate that this needs to be compiled. What exactly does releng need to do here? Just set SYMBOL_CACHE_PATH in the env (since I think we already set MINIDUMP_STACKWALK)?
OS: Linux → All
Priority: -- → P3
Hardware: x86_64 → All
Whiteboard: [buildbot-configs]
I think our end of things is bug 561754.
Yeah, this bug was just to set up the CGI.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.