Closed Bug 454878 Opened 16 years ago Closed 16 years ago

PR_GetPhysicalMemorySize sometimes returns incorrect result on Mac

Categories

(NSPR :: NSPR, defect, P2)

x86
macOS

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: bzbarsky, Assigned: jaas)

References

Details

(Keywords: fixed1.9.1)

Attachments

(3 files, 1 obsolete file)

We tracked down bug 454587 to, as far as we can tell, PR_GetPhysicalMemorySize returning 0 every so often. When this happens, we end up creating a 0-size memory cache (as well as not creating the right size for some of the Places caches during startup, etc). The problem happens particularly often if PR_GetPhysicalMemorySize is called twice within a short period of time (which is what happens during startup), but it seems to be happening in some other cases too (which causes the test failure in question). The relevant (Mac-specific) code is: 302 struct host_basic_info hInfo; 303 mach_msg_type_number_t count; 304 305 int result = host_info(mach_host_self(), 306 HOST_BASIC_INFO, 307 (host_info_t) &hInfo, 308 &count); 309 if (result == KERN_SUCCESS) 310 bytes = hInfo.memory_size; and it looks like things fail when |result| is not KERN_SUCCESS. In fact, in my tests it's often 5 (which is the generic KERN_FAILURE value). Requesting blocking, since this has some pretty serious performance implications: we end up creating memory caches that are completely the wrong size in various places. Not sure who we have around who'd be able to do something with this... shaver, any ideas for a Mac guru?
Flags: blocking1.9.1?
Oh, this is a problem on both Tiger and Leopard.
Blocks: 454587
If this is an Apple bug with the "host_info" method of obtaining the information we want, we could try getting it a different way: SInt32 bytes; ::Gestalt(gestaltPhysicalRAMSize, &bytes);
Also, we might be able to mitigate the problem by caching the first non-zero response we get from "host_info".
I'd tried the gestalt thing but couldn't figure out the right headers to include to make it work, fwiw.
Attached patch Gestalt fix v1.0 (obsolete) (deleted) — Splinter Review
Doing this with Gestalt is a little trickier than I made it sound. We have to get the system memory size in MBs because the byte count maxes out at 2 gigs of memory. I haven't tested this patch beyond making sure it runs and the values seem reasonable. It might not do rounding perfectly and if for some reason we need actual byte accuracy it may not work.
That patch fixes the bug over here. I get two correct values on startup, instead of the second value being 0.
Attached patch fix v1.0 (deleted) — Splinter Review
We need to initialize the size of the buffer and use the 64-bit value for memory size.
Attachment #341964 - Attachment is obsolete: true
Attachment #342017 - Flags: review?(bzbarsky)
Assignee: wtc → joshmoz
Attached file test app v1.0 (deleted) —
fwiw, here is the little test app I was using to figure this out
Comment on attachment 342017 [details] [diff] [review] fix v1.0 Looks good to me; needs review from wtc, I think.
Attachment #342017 - Flags: review?(wtc)
Attachment #342017 - Flags: review?(bzbarsky)
Attachment #342017 - Flags: review+
Comment on attachment 342017 [details] [diff] [review] fix v1.0 r=wtc. Thank you, Josh.
Attachment #342017 - Flags: review?(wtc) → review+
Wan-Teh - I'm not sure how nspr patching works these days. Are you taking care of landing this or should I do something?
Comment on attachment 342017 [details] [diff] [review] fix v1.0 I need to check this patch in to the CVS repository, and then import a new NSPR snapshot into mozilla-central. I found sample code on the web that initializes 'count' to HOST_BASIC_INFO_COUNT, for example, http://my.safaribooksonline.com/0321278542/ch06lev1sec3 Should we also do that? I also found Darwin source code that suggests 'max_mem' is not in the old version of the host_basic_info structure: http://www.opensource.apple.com/darwinsource/projects/other/xnu-1228.3.13/osfmk/kern/host.c I wonder if we should check the value of 'count' on return from host_info, something like: if (count > offsetof(struct host_basic_info, max_mem)) bytes = hInfo.max_mem; else bytes = hInfo.memory_size;
(In reply to comment #12) > I found sample code on the web that initializes 'count' to > HOST_BASIC_INFO_COUNT, for example, > http://my.safaribooksonline.com/0321278542/ch06lev1sec3 > > Should we also do that? Fine with me. > I also found Darwin source code that suggests 'max_mem' is > not in the old version of the host_basic_info structure: > http://www.opensource.apple.com/darwinsource/projects/other/xnu-1228.3.13/osfmk/kern/host.c It does exist as of Mac OS X 10.4, so as long as that is the nspr minimum we're fine. That's our currently supported minimum in Gecko. > I wonder if we should check the value of 'count' on return > from host_info, something like: > > if (count > offsetof(struct host_basic_info, max_mem)) > bytes = hInfo.max_mem; > else > bytes = hInfo.memory_size; Fine with me, but maybe cast memory_size explicitly? It is a 32-bit value.
This pretty clearly needs to block 1.9.1.
Flags: blocking1.9.1? → blocking1.9.1+
Comment on attachment 342017 [details] [diff] [review] fix v1.0 I checked in Josh's fix v1.0 on the NSPR trunk (NSPR 4.7.2). Checking in prsystem.c; /cvsroot/mozilla/nsprpub/pr/src/misc/prsystem.c,v <-- prsystem.c new revision: 3.30; previous revision: 3.29 done
Priority: -- → P2
Attached patch Use HOST_BASIC_INFO_COUNT (deleted) — Splinter Review
I found that we have to use HOST_BASIC_INFO_COUNT, which is defined as typedef struct host_basic_info host_basic_info_data_t; ... #define HOST_BASIC_INFO_COUNT ((mach_msg_type_number_t) \ (sizeof(host_basic_info_data_t)/sizeof(integer_t))) So, 'count' should be set to the size in number of integers. sizeof(hInfo) is four times too big (48 vs. 12, which is the value of 'count' on return from host_info).
Attachment #342137 - Flags: review?(joshmoz)
Attachment #342137 - Flags: review?(joshmoz) → review+
Comment on attachment 342137 [details] [diff] [review] Use HOST_BASIC_INFO_COUNT I checked in this patch on the NSPR trunk (NSPR 4.7.2). Checking in prsystem.c; /cvsroot/mozilla/nsprpub/pr/src/misc/prsystem.c,v <-- prsystem.c new revision: 3.31; previous revision: 3.30 done
Blocks: 459235
OK, I pushed NSPR_4_7_2_BETA4 to mozilla-central in changeset 16151b223126 (http://hg.mozilla.org/mozilla-central/rev/16151b223126), which contains this bug fix.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Target Milestone: --- → 4.7.2
This patch caused a Tp regression on PPC-only. See bug 465212.
Although the original code is faster, it is incorrect. (It's probably faster because the host_info calls fails.) We should not go back to incorrect code no matter how fast it is. The Tp regression seems to imply that the host_info call takes about 18ms. Or perhaps we do more work when we know how much memory the system has. Since this regression is PPC only, I don't think we should spend too much time on it. We can fix the Tp regression by not calling PR_GetPhysicalMemorySize on Mac OS X PPC? That seems like a wrong tradeoff.
Blocks: 470530
(In reply to comment #18) > OK, I pushed NSPR_4_7_2_BETA4 to mozilla-central in changeset 16151b223126 > (http://hg.mozilla.org/mozilla-central/rev/16151b223126), which contains > this bug fix. Since the 1.9.1 branch is also using NSPR_4_7_2_BETA4, marking fixed1.9.1. http://hg.mozilla.org/releases/mozilla-1.9.1/rev/16151b223126
Keywords: fixed1.9.1
The link to Darwin source code in comment 12 is broken now, so I copied the code from the current version of that file (in Mac OS X 10.5.8) for future reference. http://www.opensource.apple.com/source/xnu/xnu-1228.15.4/osfmk/kern/host.c kern_return_t host_info( host_t host, host_flavor_t flavor, host_info_t info, mach_msg_type_number_t *count) { if (host == HOST_NULL) return (KERN_INVALID_ARGUMENT); switch (flavor) { case HOST_BASIC_INFO: { register host_basic_info_t basic_info; register int master_num; /* * Basic information about this host. */ if (*count < HOST_BASIC_INFO_OLD_COUNT) return (KERN_FAILURE); basic_info = (host_basic_info_t) info; basic_info->memory_size = machine_info.memory_size; basic_info->max_cpus = machine_info.max_cpus; basic_info->avail_cpus = processor_avail_count; master_num = master_processor->cpu_num; basic_info->cpu_type = slot_type(master_num); basic_info->cpu_subtype = slot_subtype(master_num); if (*count >= HOST_BASIC_INFO_COUNT) { basic_info->cpu_threadtype = slot_threadtype(master_num); basic_info->physical_cpu = machine_info.physical_cpu; basic_info->physical_cpu_max = machine_info.physical_cpu_max; basic_info->logical_cpu = machine_info.logical_cpu; basic_info->logical_cpu_max = machine_info.logical_cpu_max; basic_info->max_mem = machine_info.max_mem; *count = HOST_BASIC_INFO_COUNT; } else { *count = HOST_BASIC_INFO_OLD_COUNT; } return (KERN_SUCCESS); } case HOST_SCHED_INFO: ... default: return (KERN_INVALID_ARGUMENT); } }
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: