Closed
Bug 454878
Opened 16 years ago
Closed 16 years ago
PR_GetPhysicalMemorySize sometimes returns incorrect result on Mac
Categories
(NSPR :: NSPR, defect, P2)
Tracking
(Not tracked)
RESOLVED
FIXED
4.7.2
People
(Reporter: bzbarsky, Assigned: jaas)
References
Details
(Keywords: fixed1.9.1)
Attachments
(3 files, 1 obsolete file)
(deleted),
patch
|
bzbarsky
:
review+
wtc
:
review+
|
Details | Diff | Splinter Review |
(deleted),
text/plain
|
Details | |
(deleted),
patch
|
jaas
:
review+
|
Details | Diff | Splinter Review |
We tracked down bug 454587 to, as far as we can tell, PR_GetPhysicalMemorySize returning 0 every so often. When this happens, we end up creating a 0-size memory cache (as well as not creating the right size for some of the Places caches during startup, etc). The problem happens particularly often if PR_GetPhysicalMemorySize is called twice within a short period of time (which is what happens during startup), but it seems to be happening in some other cases too (which causes the test failure in question).
The relevant (Mac-specific) code is:
302 struct host_basic_info hInfo;
303 mach_msg_type_number_t count;
304
305 int result = host_info(mach_host_self(),
306 HOST_BASIC_INFO,
307 (host_info_t) &hInfo,
308 &count);
309 if (result == KERN_SUCCESS)
310 bytes = hInfo.memory_size;
and it looks like things fail when |result| is not KERN_SUCCESS. In fact, in my tests it's often 5 (which is the generic KERN_FAILURE value).
Requesting blocking, since this has some pretty serious performance implications: we end up creating memory caches that are completely the wrong size in various places.
Not sure who we have around who'd be able to do something with this... shaver, any ideas for a Mac guru?
Flags: blocking1.9.1?
If this is an Apple bug with the "host_info" method of obtaining the information we want, we could try getting it a different way:
SInt32 bytes;
::Gestalt(gestaltPhysicalRAMSize, &bytes);
Also, we might be able to mitigate the problem by caching the first non-zero response we get from "host_info".
Reporter | ||
Comment 4•16 years ago
|
||
I'd tried the gestalt thing but couldn't figure out the right headers to include to make it work, fwiw.
Doing this with Gestalt is a little trickier than I made it sound. We have to get the system memory size in MBs because the byte count maxes out at 2 gigs of memory.
I haven't tested this patch beyond making sure it runs and the values seem reasonable. It might not do rounding perfectly and if for some reason we need actual byte accuracy it may not work.
Reporter | ||
Comment 6•16 years ago
|
||
That patch fixes the bug over here. I get two correct values on startup, instead of the second value being 0.
We need to initialize the size of the buffer and use the 64-bit value for memory size.
Attachment #341964 -
Attachment is obsolete: true
Attachment #342017 -
Flags: review?(bzbarsky)
fwiw, here is the little test app I was using to figure this out
Reporter | ||
Comment 9•16 years ago
|
||
Comment on attachment 342017 [details] [diff] [review]
fix v1.0
Looks good to me; needs review from wtc, I think.
Attachment #342017 -
Flags: review?(wtc)
Attachment #342017 -
Flags: review?(bzbarsky)
Attachment #342017 -
Flags: review+
Comment 10•16 years ago
|
||
Comment on attachment 342017 [details] [diff] [review]
fix v1.0
r=wtc. Thank you, Josh.
Attachment #342017 -
Flags: review?(wtc) → review+
Assignee | ||
Comment 11•16 years ago
|
||
Wan-Teh - I'm not sure how nspr patching works these days. Are you taking care of landing this or should I do something?
Comment 12•16 years ago
|
||
Comment on attachment 342017 [details] [diff] [review]
fix v1.0
I need to check this patch in to the CVS repository, and then
import a new NSPR snapshot into mozilla-central.
I found sample code on the web that initializes 'count' to
HOST_BASIC_INFO_COUNT, for example,
http://my.safaribooksonline.com/0321278542/ch06lev1sec3
Should we also do that?
I also found Darwin source code that suggests 'max_mem' is
not in the old version of the host_basic_info structure:
http://www.opensource.apple.com/darwinsource/projects/other/xnu-1228.3.13/osfmk/kern/host.c
I wonder if we should check the value of 'count' on return
from host_info, something like:
if (count > offsetof(struct host_basic_info, max_mem))
bytes = hInfo.max_mem;
else
bytes = hInfo.memory_size;
Assignee | ||
Comment 13•16 years ago
|
||
(In reply to comment #12)
> I found sample code on the web that initializes 'count' to
> HOST_BASIC_INFO_COUNT, for example,
> http://my.safaribooksonline.com/0321278542/ch06lev1sec3
>
> Should we also do that?
Fine with me.
> I also found Darwin source code that suggests 'max_mem' is
> not in the old version of the host_basic_info structure:
> http://www.opensource.apple.com/darwinsource/projects/other/xnu-1228.3.13/osfmk/kern/host.c
It does exist as of Mac OS X 10.4, so as long as that is the nspr minimum we're fine. That's our currently supported minimum in Gecko.
> I wonder if we should check the value of 'count' on return
> from host_info, something like:
>
> if (count > offsetof(struct host_basic_info, max_mem))
> bytes = hInfo.max_mem;
> else
> bytes = hInfo.memory_size;
Fine with me, but maybe cast memory_size explicitly? It is a 32-bit value.
Assignee | ||
Comment 14•16 years ago
|
||
This pretty clearly needs to block 1.9.1.
Flags: blocking1.9.1? → blocking1.9.1+
Comment 15•16 years ago
|
||
Comment on attachment 342017 [details] [diff] [review]
fix v1.0
I checked in Josh's fix v1.0 on the NSPR trunk (NSPR 4.7.2).
Checking in prsystem.c;
/cvsroot/mozilla/nsprpub/pr/src/misc/prsystem.c,v <-- prsystem.c
new revision: 3.30; previous revision: 3.29
done
Comment 16•16 years ago
|
||
I found that we have to use HOST_BASIC_INFO_COUNT,
which is defined as
typedef struct host_basic_info host_basic_info_data_t;
...
#define HOST_BASIC_INFO_COUNT ((mach_msg_type_number_t) \
(sizeof(host_basic_info_data_t)/sizeof(integer_t)))
So, 'count' should be set to the size in number of integers.
sizeof(hInfo) is four times too big (48 vs. 12, which is the
value of 'count' on return from host_info).
Attachment #342137 -
Flags: review?(joshmoz)
Attachment #342137 -
Flags: review?(joshmoz) → review+
Comment 17•16 years ago
|
||
Comment on attachment 342137 [details] [diff] [review]
Use HOST_BASIC_INFO_COUNT
I checked in this patch on the NSPR trunk (NSPR 4.7.2).
Checking in prsystem.c;
/cvsroot/mozilla/nsprpub/pr/src/misc/prsystem.c,v <-- prsystem.c
new revision: 3.31; previous revision: 3.30
done
Comment 18•16 years ago
|
||
OK, I pushed NSPR_4_7_2_BETA4 to mozilla-central in changeset 16151b223126
(http://hg.mozilla.org/mozilla-central/rev/16151b223126), which contains
this bug fix.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Target Milestone: --- → 4.7.2
Comment 19•16 years ago
|
||
This patch caused a Tp regression on PPC-only. See bug 465212.
Comment 20•16 years ago
|
||
Although the original code is faster, it is incorrect.
(It's probably faster because the host_info calls fails.)
We should not go back to incorrect code no matter how fast
it is.
The Tp regression seems to imply that the host_info call
takes about 18ms. Or perhaps we do more work when we know
how much memory the system has.
Since this regression is PPC only, I don't think we should
spend too much time on it.
We can fix the Tp regression by not calling
PR_GetPhysicalMemorySize on Mac OS X PPC? That seems like
a wrong tradeoff.
Comment 21•16 years ago
|
||
(In reply to comment #18)
> OK, I pushed NSPR_4_7_2_BETA4 to mozilla-central in changeset 16151b223126
> (http://hg.mozilla.org/mozilla-central/rev/16151b223126), which contains
> this bug fix.
Since the 1.9.1 branch is also using NSPR_4_7_2_BETA4, marking fixed1.9.1.
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/16151b223126
Keywords: fixed1.9.1
Comment 22•15 years ago
|
||
The link to Darwin source code in comment 12 is broken now, so I copied the
code from the current version of that file (in Mac OS X 10.5.8) for future reference.
http://www.opensource.apple.com/source/xnu/xnu-1228.15.4/osfmk/kern/host.c
kern_return_t
host_info(
host_t host,
host_flavor_t flavor,
host_info_t info,
mach_msg_type_number_t *count)
{
if (host == HOST_NULL)
return (KERN_INVALID_ARGUMENT);
switch (flavor) {
case HOST_BASIC_INFO:
{
register host_basic_info_t basic_info;
register int master_num;
/*
* Basic information about this host.
*/
if (*count < HOST_BASIC_INFO_OLD_COUNT)
return (KERN_FAILURE);
basic_info = (host_basic_info_t) info;
basic_info->memory_size = machine_info.memory_size;
basic_info->max_cpus = machine_info.max_cpus;
basic_info->avail_cpus = processor_avail_count;
master_num = master_processor->cpu_num;
basic_info->cpu_type = slot_type(master_num);
basic_info->cpu_subtype = slot_subtype(master_num);
if (*count >= HOST_BASIC_INFO_COUNT) {
basic_info->cpu_threadtype = slot_threadtype(master_num);
basic_info->physical_cpu = machine_info.physical_cpu;
basic_info->physical_cpu_max = machine_info.physical_cpu_max;
basic_info->logical_cpu = machine_info.logical_cpu;
basic_info->logical_cpu_max = machine_info.logical_cpu_max;
basic_info->max_mem = machine_info.max_mem;
*count = HOST_BASIC_INFO_COUNT;
} else {
*count = HOST_BASIC_INFO_OLD_COUNT;
}
return (KERN_SUCCESS);
}
case HOST_SCHED_INFO:
...
default:
return (KERN_INVALID_ARGUMENT);
}
}
You need to log in
before you can comment on or make changes to this bug.
Description
•