On Wed, 25 Feb 2004, Bruce Cran wrote: > After investigating whether FreeBSD will be able to show the correct speed of a > 4.3GHz CPU (http://www.theinquirer.net/?article=14319), I found that the sysctl Please use lines shorter than 85 columns. > infrastructure doesn't currently have support for 64-bit integers, on a 32-bit > platform at least. The only type pertaining to 64-bits is CTLTYPE_QUAD, but since > there's no corresponding sysctl_handle_quad, it's still handled as a 32-bit integer. The kernel can handle quads (not quite right) using CTLTYPE_OPAQUE. Support for reading CTLTYPE_QUAD variables from the kernel in sysctl(8) was broken long ago (1995?). I use the following quick fix: %%% Index: sysctl.c =================================================================== RCS file: /home/ncvs/src/sbin/sysctl/sysctl.c,v retrieving revision 1.59 diff -u -1 -r1.59 sysctl.c --- sysctl.c 7 Nov 2003 16:41:47 -0000 1.59 +++ sysctl.c 26 Feb 2004 03:03:21 -0000 _at__at_ -599,2 +581,19 _at__at_ + case 'Q': + if (!nflag) + printf("%s%s", name, sep); + fmt++; + val = ""; + while (len >= sizeof(quad_t)) { + fputs(val, stdout); + if (*fmt == 'U') + printf(hflag ? "%'qu" : "%qu", *(u_quad_t *)p); + else + printf(hflag ? "%'qd" : "%qd", *(quad_t *)p); + val = " "; + len -= sizeof(quad_t); + p += sizeof(quad_t); + } + return (0); + case 'T': %%% > I found PR kern/39681 which explains that it's apparently not as simple as adding a > sysctl_handle_quad function. With CPUs getting faster and a few people adding more Arches that support both 32-bit and 64-bit applications with a 64-bit kernel make this even more interesting. See peter's recent hacks in kern_exec.c. Longs are 64-bit in the kernel but 32-bit in applications, so CTLTYPE_LONG doesn't do what 32-bit applications expect. Longs in data structures just cannot work right, but it should be possible to return small scalar values using only as many bytes as required without either the application or the kernel caring about the exact number of bytes. > then 4GB RAM to their machines, there's a need for some sort of solution IMO. > In addition, the only place a CTLTYPE_QUAD is used is in sys/i386/i386/tsc.c, and > then it mixed CTLTYPE_QUAD and sizeof(u_int). I've also noticed that the This mixture is invalid. It should cause the following error according to sysctl.3: [EINVAL] A non-null newp is given and its specified length in newlen is too large or too small. but the error handling for this was broken at much the same time that support for CTLTYPE_QUAD was broken. Quick fix (with debugging cruft): %%% Index: kern_sysctl.c =================================================================== RCS file: /home/ncvs/src/sys/kern/kern_sysctl.c,v retrieving revision 1.149 diff -u -1 -r1.149 kern_sysctl.c --- kern_sysctl.c 22 Feb 2004 12:31:43 -0000 1.149 +++ kern_sysctl.c 23 Feb 2004 03:41:17 -0000 _at__at_ -942,4 +934,11 _at__at_ return 0; - if (req->newlen - req->newidx < l) + if (req->newlen - req->newidx != l) { + if (req->newlen - req->newidx > l) { + printf( + "sysctl_new_kernel -- short write: %zu - %zu > %zu\n", + req->newlen, req->newidx, l); + Debugger("sysctl_new_kernel: short write"); + } return (EINVAL); + } bcopy((char *)req->newptr + req->newidx, p, l); _at__at_ -1058,4 +1055,11 _at__at_ return 0; - if (req->newlen - req->newidx < l) + if (req->newlen - req->newidx != l) { + if (req->newlen - req->newidx > l) { + printf( + "sysctl_new_user -- short write: %zu - %zu > %zu\n", + req->newlen, req->newidx, l); + Debugger("sysctl_new_user: short write"); + } return (EINVAL); + } error = copyin((char *)req->newptr + req->newidx, p, l); %%% I've used this code since long ago (1995?) except the Debugger() call was only added 6 months ago to determine if there are any other sysctls as broken as machdep.tsc_freq. There don't seem to be any others. Short reads of sysctl scalar variables just aren't useful, since the whole variable will need to be read if any of it is, and the read shouldn't be in pieces since that just gives races. Short writes are just bugs, since the races are much larger and the kernel can't do bounds checking on variables written in pieces. With the above error handling, the machdep.tsc_freq sysctl just fails. Quick non-fix: %%% Index: tsc.c =================================================================== RCS file: /home/ncvs/src/sys/i386/i386/tsc.c,v retrieving revision 1.204 diff -u -1 -r1.204 tsc.c --- tsc.c 21 Oct 2003 18:28:34 -0000 1.204 +++ tsc.c 26 Feb 2004 03:23:52 -0000 _at__at_ -132,4 +148,4 _at__at_ { + u_int freq; int error; - uint64_t freq; _at__at_ -138,2 +154,4 _at__at_ freq = tsc_freq; + if (freq != tsc_freq) + return (EOVERFLOW); error = sysctl_handle_int(oidp, &freq, sizeof(freq), req); _at__at_ -146,5 +164,41 _at__at_ -SYSCTL_PROC(_machdep, OID_AUTO, tsc_freq, CTLTYPE_QUAD | CTLFLAG_RW, +SYSCTL_PROC(_machdep, OID_AUTO, tsc_freq, CTLTYPE_UINT | CTLFLAG_RW, 0, sizeof(u_int), sysctl_machdep_tsc_freq, "IU", ""); +uint64_t tsc_freq1 = 0x0000000100000002; + +static int +sysctl_machdep_tsc_freq1(SYSCTL_HANDLER_ARGS) +{ + uint64_t freq; + int error; + + freq = tsc_freq1; + error = sysctl_handle_opaque(oidp, &freq, sizeof(freq), req); + if (error == 0 && req->newptr != NULL) + tsc_freq1 = freq; + return (error); +} + +SYSCTL_PROC(_machdep, OID_AUTO, tsc_freq1, CTLTYPE_QUAD | CTLFLAG_RW, + 0, sizeof(uint64_t), sysctl_machdep_tsc_freq1, "QU", ""); + +uint64_t tsc_freq2 = 0x0000000100000002; + +static int +sysctl_machdep_tsc_freq2(SYSCTL_HANDLER_ARGS) +{ + uint64_t freq; + int error; + + freq = tsc_freq2; + error = sysctl_handle_opaque(oidp, &freq, sizeof(freq), req); + if (error == 0 && req->newptr != NULL) + tsc_freq2 = freq; + return (error); +} + +SYSCTL_PROC(_machdep, OID_AUTO, tsc_freq2, CTLTYPE_QUAD | CTLFLAG_RW, + 0, sizeof(uint64_t), sysctl_machdep_tsc_freq2, "IU", ""); + static unsigned %%% The changes to the machdep.tsc_freq sysctl just fix its bogus parameters so that it returns 32 bits in a non-bogus way, and adds a check that the variable is actually representable in 32 bies (so the sysctl should fail with errno EOVERFLOW if the frequency is 4.3GHz). sysctl_machdep_tsc_freq1() shows how to implement returning 64-bit values using CTLTYPE_OPAQUE (machdep.tsc_freq1 sysctl). It should work for machdep.tsc_freq after removing all "1"'s in it. sysctl_machdep_tsc_freq2() shows another way of returning 64-bit values. IIRC, its only advantage is that it sort of works with an unpatched syasctl(8). > machdep.[i8254_freq, tsc_freq, acpi_timer_freq] are all writable, whereas I think they > should probably be read-only, as should kern.bootfile. One last thing I've noticed No, these should all be writable. See other replies for what can be done by changing the frequencies in userland. Writing to kern.bootfile is normal for all kernel installs -- kern.post.mk changes kern.bootfile so that it keeps pointing to the old kernel when the old kernel is renamed. > during my look into the kernel code is that a lot of the sysctls are defined as signed > integers - wouldn't it be better if these were converted to be unsigned, so that it Many are already unsigned. Perhaps not enough, but unsigned variables should be used as little as possible since using them breaks overflow handling and gives a morass of (un)sign extension problems. > would be impossible for the writable ones to be set to negative values and thus avoid > possible foot-shooting? I already have some diffs, so I could put together a PR if > some of these changes (such as making some sysctls read-only) are valid. Negative values are often no more preposterous than large ones. BruceReceived on Wed Feb 25 2004 - 18:44:27 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:44 UTC