On Mon, 21 Apr 2003, David Schultz wrote: > On Mon, Apr 21, 2003, Bruce Evans wrote: > > On Mon, 21 Apr 2003, David Schultz wrote: > > > Index: ufsread.c > > > =================================================================== > > > RCS file: /cvs/src/sys/boot/common/ufsread.c,v > > > retrieving revision 1.11 > > > diff -u -r1.11 ufsread.c > > > --- ufsread.c 25 Feb 2003 00:10:20 -0000 1.11 > > > +++ ufsread.c 21 Apr 2003 10:10:01 -0000 > > > ... > > > _at__at_ -47,11 +59,11 _at__at_ > > > ... > > > -#define FS_TO_VBA(fs, fsb, off) (fsbtodb(fs, fsb) + \ > > > - ((off) / VBLKSIZE) * DBPERVBLK) > > > +#define FS_TO_VBA(fs, fsb, off) ma((off) / VBLKSIZE, DBPERVBLK, \ > > > + fsbtodb((fs), (fsb))) > > > > The division by VBLKSIZE should probably be a shift. ufsread.c has > > VBLKSHIFT and uses it for all multiplications and divisions by VBLKSIZE > > except this one. gcc can't optimize to just a shift since all the > > types are signed and C99 specifies that division of negative integers > > by positive ones has the usual hardware brokenness. > > As I recall, signed division gets optimized into a sign test, some ^ by a power of 2 > bit fiddling for negative numbers, and a division. The additional shift > cost is nominal if you only care about speed, but I'm sure using a > shift directly would save a few more bytes. I tried this, but it had no effect since FS_TO_VBA() is never actually used. So there is a much better optimization for it :-). I think this makes ma() unused too. I thought that the savings for unsigned division were more for long longs than for longs, but they are actually relatively smaller. On i386's, signed division (when optimized to shifts) of %edx:%eax by 2^12 takes 19 bytes and right shift takes 7 bytes; the corresponding numbers for %eax are 12 bytes and 3 bytes. Optimization for space should not use shifts for the signed case. BruceReceived on Tue Apr 22 2003 - 02:43:44 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:04 UTC