On Fri, Jan 17, 2020 at 01:12:32PM -0500, Ed Maste wrote: > On Fri, 17 Jan 2020 at 12:19, Steve Kargl > <sgk_at_troutmask.apl.washington.edu> wrote: > > > > Why? Because adding -pg to the gfortran command line is sufficient > > to getting profiling information for long running numerically > > intensive codes. 'gfortran -pg', of course, loosk for libc_p.a > > and libm_p.a. > > Have you tried sampling-based profiling (i.e., hwpmc)? I'm curious if > it provides equal utility for you, or if there's some shortcoming. Never needed to try hwpmc. % gfortran9 -o z -pg fortran_file.f90 just works if libc_p.a and libm_p.a are present. There is a link-time failure if the libraries are missing. Here's an example of using -pg that found a bottleneck in my code (which I haven't profiled recently). Each sample counts as 0.000123062 seconds. % cumulative self self total time seconds seconds calls s/call s/call name 46.80 275.68 275.68 1178817696 0.00 0.00 __lum_MOD_cludet_dble 11.55 343.73 68.05 19458348 0.00 0.00 __sjnm_MOD_csjn_dble 7.09 385.47 41.73 19458348 0.00 0.00 __sphere_MOD_sphere_shell_formfcn 5.97 420.63 35.16 97291740 0.00 0.00 __sjnm_MOD_sjn_dble 3.84 443.24 22.61 23712564606 0.00 0.00 cabs (w_cabs.c:17 _at_ 4968f0) The cludet_dble() routine is a bottleneck, which makes heavy use of cabs(). It so happens that cludet_dble doesn't need to use cabs, and instead can look at the magnitude square. Replacing cabs(z) with creal(z)**2 + cimag(z)**2 gives Each sample counts as 0.000123062 seconds. % cumulative self self total 53.93 232.70 232.70 1178817696 0.00 0.00 __lum_MOD_cludet_dble 15.84 301.02 68.32 19458348 0.00 0.00 __sjnm_MOD_csjn_dble 10.63 346.91 45.88 19458348 0.00 0.00 __sphere_MOD_sphere_shell_formfcn 7.84 380.71 33.81 97291740 0.00 0.00 __sjnm_MOD_sjn_dble Nominally, a 43 CPU seconds decrease. That 43 seconds accumulates quickly, when the code is executed a few thousand times for Monte Carlo simulations. Is there a trivially stupid way of using hwpmc that requires no changes to fortran_file.f90? PS: For those snickering about the word Fortran. Go read the Fortran 2018 standard and educate yourselves. You want document 007 from https://j3-fortran.org/doc/standing. -- SteveReceived on Fri Jan 17 2020 - 18:29:29 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:22 UTC