clocks_mod
a code timing module for f90
clocks_mod is a set of simple calls for timing f90 code and code sections.
Introduction to clocks_mod.
Using clocks_mod.
Acquiring clocks_mod source.
Linking with clocks_mod.
Portability issues.
The clocks_mod API:
In parallel environments, the key timing information is the
wallclock ("real") time, not the CPU time, of a run. F90 provides the
system_clock(3F) intrinsic to retrieve timing
information from the system realtime clock.
clocks_mod uses the F90
system_clock(3F) intrinsic to measure time. The main
call is to the function tick(). Clocks can be set up
to provide direct timing of a section of code, or for cumulative
timing of code sections within loops.
The overhead of calls to the system clock is typically measured in
microseconds. However, the resolution of the clock may be higher or
lower than this overhead. The resolution is printed when the module is
initialized. A test program is supplied with the module which, among
other things, measures the calling overhead.
In the simplest method of calling, just designate sections of a main
program with calls to tick(). tick()
by default measures time since the last call to
tick() (or to clocks_init()).
This will return timing information for the three regions of the
main program.
If, however, a subroutine in one of the code sections itself
contained calls to tick(), this would produce
erroneous information (since "the last call to tick()"
might refer to a call elsewhere). In this case, we set the reference
tick using the since argument to
tick():
The call to clocks_exit above prints the timings
for the code sections.
The public interfaces to clocks_mod are described here in
alphabetical order:
This prints the values of all cumulative
clocks. flag may be used, for example, in a parallel
run, to have only one of the PEs print times.
Called to initialize the clocks_mod package. Some
information is printed regarding the version of this module and the
resolution of the system clock. flag may be used, for
example, in a parallel run, to have only one of the PEs print this
information.
This is used to return information stored on the clock whose ID is
id. The subroutine returns any or all of the
information held in the following: ticks, for the
total accumulated clock ticks for this ID; calls,
the number of intervals measured with this clock (i.e, the number of
times tick() was called with this ID);
total_time, the total time in seconds on this clock;
and time_per_call, the time per measured interval on
this clock (total_time/calls).
This routine is used if you wish to retrieve this information in a
variable. Otherwise, clocks_exit() may be used to
print this information at termination.
This is used to return an ID to a clock that may be used for timing
a code section. Currently up to 256 (an arbitrarily chosen setting for
the internal parameter max_clocks) clocks can be set.
The cumulative times can be printed at the end of the run by a call to
clocks_exit(). The name can
be a new or existing clock.
Note that name is restricted to 24
characters. If you enter a longer name, it is silently and
gracefully truncated. This can be problematic if you inadvertently
give different clocks names that differ only beyond the 24th
character. These will look to the clocks module as the same clock.
GFDL users can copy the file
/net/vb/public/utils/clocks.F90. External users can
download the source here. The
current public version number is 2.2.
Any module or program unit using clocks_mod must
contain the line
The source file for clocks_mod is clocks.F90.
Compiling with the cpp flag test_clocks turned on:
will produce a program that will exercise certain portions of the
clocks_mod module.
Introduction
Using clocks_mod
program main
call clocks_init()
!code section 1
...
i = tick( 'code section 1' )
!code section 2
...
i = tick( 'code section 2' )
!code section 3
...
i = tick( 'code section 3' )
end
program main
call clocks_init()
i = tick()
!code section 1
...
i = tick( 'code section 1', since=i )
!code section 2
...
i = tick( 'code section 2', since=i )
!code section 3
...
i = tick( 'code section 3', since=i )
end
A third way to use clocks_mod is to produce cumulative
times of code sections within loops. Here we first call
clock_id to set up a clock with an
id, and accumulate times to this ID.
program main
call clocks_init()
id1 = clock_id( 'Code section 1' )
id2 = clock_id( 'Code section 2' )
id3 = clock_id( 'Code section 3' )
do j = 1,10000
i = tick()
!code section 1
...
i = tick( id=id1, since=i )
!code section 2
...
i = tick( id=id2, since=i )
!code section 3
...
i = tick( id=id3, since=i )
end do
call clocks_exit()
end
clocks_mod call syntax
clocks_exit
subroutine clocks_exit(flag)
!print all cumulative clocks
!if flag is set, only print if flag=0
!for instance, flag could be set to pe number by the calling program
!to have only PE 0 print clocks
integer, intent(in), optional :: flag
clocks_init
subroutine clocks_init(flag)
!initialize clocks module
!if flag is set, only print if flag=0
!for instance, flag could be set to pe number by the calling program
!to have only PE 0 in a parallel run print info
integer, intent(in), optional :: flag
get_clock
subroutine get_clock( id, ticks, calls, total_time, time_per_call )
integer, intent(in) :: id
integer, intent(out), optional :: ticks, calls
real, intent(out), optional :: total_time, time_per_call
clock_id
function clock_id(name)
!return an ID for a new clock
integer :: clock_id
character(len=*), intent(in) :: name
Acquiring clocks_mod source
Compiling and linking to clocks_mod
use clocks_mod
f90 -Dtest_clocks clocks.F90
Portability issues
clocks_mod is fully f90 standard-compliant. There are
no portability issues.
Changes
The RCS log for
clocks.F90 contains a comprehensive list of changes. In the
unlikely event that you should wish to check out a retro version,
please get in touch with me, Balaji.