Quantcast
Channel: Adaptive Computing » CPU
Viewing all articles
Browse latest Browse all 2

Tooling for Timing

$
0
0

My father was a master gardener—I guess that’s not a big surprise for a man that grew up on a farm, studied soil science in college and went on to become the Chief Trainer for the US Soil Conservation Service. While the science of agriculture is still a mystery to me, I can see how he achieved the success he did growing things in the soil under greatly varying environmental elements. It was all about measurement. He used to put empty tuna fish cans on our lawn to measure the amount of water the sprinkler systems delivered. He had a mechanical weather station in the back yard that measured rainfall and temperature and would send the data to the National Weather Service. He used a little machine to drop seeds in the ground at a measurable rate, and then measured crop yields—all the while measuring the money he spent and counting the profits he brought in from his two acres of corn and raspberries.

Measurements allow us to set a baseline, make changes, and then determine the efficacy of those changes. We learn from each cycle and find out what works and what doesn’t work—but only if we measure it. Gratefully I don’t have to wait for the change of seasons to measure the changes we’ve been making to Moab.

While improving the performance of Moab, I had to know how many times specific functions were being called and how much time they were taking. I started by using tools such as Valgrind and KCacheGrind to look for bottlenecks. It was a slow and tedious process that yielded some great information. But I had to wait hours for a test run to work through a complicated scheduling cycle. Now I can get the same information in real-time with the help of a microsecond timer class. The class is instantiated at the beginning of a function, records the time and emits a ZeroMQ message indicating that the function has started. When the function exits, the class goes out of scope and the destructor is called which then gets the time and calculates the elapsed time and emits another ZeroMQ message indicating the function exit and duration.

There are two ways to measure time—wall clock time, and CPU time. Wall clock time is the total elapsed time between the function start and exit.  If the function performs a calculation, then sleeps for 5 seconds, the time will be 5 seconds plus however long the calculation took.  CPU time only counts how long the process was actually being executed. Using the same calculation with a 5 second sleep may yield only the time of the calculation, since during the sleep the CPU can switch to another process and will resume execution of your process after the specified duration. I prefer to use the wall clock time in this application. If you’re using a similar timing mechanism to measure the efficiency of a calculation, you might want to use the CPU time instead. I felt that since I am reporting to others how long scheduling activities take I wanted a number that is repeatable in a customer environment and congruent with the times that Moab reports in the mdiag command. Since Moab can also be sending commands to a resource manager that may be on the same machine, I want to be able to get timing that takes “environmental” factors into account.

Here’s some code that illustrates getting the start and end time and calculating the difference in nanoseconds:

// Structures to hold start and end times
timespec StartTime;
timespec EndTime;

// Get the starting time
clock_gettime(CLOCK_MONOTONIC, &StartTime);

// Do something useful
...

// Get the end time
clock_gettime(CLOCK_MONOTONIC, &EndTime);

// Calculate the number of nanoseconds
long int SecondsElapsed = (EndTime.tv_sec - StartTime.tv_sec);
long int TotalElapsedTime = ((SecondsElapsed * 1e9) + EndTime.tv_nsec) - StartTime.tv_nsec;

I use the CLOCK_MONOTONIC clock type so that I’m not affected by changes in system time, or time of day—all I care about is a number that is always increasing (well, realistically it must wrap eventually, but only once every 136 years—I think I can live with that). Now let’s put this in a class that can be instantiated on the stack of a function to get the timing:

class MicrosecondTimer
{
public:
MicrosecondTimer(const char *file, const char *function);
~MicrosecondTimer();

private:
timespec m_StartTime;
const char *m_File;
const char *m_Function;
};

MicrosecondTimer::MicrosecondTimer(const char *file, const char *function, int lineNumber):
  m_File( file ),
  m_Function( function )
{
  char Message[256];
  clock_gettime(CLOCK_MONOTONIC, &m_StartTime);
  snprintf(Message, sizeof(Message), "%s,%s,started", m_File, m_Function);
  SendZMQMessage(Message);
}

MicrosecondTimer::~MicrosecondTimer()
{
  char Message[256];
  timespec EndTime;
  clock_gettime(CLOCK_MONOTONIC, &EndTime);

  long int SecondsElapsed = (long int)(EndTime.tv_sec - m_StartTime.tv_sec);
  long int TotalElapsedTime = (long int)((SecondsElapsed * 1e9) + EndTime.tv_nsec) - m_StartTime.tv_nsec;
  snprintf(Message, sizeof(Message), "%s, %s,time:%lu",
  m_File,
  m_Function,
  TotalElapsedTime);

  SendZMQMessage(Message);
}

We can now use our MicrosecondTimer class to time a function:

int MyFunction(char *who)
{
  // Instantiate the timer on the stack
  MicrosecondTimer timer(__FILE__, __func__);

  // Function body
  while( CalculateMeaningOfUniverse(who) < 47 )
    GetAncestor(who);

  // timer is calculated and sent at function convocation
}

Now that we have timing information, we can write some scripts to gather real-time function timing, count, and call stack information such as this:

Profiling2

Being able to measure Moab’s performance has led us to a number of improvements including finding sections that could be multi-threaded, adding caching of often calculated data, and identifying bottlenecks that could be short-circuited. Because we’ve been measuring and watching the numbers carefully, we’re able to provide our customers with a much better product. As time goes on we’ll keep looking for ways to streamline the scheduling process and make Moab more responsive and able to handle much larger loads. Thanks to our timing instrumentation, we can make sure that changes to our code base is affecting real, positive change.

The post Tooling for Timing appeared first on Adaptive Computing.


Viewing all articles
Browse latest Browse all 2

Latest Images

Trending Articles





Latest Images