CONCEPT: The Impact of Monitoring

The impact of monitoring a database environment is contentious issue for all monitoring tools, including RockSolid's monitoring of SQL Server systems.  Organisations wanting to get the most out of their SQL Server environments need to weigh up the "impact" of monitoring vs the "risk" of not monitoring - i.e. the load generated by monitoring and environment vs the lack of information or clarity as to the state of a given environment and/or the loss of early warning to issue occurrences.  In general most organisations when faced with this decision would choose to monitor and have viability into systems.

RockSolid is designed to have monitoring always enabled for production systems.  It is designed to collect all relevant operational data necessary for the system to analyse and provide warning as to issues relating to:

  • Availability
  • Performance
  • Recovery
  • Security
  • General Maintenance

This requires the collection of numerous data sources from within the SQL Server and Windows environments which carries a performance impact.  Quantifying the cost of this can be difficult because environments vary considerably.  However as most monitoring occurs on internal "in-memory" structures, it is a fair assessment to assume that the majority of impact from monitoring will be CPU related.

Monitoring as a % of Workload

Accessing the monitoring workload as a percentage of user workload is a dangerous measurement when determining the impact of monitoring.  This is because the relative impact of monitoring will vary based on how busy the user workload of a system is.  As monitoring is a constant load on a system, for quite systems the monitoring workload would appear to have a high overall % when compared with user workload, potentially a higher overall workload.  Conversely the impact of monitoring on a very busy system when compared with user workload may be extremely low or measurable.

Instead a more true impact of monitoring is to assess the monitoring workload as a percentage of overall system resources so administrations can judge if the monitoring impact is having a significant impact overall on the system irrespective as to if users are generating a high or low workload.  To assess this in RockSolid you can use the built in measure which quantifies RockSolid monitoring impact as a % of overall CPU load.  To view this measure:

  • Go to a relevant instance
  • Click on the Analysis tab
  • From the Metric drop down choose "RockSolid Monitoring CPU Load (% Actual CPU)

The resulting graph shows RockSolid monitoring as a % of system CPU capacity:

In general, the CPU impact of monitoring should be overall below 5% of CPU on an average with occasional spikes to ~20% considered acceptable.

Reducing Monitoring CPU Load

The easiest method of reducing Monitoring CPU load is to increasing the polling interval at which RockSolid collects data.  This of course reduces the responsiveness of the RockSolid application to changes or events in your environment.

Have more questions? Submit a request


Please sign in to leave a comment.