Ensuring systems and databases are up and available is an important function of the RockSolid application. RockSolid supports this requirement via it's availability monitoring feature.
SQL Server Instance Level Availability
If you enable Availability Monitoring RockSolid will attempt to connect to each instance every minute. If connection is successful RockSolid will also execute the rudimentary SQL Server command "SELECT @@SERVERNAME". If this is also successful then the instance is judged to be up and available.
If the instance is not available then this is reported immediately to the RockSolid application, from where "Instance Down" events will be raised. These events will be processed depending on your configuration, typically these are converted into Service Requests.
Instance Down Service Requests
Once a Instance Down service request is raised, RockSolid will attempt to automatically identify more details of the nature of the outage to aid in resolution. Remember at this point the only symptom is that the RockSolid monitoring agent has been unable to communicate with the instance. This could be caused by:
- The OSE/Server being down or turned off
- SQL Server services stopped
- Incorrect login permissions being assigned to the RockSolidAgent service account
- An active directory authentication issue
- A network routing or name resolution issue
All these conditions would prevent the RockSolidAgent from connecting to the instance and therefore determining the instance is "down", from it's perspective.
In order to quantity the issue further, RockSolid will attempt to:
- Ping the host. If the host is not pingable then the issue may be a server down or a network issue.
- It the host is pingable then the issue may be a service, login permissions or authentication issue.
- Check the SQL Server service status. If the SQL Service status can be checked, and it is confirmed running then the issue may be a SQL Server login permissions issue.
The relevant details of all these checks are appended into the service request that is raised for manual resolution.
SQL Server Database Level Availability
Database level heath is also monitored by RockSolid. Various service request will be raised if it is determined that databases are in an unexpected state, including:
- If a database is marked offline or suspect
- If a database corruption is detected
- If a database data file cannot come online
- If a database access status changes
- If a database cannot be connected to for the purposes of monitoring
It should be noted that events are raised when the status is "unexpected". Databases participating in DR processes such as log shipping, database mirroring, availability groups etc have states which are expected to be unavailable for access depending on the database role in the replica. RockSolid is aware of these states and will not generate events for databases acting appropriately for their role in a replica. Database corruptions in these databases will still however raise the relevant events in RockSolid.