Databases - Unavailable alarm | Spotlight Enterprise Documentation

The Databases - Unavailable alarm becomes active when Spotlight Enterprise detects that an Amazon RDS for SQL Server database is not available for reading. Users attempting to access an unavailable database will receive an error message.

This alarm detects unusual database statuses, including Suspect, Offline, Recovering, Loading, Restoring, Emergency Mode, and others.

When the alarm is raised

Determine which databases are unavailable. Check the Amazon RDS for SQL Server | Databases drilldown | Databases grid. The Status column shows which databases are unavailable.
Take the action specified below for each unavailable database.

Note: You can configure this alarm to ignore certain values. You can configure different alarm severities for specific databases.

Common database unavailable status

Some of the more common unavailable statuses are listed below:

Offline

Databases can only be set offline manually, using the sp_dboption procedure. If you have Offline databases, you should check who has set the database offline, and consider using sp_dboption to bring the database online again.

Loading or Restoring

Databases that are marked as Loading or Restoring are currently being restored by a RESTORE DATABASE or RESTORE LOG command. The database cannot be accessed by anyone while this is the case.

This status is also assigned to databases that have been restored using the NORECOVERY option. Specifying this parameter on a RESTORE statement tells SQL Server that there are more transaction logs to be restored, and that no access to the database is permitted until they are applied.

You should check the Amazon RDS for SQL Server | SQL Activity drilldown | Sessions tab for active sessions that are processing a RESTORE command (where the Last Command column contains Restore). If there are no sessions processing a RESTORE command, then the database is most likely unavailable because the last restore was done with the NORECOVERY keyword.

To remove Loading/Restoring status, you should complete the RESTORE process. This can involve either waiting for the active RESTORE command to complete, or restoring the remaining transaction logs. The last transaction log should be restored without the NORECOVERY keyword.

Recovering

Databases will be Recovering (or InRecovery) for a while when SQL Server is restarted, or the database is first set online. This is the status that SQL Server uses to indicate that it is re-applying committed transactions, or removing uncommitted transactions after a SQL Server failure.

Normally, SQL Server only takes a short time to re-apply these transactions, however, if there were long-running transactions open when SQL Server ended abnormally, then it can take an extended period. The general rule is that the automatic recovery will take between 1.5 and 2 times the age of the oldest transaction (although the exact time depends on your Recovery Interval setting.) For example, if a large UPDATE statement had been running for 2 hours when SQL Server stopped unexpectedly, you can expect the automatic recovery to take between 3 and 4 hours.

Usually, the best course of action is to let SQL Server proceed with its automatic recovery. If SQL Server is stopped during its automatic recovery, the next time it starts it not only has to re-process the transactions that were in progress when the original failure occurred, but it also has to reprocess the failed automatic recovery.

Suspect

Databases can be Suspect if they fail SQL Server’s automatic recovery. This is most common after a SQL Server restart when something goes wrong with the automatic recovery process. In some rare situations, a database can become Suspect without a SQL Server restart (if serious database corruption is detected by SQL Server). This can also happen when SQL Server runs out of disk space for the database files while performing recovery.

The first thing to do when you have a Suspect database is to check the SQL Server error log, and look for error messages indicating recovery failure or database corruption. This should give you an indication of what caused the problem.

Actions you can consider to correct a suspect database include:

Check the SQL Server error log to determine why the database was made suspect.
Make sure all database files are available. If any database file is unavailable when SQL Server attempts to open the database, the database is made suspect. This can happen if you have deleted or renamed a database file while SQL Server was down. It can also happen if another Windows process (such as Backup or Virus Scanning software) is using a database file when SQL Server tries to open it. If this is the case, once the database file is available again, use the sp_resetstatus stored procedure (as documented in Books Online) to reset the database status, and then restart SQL Server to initiate recovery.
If the Suspect status was caused by a full disk during recovery, you should free up disk space and use the sp_resetstatus stored procedure (as documented in Books Online) to reset the database status. SQL Server should then be restarted to initiate recovery.
If the Suspect status was caused by a full disk during recovery, and you cannot free up space on existing database disks, you should add a new data or log file on a different disk that has free space available.
Restore the database from your last full database backup, and then restore all transaction log backups taken since that point.

In most cases, the correct action to take for a suspect database is to restore from your last good full database backup, and transaction logs.

If restoring is not an option for you and you cannot fix the problem using the tips above, there are a few more options you can attempt. These will not work in all cases, but sometimes they may give you enough to extract data from the corrupt database that cannot be retrieved any other way.

Note: These procedures should only be considered as a last resort. They are not supported by Microsoft or Quest, and could irretrievably corrupt your databases and data.

[]:

Tags: