Extend AlwaysUp and Service Protector by plugging in your own failure detection program
AlwaysUp and
Service Protector are
both designed to keep your applications functioning 24/7. While they can monitor and take action when a wide range of events occur (for example, your program uses too much RAM or CPU), not all
failure detection capabilities are built in. For example, if your program doesn't operate correctly when another program has stopped, there is no way for AlwaysUp or Service Protector to find out about the special relationship and
stop/restart your application.
Fortunately both AlwaysUp and Service Protector can be extended to "plug in" your own arbitrary, failure detection logic. All you have to do is supply an executable (or batch file) that can detect the failure and communicate that situation to AlwaysUp or Service Protector via a return/exit code. We call these small, targeted failure detection programs Sanity Check Plugins. A Sanity Check Plugin can be an executable written in any language (C++, C#, VB, Delphi, etc.) or can be a DOS batch file. The only requirement is that it exits with a special code. Feel free to construct your own plugins, specific to your situations, or simply use one that we have already written (below). Available Sanity Check PluginsWe have developed a few Sanity Check Plugins that can be used freely. Each will signal AlwaysUp or Service Protector to:
Stop/restart your application if a specified program is not runningCommand Line Usage
CheckProgramIsRunning.exe <program-name>where <program-name> is the name of an executable, as seen in Task Manager ExampleTo check if Microsoft Word is running, use: CheckProgramIsRunning.exe winword.exe DownloadsCheckProgramIsRunning.exe (490 KB) Stop/restart your application if a particular file has not changed for a whileCommand Line Usage
CheckFileChanged.exe <file-name> <num-minutes> [-e] [-v]where <file-name> is the full path to the file to be checked - Please enclose in quotes if the path contains spaces. - Note that this file name can contain special macro-like strings that will be dynamically replaced when the program is run. These are: $DAY$ == The current day (1-31) $DAY2$ == The current 2-digit day (01-31) $MONTH$ == The current month (1-12) $MONTH2$ == The current 2-digit month (01-12) $YEAR2$ == The current 2-digit year $YEAR4$ == The current 4-digit year For example, if today is August 28 2022, then: C:\Files\Myfile_$MONTH$_$DAY$_$YEAR4$.log will expand to C:\Files\Myfile_8_28_2022.log when the utility is run. <num-minutes> is the number of minutes -e signals to return 1 (a failure) if the file doesn't exist (optional) -v signals to produce verbose output (optional) ExampleTo check if a log file located at C:\myserver\log.txt has not been modified for the past 10 minutes: CheckFileChanged.exe "C:\myserver\log.txt" 10 DownloadsCheckFileChanged.exe (251 KB) Stop/restart your application if it is using too much virtual memoryCommand Line Usage
CheckVMSize.exe <process-id> <vm-size> [-v]where <process-id> is the numeric identifier of the process to be checked. <vm-size> is the memory threshold, in MB. -v signals to produce verbose output (optional) ExampleTo check if the process with ID 4521 is using more that 100 MB of virtual memory: CheckVMSize.exe 4521 100 DownloadsCheckVMSize.exe (249 KB) Stop/restart your 64-bit application if it is using too much memoryCommand Line Usage
CheckMemorySize64.exe <process-id> <size> [-vm] [-v]where <process-id> is the numeric identifier of the process to be checked. <size> is the memory threshold, in MB. -vm indicates to check the virtual private memory instead (optional) -v signals to produce verbose output (optional) ExampleTo check if the process with ID 4521 is using more that 3 GB of memory: CheckMemorySize64.exe 4521 3172 DownloadsCheckMemorySize64.exe (911 KB). Note: This 64-bit application will not run on 32-bit versions of Windows. Stop/restart your application if a specific web site/URL is not respondingCommand Line Usage
check-web-server.bat Configuration
Downloads
http-ping.exe (504 KB)
Stop/restart your application if a specific web site/URL is not responding or is returning a 5XX HTTP error codeCommand Line Usage
check-web-server-for-5XX-errors.bat Configuration
Downloads
http-ping.exe (504 KB)
Stop/restart your application if a specific web site/URL returns an unexpected responseCommand Line Usage
check-web-server-response.bat Configuration
Downloads
http-ping.exe (504 KB)
Stop/restart your application if it's not listening on a specific network portCommand Line Usage
check-port-listening.bat Configuration
Downloads
PortQry
Stop/restart your application if a specific string is found in a log fileCommand Line Usage
CheckLogFileForFatalError.exe <file-name> -e <error-string> [-ok <ok-string>] [-v]where <file-name> is the full path to the file to be checked - Please enclose in quotes if the path contains spaces. - Note that this file name can contain special macro-like strings that will be dynamically replaced when the program is run. These are: $DAY == The current 2-digit month (01-12) $MONTH == The current 2-digit day (01-31) $YEAR2 == The current 2-digit year $YEAR4 == The current 4-digit year For example, if today is May 26 2011, then: C:\Files\Myfile_$MONTH_$DAY_$YEAR4.log will be expanded to C:\Files\Myfile_05_26_2011.log when the utility is run. <error-string> is the error message to look for in the file There can be multiple error-strings, and at least one must be specified. <ok-string> is a message that signals that the software is ok. If it appears after all the error strings, a restart will not be signaled. An ok-string is not required but multiple can be specified. -v signals to produce verbose output (optional) Note: All strings are case sensitive. ExampleTo check if a log file located at C:\myserver\log.txt contains the string "Server failed to process request" but only if it appears after "Server started", use: CheckLogFileForFatalError.exe "C:\myserver\log.txt" -e "Server failed to process request" -ok "Server started" DownloadsCheckLogFileForFatalError.exe (290 KB) Stop/restart your application if it's not using any CPUCommand Line Usage
CheckForCPUActivity.exe <process-id> <duration> [-t <percent>] [-v]where <process-id> is the identifier of the process to be checked. <duration> is the number of seconds to monitor the CPUs. The maximum value is 300. <percent> is the threshold percent (optional). If not supplied, the threshold is 0%. The maximum value is 99. -v signals to produce verbose output (optional) ExampleTo check if the process with ID 4521 uses any CPU over 60 seconds, use: CheckForCPUActivity.exe 4521 60 DownloadsCheckForCPUActivity.exe (271 KB) Stop/restart your application if it has been running for too longCommand Line Usage
CheckIfLongRunning.exe <process-id> <num-minutes>where <process-id> is the identifier of the process to be checked. <num-minutes> is the "threshold" number of minutes. ExampleTo check if the process with ID 4521 has been running for longer than 3 hours, use: CheckIfLongRunning.exe 4521 180 DownloadsCheckIfLongRunning.exe (265 KB) How to use a Sanity Check Plugin with AlwaysUpTo setup a Sanity Check Plugin with AlwaysUp:
How to use a Sanity Check Plugin with Service ProtectorTo setup a Sanity Check Plugin with Service Protector:
Special Command Line ParametersAlwaysUp and Service Protector are able to pass your Sanity Check Plugins one or more "special" values. Compose your command line with the appropriate string and AlwaysUp/Service Protector will make the substitution before invoking your program.
For example, to have AlwaysUp pass your Sanity Check program the application's program identifier (PID), then your command line might resemble this: C:\myserver\my_check.exe $ALWAYSUP_PIDIf your main application is running with PID 563, then your plugin program will be invoked like this: C:\myserver\my_check.exe 563 Writing your own Sanity Check PluginsA Sanity Check Plugin can be an application written in any language (C++, C#, VB, Delphi, etc.) or can be DOS batch file. The only requirement is that it exits with a return code of:
In the last case, the application is not restarted but a message is written to the event log (and an email is sent if so configured). Note that if the Sanity Check plugin fails to complete in 120 seconds, the application will be restarted. |
Links |