With many monitoring systems if a user journey goes into or out of an error state during a time period that is marked out as downtime it will never be logged and corrected. thinkTribe’s alerting overcomes this issue and keeps users happy and alerted with some clever new logic.
During the PME window (“planned maintenance exclusion” – user defined periods of downtime used for upgrades, testing, roll outs, maintenance etc) alerts for errors are suppressed so as not to throw out lots of noise during a time that IT staff know the system is down or changing and will not be performing correctly and will set off alarms for false positives.
The only way to avoid these alerts in some monitoring systems is to shut the system down during the maintenance period. However, at thinkTribe we have many clients with complex systems of user journeys, not all of which may be affected by a particular piece of work or downtime. Those user journeys still need to be monitored. By having an exclusion system that allows users to configure which individual journeys to suppress alerts for it means that the monitoring systems does not have to be shut off.
However as sites, and the journeys within them, become more complex it is vitally important that clients are alerted to errors that may have occurred during a time period that crosses the exclusion and are not related to it.
There are several things that may happen to a user journey as it enters and leaves a PME window and we have worked closely with clients to improve the post-PME error checking and alerting system to ensure that all possible variations are covered:
If a user journey goes into error during a PME and is still in error at the end of the PME the alert sequence will start from the beginning as soon as the PME window is closed. This alert includes extra information which informs the recipient that it went into error during the PME.
If a journey is already in error when the PME window begins, and is still in error when the PME window closes then all alerts will be resent at the end of the PME.
If a User Journey is an error when a PME begins, but recovers during the course of the PME the alert recipients will be notified that the User Journey has recovered during the PME. If a User Journey goes into a PME in error, recovers during the PME, but goes into error again before the end of the PME and stays in error after the PME has closed the alert sequence will revert to the initial “down” sequence but will contain some additional information about the recovery during the PME.
This way users can always be certain that all the data kind of intelligence is also of crucial importance when looking back over time for trends and patterns in performance data.
All products in the Monitoring Suite have been designed with different user needs in mind, but all are delivered through the intuitive Customer Portal, and enjoy the one-on-one managed service support, that our clients value so highly.
To help support all teams, and provide a “single point of truth”, all products in the SV Monitor Suite are designed to ensure that everyone can understand and be proficient in using the wide ranging metrics to deliver ongoing improvements.