There are three related changes:
* Removed legacy timezones from hc.lib.tz.all_timezones
* Added data migration to update existing Check.tz values
* For backwards compatibility, added code to automatically
replace a legacy timezone with a canonical timezone when a
legacy timezone is passed to an API call
I used the timezone mapping on
https://en.wikipedia.org/wiki/List_of_tz_database_time_zones
* Add Check.last_start_rid field
* Fill Check.last_start_rid on every start event
* Clear Check.last_start on every "fail" event
* Clear Check.last_start on success event if either case is true:
- the event's rid matches Check.last_start_rid
- the event does not specify rid
In human terms, the alerting logic will be: we track the
execution time of the most recent "start" event only. It would
take a major redesign to track the execution time of all
concurrent "start" events and send alerts when *any* of them
overshoots the time budget. So, whenever we see a "start" event,
the timer resets.
Example:
* 00:00 client sends start signal with rid=A, timer starts
* 00:10 client sends start signal with rid=B, timer resets
* 00:20 client sends success signal with rid=A, timer
does not reset because rid A does not match the rid seen in
the most recent start signal (it was B)
* 00:30 the grace time runs out, the check's status shows
as started + failed
At this point the check can be reset to a healthy state in 3
different ways:
* send a success signal with rid=B
* send a failure signal with any rid value or without it
* send a success signal without a rid value
* Added duration to ping details. This is useful on a device with a small screen, since the duration cannot be seen in the main view so now one can see it in the ping's details.
* Changed terms across the board from "delta" to "duration"
* timedelta is now consistently imported as "td" across the entire project (even in Django generated migration files)
I found a bug in the downtime statistics calculation. The
scenario:
* at T=0 a check goes down
* at T=5 some time later the user pauses it
* at T=10 the check receives a ping and goes up
If we don't record a status change (a flip) at T=5, then
the calculated total downtime will come out wrong (10)
This change fixes the pause views (hc.api.views.pause,
hc.front.views.pause) to create Flip objects.
Specifically, add read/write support for the new fields:
* success_kw
* failure_kw
* filter_subject
* filter_body
The API still supports reading/writing the "subject" and
"subject_fail" fields, but these are now marked as deprecated
in API documentation.
Fixes: #653