Commit graph

596 commits

Author SHA1 Message Date
Pēteris Caune
c91213179f
Fix API to gracefully handle too long slugs 2024-10-16 12:35:30 +03:00
Pēteris Caune
8c210e151f
Update the Signal integration to retry on network errors 2024-10-14 11:19:37 +03:00
Pēteris Caune
4f9b0b11b9
Update Signal transport to log unexpected signal-cli replies
When signal-cli returns an error that we are not handling yet,
log the precise JSON message that signal-cli returns. This
is for debug & development: We can look at the logged messages
and see what additional special error handling may be needed.
2024-10-10 10:21:08 +03:00
Pēteris Caune
fd96cc794b
Remove unused bits 2024-10-04 17:34:30 +03:00
Pēteris Caune
de4c4897e3
Remove prunenotifications management command
Notifications are now cleaned up automatically during pinging.
2024-10-02 09:24:01 +03:00
Pēteris Caune
13f92b90ef
Update settings.py to read SECURE_PROXY_SSL_HEADER from env vars
And add it to docs.

And add a system check to make sure it, if set, is a tuple
with 2 elements.

cc: #851
2024-10-01 19:13:26 +03:00
Pēteris Caune
f241d070e1
Update Flip.select_channels() to sort channels by last_notify_duration
If a check has multiple associated channels, some are slow and
some are quick, handle the quick ones first.
2024-09-12 10:44:56 +03:00
Pēteris Caune
f60af9a156
Update ntfy integration to give up db connection before network IO 2024-09-12 10:30:58 +03:00
Pēteris Caune
28af3720f4
Increase outgoing webhook timeout from 10 to 30 seconds
Also simplify the retry logic: each retry attempt is now
allowed to use the full 30 seconds. This means, a single
webhook delivery can take up to 3*30=90 seconds.
2024-09-11 12:37:40 +03:00
Pēteris Caune
3275e0ffaa
Update notify() to return logs instead of printing them 2024-09-03 10:23:15 +03:00
Pēteris Caune
8c56ca6dde
Update sendalerts to mark flip as processed on thread
Previously this was done in process_one_flip (so on the main thread).
The advantage of doing this way is the flip gets marked as processed
only when the thread has started and has acquired a db connection.
There is now a smaller pause between a sendalerts process claiming a
flip, and actually starting work on it.
2024-09-01 15:28:48 +03:00
Pēteris Caune
a463daa775
Update Webhook transport to close db connection before network IO
Webhook requests can take 20+ seconds. During that time we hold
on to a database connection. With this commit, the Webhook transport
closes its DB connection before making a curl call.

With psycopg2 this does not have much effect. But with
psycopg 3 & connection pooling we will be able to use more
sendalerts workers than we have database connections. While one
worker is busy making a slow curl call, another worker can
grab its freed up connection and do some work.

Django's test runner is not happy with connections closed
mid-test, so I patched out close_old_connections() in affected tests.
2024-08-31 19:18:17 +03:00
Pēteris Caune
7641f2a9a1
Switch to using close_old_connections() instead of connection.close() 2024-08-31 19:02:11 +03:00
Pēteris Caune
d76dc53e49
Increase Signal send timeout to 60 seconds 2024-08-31 11:07:17 +03:00
Pēteris Caune
d3ae4e7fac
Add support for $SLUG placeholder in webhook payloads
Fixes: #1049
2024-08-16 13:24:12 +03:00
Pēteris Caune
bdb6f18a3d
Add "uuid" field in API responses when read/write key is used
The API responses already contain ping_url, update_url, resume_url,
pause_url fields where the UUID can be extracted from, so we are
not exposing new information. The extraction can be finicky in,
say, shell-scripting scenarios. So for API user convenience we will
now also provide the check's code (UUID) as a separate field.

Fixes: #1007
2024-07-18 18:15:52 +03:00
Pēteris Caune
8054191be3
Remove HipChat, Pagerteam, Zendesk channel kinds
HipChat and Pagerteam products have long been shut down,
the Zendesk integration was never fully implemented.
2024-07-18 16:21:45 +03:00
Pēteris Caune
e83f60cc0b
Implement Implement MS Teams Workflows integration
We already have a MS Teams integration but MS Teams is discontinuing
the incoming webhook feature used by this integration:

https://devblogs.microsoft.com/microsoft365dev/retirement-of-office-365-connectors-within-microsoft-teams/

MS Teams now recommends to use Workflows to post messages
via webhook. MS Teams does not provide backwards compatibility or
an upgrade path for existing integrations.

This commit adds a new "msteamsw" integration which uses MS Teams
Workflows to post notifications. It also updates the instructions
and illustrations in the "Add MS Teams Integration" page.

cc: #1024
2024-07-17 13:35:17 +03:00
Pēteris Caune
997154e3b0
Remove usages of Ping.body 2024-07-11 16:17:21 +03:00
Pēteris Caune
324fa10ce7
Fix Check.lock_and_delete() to gracefully handle already deleted check 2024-06-20 15:57:53 +03:00
Viktor Szépe
9a44ef1571 Fix typos 2024-06-20 15:41:42 +03:00
Pēteris Caune
b2c5e91c70
Implement legacy -> canonical timezone conversion
There are three related changes:

* Removed legacy timezones from hc.lib.tz.all_timezones
* Added data migration to update existing Check.tz values
* For backwards compatibility, added code to automatically
  replace a legacy timezone with a canonical timezone when a
  legacy timezone is passed to an API call

I used the timezone mapping on
https://en.wikipedia.org/wiki/List_of_tz_database_time_zones
2024-06-14 12:55:57 +03:00
Pēteris Caune
52f2b534a6
Fix API to accept Europe/Kiev but save it as Europe/Kyiv 2024-06-13 15:23:27 +03:00
Pēteris Caune
4ec7a48082
Update the Discord integration to disable channel on HTTP 404 responses 2024-04-26 09:25:42 +03:00
Pēteris Caune
6fb46aee32
Fix integrations to include oncalendar schedules in notifications 2024-04-24 16:08:55 +03:00
Pēteris Caune
4181399659
Fix Spike integration to not disclose check's code in incident data 2024-04-22 13:01:38 +03:00
Pēteris Caune
ddae6a04bf
Fix VictorOps integration to not disclose check's code in incident data 2024-04-22 12:57:10 +03:00
Pēteris Caune
c08ba1d872
Fix PagerTree integration to not disclose check's code in incident data 2024-04-22 12:46:18 +03:00
Pēteris Caune
994bc10857
Update PagerDuty integration to use ping.formatted_kind_created 2024-04-22 12:31:03 +03:00
Pēteris Caune
18bd44a68b
Fix PagerDuty integration to not disclose check's code in incident data 2024-04-22 12:12:22 +03:00
Pēteris Caune
5c73556050
Include ping's kind in Opsgenie notification's "Last ping" field 2024-04-19 12:19:26 +03:00
Pēteris Caune
7f03a9e738
Improve Opsgenie notifications (include description, schedule, link...) 2024-04-19 11:58:35 +03:00
Pēteris Caune
577602ae21
Fix Opsgenie integration to not disclose check's code in incident data 2024-04-19 11:24:12 +03:00
Pēteris Caune
82ed392361
Update transport classes to use regular spaces instead of non-breaking 2024-04-17 16:35:43 +03:00
Pēteris Caune
83f161d657
Update transport classes to use Transport.last_ping() consistently
* Instead of check.n_pings (int) use last_ping().n
* Instead of check.last_ping (datetime) use last_ping().created

There is a time gap from creating a flip object to processing
it (sending out an alert). We want the notification to reflect
the check's state at the moment the flip was created. To do this,
we use the Transport.last_ping() helper method which retrieves
the last ping *that is not newer than the flip*.

This commit updates transport classes and templates to use
Transport.last_ping() consistently everywhere.
2024-04-15 15:09:17 +03:00
Pēteris Caune
d8a46349a8
Update Transport.last_ping() to ignore pings newer than the flip 2024-04-15 12:47:11 +03:00
Pēteris Caune
5bdb01baf9
Fix the Zulip integration to use Flip.new_status 2024-04-12 15:43:06 +03:00
Pēteris Caune
6631a5f76a
Fix the WhatsApp integration to use Flip.new_status 2024-04-12 15:40:19 +03:00
Pēteris Caune
3301cce251
Fix the Splunk On-Call integration to use Flip.new_status 2024-04-12 15:37:43 +03:00
Pēteris Caune
6e27d88ec9
Fix the Trello integration to use Flip.new_status 2024-04-12 15:34:54 +03:00
Pēteris Caune
bcecf058c2
Fix the Telegram integration to use Flip.new_status 2024-04-12 15:33:03 +03:00
Pēteris Caune
17e9f33bb9
Fix the Spike integration to use Flip.new_status 2024-04-12 15:30:26 +03:00
Pēteris Caune
2913b8faf5
Fix the Signal integration to use Flip.new_status 2024-04-12 15:27:13 +03:00
Pēteris Caune
f673f599c5
Fix the RocketChat integration to use Flip.new_status 2024-04-12 15:18:56 +03:00
Pēteris Caune
af36078f10
Fix the Pushover integration to use Flip.new_status 2024-04-12 15:14:19 +03:00
Pēteris Caune
a485bea2b2
Fix the Pushullet integration to use Flip.new_status 2024-04-12 15:08:25 +03:00
Pēteris Caune
f24a1dbc25
Fix the PagerDuty integration to use Flip.new_status 2024-04-12 15:03:31 +03:00
Pēteris Caune
f060874be5
Fix the PagerTree integration to use Flip.new_status 2024-04-12 15:00:53 +03:00
Pēteris Caune
e5018b1195
Fix the Opsgenie integration to use Flip.new_status 2024-04-12 14:57:21 +03:00
Pēteris Caune
4da32b9214
Fix the ntfy integration to use Flip.new_status 2024-04-12 14:53:29 +03:00