Previously this was done in process_one_flip (so on the main thread).
The advantage of doing this way is the flip gets marked as processed
only when the thread has started and has acquired a db connection.
There is now a smaller pause between a sendalerts process claiming a
flip, and actually starting work on it.
Webhook requests can take 20+ seconds. During that time we hold
on to a database connection. With this commit, the Webhook transport
closes its DB connection before making a curl call.
With psycopg2 this does not have much effect. But with
psycopg 3 & connection pooling we will be able to use more
sendalerts workers than we have database connections. While one
worker is busy making a slow curl call, another worker can
grab its freed up connection and do some work.
Django's test runner is not happy with connections closed
mid-test, so I patched out close_old_connections() in affected tests.
This is a tricky one: the default value for max_workers is
None. But it doesn't mean "unlimited", in Python 3.8+ it
means "min(32, os.cpu_count() + 4)"
For example on 8-core CPU the effective value would be 8 + 4 = 12,
and passing anything above 12 to `--max-workers` would have no effect.
The counter was slightly wrong (it counted lost races as sent
notifications). Rather than complicating code to make it correct,
let's rather just remove it :-)
* Remove the --no-loop and --no-threads arguments
* Use a threadpool to do multiple sends concurrently
* Add a new `--num-workers` argument. It limits how many flips we grab
from the database and process concurrently.
* Do not prioritize flips with historically low send times any more
(not as important now with concurrent sending, and simpler this way)
* Workers close db connections when they finish
(to keep the number of idle connections low)
Note: concurrent.futures.ThreadPoolExecutor internally has an unbounded
queue, it will accept any amount of jobs and keep them queued. We don't
want that. We only want to grab a flip, and commit to processing it,
if we know there's a free worker for it. Therefore we're tracking the
number of jobs in flight using a semaphore (`self.seats`).
The API responses already contain ping_url, update_url, resume_url,
pause_url fields where the UUID can be extracted from, so we are
not exposing new information. The extraction can be finicky in,
say, shell-scripting scenarios. So for API user convenience we will
now also provide the check's code (UUID) as a separate field.
Fixes: #1007
We already have a MS Teams integration but MS Teams is discontinuing
the incoming webhook feature used by this integration:
https://devblogs.microsoft.com/microsoft365dev/retirement-of-office-365-connectors-within-microsoft-teams/
MS Teams now recommends to use Workflows to post messages
via webhook. MS Teams does not provide backwards compatibility or
an upgrade path for existing integrations.
This commit adds a new "msteamsw" integration which uses MS Teams
Workflows to post notifications. It also updates the instructions
and illustrations in the "Add MS Teams Integration" page.
cc: #1024
We used "body" to store request body as text.
In 2022 we added "body_raw" and started to use it to store request
body as bytes.
In python code we currently need to inspect both fields,
because the data could be in "body" (for old pings) or in
"body_raw" (for newer pings). My plan is to eventually get rid
of the "body" field, and have "body_raw" only. This data migration
is a step towards that: for any Ping objects that have non-empty
"body" field, it moves the data to the "body_raw" field. After
applying this migration, the "body" field should be empty (empty
string or null) for all Ping objects.
Instead, make it set a local `have_apprise` variable, and use
it in the hc.api.transports.Apprise class.
If hc.api.transports sets APPRISE_ENABLED to False,
then the apprise system check in hc.api.apps will not see the
original value and therefore will not run.
* fix: show warning if apprise is enabled and not installed in environment
* renamed appraise check register
* revert back changes in transport for apprise
There are three related changes:
* Removed legacy timezones from hc.lib.tz.all_timezones
* Added data migration to update existing Check.tz values
* For backwards compatibility, added code to automatically
replace a legacy timezone with a canonical timezone when a
legacy timezone is passed to an API call
I used the timezone mapping on
https://en.wikipedia.org/wiki/List_of_tz_database_time_zones
This is primarily to make notification lookups by code efficient.
We look up notifications by code in hc.api.views.boundces.
This field has a default value (uuid.uuid4), so any null values
will be filled with random UUIDs during migration.