If SITE_ROOT contains a subpath
(e.g., "http://example.org/healthchecks"), settings.py now adjusts
FORCE_SCRIPT_NAME and STATIC_URL.
Also, hc.lib.urls.absolute_reverse() now recognizes subpath in
SITE_ROOT, and makes sure it does not show up twice in the
generated URLs.
cc: #1091
In check's details page, we have a "Current Status" section.
It shows an icon, a status text, and a table with downtime
statistics. When the check has received a start signal, the icon
also has an animated progress indicator under it.
With this change, the status text will also indicate the running
state and the elapsed time. Example:
"Currently running, started 3 minutes ago".
When signal-cli returns an error that we are not handling yet,
log the precise JSON message that signal-cli returns. This
is for debug & development: We can look at the logged messages
and see what additional special error handling may be needed.
LINE Notify is shutting down on Apr 1, 2025:
https://notify-bot.line.me/closing-announce
I'm removing the onboarding form so people don't set up new
integrations that will stop working in 5 months.
The code for sending LINE Notify notifications still exists,
and the existing integrations will continue to work (until LINE
Notify stops working).
The Project model has (well, had) a num_checks() method.
In the project admin we are also annotating project queryset
with a "num_checks" property. Using the same name for two different
things causes type confusion for mypy and can also lead to
coding accidents.
This commit removes the Project.num_checks() method. This was easier
to do than changing admin, as the method is very simple and was used
in only two places.
Also simplify the retry logic: each retry attempt is now
allowed to use the full 30 seconds. This means, a single
webhook delivery can take up to 3*30=90 seconds.
If sendalerts receives this parameter, it reconfigures
settings.DATABASES to enable db connection pooling
(using psycopg_pool with default parameters).
This lets us use many concurrent worker threads but not
run out of database connections. For example, with
`--num-workers 100 --pool`, up to 100 worker threads can run
concurrently, but only 3 threads can get a database connection
from the pool, the rest have to wait. When a worker thread
gives up a connection (by calling `close_old_connections`),
another thread can continue.
A worker thread can give up a db connection before it is fully
finished if it anticipates a long network IO operation ahead.
The Webhook transport does this before making a curl call.
psycopg_pool's default pool size is 4 connections. One
connection is used up by the main thread, so 3 connections
are available for the worker threads.
Fixes: #1057
"PRAGMA busy_timeout" configures the database to wait when a
database is locked instead of giving up immediately.
"transaction_mode IMMEDIATE" starts transactions in read/write
mode, required to make busy_timeout work.
Reference: https://gcollazo.com/optimal-sqlite-settings-for-django/
Previously this was done in process_one_flip (so on the main thread).
The advantage of doing this way is the flip gets marked as processed
only when the thread has started and has acquired a db connection.
There is now a smaller pause between a sendalerts process claiming a
flip, and actually starting work on it.
Webhook requests can take 20+ seconds. During that time we hold
on to a database connection. With this commit, the Webhook transport
closes its DB connection before making a curl call.
With psycopg2 this does not have much effect. But with
psycopg 3 & connection pooling we will be able to use more
sendalerts workers than we have database connections. While one
worker is busy making a slow curl call, another worker can
grab its freed up connection and do some work.
Django's test runner is not happy with connections closed
mid-test, so I patched out close_old_connections() in affected tests.