When signal-cli returns an error that we are not handling yet,
log the precise JSON message that signal-cli returns. This
is for debug & development: We can look at the logged messages
and see what additional special error handling may be needed.
LINE Notify is shutting down on Apr 1, 2025:
https://notify-bot.line.me/closing-announce
I'm removing the onboarding form so people don't set up new
integrations that will stop working in 5 months.
The code for sending LINE Notify notifications still exists,
and the existing integrations will continue to work (until LINE
Notify stops working).
The Project model has (well, had) a num_checks() method.
In the project admin we are also annotating project queryset
with a "num_checks" property. Using the same name for two different
things causes type confusion for mypy and can also lead to
coding accidents.
This commit removes the Project.num_checks() method. This was easier
to do than changing admin, as the method is very simple and was used
in only two places.
Also simplify the retry logic: each retry attempt is now
allowed to use the full 30 seconds. This means, a single
webhook delivery can take up to 3*30=90 seconds.
If sendalerts receives this parameter, it reconfigures
settings.DATABASES to enable db connection pooling
(using psycopg_pool with default parameters).
This lets us use many concurrent worker threads but not
run out of database connections. For example, with
`--num-workers 100 --pool`, up to 100 worker threads can run
concurrently, but only 3 threads can get a database connection
from the pool, the rest have to wait. When a worker thread
gives up a connection (by calling `close_old_connections`),
another thread can continue.
A worker thread can give up a db connection before it is fully
finished if it anticipates a long network IO operation ahead.
The Webhook transport does this before making a curl call.
psycopg_pool's default pool size is 4 connections. One
connection is used up by the main thread, so 3 connections
are available for the worker threads.
Fixes: #1057
"PRAGMA busy_timeout" configures the database to wait when a
database is locked instead of giving up immediately.
"transaction_mode IMMEDIATE" starts transactions in read/write
mode, required to make busy_timeout work.
Reference: https://gcollazo.com/optimal-sqlite-settings-for-django/
Previously this was done in process_one_flip (so on the main thread).
The advantage of doing this way is the flip gets marked as processed
only when the thread has started and has acquired a db connection.
There is now a smaller pause between a sendalerts process claiming a
flip, and actually starting work on it.
Webhook requests can take 20+ seconds. During that time we hold
on to a database connection. With this commit, the Webhook transport
closes its DB connection before making a curl call.
With psycopg2 this does not have much effect. But with
psycopg 3 & connection pooling we will be able to use more
sendalerts workers than we have database connections. While one
worker is busy making a slow curl call, another worker can
grab its freed up connection and do some work.
Django's test runner is not happy with connections closed
mid-test, so I patched out close_old_connections() in affected tests.
This is a tricky one: the default value for max_workers is
None. But it doesn't mean "unlimited", in Python 3.8+ it
means "min(32, os.cpu_count() + 4)"
For example on 8-core CPU the effective value would be 8 + 4 = 12,
and passing anything above 12 to `--max-workers` would have no effect.
The counter was slightly wrong (it counted lost races as sent
notifications). Rather than complicating code to make it correct,
let's rather just remove it :-)
* Remove the --no-loop and --no-threads arguments
* Use a threadpool to do multiple sends concurrently
* Add a new `--num-workers` argument. It limits how many flips we grab
from the database and process concurrently.
* Do not prioritize flips with historically low send times any more
(not as important now with concurrent sending, and simpler this way)
* Workers close db connections when they finish
(to keep the number of idle connections low)
Note: concurrent.futures.ThreadPoolExecutor internally has an unbounded
queue, it will accept any amount of jobs and keep them queued. We don't
want that. We only want to grab a flip, and commit to processing it,
if we know there's a free worker for it. Therefore we're tracking the
number of jobs in flight using a semaphore (`self.seats`).
I like "sign in" better, but users from time
to time confuse "sign in" and "sign up" forms. To reduce
confusion potential, I'm renaming "sign in" to "log in".
The API responses already contain ping_url, update_url, resume_url,
pause_url fields where the UUID can be extracted from, so we are
not exposing new information. The extraction can be finicky in,
say, shell-scripting scenarios. So for API user convenience we will
now also provide the check's code (UUID) as a separate field.
Fixes: #1007