A taskqueue implementation for Plone 5/6 based on the Huey package.
See https://huey.readthedocs.io/en/latest/
This package can be used as a nearly seamless replacement for
collective.taskqueue
. It does not interfere with WSGI or ZServer and should be
compatible with most up-to-date Plone 5.2 and Plone 6.X installations. Its main
purpose is to allow you to schedule asynchronous operations directly from your
application code. Additionally, thanks to the integration of Huey as the
foundation for collective.taskqueue2
, you can also schedule periodic tasks in
a cron-style manner.
The collective.taskqueue
package supports multiple backend storage options,
including Redis, Sqlite, in-memory, and filesystem. However, in most cases,
Redis is the preferred choice for production environments, while Sqlite or
in-memory storage are commonly used for development purposes.
Install collective.taskqueue2 by adding it to your buildout::
[buildout]
...
eggs =
collective.taskqueue2
and then running bin/buildout
The HUEY_CONSUMER
environment variable determines whether the current
Plone/Zope instance functions as a consumer of the task queue. It can have the
values 1
, True
, true
, or on
to indicate that the instance is a consumer.
Any other value will be considered as indicating that the instance is not a
consumer.
HUEY_LOG_LEVEL
is an environment variable used to configure the logging level
for the collective.taskqueue2
package, which is based on the Huey package.
Here are some key points about HUEY_LOG_LEVEL
:
- It is an environment variable, which means it is a configuration setting that can be set outside of the code.
- The variable is used to control the logging level of the task queue implementation.
- The logging level determines the verbosity of the log messages generated by the
collective.taskqueue2
package. - The available logging levels vary depending on the logging framework being
used. Common levels include
DEBUG
,INFO
,WARNING
,ERROR
, andCRITICAL
, withDEBUG
being the most verbose andCRITICAL
being the least verbose. - By setting the
HUEY_LOG_LEVEL
environment variable, you can control the amount of log output produced by the task queue implementation. - The specific values that
HUEY_LOG_LEVEL
can take and their corresponding meanings will depend on the implementation details of thecollective.taskqueue2
package.
The code relies on the HUEY_TASKQUEUE_URL
environment variable to determine
the configuration of the task queue. If the environment variable is not set, it
falls back to a default value (sqlite:///tmp/huey_queue.sqlite
). The
HUEY_TASKQUEUE_URL
should be set as a string representing the URL of the task
queue configuration.
To use the code with different task queue configurations, you can set the
HUEY_TASKQUEUE_URL
environment variable with a URL representing the desired
configuration. Here are some examples of URL formats for different
configurations:
- SQLite:
HUEY_TASKQUEUE_URL=sqlite:///path/to/database.sqlite
- Redis:
HUEY_TASKQUEUE_URL=redis://localhost:6379/0
- Memory:
HUEY_TASKQUEUE_URL=memory://
- File system:
HUEY_TASKQUEUE_URL=file:///path/to/queue/folder
Make sure to adjust the URLs according to your specific environment.
Here are examples of different URL configurations for each supported scheme:
-
SQLite:
HUEY_TASKQUEUE_URL=sqlite:///path/to/database.sqlite
This URL configures the task queue to use SQLite with a specific database file.
-
Redis:
HUEY_TASKQUEUE_URL=redis://localhost:6379/0
This URL configures the task queue to use Redis with a specific host (
localhost
), port (6379
), and database (0
). -
Memory:
HUEY_TASKQUEUE_URL=memory://
This URL configures the task queue to use an in-memory storage. No additional parameters are needed.
-
File:
HUEY_TASKQUEUE_URL=file:///path/to/queue/folder
This URL configures the task queue to use a file-based storage with a specific folder path.
Ensure that you set the appropriate URL corresponding to the desired scheme before running the code.
The huey_taskqueue
object created based on the URL configuration can be used
further in the application for task queuing and processing.
The configuration options for the Huey consumer are defined in the
consumer_options
dictionary. These options can be overridden using environment
variables. Here are the available options:
backoff
: The backoff factor for retrying failed tasks.check_worker_health
: Whether to periodically check the health of the worker.extra_locks
: Additional locks to acquire during task execution.flush_locks
: Whether to flush locks after task execution.health_check_interval
: The interval (in seconds) for checking worker health.initial_delay
: The initial delay (in seconds) before processing tasks.max_delay
: The maximum delay (in seconds) for exponential backoff.periodic
: Whether to enable periodic tasks.scheduler_interval
: The interval (in seconds) for running periodic tasks.worker_type
: The type of worker to use (e.g., "thread" or "process").workers
: The number of worker threads or processes to use.verbose
: Whether to enable verbose logging.
The configuration options can be overridden using environment variables. The
environment variables should be prefixed with HUEY_
. Here are some examples:
HUEY_WORKERS
: The number of worker threads or processes.HUEY_LOGFILE
: The path to the log file.HUEY_VERBOSE
: Whether to enable verbose logging.HUEY_WORKER_TYPE
: The type of worker to use.HUEY_PERIODIC
: Whether to enable periodic tasks.HUEY_SCHEDULER_INTERVAL
: The interval (in seconds) for running periodic tasks.HUEY_INITIAL_DELAY
: The initial delay (in seconds) before processing tasks.HUEY_MAX_DELAY
: The maximum delay (in seconds) for exponential backoff.HUEY_BACKOFF
: The backoff factor for retrying failed tasks.HUEY_HEALTH_CHECK_INTERVAL
: The interval (in seconds) for checking worker health.HUEY_CHECK_WORKER_HEALTH
: Whether to periodically check the health of the worker.HUEY_EXTRA_LOCKS
: Additional locks to acquire during task execution.HUEY_FLUSH_LOCKS
: Whether to flush locks after task execution.
It is strongly recommended to keep the existing configuration default values and change the configuration only if you know what you are doing. Please refer the Huey documentation first for understanding the configuration options and their impact.
After installing collective.taskqueue2
in Plone, you should see the following
output on the console (with HUEY_LOG_LEVEL=DEBUG
and HUEY_CONSUMER=1
set):
2023-11-21 11:02:59,012 INFO [huey.consumer:386][Thread-1 (run)] Huey consumer started with 1 thread, PID 76861 at 2023-11-21 10:02:59.012894
2023-11-21 11:02:59,012 INFO [huey:77][MainThread] collective.taskqueue2: consumer thread started.
2023-11-21 11:02:59,013 INFO [huey.consumer:389][Thread-1 (run)] Scheduler runs every 1 second(s).
2023-11-21 11:02:59,013 INFO [huey.consumer:391][Thread-1 (run)] Periodic tasks are enabled.
Starting server in PID 76861.
2023-11-21 11:02:59,014 INFO [huey.consumer:398][Thread-1 (run)] The following commands are available:
+ collective.taskqueue2.huey_tasks.dump_queue_stats
+ collective.taskqueue2.huey_tasks.schedule_browser_view
The example demonstrates a common use case of starting a dedicated browser view asynchronously.
In this scenario, we schedule the browser view /magazine/@@debug-demo-view
to
run in the context of the portal object located at context_path
.
The function takes the following parameters:
view_name
: The name of the browser view to be executed asynchronously. It is important to specify the view name with a leading@@
symbol.context_path
: The path of the context object for the view within the Plone portal. It can be obtained by joining the physical path of the context object using"/".join(context.getPhysicalPath())
.site_path
: The path to the root of the Plone portal.username
: The name of the user under which the view will be executed. It's important to exercise caution when using third-party code that may provide a username with higher privileges.params
: A Python dictionary of parameters that will be passed to the browser request. These parameters will be available inself.context.request.form
within the browser view.
# bin/instance run scripts/huey_client.py
from datetime import datetime
import logging
from collective.taskqueue2.huey_tasks import schedule_browser_view
now = datetime.now().isoformat()
schedule_browser_view(
view_name="debug-demo-view",
context_path="/magazine",
site_path="/magazine",
username="admin",
params=dict(foo="bar", bar="foo", meaning_of_life=42, now=now),
)
You may wrap the code above into a custom method that would provide the
context_path
, site_path
and username
from the current calling context
like:
from datetime import datetime
import plone.api
from collective.taskqueue2.huey_tasks import schedule_browser_view
now = datetime.now().isoformat()
schedule_browser_view(
view_name="debug-demo-view",
context_path="/".join(context.getPhysicalPath()),
site_path="/".join(plone.api.portal.get().getPhysicalPath()),
username=plone.api.user.get_current().getId(),
params=dict(foo="bar", bar="foo", meaning_of_life=42, now=now),
)
In case you have specific requirements beyond scheduling a browser view, you have the option to create your own Huey tasks. You can refer to the documentation at https://huey.readthedocs.io/en/latest/ for more details on creating custom tasks with Huey.
To use your custom Huey tasks effectively, it is important to register them during the startup phase of Plone and Zope. This ensures that your tasks are properly initialized and available for execution.
Inside your package foo.bar
you may provide your own tasks in a file foo.bar/foo/bar/huey_tasks.py
like
from collective.taskqueue2.huey_config import huey_taskqueue
@huey_taskqueue.task()
def my_task(*args, **kw):
# do something
and import huey_tasks.py
e.g. inside foo.bar/foo/bar/__init__.py
.
Your own application (e.g. as part of an event listener) may call
def listen_event(event):
context = event.context
result = my_task(context=context, foo="bar", bar="42")
Please read the Huey documentation on result
handling (in case you need to access
the result for whatever reason).
The current implementation of collective.taskqueue2
is intended for internal
environments where you have complete control over your code and dependencies. It
is designed to be used within trusted environments.
It is important to note that when scheduling a browser view call, it will be
executed using the specified username
. This can potentially introduce a
significant security risk if you are using third-party code that is not under
your control with collective.taskqueue2
. In such cases, an attacker could
potentially specify a common username
such as "admin"
, which is typically
associated with manager-level rights. It is crucial to exercise caution in these
situations.
Given the importance of security, it is recommended to consider implementing a stronger security mechanism in future versions to address this potential vulnerability.
The package provides a browser view @@taskqueue-stats
(on the Plone root) that
returned the current queue status as JSON:
{
"pending": 0,
"scheduled": 0
}
The package utilizes the standard Plone/Zope logger for logging purposes.
Typically, the log information is stored in the var/log/instance.log
file or a
related file if using a ZEO setup.
In a ZEO setup, you need to determine which ZEO client(s) will serve as task
queue consumers. This is done by setting HUEY_CONSUMER=1
in the environment of
the relevant ZEO client(s).
Additionally, the HUEY_TASKQUEUE_URL
must be configured for all ZEO clients
that will add tasks to the task queue. It's important to ensure that all ZEO
clients point to the same storage backend.
It is possible to have multiple consumers, where each consumer is responsible
for executing a specific task. Having multiple consumers can be beneficial in
certain situations. However, it's important to be aware that conflict errors can
occur with ZEO clients, just like with any other ZEO setup. It's worth noting
that collective.taskqueue2
does not provide any special support for handling
conflict errors.
To do:
- make consumer configuration configurable in
huey_consumer.py
Andreas Jung info@zopyx.com for UniversitĂ di Bologna/University of Bologna.
The collective.taskqueue2
was developed as a component of a Plone 6 migration
project for UniversitĂ di Bologna and has been made available as an open-source
solution.
- Issue Tracker: https://github.com/collective/collective.taskqueue2/issues
- Source Code: https://github.com/collective/collective.taskqueue2
- Documentation: https://docs.plone.org/foo/bar
The project is licensed under the GPLv2.