prototype
prototype
design
architecture
The big picture
Tasks are stored in a MySQL, a worker (scheduler) periodically check the schedules and creates, updates or remove tasks from a live queue that uses redis, rabbitmq, nats, etc as a backend. One or more clients consume the queue and via a bidirectional gRPC stream dispatch the task to the proper client, if the task is finished, the client updates the result/status via gRPC Unary if not task is re-queued.
+---------------+ +-------------------------------------+
| task in MySQL | ----------------> | scheduler |
+---------------+ | query DB every minutes |
| | create/update/remove tasks in queue |
| +-------------------------------------+
| |
| / \
| +------------------------------+
| | task queue | <----+
| | (redis, nats, rabbitmq, etc) | |
| +------------------------------+ |
| | |
| / \ |
| +------------------------------+ |
| | gRPC (stream) | |
| | dispatch tasks | |
| +------------------------------+ |
| \ / |
| | |
| +------------------------------+ |
| | gRPC (stream - oneof) | |
| | (do X tasks) | |
| +------------------------------+ |
/ \ | |
+----------------------+ +------------------------------+ |
| gRPC (update result) | -- YES --< | tasks Done | >-- NO
+----------------------+ +------------------------------+
marabunta should scale out of the box:
+-------+.
| MySQL | | \
+-------+ | \ +-----------+. +-----------+
`--------` -- | marabunta | | --< | N-clients |
+-------+. / +-----------+ | +-----------+
| redis | | / `------------`
+-------+ |
`--------`
task fields (schema)
---
cdate: creation date
mdate: modified date
name: task name
payload: payload (could be reused for multiple targets)
retries: int
retried: int
when: when to run the task (cron)
sdate: state date
state: created, queued, runnning, done, error
target: where to run the task
uuid: unique id for the task
type: ondemand or scheduled
description: task description
---
messages: result, ouput etc (key tuuid)
---
payload: json file
state
1: todo
2: queued
3: running
4: done
5: error
type
Based on the type of the task the scheduler need to know how to handle it, for on-demand it could just add it to the database to keep track of the status but dispatched as fast as possible.
1: ondemand
2: scheduled
Ondemand (FCFS) first come, first served basis, normally not persistent bypassing the database going directly to the dispatcher.
Scheduled
Persistent tasks for periodic tasks the format for scheduling could be:
- cron format
- every (from creation time repeat every X seconds/minutes/duration)
payload
The payload could be unique to a task of shared among others
uuid: unique id
data: longtext
messages
In some cases it may be usefull to store the result from the task either at the end of on every stage:
tuuid: task uuid
message: longtext
cdate: creation date