A while ago I had a pleasure to work on a migration of a big monolithic system to a more fail-safe architecture. One of the important parts of the system was the server-side jobs that were scheduled with Cron. In total there were around 100 commands that were running with a different frequencies. The problem was that now all the system, as well as the Cron jobs, were supposed to run on two load-balanced servers. We had to find a solution not to duplicate the server-side tasks using the new architecture. In the following article I will present the approach we took. I am not saying it is the best one, but it is one of the options that exist and for us it meets all the requirements. Also that is something we haven’t found documented anywhere so I believe it is interesting to share.
Once I encountered the problem described above I started Googling around and found a few out-of-the-box solutions:
- Just run the jobs on one server — yeah, we could but… no
- Dkron — that is actually a good option, but at the moment of working on the migration it was a bit of an overkill for us. We only had some shell scripts, nothing else, integrating another system into our platform was not something we really wanted to do.
- Apache Mesos /Airbnb Chronos— lots of configuration, changes in the architecture, definitely to much for what we needed.
And that is basically it. We didn’t find anything that would be meeting our needs so we decided to build it from the scratch. In our case the system is PHP application based on Yii framework. It is hosted in AWS and as a database it is using Amazon Aurora. All the nodes are talking to the same database, so we knew we can store the job schedule there. Most of the Cron jobs are Yii Commands, some are shell scripts. The described solution is using an AWS Lambda, but can be easily implemented using Azure Cloud Functions or Google Cloud Functions.
The idea was simple: Develop a Lambda function, that will be fired every minute and will call an internal AWS Load Balancer. Each of the servers will be listening on a selected port for an HTTP requests and once the request is received it will fire the scheduler command (Yii command) that will check which tasks should be executed in the current minute and execute them in separate processes. Initially we just had an idea to fire some password protected endpoint in our existing application, but that would lead to some problems, just to name two of them:
- The process would be running under php-fpm so the PHP configuration is different then the console one
- The endpoint would be public (even if password protected, it sounds like a terrible idea)
In the end we decided to go for a separate microservice that is triggering the scheduler in the command line. All the HTTP calls are done internally in the Amazon VPC and the used ports are not open publicly so the solution is fairly safe.
For creating lambda functions we are using the Serverless framework which I strongly recommend, however it is totally optional. The full lambda function looks like that:
It is literally a “hello world” application when it comes to HTTP requests in Node.js. The function is scheduled using the following Serverless configuration:
As this lambda function should be running inside our VPC we had to configure the subnets and security groups. The rest of the configuration if quiet straightforward. The most important part is
schedule: rate(1 minute) which is telling AWS how often the function should be fired.
As mentioned before, every server hosting our platform will have now a Node.js service running and listening for the HTTP POST requests from the Lambda function. The implementation is based on Express framework and it is simply spawning a new process that is our Yii Command checking for scheduler tasks to be executed. The following code snipped shows the crucial parts of the implementation.
Configuration saved in the json file (
../config.json) contains a few important details:
- UID and GID to set the process user and group identity. That is helpful if you want to fire the process under different permissions then the microservice itself. Note that in order to switch a user, the microservice must have appropriate permission, so for example run under
- List of commands to run, each of them with separated command and parameters (that is dictated by the
Each of the servers (in our case it’s only two) is running a copy of this Node application. We have set it up to run one command, but it can be as well configured to run many at a time.
As it was mentioned the system is written using Yii framework, but this approach can be implemented using any existing PHP framework. In fact, the idea of the scheduler was borrowed from Laravel Scheduler. The idea of scheduler is very simple:
- In the database store the tasks, each of which will contain the command to run and the Cron expression defining when it should run.
- Write a command that will loop through all the tasks and execute the ones that are scheduled for that minute.
To implement scheduler we used the package
dragonmantank/cron-expression package. Below you can see the example code executing the tasks. Note that the code below is only a concept, not a full implementation, so it is not tied to any framework.
Our full implementation is a bit more robust, it contains task caching, queuing (using RabbitMQ), saving some task statistics like execution time and more, but this is not a subject of this article.
Even though the solution is using three separate parts (Lambda function, Triggering service and the PHP command executing tasks) the implementation is very simple. The essence of the code is less then 100 lines and it works just fine. You could probably simplify it to get rid of the PHP part, but as we already had the application dashboard developed using Yii framework it was an obvious way for us.
I’m sure there are some drawbacks of this solution, but for a small projects that you still want to run on multiple servers it will do the trick.