How to use python sched in a daemon process

Let’s assume we want to download a file (or to do some tasks) to every 5 seconds, but the condition is to not do the same task twice or more times at the same moment, even if takes more than 5 seconds. For example, we have a to download a file and this will take 8 seconds. Also, if takes more than 5 seconds, it should not wait until the next iteration, to start again (3 seconds more), but will start the download immediately.

So, the traditional cronjob/lock file combination was not suitable for my case.

I chose to use python language with python-daemon and sched.

To install python-daemon, you can do it with

sudo pip install python-daemon

The code below should be self explanatory, but I will do a short presentation. We start the main()function in the daemon context. After 1 second, we will call run_scheduled() first time. This one will call our “download” function get_file that can take a random time to finish its execution (we setup the random between 1 and 10 seconds). Inside run_scheduled, after the first run, we will schedule the next run depending on how much it takes the download and call again the same method:

scheduler.enter(restart, 1, run_scheduled, ('start again...',))

where restart can be between 0 and 4 (either the download took 1 to 4 seconds, so we have to wait 5 to 1 seconds, or it took more than 5 seconds, that means to schedule the download immediately).

import daemon
import sys
import sched
import time
import random
import logging
import logging.handlers
from datetime import datetime as dt
 
logger = logging.getLogger()
logger.setLevel(logging.DEBUG)
fh = logging.FileHandler('scheduler.log')
logger.addHandler(fh)
 
def now_str():
    return dt.now().time().strftime("%H:%M:%S")
 
def main():
    delay = 5
 
    def get_file():
        # this can take random time, let's say between 1 and 10 seconds
        download_time = random.randint(1, 10)
        time.sleep(download_time)
        return download_time
 
    def run_scheduled(message):
        logging.warn('RUNNING: {0} {1}'.format(now_str(), message))
        start = time.time()
        download_time = get_file()
        end = time.time()
        duration = int(end - start)
        restart = delay - duration
        if restart < 0:
            restart = 0
 
        logging.warn('get_file() took {0} seconds; reschedule in {1} seconds'.format(download_time, restart))
        scheduler.enter(restart, 1, run_scheduled, ('start again...',))
 
    # Build a scheduler object that will look at absolute times
    scheduler = sched.scheduler(time.time, time.sleep)
    logging.warn('START: {0}'.format(now_str()))
    # start in 1 second
    scheduler.enter(1, 1, run_scheduled, ('First run',))
    scheduler.run()
 
if __name__ == '__main__':
    if "-f" in sys.argv:
        main()
    else:           
        context = daemon.DaemonContext(files_preserve = [fh.stream,],)
        context.open()
        with context:
            main()

Of course, there are also other methods to achieve the same result like, threads or simple while True… with sleep between iterations 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.