Let’s assume we want to download a file (or to do some tasks) to every 5 seconds, but the condition is to not do the same task twice or more times at the same moment, even if takes more than 5 seconds. For example, we have a to download a file and this will take 8 seconds. Also, if takes more than 5 seconds, it should not wait until the next iteration, to start again (3 seconds more), but will start the download immediately.
So, the traditional cronjob/lock file combination was not suitable for my case.
I chose to use python language with python-daemon and sched.
To install python-daemon, you can do it with
sudo pip install python-daemon
The code below should be self explanatory, but I will do a short presentation. We start the main()function in the daemon context. After 1 second, we will call run_scheduled() first time. This one will call our “download” function get_file that can take a random time to finish its execution (we setup the random between 1 and 10 seconds). Inside run_scheduled, after the first run, we will schedule the next run depending on how much it takes the download and call again the same method:
scheduler.enter(restart, 1, run_scheduled, ('start again...',))
where restart can be between 0 and 4 (either the download took 1 to 4 seconds, so we have to wait 5 to 1 seconds, or it took more than 5 seconds, that means to schedule the download immediately).
import daemon import sys import sched import time import random import logging import logging.handlers from datetime import datetime as dt logger = logging.getLogger() logger.setLevel(logging.DEBUG) fh = logging.FileHandler('scheduler.log') logger.addHandler(fh) def now_str(): return dt.now().time().strftime("%H:%M:%S") def main(): delay = 5 def get_file(): # this can take random time, let's say between 1 and 10 seconds download_time = random.randint(1, 10) time.sleep(download_time) return download_time def run_scheduled(message): logging.warn('RUNNING: {0} {1}'.format(now_str(), message)) start = time.time() download_time = get_file() end = time.time() duration = int(end - start) restart = delay - duration if restart < 0: restart = 0 logging.warn('get_file() took {0} seconds; reschedule in {1} seconds'.format(download_time, restart)) scheduler.enter(restart, 1, run_scheduled, ('start again...',)) # Build a scheduler object that will look at absolute times scheduler = sched.scheduler(time.time, time.sleep) logging.warn('START: {0}'.format(now_str())) # start in 1 second scheduler.enter(1, 1, run_scheduled, ('First run',)) scheduler.run() if __name__ == '__main__': if "-f" in sys.argv: main() else: context = daemon.DaemonContext(files_preserve = [fh.stream,],) context.open() with context: main()
Of course, there are also other methods to achieve the same result like, threads or simple while True… with sleep between iterations 🙂