[Unpopular opinion] Don’t use poetry for Python dependencies

Even though Poetry for Python project management has gained popularity in the last years, and even I used it in a few projects, like any tool, it may not be the perfect fit for every project or team. Here I summarised just a few arguments that could be made against using poetry:

  • Learning Curve: For teams or individuals already familiar with other Python packaging and dependency management tools like pip and virtualenv, or even pipenv, adopting poetry can introduce a learning curve. Understanding poetry’s way of managing dependencies, environments, and packages might require time, effort and numerous mistakes.
  • Overhead for Simple Projects: Poetry provides a comprehensive solution that might be overkill for very simple projects. For small scripts or applications with minimal dependencies, the overhead of managing a poetry environment not be justified. Do you have an API with 5-20 dependencies. Don’t use poetry. Doesn’t make any sense.
  • Performance Concerns: Poetry’s dependency resolution process can be slower than some alternatives, particularly for projects with a large number of dependencies. This could impact the speed of continuous integration builds or the responsiveness of development workflows. Personally, I had situation in which adding a new package and rebuilding the lock file was taking me more than 1 hour.
  • Migration Effort for Existing Projects: Migrating an existing project to poetry from another system can require a non-trivial effort. This includes not only technical changes to how dependencies are managed and packaged but also updating any related documentation, developer guides, and CI/CD pipelines. I faced this challenge and it took a long time to migrate all the projects on poetry and to get it right. Also, in the meantime we had to maintain two systems for the same projects.

While poetry may offer some advantages (and the Internet is full of arguments pro-poetry), weighing these potential drawbacks can help determine if it’s the right choice for your situation.

Upload asynchronously to Amazon S3 using Tornado

TornadoWeb is a great non-blocking web server written in Python and Boto3 is the Amazon Web Services (AWS) SDK for Python, which allows developers to write in a very easy manner software that makes use of Amazon services like S3. Unfortunately boto3 S3 wrapper is blocking and if you would just use it out of the box in a Tornado application it will block the main thread because it uses a synchronous HTTP client.
Continue reading Upload asynchronously to Amazon S3 using Tornado

Monitor an error log with python and RabbitMQ

Nowadays there are many professional solutions to monitor your application for the errors. Some web frameworks have even build-in tools or support plugins to catch the programming exceptions and act accordingly.

Anyway, I wanted just to build a simple proof of concept how to monitor the web server error file and, when an event occurs and the file is changed, the monitoring script should send out an email. To monitor the log file I used pyinotify python module. This is an implementation on top of inotify, offering an easy interface to interact with the changes of the filesystem.

Continue reading Monitor an error log with python and RabbitMQ

How to use python sched in a daemon process

Let’s assume we want to download a file (or to do some tasks) to every 5 seconds, but the condition is to not do the same task twice or more times at the same moment, even if takes more than 5 seconds. For example, we have a to download a file and this will take 8 seconds. Also, if takes more than 5 seconds, it should not wait until the next iteration, to start again (3 seconds more), but will start the download immediately.

So, the traditional cronjob/lock file combination was not suitable for my case.

Continue reading How to use python sched in a daemon process