Tag Archives: aws

Upload asynchronously to Amazon S3 using Tornado

TornadoWeb is a great non-blocking web server written in Python and Boto3 is the Amazon Web Services (AWS) SDK for Python, which allows developers to write in a very easy manner software that makes use of Amazon services like S3. Unfortunately boto3 S3 wrapper is blocking and if you would just use it out of the box in a Tornado application it will block the main thread because it uses a synchronous HTTP client.

The solution is to use Tornado’s AsyncHTTPClient and do manually all the work which boto does for you under the hood.

I built a small replace of boto3 mthods for upload and delete (the only ones I need for the moment) which uses tornado’s AsyncHTTPClient and I published the code on github.

The main idea around this replacement is to use botocore to build the request (AWS wants the requests to be signed using different algorithms based on AWS zones and request data) and only to use the AsyncHTTPClient for the actual asynchronous call.

In order to use the S3AsyncManager you need to define an AWS profile, for example as

and the credentials file:

You can obtain your access_key and secret_access_key from your aws account.

To install the S3AsyncManager system wide (or in your virtualenv), clone this repo

and install it with pip:

Afterwords, you can use it as shown in the example folder.

 

 

Create a Cassandra cluster with OpsCenter on Amazon EC2

Today I played a little with Cassandra on Amazon EC2. It was a very user friendly and pleasant experience to deploy a cluster with 2 nodes in one region using DataStax OpsCenter.

First I started a m1.small instance in Amazon EC2 where I installed OpsCenter. For this I chose Centos 6, the official AMI. Before starting to install OpsCenter, we need to configure the firewall in order to be able to access it. In AWS console, under the Security group, there is “CentOS 6 -x86_64- – with Updates-6 – 2014-09-29-AutogenByAWSMP-“. We need to righ-click on it and Edit inbound rules. Here we add a new Custom TCP Rule with port 8888 and the Source IP: My IP.

Anyway, I noticed that the instance has also an iptables firewall and the port 8888 is not open. So, on the instance I did:

Now, we can install OpsCenter. All you need to do is to follow the installation guide for RPM package from DataStax:

1. Edit the file:

2. Add the repository for OpsCenter

3. Install and start OpsCenter

After the installation is finished and the service started, write in your browser: http://<YOUR_INSTANCE_IP>:8888 and you will see this nice screen.

Welcome to DataStax OpsCenter
image-154

From now on, it is pretty easy to setup a cluster with multiple nodes.

Just click “Create Brand New Cluster” and follow the steps.

You will need to add some information as in the image below:

create-cluster
image-155

  • The cluster Name
  • Your DataStax Credentials. If you do not know what these are, then you need to go to DataStax Registration page, fill your data and click “Download Now”. Don’t worry, nothing will be downloaded, but you will get an email with your username and password. These are your credentials you need to put in the form from OpsCenter.
  • The total number of nodes to be created (and installed with Cassandra) – be aware,  the current instance where OpsCenter is running is not counted. I created 2 nodes initially and I added another one later on.
  • The Amazon EC2 Credentials – these are needed because OpsCenter will launch the instances for you. You need only to select the Availability Zone and the Size of the instances.

The job is almost done. Now you need to click Build Cluster and wait while all the necessary software is installed.

cassandra-installing
image-156

After few minutes, you will have a Cassandra cluster with 2 nodes.

In the next tutorial I will describe how to add an extra node through OpsCenter to the current cluster.

Good luck!