You wouldn’t believe how often business owners ask themselves ‘Would it be cheaper for our infrastructure maintenance if we moved to Azure?’ Even though this question will never have a simple answer, we’ve decided to show you how it works in reality. We will showcase one infrastructure replicated on Azure and AWS with a cost comparison of both.
NOTE: Since both clouds offer different tools the implementation schemes will differ, although, it is the same infrastructure.
To display the example we chose an application with which you can upload a monochrome photo and have a colored version sent to you. So, you send a file, the system processes the photo and saves the result to storage.
Let’s see how this one will be implemented on AWS and Azure.
There are several approaches to implement the desired infrastructure in the сloud. Let’s compare several of them to check their strengths and weaknesses.
Serverless is the most modern approach to run the application. Let’s break it down to pros vs. cons.
Pros:
Cons:
This is the most straightforward approach, but at the same time the most popular in the IT world. Why? It all boils down to a single advantage:
Pros:
Cons:
Container-based approach is currently outdated but still supported by most cloud services. These services are the first iteration of container orchestration, but there are more modern options.
Pros:
Cons:
It’s the most popular solution of the vendor-agnostic, high-load, and highly elastic container orchestration service.
Pros:
Cons:
To make the application highly-available and elastic we decided to extend the original architecture with Message Queue Service (it can be a managed service which is available in each cloud platform such as AWS SQS, Azure Service Bus, or any other implementation of it like ActiveMQ, RabbitMQ, Kafka, and others). It will help to decouple the application and make it scalable. Of course, we can implement this logic on the API side, but it will make it just more complex and less predictable. Having a message queue between the API and backend can scale the number of backends depending on the amount of messages in the queue waiting to be processed. We can leave all of the other components as they are in the original diagram. In our example, the application consists of microservices that should be run on the Linux or Windows platforms. The Kubernetes cluster consists of two nodes of each type.
To reduce the cost of the architecture in the initial stage we omitted the following from the diagram:
To decide which database fits the Application architecture better, we compared two different solutions: RDBMS and a so-called “no-SQL database”. To avoid vendor-lock, we decided to compare PostgreSQL and MongoDB. Both database engines are free (opensource) and have no vendor-lock like DynamoDB has to Amazon.
The main benefits of RDBMS are transactions and data consistency (formally because they are following the ACID consistency model). It’s a traditional way to store and process data and many companies continue to use it today. The cons of RDBMS databases are usually the following:
The main benefit of noSQL databases is that they have no schema in the classical meaning and can be fed with any JSON like data. Also you will have sharding out of the box which will do all the magic for you and split the data between several servers. No-SQL databases have the BASE consistency model. The cons of no-SQL databases are usually the following:
Both database types support replication that can help with horizontal scalability of the read, but not the write requests.
I believe that selecting the proper database for such a type of the workload will be biased depending on the person who is making the decision, based on their experience with RDBMS or no-SQL databases. I’m biased to the RDBMS as I have a lot of experience working with them, fine tuning them, with schema migration, and queries optimization. There are plenty of ways to prove the selected point of view like saying that because we want to store information about financial operations in the database, we need to have the ACID consistency that is provided by RDBMS databases. The same way it can be proven that no-SQL should be selected because of the huge amount of rows potentially being stored, and without having the shards it will be hard to scale the solution.
— Dmitrii Sirant, CTO at OpsWorks Co.
Regardless of the decision the following point should be made:
If the project reaches the point where it’s not possible to scale with RDBMS anymore, use a no-SQL database for specific or even all data.
As depicted in the diagrams, the whole system consists of such blocks as:
Based on the internal Kubernetes metrics and external metrics (visits per minute, queue length) we can scale our Kubernetes cluster up and down to make sure that we are providing high-quality service to the clients.
As all of the components built are based on the Azure managed services it will be easier to maintain the infrastructure, install security, and software updates. It will be possible to conduct all changes to the infrastructure with zero downtime once we have HA setup for each of the services.
AWS: Network diagram
As depicted in the diagram, the whole system consists of such blocks as:
Based on the internal Kubernetes metrics and external metrics (visits per minute, queue length) we can scale our Kubernetes cluster up and down to make sure that we are providing high-quality service to the clients.
As all of the components are built based on the Amazon managed services it will be easier to maintain the infrastructure, and install security and software updates. It will be possible to conduct all changes to the infrastructure with zero downtime, once we have HA setup for each of the services.
All components of the application can run inside Kubernetes to fully utilize the benefits it provides: service isolation, resource limits, self-healing, scalability.
Service isolation. Kubernetes has several layers of isolation, such as Pods, Namespaces, Network Policies, RBAC, and can be precisely configured to achieve the highest level of security for the applications that are running on the top of it.
Resource limits. Each container can (and must) be configured with requests and limits for the CPU and Memory. Based on the request values cluster will decide which node is the most suited to run the container to have the most effective spread of containers among available nodes. Limits is the hard limit for the resources available to the container and need to make sure that high load on one container does not have an impact on other containers on the same node.
Self-healing. Each container has internally defined liveness and health probes, which are constantly checked by the Kubernetes cluster to make sure that container is operating properly. If the container stops to respond to the liveness or health probes it will be killed and restarted again.
Scalability. Kubernetes has internal tools (horizontal pod autoscaler) that were designed to make sure that applications can always cope with the traffic that comes to it. If more pods (containers) are required to process the traffic it decides to run more pods on the existing nodes. If a cluster has no nodes with available resources we trigger autoscaling that will spin up more nodes and connect them to the cluster. When traffic drops it will reduce the amount of pods and shut down the nodes that are not in use anymore.
Based on information above we think that the best solution would be implementing the POC based on AWS, as it is slightly cheaper than Azure. We didn’t make analyses of the Google Cloud or other providers that potentially provide similar managed services and might be cheaper, but the whole architecture was built to make sure that it can be easily implemented on any cloud (AWS, Azure, Google Cloud) or even be multi-cloud if needed.
Further Steps: Process of the PoC/MVP Implementation
To get a similar comparison document on your specific infrastructure, e-mail us or fill out the contact form on the website.