Architectural Consideration for Building Cloud Application
Though it is generally believed that biggest challenge of architecting a cloud application is security and reliability, there is another major dimension which is generally overlooked which is cost optimization. In response to the pool “What is the main risk with cloud computing?” by Tech Republic – 59% identified data security to be the main concern and 20% thought it was reliability. The fact that the applications need to be designed differently to take advantage of cloud and thus reduce cost did not even enter into the consideration.
Traditionally, actual cost of deployment has never directly been considered as a parameter of architectural tradeoffs – performance: yes, response time: yes, application-partitioning: yes, load-balancing: yes, choice-of-platform: yes, choice-of-software: yes, open-vs.-proprietary: yes – but actual-cost-of deployment: no. You are likely to do hardware sizing based on the projected load and arrive at the machine configuration. You may also tune parts of the application post deployment if the response time is not acceptable. But in how many instance will you design your application to bring down the hardware requirement by 10%? What about tuning the application post deployment to reduce hardware requirement by 10% even though the response time is adequate?
Traditionally, your hardware and software is a capital expenditure. So, once the initial investment is made you are unlikely to save any money by optimizing the application to utilize less resource. But when the application is deployed in the cloud it is no longer true. What is driving a CIO to take a serious look as cloud computing is primarily the promise of cost reduction. Pay for what you use implies don’t pay for unutilized resources and if you consume less resource you pay less.
However, in cloud, you pay for:
- actual CPU utilization
- actual size of data storage
- actual data read-write
- actual input-output bandwidth used
You can always do an architectural tradeoff and increase/reduce the usage of these 4 parameters. How it will impact the overall cost will depend on the cost structure. So your architectural decision will have a direct impact on cost but the optimality of the decision will change as soon as any adjustment is made in the cost structure by the cloud provider.
Pay as you use implies many more options for cost reduction
- You need to minimize unutilized resources
- Design and code efficiency becomes critical
- Cost effective design will depend on the relative cost of processing charge, storage charge, data read-write charge & bandwidth charge
- Restructuring of cost by cloud provider may affect the optimality of design
- Whenever there is an IT budget cut, you may be asked to optimize the code
- Any outside consultant can come and claim that there is opportunity to save money
Lack of availability of data to base the decision on – it has to be found out through experimentation.
Different interpretation of application and machine boundary implies packaging has direct impact on cost
The cloud market place has many players with different strategies but I have not considered SaaS players like Salesforce.com. Though every organization from IBM to Oracle to HP wants to make their presence felt in the cloud – the following are the major players and here a summary of the pricing structure of EC2, Azure & GAE for your quick reference.
- Pay for the duration a machine has been instantiated
- Not dependant on what you run on the machine
- Load variability needs to be managed through instantiation and de-instantiation of one or more machines
Implication: Given a choice of one machine of larger capacity and multiple machines of smaller capacity – later is preferable.
- Pay for the duration an application has been instantiated
- Not dependant on how much the application is used or how complex it is
Implication: Bundling of multiple unrelated applications into one may turn out to be more cost effective.
- pay for actual usage of the deployed application
- not dependant on how long it is deployed
- CPU usage of individual transaction is aggregated for cost calculation
Implication: Optimization needs to be performed at individual transaction level and not at machine or application level
Experience with one platform cannot be directly translated to another platform.
Availability of storage options other than RDBMS which is expected to be optimized for cloud
- Though EC2 support RDBMS, they also provide other options
- Azure only supports a version of MS-SQL
- GAE only supports persistence of objects
- EC2 has multiple non RDBMS options
- In EC2, you can use their instance on MySQL instance or use your own mounted storage
Non-relational databases can be highly efficient in specific application scenario. For example, if you have a complex domain object, it may be advantageous to store it as a single object significantly reducing the number of disk I/O thereby impacting the cost. The challenge, however, is to stop thinking in terms of relational tables and SQL. This requires lot of unlearning.
Many of the traditional design principle may have to be revisited and new ones need to be arrived at.