A well-executed product scalability model decides the fate of an application. Google is a prime example of top-notch scalability. How the website maintains a constant user response time, despite the astounding volume of user queries it picks up daily, is an towering achievement. The variety of users, how many add-on features must the Google search page have and how users should access any service they wanted in an instant, were only some of the many considerations, even as the search engine giant conquered the scalability factor.
As a product development company, Cuelogic faces and counters constant challenges with regard to making a product or application scalable. We have documented here a step-by-step process on how we helped make a web application/software scalable.
Building a scalable application
Cuelogic recently worked on the scalability of a web application. The software consisted of an application, a database and mail server. The client required scalability with any number of users (gradually) using its application/product at any given moment.
For the purpose of the article and dispensing complete information, we are suggesting two kinds of partitioning that can be implemented in Step I and Step II.
The first step to building a scalable application is reviewing its architecture. The target: Create architecture for the back end deployment server and scale it for infinite transactions. At the same time, also ensure zero downtime. Ultimately, we are aiming at cost reduction, less maintenance expenditure and low downtime impact. Post inspecting the current product architecture, developers identify the problem areas and then proceed to work on a single point of architecture.
Scenario A) Vertical Scaling / Partitioning
Many would consider adding more CPU and RAM to the node, but this is an expensive process, with hardly any benefits.
That is why we recommend vertical partitioning of the application server, database server and mail server into three different divisions. Picture an image where these three sections (app, data and mail) are linked by a single vertical line and connected to a separate server/node/unit. This partitioning can be conducted at various layers. For instance, suppose your mail server goes down, the app and data server will still be powering the application, thus keeping users engaged all the time. This segregation leads to increased scalability and per application applicability.
Vertical partitioning also enables you to maintain and adjust each server separately. You are able to optimize and adjust the server as per the tasks they are set to perform. No changes are required in the application. If there are any drawbacks, it is with regard to under-utilization of these individual partitioned servers. Again, this occurs only in certain circumstances. Each node performs a different function in this stage. A certain limit is set to the number of users.
Scenario B) Horizontal Scaling / Partitioning (of Application Server)
Horizontal scaling is your best bet, if there is an increase in the number of application users. The next logical step is to scale out the application server. Logical, because it is the app server that takes in all user-related information/data. Using a load balancer to separate the load across different servers, with the option of going for a hardware or software load balancer, is a recommended step. The former is as efficient, customization-friendly, serves millions of user-requests and worth its money, almost at par with the software version.
Now there are various methods and algorithms which you can apply to the load balancer, either by central session storage, sticky sessions and clustered session management. A central session storage location is used to fulfill the purpose of all web servers accessing the same session data. Sticky sessions aid the load balancer in redirecting the user to the same server. Clustered session management is best used if you have less number of application servers and session writes.
One can also choose from a dual set arrangement of load balancers, arrange them in either active + passive or active + active formats for scalability. In the former, if the sole active load balancer goes down, the passive one picks up the active's role. In the latter case, two load balancers share the load. If one goes down, the other gets additional load to work on.
Scenario A) Vertical Partitioning (of Database Server)
To prevent the stalling of services due to increased user load, divide the functionalities of the database server and use a bunch of SAN (Storage Area Network) devices, consequently enhancing scalability. There are many combinations of SAN and server devices possible, to be initiated as per the user demands on the application.
Scenario B) Horizontal Scaling(of Database Server)
The application server can be scaled as per your needs; the same goes for the SAN. But for long-term application scalability, the database server needs attention. Go for a shared nothing cluster. That is,divide a database server into homogeneous units, with no shared architecture or hardware, called clusters (as many clusters as one wants), these in turn, request information from the application server. This format can be used for free, an added cost-saving advantage.
Having discussed scenarios here, we now take a single-minded approach where horizontal partitioning of the database clusters is best recommended for product scalability.
Horizontal Partitioning (of Database Server)
A group of database clusters are formed, each serving a certain limited amount of users (For example, 2 million for each cluster). There are various ways to make horizontal partitioning work, like assigning all names starting from 'K to O' to cluster #1 and so on. For instance, we can assign the values as per the application's functional priorities and which cluster worked best.
An important addition can be a central control map that contains the complete list of data and the database cluster location a particular map belongs to. The application server uses the central control map to 'know' - which query has to be send to which particular database cluster.
So we finally have a complete system on the ready to aid scalability. To recap again: Application server clusters, two database servers, storage area clusters (bunch SAN devices), load balancers, a global redirector, a central control map. In totality they make the database set. The next step is simple - if there were more users, the same database set is to be replicated and deployed.
Once the architecture is made, the next task is to optimize and maintain database set. Caching of code and data ensures that the same data is not fetched repeatedly. Use of content delivery networks (CDN) like Amazon S3, Rackspace ensures high quality performance and availability. Chunking data at a certain place ensures it takes less time to collect data and code from any location, a great scalability function. Also, adding a http accelerator to the application server, parallelization and remote pooling are best used as part of the optimization process.
Finally, Cuelogic was able to provide a robust and sturdy scalability solution to the client, matching hardware and software requirements, adjusting to changing user preferences and functionalities. Low cost long-term maintenance with regards to scalability was a major achievement.