{"id":349,"date":"2023-04-02T23:39:51","date_gmt":"2023-04-02T16:39:51","guid":{"rendered":"https:\/\/thnkandgrow.com\/?p=349"},"modified":"2023-04-02T23:53:57","modified_gmt":"2023-04-02T16:53:57","slug":"calculate-puma-web_concurrency-and-max_threads","status":"publish","type":"post","link":"https:\/\/thnkandgrow.com\/blog\/2023\/04\/02\/calculate-puma-web_concurrency-and-max_threads\/","title":{"rendered":"How to Calculate Puma WEB_CONCURRENCY and MAX_THREADS for Optimal Rails Application Performance"},"content":{"rendered":"\n
Puma is a high-performance, multi-threaded web server designed specifically for Ruby on Rails applications. It’s a lightweight and scalable Rack-compatible HTTP server capable of serving both static and dynamic content. Due to its exceptional performance, scalability, and low memory usage, Puma is a favored choice for high-traffic web applications. Paired with Nginx as a reverse proxy server, Puma provides a robust infrastructure for fast and reliable web application delivery. Additionally, Puma is designed to work seamlessly with the Ruby on Rails application server interface (ASGI) specification, making it compatible with a wide range of ASGI-compliant web frameworks.<\/p>\n\n\n\n
Both of these values are critical to the performance and scalability of a Ruby on Rails application running on the Puma web server.<\/p>\n\n\n\n Where:<\/p>\n\n\n\n Note: The above values are recommendations and may need to be adjusted based on the specific needs of your application.<\/p>\n\n\n\n The The For example, for an It’s important to note that these values should be tested and adjusted based on the specific needs and workload of your application.<\/p>\n\n\n\n <\/p>\n\n\n\n <\/p>\n\n\n\n <\/p>\n\n\n\n <\/p>\n","protected":false},"excerpt":{"rendered":" 1\/ What is Puma? Puma is a high-performance, multi-threaded web server designed specifically for Ruby on Rails applications. It’s a lightweight and scalable Rack-compatible HTTP server capable of serving both static and dynamic content. Due to its exceptional performance, scalability, and low memory usage, Puma is a favored choice for high-traffic web applications. Paired with […]<\/p>\n","protected":false},"author":1,"featured_media":351,"comment_status":"closed","ping_status":"open","sticky":true,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6],"tags":[],"yoast_head":"\nWEB_CONCURRENCY<\/code> and
MAX_THREADS<\/code> are environment variables used to configure the number of worker processes and threads used by the Puma web server in a Ruby on Rails application.<\/p>\n\n\n\n
WEB_CONCURRENCY<\/code>: It specifies the number of worker processes to run in parallel to handle incoming HTTP requests. This value should be set based on the available CPU resources on the server. As a general rule of thumb, it can be calculated by multiplying the number of CPU cores with a factor that takes into account the memory available on the server. The formula for calculating
WEB_CONCURRENCY<\/code> is often recommended to be
(2 * number of CPU cores) + 1<\/code>.<\/li>
MAX_THREADS<\/code>: It specifies the number of threads to be used in each worker process to process requests concurrently. This value should be set based on the available memory on the server. As a general rule of thumb, it can be set to a value between 1 and 5, depending on the size of the requests and the memory requirements of the application.<\/li><\/ul>\n\n\n\n
3\/ Formulas<\/h2>\n\n\n\n
WEB_CONCURRENCY = (CORES * (1 + RAM \/ GB_PER_CORE) \/ 2).floor \n\nOR\n\nWEB_CONCURRENCY = (2 * number of CPU cores) + 1<\/span><\/code><\/pre>\n\n\n\n
MAX_THREADS = (RAM \/ WEB_CONCURRENCY \/ 25).floor * 5<\/code><\/pre>\n\n\n\n
CORES<\/code> is the number of CPU cores available on your server (in your case, 1 for EC2 instance type t2.micro)<\/li>
RAM<\/code> is the amount of RAM available on your server (in your case, 2GB)<\/li>
GB_PER_CORE<\/code> is the amount of RAM per core (typically 2-4GB depending on your workload)<\/li><\/ul>\n\n\n\n
3\/ Ten common EC2 instances<\/h2>\n\n\n\n
EC2 Instance Type<\/th> vCPUs<\/th> Memory (GB)<\/th> WEB_CONCURRENCY<\/th> MAX_THREADS<\/th><\/tr><\/thead> t2.micro<\/td> 1<\/td> 1<\/td> 2<\/td> 5<\/td><\/tr> t2.small<\/td> 1<\/td> 2<\/td> 4<\/td> 10<\/td><\/tr> t2.medium<\/td> 2<\/td> 4<\/td> 8<\/td> 20<\/td><\/tr> t3.micro<\/td> 2<\/td> 1<\/td> 2<\/td> 5<\/td><\/tr> t3.small<\/td> 2<\/td> 2<\/td> 4<\/td> 10<\/td><\/tr> t3.medium<\/td> 2<\/td> 4<\/td> 8<\/td> 20<\/td><\/tr> m5.large<\/td> 2<\/td> 8<\/td> 16<\/td> 40<\/td><\/tr> m5.xlarge<\/td> 4<\/td> 16<\/td> 32<\/td> 80<\/td><\/tr> m5.2xlarge<\/td> 8<\/td> 32<\/td> 64<\/td> 160<\/td><\/tr> m5.4xlarge<\/td> 16<\/td> 64<\/td> 128<\/td> 320<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n WEB_CONCURRENCY<\/code> value represents the number of Puma worker processes that will be spawned to handle incoming requests. This should be set based on the number of available CPU cores on the EC2 instance. A good starting point is to set the
WEB_CONCURRENCY<\/code> equal to the number of vCPUs on the instance.<\/p>\n\n\n\n
MAX_THREADS<\/code> value represents the maximum number of concurrent requests that can be handled by each worker process. This should be set based on the amount of available memory on the EC2 instance. A good starting point is to set the
MAX_THREADS<\/code> to the number of cores multiplied by 5.<\/p>\n\n\n\n
m5.xlarge<\/code> instance with 4 vCPUs and 16GB of memory, we would set
WEB_CONCURRENCY<\/code> to 4 and
MAX_THREADS<\/code> to 80 (4 cores x 5 threads per core x 4 worker processes).<\/p>\n\n\n\n