Hosting 100+ WordPress Sites on a Single Server Without Losing Your Mind
My largest client has been having ongoing issues as a webhost for around 120 wordpress sites. The sites scattered across multiple different hosting providers, each with its own collection of performance problems. None of the servers were optimized properly and no two sites were hosted in the same way.
There were custom php.ini files all over the place. There were logs that had been growing un-rotated for years, 10 - 20GB in some cases. There were redundant backups scattered all over the place. No real firewalls in place, sites were regularly compromised and brought down.
The servers often slowed to a crawl under various loads. Quite the mess. To make matters worse, this setup wasn't cheap. They were paying $20 per month PER hosted site.
They wanted everything consolidated. They wanted it fast and they wanted considerable savings.
The Goal
- All 120+ sites on one managed server
- PageSpeed scores in the 90s
- Sub-second load times
- Easy to manage and maintain
- Cost less than the current mess
The solution I delivered follows. Not cheap ceartainly but much, much improved. Secure and manageable with room to grow.
The Stack
Here's what I built:
- Load Balancer: AWS Application Load Balancer
- Server: AWS EC2 m5a.2xlarge (8 vCPUs, 32GB RAM) - Note: This ran perfectly fine on an m5a.xlarge (4 vCPUs, 16GB RAM) at half the cost. I upgraded for headroom and peace of mind. At 50+ sites the highest load averages were still< 2.0
- Control Panel: WHM/cPanel on CloudLinux
- Web Server: LiteSpeed with LSPHP
- CDN: AWS CloudFront in front of all sites
- Database: AWS RDS MySQL (db.m5.large, single-AZ)
- Caching: Redis (local, 8GB allocated)
- PHP: PHP 8.1 with OPcache
- Firewall: ConfigServer Security & Firewall (CSF)
This wasn't cheap, but it was cheaper than paying for 120 separate hosting accounts.
Why This Stack?
CloudLinux + WHM/cPanel
CloudLinux isolates each site in its own lightweight virtual environment (LVE). One site getting hammered with traffic doesn't affect the others. Resource limits per account prevent any single site from bringing down the server.
WHM/cPanel because the agency staff already knew it. Could've used something else, but the learning curve wasn't worth it.
LiteSpeed Web Server
LiteSpeed is faster than Apache and Nginx for WordPress. Native support for HTTP/3, built-in cache, and the LSCache plugin is legitimately good.
Worth the license cost ($50/month) for 100+ sites.
RDS for Databases
This was the key decision.
Instead of running MySQL on the same server as the web server, I moved all databases to AWS RDS. Benefits:
- Automatic backups and point-in-time recovery
- Easy to scale vertically (just change instance size)
- Offloads I/O from the web server
- Can scale databases independently of web server
- Automated maintenance windows
Each WordPress site gets its own database, all on the same RDS instance.
I didn't go with Multi-AZ. These are small business sites - brochure sites, local service companies, that kind of thing. They don't need 99.95% uptime. The occasional maintenance window or rare outage is acceptable. Multi-AZ would add $290/month for availability we don't need.
Local Redis for Object Cache
WordPress hits the database constantly. The object cache stores database query results in memory. Massive performance improvement.
I installed Redis directly on the WHM server with 8GB of RAM allocated to it. Could've used ElastiCache, but for this scale, running it locally is simpler and cheaper. One less network hop, one less service to manage.
Redis configuration:
# redis.conf
maxmemory 8gb
maxmemory-policy allkeys-lru
save ""
appendonly no Disabled persistence (save/appendonly) because this is pure cache. If Redis restarts, cache rebuilds from database queries. No need for disk I/O.
CloudFront CDN
Every site sits behind CloudFront. Static assets (images, CSS, JS) are cached at edge locations worldwide. This drastically reduces load on the origin server and improves load times globally.
CloudFront configuration per site:
- Origin: Application Load Balancer
- Cache behavior: Cache everything except wp-admin and wp-login.php
- TTL: 1 year for images, 1 day for HTML/CSS/JS
- Compression: Gzip and Brotli enabled
- SSL: Free AWS Certificate Manager certificates
The combination of LiteSpeed cache + CloudFront means most requests never hit the origin server. Cached at the edge, served from RAM. This is how you get millisecond load times.
Security Architecture
Application Load Balancer + WAF
CloudFront hits the Application Load Balancer, which has AWS WAF (Web Application Firewall) attached. The WAF filters malicious traffic before it even reaches the EC2 instance.
WAF rules configured:
- AWS Managed Rules for WordPress (blocks common WP exploits)
- SQL injection protection
- XSS (cross-site scripting) protection
- Rate limiting: 2,000 requests per 5 minutes per IP
- Geo-blocking for known bad actor countries
- Bot protection (blocks bad bots, allows good bots like Googlebot)
The EC2 instance itself only accepts traffic from the ALB - nothing else can reach it directly. Security group rules ensure web ports (80/443) are only accessible from the load balancer.
Why this architecture matters:
- WAF stops attacks before they consume server resources
- The EC2 instance has no public web ports exposed
- All traffic goes through the load balancer for health checks and routing
- Makes it easy to add a second server later (just add to target group)
- SSL termination happens at CloudFront, reducing CPU load on origin
ConfigServer Security & Firewall (CSF)
CSF handles all other firewall rules. The security group only allows the ALB to hit web ports. CSF handles everything else:
- SSH access restricted to specific IPs
- WHM/cPanel ports (2087, 2083) locked down to admin IPs
- Port scan detection and automatic blocking
- Brute force protection for SSH and cPanel logins
- SYN flood protection
CSF configuration snippet:
# csf.conf key settings
TCP_IN = "22,2087,2083" # Only admin ports
TCP_OUT = "1:65535" # Outbound allowed
DENY_IP_LIMIT = "500" # Ban after failed attempts
LF_SSHD = "5" # 5 failed SSH attempts = ban
LF_CPANEL = "5" # 5 failed cPanel attempts = ban
PORTFLOOD = "80;tcp;100;5" # Port flood protection The security group allows the ALB to hit ports 80/443. CSF blocks everything else except whitelisted admin IPs for SSH and cPanel access. This defense-in-depth approach means even if someone bypasses CloudFront, they hit the ALB, and even if they bypass that, CSF blocks them.
The Custom Redis Object Cache
WordPress has a standard object cache drop-in API. I wrote a custom object-cache.php that works with Redis and handles multiple sites properly:
<?php
// Custom Redis Object Cache for Multi-site Setup
// Drop-in for wp-content/object-cache.php
if (!defined('ABSPATH')) exit;
class WP_Object_Cache {
private $redis;
private $blog_prefix;
public function __construct() {
$this->redis = new Redis();
$this->redis->connect('127.0.0.1', 6379);
// Unique prefix per site
$this->blog_prefix = 'wp_' . md5(DB_NAME) . '_';
}
public function get($key, $group = 'default') {
$cache_key = $this->blog_prefix . $group . '_' . $key;
$value = $this->redis->get($cache_key);
if ($value === false) {
return false;
}
return unserialize($value);
}
public function set($key, $data, $group = 'default', $expire = 0) {
$cache_key = $this->blog_prefix . $group . '_' . $key;
$value = serialize($data);
if ($expire > 0) {
return $this->redis->setex($cache_key, $expire, $value);
}
return $this->redis->set($cache_key, $value);
}
public function delete($key, $group = 'default') {
$cache_key = $this->blog_prefix . $group . '_' . $key;
return $this->redis->del($cache_key);
}
public function flush() {
// Flush only this site's cache
$keys = $this->redis->keys($this->blog_prefix . '*');
if (!empty($keys)) {
$this->redis->del($keys);
}
return true;
}
}
$wp_object_cache = new WP_Object_Cache();
?> Key feature: Each site gets its own cache prefix based on its database name. Sites don't interfere with each other's cache.
PHP Configuration
PHP 8.1 with aggressive OPcache settings:
; php.ini optimizations
opcache.enable=1
opcache.memory_consumption=256
opcache.interned_strings_buffer=16
opcache.max_accelerated_files=50000
opcache.revalidate_freq=60
opcache.fast_shutdown=1
; WordPress-specific
max_execution_time=300
memory_limit=256M
upload_max_filesize=64M
post_max_size=64M OPcache stores compiled PHP bytecode in memory. WordPress doesn't need to recompile on every request.
WP-Cron Changes
WordPress by default triggers cron jobs on every page load, which creates unpredictable CPU spikes and slows down user requests. With 120+ sites, this becomes a performance nightmare.
I disabled WP-Cron completely and moved to system-level cron jobs running every 15 minutes. This gives us predictable load patterns and keeps cron execution separate from user requests.
In each site's wp-config.php:
define('DISABLE_WP_CRON', true); Then created a system cron script to handle all sites:
#!/bin/bash
# /root/scripts/run-wp-crons.sh
# Runs WP-Cron for all WordPress installations
for dir in /home/*/public_html; do
if [ -f "$dir/wp-config.php" ]; then
cd $dir
php wp-cron.php > /dev/null 2>&1 &
fi
done
wait System crontab entry:
# Run WP-Cron for all sites every 15 minutes
*/15 * * * * /root/scripts/run-wp-crons.sh Benefits of this approach:
- Predictable server load - crons run at :00, :15, :30, :45 every hour
- Page loads are faster - no cron overhead on user requests
- Better resource management - can monitor and control cron execution
- No race conditions - crons don't trigger simultaneously across sites
- Easier debugging - cron failures don't impact user experience
The tradeoff: scheduled posts and other time-sensitive operations run within 15 minutes instead of immediately. For these sites, that's acceptable.
These settings alone got most sites into the 80-90 PageSpeed range.
Database Optimization
120+ WordPress databases on one RDS instance sounds insane, but it worked because of proper configuration:
[mysqld]
max_connections=500
innodb_buffer_pool_size=24G
innodb_log_file_size=512M
innodb_flush_log_at_trx_commit=2
innodb_flush_method=O_DIRECT
query_cache_type=0
query_cache_size=0 Key decision: Disabled query cache. With object caching in Redis, MySQL's query cache just adds overhead.
The Migration Process
Moving 120 sites without downtime required planning:
- Set up the new server completely
- Migrate sites in batches of 10
- Test each batch thoroughly
- Update DNS with low TTL (300 seconds)
- Monitor for 24 hours before next batch
Each batch took about 2 hours. Total migration: 3 weeks.
Automation for Management
Managing 120 sites manually would be insane. I wrote scripts for common tasks.
Plugin Updates
#!/bin/bash
# Update WordPress and plugins across all sites
for dir in /home/*/public_html; do
if [ -f "$dir/wp-config.php" ]; then
cd $dir
wp-cli core update --allow-root
wp-cli plugin update --all --allow-root
wp-cli cache flush --allow-root
fi
done Updates all sites in 10 minutes instead of a full day.
The Results
Performance Breakdown
Typical site metrics after optimization:
- TTFB: 150-250ms
- First Contentful Paint: 400-600ms
- Largest Contentful Paint: 800ms-1.2s
- Total Page Load: 1.5-2.5s (fully loaded)
These are real-world numbers under normal traffic.
Cost Breakdown
| Service | Monthly Cost |
|---|---|
| EC2 m5a.2xlarge | $280 |
| Application Load Balancer | $25 |
| AWS WAF | $15 |
| RDS db.m5.large (single-AZ) | $145 |
| CloudFront | $120 |
| CloudLinux License | $45 |
| LiteSpeed License | $50 |
| WHM/cPanel License | $55 |
| Backups (S3) | $80 |
| Bandwidth | $50 |
| Total | $865/month |
Previous cost across 8 hosting providers: ~$2,400/month. Savings: $1,535/month or $18,420/year.
What Could Go Wrong?
This isn't a perfect solution. It's pragmatic, not bulletproof.
Single points of failure everywhere:
- One EC2 instance. If it dies, all sites go down.
- Single-AZ RDS. If AWS has an availability zone issue, database goes down.
- Local Redis. If the instance restarts, cache is gone and 120 sites hit the database simultaneously.
But here's the thing: these are small business sites. A plumber's website. A dentist's site. A local restaurant. They don't need five nines of uptime. They need to be fast when they're up, and they need to not cost a fortune.
In 18 months of running this setup, we've had two unplanned outages. One was a bad plugin update that I rolled back in 10 minutes. The other was an AWS issue that lasted 47 minutes. Total downtime: about an hour over a year and a half.
That's better than 99.9% uptime. Good enough.
Other potential issues:
- Resource contention: One site getting DDoS'd could affect others. WAF and CloudLinux LVE limits help, but aren't perfect.
- Scaling ceiling: Eventually you hit the limit of one server. At 200+ sites, you'd need to add capacity.
- Complexity: More moving parts than shared hosting. Requires someone who knows what they're doing.
The tradeoff is clear: lower cost and great performance in exchange for accepting occasional downtime. For the client's use case, that's the right call.
Monitoring and Maintenance
Set up comprehensive monitoring:
- CloudWatch for server metrics (CPU, memory, disk)
- RDS performance insights for database queries
- Custom uptime scripts.
- Custom script checking Redis health
- Weekly automated WordPress updates with rollback capability
Key Lessons
RDS was worth it.
Separating the database from the web server gave us flexibility and reliability. Worth the extra cost.
Redis object cache is mandatory.
Without Redis, the database would be crushed. With Redis, database queries drop by 80-90%. The single most impactful optimization.
LiteSpeed + LSCache is legit.
I was skeptical of proprietary web servers, but LiteSpeed performs noticeably better than Apache or Nginx for WordPress. The license pays for itself.
Automation is mandatory at scale.
You cannot manually manage 120 sites. Scripts for updates, backups, and maintenance are not optional.
CloudLinux LVE isolation is underrated.
Being able to resource-limit individual accounts prevents one bad site from killing the server. Essential for multi-tenant setups.
Would I Do It Again?
Yes, but with some changes:
- Use ElastiCache for Redis instead of running it on EC2
- Set up automated failover for the EC2 instance
- Use CloudFlare in front of everything for DDoS protection
- Implement better automated testing before mass updates
For agencies managing dozens or hundreds of WordPress sites, this approach works. It requires upfront setup and ongoing maintenance expertise, but the performance and cost savings are real.
Just don't try this if you don't know what you're doing. WordPress at scale is not beginner-friendly.