Jump to content

Huge Spike in Load After Moving to IPB 4.1


Recommended Posts

Keep in mind linode is a VPS also, I agree that you need to move to a dedicated server so you are no longer sharing resources. No matter how good of vps setup you may have, you are still sharing resources, with a site this size (traffic) you need to isolate your resources to yourself (dedicated) Based on the info in this topic you are processor power and perhaps disk i/o weak, you may have have 12 cores, but 12 cores of what processor is what will make the difference, and also the sharing aspect comes into play. IPS4 is pretty resource hungry when compared to a simple forum only solution though, so it will require more. 

 

p.s. changing web servers and mysql servers isn't going to solve this imo though, so I would skip that and save yourself the work. 

 

Link to comment
Share on other sites

  • Replies 67
  • Created
  • Last Reply
33 minutes ago, Rhett said:

Keep in mind linode is a VPS also, I agree that you need to move to a dedicated server so you are no longer sharing resources. No matter how good of vps setup you may have, you are still sharing resources, with a site this size (traffic) you need to isolate your resources to yourself (dedicated) Based on the info in this topic you are processor power and perhaps disk i/o weak, you may have have 12 cores, but 12 cores of what processor is what will make the difference, and also the sharing aspect comes into play. IPS4 is pretty resource hungry when compared to a simple forum only solution though, so it will require more. 

 

p.s. changing web servers and mysql servers isn't going to solve this imo though, so I would skip that and save yourself the work. 

 

 

I somewhat agree, though Linode's platform is very powerful. I have verified that disk is definitely not the pain point - I never show any wait I/O on the box no matter what. All the CPU usage is bottled up in "us", not "wa" when I'm watching top. Linode has full SSDs, so that's probably why I have no disk issues (along with having enough RAM to cache the database).

My fear is that a dedicated machine to match the Linode (Xeon E5 2680v3) will be difficult to find in the same price range. I'm also concerned that a dedicated machine doesn't give enough opportunity to scale up and down quickly. I may look into a solution involving multiple smaller Linodes and see how that goes.

Link to comment
Share on other sites

11 minutes ago, Ghan said:

 

I somewhat agree, though Linode's platform is very powerful. I have verified that disk is definitely not the pain point - I never show any wait I/O on the box no matter what. All the CPU usage is bottled up in "us", not "wa" when I'm watching top. Linode has full SSDs, so that's probably why I have no disk issues (along with having enough RAM to cache the database).

My fear is that a dedicated machine to match the Linode (Xeon E5 2680v3) will be difficult to find in the same price range. I'm also concerned that a dedicated machine doesn't give enough opportunity to scale up and down quickly. I may look into a solution involving multiple smaller Linodes and see how that goes.

If you really had access to 100% of those resources in your specs the thing would be flying though, that's what the crazy part is... so it's a catch 22 really, have you considered AWS? 

Link to comment
Share on other sites

For less than you pay now, you can get a 2 x Xeon 2670v2 with 40 threads. Ok, each of the threads are not as powerfull as the one of linode, but its 12 against 40 ;)
And its 40 dedicated to you. I don't know if the Linode threads are dedicated only to you.

Then it has 256Gb of Ram and 2x 480 Gb SSD. It also has one of the best dedicated DDOS protection. Linode doesn't have ddos protection. They will null your IP if you get attacked.

Link to comment
Share on other sites

20 minutes ago, Flitterkill said:

Are you using the Longview service at all with Linode?

I am! Just the free version, though. It seems to corroborate what I'm seeing generally speaking:

 

ghan-18-27-32.png

 

ghan-18-28-01.png

 

I don't have it configured to log in to MySQL right now, though I suppose I could try getting that working. I do have a bit of a custom setup, though.

 

18 minutes ago, Rhett said:

If you really had access to 100% of those resources in your specs the thing would be flying though, that's what the crazy part is... so it's a catch 22 really, have you considered AWS? 

 

Last time I tried AWS, it was quite confusing and difficult to setup. And I think they don't give you any burst potential on CPU at all - they just cap your usage regardless of any other machines running on the host. That can be good in some cases, but I think it might be restrictive in my case, especially at these price points. Linode, on the other hand, is probably giving me full reign here - the host is reporting overall usage as low:

ghan-18-30-28.png

 

17 minutes ago, RevengeFNF said:

For less than you pay now, you can get a 2 x Xeon 2670v2 with 40 threads. Ok, each of the threads are not as powerfull as the one of linode, but its 12 against 40 ;)
And its 40 dedicated to you. I don't know if the Linode threads are dedicated only to you.

Then it has 256Gb of Ram and 2c 480 Gb SSD Ram. It also has one of the best dedicated DDOS protection. Linode doesn't have ddos protection. They will null your IP if you get attacked.

 

The threads are not dedicated to me (unfortunately!) though I suspect I'm getting the majority use out of them.

DDoS protection is a valid point and I've run into this problem before for other sites in the past. I think Linode is better now than they used to be (they had some serious trouble last holiday season where their entire network was under attack) but it's still a potential problem. At the end of the day though, this site is a hobby site and runs off of the donations of the users. We're not going to lose business or livelihoods if something like that were to happen, so it's just a nice to have in the cost/benefit scheme.

 

 

Also, here's a snapshot of what I'm seeing on top:

 

ghan-18-36-04.png

 

Some interesting information from SHOW ENGINE INNODB STATUS:

 

Number of rows inserted 2871215, updated 10122526, deleted 638297, read 15076871109
72.64 inserts/s, 213.26 updates/s, 47.98 deletes/s, 292044.99 reads/s

 

Looks like reads are the main thing going on here.

Link to comment
Share on other sites

If those threads aren't dedicated I guarantee there are others around who are probably not too happy with you right now ^_^

My gut tells me there is something just flat out wrong somewhere, maybe in your linux build (you did mention compiling) or ??? 

It just doesn't seem like that many IPS users could just break a server spec'd like this - Linode VPS or otherwise. As a guest on your site I can refresh the activity feed and it pops up immediately. The site feels perky (with 531 online right now)

We (collectively) have plenty of data on Apache and Nginx; aside from you I don't think we have anything else from Lightspeed users. There might just be something "wrong" with Lightspeed + IPS right now.

EDIT:

292044.99 reads/s

Then again...

Link to comment
Share on other sites

  • Management

This seemed totally bizarre because those specs should be able to handle twice your traffic. So, I took the liberty of taking a look for you. 

Firstly, your site is performing rebuild tasks... those are very intensive and are most certainly contributing to the MySQL footprint. Ensure you've set those tasks up on a cron and they'll complete faster. 

Secondly, you have a few third party applications installed that are pretty heavy on resources as well. 

I would advise getting through the rebuild process, seeing how things perform and experiment with toggling the third party apps on and off. 

Link to comment
Share on other sites

49 minutes ago, Lindy said:

This seemed totally bizarre because those specs should be able to handle twice your traffic. So, I took the liberty of taking a look for you. 

Firstly, your site is performing rebuild tasks... those are very intensive and are most certainly contributing to the MySQL footprint. Ensure you've set those tasks up on a cron and they'll complete faster. 

Secondly, you have a few third party applications installed that are pretty heavy on resources as well. 

I would advise getting through the rebuild process, seeing how things perform and experiment with toggling the third party apps on and off. 

Thanks for taking a look. We are patiently waiting for those to end. After that we will see where to go from there.

Link to comment
Share on other sites

2 hours ago, Lindy said:

This seemed totally bizarre because those specs should be able to handle twice your traffic. So, I took the liberty of taking a look for you. 

Firstly, your site is performing rebuild tasks... those are very intensive and are most certainly contributing to the MySQL footprint. Ensure you've set those tasks up on a cron and they'll complete faster. 

Secondly, you have a few third party applications installed that are pretty heavy on resources as well. 

I would advise getting through the rebuild process, seeing how things perform and experiment with toggling the third party apps on and off. 

 

Yeah, I had the same thought, though I see the cron task run periodically and it seems to act just like another PHP thread. I tried turning off the cron for a bit shortly after we stood everything up and it seemed to not have much effect. Still, it would probably be wise to let the rebuild tasks complete before looking for another solution to see what the steady state will be.

Link to comment
Share on other sites

Your not the only one seeing the spikes, we seem to get large connection counts on the forum upto 400 plus. To ease it we had to upgrade our RDS on AWs to  db.m3.2xlarge which is obscene, and it still locks up to 100 connections but the site doesnt timeout so it means our forum is stable again. This is since the upgrade to 4.x

Link to comment
Share on other sites

All the rebuild tasks have finished as of this morning but I'm not seeing any improvement in the load.

I have moved the MySQL server out to a separate VPS and it is performing just fine on its own. The main issue seems to come from PHP. Processes executing PHP take a long time to complete and return the result to the webserver. During the night and early morning when the 12 cores are not completely maxed out, I see PHP slam individual cores - when a request comes in, a PHP thread will take a single core to 100% usage and sit for a few seconds until the request is finished.

Something in the code is causing a lot of CPU usage, but I'm not sure what. We do have over 6 million posts and around 2500 forums, so I wonder if something there is the issue.

Link to comment
Share on other sites

1 minute ago, ZeroHour said:

I take it Group Roleplays is Kevin's Group Collaboration plugin?

It may be worth disabling it for a short test to see if that affects anything and if load drops after 5 minutes.

 

The moment it is disabled it tosses all the 2k boards made by users onto the front of the site.

Link to comment
Share on other sites

1 hour ago, The Dark Wizard said:

 

The moment it is disabled it tosses all the 2k boards made by users onto the front of the site.

Yeah I know it would suck but I would try it when its quiet and map the difference in load, just for a short time. Because the bulk of your forums use that I suspect it may be causing the issue depending on how the addon works. I personally love that app but I just dont know how well it scales to that sort of size. The others apps are probably quite light but that one will be big because you use it so heavily and I dont know if Kevin has tested to that level before.

If you find out its causing a lot of load for Guests etc then I would drop Kevin a line and see if he can figure out why.

 

EDIT: what sort of guest to member ratio do you have generally? 70% guests 30% members with a 1000 online at busy times or?

Link to comment
Share on other sites

23 minutes ago, The Dark Wizard said:

Guests aren't really the issue since IPS has a cache for guests. In fact the site is blazing fast when logged out.

Maybe @Kevin Carwile could pitch in.

Its really really not fast for me tbh as a guest. The main index page isnt too bad but the rest is slow.

Link to comment
Share on other sites

Yes, so I at least know the piece of the config causing the issue: Memcached

With memcached on and configured in IPB, the load on the server is 1200%+ and over 30 on top. With memcached off, the load is 250% and generally stays in the 1-4 area.

The same things happens with redis configured. No clue why - nothing in the event logs for this. The caching mechanisms destroy the site. I have no idea how the caching is implemented so I can't say what could be going on. I suppose this also could be an addon causing the trouble, but I don't know if addon makers would have to interface directly with the cache or if there is an interface built into IPB that has the necessary functionality. Any insight on that would be most welcome! I'd like to get caching to a place where it can actually help us instead of making the site unusable.

Thanks!

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

  • Recently Browsing   0 members

    • No registered users viewing this page.

×
×
  • Create New...