Why apache creates many processes, which ultimately drops the system?

All kind time of day.

There is a web server zabbix 3.0 on a virtual machine running Debian 3.2.0-4-amd64 (vmware 12 cores 8 GB operatives). On it:

NGINX - frontend, Apache backend, php5.
Mysql on another physical machine

The load on the servo is large enough. In Zabbix a huge variety of cards with thousands of elements (I think the main problem is in them). Users can work simultaneously in the region of 100.

The situation is the following. Recently the web server began to slow down.
Multiply the process "/usr/sbin/apache-k start". Constantly. Regardless of the time of day and accordingly the load on the server.

They breed to the point until you run out of memory and the server stops responding.

Rabotaem Apache - all works perfectly and smartly. Services begin to multiply, but still quickly and working smartly.
After 10 minutes of services is 25 percent loaded 100% (all cores at 100), but still the web responds quite quickly.
It takes 20 minutes of services are already 40 pieces, but we do not care, everything is cool
Is 30 minutes of services are already 60 pieces, we still suffer, memory eats up 5-6 gigs, but start delay.
Is 50 minutes, and services like 80, we're breaking up and feeling the actual brakes sent Apache to reboot.


Then all in a circle. No matter what time in the working or in the dead of night. Within hours the server dies.

In apache2 error.log starts to complain only when memory overflows and serv can't spawn processes. In other logs as well, like pure.

I'm tired of this fight already. Asking for tips in the style of "reinstall Windows" does not give. It will be the most extreme stage, yet it is important to find the root of evil, not to cut down the whole forest.

I want to also add that before was just nginx + php5-fpm and the situation was exactly the same, just plagerise the process php5-fpm. (it was worse, because when you load 100% all cores to work was not possible and in the current situation, the fertility processes at the operating speed has almost no effect, until they devour the entire memory).

I will say the main sentence: "previously, everything worked perfectly for several months. any settings I have not changed and it is itself"

Where to look ? what log to look at ?
June 8th 19 at 17:07
1 answer
June 8th 19 at 17:09
1. the number of processes which is permissible is described in the config of Apache (worker look) same thing applies to php-fpm, can not open the config and specify the maximum number of handlers.
2. sabix update comes with a variety of devices may be worth them less frequently send and look at the configs of sabika and settings collection updates ? corny because it may send too much data which he is unable to process and then the decision or reduce the amount of data, or to scale the server (vertically or horizontally)
the number of processes which is valid


Configs of Apache modules

StartServers 5
MinSpareServers 8
MaxSpareServers 16
MaxClients 150
MaxRequestsPerChild 10000

StartServers 5
MinSpareThreads 8
MaxSpareThreads 24
ThreadLimit 20
ThreadsPerChild 10
MaxClients 150
MaxRequestsPerChild 10000

StartServers 2
MinSpareThreads 25
MaxSpareThreads 75
ThreadLimit 64
ThreadsPerChild 25
MaxClients 250
MaxRequestsPerChild 100


As soon as I don't play with these values. Honestly for anything special that may lead. Although most likely I'm just being obtuse, so in the end let it last changed and scored.

sabix update comes with a variety of devices can be worth their not so often to send


Data really send just unreal. We have only one of the nodes 11 thousand. There is proxy server into a single large segment. I'll try to dig in the direction of decreasing the frequency of surveys. - aida_Terry97 commented on June 8th 19 at 17:12
Please tell me how to calculate values in the data based on the following configs:

Users at the same time we have about 100 pieces. Of them 80 just viewing (mostly heavy card) 20 may have something to configure, but rarely all together set up.

Data elements 80 thousand
The number of nodes 11 thousand

12 cores. 8GB operatives. - aida_Terry97 commented on June 8th 19 at 17:15
that there are no universal indicators where the number of people multiplied by the rate per user,all load profiles are different
times with 80 processes do not stand up it is necessary to understand what is bucalossi neck (most likely the database and the operation input / output)
performance optimization is done individually each time
you need to scale, and better to do it horizontally to podkachali new servers for data processing and load balance is not just processes and it in one the comments do not explain to begin the journey here https://nginx.ru/ru/docs/http/ngx_http_upstream_mo...
I would do the following
1. Would have found what exactly is the problem(would have collected metrics for the load of io, cpu, ram also would have collected these data from the database servers)
2. Possibly would have asked for another server (made the database separately, 1 front server on nginx and the group of upstream servers to handle requests)
3. I would look what kind of server slow down, to slow down or the database or backend servers (or both)
4 would begin the optimization of the backend servers started is trite php opcache (because it is commonplace to interpret the files that are expensive for cpu) would look for bottlenecks and optimize them
4.2 would read about zabbix and about the optimization of it under load
5.For the database would be looking at the slow queries would use the cache and so on, strongly reducing the load on io, sharding and so on

To perform these actions need skilled people, we can try to limit on nginx number of requests to the backend and then be redundant to error, which is not very. - hall commented on June 8th 19 at 17:18
,

Believe it or not, but all this every thing has been done and repeatedly. In the beginning was nginx + phpfpm and tupil exactly php-fpm loaded percents on 100%. I could then gathered (here on the toaster is a separate issue about this) eventually came to the opinion that it is necessary to update tabix. I like to understand the problem and rarely resort to the methods of "scrap and put it again."

Decided to check out how to work the web, without phpfpm. Did the Apache backend. The same thing, but now multiply the apache processes.

The database server monitors regularly, there is a burden on Prots and 20% in the amount of not sotavlyaet and memory of 64гигов available hawala only 11.

As already mentioned, made a clone of the main web and threw it a group of users. It all well and cool running. Of the 100 users on it at least 20-30 now and they are all cool.

On largely unchanged. Apache breed serv pomeray. Yet threw in cron restart Apache. Users still do not notice and poidee is not a problem. But we all know.... From this Castile will have to get rid of. Here out of despair I write here. I'm not so good rummage in web servers )))

I still deeply tinkered with the dB server. I have not included there slow.log because it will be necessary to rebutt, and he hates it. But apparently the time has come :) - aida_Terry97 commented on June 8th 19 at 17:21
if the load on the cpu and threw some of the users there
the equipment in your network are mostly knocking on the old server and its processing is the old server, so we need to measure not only the users but also on the host where the information comes from customers it in the first place.

secondly, I would use php-fpm instead of apache+mod_php (this is purely IMHO, but not necessarily sure that apache+mod_php is also possible to configure just I often nastaivaju php-fpm).

thirdly, php can be installed all sorts of xdebug or opcache is not enabled what can cause the load on the cpu so not bad phpinfo to make current and compare it with the one where he launched and had no problems.

in the fourth we have to see how many rps comes to the load balancer (nginx) and how much he hangs up on further in the backend and you need to understand the process can be multiply curved script or is it a real load on the server

in the fifth I would just with nginx would do upstream and would have added 2 servers if the load in General was quiet, just means your server could not cope with it, if both servers began to slow down so they don't(and here it is necessary or to stick a server or reduce the load, to collect less data for example), if the brakes are all the same server and the other one works without problems then the problem is in the server, after adding servers and round robin load balancing between two servers should be roughly the same.

6 may jamb in the code where the infinite loop is necessary to understand workeri crash or just a large number of customers

7 on the database servers and on other servers it is necessary to monitor io (input/output) because usually brakes go from there, not just cpu and ram

I would start with 3 points, then 5 and then is a good choice. - hall commented on June 8th 19 at 17:24

Find more questions by tags ZabbixPHPApache