Calculation of parameters of a file server?

Good.


Task: to transfer a heavy file content from a primary server to a new dedicated.


Parameters content: content, primarily audio files in the size from 5 to 200 MB for a total volume of currently 800 GB and will grow exponentially.


Monthly attendance: ~10.000.000 page views, ~5.000.000 visits, ~1.000.000 unique visitors.

Daily attendance: ~350.000 page views, ~160.000 visits ~100.000 unique visitors.

At the same time on the website: from ~3000 to ~10,000 people (according to chartbeat).


The average load of the channel from 1 Gbps to 4 Gbps.


On the basis of such data, I would like as competently to approach to the iron requirements of the new servers, ie memory, disk subsystem, the processor.

Software on the server will run nginx. Other possible services will not work with users.


Current estimates: CPU — 4 PC at 2.6 GHz memory — 24 GB disk subsystem — 2 TB to start, the channel — 10 Gbps

Budget — it is advisable to fit in 100,000 RUB/month.


That's similar to static what are the pitfalls, what technology to deploy the disk subsystem? What can they read, what information to collect now? Currently the portal is stuff on the universal cluster. Everything is kept comfortable, but the architectural and religious is all wrong and will not stand permanent vertical expansion. Now 24 CPU at 2.2 GHz, 48 GB memory, 1 TB Raid 5 + 100 GB 2xSSD.
October 8th 19 at 00:27
7 answers
October 8th 19 at 00:29
It is necessary to proceed not from the General settings, and the load on the disk.
How many iops should be in the peak? How much CPU and RAM on current machines?
That'll do if the new single server screw?
October 8th 19 at 00:31
It is better to use a cluster of servers — and more reliable, and cheaper than buying one super powerful server.
In mainstream servers, two 1G network card, they can unite and get 2Гбита to the server (real 1.5-1.8 without loss of paketov)
Again, inexpensive modern 1U enclosure can accommodate 4 screws, it is normal for their tasks, even all unite in RAID 1 to increase read performance.
Percent absolutely unimportant for the return of a statics, it is better to take less cores but more the frequency will have a positive impact on the speed of processing of network interrupts.
Memory is to cram the maximum for mainstream moms is 16G.
Be sure to pay attention to the network card, it should be either Intel or Broadcom, in any case, not Realtek, and other marvelly.
The balance traffic should not randomly, but so that the same file was given from the same server — so more efficient use of ram as a file cache, roughly speaking — is summed across all servers.
ssd — a separate song, in Your case (small amount of content) is perhaps even more preferable to take on one ssd 480G for 500уе than 4 screws, and memory to save.

In General, this is a cdn might be easier and cheaper to use already operating on the market for cdn services.
Such loads SuperServer and not needed. Going on a normal desktop i7 2700K ssd and with 4-6 screws and it pulls good at 20Gb. The main thing over the distribution of files by screws to think. - Rosemary14 commented on October 8th 19 at 00:34
Percent absolutely unimportant for returns static
Very misleading. At speeds of several gigabits and above, the CPU begins to play a very important role. Including the number of cores. If you compare desktop hardware, it celebrates intel is better at a comparable price.

In General, a lot of very bad advice that you give without knowing the specifics of the content. Specifically, delusional advice regarding memory, raids, balansirovka, ssd. More precisely, they work only under certain conditions and that is not all. - Rosemary14 commented on October 8th 19 at 00:37
October 8th 19 at 00:33
If you have only the distribution of content without processing it, you almost don't need the processor. Almost the main bottleneck is the disks.
This solution is safer and usually cheaper to do on one powerful server, but on a few fairly simple.

Perhaps it makes sense to do a couple of basic stores and multiple frontends with a caching ssd, if there is a part of the files requested significantly more often than others.
If requests are spread over all files, it makes sense to take some simple single-processor 1U servers with 4 screws in each, more RAM. And distribute the requests between them.
Such decisions and safer, and usually cheaper than a single server that can handle the same load. Besides, it is easier to scale then, adding gradually a typical and cheap servers with the same configuration of iron and.
October 8th 19 at 00:35
One server is good and that in terms of resiliency? Better take care of it immediately. As a budget option for failover I would put next to another one server with the same roles in the lowered state, would make the sync file using rsync. If the first fell, raising the role the role on the second and perimysium ports on the router or change the IP to the one that belonged to the first
October 8th 19 at 00:37
The current slice download: CPU — used less than 50% of 2400%, memory used 12 GB out of the 48

iostat
avg-cpu:
%user: 1.91
%nice: 0.00
%system: 2.28
%iowait: 0.26
%steal: 0.00
%idle: 95.55

tps: 244.74
Blk_read/s: 18145.77
Blk_wrtn/s: 1446.57
Blk_read: 39002384910
Blk_wrtn: 3109236520

FS: ext3, ext4

Scaling specifically, the file server does not have the essence of how to scale. On the specifics just enough to produce vertical and horizontal.
Requirements for svc iops and fluxes depends strongly on the access technology. Specific numbers of requirements can't say.
File access is predominantly (95%) for a direct link and a personal dynamic rewrite links in nginx.
If the new single server will down — mourn and deploy new backup on the time remaining without files. Unpleasant, but not a critical need. For the period of the sweep in the case of a potential collapse of the new server functions will again take the main.
October 8th 19 at 00:39
The volume is not very big and read more or less posledovatelno although in a number of threads.
It seems to me that the easiest way to take some iron with 8x450 10k rpm SAS within each frame 16+, 4 cores.
October 8th 19 at 00:41
For your budget, it's more likely to be one server than 2. If you fall, will have to raise and not give the files to rise.

But if the rest. For distribution to 10G, only one CPU i7 2700K or equivalent (980, 3770, not a Hairdryer!).

I think will use the network card of type x520, and hence problems that slows down the network, will not.

The easiest way to squeeze 10G is to use SSD. For your money it will be something like 8*240Gb, such: hotline.ua/computer-diski-ssd/ocz_agt3-25sat3-240g/ Or vertex series, much of a difference on such a task will not.

Select a mother with 6 sata ports (sata3 doesn't matter), integrated video and a minimum 2 pci-ex x16 (actually it will be one x8, one x4).

To connect the eight screws, you need to put the controller. I recommend this: hotline.ua/computer-kontrollery-raid/adaptec_raid_1430sa/. sil, despite the same chipset do not have to use, will hang the server. 4 ports, not 2, to still system screw fit.

Total, it turns out that in the 8x port you stick the network card in the 4x — this is the controller.

As a system file, use xfs (with default settings, if you specify noatime mount) and the OS appropriately, the new Linux (in the case of Ubuntu is 12.04). Older versions do not need to put, they may not be very good network subsystem. FreeBSD, too, do not need to put a new Linux works with a network significantly faster.

Raids do not use scatter files themselves. If the data is larger than 4 SSD, it is better to scatter randomly and separately to make a second copy for the 20% most popular files. How many copies and how many percent of the file, in reality, will need to pick up depends on load.

The memory is 4 or 8Gb, it still will fit as a cache disk it is virtually useless for this task.

Put nginx sendfile off, turn on aio.

This tips for your website, for example, mp3 or, say, online movies for file size 0.8-1.5 TB.

If you describe how much you have all files and the percentage of the volume of these files generate 50% of traffic (better yet 80% of the traffic to indicate), maybe the configuration will have to adjust.

Find more questions by tags System administrationHighloadFile server