Where and how to keep a large database?

Hi Toaster!
I have no experience in system administration of large projects. Unfortunately, my "tip" at the moment is to configure the LAMP according to the instructions from the Internet.
It so happened that the MySQL-database of my project has grown to 1.1 TB. Busy this one terabyte MyISAM-table 340 million records. Apache and MySQL are now living on the server from Kimsufi: i5-3570S, 16GB RAM, but the server is already barely cope. The average load is about 400 concurrent connections, mostly INSERT. When you overload the database crashes for MyISAM and lock table, respectively. Yes, and my bydlokod plays a significant role. One of the last drops, beating 93% of the indices, and pushed me to action.
I think the project has outgrown the capacity of a single server and need to do something for growth, but that's something I'll never know. Reviewed Google Cloud SQL Amazon RDS, but is too expensive. I would like to stay within $250-300 monthly budget, that the project even came out to zero. I guess it makes sense to configure sharding. What is the optimal size of a single shard and what server features it depends? Maybe it would be meaningful transition to a different DBMS?
September 19th 19 at 12:45
6 answers
September 19th 19 at 12:47
and not try to start the base slightly adoptionservices? well, there are a couple of new tables to start?
and with this approach, no iron is not enough.
Love to, but I don't know exactly what to optimize? For example, does it make sense to split the table into multiple requests is not difficult to SELECT/UPDATE/INSERT ... WHERE 'id' = ...? - Wilfrid.Dickinson68 commented on September 19th 19 at 12:50
and you can structure your mega-table to see? at least approximately. - esteban_Kiehn15 commented on September 19th 19 at 12:53
puu.sh/gfag7/15fa62b260.png - Wilfrid.Dickinson68 commented on September 19th 19 at 12:56
ie the hell you have in the json field. and in these of - structure homogeneous? if Yes, then again it is and can be in a separate table. - esteban_Kiehn15 commented on September 19th 19 at 12:59
: this can be in any indexed repository to push, even in Mongu some times the simple queries and indexes only. - dagmar_Conn commented on September 19th 19 at 13:02
September 19th 19 at 12:49
Still not enough information, what you have there is json and how you use the data.

Here are some tips:
1) Read about protezirovanie may your column ownerID you can even throw and smash all the data on table owner1, owner2... so you can save on indexes and data + will be easier to smear database servers (sharding), and to work so will be faster.
2) Make archiving json, it can reduce the amount of data in 2 ... 10 times.
3) Put the old data to the archive, such as a month, make the resulting reports, caches, etc. that a client can request and the data is send to the archive.
4) Try a different database: postrgresql with - to use a compressed json which can be used to make the indexes, thus your varchar's are optimized. with nosql/mongodb is also a plus, for example, 1 "record" will occupy 1 memory unit, not several as in sql databases, + the higher the write speed.

The same principle of protezirovanija possible to do chunks of data, for example, if the data you need to choose the days and the owner, that at the end of the day can pack data in chunks: data, ownerID, archived_json. therefore, the size of indexes can be reduced 100 times, the data in 10..20 times + the speed of data acquisition can grow up to 50x times (I had a similar project).

These tips can 1TB "turn" (for example) 10Gb - depends on the data and usage.
September 19th 19 at 12:51
For a start, why not do multiple tables?
If the old data should just be stored there, occasionally reading, and the main activity of insert and work with the latest data, then one ought to consider how to divide the data.

Can pomonitorit, figure out what it is loaded with more disk, memory, network?
Maybe just to configure replication, and split the queries on the two servers?
September 19th 19 at 12:53
Look into mysql-slow.log, look for heavy queries, podkamen Yes twist them. there are 146% there is a nut that you can tighten up.
Poligiros all the queries a bit. Analyze each of them on the PROPER use of indexes. Do some cleaning unnecessary indexes (release probably 20-30 percent of space). In General, optimize your. And only if there is nothing to optimize - then we should think about shards.
Indexes are set up, there is no extra. The queries all have the same structure, the only difference is the filter sample. Thank you for the reminder about the slow log, still can not reach the hands. - Wilfrid.Dickinson68 commented on September 19th 19 at 12:56
Well, if the EXPLAIN says that all indexes under different selection filters are working properly, then chardet. I don't know what cha is for a project like this, but since you're talking about 250-300 byudet to zero to work.. to begin with tried the budget option, it's as slav"a take what some wdsco for 30$-50$, and the shard to do it. - esteban_Kiehn15 commented on September 19th 19 at 12:59
...about 400 concurrent connections, mostly INSERT. (c)

A MyISAM table is entirely blocked during the insertion of data. Try changing type to InnoDB. - Wilfrid.Dickinson68 commented on September 19th 19 at 13:02
by the way Yes )) - esteban_Kiehn15 commented on September 19th 19 at 13:05
September 19th 19 at 12:55
Tough guys use NoSQL
September 19th 19 at 12:57
You can even look in the direction of the forks of mysql, MariaDB tipo and so on
And partition by clause for tables, as he wrote later.
Caching memcache or redis. Or mongo option for individual tables or fields.

Find more questions by tags MySQLBig dataDatabases