Which database to choose ? If MySQL for these tasks?If we have proper experience with it, the right skills in DB design and a full understanding of why you need to do "exactly" and "why not otherwise?", I think is fine. In General, the base is usually evaluated not by the number of entries in 1 (one) table, and the total volume of data (Giga/Peta - bytes) and some other parameters.
- search - is there a record in the database with the specified name, if there is updated information there. Ie, before adding the entry (and let me remind you first of 5-40 million and will continue to grow) will check it in the database and add/update the data.For this there are indexes in all databases known to me. Presumably, the standard B-tree index, it works in all bases about the same.
The database will be approximately the following load:Load You will have to iron and not to the base, if it will stand up to something from the point of view of the database - logical problems of storage 40 million. records - I do not see.
Want to know how to organize the structure of the store information ?"Big information" or large amounts of data? 40 million. records is absolutely not necessarily a large amount. For example, the index on numeric (INT) field to 40 million. records will occupy only a few megabytes. Storage is "great information" - can take, such as PostgreSQL, there is a ready mechanism, TOAST, designed specifically for this purpose, or to design a MySQL database so that the data would need to lay separate from any "information collection" ("tails"), this will reduce the size of the individual tables on disk and as a consequence - to increase the working speed.
Every day or a little less will need to update the data for 5-40 million records,it is necessary to do in one run?
- search - is there a record in the database with the specified name, if there is updated information there.what kind of name is that? text, hash int? Long? In the General case, selection by index is VERY fast, there is more dependent on iron than from the base.
- search the database with the specified parameters (for example, to such a parameter was greater than a specified value and similar conditions)Indexes decide if the task is simple sample of a flat table - it will be fast in addition to options for finding a La
`field` like ' %some text%.
Want to know how to organize the structure of the store information ?read the "normal form"database.
Which database to choose ? If MySQL for these tasks?Muscul or PostGIS, it is necessary to look at a bunch of hardware/software, because you have the task either very unusual, or something you are designing is wrong, you have the same "super secret goal", respectively, very palatially response.
Data will be stored in plain sight: the line id,I hope it's a typo, I mean - id type integer?
Or can only say keep the categories and their number in a separate table. But in the main writing room only categories.read about normalization, no, but really, this is IMPORTANT.
Every day or a little less will need to update the data for 5-40 million records and this number will continue to grow every month approximately 5-10% of the records in the table will be addedThis scrapping service (perhaps through a search engine or some sort of Analytics social.network).
The MyISAM storage engine supports 2^32 rows per table, but you can build MySQL with the --with-big-tables option to make it support up to 2^64 rows per table.
Find more questions by tags Databases