Implementation of preview articles?

Hello. It was necessary to implement preview (partial display of the article). You can compare Hebron: home several stripped-down article; when you click on the link for full version! How to design the implementation for SQL?


At the moment there are three ideas:
  1. To store the article in one cell with the mark (habracut). If you need to break into two parts
  2. To store the article in two cells. If you need to connect one
  3. To store the article in one cell. Parse the first time you save. In a separate cell to store the position of the label. If you need to split into two at the saved position markers


I myself am inclined to option number 3. Intuitively, it seems that the option should be the most productive. How do you think? Is there any other options?
October 8th 19 at 03:34
6 answers
October 8th 19 at 03:36
If not critical, the DB size is 2, but without the connection (i.e. the first cell of the announcement, and secondly the whole article).
If not critical time that you want to every time to pull relevant articles from the DB and cut them in half, then 3.

I personally use the 2nd option and do not feel remorse.
The authors of the different CMS is also not feel remorse. I would have done the same. - Daisha_Klocko3 commented on October 8th 19 at 03:39
October 8th 19 at 03:38
What is this nonsense?! Of course enough two fields: article_аnounce and article_text.
October 8th 19 at 03:40
See I don't fully understand, so I will describe in more detail.

— Need a table with fields: id and anounce.
— On the main page displaying blocks of text with this information. Click to open the article with the right id.
— The whole article is taken from the file name .php

Advantages:
— Do not load with heavy queries DBMS. Requests less.
Opcache itself caches the article. Without any dances with a tambourine.
Damn... the parser Should be:
"The whole article is taken from a file named id_статьи.php" - Daisha_Klocko3 commented on October 8th 19 at 03:43
What is the severity of the queries? The same basic operation. - eloise_Hahn commented on October 8th 19 at 03:46
October 8th 19 at 03:42
The most productive option would be storing articles in file and in database store only the link to the file.
Wow, harsh. And how search, backup, caching? - Daisha_Klocko3 commented on October 8th 19 at 03:45
About caching:
— Rows with text fields in MySQL are not cached.
Files are automatically cached by the OS.
Files can be cached with the help of okashira.

About backup:
Files can also be bekapit.

Regarding the search:
— Sphinx fine looking at the files. - eloise_Hahn commented on October 8th 19 at 03:48
– I'm not about caching
– OS will cache a lot more
– it is possible, even on a tmpfs to save, but why?

not just possible, necessary, but again, no clear advantage

– and even the Sphinx set? - Joshua.Rohan commented on October 8th 19 at 03:51
— And what about the caching? Using memcached will be slower.
— This happens automatically and never advanced caching does not hurt.
— What? What would the entire project with current articles hung in memory and were taken from there.

And it's not an advantage.

— So what else? You search for the text data can be arranged without the Sphinx? - lexus.McGlynn commented on October 8th 19 at 03:54
– cached response to an sql query, it does not matter if there is TEXT or BLOB
– how can interfere, moreover, if it is not only a database server, and file storage, these txt-shki can not get into the cache at all
– if it is a very modest project, and a lot of RAM

– now I will get :) first of all, the loss of free space, small files occupy a whole cluster, the second a huge number of file descriptors, which is bad as the OS, for example, and to move/copy/expand from a backup, imagine that there are a million files; it will copy forever.

– of course, without of the Sphinx, here on habré the Sphinx figure that you will find. Rarely need serious full text search, but when you need is Solr. - eloise_Hahn commented on October 8th 19 at 03:57
— Yes, they can be cached, but the cache becomes full of "heavy" responses, and caching for other queries becomes inefficient and overall speed of the DBMS falls.
What the hell is this? You can disable the automatic caching of files in the OS?
The same can be said about caching tools MySQL. Or in some wonderful way the cached information changes its volume into the cache of MySQL?

— Millions? Well, well, You have a good imagination. For instance, the Habr — a total of 140,000 articles.
Then, what is stopping You to split files into folders.
And I suspect that You have never poured on to the server dump in polyga — that is an eternity.

— I would like to see your miracle search without Sphinx. - lexus.McGlynn commented on October 8th 19 at 04:00
– heavy, this is what 200k?
– no, can not and do not understand where is here
– no, it doesn't, but MySQL knows what works and knows when he needs to be told MySQL that'll take 3Gb for cache queries, did the file cache makes no difference that some data is more important and others are not, if it "overflows", it starts to use all Unallocated RAM, and if it becomes useful for something, releases it. With the consequences.

millions, Yes. It may not necessarily be articles can be the usual documentation you would know how much is it in serious financial offices for the day, would be surprised. Can be divided, only there was no modern FS have no restrictions on the number of files in the folder. About polyga and the server did not understand.

– The Internet is full of examples looking.

I propose to close this conversation, and the author of the question to wish you all the best and keep the text to the cat in one dB field, and the full article in the other. - eloise_Hahn commented on October 8th 19 at 04:03
October 8th 19 at 03:44
And on the first point, try this:
SUBSTRING(`text`, 1, (INSTR(`text`, "[habracut]")-1))
I think this is the most expensive option.
  • If necessary, the full version will have to remove the label from the text
  • If you want only part, will have to do this every time you open the database
- Daisha_Klocko3 commented on October 8th 19 at 03:47
Tag can be removed by means of the language, almost all of them replace function in the text work fast enough
It's difficult to say how it's slow, can conduct a synthetic test. And I think the result of this query should hit the cache. - eloise_Hahn commented on October 8th 19 at 03:50
October 8th 19 at 03:46
Two fields. In one — only that to the mark of the cat. Second — the whole article (and the fact that up to the mark of the cat, and after)

Of course, you get duplicate data. But if you CPU is more important than the place on hard, then why not?
And why the announcement and the full text save? It is enough when saving a text post. When displaying the full article, and in the form when editing glue. - Daisha_Klocko3 commented on October 8th 19 at 03:49

Find more questions by tags SQL