Is there a way to optimize a large table with sparse data?

The essence of the following, there is a table with 200 million rows (char32, char32, char32, char64)
The first three fields - uid, last - text-id
Selection in all fields, key length - 127

The point is that the sample for this table is through a where exists for 13 thousands of rows.
The size of the index with the same key and you know what. No analogues * PostgreSQL hstore, not jsonb in mysql. Partial indexes, partitioning, too. His head izumal how to wriggle out.
April 7th 20 at 15:23
1 answer
April 7th 20 at 15:25
1. Create indexes for each field.
2. Optimize the query so that the substring to look for the latest and use limit when fetching data.
And show an example query...
1. And indices are
2. No there are no substrings

select * 
from tasks_table tt
left join users_table as ut
 on ut.user_uid=tt.user_uid
left join user_groups_table as gt
 on gt.group_uid=ut.group_uid
where tt.responsible_id=? or
 exists (select 1
 from my_dumb_table as dt
 where dt.record_uid=tt.record_uid and
 dt.group_uid=gt.group_uid and 
 dt.dumb_text = 'some_dumb_text')


Approximately - Blaze_Jaskolski81 commented on April 7th 20 at 15:28
In fact, there condition is slightly more complicated but the point is this.
Explain say 13k lines under a single selection via the 1 line in each gaine, 600 rows in the materialized subquery with a key length of 127 - Blaze_Jaskolski81 commented on April 7th 20 at 15:31
@Sabryna_Durgan52, I think that brackets in exists needs to be turned into an additional column for tasks_table. - Macy88 commented on April 7th 20 at 15:34
@Maynard_Borer92, tried will be even worse, because in the link table, there are duplicates. I think that the problem is in a very large key length and the fact that we are forcing mysql to revise 13000*600 lines.

It would rewrite everything to hell of course, but since such option is not just for the moment, I estimate it is possible to help the patient. In the same PostgreSQL I would have converted all of the strings in the uuid, 64 character field in enum, would build partial indexes be relieved significantly. Working with binary(16) in mysql would require to rewrite all requests - Blaze_Jaskolski81 commented on April 7th 20 at 15:37
@Sabryna_Durgan52, search, starting with the most unique data intersections and LIMIT 1 is not an option? - Macy88 commented on April 7th 20 at 15:40
@Maynard_Borer92, limit 1 is using a subquery? it will be even worse. Exists this is the analogue of the limit 1, relatively speaking. All of these options I've already considered how hard I tried - the gain is insignificant. This request generally moves only because of its limit stops 21 in most cases, if it happens then Hey run time of 8 to 60 seconds (at best). - Blaze_Jaskolski81 commented on April 7th 20 at 15:43
@Sabryna_Durgan52, no. Without subqueries. Just one table to make to get the desired string.
You can also make a one-time binding a specific user to lines to dispense left_join? - Macy88 commented on April 7th 20 at 15:46
@Sabryna_Durgan52, show a real plan of original query - Pablo.Dach93 commented on April 7th 20 at 15:49
@Sabryna_Durgan52, the trouble you have is that the subquery is executed for all variants of intersection of the tables in the join, not only for those records for which a true condition is ON. Prigogine table should be eliminated from the subquery - Pablo.Dach93 commented on April 7th 20 at 15:52
@zachariah_Nolan, is it?

The query plan can show, but without the names of the tables and indexes.
tynts - Blaze_Jaskolski81 commented on April 7th 20 at 15:55
@zachariah_Nolan, all 3 materialized table is just a block exists, gion according to groups is actually there, forgotten - Blaze_Jaskolski81 commented on April 7th 20 at 15:58
@Maynard_Borer92, how? - Blaze_Jaskolski81 commented on April 7th 20 at 16:01
@Sabryna_Durgan52, to create an extra column next to and do select starting user, to immediately reduce the amount result for "waterfall" sample:
select ... from .. t1, (select ... from ...) as t2 where t1.col=t2.col
- Macy88 commented on April 7th 20 at 16:04
@Maynard_Borer92, it is necessary to try, usually such designs are very far from optimal. Although it is very interesting that will make the mysql optimizer in this case - Blaze_Jaskolski81 commented on April 7th 20 at 16:07
@Sabryna_Durgan52, the order from most unique to least unique!
Don't forget!

Mark this then: what happened... - Macy88 commented on April 7th 20 at 16:10
@Maynard_Borer92index? obviously. By the way there is a reference where to read about performance of indexes? The structure of b-tree with a composite index on this is not so suggests, is not the first time I question this order set. Where was the article I used is available where all was explained very odobenidae it was nothing Google can't - Blaze_Jaskolski81 commented on April 7th 20 at 16:13
@Sabryna_Durgan52, example: looking for two coincidences: 1 7:
1 5
1 6
1 7
2 7
2 5
Start looking from the 2nd column.
First, we get:
1 7
2 7
And here, looking for 1 in the first column.
On the contrary - will be slower. - Macy88 commented on April 7th 20 at 16:16
@Maynard_Borer92, only b-tree is not a flat list, where it looks slightly different. And I was asking not for himself but to link to the people, instead of explanations. - Blaze_Jaskolski81 commented on April 7th 20 at 16:19
@Sabryna_Durgan52,
If the key is contained in the root, he found. Otherwise, determine the interval and go to the appropriate descendant. Repeat.

The main disadvantage of b-trees is the lack of effective means of sampling data (i.e., the method of tree traversal), ordered by the key.

So I just wrote an example. - Macy88 commented on April 7th 20 at 16:22
@Maynard_Borer92, by the way, the idea for the uid btree fit well, very bad. It is necessary to look not have delivered in the mysql which is more suitable - Blaze_Jaskolski81 commented on April 7th 20 at 16:25
@Sabryna_Durgan52, it is easier to make the migration of records to new structure.
But it is to do normally.
And then, you can even trigger to convert the current to do with the insert.
So will the new structure be filled without changing the code palojoki. - Macy88 commented on April 7th 20 at 16:28
@Maynard_Borer92and confused, if everyone selects to be rewritten. Would PG it would be possible to convert all char(32) uuid char(64) enum, deploy indexes + make them partial to critical queries. And if you radically change something instead of a table to give a view and impose it instead of the triggers. Yes, even denormalize using * PostgreSQL hstore or jsonb - Blaze_Jaskolski81 commented on April 7th 20 at 16:31
@Sabryna_Durgan52, and normally done by rewriting the code and creating a new normal structure dB that way? - Macy88 commented on April 7th 20 at 16:34
@Maynard_Borer92, well, he for the sake of entertainment opensorse the decision, but still drained, I do not rewrite will. What prevents the firm that uses it... You know, sometimes for awareness takes years and years to be resolved.

And on topic: I also can't think of a normal scheme data for individual access to the objects in the system based on the uuid keys in mysql. - Blaze_Jaskolski81 commented on April 7th 20 at 16:37
@Maynard_Borer92, well, the idea and the base would change, I worked for many years with mysql, I nresultcols with its optimizer most do not want to, but after oracle,postgresql, mysql for database, I already do not believe. - Blaze_Jaskolski81 commented on April 7th 20 at 16:40
@Sabryna_Durgan52,
And on topic: I also can't think of a normal scheme data for individual access to the objects in the system based on the uuid keys in mysql.
This discussion is not for a toaster. Rather, it's freelance with teamviewer and access to this database and to the application. - Macy88 commented on April 7th 20 at 16:43
@Sabryna_Durgan52, the problem is that you have the knowledge less than boast. And the result - no.
Reconsider the optimization from the relations of all of the database fields and build staging tables if required. - Macy88 commented on April 7th 20 at 16:46
@Maynard_Borer92, you definitely not help you even if you strongly want, no knowledge, I do not see, all at the level of shamanism.

UPD Oh, and even rude, dear, get the fuck off this platform please. - Blaze_Jaskolski81 commented on April 7th 20 at 16:49
@Bavashi, the decision is almost always stupid, in a very rare case, when the optimizer still stand on the brains for him.

PS thank you, I know, I'm pretty rarely, but regularly meet this fellow. Actually many comments, but I learned exactly zero. - Blaze_Jaskolski81 commented on April 7th 20 at 16:52
@Sabryna_Durgan52, can't you see that this man deliberately always writes bad things about me?
Well, apparently, it will help better than me.
Karma Habra - all negative except for me.
You can go to the Habr and make sure.
Here rating your. Click on nick or see the TOP10.
All the best. - Macy88 commented on April 7th 20 at 16:55

Find more questions by tags MySQL