How to speed up a sample with low selectivity from table with hundreds of millions of records?

There is a table of several hundred million records. About 200 GB. It has a lot of columns, including text. There are columns with low selectivity. For example, the city column. The task is to select all where the city is equal to the specified value. For example, SPB about 10 million records. All they need to bring in the file. I.e. this type of query to COPY (SELECT more fields ...) TO 'file.txt'. Now they remain half an hour. Any indexes not help. And if you do not SELECT a few fields, and SELECT id ... WHERE city = ..., this happens in just a few seconds. If to make record in the city of St. PETERSBURG in a separate materialized view, SELECT more fields ... there are no longer half an hour, and half a minute.

  1. Do you need to create for each city a separate table?
  2. If you create a separate table for each city, what if you want to filter on other columns, not in the city?
  3. Read a little about PgPool 2 and the possibility of parallel queries. If you make protezirovanie by id and use parallel queries to all partitions, a variant of it? And if pgpool can do this in one car?
  4. How else can you optimize?
  5. Cope if even one machine with this problem? I read somewhere, people write that they have a few billion records in Postgres fly on one machine even fairly complex queries. How so?
July 2nd 19 at 13:08
2 answers
July 2nd 19 at 13:10
Bring base to ДНФ3.
For the current case: all cities must be in a separate table - a list of cities with ID-shnik.
List of cities with an ID-shnik. That is, the relationship one-to-one to the main table? And then JOIN? - elody_Kovacek11 commented on July 2nd 19 at 13:13
: ie, ID cities instead of city names in the main table, and a list of all the names in another table (list of ID-shnik).
And communication "one-to-many" (on a table of cities and main table respectively). - Kyleigh_Hills commented on July 2nd 19 at 13:16
The fact that the city in the main table is their id of type integer. Will in this case be an option? - elody_Kovacek11 commented on July 2nd 19 at 13:19
: Datasheet - show, and we are a spherical horse in vacuum will now discuss. - Kyleigh_Hills commented on July 2nd 19 at 13:22
Such a scheme

id bigint,PRIMARY
type character varying(20),
name character varying(400),
alias character varying(100),
count integer,
country integer,
city integer,
site character varying(255),
status character varying(30),
is_active smallint,
date character varying(20)

In General, filter conditions are assumed for any fields, not only in the city. - elody_Kovacek11 commented on July 2nd 19 at 13:25
:
1. all fields with ID should be the key.
2. create indexes for all fields on which you plan to do search
then try to search, don't forget to add LIMIT at the end, if you are looking for one (or known quantity) of entries. - Kyleigh_Hills commented on July 2nd 19 at 13:28
July 2nd 19 at 13:12
What indexes are there? The structure of the table?
explain (analyze, buffers)?

1. no need
2. see 1
3. if you run into a CPU, not disk. If the disc will make it worse.
4. first find out how does the existing sign. Then think. For example, brin id city. On disconecting fields will clear compact index.
5. 200GB is quite a normal base. Not even astronomically expensive shared_buffers entirely in place.
id bigint,PRIMARY
type character varying(20),
name character varying(400),
alias character varying(100),
count integer,
country integer,
city integer,
site character varying(255),
status character varying(30),
is_active smallint,
date character varying(20)

On a field name index gin(to_tsvector()). The remaining fields of the btree index. And one compound index still decided to try also btree on the fields id and city. It is in the request and used.

EXPLAIN (ANALYZE,BUFFERS) SELECT id, name, alias, count FROM public.item WHERE city = 8

Bitmap Heap Scan on item (cost=103708.16 7024255.99..rows=5540592 width=57) (actual time=..rows 4566.896 1333276.255=5640073 loops=1)
Recheck Cond: (city = 8)
Rows Removed by Index Recheck: 99736844
Heap Blocks: exact=13625 lossy=2174087
Buffers: shared read=2203126
-> Bitmap Index Scan on ix_item_city (cost=0.00..102323.01 rows=5540592 width=0) (actual time=..rows 4548.816 4548.816=5640073 loops=1)
Index Cond: (city = 8)
Buffers: shared read=15414
Planning time: 113.841 ms
Execution time: 1334887.657 ms - elody_Kovacek11 commented on July 2nd 19 at 13:15
About cpu and disk iostat showed spikes of up to 20 MB / sec. despite the fact that everything is on the ssd. Cpu even half of the kernel is not loaded from 6 cores. - Kyleigh_Hills commented on July 2nd 19 at 13:18
100% penalty for the buffers, of course the cold table with the drive to read slowly. Even the index from disk to raise had. And work_mem is not enough for a bitmap, I had to sink to the bitmap page and a lot recheck'AMB in units of pages.
Raise work_mem for this query. Do not tell how much, until the disappearance of lossy Heap Blocks, i.e. quite significantly from the current level.
Whether to raise shared_buffers - we need to understand the database workload. If it is a cold plate, and a request should rarely, can not touch. If it needs to be hotter to increase shared_buffers, it is possible to deliver memory.

> one composite index still decided to try also btree on the fields id and city
id & city? And what was he supposed at least in theory, to help?

> The remaining field btree index
At all? Why? - elody_Kovacek11 commented on July 2nd 19 at 13:21
Composite index in theory could hardly do something to help, but decided to try it in practice. Generally index do experiments relatively recently, so little experience, please do not judge strictly. Indexes on other fields, because other fields also need filtering in various combinations.
About how often do you need. The fact that the conditions in the queries can be for whatever and whenever. And I wish they all quickly worked out. On shared_buffers allocated 6 gigabytes of memory available 18. work_mem in General config 5MB installed. I understand that any requests were fulfilled quickly, the table needs to be warmed up completely, and in order to warm up, need it all in memory to drive. Ie to take servers with a total amount of memory under 200GB and to do sharding, am I correct? If so, then it's not too inexpensive. Or is there another way? - Kyleigh_Hills commented on July 2nd 19 at 13:24
Why not very cheap?
Take one piece of metal on RAM and 256GB of problems with reading from disks is almost there.
For example, ovh from $300 a month even with a couple of SSD: https://www.ovh.com/us/dedicated-servers/

Most likely (I don't know your workload!) you will be enough, and 128GB or even 64GB for hot data. So memory is not a problem even on the desktop to deliver.

You can try one multicolumn gin to hang. He, unlike btree, the order of the fields in the index do not care. Constantly read everything - still will not be fast. To work adequately in memory needs to be hot data. - elody_Kovacek11 commented on July 2nd 19 at 13:27
workload is 100% read only. When the database is updated, it is closed for reading. About multicolumn gin do not quite understand how. As I understand it, gin it is impossible for a normal integer field to create? And as it is with the operators of more-less? It can be an example based on the fields that I have described above? - Kyleigh_Hills commented on July 2nd 19 at 13:30
Yes, out of the box gin does not know how an int is btree_gin contrib (for gist analogue). But it seems for a couple intov already has a value of the order of the fields. range on intm can.
However, the bitmap index scan can bypass several indexes. I do not remember, as they solve the issue of different combinations of filters. - elody_Kovacek11 commented on July 2nd 19 at 13:33

Find more questions by tags DatabasesOptimization of SQL-queries