Select/where/group by 100m-200m tables?

The feeling that I'm missing something with noSql solutions, comments are welcome


There is a plate with 100-200 million records, sorted by time where you want to quickly execute queries of the form
select time field, sum(field2) 
from t inner join t2 on ...
where field3 = x and field = y 
and time between time1 and time2
group by time, field 
order by time



now all that lives on one server to sql server and the problem is that the indexes for this table takes the same place as the data.


What noSql solutions are well suited for such a task? I still managed to watch just for CoachDb — the species does not fly
October 8th 19 at 00:14
7 answers
October 8th 19 at 00:16
I mean that such tasks are much easier just for the extra table (field, sum_cache) and updates based on triggers or on your own

matview it's called.
in sql server is that CREATE VIEW [myView] WITH SCHEMABINDING...? - Gianni_Cummings commented on October 8th 19 at 00:19
October 8th 19 at 00:18
The answer is not quite in the topic.
But because the query You have as it is not noSQL-tion, and indices take up much space... might be a good idea or stupid to round the time to minutes (indexes immediately takes 60 times less space), or enter additional field with the time rounded to minutes (table pedrezuela, but the indexes will be smaller).
October 8th 19 at 00:20
Does sum(field2) for each field, and how often? Critical speed of his records?
To be more precise, does the field field2? or just add and delete new records?

I mean that such tasks are much easier just for the extra table (field, sum_cache) and updates based on triggers or on your own... by the way as far as I know there is DB support cache indexes based on expressions (in fact, they create the field and fill it with triggers)
No change in the end and just appended a new entry about once a week millions of pieces
(such tables, not one )
Can give an example of such a database? - Gianni_Cummings commented on October 8th 19 at 00:23
Yes, almost all postgresql, oracle,...

Just not sure about performance on query data from that index, as they typically involve the use of indexed expressions in where, group by and order by.

If the record only add, that as described above, a new cache table and update its index when a record is added (set sum_field2+=field2), when you delete accordingly (set sum_field2-=field2). - Gianni_Cummings commented on October 8th 19 at 00:26
October 8th 19 at 00:22
covering clustered index will save you.
And where to esteem as it is better to ask? - Gianni_Cummings commented on October 8th 19 at 00:25
October 8th 19 at 00:24
yeah I read this article and thought about it,
I wanted to know the opinion, does it make sense with such a task at all to get away from the sequel server into something else for the sake of performance, but that do not see - Gianni_Cummings commented on October 8th 19 at 00:27
on Oracle, there matview more impressive. on columnar databases like Sybase IQ - Gianni_Cummings commented on October 8th 19 at 00:30
October 8th 19 at 00:26
or if you want quite adult and Silverlight make a cube and work with it. In MS SQL, and Silverlight is free of charge to the server
Cuba does give more speed of queries? - Gianni_Cummings commented on October 8th 19 at 00:29
you will laugh but made up for it. - Gianni_Cummings commented on October 8th 19 at 00:32
Is the cube functionality is not implemented on top of (using) sql? I mean how these mechanisms can be faster... the more this task is not very falls on the cube. - jaunita commented on October 8th 19 at 00:35
October 8th 19 at 00:28
Cuba's need for multidimensional slices, they have their own cache, which is updated by the sql queries. If the field s in which you group a lot different then the cube will be useful.
If not, then I think you'd rather have a Partitioning table.
we would definitely come partitioning tables, the only problem is that it is available in Sql Server Enterprise, which is very expensive, you don't know who else has the same convenient partitioning? - Gianni_Cummings commented on October 8th 19 at 00:31
We have Postgres, it is slower than Sql Server, but free and enterprise features. - Gianni_Cummings commented on October 8th 19 at 00:34

Find more questions by tags SQL ServerNoSQLMySQL