How to make a request for the taking of the first value from the group?

There is a table

timestamp | data
-----------------------------------------------------------------
2017-12-10 10:33:19.563241+00 | {"foo":900}
2017-12-11 10:34:19.563241+00 | {"foo":1800}
2017-12-11 10:35:19.563241+00 | {"foo":1800,"bar":400}
2017-12-11 10:36:19.563241+00 | {"foo":2400,"bar":400}


It is necessary to take the last value of the column data for each day, i.e. to get

day | data
--------------------------------------------------------
2017-12-11 00:00:00+00 | {"foo":2400,"bar":400}
2017-12-10 00:00:00+00 | {"foo":900}


Understand that you need to use window functions, but for the type json is not the function of the comparison
ie, for a query of the form
SELECT DISTINCT "timestamp", date_trunc('day', "timestamp") AS "day", FIRST_VALUE (data) OVER (PARTITION BY "timestamp")
FROM "table"
ORDER BY "timestamp" DESC

get error could not find equality operator for type json
How is this still solve?
System: PostgreSQL 9.5 Win 7
June 10th 19 at 16:03
4 answers
June 10th 19 at 16:05
Solution
Well, when you get bored to go through all the table - please :-)

Postgresql 9.4 and above, the choice of known date range using index scan for the timestamp field:
select day, data 
from generate_series('2017-12-10', '2017-12-12', interval '1 day') as day, 
lateral (
 select data from tablename 
 where "timestamp" between day and day + interval '1 day' 
 order by "timestamp" desc limit 1
) ljd;


All the days from the table by index:
with recursive t as (
(select "timestamp"::date as day, data from tablename order by "timestamp" desc limit 1)
union all
select bpt.* from t, lateral (
select "timestamp"::date as day, data from tablename where "timestamp" < t.day order by "timestamp" desc limit 1
) as bpt
)
select * from t;

Using loose index scan
never used recursive
Thanks for the tip, later rewrite using recursively ) - Gonzalo.Hane39 commented on June 10th 19 at 16:08
although the query execution plan looks like that is not comforting
2017-12-11_2150.png
compared to the "brute force"
2017-12-11_2151.png
Maybe in the future it will be better - Gonzalo.Hane39 commented on June 10th 19 at 16:11
According to your plan, it is obvious that you moved the query to its original task correctly. You have a country filter, which is absent in the formulation of the question and completely broke the whole mechanism is loose index scan. - Ignacio_Romaguera commented on June 10th 19 at 16:14
How I plan to represent the request and task - you need the index using btree(country, "timestamp") - Ignacio_Romaguera commented on June 10th 19 at 16:17
Well maybe more data will affect the specificity of the response. But I faced a problem and so put the question. I do not pretend to call Your solution is inefficient, maybe I just don't know how to cook :)
And demand is only a small part of the whole production, there is still some JOIN and UNION this to the whole case. In the end, all the termination is to be run for testing with a lot more data to build necessary indexes.
The main thing that Your advice about recursive queries could affect another request that I could improve, so Thank You still owed - Gonzalo.Hane39 commented on June 10th 19 at 16:20
That loose index scan manually you just have to do with understanding what, where and how they want to get and what is the basis of this indexes are required. Ie, first decide what indexes you will request to work well and then write the query for this behavior, but not as usual when first written the query to get the data you need and then look at the actual data which it requires indexes.
Usually in such a jungle climb in an attempt to speed up what is already there and where it is possible to rewrite the query drastically. Without an index, under which the request was based - the result will probably be very poor. Recursive queries in General good can shoot you in the leg.

By the way, here is a presentation on several reconsiderations requests from my colleagues and supervisor: https://pgday.ru/presentation/232/5964945ea4142.pdf (record of the report of the conference freely available, unfortunately, not published) - Ignacio_Romaguera commented on June 10th 19 at 16:23
Thank you , will wound on a whisker - Gonzalo.Hane39 commented on June 10th 19 at 16:26
June 10th 19 at 16:07
Solution
Without additional conditions of the problem can be so
SELECT
 date_trunc('day', sub."timestamp") AS "day",
sub. data
FROM (SELECT
"timestamp",
"data",
 row_number() OVER ( PARTITION BY date_trunc('day', "timestamp")
 ORDER BY "timestamp" DESC ) AS rownum
 FROM "table") sub
WHERE sub.rownum = 1


Just remember that this option will iterate all the records in the table, therefore, may not be applicable for large tables
The table is just a plan for billing and unlikely little work, but what reminded for the row_number - Thank you - Gonzalo.Hane39 commented on June 10th 19 at 16:10
Will have to test Your solution and what I do put
SELECT 
 DISTINCT date_trunc('day', "s"."timestamp") AS "day", 
 FIRST_VALUE("s"."data"::jsonb) OVER (PARTITION BY date_trunc('day', "s"."timestamp")) AS "dat"
FROM 
 (SELECT "data", "timestamp" FROM "table" ORDER BY "timestamp" DESC) AS "s"
ORDER BY "day" DESC
- Gonzalo.Hane39 commented on June 10th 19 at 16:13
You have also no additional conditions on the sample, so it will move the entire table. Maybe for the current task it is not so scary, because most likely You will not need to constantly iterate all the data from the "beginning of time". If we are talking about the billing, you will probably need to watch the data for some period, for example, for a month. In this case, you can add an index on "timestamp" and in the subquery sub to impose an additional filter where "timestamp" between :a and :b. In this case, the sample will be more efficient. - Ignacio_Romaguera commented on June 10th 19 at 16:16
Yes, the index and the time limit is, I did not insert into the question, since a big role in the logic of not playing - Ignacio_Romaguera commented on June 10th 19 at 16:19
June 10th 19 at 16:09
As one of the solutions is the conversion of json to jsonb and with it already to work
If there are other options - it would be nice
June 10th 19 at 16:11
as perverted, but a working version:

select 
*
from table 
inner join (
select 
dt=max(timestamp) 
from table
group by date_trunc('day', "timestamp") -- hand mssql - there yuzal cast(... as date)
) as t on t.dt=table.timestamp
Thanks
all would be nothing if the timestamp would be unique. In this example it per se, but in theory can be repeated and the query may not give the correct result - Gonzalo.Hane39 commented on June 10th 19 at 16:14
Of course in situations of matching timestamp - it will be Kaka... because in one form or another need the uid (which you can if you want "to hang") - Gonzalo.Hane39 commented on June 10th 19 at 16:17
synchronously written -) - Ignacio_Romaguera commented on June 10th 19 at 16:20

Find more questions by tags SQLPostgreSQL