How to solve the problem in Redis?

All kind time of day,

Includes 1,500,000 files (from 1KB up to 12MB) of discharge from Twitter, each file is 1 unique user, the file contains all the posts and the user's answers, for example:

onStatus @,@ 691006201815957505 @,@ Sun Jan 24 10:14:51 NZDT 2016 @,@ @TerryBrunk how did you like New Zealand when you came with WWA?
onStatus @,@ 693916127768895489 @,@ Mon Feb 01 10:57:51 NZDT 2016 @,@ Would be a damn tragedy if the Wellington 7s left. https://t.co/CLiEC0wd0b
onStatus @,@ 694245265356623872 @,@ Tue Feb 02 08:45:44 NZDT 2016 @,@ New Zealand plagued by 'vampire' attacks - Unexplained Mysteries https://t.co/2htQ3THvSG
onReply to ~|695570687893860352 from ~|SailishWilbur @,@ 695571616252633088 @,@ Sat Feb 06 00:36:11 NZDT 2016 @,@ @SailishWilbur Aus vs NZ one dayer tomorrow at Westpac
onStatus @,@ 697156769605410817 @,@ Thu Feb 10 09:35:01 NZDT 2016 @,@ I liked a @YouTube video https://t.co/4dCuEjVrFR NRL Auckland Nines 2016 Game 13: Warriors vs Sea Eagles Highlights
onStatus @,@ 705281163208867840 @,@ Thu Mar 03 19:38:27 NZDT 2016 @,@ Brian Jonestown Massacre LIVE in Wellington NZ, 2015.: https://t.co/twT1cVoIOM via @YouTube

You need to set the data for each user database in Redis, and then solve 5 problems:

1. Identify the 5 most active users by number of posts

2. To determine the popular day (for the largest number of posts) in a certain period of time, e.g., 11 Feb 2016 23 Mar 2016

3. The 5 most popular hashtags in the posts among the 5 most popular users, who have the greatest number of comments.

4. To determine the 5 fast users and with the largest number of posts. Ie first you need to identify the 5 users with the largest number of posts and then determine the average time between each post, in order to determine which of them is more nimble for posting a new tweet.

5. To determine the "lifetime" of the 5 most popular hashtags, i.e. until the moment when it was last used and in what quantity.

If someone can suggest at least how to organize DB structure for these tasks, I would be very grateful, and if also the pseudo-code for any of the above tasks will be provided then I'll be even happier :)

Thank you all for your attention.
July 9th 19 at 13:05
3 answers
July 9th 19 at 13:07
And why does all the Radishes ? Cram the whole thing into HDFS and figachit it Spark'ohms. The man under him.
July 9th 19 at 13:09
In the process of file handling plant in the radish of the counters. The challenge, I think, not to use radishes, and in parallel processing a large quantity of files.
July 9th 19 at 13:11
and why the heck radishes in all this?
as it is not a relational database
and besides, if you take 12мегабайт each dock then you will need 12 terabytes of memory on the server ) - Eulalia22 commented on July 9th 19 at 13:14

Find more questions by tags Big dataRedisPHPLinux