The comparison of the two lists and output units on the basis of the number of values in one of them?

Hello.

I think that the issue is solved easier than I made it up, but my head is already bent and can not find other choice. I hope you will tell such.

There is an array of data represented as plain text. In this array are the values from multiple entities. These entities are grouped in a separate list. The values for each entity are blocks separated by an empty line. The example below in the code. The number of values for each entity may vary.

dataset_names = ["moscow", "new-york"]

text = """2008 11 186
2009 11 281
2011 11 776
2012 11 856

2011 11 776
2012 11 856"""

def chunks(l, n=2):
 """Splits the list (l) into parts of size n"""
 for i in range(0, len(l), n):
 yield l[i:i + n]

prepare_text = text.replace(' ',").replace('\n','\t').split('\t\t')

data = [list(chunks(item.split('\t'))) for item in prepare_text]


What I do now: remove the spaces (they are not perceived at the destination), divided into blocks, divide blocks in pairs.

And then, as already wrote, the brain falls into a cycle and come eventually to the same solution which does not give the desired result. Shouldn't be that hard to compare both lists and add in chunks the names of datasets, but then to pull out the data in the proper formatting?

Ultimately I want to get something like this:

moscow
value_1 = 2008
value_2 = 11186
value_1 = 2009
value_2 = 11281
etc

new-york
value_1
value_2
value_1
value_2

etc

Simply put, I want to bring to each entity from one list all the corresponding values from the other, further breaking them by pairs.

Please, send in the right direction.
June 10th 19 at 15:28
1 answer
June 10th 19 at 15:30
import re

dataset_names = ["moscow", "new-york"]

text = """2008 11 186
2009 11 281
2011 11 776
2012 11 856
2011 11 776

2012 11 856"""



prepare_text = re.split('\t+|\n+', text)

out = {k: [{'value_{}'.format(i+1): prepare_text[t+i] for i in range(2)} for t in range(0,len(prepare_text), 2)] for k in dataset_names}
print(out)

something like this?
Thanks for the reply. I myself solved the problem. Clumsy, but I'm not special :).

Beginning as in the description, and then added:

merge_data = list(zip(data, datasets_names))

for dataset in merge_data:
 print('dataset_name=' + dataset[-1])
 for k,v in dataset[0]:
 print('value_1=' + k + '\n' + 'value_2=' + v)
 print('/dataset_name')


Formatting in reality is different, but that's not the point. Most importantly - I got what you wanted. - Vaughn.Schaden commented on June 10th 19 at 15:33

Find more questions by tags Text processingProgrammingPython