As when unloading to determine the restriction on the dictionary?

The bottom line:
1) I have a large number of texts from the VC
2) and a dictionary of words that I work with

I need in this test to get only those texts that relate to the selected words (let it be "flat" and "house")
it I like how it turned out...
BUT...
I need to in the uploaded texts there were no other words from my dictionary!
ie if the text was found "apartment", "chair", "wardrobe", then this text must not be uploaded

i.e. in the end I have a set of texts which contain only one word from the dictionary, and the other should not be there

actually, the code:
import csv
from collections import Counter
house_list = set(["flat", "house"] )
in_csv = open("C:\\Hun\\texts_for_topicminer\\Vk_csv_full_lem_CORRECTED.csv", "rt", newline="")
out_csv = open("C:\\Hun\\dasha\\house_counter.csv", "wt", newline="")
full_house = open("C:\\Hun\\dasha\\house_list-2.csv", "rt", newline="")
reader = csv.reader(in_csv, delimiter=";")
writer = csv.writer(out_csv)
full_house_reader = csv.reader(full_house, delimiter=";")
full_house_list = set()
for row in full_house_reader:
full_house_list.add(row[0])
print(full_house_list)
for house in house_list:
full_house_list.remove(house)
writer.writerow(["line_number", "auth_id", "date", "text", "city", "region", "text_length", "apartment", "house"])
for num, row in enumerate(reader):
 words_list = row[0].split()
 if set(full_house_list).issubset(words_list):
continue
else:
 cnt = Counter(words_list)
 two_house = False
 for house in house_list:
 if cnt[house] != 0:
 two_house = True
 if two_house:
 house_counter = {}
 for house in house_list:
 house_counter[houses] = cnt[house]
 writer.writerow([num + 1, row[1], row[4], row[0], row[7], row[8], len(words_list), house_counter["flat"], house_counter["house"]])


how can I do that? as it written in code?
July 12th 19 at 16:59
1 answer
July 12th 19 at 17:01
The code is not idiomatic. It is not clear what makes this. full_house_list is a list of words or texts?

for house in house_list:
 full_house_list.remove(house)


1. I would guess that you need the input texts are split into words first and then find the intersection of the set of allowed words and the set of forbidden words of text, respectively. But then, both have set all word forms match.
2. But in General, in a more advanced form, you need the engine for stemming.

Find more questions by tags Python