In Python correct encoding for Cyrillic?

All Cyrillic becomes like this: \xd0\x9f\xd1\x80\xd0\part no xb8\xd0\xb2\xd0\xb5\xd1\x82
And the same is recorded in the file.
How can I fix it?

from chatterbot import ChatBot

# Create a new instance of a ChatBot
bot = ChatBot("Alice",
storage_adapter="chatterbot.adapters.storage.JsonDatabaseAdapter",
logic_adapters=[
"chatterbot.adapters.logic.MathematicalEvaluation",
"chatterbot.adapters.logic.TimeLogicAdapter",
"chatterbot.adapters.logic.ClosestMatchAdapter"
],
input_adapter="chatterbot.adapters.input.TerminalAdapter",
output_adapter="chatterbot.adapters.output.TerminalAdapter",
database="database.db"
)

bot.train(
Hello
"Hi)",
 "How you doing?"
"Excellent)",
 "And you?"
"Well,"
)

print("Type something to begin...")

# The following loop will execute each time the user enters input
while True:
try:
 # We pass None to this method because the parameter
 # is not used by the TerminalAdapter
 bot_input = bot.get_response(None)

 # Press ctrl-c or ctrl-d on the keyboard to exit
 except (KeyboardInterrupt, EOFError, SystemExit):
 break
July 9th 19 at 13:36
2 answers
July 9th 19 at 13:38
Solution
The encoding in the header, specify the first line
# -*- coding: utf-8 -*-
What version of python?
If the second then do the import after you specify the encoding:
from __future__ import unicode_literals
Either explicitly specify:
u"Hello",
u"hi)",
 u"How are you?",
The fighting stopped, but the file database.db is still the same krakozyabry - Geovany74 commented on July 9th 19 at 13:41
: you have it writes to a file in json format, and the Python escaping your characters. - Nathan_Gaylord commented on July 9th 19 at 13:44
how to be in this case? - Geovany74 commented on July 9th 19 at 13:47
to use a different adapter, or try to rewrite the adapter jsondb, so that he could write in utf-8 without encoding / decoding escaped sequence - Nathan_Gaylord commented on July 9th 19 at 13:50
July 9th 19 at 13:40
In [1]: s = b'\xd0\x9f\xd1\x80\xd0\part no xb8\xd0\xb2\xd0\xb5\xd1\x82'

In [2]: s
Out[2]: b'\xd0\x9f\xd1\x80\xd0\part no xb8\xd0\xb2\xd0\xb5\xd1\x82'

In [3]: print(s)
b'\xd0\x9f\xd1\x80\xd0\part no xb8\xd0\xb2\xd0\xb5\xd1\x82'

In [4]: print(s.decode('utf8'))


you have a problem with understanding encoding
Before you Cyrillic in utf8 in safe ascii form (aka latin1)

You need to find settings chatterbot and see how it to set ensure_ascii=False

Find more questions by tags Python