How to extract text from tag and then replace it with a Soup Python?

There is a piece of HTML code:

<div class="f-subheader subheader f-subheader-sm" data-editable="true" data-main-class="subheader" data-param="subheader">
<p>
 holding educativo internacional
</p>
<p>
 Academia STANDART LONDRES
</p>
</div>
 <div class="f-header f header-header-72" data-editable="true" data-main-class="header" data-param="header">
<p>
<br/>
</p>
<p>
<br/>
</p>
<p>
<br/>
</p>
the <h1>
 Fashion courses and seminars
<br/>
 hairdressers, stylists, makeup artists, beauticians And manicurists
</h1>
</div>
 <div class="f-desc description f-desc-xl" data-editable="true" data-main-class="description" data-param="description">
<p>
<a href = 
 EUROPEAN STANDARD TRAINING IN Mexico and Colombia
<br/>
 FROM BEAUTY EXPERTS FROM LONDON
</strong>
<br/>
</p>
</div>
 <div class="buttons" data-main-class="buttons">
 <button class="btn f-btn btn-success" id="button3504888" style="color: #FFFFFF; background-color: #E31e24;" type="button">
 Ver todos los cursos
 </button>


The text that is not in Russian - was successfully retrieved, translated and inserted back
And the one that is not translated - was not found and accordingly processed.

Python script:

soup = Soup(html, features="html.parser")
tags = ['span', 'p', 'b', 'a', 'div', 'li', 'h1', 'h2', 'h3', 'button', 'small', 'strong', 'td', 'img', 'input']

for tag in tags:
 for htmltag in soup.find_all(tag):
try:
 # print(f Text: {htmltag.text}, string: {htmltag.string}')
 if htmltag.string and len(htmltag.string) > 0:
 # if tag == 'span' and 'Copyright' in htmltag.string : continue
 # print(f Tag <{tag}> String: {htmltag.string}')
 translated = translator.translate(htmltag.string dest=lang)
 print(f,'<{tag}> {htmltag.string} > {translated.text}')
htmltag.string.replace_with(translated.text)
 elif tag == 'img' and 'alt' in htmltag.attrs and len(htmltag["alt"]) > 0:
 # print(f Tag <{tag}> Alt: {htmltag ['alt']}')
 translated = translator.translate(htmltag['alt'], dest=lang)
 print(f,'<{tag}> {htmltag["alt"]} > {translated.text}')
 htmltag['alt'] = translated.text
 elif tag == 'input' and 'placeholder' in htmltag.attrs and len(htmltag["placeholder"]):
 # print(f Tag <{tag}> Placeholder: {htmltag["placeholder"]}')
 translated = translator.translate(htmltag['placeholder'], dest=lang)
 print(f,'<{tag}> {htmltag["placeholder"]} > {translated.text}')
 htmltag['placeholder'] = translated.text
 except Exception as e:
pass
 print(f,'*** ERROR Tag: {tag} , htmltag: {htmltag} , Str: {htmltag.string} / Err: {e} ***')
 errors += 1


Through htmtagl.text it finds the text, but also finds and tag code < script > if it is in the block < div > what does the method of htmltag.string
And after .string it, as I understand it, finds a text that includes < /br > or something else
How to extract the text and replace it then all of the tags in which it is?
March 23rd 20 at 19:11
0 answer

Find more questions by tags Beautiful SoupPython