How is it possible in bash to pull content DIV according to the name of the class or other characteristic?

Good day, a wget first html file in which you want to pull content div container for a particular attribute

<div class="b-text clearfix js-topic__text mvhh" itemprop="articleBody">
 <div class="some_another class"> content ... another ... </div>
 ... again content ...
 <span> again and again content </span>

Ideally, you should get on the basis itemprop="articleBody" , ie to get the contents of the container with a property itemprop="articleBody"
June 5th 19 at 22:00
1 answer
June 5th 19 at 22:02
Try this

xmllint --html --xpath '//div[@itemprop="articleBody"]' file.html

xmllint need to install (apt-get install xmllint)

Find more questions by tags bashParsingRegular expressions