Tell me parer XML for C++ to parse large files (1Gb+)

It is necessary that the parser is not loaded in memory the whole file, and read it slowly. Than the standard library the better. Ie if there's something in the standard library STL or Boost, it is all perfect, if not, then may be some more or less standard libraries for Linux can do it. I also need to work fast.
October 3rd 19 at 02:22
5 answers
October 3rd 19 at 02:24
Any SAX parser will do.
Any not suitable need fast. I tried libxml2, he's not fast enough. - Thalia.T commented on October 3rd 19 at 02:27
Can you specify, what means "fast enough"? For example, manifest parsing your 1Gb file using libxml and desired time. - Adrain_Runolfsson52 commented on October 3rd 19 at 02:30
October 3rd 19 at 02:26
Looking for on-demand SAX parser. Comes to mind Xerces, expat, but a lot.
I found this in the bins:

lars.ruoff.free.fr/xmlcpp/ - Thalia.T commented on October 3rd 19 at 02:29
Yeah, xerces, I would not advise too heavy for such a simple task. So it expat. - Adrain_Runolfsson52 commented on October 3rd 19 at 02:32
Judging by these benchmarks pugixml.org/benchmark/ — Xerces and expat slower than libxml2 sax. - emanuel91 commented on October 3rd 19 at 02:35
There are four test files cathedral.xml (1 Mb) employees-big.xml (10 Mb) house.dae (6 Mb) and terrover.xml (16 Mb).

Everything can be file size is not particularly large. - Jaclyn_Ro commented on October 3rd 19 at 02:38
October 3rd 19 at 02:28
There are still libxml2 in C, he can SAX. Pretty standard for Linux. I don't know.
The fact that the standard is a plus, but I want something faster. - Thalia.T commented on October 3rd 19 at 02:31
October 3rd 19 at 02:30
Strongly depends on tasks.
If the goal is to disassemble everything and then long to be picked, of course, Lieb type SAX, the most common variant is libxml2.
If the goal is to find something specific, then most likely imho — row-by-row processing regexps.
What is the html that the xml is not advised to parse with regexps, and at the same 1Gb file, the regular season will turn blue. - Thalia.T commented on October 3rd 19 at 02:33
Carefully read the previous text. Line-by-line processing regexps. - Adrain_Runolfsson52 commented on October 3rd 19 at 02:36
You right-regexps obtained quickly enough. An even quicker way to ask this without regexps standard string funkcjami. Tested in practice. But not something I want to write and maintain. - emanuel91 commented on October 3rd 19 at 02:39
October 3rd 19 at 02:32
Perhaps pugixml you.
Looked at the documentation, I understand Pugixml loads into memory the whole file at once. It does not suit me. - Thalia.T commented on October 3rd 19 at 02:35

Find more questions by tags C++