I've been using the HTMLAgility pack for a while and I've never had any encoding issues before but recently I've been seen loads of encoding issues, where this ' is encoded as ’ or " is encoded as “
There is a very simple solution, instead of simply loading the document like this:
doc.Load(file.ToString());
The encoding should be specified, like this:
doc.Load(file.ToString(),System.Text.Encoding.UTF8, false);
After loading the document having specified the encoding, all the encoding issues disappeared as if by magic :)
No comments:
Post a Comment