在visualstudio上获得确切的异常
CTR+ALT+E
并启用
CommonLanguageRunTimeExceptions
,看起来LoadHtmlAsXml需要html,所以最好的选择可能是使用
WebClient.DownloadString(url)
和
HtmlDocument
具有属性
OptionOutputAsXml
设置为
true
如下所示,当失败时,抓住它
public XDocument Scrape(string url)
{
var wc = new WebClient();
var htmlorxml = wc.DownloadString(url);
var doc = new HtmlDocument() { OptionOutputAsXml = true};
var stringWriter = new StringWriter();
doc.Save(stringWriter);
try
{
return XDocument.Parse(stringWriter.ToString());
}
catch
{
try
{
return XDocument.Parse(htmlorxml);
}
catch
{
return null;
}
}
}