代码之家 › 专栏 › 技术社区 › Paul

使用curl从外部网页中选择特定div

curl html regex php

Paul · 技术社区 · 15 年前

嗨,有谁能帮助我如何从网页内容中选择一个特定的div。

假设我想让迪夫 id="wrapper_content" 从网页 http://www.test.com/page3.php .

我当前的代码看起来是这样的:(不工作)

//REG EXP.
$s_searchFor = '@^/.dont know what to put here..@ui';    

//CURL
$ch = curl_init();
$timeout = 5; // set to zero for no timeout
curl_setopt ($ch, CURLOPT_URL, 'http://www.test.com/page3.php');
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
if(!preg_match($s_searchFor, $ch))
{
  $file_contents = curl_exec($ch);
}
curl_close($ch);

// display file
echo $file_contents;

所以我想知道如何使用REG表达式来查找特定的div,以及如何 未定式 网页的其余部分,以便 $file_content 只包含div。

3 回复 | 直到 12 年前

Community CDub 8 年前

HTML isn't regular ,所以不应该使用regex。相反,我推荐一个html解析器,比如 Simple HTML DOM 或 DOM

如果要使用简单的html dom,可以执行以下操作:

$html = str_get_html($file_contents);
$elem = $html->find('div[id=wrapper_content]', 0);

即使使用regex,代码仍然无法正常工作。在使用regex之前,您需要获取页面的内容。

//wrong
if(!preg_match($s_searchFor, $ch)){
    $file_contents = curl_exec($ch);
}

//right
$file_contents = curl_exec($ch); //get the page contents
preg_match($s_searchFor, $file_contents, $matches); //match the element
$file_contents = $matches[0]; //set the file_contents var to the matched elements

Amit Garg 12 年前

include('simple_html_dom.php');
$html = str_get_html($file_contents);
$elem = $html->find('div[id=wrapper_content]', 0);

下载 simple_html_dom.php

imightbeinatree at Cloudspace 15 年前

查看我们的HPRICOT,它可以让您优雅地选择部分

首先使用curl获取文档,然后使用hpricot获取所需的部分

推荐文章

Antonizz · 我无法使用cURL错误发送带有不一致webhook的消息:“无法发送空消息

2 年前

karadayi · Python:curl-dJSON如果没有撇号就无法工作

3 年前

Luis Rodriguez Sanchez · 如何将这个cURL转换为PHP cURL?

3 年前

Shehab Mohamed · 通过php curl的多个请求会影响我的vps速度

3 年前

Kuba Kaktus · 用python发送curl请求

3 年前

Daniel Robinson · 在PHP中返回带有CURL的http 404

7 年前

ADGB · 将谷歌硬盘文件下载到Ubuntu终端

7 年前

Anatolii Humennyi · 如果CURLOPT_SSLCERT被设置为同时具有证书和私钥的文件,我应该设置CURLOPT_SSLKEY吗?

7 年前

user9632001 · Cron和Curl删除或关闭会话用户

7 年前

Hem · CURL POST请求出现问题,获取CURL_setopt()参数警告

7 年前