代码之家  ›  专栏  ›  技术社区  ›  Tony

尝试使用jsoup操作网页的数据输入

  •  0
  • Tony  · 技术社区  · 8 年前

    here

    这些值是街道地址号和街道名称,由表示 inpNumber inpStreet .

    HTML:

    <td width="48">
      <input type="text" id="inpNumber" name="inpNumber" class="Input" size="5" value="" onkeypress="clearAction(this)" />
    </td>
    
    <td width="40">
      <input type="text" id="inpUnit" name="inpUnit" class="Input" size="4" value="" onkeypress="clearAction(this)" />
    </td>
    
    <td width="160">
      <input type="text" id="inpStreet" name="inpStreet" class="Input" size="20" value="" onkeypress="clearAction(this)" />
    </td>

    只有 INP编号

    到目前为止,我已经尝试了:

    String url = "http://icare.fairfaxcounty.gov/ffxcare/search/commonsearch.aspx?mode=address";    
    try {
        Connection.Response response = Jsoup.connect(url)
                    .userAgent("Mozilla/5.0")
                    .timeout(10 * 10000)
                    .method(Connection.Method.POST)
                    .data("inpNumber", "4127")
                    .data("inpUnit", "")
                    .data("inpStreet", "Winter Harbor")
                    .data("btSearch", "")
                    .data("inpSuffix1", "")
                    .followRedirects(true)
                    .execute();
    
        //parse the document from response
        Document document = response.parse();
        System.out.println(" extracting information from site ");
        
        FileWriter fw = new FileWriter("doc.html");
        BufferedWriter bw = new BufferedWriter(fw);
        bw.write(document.html());
        bw.close();
    } catch (Exception ex){
        ex.printStackTrace();
    }
    

    我还尝试了上述代码的几种变体,包括更多/更少的键/对值(设置并返回从firebug中找到的“”值),查看所有返回值和对 Jsoup.connect(url) 呼叫

    结果我进去了 doc.html 文件是原始未更改的页面。我做错了什么?

    1 回复  |  直到 5 年前
        1
  •  1
  •   Vanna    8 年前

    信息是作为有效载荷发送的,我发送信息的最佳方式是使用 requestBody(String) . 下面的代码已测试正常。

    import java.io.BufferedWriter;
    import java.io.FileWriter;
    
    import org.jsoup.*;
    import org.jsoup.nodes.Document;
    import org.jsoup.select.Elements;
    
    import static java.net.URLEncoder.encode;
    

    代码:

    public static void main(String[] args) {
        String url = "http://icare.fairfaxcounty.gov/ffxcare/search/commonsearch.aspx?mode=address";
        String userAgent = "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:55.0) Gecko/20100101 Firefox/55.0";
    
        try {
    
            // GET required information for validation
            // Note that you might want to make a method out of this and call it whenever you need to instead of always
            Elements inputs = Jsoup.connect(url)
                    .userAgent(userAgent)
                    .get().select("input");
    
            String eventValidation = encode(inputs.select("#__EVENTVALIDATION").attr("value"), "UTF-8");
            String viewStateGen = encode(inputs.select("#__VIEWSTATEGENERATOR").attr("value"), "UTF-8");
            String viewState = encode(inputs.select("#__VIEWSTATE").attr("value"), "UTF-8");
    
    
            int number = 4127;
            String street = encode("Winter Harbor", "UTF-8");
    
            // not necessary
            String unit = "";
            String suffix = "";
    
            Document document = Jsoup.connect(url)
                    .userAgent(userAgent)
                    .requestBody(
                            String.format(
                                    "mode=ADDRESS"
                                    + "&__VIEWSTATE=%s"
                                    + "&__VIEWSTATEGENERATOR=%s"
                                    + "&__EVENTVALIDATION=%s"
                                    + "&inpNumber=%d"
                                    + "&inpUnit=%s"
                                    + "&inpStreet=%s"
                                    + "&inpSuffix1=%s", 
                                    viewState, viewStateGen, eventValidation,
                                    number, unit, street, suffix))
                    .post();
    
    
            System.out.println("Extracting information from the site...");
    
            FileWriter fw = new FileWriter("doc.html");
            BufferedWriter bw = new BufferedWriter(fw);
            bw.write(document.html());
            bw.close();
    
            System.out.println("Done.");
        } catch (Exception ex) {
            //TODO Handle exceptions
            ex.printStackTrace();
        }
    
    }
    
    推荐文章