代码之家  ›  专栏  ›  技术社区  ›  Shushiro

Linux Bash:cURL-如何将变量传递到URL

  •  3
  • Shushiro  · 技术社区  · 7 年前

    我想做cURL GET请求。应使用以下URL:

    https://iant.toulouse.inra.fr/bacteria/annotation/cgi/rhime.cgi' -H 'Host: iant.toulouse.inra.fr' -H 'User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:56.0) Gecko/20100101 Firefox/56.0' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' -H 'Accept-Language: de,en-US;q=0.7,en;q=0.3' --compressed -H 'Referer: https://iant.toulouse.inra.fr/bacteria/annotation/cgi/rhime.cgi?__wb_cookie=&__wb_cookie_name=auth.rhime&__wb_cookie_path=/bacteria/annotation/cgi&__wb_session=WB84Qfsf&__wb_main_menu=Genome&__wb_function=$parent' -H 'Content-Type: application/x-www-form-urlencoded' -H 'Connection: keep-alive' -H 'Upgrade-Insecure-Requests: 1' -H 'Pragma: no-cache' -H 'Cache-Control: no-cache' --data '__wb_function=PortalExtractSeq&mode=run&species=rhime&fastafile=%2Fwww%2Fbacteria%2Fannotation%2F%2Fsite%2Fprj%2Frhime%2F%2Fdb%2F$ab.genomic&begin=$start&end=$end&strand=$strand
    

    在URL的末尾,我有一些单词,我想将其设计为变量,因此根据输入,URL不同,然后我请求另一个资源。

    URL的结尾$ab、$start、$end和$strand是变量,它们都是字符串。

    ...2Frhime%2F%2Fdb%2F$ab.genomic&begin=$start&end=$end&strand=$strand
    

    我遇到了“urlencode”,我想把我的URL作为一个大字符串存储在一个变量中,并将其传递给URL encode,但我不知道该怎么做。

    我试过这个/我正在搜索这样的东西:

    #!bin/bash
    [...]
    cURL="https://iant.toulouse.inra.fr/bacteria/annotation/cgi/rhime.cgi' -H 'Host: iant.toulouse.inra.fr' -H 'User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:56.0) Gecko/20100101 Firefox/56.0' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' -H 'Accept-Language: de,en-US;q=0.7,en;q=0.3' --compressed -H 'Referer: https://iant.toulouse.inra.fr/bacteria/annotation/cgi/rhime.cgi?__wb_cookie=&__wb_cookie_name=auth.rhime&__wb_cookie_path=/bacteria/annotation/cgi&__wb_session=WB84Qfsf&__wb_main_menu=Genome&__wb_function=$parent' -H 'Content-Type: application/x-www-form-urlencoded' -H 'Connection: keep-alive' -H 'Upgrade-Insecure-Requests: 1' -H 'Pragma: no-cache' -H 'Cache-Control: no-cache' --data '__wb_function=PortalExtractSeq&mode=run&species=rhime&fastafile=%2Fwww%2Fbacteria%2Fannotation%2F%2Fsite%2Fprj%2Frhime%2F%2Fdb%2F$ab.genomic&begin=$start&end=$end&strand=$strand"
    
    # storing HTTP response code in variable response. Only if the
    # reponse code is OK (200), we move on
      response=$(curl -X HEAD -I --header 'Accept:txt/html' "https://iant.toulouse.inra.fr/bacteria/annotation/cgi/rhime.cgi?__wb_cookie=&__wb_cookie_name=auth.rhime&__wb_cookie_path=/bacteria/annotation/cgi&__wb_session=WB8jqwTM&__wb_main_menu=Genome&__wb_function="$location""|head -n1|awk '{print $2}')
    
      echo "$response"
    
    # getting information via curl request
      if [ $response = 200 ] ; then
        info=$(curl -G "$ (urlencode "$cURL")")
      fi
    
      echo $info
    

    对于我的响应代码检查,直接传递$location的方法似乎可行,但如果变量更多,我会得到一个错误(响应代码100,而代码检查得到200)

    我在理解curl/urlencode时是否有一般性错误?我错过了什么?

    提前感谢您的时间和努力:)

    更新

    #!/bin/sh
    # handling command-line input
    file=$1
    ecf=$2
    
    
    # iterating through file and pulling out
    # information for the GET- and POST-request
    
    while read -r line
      do
        parent=$(echo $line | awk '{print substr($1,2,3)}')
        start=$(echo $line | awk '{print substr($2,2,6)}')
        end=$(echo $line | awk '{print substr($3,2,6)}')
        strand=$(echo $line | awk '{print substr($4,2,1)}')
        locus=$(echo $line | awk '{print substr($6,2,8)}')
    
    # depending on $parent, the right insertion for the URL is generated
        if [ $parent = "SMc" ] ; then
          location="Genome"
          ab="SMc"
        elif [ $parent = "SMa" ] ; then
          location="PrintPsyma"
          ab="pSymA"
        else [ $parent = "SMb" ]
          location="PrintPsymb"
          ab="pSymB"
        fi
    # building variables for curl content request
    
    
      options=( --compressed)
    
      headers=(
        -H 'Host: iant.toulouse.inra.fr'
        -H 'User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:56.0) Gecko/20100101 Firefox/56.0'
        -H 'Accept: txt/html,application/xhtml+xml,application/xml;1=0.9,*/*;q=0.8'
        -H 'Accept-Language: de,en-US;q=0.7,en;q=0.3'
        -H 'Referer: https://iant.toulouse.inra.fr/bacteria/annotation/cgi/rhime.cgi?__wb_cookie=&__wb_cookie_name=auth.rhime&__wb_cookie_path=/bacteria/annotation/cgi&__wb_session=WB84Qfsf&__wb_main_menu=Genome&__wb_function=$parent'
        -H 'Content-Type: application/x-www-form-urlencoded'
        -H 'Connection: keep-alive'
        -H 'Upgrade-Insecure-Requests: 1'
        -H 'Pragma: no-cache'
        -H 'Cache-Control: no-cache'
      )
    
        url='https://iant.toulouse.inra.fr/bacteria/annotation/cgi/rhime.cgi'
    
        ab=$(urlencode "${ab}")
        start=$(urlencode "${start}")
        end=$(urlencode "${end}")
        strand=$(urlencode "${strand}")
        data="__wb_function=PortalExtractSeq&mode=run&species=rhime&fastafile=%2Fwww%2Fbacteria%2Fannotation%2F%2Fsite%2Fprj%2Frhime%2F%2Fdb%2F$ab.genomic&begin=$start&end=$end&strand=$strand"
    
    
    
    
    # storing HTTP response code in variable response. Only if the
    # reponse code is OK (200), we move on
        response=$(curl -X HEAD -I --header 'Accept:txt/html' "https://iant.toulouse.inra.fr/bacteria/annotation/cgi/rhime.cgi?__wb_cookie=&__wb_cookie_name=auth.rhime&__wb_cookie_path=/bacteria/annotation/cgi&__wb_session=WB8jqwTM&__wb_main_menu=Genome&__wb_function="$location""|head -n1|awk '{print $2}')
    
        echo "$response"
    
    # getting information via curl request
        if [ $response = 200 ] ; then
            info=$(curl -G "${options[@]}" "${headers[@]}" --data "${data}" "${url}")
        fi
    
        echo $info
    
    done < $file
    
    1 回复  |  直到 5 年前
        1
  •  3
  •   Yoory N. jmcg    7 年前

    你需要分离概念。放入cURL变量中的字符串不是URL,而是URL+一组头+参数+一个压缩选项。它们都是不同的东西。

    分别定义如下:

    url='https://iant.toulouse.inra.fr/bacteria/annotation/cgi/rhime.cgi'
    headers=(
        -H 'Host: iant.toulouse.inra.fr'
        -H 'User-Agent: ...'
        -H 'Accept: ...'
        -H 'Accept-Language: ...'
        ... other headers from your example ...
    )
    options=(
        --compressed
    )
    data="__wb_function=PortalExtractSeq&mode=run&species=rhime&fastafile=%2Fwww%2Fbacteria%2Fannotation%2F%2Fsite%2Fprj%2Frhime%2F%2Fdb%2F$ab.genomic&begin=$start&end=$end&strand=$strand"
    

    然后以这种方式运行curl:

    curl -G "${options[@]}" "${headers[@]}" --data "${data}" "${url}"
    

    这将扩展到正确的curl命令。

    关于urlencode部分:您需要分别对$ab、$start、$end和$strand进行编码。如果将它们插入字符串中,然后进行编码,则该字符串中的所有特殊字符如下 & = 也会被编码,那些已经编码的像 %2F 在您的示例中,将被编码两次(将成为 %252F ).

    为了保持代码整洁,您可以事先对其进行编码:

    ab=$(urlencode "${ab}")
    start=$(urlencode "${start}")
    end=$(urlencode "${end}")
    strand=$(urlencode "${strand}")
    data="__wb_function=PortalExtractSeq&mode=run&species=rhime&fastafile=%2Fwww%2Fbacteria%2Fannotation%2F%2Fsite%2Fprj%2Frhime%2F%2Fdb%2F$ab.genomic&begin=$start&end=$end&strand=$strand"
    

    ... 或者以一种麻烦的方式:

    data="__wb_function=PortalExtractSeq&mode=run&species=rhime&fastafile=%2Fwww%2Fbacteria%2Fannotation%2F%2Fsite%2Fprj%2Frhime%2F%2Fdb%2F$(urlencode "${ab}").genomic&begin=$(urlencode "${start}")&end=$(urlencode "${end}")&strand=$(urlencode "${strand}")"
    

    我希望这有帮助。