代码之家  ›  专栏  ›  技术社区  ›  CupraR_On_Rails

Ruby on Rails“UTF-8中的字节序列无效”,原因是bot

  •  52
  • CupraR_On_Rails  · 技术社区  · 11 年前

    我有一些由中国机器人触发的错误: http://www.easou.com/search/spider.html 当它滚动我的网站时。

    我的应用程序版本都使用Ruby1.9.3和Rails 3.2.X

    这里有一个堆栈:

    An ArgumentError occurred in listings#show:
    
      invalid byte sequence in UTF-8
      rack (1.4.5) lib/rack/utils.rb:104:in `normalize_params'
    
    
    -------------------------------
    Request:
    -------------------------------
    
      * URL       : http://www.my-website.com
      * IP address: X.X.X.X
      * Parameters: {"action"=>"show", "controller"=>"listings", "id"=>"location-t7-villeurbanne--58"}
      * Rails root: /.../releases/20140708150222
      * Timestamp : 2014-07-09 02:57:43 +0200
    
    -------------------------------
    Backtrace:
    -------------------------------
    
      rack (1.4.5) lib/rack/utils.rb:104:in `normalize_params'
      rack (1.4.5) lib/rack/utils.rb:96:in `block in parse_nested_query'
      rack (1.4.5) lib/rack/utils.rb:93:in `each'
      rack (1.4.5) lib/rack/utils.rb:93:in `parse_nested_query'
      rack (1.4.5) lib/rack/request.rb:332:in `parse_query'
      actionpack (3.2.18) lib/action_dispatch/http/request.rb:275:in `parse_query'
      rack (1.4.5) lib/rack/request.rb:209:in `POST'
      actionpack (3.2.18) lib/action_dispatch/http/request.rb:237:in `POST'
      actionpack (3.2.18) lib/action_dispatch/http/parameters.rb:10:in `parameters'
    
    -------------------------------
    Session:
    -------------------------------
    
      * session id: nil
      * data: {}
    
    -------------------------------
    Environment:
    -------------------------------
    
      * CONTENT_LENGTH                                 : 514
      * CONTENT_TYPE                                   : application/x-www-form-urlencoded
      * HTTP_ACCEPT                                    : text/html, application/xml;q=0.9, application/xhtml+xml, image/png, image/jpeg, image/gif, image/x-xbitmap, */*;q=0.1
      * HTTP_ACCEPT_ENCODING                           : gzip, deflate
      * HTTP_ACCEPT_LANGUAGE                           : zh;q=0.9,en;q=0.8
      * HTTP_CONNECTION                                : close
      * HTTP_HOST                                      : www.my-website.com
      * HTTP_REFER                                     : http://www.my-website.com/
      * HTTP_USER_AGENT                                : Mozilla/5.0 (compatible; EasouSpider; +http://www.easou.com/search/spider.html)
      * ORIGINAL_FULLPATH                              : /
      * PASSENGER_APP_SPAWNER_IDLE_TIME                : -1
      * PASSENGER_APP_TYPE                             : rack
      * PASSENGER_CONNECT_PASSWORD                     : [FILTERED]
      * PASSENGER_DEBUGGER                             : false
      * PASSENGER_ENVIRONMENT                          : production
      * PASSENGER_FRAMEWORK_SPAWNER_IDLE_TIME          : -1
      * PASSENGER_FRIENDLY_ERROR_PAGES                 : true
      * PASSENGER_GROUP                                :
      * PASSENGER_MAX_REQUESTS                         : 0
      * PASSENGER_MIN_INSTANCES                        : 1
      * PASSENGER_SHOW_VERSION_IN_HEADER               : true
      * PASSENGER_SPAWN_METHOD                         : smart-lv2
      * PASSENGER_USER                                 :
      * PASSENGER_USE_GLOBAL_QUEUE                     : true
      * PATH_INFO                                      : /
      * QUERY_STRING                                   :
      * REMOTE_ADDR                                    : 183.60.212.153
      * REMOTE_PORT                                    : 52997
      * REQUEST_METHOD                                 : GET
      * REQUEST_URI                                    : /
      * SCGI                                           : 1
      * SCRIPT_NAME                                    :
      * SERVER_PORT                                    : 80
      * SERVER_PROTOCOL                                : HTTP/1.1
      * SERVER_SOFTWARE                                : nginx/1.2.6
      * UNION_STATION_SUPPORT                          : false
      * _                                              : _
      * action_controller.instance                     : listings#show
      * action_dispatch.backtrace_cleaner              : #<Rails::BacktraceCleaner:0x000000056e8660>
      * action_dispatch.cookies                        : #<ActionDispatch::Cookies::CookieJar:0x00000006564e28>
      * action_dispatch.logger                         : #<ActiveSupport::TaggedLogging:0x0000000318aff8>
      * action_dispatch.parameter_filter               : [:password, /RAW_POST_DATA/, /RAW_POST_DATA/, /RAW_POST_DATA/]
      * action_dispatch.remote_ip                      : 183.60.212.153
      * action_dispatch.request.content_type           : application/x-www-form-urlencoded
      * action_dispatch.request.parameters             : {"action"=>"show", "controller"=>"listings", "id"=>"location-t7-villeurbanne--58"}
      * action_dispatch.request.path_parameters        : {:action=>"show", :controller=>"listings", :id=>"location-t7-villeurbanne--58"}
      * action_dispatch.request.query_parameters       : {}
      * action_dispatch.request.request_parameters     : {}
      * action_dispatch.request.unsigned_session_cookie: {}
      * action_dispatch.request_id                     : 9f8afbc8ff142f91ddbd9cabee3629f3
      * action_dispatch.routes                         : #<ActionDispatch::Routing::RouteSet:0x0000000339f370>
      * action_dispatch.show_detailed_exceptions       : false
      * action_dispatch.show_exceptions                : true
      * rack-cache.allow_reload                        : false
      * rack-cache.allow_revalidate                    : false
      * rack-cache.cache_key                           : Rack::Cache::Key
      * rack-cache.default_ttl                         : 0
      * rack-cache.entitystore                         : rails:/
      * rack-cache.ignore_headers                      : ["Set-Cookie"]
      * rack-cache.metastore                           : rails:/
      * rack-cache.private_headers                     : ["Authorization", "Cookie"]
      * rack-cache.storage                             : #<Rack::Cache::Storage:0x000000039c5768>
      * rack-cache.use_native_ttl                      : false
      * rack-cache.verbose                             : false
      * rack.errors                                    : #<IO:0x000000006592a8>
      * rack.input                                     : #<PhusionPassenger::Utils::RewindableInput:0x0000000655b3a0>
      * rack.multiprocess                              : true
      * rack.multithread                               : false
      * rack.request.cookie_hash                       : {}
      * rack.request.form_hash                         :
      * rack.request.form_input                        : #<PhusionPassenger::Utils::RewindableInput:0x0000000655b3a0>
      * rack.request.form_vars                         : ���W�"��陷q�B��)���
    �F��P   Z� 8�� &   G\y�P��u�T ed �.�%�mxEAẳ\�d*�Hg�     �C賳�lj��� � U 1��]pgt�P�
      Ɗ    ��c"� ��LX��D���HR�y��p`6�l���lN�P �l�S����`V4y��c����X2�        &JO!��*p �l��-�гU��w }g�ԍk�� (� F J��  q�:�5G�Jh�pί����ࡃ]                                                                                                                                                                                                                                                                           �z�h���� d }�}
      * rack.request.query_hash                        : {}
      * rack.request.query_string                      :
      * rack.run_once                                  : false
      * rack.session                                   : {}
      * rack.session.options                           : {:path=>"/", :domain=>nil, :expire_after=>nil, :secure=>false, :httponly=>true, :defer=>false, :renew=>false, :coder=>#<Rack::Session::Cookie::Base64::Marshal:0x000000034d4ad8>, :id=>nil}
      * rack.url_scheme                                : http
      * rack.version                                   : [1, 0]
    

    如您所见,url中没有无效的utf-8,而只有 rack.request.form_vars 。我每天大约有100个错误,而且都和这个类似。

    所以,我试图强迫utf-8进入 机架请求表单 像这样:

    class RackFormVarsSanitizer
      def initialize(app)
        @app = app
      end
    
      def call(env)
        if env["rack.request.form_vars"] 
          env["rack.request.form_vars"] = env["rack.request.form_vars"].force_encoding('UTF-8')
        end
        @app.call(env)
      end
    end
    

    我称之为 application.rb :

    config.middleware.use "RackFormVarsSanitizer"
    

    它似乎不起作用,因为我已经有错误了。问题是我无法在开发模式下进行测试,因为我不知道如何设置 机架请求表单 .

    我安装了 utf8-cleaner 宝石,但它什么也不能修复。

    有人有办法解决这个问题吗?或者在开发中触发它?

    3 回复  |  直到 11 年前
        1
  •  33
  •   Henrik N    11 年前

    所以你不必在我的另一个回复中拼凑评论,这是我现在正在做的,我已经24小时没有看到任何错误,所以看起来很有希望:

    添加 rack-utf8_sanitizer 到您的Gemfile:

    gem 'rack-utf8_sanitizer'
    

    并运行

    bundle
    

    this middleware 在里面 app/middleware/handle_invalid_percent_encoding.rb 并重命名类 HandleInvalidPercentEncoding (因为 ExceptionApp 有点过于笼统)。

    config config/application.rb 做:

    require "#{Rails.root}/app/middleware/handle_invalid_percent_encoding.rb"
    
    
    # NOTE: These must be in this order relative to each other.
    # HandleInvalidPercentEncoding just raises for encoding errors it doesn't cover,
    # so it must run after (= be inserted before) Rack::UTF8Sanitizer.
    config.middleware.insert 0, HandleInvalidPercentEncoding
    config.middleware.insert 0, Rack::UTF8Sanitizer  # from a gem
    

    部署完成。

    ( app 恰好是我正在研究的项目中中间件的位置,但我可能更喜欢 lib 无论什么两种方法都可以。)

        2
  •  11
  •   Sunny Harsh Sanghani    6 年前

    将此行添加到 Gemfile ,然后运行 bundle 在您的终端中:

    gem "handle_invalid_percent_encoding_requests"
    

    此解决方案基于 Henrik's answer ,已转换为 a Rails Engine gem .

        3
  •  0
  •   Henrik N    11 年前

    an issue 在gem回购中,链接到 someone's possible solution 他们说这对他们有用,但他们不确定这是否是一个好的解决方案。

    我还没有试过,但我想我会的。