http://www.07net01.com/2015/04/822090.html
直接用Openresty替换掉了Nginx,通过Nginx内嵌Lua配合一个Memcached实现一个不依赖后端反爬虫验证(类似于CloudFlare的验证码。Memcached中包含键值identify_ip
的用户都会被重定向到identify.php进行处理,可以在identify.php通过验证码或者js进行human验证,验证之后将identify_IP
删除,该IP
即可继续访问。
server { #... location / { index index.php; } location ~ /.php$ { content_by_lua ' uri = ngx.var.uri if uri == "/identify.php" then ngx.exec("@bypass") return end clientIP = ngx.var.remote_addr local memcached = require "resty.memcached" local memc, err = memcached:new() if not memc then ngx.say("failed to instantiate memc: ", err) return end local ok, err = memc:connect("127.0.0.1", 11211) if not ok then ngx.say("failed to connect: ", err) return end local res, flags, err = memc:get("identify_"..clientIP) if err then ngx.exec("@bypass") return end if res == "1" then ngx.exec("@identify") return end ngx.exec("@bypass") '; } location @bypass { #echo 'bypass'; #rewrite break fastcgi_pass unix:/var/run/php5-fpm.sock; fastcgi_index index.php; fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name; include fastcgi_params; } location @identify { #echo 'identify'; #identify.php rewrite ^/(.*)$ /identify.php?url=$request_uri redirect; #redirect } location ~ //.ht { deny all; } }identify_IP
键值可通过分析Nginx日志自动set,通过AWK筛选出10分钟的访问日志。
tac chd_access.log | awk 'BEGIN{ "date -d /"-10 minute/" +/"%H:%M:%S/"" | getline min5 } { if (substr($4, 14) > min5) PRint; else exit;}' | tac然后写个python cron分析,比如10分钟内请求页面数超过100的用户,然后插入Memcached好了...
原文地址:Openresty+Lua+Memcached反爬虫策略, 感谢原作者分享。关键词:
新闻热点
疑难解答