首页 > 开发 > 综合 > 正文

C#关于正则表达式匹配无异常资源耗尽的解决方案

2024-07-21 02:28:14
字体:
来源:转载
供稿:网友


  在c#中使用正则表达式进行匹配,有时候我们会遇到这种情况,cpu使用率100%,但是正则表达式并没有异常抛出,正则一直处于匹配过程中,这将导致系统资源被耗尽,应用程序被卡住,这是由于正则不完全匹配,而且regex中没有timeout属性,使正则处理器陷入了死循环。

  这种情况尤其可能发生在对非可靠的被匹配对象的匹配过程中,例如在我的个人网站www.eahan.com项目中,对多个网站页面的自动采集匹配,就经常发生该问题。为了避免资源耗尽的情况发生,我写了一个asynchronousregex类,顾名思义,异步的regex。给该类一个设置一个timeout属性,将regex匹配的动作置于单独的线程中,asynchronousregex监控regex匹配超过timeout限定时销毁线程。

using system;

using system.text.regularexpressions;
using system.threading;

namespace lzt.eahan.common
{
    public class asynchronousregex
    {
        private matchcollection mc;
        private int _timeout;        // 最长休眠时间(超时),毫秒
        private int sleepcounter;
        private int sleepinterval;    // 休眠间隔,毫秒
        private bool _istimeout;

        public bool istimeout
        {
            get {return this._istimeout;}
        }

        public asynchronousregex(int timeout)
        {
            this._timeout = timeout;
            this.sleepcounter = 0;
            this.sleepinterval = 100;
            this._istimeout = false;

            this.mc = null;
        }

        public matchcollection matchs(regex regex, string input)
        {
            reg r = new reg(regex, input);
            r.onmatchcomplete += new reg.matchcompletehandler(this.matchcompletehandler);
           
            thread t = new thread(new threadstart(r.matchs));
            t.start();

            this.sleep(t);

            t = null;
            return mc;
        }

        private void sleep(thread t)
        {
            if (t != null && t.isalive)
            {
                thread.sleep(timespan.frommilliseconds(this.sleepinterval));
                this.sleepcounter ++;
                if (this.sleepcounter * this.sleepinterval >= this._timeout)
                {
                    t.abort();
                    this._istimeout = true;
                }
                else
                {
                    this.sleep(t);
                }
            }
        }

        private void matchcompletehandler(matchcollection mc)
        {
            this.mc = mc;
        }

        class reg
        {
            internal delegate void matchcompletehandler(matchcollection mc);
            internal event matchcompletehandler onmatchcomplete;

            public reg(regex regex, string input)
            {
                this._regex = regex;
                this._input = input;
            }

            private string _input;
            public string input
            {
                get {return this._input;}
                set {this._input = value;}
            }

            private regex _regex;
            public regex regex
            {
                get {return this._regex;}
                set {this._regex = value;}
            }

            internal void matchs()
            {
                matchcollection mc  = this._regex.matches(this._input);
                if (mc != null && mc.count > 0)    // 这里有可能造成cpu资源耗尽
                {
                    this.onmatchcomplete(mc);
                }
            }
        }
    }
}

发表评论 共有条评论
用户名: 密码:
验证码: 匿名发表