优化你的DiscuzNT3.0，让它跑起来（5）缓存和高cpu

注：本文仅针对 DiscuzNT3.0, sqlserver 2000版本，其他版本请勿对号入座。

在本系列的第（4）篇中，我们提到过windbg，上次没能华丽登场的他，今天终于要一展才华了。解决问题是windbg与生俱来的功夫，他今天将怎么样来解决我们的问题，静听分解。

经过第（4）篇的优化之后，我们的论坛迎来了每日100万pv的突破性指标，相对于我们其他项目来说，这个量虽然不值一提，但毕竟这个项目是一切从0开始，能达到这个量也算是一个小突破，值得小庆祝一下。不过问题也随之而来了。我们的iis服务器的cpu占用率从之前的约30%,上升到了80%，按照我们的经验，这是不合理的，80%已经是一个危险的信号，随时都有可能引发server too busy。

从公司的利益角度来考虑，为了节约服务器资源和成本，当然要把cpu降下来；从技术人员的角度来考虑，如果你之前没解决过，这将是对自己的一个挑战，如果你解决了，就刚好是你价值的体现，正所谓乱世出英雄，这是一样的道理。好了，废话不多说，有请我们的今日之星windbg。

关于windbg，网上有很多介绍，我就不再累述了。我只发表自己对他的两点认识，比较通俗：

1）可以抓取内存包，即dump文件；

2）通过命令输出内存中正在执行的动作；

第3）步是技术人员自己要做的事情，通过命令输出的信息，分析找出问题的症结。这一步windbg是帮不了你的。

网上关于windbg的最常解决的问题一般是

1）程序崩溃；

2）程序挂起，比如死锁，超时；

3）高cpu；

4）内存泄露；

我们今天就来看看他是怎么找到 3）高cpu 的问题的。

windbg定位高cpu一般常用的方法和步骤是：

1）在高cpu的时候间隔约5秒钟抓一个包，抓3个左右,用adplus 命令；

2）通过 !runaway 命令找出每个包的前几个线程（约10个）；

3）然后找出在每次的前10个线程里都出现的线程id；

4）通过 !clrstack 命名输出该线程的执行动作；

5）通过第 4) 输出的信息找出问题的症结

（这是网上很多文章的说法，我觉得说的很模糊）；

关于第5）点，我提点自己的看法，我一般就看这些个线程id都在干什么，如果他们10个里面有8个都停留在一个方法，那这个方法就很值得怀疑，是需要重点检查的，那到底dnt3.0情况如何，我们一步一步来看看。

1）首先抓包，在cmd下通过adplus抓包，间隔10秒连抓3个（windbg命令的用法，不是本文的重点，google可以有一箩筐）；

优化你的DiscuzNT3.0，让它跑起来（5）缓存和高cpu

2）用windbg分别打开3个dump文件，运行!runaway命令，找到耗时的线程；

3个文件的截图如下：

第1张

优化你的DiscuzNT3.0，让它跑起来（5）缓存和高cpu

第2张

第3张

3）找到10个最耗时的线程：11，12，21，22，34，38，40，46，47，48 。

4）通过!clrstack命令输出他们都在干什么（因为太多，这里只选几个代表性的）

线程 11 :

----------------

0:000> ~11s

eax=02d25354 ebx=001410a8 ecx=02d253ec edx=02d2541c esi=00000360 edi=00000000

eip=7c95847c esp=020bf15c ebp=020bf1cc iopl=0 nv up ei pl zr na pe nc

cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246

ntdll!KiFastSystemCallRet:

7c95847c c3 ret

0:011> !clrstack

OS Thread Id: 0x143c (11)

ESP EIP

020bf2cc 7c95847c [GCFrame: 020bf2cc]

020bf2e8 7c95847c [HijackFrame: 020bf2e8]

020bf32c 69c7bc58 System.Xml.XmlElement.get_XPLocalName()

020bf330 69c6ea0d System.Xml.DocumentXPathNavigator.get_LocalName()

020bf334 69ca30c5 MS.Internal.Xml.XPath.XPathAxisIterator.get_Matches()

020bf344 69ca37a8 MS.Internal.Xml.XPath.XPathChildIterator.MoveNext()

020bf34c 69c99ce1 MS.Internal.Xml.XPath.ChildrenQuery.Advance()

020bf35c 69ca6fcc MS.Internal.Xml.XPath.XPathSelectionIterator.MoveNext()

020bf368 69c83b07 System.Xml.XPathNodeList.ReadUntil(Int32)

020bf378 69c83ac1 System.Xml.XPathNodeList.get_Count()

020bf380 69c72736 System.Xml.XmlNode.SelectSingleNode(System.String)

020bf390 1b9ebe3f Discuz.Cache.DNTCache.RetrieveOriginObject(System.String)

020bf398 1b9ebde3 Discuz.Cache.DNTCache.RetrieveObject(System.String)

020bf3c4 1b9ee906 Discuz.Forum.UserGroups.GetUserGroupList()

020bf3d0 1b9ee853 Discuz.Forum.UserGroups.GetUserGroupInfo(Int32)

020bf3e0 1b9ea7dc Discuz.Forum.PageBase..ctor()

020bf484 1b9e9e60 Discuz.Web.UI.Avatar..ctor()

020bf490 1b9e9deb ASP.tools_avatar_aspx..ctor()

020bf498 1b9e9dc5 __ASP.FastObjectFactory_app_web_idnpx4j6.Create_ASP_tools_avatar_aspx()

。。。。。。这里省略若干字。

---------------------------

线程 12 ：

----------

0:012> !clrstack

OS Thread Id: 0x10d0 (12)

ESP EIP

0213f3b4 7c95847c [HijackFrame: 0213f3b4]

0213f3f4 69c6f8be System.Xml.DocumentXPathNavigator.MoveToNext()

0213f404 69ca37c8 MS.Internal.Xml.XPath.XPathChildIterator.MoveNext()

0213f40c 69c99ce1 MS.Internal.Xml.XPath.ChildrenQuery.Advance()

0213f41c 69ca6fcc MS.Internal.Xml.XPath.XPathSelectionIterator.MoveNext()

0213f428 69c83b07 System.Xml.XPathNodeList.ReadUntil(Int32)

0213f438 69c83ac1 System.Xml.XPathNodeList.get_Count()

0213f440 69c72736 System.Xml.XmlNode.SelectSingleNode(System.String)

0213f450 1b9ebe3f Discuz.Cache.DNTCache.RetrieveOriginObject(System.String)

0213f458 1b9ebde3 Discuz.Cache.DNTCache.RetrieveObject(System.String)

0213f484 1c8d2242 Discuz.Forum.Templates.GetValidTemplateList()

0213f4b8 1c8d2020 Discuz.Forum.Templates.GetTemplateItem(Int32)

0213f4d4 1b9ea9c2 Discuz.Forum.PageBase..ctor()

0213f578 1b9efa4a Discuz.Web.UI.TopicPage..ctor()

0213f584 1b9ef8ca Discuz.Web.showtopic..ctor()

0213f590 1b9ef7e3 ASP.aspx_2_showtopic_aspx..ctor()

0213f598 1b9ef7bd __ASP.FastObjectFactory_app_web_ompjorep.Create_ASP_aspx_2_showtopic_aspx()

。。。。。。这里省略若干字。

---------------------------

线程 22：

--------

0:022> !clrstack

OS Thread Id: 0x11e8 (22)

ESP EIP

1c27f1f0 7c95847c [HijackFrame: 1c27f1f0]

1c27f230 69ca37c8 MS.Internal.Xml.XPath.XPathChildIterator.MoveNext()

1c27f238 69c99ce1 MS.Internal.Xml.XPath.ChildrenQuery.Advance()

1c27f248 69ca6fcc MS.Internal.Xml.XPath.XPathSelectionIterator.MoveNext()

1c27f254 69c83b07 System.Xml.XPathNodeList.ReadUntil(Int32)

1c27f264 69c83ac1 System.Xml.XPathNodeList.get_Count()

1c27f26c 69c72736 System.Xml.XmlNode.SelectSingleNode(System.String)

1c27f27c 1b9ebe3f Discuz.Cache.DNTCache.RetrieveOriginObject(System.String)

1c27f284 1b9ebde3 Discuz.Cache.DNTCache.RetrieveObject(System.String)

1c27f2b0 1c8d2c7b Discuz.Forum.OnlineUsers.GetOnlineAllUserCount()

1c27f2c4 1c8d1683 Discuz.Forum.OnlineUsers.GetCacheOnlineAllUserCount()

1c27f2d4 1b9ea930 Discuz.Forum.PageBase..ctor()

1c27f378 1b9efa4a Discuz.Web.UI.TopicPage..ctor()

1c27f384 1b9ef8ca Discuz.Web.showtopic..ctor()

1c27f390 1b9ef7e3 ASP.aspx_2_showtopic_aspx..ctor()

1c27f398 1b9ef7bd __ASP.FastObjectFactory_app_web_ompjorep.Create_ASP_aspx_2_showtopic_aspx()

。。。。。。这两省略若干字。

---------------------------

线程 48：

--------

0:048> !clrstack

OS Thread Id: 0x1620 (48)

ESP EIP

1f11eb54 7c95847c [HelperMethodFrame: 1f11eb54]

1f11ebac 79357d11 System.String.ConcatArray(System.String[], Int32)

1f11ebc0 79371853 System.String.Concat(System.String[])

1f11ebd8 7a4aec86 System.Text.RegularExpressions.Regex..ctor(System.String, System.Text.RegularExpressions.RegexOptions, Boolean)

1f11ec0c 7a4bf93a System.Text.RegularExpressions.Regex.Split(System.String, System.String, System.Text.RegularExpressions.RegexOptions)

1f11ec20 1b9ed247 Discuz.Common.Utils.SplitString(System.String, System.String)

1f11ec2c 1c8d1952 Discuz.Common.Utils.InIPArray(System.String, System.String[])

1f11ec48 1c8d1869 Discuz.Forum.PageBase.ValidateUserPermission()

1f11ec54 1b9ea950 Discuz.Forum.PageBase..ctor()

1f11ecf8 1b9efa4a Discuz.Web.UI.TopicPage..ctor()

1f11ed04 1b9ef8ca Discuz.Web.showtopic..ctor()

1f11ed10 1b9ef7e3 ASP.aspx_2_showtopic_aspx..ctor()

1f11ed18 1b9ef7bd __ASP.FastObjectFactory_app_web_ompjorep.Create_ASP_aspx_2_showtopic_aspx()

。。。。。。这里省略若干字。

---------------------------

看上面几个线程，虽然代码不完全相同，但是有几个地方是一样的，标红的地方就是，上面标红的几个线程都执行到了

Discuz.Cache.DNTCache.RetrieveObject 这里面来，类似这样的代码，这10个线程里面有6个，不再一一列出，有2个没有跟踪到执行动作，另外2个线程在做其他事情（上面的48就是其中一个），我们现在只重点来关注这6个线程，因为她们都同时执行到了

Discuz.Cache.DNTCache.RetrieveObject ，我认为这里是存在性能问题的，来看看他的源码是怎么样的。

1         /// <summary>
2         /// 返回指定XML路径下的数据项
3         /// </summary>
4         /// <param name="xpath">分级对象的路径</param>
5         /// <returns></returns>
6         public virtual object RetrieveOriginObject(string xpath)
7         {
8             if (applyMemCached)
9             {
10                 //向缓存加入新的对象
11                 return cs.RetrieveObject(xpath);
12             }
13             else
14             {
15                 XmlNode node = objectXmlMap.SelectSingleNode(PrepareXpath(xpath));
16                 if (node != null)
17                 {
18                     string objectId = node.Attributes["objectId"].Value;
19
20                     return cs.RetrieveObject(objectId);
21                 }
22                 return null;
23             }

24 }

再看看

objectXmlMap.SelectSingleNode 这里是怎么实现的，请出 reflector

public XmlNode SelectSingleNode(string xpath)
{
    XmlNodeList list = this.SelectNodes(xpath);
if ((list != null) && (list.Count > 0))
    {
        return list[0];
    }
    return null;
}

再看

this.SelectNodes(xpath);

public XmlNodeList SelectNodes(string xpath)
{
    XPathNavigator navigator = this.CreateNavigator();
if (navigator == null)
    {
        return null;
    }
    return new XPathNodeList(navigator.Select(xpath));
}

挺复杂的，再看

CreateNavigator

public virtual XPathNavigator CreateNavigator()
{
    XmlDocument document = this as XmlDocument;
    if (document != null)
    {
        return document.CreateNavigator(this);
    }
    return this.OwnerDocument.CreateNavigator(this);
}

哦，到这里看明白了， SelectSingleNode 经过了如下流程最终得到了结果：

创建 XmlDocument ——》创建 XPathNavigator ——》获取所有的节点Nodes ——》返回第一个节点

这样一个流程下来会耗多少性能呢，这个每个人的测试方式不一样，所以结果可能也会不一样，感兴趣的童鞋可以自己测试一下，因为我不熟悉这种缓存方案，无法预知他的性能，我只好改成自己熟悉的，最后用的系统自带的HttpRuntime.Cache来做缓存。

改了缓存策略的效果还不错，cpu从之前的 >80% 降到了 <30%，抓个图：

优化你的DiscuzNT3.0，让它跑起来（5）缓存和高cpu