array(2) { ["docs"]=> array(10) { [0]=> array(10) { ["id"]=> string(9) "308660876" ["text"]=> string(45) "安全测试前置实践1-白盒&黑盒扫描" ["intro"]=> string(411) "本文我们将以围绕系统安全质量提升为目标,讲述在安全前置扫描上实践开展过程。希望通过此篇文章,帮助大家更深入、透彻地了解安全测试,能快速开展安全测试。 作者:京东物流 陈维 一、引言 G.J.Myers在《软件测试的艺术》中提出:从心理学角度来说,测试是一个为了寻找错误而运行程序的过程。 " ["username"]=> string(12) "jingdongkeji" ["tagsname"]=> string(39) "前端|安全|黑盒测试|白盒测试" ["tagsid"]=> string(29) "["160","2823","14120","5741"]" ["catesname"]=> string(0) "" ["catesid"]=> string(2) "[]" ["createtime"]=> string(10) "1681206002" ["_id"]=> string(9) "308660876" } [1]=> array(10) { ["id"]=> string(9) "308660875" ["text"]=> string(24) "vulnhub靶场之ORASI: 1" ["intro"]=> string(256) "准备: 攻击机:虚拟机kali、本机win10。 靶机:Orasi: 1,下载地址:https://download.vulnhub.com/orasi/Orasi.ova,下载后直接vbox打开即可。 知识点:hex编码、ida逆向、AndroidKiller逆向、ffuf爆破、ssti漏洞、s" ["username"]=> string(6) "upfine" ["tagsname"]=> string(0) "" ["tagsid"]=> string(2) "[]" ["catesname"]=> string(0) "" ["catesid"]=> string(2) "[]" ["createtime"]=> string(10) "1681204802" ["_id"]=> string(9) "308660875" } [2]=> array(10) { ["id"]=> string(9) "308660874" ["text"]=> string(92) "C# Kafka重置到最新的偏移量,即从指定的Partition订阅消息使用Assign方法" ["intro"]=> string(428) "在使用Kafka的过程中,消费者断掉之后,再次开始消费时,消费者会从断掉时的位置重新开始消费。 场景再现:比如昨天消费者晚上断掉了,今天上午我们会发现kafka消费的数据不是最新的,而是昨天晚上的数据,由于数据量比较多,也不会及时的消费到今天上午的数据,这个时候就需要我们对偏移量进行重置为最新的,以" ["username"]=> string(15) "Poetwithapistol" ["tagsname"]=> string(10) ".NET|Kafka" ["tagsid"]=> string(13) "["300","440"]" ["catesname"]=> string(4) ".NET" ["catesid"]=> string(7) "["119"]" ["createtime"]=> string(10) "1681203303" ["_id"]=> string(9) "308660874" } [3]=> array(10) { ["id"]=> string(9) "308660873" ["text"]=> string(129) "迁移学习()《Attract, Perturb, and Explore: Learning a Feature Alignment Network for Semi-supervised Domain Adaptation》" ["intro"]=> string(194) "论文信息 论文标题:Attract, Perturb, and Explore: Learning a Feature Alignment Network for Semi-supervised Domain Adaptation论文作者:Taekyung Kim论文来源:2020 ECCV论文地" ["username"]=> string(12) "BlairGrowing" ["tagsname"]=> string(0) "" ["tagsid"]=> string(2) "[]" ["catesname"]=> string(0) "" ["catesid"]=> string(2) "[]" ["createtime"]=> string(10) "1681203302" ["_id"]=> string(9) "308660873" } [4]=> array(10) { ["id"]=> string(9) "308660872" ["text"]=> string(92) "C# Kafka重置到最新的偏移量,即从指定的Partition订阅消息使用Assign方法" ["intro"]=> string(428) "在使用Kafka的过程中,消费者断掉之后,再次开始消费时,消费者会从断掉时的位置重新开始消费。 场景再现:比如昨天消费者晚上断掉了,今天上午我们会发现kafka消费的数据不是最新的,而是昨天晚上的数据,由于数据量比较多,也不会及时的消费到今天上午的数据,这个时候就需要我们对偏移量进行重置为最新的,以" ["username"]=> string(10) "goodboydcc" ["tagsname"]=> string(10) ".NET|Kafka" ["tagsid"]=> string(13) "["300","440"]" ["catesname"]=> string(4) ".NET" ["catesid"]=> string(7) "["119"]" ["createtime"]=> string(10) "1681202402" ["_id"]=> string(9) "308660872" } [5]=> array(10) { ["id"]=> string(9) "308660870" ["text"]=> string(42) "Django怎么使用原生SQL查询数据库" ["intro"]=> string(392) "这篇文章主要介绍“Django怎么使用原生SQL查询数据库”,在日常操作中,相信很多人在Django怎么使用原生SQL查询数据库问题上存在疑惑,小编查阅了各式资料,整理出简单好用的操作方法,希望对大家解答”Django怎么使用原生SQL查询数据库”的疑惑有所帮助!接下来,请跟着小编一起来学习吧! D" ["username"]=> NULL ["tagsname"]=> string(20) "django|sql|数据库" ["tagsid"]=> NULL ["catesname"]=> string(0) "" ["catesid"]=> NULL ["createtime"]=> string(10) "1681201981" ["_id"]=> string(9) "308660870" } [6]=> array(10) { ["id"]=> string(9) "308660871" ["text"]=> string(37) "Express怎么实现定时发送邮件" ["intro"]=> string(432) "今天小编给大家分享一下Express怎么实现定时发送邮件的相关知识点,内容详细,逻辑清晰,相信大部分人都还太了解这方面的知识,所以分享这篇文章给大家参考一下,希望大家阅读完这篇文章后有所收获,下面我们一起来了解一下吧。 在开发中我们有时候需要每隔 一段时间发送一次电子邮件,或者在某个特定的时间进行发" ["username"]=> NULL ["tagsname"]=> string(7) "express" ["tagsid"]=> NULL ["catesname"]=> string(0) "" ["catesid"]=> NULL ["createtime"]=> string(10) "1681201981" ["_id"]=> string(9) "308660871" } [7]=> array(10) { ["id"]=> string(9) "308660869" ["text"]=> string(29) "mysql运维------分库分表" ["intro"]=> string(412) "1. 介绍 问题分析: 随着互联网以及移动互联网的发展,应用系统的数据量也是成指数式增长,若采用单数据库进行数据存储,存在以下性能瓶颈: IO瓶颈:热点数据太多,数据库缓存不足,产生大量磁盘IO,效率较低。请求数据太多,带宽不够,网络IO瓶颈。CPU瓶颈:排序、分组、连接查询、聚合统计等SQL会耗费" ["username"]=> string(13) "qds1401744017" ["tagsname"]=> string(5) "mysql" ["tagsid"]=> string(7) "["237"]" ["catesname"]=> string(0) "" ["catesid"]=> string(2) "[]" ["createtime"]=> string(10) "1681200304" ["_id"]=> string(9) "308660869" } [8]=> array(10) { ["id"]=> string(9) "308660868" ["text"]=> string(41) "ASP.NET Core - 缓存之内存缓存(下)" ["intro"]=> string(292) "话接上篇 [ASP.NET Core - 缓存之内存缓存(上)],所以这里的目录从 2.4 开始。 2.4 MemoryCacheEntryOptions MemoryCacheEntryOptions 是内存缓存配置类,可以通过它配置缓存相关的策略。除了上面讲到的过期时间,我们还能够设置下面这些" ["username"]=> string(6) "wewant" ["tagsname"]=> string(12) "asp.net core" ["tagsid"]=> string(7) "["179"]" ["catesname"]=> string(25) "APS.NET Core 系列总结" ["catesid"]=> string(9) "["15288"]" ["createtime"]=> string(10) "1681200302" ["_id"]=> string(9) "308660868" } [9]=> array(10) { ["id"]=> string(9) "308660867" ["text"]=> string(9) "SPI协议" ["intro"]=> string(334) "SPI协议是由摩托罗拉公司提出的通讯协议(Serial Peripheral Interface),即串行外设接口。广泛用在ADC、LCD等设备与MCU间,要求通讯速率较高的场合。区分它与I2C协议差异以及FLASH存储器与EEPROM存储器的区别。下面我们分别对SPI协议的物理层及协议层进行讲解。" ["username"]=> string(8) "Kaelthas" ["tagsname"]=> string(5) "STM32" ["tagsid"]=> string(8) "["1311"]" ["catesname"]=> string(5) "STM32" ["catesid"]=> string(8) "["1139"]" ["createtime"]=> string(10) "1681199702" ["_id"]=> string(9) "308660867" } } ["count"]=> int(5621682) } ES内存持续上升问题定位 - 爱码网
 

https://discuss.elastic.co/t/memory-usage-of-the-machine-with-es-is-continuously-increasing/23537/7 里提到ES内存缓慢上升可能是因为小文件过多(ES本身会在index时候建立大量的小文件),linux dentry 和 inode cache会增加。可以通过设置vfs_cache_pressure 让操作系统进行高优先级回收。

当然,还可能是因为ES lucene自身内存泄漏bug导致。 Your bug description reminds me of https://issues.apache.org/jira/browse/LUCENE-7657 but this bug is expected to be fixed in 5.5.2.


https://discuss.elastic.co/t/es-vs-lucene-memory/20959

I've read the recommendations for ES_HEAP_SIZE
which
basically state to set -Xms and -Xmx to 50% physical RAM.
It says the rest should be left for Lucene to use (OS filesystem caching).
But I'm confused on how Lucene uses that. Doesn't Lucene run in the same
JVM as ES? So they would share the same max heap setting of 50%.

nik9000 Nik Everett Elastic Team Member
Nov '14
Lucene runs in the same JVM as Elasticsearch but (by default) it mmaps
files and then iterates over their content inteligently. That means most
of its actual storage is "off heap" (its a java buzz-phrase). Anyway,
Linux will serve reads from mmaped files from its page cache. That is why
you want to leave linux a whole bunch of unused memory.

Nik
就是官方人员建议预留一些空闲的内存给ES(lucene)的底层文件系统用于File cache。


pountz Adrien Grand Elastic Team Member
Nov '14
Indeed the behaviour is the same on Windows and Linux: memory that is not
used by processes is used by the operating system in order to cache the
hottest parts of the file system. The reason why the docs say that the rest
should be left to Lucene is that most disk accesses that elasticsearch
performs are done through Lucene.

I used procexp and VMMap to double check, ya, i think they are file system
cache.
Is there anyway to control the size of file system cache? Coz now it's
easily driving up OS memory consumption. When it's reaching 100%, the node
would fail to respond...
他们也遇到ES机器的内存(java heap+文件系统)使用达到了100%。


不过该帖子没有给出解决方案。

类似ES内存问题在:https://discuss.elastic.co/t/jvm-memory-increase-1-time-more-xmx16g-while-elasticsearch-heap-is-stable/55413/4
其操作是每天都定期查询。

Elasticsearch uses not only heap but also out-of-heap memory buffers because of Lucene.


I just read the Lucene blog post and I already know that Lucene/ES start to use the file system cache (with MMapDirectory).
That why in my graph memory you can see: Free (in green) + Used memory (in red) + cached memory (the FS cache in blue).


https://discuss.elastic.co/t/memory-usage-of-the-machine-with-es-is-continuously-increasing/23537/2

提到:ES内存每天都上升200MB,而重启ES则一切又正常了。
Note
that when I restart the ES it gets cleared(most of it, may be OS clears up
this cache once it sees that the parent process has been stopped).

When the underlying lucene engine interacts with a segment the OS will
leverage free system RAM and keep that segment in memory. However
Elasticsearch/lucene has no way to control of OS level caches.
这个是操作系统的cache,ES本身无法控制。

 

转自:http://farll.com/2016/10/high-memory-usage-alarm/

Linux Centos服务器内存使用率过高的报警, 最后得出结论是因为 nss-softokn的bug导致curl 大量请求后, dentry 缓存飙升.

问题的开始是收到云平台发过来的内存使用率平均值超过报警值的短信, 登录云监控后台查看发现从前两天开始内存使用曲线缓慢地呈非常有规律上升趋势.

ES内存持续上升问题定位

用top命令, 然后使用M按内存使用大小排序发现并没有特别消耗内存的进程在跑, 用H查询线程情况也正常. 最高的mysql占用6.9%内存. memcached进程telnet后stats并未发现有内存消耗过大等异常情况.

ES内存持续上升问题定位

free -m 查看 -/+ buffers/cache used很高, 可用的free内存只剩下百分之十几, 那么内存消耗究其在哪里去了?

使用cat /proc/meminfo 查看内存分配更详细的信息:  Slab 和 SReclaimable 达几个G,  Slab内存分配管理, SReclaimable看字面意思是可回收内存.

MemTotal: 8058056 kB
MemFree: 3892464 kB
Buffers: 192016 kB
Cached: 873860 kB
SwapCached: 0 kB
Active: 1141088 kB
Inactive: 690580 kB
Active(anon): 765260 kB
Inactive(anon): 22220 kB
Active(file): 375828 kB
Inactive(file): 668360 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 24 kB
Writeback: 0 kB
AnonPages: 765784 kB
Mapped: 58648 kB
Shmem: 21696 kB
Slab: 2261236 kB
SReclaimable: 2236844 kB
SUnreclaim: 24392 kB
KernelStack: 1448 kB
PageTables: 8404 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 4029028 kB
Committed_AS: 1500552 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 25952 kB
VmallocChunk: 34359710076 kB
HardwareCorrupted: 0 kB
AnonHugePages: 673792 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 6144 kB
DirectMap2M: 8382464 kB

使用slabtop 命令按C按分配slab的size大小排列, dentry (directory entry cache)排最前面高其他很多, 其次是inode_cache 文件缓存. 一个Laravel队列引发的报警这篇文章提到laravel队列会生成和读写操作大量的小文件导致dentry cache 飙升, 超大目录(一个目录里含有上百万数量的文件)也有可能是原因之一.

ES内存持续上升问题定位

因为最近并没有对服务器配置和程序代码进行变更, 所以第一反应是想是不是AliyunUpdate自动更新程序更改了什么. 但是 strace 依次监控Ali系列的进程并没有发现有大量的文件操作.  crond也发现会读写少量的session等临时文件, 但数量少并不至于到这个程度.

ps aux | grep "Ali"
strace -fp {pid} -e trace=open,stat,close,unlink

官方的工单回复是可以考虑升级服务器内存,如果内存不足影响业务,需要临时释放一下slab占用的内存, 参考以下步骤:

回收dentry cache和inode cache占用的内存
#echo 2 > /proc/sys/vm/drop_caches
等内存回收完毕后再恢复:
#echo 0 > /proc/sys/vm/drop_caches

来自linux内核文档:
To free pagecache:
	echo 1 > /proc/sys/vm/drop_caches
To free reclaimable slab objects (includes dentries and inodes):
	echo 2 > /proc/sys/vm/drop_caches
To free slab objects and pagecache:
	echo 3 > /proc/sys/vm/drop_caches

只给了个临时解决方案, 总不能还要开个自动任务定时执行回收dentry cache的任务吧. 不过我们逐渐接近了事情的本质, 即目前看来是过高的slab内存无法回收导致内存不足. 那么我们可以提高slab内存释放的优先级, Linux 提供了 vfs_cache_pressure 这个参数, 默认为100, 设置为高于100的数, 数值越大slab回收优先级越高(root 身份运行):

echo 10000 > /proc/sys/vm/vfs_cache_pressure

注意有文章提到有时vfs_cache_pressure的设置并不会在系统中马上体现出来, dentry 和inode cache会有一个到高峰后突然下降,然后逐渐正常波动的过程. 因此需要运行个24小时再来下效果定论.

ES内存持续上升问题定位

并不建议通过如上Laravel队列文章最后提到的使用min_free_kbytes 和 extra_free_kbytes参数来实现回收slab缓存的, 内核文档中比较明确地指明了vfs_cache_pressure 适合于控制directory 和inode 缓存的回收:

vfs_cache_pressure
------------------

This percentage value controls the tendency of the kernel to reclaim
the memory which is used for caching of directory and inode objects.

At the default value of vfs_cache_pressure=100 the kernel will attempt to
reclaim dentries and inodes at a "fair" rate with respect to pagecache and
swapcache reclaim.  Decreasing vfs_cache_pressure causes the kernel to prefer
to retain dentry and inode caches. When vfs_cache_pressure=0, the kernel will
never reclaim dentries and inodes due to memory pressure and this can easily
lead to out-of-memory conditions. Increasing vfs_cache_pressure beyond 100
causes the kernel to prefer to reclaim dentries and inodes.

Increasing vfs_cache_pressure significantly beyond 100 may have negative
performance impact. Reclaim code needs to take various locks to find freeable
directory and inode objects. With vfs_cache_pressure=1000, it will look for
ten times more freeable objects than there are.

虽然slab占用过多内存得到了有效控制和实时回收, 但实际上我们还是没有找到问题的本质, 按理来说默认的vfs_cache_pressure值应该是能够较好地实现可用内存和slab缓存之间的平衡的.

最后去查该时间节点的网站服务日志才发现, 内存使用率一直升的原因在于从这个时间开始, 有系统crontab任务一直在不断地通过curl请求外部资源api接口, 而他这个api基于https. 罪魁祸首就在于Libcurl附带的Mozilla网络安全服务库NSS(Network Security Services) 的bug, 只有curl请求ssl即https的资源才会引入NSS.

其内部机制是: NSS为了检测访问的临时目录是本地的还是网络资源, 它会访问数百个不存在的文件并统计所需要的时间, 在这过程就会为这些不存在的文件生成大量dentry cache. 当curl请求产生的dentry cache超过系统的内存回收能力时, 内存使用率自然会逐步攀升. 有篇外文blog有比较详细的介绍, 以及.  NSS从后面的版本开始解决了这个bug:  NSS now avoids calls to sdb_measureAccess in lib/softoken/sdb.c s_open if NSS_SDB_USE_CACHE is “yes”. 所以我们得到的最终解决办法是:

第一步: 确保nss-softokn是已经解决了bug的版本

查看是否大于等于3.16.0版本(根据上面的bug修复链接, 大于nss-softokn-3.14.3-12.el6 也可以):
yum list nss-softokn
若版本太低需要升级:
sudo yum update -y nss-softokn

Resolving Dependencies
 --> Running transaction check
 ---> Package nss-softokn.x86_64 0:3.14.3-10.el6_5 will be updated
 ---> Package nss-softokn.x86_64 0:3.14.3-23.3.el6_8 will be an update
 --> Processing Dependency: nss-softokn-freebl(x86-64) >= 3.14.3-23.3.el6_8 for package: nss-softokn-3.14.3-23.3.el6_8.x86_64
 --> Processing Dependency: libnssutil3.so(NSSUTIL_3.17.1)(64bit) for package: nss-softokn-3.14.3-23.3.el6_8.x86_64
 --> Running transaction check
 ---> Package nss-softokn-freebl.x86_64 0:3.14.3-10.el6_5 will be updated
 ---> Package nss-softokn-freebl.x86_64 0:3.14.3-23.3.el6_8 will be an update
 ---> Package nss-util.x86_64 0:3.16.1-1.el6_5 will be updated
 ---> Package nss-util.x86_64 0:3.21.0-2.el6 will be an update
 --> Processing Dependency: nspr >= 4.11.0-1 for package: nss-util-3.21.0-2.el6.x86_64
 --> Running transaction check
 ---> Package nspr.x86_64 0:4.10.6-1.el6_5 will be updated
 ---> Package nspr.x86_64 0:4.11.0-1.el6 will be an update
 --> Finished Dependency Resolution

第二步: 设置变量NSS_SDB_USE_CACHE=yes

Apache:
echo "export NSS_SDB_USE_CACHE=yes" >> /etc/sysconfig/httpd
service httpd restart
PHP:
putenv('NSS_SDB_USE_CACHE=yes');     
//注php设置环境变量, 如果safe_mode开启的话, 会受 safe_mode_allowed_env_vars 和 safe_mode_protected_env_vars 指令配置的影响.
More
Nginx:
fastcgi_param NSS_SDB_USE_CACHE yes;

第三步: 重启Web服务器
重启Apache 或 重启Nginx和php-fpm. 否则curl_error可能出现 Problem with the SSL CA cert (path? access rights?) 的错误.

设置后dentry cache终于慢慢恢复正常.

相关文章:

  • 2021-12-31
  • 2022-01-10
  • 2022-01-01
  • 2021-07-23
  • 2022-12-23
  • 2021-09-26
  • 2021-06-23
猜你喜欢
  • 2021-09-17
  • 2021-05-22
  • 2022-12-23
  • 2021-10-09
  • 2022-12-23
相关资源
相似解决方案