【发布时间】:2019-12-06 16:49:09
【问题描述】:
我有 aws cpu-utilization 数据,NAB 使用这些数据使用 AWS-SageMaker Random Cut Forest 创建异常检测。我能够执行它,但我需要更深入的超参数调整解决方案。我已经阅读了 AWS 文档,但需要了解 Hyper Parameter 选择。参数是有根据的猜测还是我们需要计算 co_disp 的均值和标准差来推断参数。
提前致谢。
我尝试了 100 Trees 和 512/256 tree_size 来检测异常,但是如何推断这些参数
# Set tree parameters
num_trees = 50
shingle_size = 48
tree_size = 512
# Create a forest of empty trees
forest = []
for _ in range(num_trees):
tree = rrcf.RCTree()
forest.append(tree)
# Use the "shingle" generator to create rolling window
#temp_data represents my aws_cpuutilization data
points = rrcf.shingle(temp_data, size=shingle_size)
# Create a dict to store anomaly score of each point
avg_codisp = {}
# For each shingle...
for index, point in enumerate(points):
# For each tree in the forest...
for tree in forest:
# If tree is above permitted size, drop the oldest point (FIFO)
if len(tree.leaves) > tree_size:
tree.forget_point(index - tree_size)
# Insert the new point into the tree
tree.insert_point(point, index=index)
"""Compute codisp on the new point and take the average among all
trees"""
if not index in avg_codisp:
avg_codisp[index] = 0
avg_codisp[index] += tree.codisp(index) / num_trees
values =[]
for key,value in avg_codisp.items():
values.append(value)
【问题讨论】:
标签: amazon-web-services machine-learning artificial-intelligence data-science amazon-sagemaker