一种方法是使用d3.bin()创建桶,并通过聚合函数映射桶以聚合属性,例如d3.median()。
d3.bin() 方法根据给定的分桶定义创建数据组(d3 称之为thresholds)。 d3.bin() 的输出与您输入的数据相同,但桶周围有数组。例如,在阈值 [0, 1, 2] 中分箱的数组 [0.2, 0.8, 1.2, 1.8] 变为 [ [0.2, 0.8], [1.2, 1.8] ]。
您可以使用d3.bin().value() 方法来指定一个访问器,在您的例子中是一个访问start 属性d => d.start 的函数。
由于您希望阈值是整秒,因此需要进行一些预处理计算来计算阈值;我将在下面代码sn-p的cmets中解释它们:
// Compute the max value of the data and round up to get the whole second.
const maxValue = Math.ceil(d3.max(data, d=> d.start))
// Define the bin thresholds (In your case, an array of whole seconds: 0, 1, ... , maxValue)
const thresholds = d3.range(maxValue+1)
// Define the min and max values of the buckets
const domain = [0, maxValue]
// Create the binner function
const binner = d3.bin().value(d=>d.start).thresholds(thresholds).domain(domain)
// Bin the data
const binned = binner(data)
// Map each bin array and return
const medians = binned.map(bin => {
return {
medianDuration: d3.median(bin, b=>b.duration),
medianConfidence: d3.median(bin, b=>b.confidence),
// ... The other properties repeat the same pattern
/* You can also extract three additional values: the amount of data
in the bin, the start of the bin, and the end of the bin; */
dataPoints: bin.length,
bucketMin: bin.x0,
bucketMax: bin.x1
}
})
工作示例:
const data = getData()
const maxValue = Math.ceil(d3.max(data, d=> d.start))
const thresholds = d3.range(maxValue+1)
const domain = [0, maxValue]
const binner = d3.bin().value(d=>d.start).thresholds(thresholds).domain(domain)
const binned = binner(data)
const medians = binned.map(bin => {
return {
medianDuration: d3.median(bin, b=>b.duration),
dataPoints: bin.length,
bucketMin: bin.x0,
bucketMax: bin.x1
}
})
console.log(medians)
function getData() {
return [{
"start": 0.2,
"duration": 0.20934,
}, {
"start": 0.8,
"duration": 0.30934,
},
{
"start": 1.2,
"duration": 0.20934,
},
{
"start": 1.8,
"duration": 0.10934,
},
{
"start": 2.2,
"duration": 0.30934,
},
{
"start": 2.8,
"duration": 0.40934,
}]
}
<script src="https://d3js.org/d3.v6.min.js"></script>