【发布时间】:2014-05-28 05:02:59
【问题描述】:
我正在尝试编写代码来生成集合中的所有子集,其中包含一个条件,例如 如果我有阈值=2,并且三个设置:
1, 2, 3, 4, 5
1,3,5
1,3,4
然后程序会输出:
第一次迭代时的生成集:
1 = number of frequency = 3
2 = number of frequency = 1
3 = number of frequency = 3
4 = number of frequency = 2
5= number of frequency = 2
由于数字2的频率
第二次迭代时的生成集:
1,3 = number of frequency = 3
1,4 = number of frequency = 2
1,5 = number of frequency = 2
3,4 = number of frequency = 2
3,5= number of frequency = 2
4,5= number of frequency = 1
由于数字 (4,5)
第三次迭代的生成集
1,3,4= number of frequency = 2
1,3,5= number of frequency = 2
第四次迭代的生成集
不再有超集,因为 (4,5)
我写了程序,我已经生成了所有的子集,但是在两件事上失败了:
- 我无法在地图中搜索
std::map <int,std::pair<list<int>, int>> CandList来统计相似集(频率数) - 我不知道如何应用条件
感谢您的帮助。
这是我的代码:
int threshold = 2;
std::vector<std::list<int>> data;
std::map<int, int> FISupp;
typedef std::pair<list<int>, int> combo;
std::map <int,combo> CandList;
std::list<int> FrqList;
/*
input:Threshold =2, and data=
1 2 3 4 5
1 3 4 5
1 2 3 5
1 3
at first scan after PassOne function:
FISupp(1,4)
FISupp(2,2)
FISupp(3,4)
FISupp(4,4)
FISupp(5,3)
at k scan after Passk function:
---
*/
int Lsize = 2; // Level size
void ScanData()
{
ifstream in;
in.open("mydata.txt");
/* mydata.txt
1 2 3 4 5
1 3 4 5
1 2 3 5
1 3
*/
std::string line;
int i = 0;
while (std::getline(in, line))
{
std::stringstream Sline1(line);
std::stringstream ss(line);
std::list<int> inner;
int info;
while (ss >> info)
inner.push_back(info);
data.push_back(inner);
}
}
/* first pass to generate first Candidates items */
void PassOne()
{
for (unsigned i = 0; i < data.size(); ++i)
{
std::list<int>::iterator li;
for (li = data[i].begin(); li != data[i].end(); ++li)
FISupp[*li] += 1;
}
/*update the FFISupp by erasing all first Candidates items with support < Threshold*/
std::map<int, int> ::iterator current = FISupp.begin();
std::list<int> ls; /* save Candidates itemes with support < Threshold*/
while (current != FISupp.end())
{
if (current->second < threshold)
{
ls.push_back(current->first);
current = FISupp.erase(current);
}
else
++current;
}
/*update the the orginal data by erasing all first Candidates items with support < Threshold*/
for (unsigned i = 0; i < data.size(); ++i)
{
std::list<int>::iterator li;
std::list<int>::iterator item = ls.begin();
while (item != ls.end())
{
for (li = data[i].begin(); li != data[i].end(); ++li)
{
if (*li == *item)
{
li = data[i].erase(li);
break;
}
}
++item;
}
}
}
void FrequentItem(list<int> l, int indx)
{
int a = 0;
for (list<int>::iterator it = l.begin(); it != l.end(); ++it)
{
//std::list <int> &m2 = CandList[indx].first;
//auto itr = m2.find(*it);
//auto itr = std::find(CandList.begin(), CandList.end(), *it);
auto itr = CandList.find(*it);
if (itr != CandList.end())
{
a += CandList[indx].second;
CandList[indx].first.push_back(*it);
CandList[indx].second = a;
}
}
}
int ind = 0;
void Passk(int j, std::list<int>::iterator Itm , int q = 0)
{
if (Lsize == q)
{
FrequentItem(FrqList, ind);
++ind;
return;
}
else
{
for (std::list<int>::iterator Itm2 = Itm; Itm2 != data[j].end(); ++Itm2)
{
FrqList.push_back(*Itm2);
Passk(j, ++Itm2, q + 1);
FrqList.pop_back();
--Itm2;
}
}
}
void main(int argc, char *argv[])
{
int temp = 0;
int j = -1;
ScanData();
PassOne();
while (Lsize <= data.size()) // How to stop the loop when there is no more candidate >= threshold???
{
for (unsigned i = 0; i < data.size(); ++i)
{
std::list<int>::iterator items = data[i].begin();
Passk(++j, items);
}
j = -1;
++ Lsize;
}
data.clear();
system("PAUSE");
return;
}
【问题讨论】:
-
你的程序的目标和逻辑我都不清楚。
-
检查 我理解您的规定:您想列出所有集合
S使得S至少是您列表中集合数量threshold的子集;并且您想按大小对输出进行排序? -
考虑使用
std::set而不是std::list
标签: c++