【发布时间】:2015-02-17 03:23:18
【问题描述】:
我正在尝试对包含密码的大约一千万行的文本文件执行一些分析。我这样做是通过读取文件的每一行,以该值作为参数创建一个类,然后将该类添加到列表中。在第 4,000,000 行之后,出现内存不足异常。除了将所有内容都存储在 SQL 数据库中之外,还有什么可以做的吗?
编辑:我要做的是获取密码,将其添加到 Credential 对象,然后将其添加到列表中。
public class Credential
{
public string Password { get; set; }
public static readonly List<string> specialCharacters = new List<string> { "@", "!", "~", "*", "^", "&", "\\", "/", "#", "$", "%", "<", ">", ".", ",", "?", ")", "(", "'", "\"", "+", "=", "_", "-", ";", ":", "{", "}", "]", "[", };
public Credential(string password)
{
this.Password = password;
this.Mapping = new Dictionary<int, CredentialValueType>();
for (var i = 0; i < this.Length; i++)
{
this.Mapping.Add(i, new CredentialValueType(this.Password[i]));
}
}
public Dictionary<int, CredentialValueType> Mapping { get; private set; }
public int Length
{
get
{
return this.Password.Length;
}
}
public bool HasUppercase
{
get
{
return this.Password.Any(c => char.IsUpper(c));
}
}
public bool HasLowercase
{
get
{
return this.Password.Any(c => char.IsLower(c));
}
}
public bool HasNumber
{
get
{
return this.Password.Any(c => char.IsNumber(c));
}
}
public bool HasSpecialCharacter
{
get //Verify that this works right...
{
return this.Password.Where(a => specialCharacters.Contains(a.ToString())).Count() > 0;
}
}
}
public struct CredentialValueType
{
public char Value { get; set; }
public ValueType ValueType { get; set; }
public CredentialValueType(char val)
{
this = new CredentialValueType();
this.Value = val;
if (char.IsUpper(val)) this.ValueType = ValueType.UpperCase;
else if (char.IsLower(val)) this.ValueType = PasswordStats.ValueType.LowerCase;
else if (char.IsNumber(val)) this.ValueType = PasswordStats.ValueType.Number;
else this.ValueType = PasswordStats.ValueType.SpecialCharacter;
}
}
我的功能如下:
public class PasswordAnalyzer
{
public IList<Credential> Credentials { get; private set; }
public PasswordAnalyzer(string file, int passwordField = 0, Delimiter delim = Delimiter.Comma)
{
this.Credentials = new List<Credential>();
using (var fileReader = File.OpenText(file)) //Verify UTF-8
{
using (var csvReader = new CsvHelper.CsvReader(fileReader))
{
csvReader.Configuration.Delimiter = "\t";
while (csvReader.Read())
{
var record = csvReader.GetField<string>(passwordField);
this.Credentials.Add(new Credential(record));
System.Diagnostics.Debug.WriteLine(this.Credentials.Count);
}
}
}
}
}
【问题讨论】:
-
你的实际代码是什么?你在使用 File.ReadLines() 吗?
-
这似乎解决了一个类似的问题:stackoverflow.com/questions/27561324/…
-
购买更多内存。或者以增量方式进行分析(例如一次 1M)。
-
1.获取更多内存。 2、可以批量加工吗? 3. 如果进行任何类型的聚合,请在处理时进行聚合(将总和和计数存储在单独的变量中,并在执行过程中递增以避免将所有内容加载到内存中) 4. 有关您尝试执行的分析类型的更多详细信息会有所帮助.我们只是在黑暗中刺伤。
-
@Pierre-LucPineault 获得更多 RAM 不会神奇地为 32 位进程提供更多地址空间...
标签: c# out-of-memory