【发布时间】:2015-12-24 21:59:22
【问题描述】:
在之前的编码中,我注意到 SHA256 的一些奇怪之处,因为它似乎为哈希生成的整数多于字母。起初我以为我只是在想象,所以我做了一个快速测试来确定。令人惊讶的是,我的测试似乎证明 SHA256 在它生成的散列中偏爱整数值。我想知道这是为什么。哈希索引是字母和数字之间的区别不应该完全相同吗?这是我的测试示例:
namespace TestingApp
{
static class Program
{
private static string letters = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890";
private static char[] characters = letters.ToCharArray();
private static Random _rng = new Random();
static void Main(string[] args)
{
int totalIntegers = 0;
int totalLetters = 0;
for (int testingIntervals = 0; testingIntervals < 3000; testingIntervals++)
{
string randomString = NextString(10);
string checksum = DreamforceChecksum.GenerateSHA256(randomString);
int integerCount = checksum.Count(Char.IsDigit);
int letterCount = checksum.Count(Char.IsLetter);
Console.WriteLine("String: " + randomString);
Console.WriteLine("Checksum: " + checksum);
Console.WriteLine("Integers: " + integerCount);
Console.WriteLine("Letters: " + letterCount);
totalIntegers += integerCount;
totalLetters += letterCount;
}
Console.WriteLine("Total Integers: " + totalIntegers);
Console.WriteLine("Total Letters: " + totalLetters);
Console.Read();
}
private static string NextString(int length)
{
StringBuilder builder = new StringBuilder();
for (int i = 0; i < length; i++)
{
builder.Append(characters[_rng.Next(characters.Length)]);
}
return builder.ToString();
}
}
}
还有我的校验和/散列类:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Security.Cryptography;
using System.Text;
using System.Threading.Tasks;
namespace DreamforceFramework.Framework.Cryptography
{
public static class DreamforceChecksum
{
private static readonly SHA256Managed _shaManagedInstance = new SHA256Managed();
private static readonly StringBuilder _checksumBuilder = new StringBuilder();
public static string GenerateSHA256(string text)
{
byte[] bytes = Encoding.UTF8.GetBytes(text);
byte[] hash = _shaManagedInstance.ComputeHash(bytes);
_checksumBuilder.Clear();
for (int index = 0; index < hash.Length; index++)
{
_checksumBuilder.Append(hash[index].ToString("x2"));
}
return _checksumBuilder.ToString();
}
public static byte[] GenerateSHA256Bytes(string text)
{
byte[] bytes = Encoding.UTF8.GetBytes(text);
byte[] hash = _shaManagedInstance.ComputeHash(bytes);
_checksumBuilder.Clear();
for (int index = 0; index < hash.Length; index++)
{
_checksumBuilder.Append(hash[index].ToString("x2"));
}
return Encoding.UTF8.GetBytes(_checksumBuilder.ToString());
}
public static bool ValidateDataIntegrity(string data, string targetHashcode)
{
return GenerateSHA256(data).Equals(targetHashcode);
}
}
}
我已经多次运行测试,每次似乎在哈希中生成的整数多于字母。以下是 3 次测试运行:
有谁知道为什么 SHA256 似乎更倾向于数字而不是字母和数字的均等分布?
【问题讨论】:
-
SHA-256(与几乎所有哈希一样)输出原始字节。这些仅在您应用编码时才变成字符,在您的情况下为十六进制。字符的分布是该编码的一个属性。