【发布时间】:2010-10-14 11:08:46
【问题描述】:
有a bug in Firefox(即使在新的测试版和雷区版本中)由于在其缓存哈希中创建密钥的算法而阻止缓存某些文件。 Here is a link to the source code of the function。
我想确保可以缓存我网站的所有文件。但是,我不明白为什么他们的散列函数无法为不同的 url 创建唯一键。我希望有人可以用伪代码或 java 描述这个 mal 函数。
最好为开发人员创建一个实用程序,以确保在修复此错误之前唯一的 url。
编辑:有一些非常有用的答案,但是,我需要更多的分步帮助来创建一个实用程序来检查这些缓存混淆。获得一些可以重现 Firefox 正在创建的密钥的 java 代码会很棒。因此,在这个问题上开悬赏。
编辑 2: 这是一个部分工作的 Java 端口(使用 processing 编写)。注意底部的测试;前三个按预期工作,但其他人没有。我怀疑有关已签名/未签名整数的某些事情。有什么建议吗?
//
// the bad collision function
// http://mxr.mozilla.org/mozilla/source/netwerk/cache/src/nsDiskCacheDevice.cpp#240
//
//248 PLDHashNumber
//249 nsDiskCache::Hash(const char * key)
//250 {
//251 PLDHashNumber h = 0;
//252 for (const PRUint8* s = (PRUint8*) key; *s != '\0'; ++s)
//253 h = PR_ROTATE_LEFT32(h, 4) ^ *s;
//254 return (h == 0 ? ULONG_MAX : h);
//255 }
//
// a java port...
//
String getHash( String url )
{
//get the char array for the url string
char[] cs = getCharArray( url );
int h = 0;
//for (const PRUint8* s = (PRUint8*) key; *s != '\0'; ++s)
for ( int i=0; i < cs.length; i++ )
{ h = PR_ROTATE_LEFT32(h, 4) ^ cs[i];
}
//looks like the examples above return something in hex.
//if we get matching ints, that is ok by me.
//but for fun, lets try to hex the return vals?
String hexVal = hex( h );
return hexVal;
}
char[] getCharArray( String s )
{
char[] cs = new char[s.length()];
for (int i=0; i<s.length(); i++)
{
char c = s.charAt(i);
cs[i] = c;
}
return cs;
}
//
// how to PR_ROTATE_LEFT32
//
//110 /*
//111 ** Macros for rotate left and right. The argument 'a' must be an unsigned
//112 ** 32-bit integer type such as PRUint32.
//113 **
//114 ** There is no rotate operation in the C Language, so the construct
//115 ** (a << 4) | (a >> 28) is frequently used instead. Most compilers convert
//116 ** this to a rotate instruction, but MSVC doesn't without a little help.
//117 ** To get MSVC to generate a rotate instruction, we have to use the _rotl
//118 ** or _rotr intrinsic and use a pragma to make it inline.
//119 **
//120 ** Note: MSVC in VS2005 will do an inline rotate instruction on the above
//121 ** construct.
//122 */
//...
//128 #define PR_ROTATE_LEFT32(a, bits) _rotl(a, bits)
//return an int (32 bit). what do we do with the 'bits' parameter? ignore?
int PR_ROTATE_LEFT32( int a, int bits )
{ return (a << 4) | (a >> (32-bits));
}
//
// examples of some colliding hashes
// https://bugzilla.mozilla.org/show_bug.cgi?id=290032#c5
//
//$ ./hashit "ABA/xxx.aba"
//8ffac222
//$ ./hashit "XyZ/xxx.xYz"
//8ffac222
//$ ./hashit "CSS/xxx.css"
//8ffac222
//$ ./hashit "JPG/xxx.jpg"
//8ffac222
//$ ./hashit modules_newsfeeds/MenuBar/MenuBar.css
//15c23729
//$ ./hashit modules_newsfeeds/ListBar/ListBar.css
//15c23729
//$ ./hashit modules_newsfeeds/MenuBar/MenuBar.js
//a15c23e5
//$ ./hashit modules_newsfeeds/ListBar/ListBar.js
//a15c23e5
//
// our attempt at porting this algorithm to java...
//
void setup( )
{
String a = "ABA/xxx.aba";
String b = "CSS/xxx.css";
String c = "CSS/xxx.css";
String d = "JPG/xxx.jpg";
println( getHash(a) ); //yes 8ffac222
println( getHash(b) ); //yes 8ffac222
println( getHash(c) ); //yes 8ffac222
println( getHash(d) ); //no [??] FFFFFF98, not 8ffac222
println( "-----" );
String e = "modules_newsfeeds/MenuBar/MenuBar.css";
String f = "modules_newsfeeds/ListBar/ListBar.css";
println( getHash(e) ); //no [??] FFFFFF8C, not 15c23729
println( getHash(f) ); //no [??] FFFFFF8C, not 15c23729
println( "-----" );
String g = "modules_newsfeeds/MenuBar/MenuBar.js";
String h = "modules_newsfeeds/ListBar/ListBar.js";
println( getHash(g) ); //yes [??] FFFFFF8C, not a15c23e5
println( getHash(h) ); //yes [??] FFFFFF8C, not a15c23e5
}
【问题讨论】:
-
老实说,我认为您完全担心这一点。您是否遇到了某种问题,或者这一切都是过早的优化?
-
对问题的进一步解释:需要制定策略以确保正确缓存数千个文件。现在,他们不是。想要预处理所有文件名以确保它们可以缓存。
标签: java c++ algorithm firefox hash