【发布时间】:2011-11-14 19:58:09
【问题描述】:
首先我有一个非常具体的问题,但也许解决我的问题的另一种方法(第二部分)也可以帮助我。
有没有办法通过mysql中的索引来寻址字符串中的字符。 (即在 PHP 中 $var[2] 会给你第三个字符)?
显而易见的方法是SUBSTRING(var, 3,1 ),但由于我的字符串长度为 1024 个字符,我认为这不是最快的解决方案。如代码示例中所示,使用子字符串检索字符串的尾部也没有性能差异。有没有办法迭代一个字符串? (移动第一个元素?)
CREATE FUNCTION hashDiff( hash1 TEXT(1024), hash2 TEXT(1024), threshold INT)
RETURNS INT
DETERMINISTIC
BEGIN
DECLARE diff, x, b1, b2 INT;
SET diff =0;
SET x = 0;
WHILE (x<1024 AND diff<threshold) DO
SET b1 = ASCII(hash1); --uses first character only!!
SET b2 = ASCII(hash2);
SET hash1=SUBSTRING(hash1, 2 );
SET hash2=SUBSTRING(hash2, 2 );
SET diff=diff+ ((b1-b2)*(b1-b2));
SET x=x+1;
END WHILE;
RETURN diff;
END
如果您还没有从代码中读取它,我会尝试编写一个存储过程来计算哈希之间的差异或距离。差异是字符平方距离的总和(即hashDiff(AA,AC)=(65-65)²+(65-67)²=4)。如果哈希值已经不同,则可以通过引入一个阈值来取消计算,从而实现第一个主要的性能提升。但是由于 mysql 不是我的“日常”语言,所以我一直在寻找其他优化。为了完整起见,两个示例哈希:
YAAAAAAYAAAYAAVAAQAARAOAAOAQASAQAMAKAKAJIAJAJIAHAHIAKJAIIAHHAHIIAIHGAGFFAGGFEAFEEEEAEDDDDDAEEEEDEEEFAFFFFFFEFFFEFFFFFGFEEFFEEEFFFJEFFEEEEEEELFFFFEEFJEEEEDIEEEEEIEEEEHEEEJEEFKFEFKGGFNHGOIIJTJKYONYNMTGHNHHQISJJQIKWLXJJSMYRQWJOGKDDFCCBBAAAAAAAAAAAAAAAAAAAAAAAYAAAAAAYAAAYAAWAARAASASAAQARAUAYAYATAOALKAJAJIAIAHHAHGAGFAFFAEFFAEFFAFFFAEEFFAFEEEDADEDDDDADDDCDDDDDAEEEFEEEEDDDEEEDEDDEEEFEFFGGFMFHGFFFGFFFLGHGGHGGNHHGGGOHGHGHMGGFGMFFFMFGFLFFFMGFFMGGMGGGNGGMGGLGGLGGMGGLEIEEHDCGCGCDGDGDCGDFCECCECECECECFCECFCFCFCFCFCGCJGYCYAAAAAAYAAAYAAUAATAAUAUAAUARARAQAPAPASARRAPARQAPAQQAQQAQSAKMATKKAIIHAIHGAGGGGAGHHGGAGGFGFFAFFGEFFFFFAFFGFGGGFFFEEFGFFGGFGGHIJJLKLWLKJJIJJJKJRLJKLKKKUKLLKKUMMKJIQIIIISKJJWKLLXMLMYMLNYMMYMLLWJIQIINFGKFFKEEIDHEDHDDFCECCFDECCFCFDGCDGCGCGEGCDCECECFDFCGDGCIEKEOAYNFBREUXKPQMMQTKTMMNJLPPVYYYTOUOPOLLJKKJJJIJIMJJJLIJJLLJIIHHIHHHIGHIHIHJHHHJHHIHGHGHFGHGFFEFEEEFEFEFFGGHIHIHGHGHHIIIIHIIJMNLONKLKKKKKKKMLKKLONMKOOOMLOPONMNMKKLLKKLMNKLMMMNMOPPOORPORSSVRTSSRTRRTSSTTXSTQRPONOKKLKLJMKJJIJIIHHHIIIJHI JIJJIJIKJIMWMYYDAAAAAAAAAAAAA
AAAAAAAAAAABAABAACAACACAACADADAEADADADADDAEAEEAEAFEAEEAEFAFGAGGGAGGGAHHHAHIIIAIHIJHAIIHIHHAJIHIJIJKJAJJJIKJJJJKKJKJKKLKLKLLMMMNNMYOOOOOOPOONYOONONNPYNOOOPYOOPPPYNONNYMLLWLLKUJIISHIHOGGMFGFLFFMGGLFGLGFLFFKFKFFLEEKFLEFJFKFGNGNHLFHJFIEGDIEKGOIRFGBBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABABACACDACADDACADDADDAEDAFEAEEFAFFFAFFGAFGGGAIGHHIAHIHHHHAHIHIIJJIIAIIIIJIJKJIIIIJJHIIHIIIIJIIIIRJJJJKJJJJLVKLLKLLKXLMMKMXMLLLMWMMMMYMNLYMNNYNNMYMMNYMLYLMLXKJRIHPHIMGGMFEJEJEEIEEHDGCDFCFDCFCECECCEBEBECFDGCFDNGLDBAAAAAAAAAAAAAAAAAAAAAABAAAAABAABABAACACACACACACACADDADAEEAFAFGAFGAHGAGGAGGHAGGIAIHJAJJJJAJKKKKAMLMNNNANOMMNNMMNAONMNOOOMOOPOMNOMMNPOOPPPPRQQYPPRPPPPPNOYLLMMMMLYLMLMLYLMLMMYLNNMYNLLWMLKXLLLUKIKQIIQGHHPFHNGFLFFLGFJEEJEIDDIDCHDFCDGCFCCFCECECCECFCGDGDHDHDIFIDEBBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABAABBBBBCBCCCCDCCCCCCCCDDDDEDEEEEFFDEGGHGHHHGHHHHHHIIJJJJJIJJJJJJIKJJKLKKMMNMMMMMMMNNNNNNLNNONPONNNOOOOPQQQRSSSSSSUTSTUUUVWVVXUYXWVXVXWYVYWYVYYUWVUTTSSPQPQOPOPONONOMONOOONNNMMNLJJKJIIJHHGGGFHFGFFFFEEEDD EEEEFGGIGJLRNEAAAAAAAAAAAAAA
任何帮助或提示将不胜感激。
【问题讨论】:
-
您可以使用 HEX 函数一次处理四个字节的字符串。
标签: mysql string optimization substring distance