也许有人会发现这个答案对自己的提议很有用......
我决定现在采用最简单的方法。
寻找相似的对象:
public List<Person> searchSimilarPeople(Person person) {
Session session = sessionFactory.getCurrentSession();
Query query = null;
query = session.createQuery("from Person where lower(firstName) like :s_name or lower(lastName) like :f_name " +
"or lower(email) like :email or lower(address1) like :address1 or lower(address2) like :address2 " +
"or lower(city) like :city or lower(region_state) like :state or lower(zip) like :zip " +
"or lower(country) like :country", Person.class);
query.setParameter("s_name", "%" + person.getFirstName() + "%");
query.setParameter("f_name", "%" + person.getLastName() + "%");
query.setParameter("address1", "%" + person.getAddress1() + "%");
query.setParameter("address2", "%" + person.getAddress2() + "%");
query.setParameter("city", "%" + person.getCity() + "%");
query.setParameter("state", "%" + person.getRegion_state() + "%");
query.setParameter("zip", "%" + person.getZip() + "%");
query.setParameter("country", "%" + person.getCountry() + "%");
query.setParameter("email", "%" + person.getEmail() + "%");
return query.getResultList();
}
我尝试过使用 Lucene 深入研究 Hibernate Search 过程,但我认为它对于 Hibernate 初学者来说并不是最佳选择。但如果有人使用 Lucene 提供相同过程的简要说明,我将不胜感激:)
对象比较:
比较对象中的几乎所有变量都是字符串,所以我决定使用 Jaro-Winkler 距离来获得字符串匹配系数,然后将其乘以 0.11(我有 9 个变量要比较,因此最大可能匹配结果为 0.99)
@Override
public Double getMatch(Person inputPerson, Person dbPerson) {
double intermResult = 0;
double finalResult = 0;
// FirstName
intermResult = StringUtils.getJaroWinklerDistance(inputPerson.getFirstName().toLowerCase(), dbPerson.getFirstName().toLowerCase());
finalResult += (intermResult * rateCoefficient);
// LAST NAME
intermResult = StringUtils.getJaroWinklerDistance(inputPerson.getLastName().toLowerCase(), dbPerson.getLastName().toLowerCase());
finalResult += (intermResult * rateCoefficient);
// Email
intermResult = StringUtils.getJaroWinklerDistance(inputPerson.getEmail().toLowerCase(), dbPerson.getEmail().toLowerCase());
finalResult += (intermResult * rateCoefficient);
// ADDRESS 1
intermResult = StringUtils.getJaroWinklerDistance(inputPerson.getAddress1().toLowerCase(), dbPerson.getAddress1().toLowerCase());
finalResult += (intermResult * rateCoefficient);
// ADDRESS 2
if (inputPerson.getAddress2() != null && dbPerson.getAddress2() != null) {
intermResult = StringUtils.getJaroWinklerDistance(inputPerson.getAddress2().toLowerCase(), dbPerson.getAddress2().toLowerCase());
finalResult += (intermResult * rateCoefficient);
}
// CITY
intermResult = StringUtils.getJaroWinklerDistance(inputPerson.getCity().toLowerCase(), dbPerson.getCity().toLowerCase());
finalResult += (intermResult * rateCoefficient);
// STATE
if (inputPerson.getRegion_state() != null && dbPerson.getRegion_state() != null) {
intermResult = StringUtils.getJaroWinklerDistance(inputPerson.getRegion_state().toLowerCase(), dbPerson.getRegion_state().toLowerCase());
finalResult += (intermResult * rateCoefficient);
}
// ZIP
intermResult = StringUtils.getJaroWinklerDistance(inputPerson.getZip().toLowerCase(), dbPerson.getZip().toLowerCase());
finalResult += (intermResult * rateCoefficient);
// COUNTRY
intermResult = StringUtils.getJaroWinklerDistance(inputPerson.getCountry().toLowerCase(), dbPerson.getCountry().toLowerCase());
finalResult += (intermResult * rateCoefficient);
finalResult = Precision.round(finalResult, 3);
System.out.println("Final Result. ID : " + dbPerson.getId() + " " + finalResult);
return finalResult;
}