当您将其标记为读作“Oracle”的 PL/SQL 时,看看这是否有帮助。
如果您使用utl_match 包,您可以计算“相似度”,显示字符串的相似程度(从 0 到 100)。这两个表的交叉连接显示了这一点:
SQL> with
2 master (company_name) as
3 (select 'ABC India Private Limited' from dual union all
4 select 'XYZ RAK Private Limited' from dual union all
5 select 'PQR XRK Private Limited' from dual
6 ),
7 new_table (emp_name, company_name, age, designation) as
8 (select 'HarishP', 'ABC pvt. Ltd.', 30, 'Director' from dual union all
9 select 'Rupeshj', 'XYZ RAK Ltd.' , 25, 'IT Head' from dual union all
10 select 'RajeshK', 'PQR, XRK Pvt.', 45, 'Engineer' from dual
11 )
12 select n.emp_name,
13 n.company_name new_cname,
14 m.company_name,
15 n.age,
16 n.designation,
17 utl_match.jaro_winkler_similarity(lower(m.company_name), lower(n.company_name)) sim
18 from master m cross join new_table n
19 order by n.company_name, m.company_name;
EMP_NAM NEW_CNAME COMPANY_NAME AGE DESIGNAT SIM
------- ------------- ------------------------- ---------- -------- ----------
HarishP ABC pvt. Ltd. ABC India Private Limited 30 Director 77
HarishP ABC pvt. Ltd. PQR XRK Private Limited 30 Director 47
HarishP ABC pvt. Ltd. XYZ RAK Private Limited 30 Director 52
RajeshK PQR, XRK Pvt. ABC India Private Limited 45 Engineer 45
RajeshK PQR, XRK Pvt. PQR XRK Private Limited 45 Engineer 84
RajeshK PQR, XRK Pvt. XYZ RAK Private Limited 45 Engineer 58
Rupeshj XYZ RAK Ltd. ABC India Private Limited 25 IT Head 47
Rupeshj XYZ RAK Ltd. PQR XRK Private Limited 25 IT Head 54
Rupeshj XYZ RAK Ltd. XYZ RAK Private Limited 25 IT Head 83
9 rows selected.
SQL>
现在,很明显 - 当我们查看它时 - 哪些 company_name 对是可以的,哪些不是。在这种情况下,可接受的限制是例如75 所以如果我们将它包含在 where 子句中,我们会得到:
<snip>
12 select n.emp_name,
13 n.company_name new_cname,
14 m.company_name,
15 n.age,
16 n.designation,
17 utl_match.jaro_winkler_similarity(lower(m.company_name), lower(n.company_name)) sim
18 from master m join new_table n
19 on utl_match.jaro_winkler_similarity(lower(m.company_name), lower(n.company_name)) > 75
20 order by n.company_name, m.company_name;
EMP_NAM NEW_CNAME COMPANY_NAME AGE DESIGNAT SIM
------- ------------- ------------------------- ---------- -------- ----------
HarishP ABC pvt. Ltd. ABC India Private Limited 30 Director 77
RajeshK PQR, XRK Pvt. PQR XRK Private Limited 45 Engineer 84
Rupeshj XYZ RAK Ltd. XYZ RAK Private Limited 25 IT Head 83
SQL>
或者,更漂亮:
<snip>
12 select n.emp_name,
13 m.company_name,
14 n.age,
15 n.designation
16 from master m join new_table n
17 on utl_match.jaro_winkler_similarity(lower(m.company_name), lower(n.company_name)) > 75
18 order by n.company_name, m.company_name;
EMP_NAM COMPANY_NAME AGE DESIGNAT
------- ------------------------- ---------- --------
HarishP ABC India Private Limited 30 Director
RajeshK PQR XRK Private Limited 45 Engineer
Rupeshj XYZ RAK Private Limited 25 IT Head
SQL>
但是,它是否适用于您的所有案例?很可能不是。取决于您输入的垃圾的质量。