【问题标题】:How to setup a Full Text Search query in PostgreSQL如何在 PostgreSQL 中设置全文搜索查询
【发布时间】:2014-05-09 01:49:54
【问题描述】:

我对 PostgreSQL 很陌生,在实现全文搜索时遇到了一些问题。我目前正在使用以下设置:

CREATE DATABASE test;

CREATE TABLE data_table (
   id BIGSERIAL PRIMARY KEY,
   name VARCHAR(160) NOT NULL,
   description VARCHAR NOT NULL
);

CREATE INDEX data_table_idx ON data_table 
USING gin(to_tsvector('English', name || ' ' || description)); 

INSERT INTO data_table (name, description) VALUES 
    ('Penguin', 'This is the Linux penguin.'), 
    ('Gnu', 'This is the GNU gnu.'), 
    ('Elephant', 'This is the PHP elephant.'), 
    ('Elephant', 'This is the postgres elephant.'), 
    ('Duck', 'This is the duckduckgo duck.'), 
    ('Cat', 'This is the GitHub cat.'), 
    ('Bird', 'This is the Twitter bird.'), 
    ('Lion', 'This is the Leo lion.');

现在我尝试在表格中搜索给定的用户输入,并返回整个数据行以及突出显示的匹配项,它应该类似于以下内容:

WITH 
    q AS ( SELECT plainto_tsquery('English', 'elephants php') AS query ),
    d AS ( SELECT (name || ' ' || description) AS document FROM data_table ),
    t AS ( SELECT to_tsvector('English', d.document) AS textsearch FROM d ),
    r AS ( SELECT ts_rank_cd(t.textsearch, q.query) AS rank FROM t, q )
SELECT data_table.*, ts_headline('german', d.document, q.query) AS matches
FROM data_table, q, d, t , r
WHERE q.query @@ t.textsearch 
ORDER BY r.rank DESC 
LIMIT 10;

这让我得到以下输出:

 id |   name   |          description           |              matches               
----+----------+--------------------------------+------------------------------------
  5 | duck     | This is the duckduckgo duck.   | Penguin This is the Linux penguin.
  2 | Gnu      | This is the GNU gnu.           | Gnu This is the GNU gnu.
  3 | Elephant | This is the PHP elephant.      | Penguin This is the Linux penguin.
  4 | elephant | This is the postgres elephant. | Penguin This is the Linux penguin.
  6 | Cat      | This is the GitHub cat.        | Penguin This is the Linux penguin.
  1 | Penguin  | This is the Linux penguin.     | Gnu This is the GNU gnu.
  1 | Penguin  | This is the Linux penguin.     | Penguin This is the Linux penguin.
  2 | Gnu      | This is the GNU gnu.           | Penguin This is the Linux penguin.
  4 | elephant | This is the postgres elephant. | Gnu This is the GNU gnu.
  3 | Elephant | This is the PHP elephant.      | Gnu This is the GNU gnu.
(10 rows)

所以查询确实返回了一些东西,但它没有按排名排序,每个文档都与名称/描述的每个组合相结合,唯一有效的是正确突出显示文档中的搜索结果。那么我做错了什么,我该如何解决呢?

【问题讨论】:

  • 样本数据/sqlfiddle?
  • @Craig 完成了我的示例,因此您只需复制粘贴我的查询即可重现它。

标签: postgresql full-text-search with-statement


【解决方案1】:

最后我能够得到这个工作。请在下面找到我的解决方案。我希望这会对某人有所帮助。如果有人知道更好的解决方案,具有更好/更快的索引,我会很高兴知道。

查询:

WITH 
    q AS ( SELECT to_tsquery('german', 'elephant | php') AS query ),
    d AS ( SELECT id, (name || ' ' || description) AS doc FROM data_table ),
    t AS ( SELECT id, doc, to_tsvector('german', doc) AS vector FROM d ),
    r AS ( 
        SELECT id, doc, ts_rank_cd(vector, query) AS rank 
        FROM t, q
        WHERE q.query @@ vector
        ORDER BY rank DESC 
    )
SELECT id, ts_headline('german', doc, q.query) AS matches, rank
FROM r, q
ORDER BY r;

结果:

 id |                         matches                         | rank 
----+---------------------------------------------------------+------
  3 | <b>Elephant</b> This is the <b>PHP</b> <b>elephant</b>. |  0.3
  4 | <b>elephant</b> This is the postgres <b>elephant</b>.   |  0.2

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2011-03-29
    • 2023-03-28
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多