Cassandra 与 RDBMS 有不同的范式,这在必须完成数据建模的方式上更为明显。您需要记住,首选非规范化,并且您将拥有重复的数据。
表的定义应该基于查询来检索数据,这在问题的定义中有部分说明,例如:
查找为用户发送的请求
采用表requestsByFrom的初始设计,替代方案将是
CREATE TABLE IF NOT EXISTS requests_sent_by_user(
requester_email TEXT,
recipient_email TEXT,
recipient_name TEXT,
message TEXT,
created TIMESTAMP
PRIMARY KEY (requester_email, recipient_email)
) WITH default_time_to_live = 864000;
注意from是一个受限关键字,expiry信息可以用default_time_to_live子句(TTL)的定义来设置,在定义的时间后删除记录;此值为插入记录后的秒数,示例为 10 天(864,000 秒)。
主键建议是email地址,也可以是UUID,不建议使用name,因为可以多人同名(如James Smith)或者同一人可以有多种方式写名字(下面的例子Jim Smith、J. Smith和j smith可能指的是同一个人)。
还添加了名称recipient_name,因为您很可能希望显示它;应添加将与查询一起显示/使用的任何其他信息。
查找收到的用户请求
CREATE TABLE IF NOT EXISTS requests_received_by_user(
recipient_email TEXT,
requester_email TEXT,
requester_name TEXT,
message TEXT,
created TIMESTAMP
PRIMARY KEY (recipient_email, requester_email)
) WITH default_time_to_live = 864000;
最好使用batch同时向requests_sent_by_user和requests_received_by_user添加记录,这将确保两个表之间信息的一致性,同时TTL(数据过期)将是一样的。
存储联系人
问题中有4个连接表:connections、active_connections、favourite_connections、paired_connections,它们之间有什么区别?他们会有不同的规则/用例吗?如果是这种情况,将它们作为不同的表是有意义的:
CREATE TABLE IF NOT EXISTS connections(
requester_email TEXT,
recipient_email TEXT,
recipient_name TEXT,
notes TEXT,
created TIMESTAMP,
last_update TIMESTAMP,
is_favourite BOOLEAN,
is_active BOOLEAN,
is_paired BOOLEAN,
PRIMARY KEY (requester_email, recipient_email)
);
CREATE TABLE IF NOT EXISTS active_connections(
requester_email TEXT,
recipient_email TEXT,
recipient_name TEXT,
last_update TIMESTAMP,
PRIMARY KEY (requester_email, recipient_email)
);
CREATE TABLE IF NOT EXISTS favourite_connections(
requester_email TEXT,
recipient_email TEXT,
recipient_name TEXT,
last_update TIMESTAMP,
PRIMARY KEY (requester_email, recipient_email)
);
CREATE TABLE IF NOT EXISTS paired_connections(
requester_email TEXT,
recipient_email TEXT,
recipient_name TEXT,
last_update TIMESTAMP,
PRIMARY KEY (requester_email, recipient_email)
);
注意去掉了布尔标志,逻辑是如果记录存在于active_connections,则假定它是一个活动连接。
当一个新的连接被创建时,它可能在不同的表中有几条记录;要捆绑所有这些插入或更新,最好使用batch
查找给定用户的所有活动联系人
根据建议的表格,如果请求者的电子邮件是 test@email.com:
SELECT * FROM active_connections WHERE requester_email = 'test@email.com'
将用户更新为收藏
这将是批量更新connections中的记录并将新记录添加到favourite_connections:
BEGIN BATCH
UPDATE connections
SET is_favourite = true, last_update = dateof(now())
WHERE requester_email ='test@email.com'
AND recipient_email = 'john.smith@test.com';
INSERT INTO favourite_connections (
requester_email, recipient_email, recipient_name, last_update
) VALUES (
'test@email.com', 'john.smith@test.com', 'John Smith', dateof(now())
);
APPLY BATCH;
标记连接以进行软删除
连接的信息可以保留在connections中,所有标志都被禁用,以及从active_connections、favourite_connections和paired_connections中删除的记录
BEGIN BATCH
UPDATE connections
SET is_active = false, is_favourite = false,
is_paired = false, last_update = dateof(now())
WHERE requester_email ='test@email.com'
AND recipient_email = 'john.smith@test.com';
DELETE FROM active_connections
WHERE requester_email = 'test@email.com'
AND recipient_email = 'john.smith@test.com';
DELETE FROM favourite_connections
WHERE requester_email = 'test@email.com'
AND recipient_email = 'john.smith@test.com';
DELETE FROM paired_connections
WHERE requester_email = 'test@email.com'
AND recipient_email = 'john.smith@test.com';
APPLY BATCH;