这实际上是一个非常有趣的问题。
哈哈,不管这个,我真烂。
编辑时:这个答案是可行的,但是在MySQL上,当父行数只有100行时,它会变得非常缓慢。
但是,请参阅下面的性能修复。
显然,您可以对每个日志运行一次此查询:
select * from comments where id = $id limit 3
这会产生大量的开销,当您在每个日志中执行一个数据库查询时
N+ 1查询
.
如果你想一次得到所有的帖子(或某个子集的位置),下面将
惊人地
工作。它假定注释具有单调递增的ID(因为日期时间不一定是唯一的),但允许在文章之间交错注释ID。
因为一个自动递增的id列是单调递增的,所以如果comment有一个id,您就都设置好了。
首先,创建此视图。在视图中,我呼叫Post
parent
评论
child
:
create view parent_top_3_children as
select a.*,
(select max(id) from child where parent_id = a.id) as maxid,
(select max(id) from child where id < maxid
and parent_id = a.id) as maxidm1,
(select max(id) from child where id < maxidm1
and parent_id = a.id) as maxidm2
from parent a;
maxidm1
只是“最大ID减1”;
maxidm2
“max id减去2”--即第二和第三大子id
在特定父ID中
.
然后将视图加入到评论中所需的任何内容中(我将称之为
text
):
select a.*,
b.text as latest_comment,
c.text as second_latest_comment,
d.text as third_latest_comment
from parent_top_3_children a
left outer join child b on (b.id = a.maxid)
left outer join child c on (c.id = a.maxidm1)
left outer join child d on (c.id = a.maxidm2);
当然,您可以添加任何希望添加的WHERE子句,以限制日志:
where a.category = 'foo'
或者什么。
我的桌子是这样的:
mysql> select * from parent;
+----+------+------+------+
| id | a | b | c |
+----+------+------+------+
| 1 | 1 | 1 | NULL |
| 2 | 2 | 2 | NULL |
| 3 | 3 | 3 | NULL |
+----+------+------+------+
3 rows in set (0.00 sec)
还有一部分孩子。父级1没有子级:
mysql> select * from child;
+----+-----------+------+------+------+------+
| id | parent_id | a | b | c | d |
+----+-----------+------+------+------+------+
. . . .
| 18 | 3 | NULL | NULL | NULL | NULL |
| 19 | 2 | NULL | NULL | NULL | NULL |
| 20 | 2 | NULL | NULL | NULL | NULL |
| 21 | 3 | NULL | NULL | NULL | NULL |
| 22 | 2 | NULL | NULL | NULL | NULL |
| 23 | 2 | NULL | NULL | NULL | NULL |
| 24 | 3 | NULL | NULL | NULL | NULL |
| 25 | 2 | NULL | NULL | NULL | NULL |
+----+-----------+------+------+------+------+
24 rows in set (0.00 sec)
这个观点给了我们:
mysql> select * from parent_top_3;
+----+------+------+------+-------+---------+---------+
| id | a | b | c | maxid | maxidm1 | maxidm2 |
+----+------+------+------+-------+---------+---------+
| 1 | 1 | 1 | NULL | NULL | NULL | NULL |
| 2 | 2 | 2 | NULL | 25 | 23 | 22 |
| 3 | 3 | 3 | NULL | 24 | 21 | 18 |
+----+------+------+------+-------+---------+---------+
3 rows in set (0.21 sec)
该视图的解释计划只是有点毛毛:
mysql> explain select * from parent_top_3;
+----+--------------------+------------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+------------+------+---------------+------+---------+------+------+-------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 3 | |
| 2 | DERIVED | a | ALL | NULL | NULL | NULL | NULL | 3 | |
| 5 | DEPENDENT SUBQUERY | child | ALL | PRIMARY | NULL | NULL | NULL | 24 | Using where |
| 4 | DEPENDENT SUBQUERY | child | ALL | PRIMARY | NULL | NULL | NULL | 24 | Using where |
| 3 | DEPENDENT SUBQUERY | child | ALL | NULL | NULL | NULL | NULL | 24 | Using where |
+----+--------------------+------------+------+---------------+------+---------+------+------+-------------+
但是,如果我们为父FK添加索引,它会变得更好:
mysql> create index pid on child(parent_id);
mysql> explain select * from parent_top_3;
+----+--------------------+------------+------+---------------+------+---------+-----------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+------------+------+---------------+------+---------+-----------+------+-------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 3 | |
| 2 | DERIVED | a | ALL | NULL | NULL | NULL | NULL | 3 | |
| 5 | DEPENDENT SUBQUERY | child | ref | PRIMARY,pid | pid | 5 | util.a.id | 2 | Using where |
| 4 | DEPENDENT SUBQUERY | child | ref | PRIMARY,pid | pid | 5 | util.a.id | 2 | Using where |
| 3 | DEPENDENT SUBQUERY | child | ref | pid | pid | 5 | util.a.id | 2 | Using where |
+----+--------------------+------------+------+---------------+------+---------+-----------+------+-------------+
5 rows in set (0.04 sec)
如前所述,当父行数不到100时,这一点就开始分崩离析。
即使我们使用它的主键索引到父级
:
mysql> select * from parent_top_3 where id < 10;
+----+------+------+------+-------+---------+---------+
| id | a | b | c | maxid | maxidm1 | maxidm2 |
+----+------+------+------+-------+---------+---------+
| 1 | 1 | 1 | NULL | NULL | NULL | NULL |
| 2 | 2 | 2 | NULL | 25 | 23 | 22 |
| 3 | 3 | 3 | NULL | 24 | 21 | 18 |
| 4 | NULL | 1 | NULL | 65 | 64 | 63 |
| 5 | NULL | 2 | NULL | 73 | 72 | 71 |
| 6 | NULL | 3 | NULL | 113 | 112 | 111 |
| 7 | NULL | 1 | NULL | 209 | 208 | 207 |
| 8 | NULL | 2 | NULL | 401 | 400 | 399 |
| 9 | NULL | 3 | NULL | 785 | 784 | 783 |
+----+------+------+------+-------+---------+---------+
9 rows in set (1 min 3.11 sec)
(请注意,我有意在一台速度较慢的机器上进行测试,数据保存在速度较慢的闪存磁盘上。)
这是解释,查找一个ID(第一个ID,在那个位置):
mysql> explain select * from parent_top_3 where id = 1;
+----+--------------------+------------+------+---------------+------+---------+-----------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+------------+------+---------------+------+---------+-----------+------+-------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 1000 | Using where |
| 2 | DERIVED | a | ALL | NULL | NULL | NULL | NULL | 1000 | |
| 5 | DEPENDENT SUBQUERY | child | ref | PRIMARY,pid | pid | 5 | util.a.id | 179 | Using where |
| 4 | DEPENDENT SUBQUERY | child | ref | PRIMARY,pid | pid | 5 | util.a.id | 179 | Using where |
| 3 | DEPENDENT SUBQUERY | child | ref | pid | pid | 5 | util.a.id | 179 | Using where |
+----+--------------------+------------+------+---------------+------+---------+-----------+------+-------------+
5 rows in set (56.01 sec)
一排超过56秒,即使在我的慢机器上,也不能接受两个数量级。
那么我们可以保存这个查询吗?它
作品
太慢了。
这是修改后的查询的解释计划。看起来很糟糕或更糟:
mysql> explain select * from parent_top_3a where id = 1;
+----+--------------------+------------+------+---------------+------+---------+-----------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+------------+------+---------------+------+---------+-----------+------+-------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 100 | Using where |
| 2 | DERIVED | <derived4> | ALL | NULL | NULL | NULL | NULL | 100 | |
| 4 | DERIVED | <derived6> | ALL | NULL | NULL | NULL | NULL | 100 | |
| 6 | DERIVED | a | ALL | NULL | NULL | NULL | NULL | 100 | |
| 7 | DEPENDENT SUBQUERY | child | ref | pid | pid | 5 | util.a.id | 179 | Using where |
| 5 | DEPENDENT SUBQUERY | child | ref | PRIMARY,pid | pid | 5 | a.id | 179 | Using where |
| 3 | DEPENDENT SUBQUERY | child | ref | PRIMARY,pid | pid | 5 | a.id | 179 | Using where |
+----+--------------------+------------+------+---------------+------+---------+-----------+------+-------------+
7 rows in set (0.05 sec)
但它完成了
三
数量级更快,在1/20秒!
我们怎样才能到达速度更快的家长?我们创造
三
视图,每个视图依赖于前一个视图:
create view parent_top_1 as
select a.*,
(select max(id) from child where parent_id = a.id)
as maxid
from parent a;
create view parent_top_2 as
select a.*,
(select max(id) from child where parent_id = a.id and id < a.maxid)
as maxidm1
from parent_top_1 a;
create view parent_top_3a as
select a.*,
(select max(id) from child where parent_id = a.id and id < a.maxidm1)
as maxidm2
from parent_top_2 a;
这不仅工作得更快,而且对于RDBMSE(而不是MySQL)也是合法的。
让我们把父行数增加到12800,子行数增加到1536(大多数博客文章都没有评论,对吧?;))
mysql> select * from parent_top_3a where id >= 20 and id < 40;
+----+------+------+------+-------+---------+---------+
| id | a | b | c | maxid | maxidm1 | maxidm2 |
+----+------+------+------+-------+---------+---------+
| 39 | NULL | 2 | NULL | NULL | NULL | NULL |
| 38 | NULL | 1 | NULL | NULL | NULL | NULL |
| 37 | NULL | 3 | NULL | NULL | NULL | NULL |
| 36 | NULL | 2 | NULL | NULL | NULL | NULL |
| 35 | NULL | 1 | NULL | NULL | NULL | NULL |
| 34 | NULL | 3 | NULL | NULL | NULL | NULL |
| 33 | NULL | 2 | NULL | NULL | NULL | NULL |
| 32 | NULL | 1 | NULL | NULL | NULL | NULL |
| 31 | NULL | 3 | NULL | NULL | NULL | NULL |
| 30 | NULL | 2 | NULL | 1537 | 1536 | 1535 |
| 29 | NULL | 1 | NULL | 1529 | 1528 | 1527 |
| 28 | NULL | 3 | NULL | 1513 | 1512 | 1511 |
| 27 | NULL | 2 | NULL | 1505 | 1504 | 1503 |
| 26 | NULL | 1 | NULL | 1481 | 1480 | 1479 |
| 25 | NULL | 3 | NULL | 1457 | 1456 | 1455 |
| 24 | NULL | 2 | NULL | 1425 | 1424 | 1423 |
| 23 | NULL | 1 | NULL | 1377 | 1376 | 1375 |
| 22 | NULL | 3 | NULL | 1329 | 1328 | 1327 |
| 21 | NULL | 2 | NULL | 1281 | 1280 | 1279 |
| 20 | NULL | 1 | NULL | 1225 | 1224 | 1223 |
+----+------+------+------+-------+---------+---------+
20 rows in set (1.01 sec)
请注意,这些计时是针对myisam表的;我将把它留给其他人在innodb上进行计时。
但是使用PostgreSQL,在一个相似但不相同的数据集上,我们得到了相似的时间
where
涉及的谓词
起源
专栏:
postgres=# select (select count(*) from parent) as parent_count, (select count(*)
from child) as child_count;
parent_count | child_count
--------------+-------------
12289 | 1536
postgres=# select * from parent_top_3a where id >= 20 and id < 40;
id | a | b | c | maxid | maxidm1 | maxidm2
----+---+----+---+-------+---------+---------
20 | | 18 | | 1464 | 1462 | 1461
21 | | 88 | | 1463 | 1460 | 1457
22 | | 72 | | 1488 | 1486 | 1485
23 | | 13 | | 1512 | 1510 | 1509
24 | | 49 | | 1560 | 1558 | 1557
25 | | 92 | | 1559 | 1556 | 1553
26 | | 45 | | 1584 | 1582 | 1581
27 | | 37 | | 1608 | 1606 | 1605
28 | | 96 | | 1607 | 1604 | 1601
29 | | 90 | | 1632 | 1630 | 1629
30 | | 53 | | 1631 | 1628 | 1625
31 | | 57 | | | |
32 | | 64 | | | |
33 | | 79 | | | |
34 | | 37 | | | |
35 | | 60 | | | |
36 | | 75 | | | |
37 | | 34 | | | |
38 | | 87 | | | |
39 | | 43 | | | |
(20 rows)
Time: 91.139 ms