MySQL - 选择不在分组依据中的列

2022-01-30 group-by mysql

我正在尝试将功能添加到预先存在的应用程序中,我遇到了一个类似这样的 MySQL 视图:

I'm trying to add features to a preexisting application and I came across a MySQL view something like this:

SELECT
     AVG(table_name.col1),
     AVG(table_name.col2),
     AVG(table_name.col3),
     table_name.personID,
     table_name.col4
FROM table_name
GROUP BY table_name.personID;

好的,所以有一些聚合函数.您可以选择 personID,因为您正在按它进行分组.但它也会选择不在聚合函数中且不属于 GROUP BY 子句的列.这怎么可能???它只是选择一个随机值,因为每个组的值肯定不是唯一的吗?

OK so there's a few aggregate functions. You can select personID because you're grouping by it. But it also is selecting a column that is not in an aggregate function and is not a part of the GROUP BY clause. How is this possible??? Does it just pick a random value because the values definitely aren't unique per group?

我来自哪里(MSSQL Server),这是一个错误.有人可以向我解释这种行为以及为什么它在 MySQL 中是允许的吗?

Where I come from (MSSQL Server), that's an error. Can someone explain this behavior to me and why it's allowed in MySQL?

推荐答案

确实,这个特性允许一些不明确的查询,并且默默地返回一个从该列中选择的任意值的结果集.在实践中,它往往是首先物理存储的组内行中的值.

It's true that this feature permits some ambiguous queries, and silently returns a result set with an arbitrary value picked from that column. In practice, it tends to be the value from the row within the group that is physically stored first.

如果您只选择在功能上依赖于 GROUP BY 条件中的列的列,则这些查询不会有歧义.换句话说,如果定义组的每个值只能有一个模糊"列的不同值,则没有问题.此查询在 Microsoft SQL Server(和 ANSI SQL)中是非法的,即使它在逻辑上不会导致歧义:

These queries aren't ambiguous if you only choose columns that are functionally dependent on the column(s) in the GROUP BY criteria. In other words, if there can be only one distinct value of the "ambiguous" column per value that defines the group, there's no problem. This query would be illegal in Microsoft SQL Server (and ANSI SQL), even though it cannot logically result in ambiguity:

SELECT AVG(table1.col1), table1.personID, persons.col4
FROM table1 JOIN persons ON (table1.personID = persons.id)
GROUP BY table1.personID;

此外,MySQL 有一个 SQL 模式,使其符合标准:ONLY_FULL_GROUP_BY

Also, MySQL has an SQL mode to make it behave per the standard: ONLY_FULL_GROUP_BY

FWIW,SQLite 也允许这些模棱两可的 GROUP BY 子句,但它从组中的 最后 行中选择值.

FWIW, SQLite also permits these ambiguous GROUP BY clauses, but it chooses the value from the last row in the group.

至少在我测试的版本中.任意意味着 MySQL 或 SQLite 将来可能会改变它们的实现,并有一些不同的行为.因此,在这种模棱两可的情况下,您不应依赖当前的行为方式.最好将您的查询重写为确定性而不是模棱两可.这就是 MySQL 5.7 现在默认启用 ONLY_FULL_GROUP_BY 的原因.

At least in the version I tested. What it means to be arbitrary is that either MySQL or SQLite could change their implementation in the future, and have some different behavior. You should therefore not rely on the behavior staying they way it is currently in ambiguous cases like this. It's better to rewrite your queries to be deterministic and not ambiguous. That's why MySQL 5.7 now enables ONLY_FULL_GROUP_BY by default.

相关文章

MySQL实战

2023-07-01 mysql 实战