实体框架 (4.2) HasRequired 导致意外的 LEFT OUTER JOIN

人气:999 发布:2022-10-16 标签: include entity-framework linq dbcontext

问题描述

似乎实体框架(来自 NuGet 的最新版本)在为导航属性构造连接而不是定义的第一个连接时可能会忽略 HasRequired 配置.

It appears that the Entity Framework (latest version from NuGet) may be ignoring the HasRequired configuration when constructing joins for Navigation Properties other than the first one defined.

例如,给定一个具有以下配置的 POCO 对象(Person):

For example, given a POCO object (Person) with the following configuration:

var person = modelBuilder.Entity<Person>();
person.ToTable("The_Peoples");
person.HasKey(i => i.Id);
person.Property(i => i.Id).HasColumnName("the_people_id");
person.HasRequired(i => i.Address)
    .WithMany()
    .Map(map => map.MapKey("address_id"));
person.HasRequired(i => i.WorkPlace)
    .WithMany()
    .Map(map => map.MapKey("work_place_id"));

我正在尝试使用以下查询加载人员列表:

I'm attempting to load a list of people with the following query:

myContext.Set<People>()
    .Include(o => o.Address)
    .Include(o => o.WorkPlace);

实体框架生成以下查询:

Entity Framework generates the following query:

FROM  [dbo].[The_Peoples] AS [Extent1]
INNER JOIN [dbo].[The_Addresses] AS [Extent2] ON [Extent1].[address_id] = [Extent2].[address_id]
LEFT OUTER JOIN [dbo].[The_Work_Places] AS [Extent3] ON [Extent1].[work_place_id] = [Extent3].[work_place_id]

请注意,*The_Addresses* 表的连接是内连接(如预期的那样),但是,*The_Work_Places* 的后续连接是外连接.鉴于 Address 和 WorkPlace 属性都被标记为必需,我希望这两个连接都是内部连接.我还尝试使用 Required 属性标记 Address 和 WorkPlace 属性,但这没有效果.

Notice that the join to the *The_Addresses* table is an inner join (as expected), however, the subsequent join to the *The_Work_Places* is an outer join. Given that both the Address and WorkPlace properties are marked as required, I would expect both joins to be inner joins. I've also attempted marking the Address and WorkPlace properties with the Required attribute, but this had no effect.

这是一个错误还是我可能错误配置了什么?有什么建议吗?

Is this a bug or am I perhaps misconfiguring something? Suggestions?

推荐答案

您的模型配置是正确的,我认为这不是错误,而是设计行为,但我无法准确判断是什么设计.我还在此类查询中看到了该 SQL.只说几点:

Your model configuration is correct and I think it is not a bug but it is behaviour by design, but I cannot tell exactly what design. I've also seen that SQL in such queries. Only a few remarks:

您看到的查询并非特定于 EF 4.2.EF 4.1 和 EF 4.0 也会出现这种情况.但对于 EF 1 (.NET 3.5),不是.在 EF 1 中,每个 Include(也是第一个)都已映射到一个 LEFT OUTER JOIN,同样用于所需的关系.

The query you are seeing is not specific to EF 4.2. It would also occur for EF 4.1 and EF 4.0. But not for EF 1 (.NET 3.5). In EF 1 every Include, also the first, has been mapped to a LEFT OUTER JOIN, also for required relationships.

我认为不能说使用 INNER JOIN 对于所需的导航属性是正确的",而 LEFT OUTER JOIN 是错误的.从映射的角度来看,使用什么并不重要,因为数据库中的约束正确地表示了模型中的关系.对于必需的导航属性,数据库中的 FK 列不能为空,并且数据库中必须有一个约束,强制 FK 引用目标表中的现有行.如果是这种情况,每个 JOIN 都必须返回一行,无论您使用 INNER JOIN 还是 LEFT OUTER JOIN.

I think one cannot say that using an INNER JOIN is "correct" for required navigation properties and LEFT OUTER JOIN is wrong. From a mapping view point it doesn't matter what you use, given that the constraints in the database represent the relationships in the model correctly. For a required navigation property the FK column in the database must not be nullable and there must a constraint in the database which enforces that the FK refers to an existing row in the target table. If that is the case, every JOIN must return a row, no matter if you use INNER JOIN or LEFT OUTER JOIN.

如果模型和数据库在关系上不同步"会怎样?在这两种情况下基本上都会发生废话:如果您使用 LEFT OUTER JOIN 并且 FK 在数据库中是 NULL 或引用不存在的行,您将获得一个实体其中导航属性为 null,违反了需要该属性的模型定义.使用 INNER JOIN 并没有更好:你根本不会得到任何实体,查询结果至少与 LEFT OUTER JOIN 的结果一样错误,如果不会更糟.

What happens if model and database is "out of sync" regarding the relationships? Basically nonsense happens in both cases: If you use a LEFT OUTER JOIN and the FK is NULL in the DB or refers to a not existing row, you'd get an entity where the navigation property is null, violating the model definition that the property is required. Using an INNER JOIN is not better: You'd get no entity at all, a query result which is at least as wrong as the result with the LEFT OUTER JOIN, if not worse.

所以,我认为 .NET 4 中对某些 Include 使用 INNER JOIN 的更改不是因为 EF 1 中的 SQL错误,但要创建更好、性能更高的 SQL.此更改实际上引入了一项重大更改,因为某些查询现在返回的结果与 EF 1 中不同:http://thedatafarm.com/blog/data-access/ef4-break-change-ef4-inner-joins-affect-eager-loading-many对多/

So, I think the change in .NET 4 to use INNER JOINs for some Includes has been made not because the SQL in EF 1 was wrong but to create better and more performant SQL. This change actually introduced a breaking change in that some queries returned other results now than they did in EF 1: http://thedatafarm.com/blog/data-access/ef4-breaking-change-ef4-inner-joins-affect-eager-loading-many-to-many/

我的理解是这个问题已经修复了,原因是在太多的情况下,INNER JOINs 已经在 EF 4 中引入了急切加载.(也许在这个阶段(EF 4 的 beta/release 候选)您的查询将有两个 INNER JOIN.)EF 团队对该问题的答复:http://connect.microsoft.com/VisualStudio/feedback/details/534675/ef4-include-method-returns-different-results-than-ef1-include(我的粗体突出显示):

My understanding is that this has been fixed and that the reason was that INNER JOINs in too many situations have been introduced for eager loading in EF 4. (Perhaps in this phase (beta/release candidate for EF 4) your query would have had two INNER JOINs.) The reply to that problem from the EF team: http://connect.microsoft.com/VisualStudio/feedback/details/534675/ef4-include-method-returns-different-results-than-ef1-include (bold highlight from me):

我们正在解决 .net 4 RTM 的问题.这是一个意外突破性的变化.我们并没有在每个离开的地方做出预期的改变由 Include 产生的外连接在 .Net 4 中变成了内连接. 但是而是优化着眼于 EF 元数据中的约束并尝试转换那些可以安全的左外连接根据约束转换为内部连接.我们在我们根据导致的约束进行推理的代码在比约束所暗示的更积极的转换中.我们已经缩减了优化,以便我们转换左外连接仅在我们绝对确定可以的地方进行内部连接根据约束来做.我们认为我们可以改进这一点以后再优化一点.您将开始看到更多与 RC 和 Beta 相比,RTM 中某些查询的左外连接2 但在大多数情况下,这是返回正确结果所必需的.

We are fixing the issue for .net 4 RTM. This was an unintended breaking change. We did not make an intended change where every left outer join produced by an Include became an inner join in .Net 4. But rather the optimization looked at the constraints in the EF metadata and tried to convert those left outer joins which could be safely converted to inner joins based on the constraints. We had a bug in the code where we were reasoning based on the constraints which resulted in more aggressive conversion than what the constraints implied. We have scaled back the optimization so that we convert left outer joins to inner joins only in the places where we are absolutely sure we can do it based on the constraints. We think we can improve this optimization a little more in the future. You will start seeing more left outer joins for some queries in RTM when compared to RC and Beta 2 but in most of these cases this is needed to return correct results.

因此,EF 4 的最终版本显然重新引入了更多 LEFT OUTER JOIN(与 beta/候选发布版本相比)以避免出现类似的重大更改.

So, the final release for EF 4 apparently reintroduced some more LEFT OUTER JOINs (compared to beta/release candidate) to avoid a breaking change like that.

抱歉,这更像是一个历史故事,而不是一个真实的解释,为什么你会得到一个 INNER JOIN,然后是一个 LEFT OUTER JOIN.如前所述,以这种方式编写查询并没有错——因为使用两个 INNER JOIN 或两个 LEFT OUTER JOIN 也没有错.我想只有 EF 团队才能准确解释为什么您的查询会产生特定的 SQL.

Sorry, this is more a historical story than a real explanation why you get an INNER JOIN and then a LEFT OUTER JOIN. As said, it is not wrong to write the query this way - as it wouldn't be wrong to use two INNER JOINs or two LEFT OUTER JOINs. I guess that only the EF team can explain exactly why your query produces that specific SQL.

我建议 - 如果您没有遇到严重的性能问题 - 不要担心那个 SQL(因为您得到的结果毕竟是正确的)并继续.不喜欢 EF 创建的 SQL 最终会导致编写大量功能和更改请求,或者编写大量原始 SQL 查询或完全放弃 EF.

I'd recommend - if you don't experience serious performance problems - not to worry about that SQL (since the result you get is correct after all) and proceed. Not liking the SQL which EF creates ends up in writing either a lot of feature and change requests or in writing a lot of raw SQL queries or in abandoning EF at all.

803