Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Conditional insert #578

Open
tsanton opened this issue Apr 29, 2024 · 23 comments
Open

Feat: Conditional insert #578

tsanton opened this issue Apr 29, 2024 · 23 comments
Assignees

Comments

@tsanton
Copy link

tsanton commented Apr 29, 2024

One of the, in my option, larger limitations with EF at the moment is the lack for conditional singleton inserts.

My current case is as follows: I'm allowing users to manipulate a history table, but with certain limitations.
For instance I will allow them to create a new statuses, but that status can't be backdated with "valid_from" <= min(valid_from) where entity was created (status == 'created').

As of now I have to look up the entity (or run an .Any() with a predicate), and then insert if it passes the predicate, whereas I'd much rather just fire off IQueryable.Where(prediates).ConditionalInsertAsync(Entity) and return the count from the output to see if one went in or if 0 inserted (and then return conditional responses based on the feedback).

In terms of design (at least for Postgres) I'm thinking something along these lines:

CREATE TEMP TABLE proof_of_concept (
    id UUID PRIMARY KEY,
    name TEXT,
    age INTEGER,
    created TIMESTAMP WITHOUT TIME ZONE DEFAULT CURRENT_TIMESTAMP
);

--Empty table
select * from proof_of_concept;

--Initial insert
INSERT INTO proof_of_concept
select gen_random_uuid() id, 'Al Coholic' name, 10 age, timezone('utc', now()) created;

--Al made his way into the model
select * from proof_of_concept;

--Do not insert because the new age (9) is lesser than his his current max age.
with data as(
    select 'Al Coholic' name, 9 age
)

INSERT INTO proof_of_concept
select gen_random_uuid() id, data.name, data.age, timezone('utc', now()) created from data
where data.age > (select max(poc.age) from proof_of_concept poc where name = data.name)
RETURNING *;

--As we can see, both from RETURNING and select: no insert
select * from proof_of_concept;

--Do insert because the new age (11) is greater than his his current max age (10)
with data as(
    select 'Al Coholic' name, 11 age
)

INSERT INTO proof_of_concept
select gen_random_uuid() id, data.name, data.age, timezone('utc', now()) created from data
where data.age > (select max(poc.age) from proof_of_concept poc where name = data.name)
RETURNING *;

--And here we have it: conditional insert
select * from proof_of_concept;

I'm posting the suggestion here firstly because I think a lot of the required pipework for this extension already exist within the existing code base. Further I think it's a killer extension that I'm somewhat perplex that I can't find an implementation for -> it surely would save a lot of time and boilerplate code.

I can also say that though it's on the EF core radar (here) I would not put money on it making the EF core 9 cut. Nor is it completely clear to me if the design supports the conditional bit.

Hoping to hear back from you and I'd be happy to help with other SQL-provider syntax research or whatever you feel you might need in order to get this into either extension or plus!

/T

@JonathanMagnan JonathanMagnan self-assigned this Apr 29, 2024
@JonathanMagnan
Copy link
Member

JonathanMagnan commented Apr 29, 2024

Hello @tsanton ,

Here are a few questions to better understand your requirement

  1. Are you using PostgreSQL?
  2. Are you looking to provide a list of entities to insert or looking to insert from existing entities (already in the database)? Such as what InsertFromQuery does?
  3. Is an option like InsertStagingTableFilterFormula that will allow you to write your own SQL (for the filtering part in the WHERE statement) and filter data before they get inserted enough for you?
    • NOTE: This option works for SQL Server but not yet for PostgreSQL

Let me know more about your scenario. Depending on your answer, we might already be very close to supporting it.

Best Regards,

Jon

@tsanton
Copy link
Author

tsanton commented Apr 29, 2024

Hi @JonathanMagnan!

I'm on Postgres, yes.
I would like to utilise this in place of EF.AddAsync (so not from query, but single parameterised insert from code). I am agnostic to how this is solved on the backend, with a CTE/temp-table with insert from query, or parameterized select :)

I'm already running this pattern for updates and deletes. Here is how I implement a conditional delete of "SomeEntity" based on the ValidFrom queryable predicate

var query = from ua in context.SomeRandomEntity
    let minValidFrom = (
        from x in context.SomeRandomEntity
        where x.TenantId == tenantId && x.SomeId == pred.SomeId
        select x.ValidFrom
    ).Min()
    where ua.TenantId == tenantId && ua.SomeId == pred.SomeId && ua.Id == pred.Id && ua.ValidFrom > minValidFrom
    select ua;

var strategy = context.Database.CreateExecutionStrategy();
return await strategy.ExecuteAsync(async () =>
{
    await using var transaction = await context.Database.BeginTransactionAsync(ct);
    var deleted = await query.ExecuteDeleteAsync(ct);
    if (deleted != 1)
    {
        await transaction.RollbackAsync(ct);
        return false;
    }
    await transaction.CommitAsync(ct);
    return true;
});

@JonathanMagnan
Copy link
Member

Thank you for the additional information.

We will work on it.

Best Regards,

Jon

@tsanton
Copy link
Author

tsanton commented May 27, 2024

Hi again @JonathanMagnan, hope your enjoyed your vacation!

Just wondering if there are any news on this subject and if/when one hopefully can expect to see it live? :)

/T

@JonathanMagnan
Copy link
Member

Hello @tsanton ,

My vacation was great; it was the best one I've had so far!

My developer provided me with a fix; the code is currently under code review.

If the code is accepted, the fix will be deployed on June 11 or June 18.

Best Regards,

Jon

@tsanton
Copy link
Author

tsanton commented May 28, 2024

@JonathanMagnan that is fantastic news (both that your vacation was a blast and that the feature is rolling through)!

I'll order the champaign and schedule in a tenative refactoring session on my side :)

Keep up the good work and godspeed on your upcoming June deployment; can't wait!

/T

@JonathanMagnan
Copy link
Member

JonathanMagnan commented Jun 12, 2024

Hello @tsanton ,

I just want to confirm that the code has been merged.

The only problem with this option at this moment is we haven't succeeded in making it compatible with the option InsertIfNotExists (For PostgreSQL), but since you don't use it, that will not be a problem on your side.

We are still targeting June 18 for our next release 🍾 🥂

Best Regards,

Jon

@JonathanMagnan
Copy link
Member

Hello @tsanton ,

The v8.103.0.0 has been finally released.

In this version, we added the support to InsertStagingTableFilterFormula.

Here is an example:

context.BulkInsert(list, option => { 
	option.InsertStagingTableFilterFormula = "\"ColumnInt\" > 10"; 
});

Let me know if that option works correctly for you or if you need help to implement it.

Best Regards,

Jon

@tsanton
Copy link
Author

tsanton commented Jun 23, 2024

Hi @JonathanMagnan and sorry for the late reply -> have been busy elsewhere in my backlog so have not had the time to look into it before now.

So I do fear we might have talked past each other as my need is for single entity inserts. Here is the pattern that I'm trying to do away with:

/// <summary>
/// You can only add a new status if the new ValidFrom is greater than the latest ValidFrom
/// </summary>
public override async Task<int> AddAsync(Guid tenantId, LeaseStatusEntity entity, CancellationToken ct = new())
  {
      var context = await _factory.GetDbContext(tenantId);
      
      var maxDate = await context.Status.AsNoTracking()
          .Where(x => x.TenantId == tenantId && x.LeaseId == entity.LeaseId)
          .MaxAsync(x => (DateTime?)x.ValidFrom, ct);

      if (maxDate.HasValue && entity.ValidFrom < maxDate) return -1;
      
      await context.Status.AddAsync(entity, ct);
      return await context.SaveChangesAsync(ct);
  }

Which for me translates into one of the following SQL queries:

insert into XXXStatus (tenant_id, lease_id, status_id, status, valid_from.......)
select 
    '<UUID1>', '<UUID2>', '<UUID3>, 'ACTIVE', '2000-01-01', ......
where not exists (
    select 1 from XXXStatus where tenant_id = '<UUID1>' LeaseId = '<UUID2>' and valid_from > '2000-01-01'
);

or it can take the form of

insert into XXXStatus (tenant_id, lease_id, status_id, status, valid_from.......)
select 
    '<UUID1>', '<UUID2>', '<UUID3>, 'ACTIVE', '2000-01-01', ......
where valid_from > (select min(valid_from) from XXXStatus where tenant_id = '<UUID1>' LeaseId = '<UUID2>' and valid_from > '2000-01-01');

Not complete sure how I can achieve this with the InsertStagingTableFilterFormula 🤔

@JonathanMagnan
Copy link
Member

Hello @tsanton ,

Thank you for providing the code.

Here is an example using a similar entity as you:

context.BulkInsert(list, option =>
{
	option.InsertStagingTableFilterFormula = $"\"{nameof(EntitySimple.ValidFrom)}\" > (SELECT MAX(\"{nameof(EntitySimple.ValidFrom)}\") FROM \"EntitySimples\" AS X WHERE X.\"{nameof(EntitySimple.TenantID)}\" = StagingTable.\"{nameof(EntitySimple.TenantID)}\" AND X.\"{nameof(EntitySimple.LeaseID)}\" = StagingTable.\"{nameof(EntitySimple.LeaseID)}\")";
});

In this example, I was able to create an SQL statement similar to the one you provided. I just changed the min for max as I believe this is the behavior you want.

Looking at your requirement, I will recommend you instead to use a combo of some other options:

context.BulkInsert(list, option => {
	option.InsertIfNotExists = true;
	option.ColumnPrimaryKeyExpression = x => new { x.TenantID, x.LeaseID };
	option.InsertPrimaryKeyAndFormula = $"DestinationTable.\"{nameof(EntitySimple.ValidFrom)}\" > StagingTable.\"{nameof(EntitySimple.ValidFrom)}\"";
});

This code will probably be easier to read and maintain. The entity will only be inserted if a row with a more recent date (for the same TenantID and LeaseID) doesn't already exist.

Let me know if that makes sense.

Best Regards,

Jon

@JonathanMagnan
Copy link
Member

Hello @tsanton,

Since our last conversation, we haven't heard from you.

Let me know if you need further assistance.

Best regards,

Jon

@tsanton
Copy link
Author

tsanton commented Jul 2, 2024

Hi @JonathanMagnan and sorry for the late reply: took me a few days to get around to one of these issues.

So I currently have this code working like a charm. I must say it's not the prettiest I've ever written, but this insert patterns sure has one major advantage: no foreign key constraint issues when trying to exist a non-existing entity due to my model setup and the valid_from > null predicate (which is very nice)

var context = await _factory.GetDbContext(tenantId);
        
var resultInfo = new Z.BulkOperations.ResultInfo();
        
await context.BulkInsertAsync([entity], options =>
{
    options.UseRowsAffected = true;
    options.ResultInfo = resultInfo;
    options.InsertStagingTableFilterFormula = $"valid_from > (SELECT MAX(valid_from) FROM {XXXConfig.SchemaName}.{YYYEntityConfiguration.TableName} AS X WHERE X.tenant_id = StagingTable.tenant_id AND X.ZZZ_id = StagingTable.ZZZ_id)";
}, ct);

return resultInfo.RowsAffectedInserted;

Though I am happy with the performance of it, as you see it's not refactoring proof. As you may gather EF is configured to use snake_case naming. That makes the NameOf(entity.TenantId) option not viable -> do you know of any good way to reference the snake_cased variable name from the EFcontext (or some other equivalent way of achieving this)?

In terms of functionality: superb.
Ease of use: not so much.

May I ask if it's possible to request extension methods on an IQueryable such as IQueryable.InsertWhereExists(entity) and IQueryable.InsertWhereNotExists(entity) which creates an insert into and appends a where <not> exists( <select from IQueryable>). This would be a pure single instance insert which covers most (all?) of my transactional use cases and would enhance ease of use in comparison to the solution above.

Don't get me wrong: I'm a happy camper and keeping the code as is, but I would like my juniors to 1) not to make fun of me and 2) be able to repeat this pattern without intimate SQL understanding.

As always: many thanks and great delivery speed!

/T

@JonathanMagnan
Copy link
Member

JonathanMagnan commented Jul 3, 2024

Hello @tsanton ,

I'm happy to hear that you succeeded in making it work.

Regarding your point about being "easy to use," for a basic scenario, we have our InsertIfNotExists option, as I have shown in my second example. However, your case is not considered basic because you must also look at the maximum date.

Thank you for your suggestion about the IQueryable.InsertWhereNotExists(entity). It makes indeed scenario that is more complex like your very easy to use, I now understand your point about this part. That could be actually something possible. We will look at it probably at the beginning of next month to see if that is something we could add in the future.

do you know of any good way to reference the snake_cased variable name from the EFcontext

We don't have any public method that currently allows this, but we will look at it as well as we store this information within our library.

It's possible to do it by getting the entity type from the model and then finding the right property, and finally call after the GetColumnName method. However, it looks a little bit complicated for what you need to do.

Best Regards,

Jon

@tsanton
Copy link
Author

tsanton commented Aug 25, 2024

Hi @JonathanMagnan, did you guys ever look into IQueryable.InsertWhereNotExists(entity) & IQueryable.InsertWhereExists(entity)? Would love to have this with a T? or int return where T? is your entity (maybe enriched with server side generated ids if people do that) or a "inserted" int for "how many records inserted".

Looking forward to hearing back from you!

@JonathanMagnan
Copy link
Member

Hello @tsanton ,

In the past two weeks, we made a lot of progress on this good idea you provided. It's now one of our few priorities. So far, we really like the current direction and the potential it will provide to our library if we succeed in supporting it correctly.

We already have a working version, but some cases are currently limited, so we are working on it.

I will be able to provide a better update at the end of September (or perhaps before). As said, it has now become one of our priorities, so a lot of time will be put into this feature to make it happen, but I cannot promise anything yet.

Best Regards,

Jon

@tsanton
Copy link
Author

tsanton commented Aug 25, 2024

That's great news! It will add something that's very much missing from EF and that (to my knowledge) nobody else are providing so I think it's a smart choice! Looking forward to hearing more about this -> I'll drop by this issue come end of September to ping for updates if I don't hear anything before that :)

Good luck!

/T

@tsanton
Copy link
Author

tsanton commented Sep 24, 2024

Howdy @JonathanMagnan! A bit early for end of September, but I though I'd drop by and hear how it's going with the feature? :) Any significant progress/blockers, and if not: maybe an ETA?

Speak soon!

/T

@JonathanMagnan
Copy link
Member

Hello @tsanton ,

This is pretty much the same status as last time. I was traveling for the first three weeks of September, so it was too hard to focus on a more complex JIRA like this one.

I will soon start to focus on the code my developer made. Normally, this process goes really fast as I take only one priority at a time. We are currently completing one of our priorities this week, and after this, we will re-evaluate to choose if this request become our first priority.

Best Regards,

Jon

@JonathanMagnan
Copy link
Member

Hello @tsanton ,

Just to give you an update we resumed our work for the InsertWhereNotExists feature. We have already added a few improvements today. Next week, I should be able to give you a better update but that's certainly now a JIRA we will push to complete very fast.

Best Regards,

Jon

@tsanton
Copy link
Author

tsanton commented Oct 1, 2024

Great news @JonathanMagnan!

May I just request that we have it both ways: InsertWhereNotExists and InsertWhereExists -> I guess we could also just invert the predicate query, but it might make it easier for people not as savvy in SQL :)

I'm also hoping you add a plain Insert(entity) API which does exactly that (AddAsync ripoff), because I think you can implement some really cool, efficient, and nice extensions onto the InsertXXX fluent APIs.

One I have in mind is .WithValueFromQuery:

public class MyEntityWithRevisionId
{
    public Guid Id { get; set; }
    public int Revision { get; set; }
}

var entity = new MyEntityWithRevisionId
{
    Id = Guid.NewGuid(),
    Revision = -1
};

IQueryable query = context.MyEntity.Where(x => x.Id == entity.Id));
Context.InsertAsync(entity).WithValueFromQuery(x => x.Revision, query.Max(x => x.Revision)); //Coalesce here too..

//or maybe you want to insert only if it's a sibling (also with a calculated field)
query.InsertWhereExists(entity).WithValueFromQuery(x => x.Revision, query.Max(x => x.Revision));

I'm guessing you have no problem seeing how that subquery can be substituted for a parameter during insert :)

I'll request this as a separate feature once this is done, but I just wanted to bounce it by you early on in case you liked it and it would help influence API design in a positive manner!

Speak soon!

/T

@JonathanMagnan
Copy link
Member

Hello @tsanton ,

A new version has been released today. In this version, we added two new methods:

  • WhereExistsBulkInsert
  • WhereNotExistsBulkInsert

With these two new methods come three new options dedicated to it:

  • QueryFilterPrimaryKeyExpression: Allows to specify a custom primary key to specify how to join the query with the provided list
  • QueryFilterPrimaryKeyNames: Same as above but using a List<string> of properties names.
  • QueryFilterPrimaryKeyAndFormula: Allow specific additional conditions about how to join to filter the provided list.

For example, one of the previous queries could be re-written like:

context.Customers.Where(x => x.IsActive).WhereNotExistsBulkInsert(list, options =>
{
   options.QueryFilterPrimaryKeyExpression = x => new { x.Id, x.TenantId };
   options.QueryFilterPrimaryKeyAndFormula = "StagingTable.valid_from > QueryFilter.valid_from";
});

We added a custom key but only for this query filter and a formula, as this is the only way to compare the valid_from part. I also added a Where clause just to show you can use it.

Let me know if that's close to what you where looking for


We are currently improving this features to allow filtering other methods (BulkUpdate, BulkMerge, BulkDelete). We believe it will come at the end of November

Once it will be completed, we will release an official documentation.

@JonathanMagnan
Copy link
Member

I will look again at your last message as I was about to answer, but it doesn't look like the discussion I had at all with my employee this morning.

However, I understand what you would like to do.

Best Regards,

Jon

@tsanton
Copy link
Author

tsanton commented Oct 24, 2024

As pr. mail -> this looks good and solves my request! Thanks 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants