You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The result of this is that the error can end up getting an error like this when trying to upsert rows from an output binding
Exception executing query. Message=Invalid column name 'Name'.
Invalid column name 'Name'.
Query=WITH cte AS ( SELECT * FROM OPENJSON(REPLACE(@rowData, N'/', N'\/')) WITH ([Id] int) )
MERGE INTO [dbo].[Customer] WITH (HOLDLOCK)
AS ExistingData
USING cte
AS NewData
ON
ExistingData.[Id] = NewData.[Id]
WHEN MATCHED THEN
UPDATE SET ExistingData.[Name] = NewData.[Name]
WHEN NOT MATCHED THEN
INSERT ([Name]) VALUES ([Name]);
Executed 'Functions.UpdateCustomers' (Failed, Id=dfb6f55f-fbf3-4de7-b6a1-e9c2932105cb, Duration=49017ms)
System.Private.CoreLib: Exception while executing function: Functions.UpdateCustomers. Microsoft.Azure.WebJobs.Host: Error while handling parameter _binder after function returned:. Microsoft.Azure.WebJobs.Extensions.Sql: Unexpected error upserting rows. Core Microsoft SqlClient Data Provider: Invalid column name 'Name'.
Invalid column name 'Name'.
Here is a sample project demonstrating this behavior. java_customer.zip
Create the table from the script in the zip and then update the connection string to point to your server, then run it and hit the function endpoint.
While allowing overriding the serialization settings would solve this, I believe we should be able to handle this in our code as well. The issue currently happens because we combine two separate parts into the query that ends up throwing. The first part is the "merge or insert query" which is generated from the first row of data sent in :
The issue is the disconnect that happens when objects in the data have different schemas - in this case some objects have the Name field (because they have a name) and some don't (because the name is null). Depending on where this data shows up within the batch this can result in the error above.
So we could see if there's anything we can do here to handle the issue - possibly by detecting schema mismatches between batches and say for any fields that are missing just use the existing value in the row. But I'm wary of trying to do anything much here - in general the extension is currently written to expect that all rows in a batch have the same schema and so changing that would require some deeper investigation and discussion of behavior before such a thing is attempted.
Workarounds
Only known workaround currently is to use a different language that doesn't have this issue with the serialization. I've tested C# and verified that works, but other languages could work too.
The text was updated successfully, but these errors were encountered:
Some languages, such as Java, remove null properties from an object when serializing into JSON. For Java specifically this is because the default behavior of Gson is to ignore null values, and currently the Java worker doesn't allow customizing the serialization
The result of this is that the error can end up getting an error like this when trying to upsert rows from an output binding
Here is a sample project demonstrating this behavior. java_customer.zip
Create the table from the script in the zip and then update the connection string to point to your server, then run it and hit the function endpoint.
While allowing overriding the serialization settings would solve this, I believe we should be able to handle this in our code as well. The issue currently happens because we combine two separate parts into the query that ends up throwing. The first part is the "merge or insert query" which is generated from the first row of data sent in :
azure-functions-sql-extension/src/SqlAsyncCollector.cs
Line 264 in a47be4d
And the second is the data passed in - this is updated for each batch (and again generated based on the first row of data for the batch
azure-functions-sql-extension/src/SqlAsyncCollector.cs
Line 281 in a47be4d
The issue is the disconnect that happens when objects in the data have different schemas - in this case some objects have the Name field (because they have a name) and some don't (because the name is null). Depending on where this data shows up within the batch this can result in the error above.
So we could see if there's anything we can do here to handle the issue - possibly by detecting schema mismatches between batches and say for any fields that are missing just use the existing value in the row. But I'm wary of trying to do anything much here - in general the extension is currently written to expect that all rows in a batch have the same schema and so changing that would require some deeper investigation and discussion of behavior before such a thing is attempted.
Workarounds
Only known workaround currently is to use a different language that doesn't have this issue with the serialization. I've tested C# and verified that works, but other languages could work too.
The text was updated successfully, but these errors were encountered: