-
Notifications
You must be signed in to change notification settings - Fork 248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[batch] A procedure to rename job_groups_cancelled.id -> job_groups_cancelled.batch_id #14672
base: main
Are you sure you want to change the base?
[batch] A procedure to rename job_groups_cancelled.id -> job_groups_cancelled.batch_id #14672
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Ivan! Thanks so much for picking this up :) I haven't much experience on the batch system but I'll try my best to give accurate feedback. I have a few questions/observations up front:
- Your changes to stored procedures under
batch/sql
make me a little nervous.
Most of these are migrations applied in the order defined in the build step mentioned in [NOTE 1] except estimated-current.sql
[NOTE 2].
I don't think changing these will have the desired effect and may make it impossible for someone to reproduce the database.
The only changes to existing sql you'll need to make are in the sql strings in python code.
- This needs to be written as a migration and maybe could be simplified?
I think this needs to be done as a database migration. We'll have no need for a stored procedure once complete.
You can assume current columns and constraints exist, dispense with the error checking and simplify.
Can you convert this to a sql script and add it to the end of the list of migrations in build.yaml
? You'll probably want online: false
too.
I fear you'll have to take inspiration from rename-job-groups-tables.sql
by applying one ALTER TABLE
command then drop and recreate EVERYTHING that references that name (constraints, triggers, procedures etc).
This will likely involve copy+paste and rename.
Alternatively, create, execute then drop the procedure within rename-job-groups-cancelled
.
[NOTE 1] migration applied in build.yaml
The relevant build step in build.yaml
can be found by searching for the entry starting with the yaml below. This controls which migrations are applied and in what order.
kind: createDatabase2
name: batch_database
databaseName: batch
[NOTE 2] estimated-current.yaml
I don't agree with why we have this. It would be nice to generate this automatically. Anyway, please keep your changes to this file as it's meant for documentation purposes only. None of it is applied and who knows how much of it works.
…efactored sql for simplification.
Got it! I wasn't sure how Hail usually does schema update. Based on your above description the process becomes clearer ro me. Here's my second try:
Do you mean |
@ehigham Also another question is how does the schema update enforce certain order of operations. The |
Yes, sorry for the confusion
Migrations are applied successively. You cannot edit a previous migration or the order in which they're applied as they've already been applied to the production database.
I think you need to find any trigger or stored procedure that references that column, drop it and recreate it with the field renamed. It's a little scary. |
@ehigham Thanks for your comments above. I’ve added the triggers and stored procedures referencing the I was initially confused by I just have one question: Do we need to manually update |
Yes. |
I think you missed reference in the python function |
…cluding those referenced by alias.
Good catch! I’ve fixed it and also updated the |
Still seeing this error in the deploy_batch job: utils.py retry_long_running:923 in delete_prev_cancelled_job_group_cancellable_resources_records
Traceback (most recent call last):
File "/usr/local/lib/python3.9/dist-packages/hailtop/utils/utils.py", line 915, in retry_long_running
return await f(*args, **kwargs)\n File "/usr/local/lib/python3.9/dist-packages/hailtop/utils/utils.py", line 959, in loop
await f(*args, **kwargs)\n File "/usr/local/lib/python3.9/dist-packages/batch/driver/main.py", line 1485, in delete_prev_cancelled_job_group_cancellable_resources_records
async for target in targets:\n File "/usr/local/lib/python3.9/dist-packages/gear/database.py", line 334, in execute_and_fetchall
async for row in tx.execute_and_fetchall(sql, args, query_name):\n File "/usr/local/lib/python3.9/dist-packages/gear/database.py", line 257, in execute_and_fetchall
await cursor.execute(sql, args)\n File "/usr/local/lib/python3.9/dist-packages/aiomysql/cursors.py", line 239, in execute
await self._query(query)\n File "/usr/local/lib/python3.9/dist-packages/aiomysql/cursors.py", line 457, in _query
await conn.query(q)\n File "/usr/local/lib/python3.9/dist-packages/aiomysql/connection.py", line 469, in query
await self._read_query_result(unbuffered=unbuffered)\n File "/usr/local/lib/python3.9/dist-packages/aiomysql/connection.py", line 683, in _read_query_result
await result.read()\n File "/usr/local/lib/python3.9/dist-packages/aiomysql/connection.py", line 1164, in read
first_packet = await self.connection._read_packet()\n File "/usr/local/lib/python3.9/dist-packages/aiomysql/connection.py", line 652, in _read_packet
packet.raise_for_error()\n File "/usr/local/lib/python3.9/dist-packages/pymysql/protocol.py", line 219, in raise_for_error
err.raise_mysql_exception(self._data)\n File "/usr/local/lib/python3.9/dist-packages/pymysql/err.py", line 150, in raise_mysql_exception
raise errorclass(errno, errval)
pymysql.err.OperationalError: (1054, "Unknown column 'cancelled.id' in 'on clause'") |
This error strikes me as odd because Based on the error, it looks like the |
The
rename_job_groups_cancelled_column
sql file renames thejob_groups_cancelled.id
column tojob_groups_cancelled.batch_id
. The sql also updates all constraints that reference the original column to reflect the new column name.I have reviewed other tables and found no foreign keys referencing the
job_groups_cancelled
table.All queries that previously used
job_groups_cancelled.id
have been updated to referencejob_groups_cancelled.batch_id
accordingly.Resolve #14646