Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX: issues relating to "queued" moves in state mover with bMoveOk #205

Merged
merged 8 commits into from
Nov 2, 2023

Conversation

ZLLentz
Copy link
Member

@ZLLentz ZLLentz commented Oct 6, 2023

Description

  • If there is a move error, cut the execute output
  • Require a bExecute rising edge and bAllowMoves at the same time, or else no state moves

Motivation and Context

See https://jira.slac.stanford.edu/browse/ECS-3932
There are cases where "unsafe" moves can become marked as "safe" during execution, and then immediately start as if they were waiting to become "safe" rather than cancelling the move request altogether.

How Has This Been Tested?

I created a unit test that follows the following sequence:

  • Ask for a move to a bMoveOk = FALSE state
  • Confirm no move
  • Set bMoveOk := TRUE
  • Confirm no move
  • Reset the error
  • Confirm no move
  • Ask for a new move
  • Confirm yes move

I also tested this interactively, and it works as one would expect: you need to reset the error, wait for the move to become safe, and then explicitly ask for the move again. Here's a screenshot of the new updated error message:
image

Where Has This Been Documented?

Release notes only

Pre-merge checklist

  • Code works interactively
  • Test suite passes locally
  • Code contains descriptive comments
  • Libraries are set to Always Newest version (Library, *)
  • Committed with pre-commit or ran pre-commit run --all-files

@ZLLentz
Copy link
Member Author

ZLLentz commented Oct 7, 2023

I'm a bit stumped as to the failure state of my test- at the end, the motor moves as I expect it to, but the function block reports that it is not busy. I'm not sure what I'm looking at, if it's a bug related to this code or unrelated or not a bug at all and I'm calling the code wrong.

@ZLLentz
Copy link
Member Author

ZLLentz commented Oct 12, 2023

It turns out that my tests were failing simply because I wasn't waiting long enough to check the results. I feel a little bit silly.
Remaining TODO: verify that the unit test fails with the pre-patch version of the code.

@ZLLentz
Copy link
Member Author

ZLLentz commented Oct 12, 2023

The tests pass without the patch, meaning that I've completely failed at some part of the dev process here.
I need to go back to interactive pre-patch testing to make sure I understood the issue properly.

@ZLLentz
Copy link
Member Author

ZLLentz commented Oct 12, 2023

I'm currently taking this as a learning experience to experiment with and get better at writing these sorts of unit tests. Hopefully this is a time investment that will pay off later. I have not entirely figured out what is wrong, but it has something to do with extra moves being requested on my motor that I don't intend to make.

@ZLLentz
Copy link
Member Author

ZLLentz commented Oct 13, 2023

This newest version of the test consistently passes without hitting the weird errors I was seeing, largely by being less "cycle-precise" than the previous iterations. That is, we're hitting buttons more like "humans" with human delays.

I think this means that some of these function blocks have race conditions, and I think some of these race conditions arise from the interaction of the NC process with the PLC process.

In addition, the test now fails how I expected it to fail with the pre-patch code.

@ZLLentz
Copy link
Member Author

ZLLentz commented Oct 13, 2023

"The usual suspects" for code review are all at a conference right now- feel free to wait until next week to take a look.

@ZLLentz ZLLentz marked this pull request as ready for review October 24, 2023 20:14
@ZLLentz
Copy link
Member Author

ZLLentz commented Oct 24, 2023

@klauer @slactjohnson @ghalym I'd appreciate at least one review on this so we can close https://jira.slac.stanford.edu/browse/ECS-3932

klauer
klauer previously approved these changes Oct 24, 2023
Copy link
Contributor

@klauer klauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latching logic makes sense to me, I think - though I do want to confirm that with the few comments I have here.

IF nLatchAllowErrorID <> 0 THEN
nErrorID := nLatchAllowErrorID;
ELSIF stPositionState.bValid THEN
nErrorID := 16#7901;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment on these magic numbers would be appreciated (or eventually moved to an enum as we may have previously chatted about, I think)

Copy link
Member Author

@ZLLentz ZLLentz Oct 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought I'd done the background work for this and forgot to apply it here, but the reality is that I only implemented enums so far for the FF types. The custom error codes are reflected only in the error message writer FB, this needs some better organization as you've said before.
https://github.com/pcdshub/lcls-twincat-motion/blob/master/lcls-twincat-motion/Library/DUTs/E_MotionFFType.TcDUT
https://github.com/pcdshub/lcls-twincat-motion/blob/master/lcls-twincat-motion/Library/POUs/Motion/Utils/F_MotionErrorCodeLookup.TcPOU

Copy link
Member Author

@ZLLentz ZLLentz Oct 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intuition is that this calls for a new enum specifically for custom error codes- not to rehash all the built-in errors.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the "custom error code enum" makes a lot of sense. This certainly could be in a follow-up PR (if at all)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handled in 1170c26

// Inject custom error if we can't move because of bMoveOk or bValid
IF bExecute AND NOT bAllowMove THEN
IF nLatchAllowErrorID <> 0 OR (bExecute AND NOT bAllowMove) THEN
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the scenario of nLatchAllowErrorID <> 0, it's necessary to redo this IF block not to re-look things up, but ensure that bError is set (no matter what fbMotionRequest does to it), is that right? Trying to follow the logic, and it seems largely a no-op beside for that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's exactly correct. I need to reapply my own FB's error status, including bError and nErrorID, irrespective of the internal error state of fbMotionRequest.

END_VAR
]]></Declaration>
<Implementation>
<ST><![CDATA[// Veto the move for uninitialized and unsafe states
bAllowMove := stPositionState.bMoveOk AND stPositionState.bValid AND stPositionState.bUpdated;

rtExec(CLK:=bExecute);
bInnerExec S= rtExec.Q AND bAllowMove;
bInnerExec R= NOT bExecute;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd naively have expected if there were a latched error we wouldn't execute fbMotionRequest at all. I'd imagine there are some details I'm missing there - mind giving a quick explanation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, despite your other comment, you're correct. It may be worth doing something additional here, making the inner function blocks also protect themselves for the sake of robustness.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handled in cc683b2

@@ -84,6 +86,10 @@ CASE nState OF
END_IF
END_CASE

IF bMoveError THEN
bExecMove := FALSE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I think this is the line that explains my above question...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is indeed why it works out, but I think your previous point still stands

Copy link
Contributor

@klauer klauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The enum additions look good and I have no further suggestions / nits

I'd say this is good to go, though it'd be nice to have another set of eyes on it for good measure

@ZLLentz
Copy link
Member Author

ZLLentz commented Nov 2, 2023

I'm fairly confident in this and I'm already here, so I'm going to merge, tag, and deploy the tag. If something else is discovered between now and the PAMM there is time to address it.

@ZLLentz ZLLentz merged commit 4898708 into pcdshub:master Nov 2, 2023
9 checks passed
@ZLLentz ZLLentz deleted the fix_bmoveok_queue branch November 2, 2023 18:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants