Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trace_api: get_transaction_trace endpoint fails to return transaction trace if the initial block including the transaction forks out #942

Closed
linh2931 opened this issue Oct 16, 2024 · 3 comments · Fixed by #966 or #970
Assignees
Labels
bug The product is not working as was intended. 👍 lgtm
Milestone

Comments

@linh2931
Copy link
Member

https://github.com/AntelopeIO/spring/actions/runs/11367823623/job/31622189569#step:4:859

Traceback (most recent call last):
  File "/__w/spring/spring/build/tests/separate_prod_fin_test.py", line 50, in <module>
    if cluster.launch(pnodes=pnodes, totalNodes=total_nodes, totalProducers=pnodes,
  File "/__w/spring/spring/build/tests/TestHarness/Cluster.py", line 536, in launch
    if not self.bootstrap(launcher, self.biosNode, self.startedNodesCount, prodCount + sharedProducers, totalProducers, pfSetupPolicy, onlyBios, onlySetProds, loadSystemContract, activateIF, biosFinalizer, signatureProviderForNonProducer):
  File "/__w/spring/spring/build/tests/TestHarness/Cluster.py", line 1294, in bootstrap
    trans=biosNode.publishContract(eosioAccount, contractDir, wasmFile, abiFile, waitForTransBlock=True)
  File "/__w/spring/spring/build/tests/TestHarness/transactions.py", line 171, in publishContract
    trans=Utils.runCmdReturnJson(cmd, trace=False, silentErrors=shouldFail)
  File "/__w/spring/spring/build/tests/TestHarness/testUtils.py", line 327, in runCmdReturnJson
    return Utils.runCmdArrReturnJson(cmdArr, trace=trace, silentErrors=silentErrors)
  File "/__w/spring/spring/build/tests/TestHarness/testUtils.py", line 311, in runCmdArrReturnJson
    return Utils.toJson(retStr, trace, silentErrors)
  File "/__w/spring/spring/build/tests/TestHarness/testUtils.py", line 297, in toJson
    raise TypeError(msg)
TypeError: Received empty JSON response
@enf-ci-bot enf-ci-bot moved this to Todo in Team Backlog Oct 16, 2024
@linh2931 linh2931 self-assigned this Oct 17, 2024
@linh2931
Copy link
Member Author

Looks like the retrying here not working

while retries < retryNum:

@linh2931 linh2931 added 👍 lgtm test-instability tag issues for flaky tests, high priority to address and removed triage labels Oct 17, 2024
@linh2931 linh2931 moved this from Todo to In Progress in Team Backlog Oct 17, 2024
@linh2931 linh2931 added this to the Spring v1.1.0-rc1 milestone Oct 17, 2024
@heifner
Copy link
Member

heifner commented Oct 19, 2024

Certainly looks like a bug in trace_api_plugin get_trx_block_number.

The block 125 that originally had the eosio system contract in it was forked out. It was later put into block 137. That block 137 did quickly become final in block 139.

However, get_trx_block_number keeps reporting that the trx is in block 125.

This looks like it is because the trace_api_plugin only every appends. It does not rewind like SHiP. So the logic is suppose to find the last block with the trxid that is lib. Instead it only finds any block number with a trx where that block number becomes lib later. However, a new block with that number might not have the trx in it.

The original trace_api_plugin did not store trx ids. The logic of tracking block_num works for blocks and lib. Once a block number is lib you know the last block with that number in the file is the one you want. This doesn't work for trx ids as the last version of that block you saw might not have the trx id in it.

I think we should create a test that reproduces this problem. One possible fix is in get_trx_block_number you verify trx_block_num when found again includes the trxid or you reset trx_block_num to 0.

@linh2931 linh2931 added bug The product is not working as was intended. and removed test-instability tag issues for flaky tests, high priority to address labels Oct 21, 2024
@linh2931 linh2931 changed the title Test failure: separate_prod_fin_test trace_api: get_transaction_trace endpoint fails to return transaction trace if the initial block including the transaction forks out Oct 21, 2024
@linh2931
Copy link
Member Author

Here is the background info from user's point of view.

The separate_prod_fin_test publishes system contract at the test cluster starting up and waits for the set code transaction to be included in a finalized block. It uses trace_api's get_transaction_trace to track the status of the transaction. But the first block including the transaction forks out, and because of the bug in trace_api's get_trx_block_number, get_transaction_trace never returns the transaction trace, which causes test failure and exposes the biug.

In addition, more debug logging could be added to publishContract of TestHarness to help debugging future test failures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment