ScheduleFlowTests is flakey but also causes double spends. Might be a severe scheduler bug.

Description

This is intermittent. In IntelliJ use the "run until fail" feature. The double spend and the fail are not necessarily related. One can happen without the other, I think. And neither happen every time.

Looking at the test flows I think this should not be possible (but worth confirming), so I'm concluding it's a bug in the scheduler. If so, I think we have some race between database transaction visibility, but I'm speculating. It needs confirming that we are indeed scheduling the same flow / state more than once occasionally.

Note, I'm slightly changing the scheduler in https://github.com/corda/corda/pull/3204 so perhaps worth a discussion once we've confirmed what the problem is to discuss how to tackle. It happens in master as well as in that PR.

[WARN ] 19:08:25,578 [Mock node 0 thread] (FlowStateMachineImpl.kt:198) corda.flow.run - Flow threw exception {flow-id=066d9709-ef95-4d64-8502-92694e784bb0, invocation_id=42857589-79f6-4111-a195-25d0555a8535, invocation_timestamp=2018-05-22T18:08:25.355Z, session_id=42857589-79f6-4111-a195-25d0555a8535, session_timestamp=2018-05-22T18:08:25.355Z}
net.corda.core.flows.NotaryException: Unable to notarise transactionB7CA0611BFD9166A87BFE55359C9850D3BF1B40206883E7F29D40CEBA827D1AC: One or more input states have been used in another transaction
at net.corda.core.internal.notary.NotaryServiceFlow.call(NotaryServiceFlow.kt:47) ~[classes/:?]
at net.corda.core.internal.notary.NotaryServiceFlow.call(NotaryServiceFlow.kt:22) ~[classes/:?]
at net.corda.node.services.statemachine.FlowStateMachineImpl.run(FlowStateMachineImpl.kt:194) [classes/:?]
at net.corda.node.services.statemachine.FlowStateMachineImpl.run(FlowStateMachineImpl.kt:49) [classes/:?]
at co.paralleluniverse.fibers.Fiber.run1(Fiber.java:1092) [quasar-core-7629695563deae6cc95adcfbebcbc8322fd0241a-jdk8.jar:0.7.9]
at co.paralleluniverse.fibers.Fiber.exec(Fiber.java:788) [quasar-core-7629695563deae6cc95adcfbebcbc8322fd0241a-jdk8.jar:0.7.9]
at co.paralleluniverse.fibers.RunnableFiberTask.doExec(RunnableFiberTask.java:100) [quasar-core-7629695563deae6cc95adcfbebcbc8322fd0241a-jdk8.jar:0.7.9]
at co.paralleluniverse.fibers.RunnableFiberTask.run(RunnableFiberTask.java:91) [quasar-core-7629695563deae6cc95adcfbebcbc8322fd0241a-jdk8.jar:0.7.9]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_131]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_131]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_131]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_131]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_131]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_131]

Activity

Show:
Rick Parker
May 23, 2018, 8:53 AM

Note, implied double scheduling more important than flakey test!

Katelyn Baker
July 13, 2018, 5:02 PM

Untargetting from V3 as requires new statemachine

Assignee

Christian Sailer

Reporter

Rick Parker

Labels

Sprint

None

Epic Link

None

Priority

Highest

Severity

Medium

CVSS Score

None

CVSS Vector

None

Due Date

None

Engineering Teams

None

Fix versions

None

Affects versions

Ported to...

None

Story Points / Dev Days

None
Configure