Skip to content

[CASSANDRA-21471][trunk] Expose immediately-executed tasks in the queries virtual table#4902

Open
netudima wants to merge 1 commit into
apache:trunkfrom
netudima:CASSANDRA-21471-trunk
Open

[CASSANDRA-21471][trunk] Expose immediately-executed tasks in the queries virtual table#4902
netudima wants to merge 1 commit into
apache:trunkfrom
netudima:CASSANDRA-21471-trunk

Conversation

@netudima

Copy link
Copy Markdown
Contributor

SEPExecutor.maybeExecuteImmediately() runs a task synchronously on the calling worker thread, nested within the task the worker is already running. Such immediate tasks were invisible in system_views.queries, which only exposed each worker's primary running task. This is common on the coordinator path, where a local read or mutation is executed immediately within the enclosing QUERY task.

Each SEPWorker now also tracks an immediate current task, set around maybeExecuteImmediately(), and exposes it as an additional DebuggableTaskRunner, so the queries table reports both the enclosing task and the immediate one as separate rows.

patch by Dmitry Konstantinov; reviewed by TBD for CASSANDRA-21471

SEPExecutor.maybeExecuteImmediately() runs a task synchronously on the
calling worker thread, nested within the task the worker is already
running. Such immediate tasks were invisible in system_views.queries, which only exposed each worker's primary running task. This is common on the coordinator path, where a local read or mutation is executed immediately within the enclosing QUERY task.

Each SEPWorker now also tracks an immediate current task, set around
maybeExecuteImmediately(), and exposes it as an additional
DebuggableTaskRunner, so the queries table reports both the enclosing
task and the immediate one as separate rows.

patch by Dmitry Konstantinov; reviewed by TBD for CASSANDRA-21471
// the work permit may go wasted if we don't immediately attempt to spawn another worker
maybeSchedule();
// nesting with depth > 1 is not supported
taskHolder.setImmediateTask(null);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we move this to the start of the finally block to make sure that we can't leak stale state into the virtual table if some kind of exception is thrown from maybeSchedule()?

// in this case in particular we are not processing the rest of the queue anyway, and so
// the work permit may go wasted if we don't immediately attempt to spawn another worker
maybeSchedule();
// nesting with depth > 1 is not supported

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a cheap way to assert this is the case?


public ImmediateTaskHolder getImmediateTaskHolder()
{
return immediateTaskHolder == null ? ImmediateTaskHolder.NO_OP : immediateTaskHolder;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can we just set NO_OP directly in the constructors above (and make passing null illegal) so we don't have to do this check?


boolean localReaderThread = threadId.contains("Read") || threadId.contains("SharedPool-Worker");
readVisible |= localReaderThread && task.contains("SELECT");
readVisible |= localReaderThread && task.contains("SELECT") && !task.contains("QUERY");

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From CC shallow review:

The tests disambiguate the new immediate-row from the enclosing-QUERY-row by String.contains("QUERY") on the free-form task description, not by the structural marker the patch actually introduces (the (immediate) suffix on thread_id). If the local-read description ever
changes to mention "QUERY", or a user creates a keyspace/table literally named QUERY, or someone re-uses the same test against a query reading from system_views.queries, the assertion silently flips. The unambiguous signal is thread_id ending with (immediate).

{
return allWorkers.stream();
return Stream.concat(allWorkers.stream(),
allWorkers.stream().map(SEPWorker::immediateRunner));

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From CC shallow review:

Finding 7: Stream.concat(allWorkers.stream(), allWorkers.stream().map(...)) iterates the concurrent set twice — snapshot inconsistency on worker churn

  • Location: src/java/org/apache/cassandra/concurrent/SharedExecutorPool.java:125-127
  • Confidence: Medium
  • Flagged by: Concurrency, Resources
  • What's wrong: allWorkers is Collections.newSetFromMap(new ConcurrentHashMap<>()). The two allWorkers.stream() calls produce two independent iterations; between them, workers can be added/removed by schedule() / workerEnded(). The two streams can yield different worker sets — main row
    without immediate row for a freshly-removed worker, immediate-only row for one added between the two traversals. Also doubles the iteration cost on each SELECT * FROM system_views.queries even at idle.
  • Suggested fix:
    return allWorkers.stream().flatMap(w -> Stream.of((DebuggableTaskRunner) w, w.immediateRunner()));

@maedhroz maedhroz left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some inline comments around things that might be worth fixing. WDYT?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants