Skip to content

Feature/allow full compute asset#156

Open
alexcos20 wants to merge 5 commits into
mainfrom
feature/allow_full_computeAsset
Open

Feature/allow full compute asset#156
alexcos20 wants to merge 5 commits into
mainfrom
feature/allow_full_computeAsset

Conversation

@alexcos20

@alexcos20 alexcos20 commented Jun 19, 2026

Copy link
Copy Markdown
Member

Support raw datasets & algorithms in compute commands

What

startCompute and startFreeCompute previously accepted only DIDs for datasets and the algorithm. They now also accept full ComputeAsset / ComputeAlgorithm JSON objects with a fileObject, so you can run compute on raw, unpublished data and algorithms (URL, IPFS, Arweave, S3, FTP, node persistent storage) — no publishing, no datatoken required.

Both commands also support mixed inputs: a datasets argument can combine published DIDs and raw file objects in the same job.

Why

The underlying ProviderInstance.computeStart / freeComputeStart already supported these objects; the CLI just didn't expose them. This unlocks the common "run an algorithm on a file I haven't published" workflow.

How

  • New shared helper resolveComputeInputs turns the dataset/algorithm CLI strings into ready-to-use compute inputs, auto-detecting DID vs JSON per entry. This also removes ~330 lines of duplicated parsing logic across three functions.
  • DID entries are resolved and ordered as before; raw entries skip DDO resolution, orderability checks, and datatoken ordering.
  • Fully backward compatible — existing DID-based usage (including the legacy [did:a,did:b] form) is unchanged.

Usage

JSON must be single-quoted on the shell; in mixed arrays, DIDs must be quoted.

# Raw algorithm against a published dataset
npm run cli startFreeCompute did:op:dataset '{"fileObject":{"type":"url","url":"https://example.com/algo.py","method":"GET"},"meta":{...}}' env1

# Mixed datasets (published DID + raw file)
npm run cli startCompute -- '["did:op:dataset",{"fileObject":{"type":"ipfs","hash":"Qm..."}}]' did:op:algo env1 900 token resources --accept true

Notes

  • Fixed a latent bug where additional datasets were pushed as a nested array, which would have corrupted multi-dataset jobs.
  • Added a network-free unit test for resolveComputeInputs (DID, raw, mixed, legacy, and error cases).
  • README updated with examples.

Summary by CodeRabbit

  • Documentation

    • Updated README with examples for raw datasets/algorithms and mixed DID+raw compute inputs.
  • New Features

    • Compute commands now accept raw datasets and algorithms via fileObject JSON inputs, alongside traditional DIDs.
    • Support for mixed DID and raw file inputs in both startCompute and startFreeCompute.
  • Refactor

    • Improved compute command input handling and validation logic.

@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 858fdf1a-f374-41d4-987a-70641876d2e0

📥 Commits

Reviewing files that changed from the base of the PR and between 07d176d and 5366520.

📒 Files selected for processing (3)
  • src/commands.ts
  • src/helpers.ts
  • test/resolveComputeInputs.test.ts
🚧 Files skipped from review as they are similar to previous changes (3)
  • test/resolveComputeInputs.test.ts
  • src/helpers.ts
  • src/commands.ts

📝 Walkthrough

Walkthrough

Adds support for raw (unpublished) datasets and algorithms in compute CLI commands. A new resolveComputeInputs helper in helpers.ts tokenizes and resolves mixed DID/raw-fileObject JSON inputs via Aquarius, returning index-aligned assets/ddos arrays and a derived providerURI. The three compute command functions (initializeCompute, computeStart, freeComputeStart) are refactored to use this helper, removing ~300 lines of inline parsing. CLI help text and README are updated with examples.

Changes

Raw compute input resolution and wiring

Layer / File(s) Summary
ResolvedComputeInputs contract and resolver
src/helpers.ts
Adds ComputeAsset import, exports the ResolvedComputeInputs interface with index-aligned assets/ddos, algo/algoDdo, and providerURI. Adds internal parseComputeInput tokenizer (supporting DIDs, JSON objects/arrays, legacy bracket form) and the exported resolveComputeInputs function that validates fileObject on raw entries, calls aquarius.waitForIndexer for DID tokens, aligns null placeholders in ddos, and returns null with console.error on failures.
resolveComputeInputs unit tests
test/resolveComputeInputs.test.ts
Adds a Mocha/Chai suite with a fake Aquarius stub covering: single DID, legacy bracketed and unbracketed comma-separated DID strings, raw JSON datasets (ddos entries null), mixed DID+raw arrays with ddos index alignment, raw algorithm objects (algoDdo = null), empty dataset arrays, and negative cases for missing fileObject or unresolvable DIDs.
Compute command refactoring
src/commands.ts
Adds resolveComputeInputs and Asset imports. Replaces inline parsing/Aquarius-fetch blocks in initializeCompute, computeStart, and freeComputeStart with resolveComputeInputs(...) calls. Refactors dataset/algorithm ordering loops to use per-dataset fee entries from parsedProviderInitializeComputeJob?.datasets?.[i], gates algorithm ordering on algoDdo presence, adds a local describeAsset(...) logging helper, and removes unused mytime/computeMinutes block and previous "add additional datasets" mutation.
CLI help text and README
src/cli.ts, README.md
Expands startCompute and startFreeCompute positional argument and --datasets/--algo option help strings to document JSON fileObject input format and mixed DID+raw syntax. Adds README examples for raw algorithm on a published dataset, raw dataset(s) on a published algorithm, and mixed DID+raw inputs.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐇 A rabbit once dug through the DID-filled ground,
And found raw fileObjects scattered around.
No tokens to order, no datatoken to buy—
Just JSON passed in, and the compute runs high!
The resolver now sorts every asset in line,
With null-filled ddos and a providerURI fine. ✨

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'Feature/allow full compute asset' is vague and does not clearly describe the main change—supporting raw datasets/algorithms in compute commands. Revise the title to be more specific and descriptive, such as 'Add support for raw datasets and algorithms in compute commands' or 'Enable compute jobs with raw fileObject inputs'.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/allow_full_computeAsset

Comment @coderabbitai help to get the list of available commands and usage tips.

@alexcos20

Copy link
Copy Markdown
Member Author

/run-security-scan

@alexcos20 alexcos20 left a comment

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI automated code review (Gemini 3).

Overall risk: low

Summary:
The PR successfully introduces support for full ComputeAsset and ComputeAlgorithm JSON inputs, enabling the use of raw datasets and algorithms without DIDs. It significantly refactors and cleans up the CLI commands by extracting common resolution logic into a reusable and well-tested resolveComputeInputs helper. Excellent architectural improvement and test coverage.

Comments:
• [WARNING][style] These look like leftover debugging logs. Consider removing them to keep the CLI console output clean.

-      console.log("Bucket ID:", bucketId);
-      console.log("File Path:", filePath);
-      console.log("File Name:", fileName);

• [INFO][style] Passing algoDdo as Asset | DDO when algoDdo could be null (since raw algorithms return null for algoDdo) might cause unexpected behavior if isOrderable strictly checks for undefined but doesn't handle null. Though it's likely fine as null is falsy, it's safer to use the nullish coalescing operator to pass undefined.

-        algoDdo as Asset | DDO
+        (algoDdo ?? undefined) as Asset | DDO

• [INFO][other] The addition of this test suite is excellent! Great job covering all the edge cases like legacy arrays, raw/DID mixes, and fallback provider URLs.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
test/resolveComputeInputs.test.ts (1)

61-75: ⚡ Quick win

Add a regression test for unbracketed comma-separated DIDs.

Given backward-compat support is expected, add a case for did:op:a,did:op:b (without brackets) so this contract stays protected.

Suggested test case
+  it("supports legacy comma-separated dataset DIDs without brackets", async function () {
+    const res = await resolveComputeInputs(
+      "did:op:a,did:op:b",
+      "did:op:algo1",
+      makeAquarius(),
+      indexingParams,
+      FALLBACK
+    );
+    expect(res).to.not.equal(null);
+    expect(res.assets.map((a) => a.documentId)).to.deep.equal([
+      "did:op:a",
+      "did:op:b",
+    ]);
+  });
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/resolveComputeInputs.test.ts` around lines 61 - 75, Add a new regression
test case to ensure backward compatibility for unbracketed comma-separated DIDs.
Create a test similar to the existing "supports the legacy unquoted
[did:a,did:b] datasets form" test, but instead call resolveComputeInputs with
the string "did:op:a,did:op:b" (without brackets) and verify that it produces
the same results: 2 assets with documentIds "did:op:a" and "did:op:b", and all
ddos values are non-null.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/commands.ts`:
- Around line 299-309: The isOrderable function is async but is being called
without await in three locations, causing validation checks to fail since a
Promise is always truthy. Add the await keyword before each call to isOrderable
at lines 304, 473, and 768 in src/commands.ts to ensure the function result is
properly evaluated before the conditional checks (such as if (!canStartCompute))
are executed.

In `@src/helpers.ts`:
- Around line 310-323: The function currently only splits comma-separated DIDs
when they are wrapped in brackets, causing plain comma-separated values like
did:op:a,did:op:b to be treated as a single token. Before the final return
statement that returns [trimmed], add a check to see if trimmed contains a
comma. If it does, split the trimmed string by comma and apply the same
processing (trim each element and filter out empty strings) that is already
being done for the bracketed case, to restore support for the legacy
comma-separated DID format.

---

Nitpick comments:
In `@test/resolveComputeInputs.test.ts`:
- Around line 61-75: Add a new regression test case to ensure backward
compatibility for unbracketed comma-separated DIDs. Create a test similar to the
existing "supports the legacy unquoted [did:a,did:b] datasets form" test, but
instead call resolveComputeInputs with the string "did:op:a,did:op:b" (without
brackets) and verify that it produces the same results: 2 assets with
documentIds "did:op:a" and "did:op:b", and all ddos values are non-null.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: c466a90b-98dc-4166-ab70-64fc4219e719

📥 Commits

Reviewing files that changed from the base of the PR and between 3048ee3 and 07d176d.

📒 Files selected for processing (5)
  • README.md
  • src/cli.ts
  • src/commands.ts
  • src/helpers.ts
  • test/resolveComputeInputs.test.ts

Comment thread src/commands.ts Outdated
Comment on lines 299 to 309
// Validate orderability only for DID-based datasets (raw fileObject assets
// have no DDO and are validated by the node).
for (let i = 0; i < assets.length; i++) {
const dataDdo = ddos[i];
if (!dataDdo) continue;
const canStartCompute = isOrderable(
ddos[dataDdo],
ddos[dataDdo].services[0].id,
dataDdo,
dataDdo.services[0].id,
algo,
algoDdo
algoDdo as Asset | DDO
);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Verify isOrderable is async
rg -nP 'export\s+async\s+function\s+isOrderable' src/helpers.ts

# Find current non-awaited call sites
rg -nP 'const\s+canStartCompute\s*=\s*isOrderable\(' src/commands.ts

# Confirm awaited call sites (should match count after fix)
rg -nP 'const\s+canStartCompute\s*=\s*await\s+isOrderable\(' src/commands.ts

Repository: oceanprotocol/ocean-cli

Length of output: 247


Add await when calling isOrderable or validation is bypassed.

isOrderable is async but these three call sites evaluate the returned Promise directly without awaiting. This makes if (!canStartCompute) checks ineffective since a Promise is always truthy, allowing disallowed compute combinations to pass local validation.

Suggested fix
-      const canStartCompute = isOrderable(
+      const canStartCompute = await isOrderable(
         dataDdo,
         dataDdo.services[0].id,
         algo,
         algoDdo as Asset | DDO
       );

Apply this change at lines 304, 473, and 768 in src/commands.ts.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/commands.ts` around lines 299 - 309, The isOrderable function is async
but is being called without await in three locations, causing validation checks
to fail since a Promise is always truthy. Add the await keyword before each call
to isOrderable at lines 304, 473, and 768 in src/commands.ts to ensure the
function result is properly evaluated before the conditional checks (such as if
(!canStartCompute)) are executed.

Comment thread src/helpers.ts
@alexcos20 alexcos20 marked this pull request as ready for review June 19, 2026 10:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants