Skip to content

Add string format tools to library.#2373

Draft
toinehartman wants to merge 5 commits into
mainfrom
feature/format-tools
Draft

Add string format tools to library.#2373
toinehartman wants to merge 5 commits into
mainfrom
feature/format-tools

Conversation

@toinehartman

Copy link
Copy Markdown
Member

This PR adds many generic tools that can be used to process/format strings.

TODO

  • Extensively document (once we figured out which ones we will actually keep)

@codecov

codecov Bot commented Aug 27, 2025

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 46%. Comparing base (e53517d) to head (3633d79).
⚠️ Report is 24361 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff            @@
##              main   #2373     +/-   ##
=========================================
- Coverage       47%     46%     -2%     
- Complexity    6566    6724    +158     
=========================================
  Files          780     839     +59     
  Lines        64398   66767   +2369     
  Branches      9628    9983    +355     
=========================================
+ Hits         30513   30778    +265     
- Misses       31559   33602   +2043     
- Partials      2326    2387     +61     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@rodinaarssen rodinaarssen left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! Just a few remarks and questions.

Comment thread src/org/rascalmpl/library/String.rsc
Comment thread src/org/rascalmpl/library/String.rsc Outdated
Comment thread src/org/rascalmpl/library/String.rsc
Comment thread src/org/rascalmpl/library/String.rsc Outdated
Comment thread src/org/rascalmpl/library/String.rsc Outdated
Comment thread src/org/rascalmpl/library/String.rsc Outdated
Comment thread src/org/rascalmpl/library/String.rsc Outdated
Comment thread src/org/rascalmpl/library/String.rsc
Comment thread src/org/rascalmpl/library/String.rsc
Comment thread src/org/rascalmpl/library/String.rsc

@DavyLandman DavyLandman left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pinging @jurgenvinju since he has in the past done some more stuff around formatting & indentation.

I have the feeling we're missing something in this PR, a feature we already have in rascal. (aside from the fact I don't really like all these regexps and char loops)

}

@synopsis{Split a string to an indentation prefix and the remainder of the string.}
tuple[str indentation, str rest] splitIndentation(/^<indentation:\s*><rest:.*>/)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if the string contains multiple lines?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that should be an explicit exception (invalid argument or similar)

int count = size(findAll(input, nl));
linesepCounts[nl] = count;
// subtract all occurrences of substrings of newline characters that we counted before
for (str snl <- substrings(nl), linesepCounts[snl]?) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this almost looks like pattern matching on strings? (which we only have reasonable support over)

for example:

rascal>visit("abcd") { case str m : println(m); }
abcd
bcd
cd
d

come to think of it, this whole function smells like an parsing automata. Where we build a big state table of all the possible matches and then iterate through all the chars and count the matches based on their state.

In java this would be 20/30 lines, but in rascal we might be missing some primitives (as we don't have a character loop).

@toinehartman toinehartman Sep 8, 2025

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we hard-code the set of newline characters (e.g. to all Unicode newline chars), we could write it as a grammar and use the parser generator. Downside (as we discussed) is that all (transitive) imports of this module will trigger generation of a parser. We could also move some of those to a specific Format module.

@toinehartman toinehartman marked this pull request as draft October 28, 2025 08:23
@DavyLandman DavyLandman force-pushed the feature/format-tools branch 2 times, most recently from a473a0d to 3633d79 Compare June 26, 2026 12:02
@sonarqubecloud

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants