Add validations for ERE quantifiers {m}, {m,} and {m,n}#328
Add validations for ERE quantifiers {m}, {m,} and {m,n}#328LoukasPap wants to merge 1 commit intouutils:mainfrom
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #328 +/- ##
==========================================
+ Coverage 82.07% 82.40% +0.33%
==========================================
Files 13 13
Lines 5445 5587 +142
Branches 293 307 +14
==========================================
+ Hits 4469 4604 +135
- Misses 974 981 +7
Partials 2 2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Forgot to add unit tests. I will add soon. |
Merging this PR will improve performance by 2.42%
Performance Changes
Comparing |
|
@e-kwsm Ready for review! 🙂 |
There was a problem hiding this comment.
{,n} to be parsed as literal
nitpick: "{,n}" pattern is not specified in ERE, but
<<< xxx sed -E -n -e '/x{,-1}/s//@/p'- GNU and BSD:
Invalid content of \{\}, exit with one - uutils: prints nothing and exits with zero
<<< xxx sed -E -n -e '/x{,0}/s//@/p'- GNU and BSD: print
@xxx - uutils: prints nothing
<<< xxx sed -E -n -e '/x{,1}/s//@/p'- GNU and BSD: print
@xx - uutils: prints nothing
<<< xxx sed -n -E -e '/.{,x}/p'
|
GNU and BSD support 255-repetition.
https://man.freebsd.org/cgi/man.cgi?re_format(7)
|
|
Nice catches, I'll check them |
Fixes issue 289
There are two types of errors in quantifiers:
Unmatched \\{for unterminated bracesInvalid content of \\{\\}for all the othersUnmatched \\{sedalways captures the unmatched braces error first. To deal with that, I scan ahead the regex for unclosed{ }and if it is violated it throws the error, otherwise we retreat back to the beginning to start the processing.Invalid content of \\{\\}Processing happens only if we have extended mode and checks for:
{,}to be converted to*{,n}to be parsed as literalThere is also an error when m and/or n are very large numbers but I let that for the
regexcrate to handle (the error message is different fromsed's but that is trivial I believe).