#1 2020-05-24 08:27

diditi2020
Member
Registered: 2020-05-24
Posts: 3

Something wrong with regular expression

Compare of ReNamer and PowerRename:

Compare of ReNamer and PowerRename

Test of RegExr.com:

Test of RegExr.com

Last edited by diditi2020 (2020-05-24 08:37)

Offline

#2 2020-05-24 09:50

den4b
Administrator
From: den4b.com
Registered: 2006-04-06
Posts: 3,440

Re: Something wrong with regular expression

You are trying to use an expression "(\W+\.)?([\w\.]+)(.*)" with a replacement "$2" to convert something like "汉语.test.txt" to "test.txt". This won't do anything, if you consider that "\w" stands for a word character in any language and "\W" is the reverse of it, i.e. not a word character in any language.

ReNamer treats all Unicode word characters equally. You need to replace "\W" with "\w" in your example, so the expression becomes "(\w+\.)?([\w\.]+)(.*)".

Your other sources, PowerRename and RegExr.com, treat "\w" as alphanumeric & underscore in Latin only, equivalent to "[A-Za-z0-9_]".

Offline

#3 2020-05-24 12:08

diditi2020
Member
Registered: 2020-05-24
Posts: 3

Re: Something wrong with regular expression

den4b wrote:

ReNamer treats all Unicode word characters equally.

Now, there is a new probrem about unicode:

4cb0571271995ae7a3d79be0653fd05eed648234.jpg


But real strange is, I only used $2, and output this:

3d638d7076203fbb94a6cac2259d62b6f53dbfaa.jpg

Last edited by diditi2020 (2020-05-24 12:23)

Offline

#4 2020-05-24 13:44

den4b
Administrator
From: den4b.com
Registered: 2006-04-06
Posts: 3,440

Re: Something wrong with regular expression

Screenshots are great, but please also post your examples and rules configuration in plain text, to help us investigate.

You can use the "Export to Clipboard" option in the context menu of the rules table.

Offline

#5 2020-05-24 17:35

diditi2020
Member
Registered: 2020-05-24
Posts: 3

Re: Something wrong with regular expression

den4b wrote:

Screenshots are great, but please also post your examples and rules configuration in plain text, to help us investigate.

You can use the "Export to Clipboard" option in the context menu of the rules table.

ReNamer 7.2.0.1 beta

replace: $2

([\u4e00-\u9fa5]+)([A-Z0-9]+)

測試TEST → 測試T
測試1234 → 測試4
TEST1234 → 4

([^\x00-\xff]+)([A-Z0-9]+)

測試TEST → T
測試1234 → 4
TEST1234 → 4

------------------------
replace: $1_$2

([\u4e00-\u9fa5]+)([A-Z0-9]+)

測試TEST → 測試TES_T
測試1234 → 測試123_4
TEST1234 → TEST123_4

Last edited by diditi2020 (2020-05-24 17:45)

Offline

#6 2020-05-27 16:30

den4b
Administrator
From: den4b.com
Registered: 2006-04-06
Posts: 3,440

Re: Something wrong with regular expression

The syntax used by the Regular Expressions engine in ReNamer is slightly different to the one you assume.

The hex based character codes can be entered as "\xnn" or "\x{nnnn}".

See the reference for more information:
https://www.den4b.com/wiki/ReNamer:Regular_Expressions

If we look at your original expression, to be replaced by "$2":

([\u4e00-\u9fa5]+)([A-Z0-9]+)

It should be modified as follows:

([\x{4e00}-\x{9fa5}]+)([A-Z0-9]+)

Input:

測試TEST
測試1234
TEST1234

Output:

TEST
1234
TEST1234

Offline

Board footer

Powered by FluxBB