#1 2011-12-01 23:50

bss_tech
Member
Registered: 2010-12-22
Posts: 5

Remove repeat strings in filename, maybe with RegEx?

Hello,

I'm hoping someone can help me with my file naming issue. I need to remove repeat instances of a string in my file name. For example:

     spanx_women_spanx_underwear.psd     (I work for a fashion retailer)


I'd like to be able to remove the second instance of 'spanx", or any repeat alphabetic string, from my file name. I'd like it to look like this:

     spanx_women_underwear.psd


I believe the solution is with a regex rule, but I am new to ReNamer and have less experience. could someone please help me?

Also please note that the first instance of "spanx" is being added with an Insert: ":File_FolderName:" rule. "spanx" is the name of the parent directory.

Thanks so much!

Last edited by bss_tech (2011-12-01 23:52)

Offline

#2 2011-12-02 08:18

Stefan
Moderator
From: Germany, EU
Registered: 2007-10-23
Posts: 1,161

Re: Remove repeat strings in filename, maybe with RegEx?

How can i get the wanted string to remove?
Is there an rule?
Every time the first word till the first underscore?
Or simpler: every time the word between second and third underscore?
Have you a few more examples?


Read the  *WIKI* for HELP + MANUAL + Tips&Tricks.
If ReNamer had helped you, please *DONATE* to Denis or buy a PRO license. (Read *Lite vs Pro*)

Offline

#3 2011-12-02 15:06

SafetyCar
Senior Member
Registered: 2008-04-28
Posts: 446
Website

Re: Remove repeat strings in filename, maybe with RegEx?

Well, I have a general script for this...

Maybe it could get better but it works for me:

var
  Part: TStringsArray;
  Ext, LastSeparation: WideString;
  I1, I2: Integer;

begin  
  // TAKE ALL PARTS (GOOD AND BAD)
  Part := MatchesRegEx(WideStripExtension(FileName), '([^A-Z]+|[A-Z]+)', False);

  // EMPTY NAME TO REBUILD LATER
  Ext := WideExtractFileExt(FileName);
  FileName := '';

  // GO TROUGH EVERY WORDS
  LastSeparation := '';
  for I1:=0 to length(Part)-1 do begin
    if (Part[I1] = '') then Continue;
    
    // REMOVE EVERY REPEATED WORD
    if length(MatchesRegEx(Part[I1], '^[A-Z]+$', False))>0 then  begin
      for I2:=I1+1 to length(Part)-1 do if WideSameText(Part[I1], Part[I2]) then Part[I2]:='';
      LastSeparation := '';
    end else begin
      // CLEAN REPEATED SEPARATORS ALSO
      if (Part[I1] = LastSeparation) then Part[I1] := '' else LastSeparation := Part[I1];
    end;

    // REBUILD FILE NAME
    FileName := FileName + Part[I1];
  end;
  
  FileName := FileName + Ext;
end.

EDIT: Updated to clean the separators better

Last edited by SafetyCar (2011-12-04 20:35)


If this software has helped you, consider getting your pro version. :)

Offline

#4 2011-12-02 21:53

bss_tech
Member
Registered: 2010-12-22
Posts: 5

Re: Remove repeat strings in filename, maybe with RegEx?

SafetyCar,

That worked almost perfectly! Thanks!

One thing I wanted to ask - The pascalscript you wrote isn't removing numbers in the repeated string. So:

2byrch_women_2byrch.psd
2xist_men_2xist.psd
525america_men_525america.psd

turn into:

2byrch_women_2.psd
2xist_men_2.psd
525america_men_525.psd

What I'm hoping is that the files will turn into:

2byrch_women_.psd
2xist_men_.psd
525america_men_.psd

What should I add to your pascalscript that will include the numerical part?

Thanks!

Offline

#5 2011-12-03 00:17

SafetyCar
Senior Member
Registered: 2008-04-28
Posts: 446
Website

Re: Remove repeat strings in filename, maybe with RegEx?

Emmm, about including digits...
I see it a little more dangerous for a normal use, because it could delete unwanted numbers.

Anyway, taking some care you could do it replacing this 2 things:

  Part := MatchesRegEx(FileName, '([^A-Z]+|[A-Z]+)', False);

with

  Part := MatchesRegEx(FileName, '([^A-Z0-9]+|[A-Z0-9]+)', False);

&

if length(MatchesRegEx(Part[I1], '^[A-Z]+$', False))>0 then  begin

with

if length(MatchesRegEx(Part[I1], '^[A-Z0-9]+$', False))>0 then  begin

If this software has helped you, consider getting your pro version. :)

Offline

#6 2011-12-03 00:32

SafetyCar
Senior Member
Registered: 2008-04-28
Posts: 446
Website

Re: Remove repeat strings in filename, maybe with RegEx?

Wait... To not match single numbers (without text), here is a safer way:

var
  Part: TStringsArray;
  Ext, LastSeparation: WideString;
  I1, I2: Integer;

begin
  // TAKE ALL PARTS (GOOD AND BAD)
  Part := MatchesRegEx(WideStripExtension(FileName), '([^A-Z]+|[A-Z]+)', False);

  // EMPTY NAME TO REBUILD LATER
  Ext := WideExtractFileExt(FileName);
  FileName := '';

  // GO TROUGH EVERY WORDS
  LastSeparation := '';
  for I1:=0 to length(Part)-1 do begin
    if (Part[I1] = '') then Continue;
    
    // REMOVE EVERY REPEATED WORD
    if length(MatchesRegEx(Part[I1], '^[0-9A-Z]+$', False))>0 then begin
      if length(MatchesRegEx(Part[I1], '^[0-9]+$', False))=0 then begin
        for I2:=I1+1 to length(Part)-1 do if WideSameText(Part[I1], Part[I2]) then Part[I2]:='';
      end;
      LastSeparation := '';
    end else begin
      // CLEAN REPEATED SEPARATORS ALSO
      if (Part[I1] = LastSeparation) then Part[I1] := '' else LastSeparation := Part[I1];
    end;

    // REBUILD FILE NAME
    FileName := FileName + Part[I1];
  end;
  
  FileName := FileName + Ext;
end.

Last edited by SafetyCar (2011-12-04 20:35)


If this software has helped you, consider getting your pro version. :)

Offline

Board footer

Powered by FluxBB