Difference between revisions of "ReNamer:Pascal Script:Unicode String Handling Routines"

From den4b Wiki
Jump to navigation Jump to search
(Replaced the use of TStringsArray with TWideStringArray.)
m (Text replacement - "<source>" to "<syntaxhighlight lang="pascal">")
Line 12: Line 12:
 
[[ReNamer:Pascal Script:Quick guide|<u>To understand the code below you'll need basic knowledge about variables declaration, arrays and if-then-else statement.</u>]]
 
[[ReNamer:Pascal Script:Quick guide|<u>To understand the code below you'll need basic knowledge about variables declaration, arrays and if-then-else statement.</u>]]
  
<source>
+
<syntaxhighlight lang="pascal">
 
var
 
var
 
   SplittedFileName: TWideStringArray;
 
   SplittedFileName: TWideStringArray;
Line 30: Line 30:
 
If we would like to split the FileName into words (word in this case is anything that lays between two spaces) the proper line of code would look like this:
 
If we would like to split the FileName into words (word in this case is anything that lays between two spaces) the proper line of code would look like this:
  
<source>
+
<syntaxhighlight lang="pascal">
 
SplittedFileName := WideSplitString(WideExtractBaseName(FileName), ' ');
 
SplittedFileName := WideSplitString(WideExtractBaseName(FileName), ' ');
 
</source>
 
</source>
Line 38: Line 38:
 
Another useful function is '''WideReplaceStr''' function. With its help we can eg. replace all appearances of <nowiki>'</nowiki>''your car''<nowiki>'</nowiki> phrase with <nowiki>'</nowiki>''my car''<nowiki>'</nowiki>.
 
Another useful function is '''WideReplaceStr''' function. With its help we can eg. replace all appearances of <nowiki>'</nowiki>''your car''<nowiki>'</nowiki> phrase with <nowiki>'</nowiki>''my car''<nowiki>'</nowiki>.
  
<source>
+
<syntaxhighlight lang="pascal">
 
FileName := WideReplaceStr(FileName, 'your car', 'my car');
 
FileName := WideReplaceStr(FileName, 'your car', 'my car');
 
</source>
 
</source>
Line 52: Line 52:
 
Now we can take a look at the description of functions/procedures that were mentioned above.
 
Now we can take a look at the description of functions/procedures that were mentioned above.
  
<source>
+
<syntaxhighlight lang="pascal">
 
function WidePos(const SubStr, S: WideString): Integer;
 
function WidePos(const SubStr, S: WideString): Integer;
 
</source>
 
</source>
Line 62: Line 62:
 
If the substring is not present in the S string function will return 0.
 
If the substring is not present in the S string function will return 0.
  
<source>
+
<syntaxhighlight lang="pascal">
 
procedure WideInsert(const Substr: WideString; var Dest: WideString; Index: Integer);
 
procedure WideInsert(const Substr: WideString; var Dest: WideString; Index: Integer);
 
</source>
 
</source>
Line 68: Line 68:
 
'''WideInsert''' inserts given substring into Dest string starting from Index. So '''WideInsert'''(<nowiki>'</nowiki>not <nowiki>'</nowiki>, <nowiki>'</nowiki>it is my car<nowiki>'</nowiki>, 7) will change the Dest string into <nowiki>'</nowiki>''it is not my car''<nowiki>'</nowiki>.
 
'''WideInsert''' inserts given substring into Dest string starting from Index. So '''WideInsert'''(<nowiki>'</nowiki>not <nowiki>'</nowiki>, <nowiki>'</nowiki>it is my car<nowiki>'</nowiki>, 7) will change the Dest string into <nowiki>'</nowiki>''it is not my car''<nowiki>'</nowiki>.
  
<source>
+
<syntaxhighlight lang="pascal">
 
procedure WideDelete(var S: WideString; Index, Count: Integer);
 
procedure WideDelete(var S: WideString; Index, Count: Integer);
 
</source>
 
</source>
Line 80: Line 80:
 
In opposition to the '''WideReplaceStr''' function this script will find only the first appearance of searched phrase. If we would like to check all appearances, we would have to put this code into some fancy loop.
 
In opposition to the '''WideReplaceStr''' function this script will find only the first appearance of searched phrase. If we would like to check all appearances, we would have to put this code into some fancy loop.
  
<source>
+
<syntaxhighlight lang="pascal">
 
var
 
var
 
   Car_Index, Not_Index : Integer;
 
   Car_Index, Not_Index : Integer;
Line 101: Line 101:
 
And finally last, but not least, in this chapter will be presented '''WideCopy''' function. Let<nowiki>’</nowiki>s take a look on it<nowiki>’</nowiki>s declaration:
 
And finally last, but not least, in this chapter will be presented '''WideCopy''' function. Let<nowiki>’</nowiki>s take a look on it<nowiki>’</nowiki>s declaration:
  
<source>
+
<syntaxhighlight lang="pascal">
 
function WideCopy(const S: WideString; Index, Count: Integer): WideString;
 
function WideCopy(const S: WideString; Index, Count: Integer): WideString;
 
</source>
 
</source>
Line 113: Line 113:
 
'''WideCopy''' function will let us capitalize only the first letter of the filename.
 
'''WideCopy''' function will let us capitalize only the first letter of the filename.
  
<source>
+
<syntaxhighlight lang="pascal">
 
FileName := WideUpperCase(FileName[1]) + WideLowerCase(WideCopy(FileName, 2, Length(FileName)-1));
 
FileName := WideUpperCase(FileName[1]) + WideLowerCase(WideCopy(FileName, 2, Length(FileName)-1));
 
</source>
 
</source>

Revision as of 15:00, 8 February 2017

Unicode String Handling Routines or How to operate on words

Swapping parts of the FileName

What if we have mp3 files of certain format, eg. "author – title.mp3" and we want to rename them into "title - author.mp3"? We need to split filename in some certain place (on " - ") and then use created parts to build a new filename. We can achieve that with WideSplitString function that takes Input (a string to split) and Delimiter paramethers and returns an array of strings (TWideStringArray type). If the Input is "Queen - Bohemian Rhapsody" and a Delimiter is " - " it will produce an array ["Queen", "Bohemian Rhapsody"].

Please pay attention that TWideStringArray type arrays are zero-based, which means the index of the first element is 0. So we will get array[0] = "Queen" and array[1] = "Bohemian Rhapsody".

The whole operation can be achieved with such a piece of code.

To understand the code below you'll need basic knowledge about variables declaration, arrays and if-then-else statement.

<syntaxhighlight lang="pascal"> var

 SplittedFileName: TWideStringArray;

begin

 SplittedFileName := WideSplitString(WideExtractBaseName(FileName), ' - ');
 if Length(SplittedFileName) = 2 then
   FileName := SplittedFileName[1] + ' - ' + SplittedFileName[0] + WideExtractFileExt(FileName);

end. </source>

The script will produce "Bohemian Rhapsody – Queen.mp3" from "Queen – Bohemian Rhapsody.mp3".

We are checking the length of the array SplittedFileName to ensure that we won't go out of the array bounds. This would happen if we would have a file of a different format in the files table, eg. "Bohemian Rhapsody (Queen)").


Splitting the FileName into words

If we would like to split the FileName into words (word in this case is anything that lays between two spaces) the proper line of code would look like this:

<syntaxhighlight lang="pascal"> SplittedFileName := WideSplitString(WideExtractBaseName(FileName), ' '); </source>


Replacing parts of the FileName

Another useful function is WideReplaceStr function. With its help we can eg. replace all appearances of 'your car' phrase with 'my car'.

<syntaxhighlight lang="pascal"> FileName := WideReplaceStr(FileName, 'your car', 'my car'); </source>

It will also change 'not your car' into 'not my car' and if we are really possesive and egoistic we might not like that...


WidePos, WideInsert and WideDelete functions

To solve the problem we will need few others string handling functions and procedures: WidePos, WideInsert and WideDelete. If you’re sure you won’t process any unicode characters, you may use Pos, Insert and Delete functions/procedures instead.

Before we start to describe them you need to know that strings in Pascal are represented as 1-based arrays of chars which means that the first index of string is 1 (so FileName[0] gives 'out of bounds error').

Now we can take a look at the description of functions/procedures that were mentioned above.

<syntaxhighlight lang="pascal"> function WidePos(const SubStr, S: WideString): Integer; </source>

WidePos finds a substring in given string S and returns the position of its first char.

So WidePos('car', 'scar tissue') will return 2.

If the substring is not present in the S string function will return 0.

<syntaxhighlight lang="pascal"> procedure WideInsert(const Substr: WideString; var Dest: WideString; Index: Integer); </source>

WideInsert inserts given substring into Dest string starting from Index. So WideInsert('not ', 'it is my car', 7) will change the Dest string into 'it is not my car'.

<syntaxhighlight lang="pascal"> procedure WideDelete(var S: WideString; Index, Count: Integer); </source>

WideDelete deletes Count number of chars from S string starting at Index. So WideDelete('it is not my car', 7, 4) will change back the S string into 'it is my car'.

Armed with that knowledge we can write a script that will find 'your car' phrase and will check if there is a word 'not' before it (no matter where exactly, but between beginning of the filename and the phrase). And only if there is no such word, it will replace 'your' with 'my'.


Full control over Find & Replace operation

In opposition to the WideReplaceStr function this script will find only the first appearance of searched phrase. If we would like to check all appearances, we would have to put this code into some fancy loop.

<syntaxhighlight lang="pascal"> var

 Car_Index, Not_Index : Integer;

begin

 Car_Index := WidePos('your car', WideLowerCase(FileName));
 Not_Index := WidePos('not ', WideLowerCase(FileName));
 if Car_Index > 0 then 
   if (Not_Index > 0) and (Not_Index < Car_Index) then
     begin
       WideDelete(FileName, Car_Index, Length('your'));
       WideInsert('my', FileName, Car_Index);
     end;

end. </source>

I guess you’re curious why we did search 'your car' and 'not ' phrases in lowercased FileName (WideLowerCase(FileName)). We did that because WidePos function is case sensitive. Please pay attention that we didn’t change the actual case of the FileName. We just passed the copy of lowercased FileName string into WidePos function. This ensures that any variant of case will be found as all of them (eg. 'Your Car', 'YoUR caR') are identical to 'your car' after lowercasing.


WideCopy function

And finally last, but not least, in this chapter will be presented WideCopy function. Let’s take a look on it’s declaration:

<syntaxhighlight lang="pascal"> function WideCopy(const S: WideString; Index, Count: Integer): WideString; </source>

WideCopy will return a substring of string S that starts on Index and has numbers of chars defined by Count parameter.

This means that WideCopy(’sit down’; 5, 4) will return ’down’ (4 letters starting from index 5).


Making first letter capital

WideCopy function will let us capitalize only the first letter of the filename.

<syntaxhighlight lang="pascal"> FileName := WideUpperCase(FileName[1]) + WideLowerCase(WideCopy(FileName, 2, Length(FileName)-1)); </source>

We are building the FileName from two parts: first goes uppercased first letter of the FileName and then lowercased rest of the FileName. We use WideCopy(FileName, 2, Length(FileName) - 1) statement to get everything from the second letter till the end of the FileName.