Difference between revisions of "ReNamer:Pascal Script:Unicode String Handling Routines"
Line 1: | Line 1: | ||
==Unicode String Handling Routines or How to operate on words== | ==Unicode String Handling Routines or How to operate on words== | ||
− | And what if we have mp3 files of certain format, eg. ''<nowiki>’</nowiki>author – title.mp3<nowiki>’</nowiki>'' and we want to rename them into ''<nowiki>’</nowiki>title - author.mp3<nowiki>’</nowiki>''? We need to split filename in some certain place (on ''<nowiki>’</nowiki> – <nowiki>’</nowiki>'') and then use created parts to build a new filename. We can achieve that with WideSplitString function that takes a string to split (Input) and a Delimiter and returns an array of strings (TStringsArray). If the Input is ''<nowiki>’</nowiki>Queen – Bohemian Rhapsody<nowiki>’</nowiki>'' and a Delimiter is ''<nowiki>’</nowiki> - <nowiki>’</nowiki>'' it will produce an array <nowiki>[' | + | And what if we have mp3 files of certain format, eg. ''<nowiki>’</nowiki>author – title.mp3<nowiki>’</nowiki>'' and we want to rename them into ''<nowiki>’</nowiki>title - author.mp3<nowiki>’</nowiki>''? We need to split filename in some certain place (on ''<nowiki>’</nowiki> – <nowiki>’</nowiki>'') and then use created parts to build a new filename. We can achieve that with WideSplitString function that takes a string to split (Input) and a Delimiter and returns an array of strings (TStringsArray). If the Input is ''<nowiki>’</nowiki>Queen – Bohemian Rhapsody<nowiki>’</nowiki>'' and a Delimiter is ''<nowiki>’</nowiki> - <nowiki>’</nowiki>'' it will produce an array <nowiki>['</nowiki>Queen<nowiki>’</nowiki>'', ''<nowiki>’</nowiki>Bohemian Rhapsody<nowiki>']</nowiki>. |
Please pay attention that TStringsArray type arrays are zero-based, which means the index of the first element is 0. So we will get array<nowiki>[</nowiki>0<nowiki>]</nowiki> = ''<nowiki>’</nowiki>Queen<nowiki>’</nowiki>'' and array<nowiki>[</nowiki>1<nowiki>]</nowiki> = ''<nowiki>’</nowiki>Bohemian Rhapsody<nowiki>’</nowiki>''. | Please pay attention that TStringsArray type arrays are zero-based, which means the index of the first element is 0. So we will get array<nowiki>[</nowiki>0<nowiki>]</nowiki> = ''<nowiki>’</nowiki>Queen<nowiki>’</nowiki>'' and array<nowiki>[</nowiki>1<nowiki>]</nowiki> = ''<nowiki>’</nowiki>Bohemian Rhapsody<nowiki>’</nowiki>''. |
Revision as of 17:10, 25 May 2009
Unicode String Handling Routines or How to operate on words
And what if we have mp3 files of certain format, eg. ’author – title.mp3’ and we want to rename them into ’title - author.mp3’? We need to split filename in some certain place (on ’ – ’) and then use created parts to build a new filename. We can achieve that with WideSplitString function that takes a string to split (Input) and a Delimiter and returns an array of strings (TStringsArray). If the Input is ’Queen – Bohemian Rhapsody’ and a Delimiter is ’ - ’ it will produce an array ['Queen’, ’Bohemian Rhapsody'].
Please pay attention that TStringsArray type arrays are zero-based, which means the index of the first element is 0. So we will get array[0] = ’Queen’ and array[1] = ’Bohemian Rhapsody’.
The whole operation can be achieved with such a piece of code.
To understand the code below you’ll need basic knowledge about variables declaration, arrays and if-then-else statement.
var SplittedFileName : TStringsArray; begin SplittedFileName:=WideSplitString(WideExtractBaseName(FileName), ' - '); if Length(SplittedFileName) = 2 then FileName:=SplittedFileName[1] + ' - ' +SplittedFileName[0] + WideExtractFileExt(FileName); end.
The script will produce ’Bohemian Rhapsody – Queen.mp3’ from ’Queen – Bohemian Rhapsody.mp3’.
We are checking the length of the array SplittedFileName to ensure that we won’t go out of the array bounds (if we would have a file of a different format in the files table, eg. ’Bohemian Rhapsody (Queen)’), which would give us an error.
If we would like to split the FileName into words (word in this case is anything that lays between two spaces) the proper line of code would look like this:
SplittedFileName:=WideSplitString(WideExtractBaseName(FileName), ' ');
Another useful function is WideReplaceStr function. With its help we can eg. replace all appearances of ’your car’ phrase with ’my car’.
FileName:=WideReplaceStr(FileName, 'your car', 'my car');
It will also change ’not your car’ into ’not my car’ and if we are really possesive and egoistic we might not like that...
To solve this problem we will need few others string handling functions and procedures: WidePos, WideInsert and WideDelete. If you’re sure you won’t process any unicode characters, you may use Pos, Insert and Delete functions/procedures instead.
Before we start to describe them we need to tell you that strings in Pascal are represented as 1-based arrays of chars which means that the first index of string is 1 (so FileName[0] gives ’out of bounds error’).
Now we can take a look at the description of functions/procedures that were mentioned above.
function '''WidePos'''(const SubStr, S: WideString): Integer;
WidePos finds a substring in given string S and returns the position of its first char.
So WidePos(’car’, ’scar tissue’) will return 2.
If the substring is not present in string S the function will return 0.
procedure WideInsert(const Substr: WideString; var Dest: WideString; Index: Integer);
WideInsert inserts given substring into Dest string starting from Index. So WideInsert(’not ’, ’it is my car’, 7) will change the Dest string into ’it is not my car’.
procedure WideDelete(var S: WideString; Index, Count: Integer);
WideDelete deletes Count number of chars from S string starting at Index. So WideDelete(’it is not my car’, 7, 4) will change back the S string into ’it is my car’.
Armed with that knowledge we can write a script that will find ’your car’ phrase and will check if there is a word ’not’ before it (no matter where exactly, but between beginning of the filename and the phrase). And only if there is no such word, it will replace ’your’ with ’my’.
In opposition to the WideReplaceStr function this script will find only first appearance of searched phrase. If we would like to check all appearances, we would have to put this code into some fancy loop.
var Car, Not_Word : Integer; begin Car:=WidePos('your car', WideLowerCase(FileName)); Not_Word:=WidePos('not ', WideLowerCase(FileName)); if Car > 0 then if (Not_Word > 0) and (Not_Word < Car) then begin WideDelete(FileName, Car, Length('your')); WideInsert('my', FileName, Car); end; end.
I guess you’re curious why we did search ’your car’ and ’not ’ phrases in lowercased filename (WideLowerCase(FileName)). We did that because WidePos function is case sensitive. Please pay attention that we didn’t change the actual case of the filename. We just passed the copy of lowercased filename string into WidePos function. This ensures that any variant of case will be found as all of them (eg. ’Your Car’, ’YoUR caR’) are identical to ’your car’ after lowercasing.
And finally last, but not least, in this chapter will be presented WideCopy function. Let’s take a look on it’s declaration:
function WideCopy(const S: WideString; Index, Count: Integer): WideString;
WideCopy will return a substring of string S that starts on Index and has numbers of chars defined by Count parameter.
This means that WideCopy(’sit down’; 5, 4) will return ’down’ (4 letters starting from index 5).
This function will let us capitalize only first letter of the filename.
FileName:=WideUpperCase(FileName[1])+WideLowerCase(WideCopy(FileName, 2, Length(FileName)-1));
We are building the FileName from two blocks: first is the first letter of FileName changed to uppercase and second – is the rest of the FileName made lowercase. We use WideCopy(FileName, 2, Length(FileName)-1) statement to get everything from the second letter till the end of the filename.