Difference between revisions of "UTF8 Tools"
From Free Pascal wiki
Jump to navigationJump to searchLine 14: | Line 14: | ||
** BOM or no BOM | ** BOM or no BOM | ||
− | Simple | + | ''Simple demo:'' |
− | + | fCES := TCharEncStream.Create; | |
− | + | fCES.LoadFromFile(OpenDialog1.FileName); | |
− | + | Memo1.text := fCES.UTF8Text; | |
− | + | fCES.free; | |
* character.pas: Get Information about code points using the TCharacter class. | * character.pas: Get Information about code points using the TCharacter class. | ||
+ | |||
+ | ''Demo'' | ||
+ | if TCharacter.IsLetter(s[i]) then s[i] := TCharacter.toLower(s[i]); | ||
* utf8scanner.pas: Access UTF-8 strings by code index, use case statements on UTF-8 strings and more... | * utf8scanner.pas: Access UTF-8 strings by code index, use case statements on UTF-8 strings and more... | ||
+ | |||
+ | ''Index demo'' | ||
+ | s := TUTF8Scanner.Create(Memo1.text); | ||
+ | for i := 1 to s.Length do | ||
+ | if TCharacter.IsLetter(s[i]) then s[i] := TCharacter.toLower(s[i]); | ||
+ | Memo1.Text := s.UTF8String; | ||
+ | s.free; | ||
+ | |||
+ | ''Case demo'' | ||
+ | s := TUTF8Scanner.Create(Memo1.text); | ||
+ | s.FindChars := 'öäü'; | ||
+ | repeat | ||
+ | case s.FindIndex(s.Next) of | ||
+ | {ö} 0: s.Replace('oe'); | ||
+ | {ä} 1: s.Replace('ae'); | ||
+ | {ü} 2: s.Replace('ue'); | ||
+ | end; | ||
+ | until s.Done; | ||
+ | Memo1.Text := s.UTF8String; | ||
+ | s.free; | ||
== Download == | == Download == | ||
[http://www.theo.ch/lazarus/utf8tools.zip Donwload utf8tools.zip] | [http://www.theo.ch/lazarus/utf8tools.zip Donwload utf8tools.zip] |
Revision as of 11:17, 9 July 2009
About
Sharing some of my code
UTF-8 Tools
Purpose
Some tools for common problems with UTF-8 / Unicode.
- charencstreams.pas: Load and save data from almost any text source like
- ansi, UTF8, UTF16, UTF32
- big or little endian
- BOM or no BOM
Simple demo:
fCES := TCharEncStream.Create; fCES.LoadFromFile(OpenDialog1.FileName); Memo1.text := fCES.UTF8Text; fCES.free;
- character.pas: Get Information about code points using the TCharacter class.
Demo
if TCharacter.IsLetter(s[i]) then s[i] := TCharacter.toLower(s[i]);
- utf8scanner.pas: Access UTF-8 strings by code index, use case statements on UTF-8 strings and more...
Index demo
s := TUTF8Scanner.Create(Memo1.text); for i := 1 to s.Length do if TCharacter.IsLetter(s[i]) then s[i] := TCharacter.toLower(s[i]); Memo1.Text := s.UTF8String; s.free;
Case demo
s := TUTF8Scanner.Create(Memo1.text); s.FindChars := 'öäü'; repeat case s.FindIndex(s.Next) of {ö} 0: s.Replace('oe'); {ä} 1: s.Replace('ae'); {ü} 2: s.Replace('ue'); end; until s.Done; Memo1.Text := s.UTF8String; s.free;