Difference between revisions of "AnsiString"
(Created AnsiString English page) |
m (Alextpp moved page Ansistring to AnsiString) |
||
(6 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
{{LanguageBar}} | {{LanguageBar}} | ||
− | + | '''<syntaxhighlight lang="pascal" inline>AnsiString</syntaxhighlight>''' is a variable-length string [[Data type|data type]]. | |
+ | It can store characters that have a size of one Byte. | ||
− | + | == implementation == | |
+ | In [[FPC]] an <syntaxhighlight lang="pascal" inline>AnsiString</syntaxhighlight> is implemented as a [[Pointer|pointer]]. | ||
+ | It is a managed data type. | ||
+ | As such it is initialized with [[Nil|<syntaxhighlight lang="pascal" inline>nil</syntaxhighlight>]] as soon as it enters the scope. | ||
+ | Memory for the character sequence is dynamically allocated and freed. | ||
− | <syntaxhighlight lang="pascal"> | + | An <syntaxhighlight lang="pascal" inline>AnsiString</syntaxhighlight> points to the ''first'' character. |
− | + | This facilitates interfacing to libraries or foreign functions expecting [[PChar|<syntaxhighlight lang="pascal" inline>pChar</syntaxhighlight> strings]]. | |
− | + | For that, an <syntaxhighlight lang="pascal" inline>AnsiString</syntaxhighlight> always concludes with a null [[Byte]]. | |
− | </syntaxhighlight> | + | ''In [[Pascal]]'', this terminating null Byte has no significance as to the string’s value (including its length). |
+ | An <syntaxhighlight lang="pascal" inline>AnsiString</syntaxhighlight> always entails some management data ''before'' the first character. | ||
+ | These are | ||
+ | * a code page | ||
+ | * the size of a character | ||
+ | * a reference count | ||
+ | * the length of the string. | ||
+ | {| class="wikitable" style="text-align: center; margin: auto;" | ||
+ | | <syntaxhighlight lang="pascal" inline>253</syntaxhighlight> | ||
+ | | <syntaxhighlight lang="pascal" inline>233</syntaxhighlight> | ||
+ | | <syntaxhighlight lang="pascal" inline>0</syntaxhighlight> | ||
+ | | <syntaxhighlight lang="pascal" inline>1</syntaxhighlight> | ||
+ | | <syntaxhighlight lang="pascal" inline>0</syntaxhighlight> | ||
+ | | <syntaxhighlight lang="pascal" inline>0</syntaxhighlight> | ||
+ | | <syntaxhighlight lang="pascal" inline>0</syntaxhighlight> | ||
+ | | <syntaxhighlight lang="pascal" inline>1</syntaxhighlight> | ||
+ | | <syntaxhighlight lang="pascal" inline>0</syntaxhighlight> | ||
+ | | <syntaxhighlight lang="pascal" inline>0</syntaxhighlight> | ||
+ | | <syntaxhighlight lang="pascal" inline>0</syntaxhighlight> | ||
+ | | <syntaxhighlight lang="pascal" inline>3</syntaxhighlight> | ||
+ | | <syntaxhighlight lang="pascal" inline>'F'</syntaxhighlight> | ||
+ | | <syntaxhighlight lang="pascal" inline>'o'</syntaxhighlight> | ||
+ | | <syntaxhighlight lang="pascal" inline>'o'</syntaxhighlight> | ||
+ | | <syntaxhighlight lang="pascal" inline>#0</syntaxhighlight> | ||
+ | |- | ||
+ | ! colspan="2" | code page | ||
+ | ! colspan="2" | maximum character size | ||
+ | ! colspan="4" | reference count | ||
+ | ! colspan="4" | length | ||
+ | ! colspan="3" | payload | ||
+ | ! complimentary Null | ||
+ | |- | ||
+ | | colspan="13" style="text-align: right;" | pointer points here ⤴ | ||
+ | | colspan="3" | | ||
+ | |+ <syntaxhighlight lang="pascal" inline>AnsiString</syntaxhighlight> memory layout sample ([[32 bit|32-bit]] platform) | ||
+ | |} | ||
+ | Only the length field has significance in Pascal. | ||
+ | In Pascal, an <syntaxhighlight lang="pascal" inline>AnsiString</syntaxhighlight> may contain <syntaxhighlight lang="pascal" inline>#0</syntaxhighlight> characters. | ||
− | + | An <syntaxhighlight lang="pascal" inline>AnsiString</syntaxhighlight> can furthermore be associated with a code page (since [[FPC New Features 3.0.0#Support for codepage-aware strings|3.0.0]]). | |
− | <syntaxhighlight lang="pascal"> | + | == application == |
− | + | The data type <syntaxhighlight lang="pascal" inline>AnsiString</syntaxhighlight> can be used like any other string data type. | |
− | + | You may [[Becomes|assign]] string literals to an <syntaxhighlight lang="pascal" inline>AnsiString</syntaxhighlight> variable as normal. | |
− | + | String values can be compared ([[Equal|<syntaxhighlight lang="pascal" inline>=</syntaxhighlight>]]) just as usual. | |
− | </syntaxhighlight> | + | The entire pointer-characteristic is transparent. |
− | + | Characters in <syntaxhighlight lang="pascal" inline>AnsiString</syntaxhighlight> have a 1-based index. | |
+ | <syntaxhighlight lang="pascal" inline>myAnsiString[1]</syntaxhighlight> refers to the first character. | ||
+ | {{Note|The linear character index is only guaranteed to work for strings that have a maximum character size of <syntaxhighlight lang="pascal" inline>1</syntaxhighlight>. That means, using an integer index for example on an UTF-8 encoded string (not exclusively containing ASCII characters) will produce erroneous results.}} | ||
− | <syntaxhighlight lang="pascal"> | + | The {{Doc|package=RTL|unit=system|identifier=length|text=<syntaxhighlight lang="pascal" inline>length</syntaxhighlight> function}}, and for that matter also <syntaxhighlight lang="pascal" inline>high</syntaxhighlight>, will return a string’s length by examining the length data field. |
− | |||
− | |||
− | </syntaxhighlight> | ||
− | + | Because an <syntaxhighlight lang="pascal" inline>AnsiString</syntaxhighlight> is essentially a pointer, ''copying'' strings of this type is fast, since only the reference is copied and the reference count increased. | |
+ | Modifications may trigger a <abbr title="copy-on-write">COW</abbr>. | ||
− | [[Category: | + | == caveats == |
+ | * The compiler directive [[$H|<syntaxhighlight lang="pascal" inline>{$longStrings on}</syntaxhighlight>]] (or <syntaxhighlight lang="pascal" inline>{$H+}</syntaxhighlight>) [[Defensive programming techniques#Do you know your String Type? Really?|aliases <syntaxhighlight lang="pascal" inline>string</syntaxhighlight>]] (without a specified length) to <syntaxhighlight lang="pascal" inline>AnsiString</syntaxhighlight>. | ||
+ | * <syntaxhighlight lang="pascal" inline>AnsiString</syntaxhighlight> as a ''managed'' data type introduces a certain overhead. See [[Avoiding implicit try finally section]] for more explanations. | ||
+ | * The [[SizeOf|<syntaxhighlight lang="pascal" inline>sizeOf</syntaxhighlight>]] value of an <syntaxhighlight lang="pascal" inline>AnsiString</syntaxhighlight> variable is merely the size of a pointer. | ||
+ | * Assigning an empty string <syntaxhighlight lang="pascal" inline>''</syntaxhighlight> to an <syntaxhighlight lang="pascal" inline>AnsiString</syntaxhighlight> variable will in fact assign <syntaxhighlight lang="pascal" inline>nil</syntaxhighlight> to the variable and, if the reference count hit zero, release underlying memory (if any was previously allocated at all). Empty strings are ''not'' [[#implementation|stored as described above]]. | ||
+ | |||
+ | == see also == | ||
+ | * [[Character and string types]] | ||
+ | * [https://www.freepascal.org/docs-html/current/ref/refsu9.html#x32-370003.2.4 Ansistrings] in the reference guide | ||
+ | * [https://www.freepascal.org/docs-html/current/prog/progsu161.html#x205-2160008.2.7 Ansistrings] in the programmer’s guide | ||
+ | |||
+ | [[Category: Data types]] |
Latest revision as of 13:27, 22 December 2023
│ English (en) │
AnsiString
is a variable-length string data type.
It can store characters that have a size of one Byte.
implementation
In FPC an AnsiString
is implemented as a pointer.
It is a managed data type.
As such it is initialized with nil
as soon as it enters the scope.
Memory for the character sequence is dynamically allocated and freed.
An AnsiString
points to the first character.
This facilitates interfacing to libraries or foreign functions expecting pChar
strings.
For that, an AnsiString
always concludes with a null Byte.
In Pascal, this terminating null Byte has no significance as to the string’s value (including its length).
An AnsiString
always entails some management data before the first character.
These are
- a code page
- the size of a character
- a reference count
- the length of the string.
253
|
233
|
0
|
1
|
0
|
0
|
0
|
1
|
0
|
0
|
0
|
3
|
'F'
|
'o'
|
'o'
|
#0
|
code page | maximum character size | reference count | length | payload | complimentary Null | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
pointer points here ⤴ |
Only the length field has significance in Pascal.
In Pascal, an AnsiString
may contain #0
characters.
An AnsiString
can furthermore be associated with a code page (since 3.0.0).
application
The data type AnsiString
can be used like any other string data type.
You may assign string literals to an AnsiString
variable as normal.
String values can be compared (=
) just as usual.
The entire pointer-characteristic is transparent.
Characters in AnsiString
have a 1-based index.
myAnsiString[1]
refers to the first character.
1
. That means, using an integer index for example on an UTF-8 encoded string (not exclusively containing ASCII characters) will produce erroneous results.The length
function, and for that matter also high
, will return a string’s length by examining the length data field.
Because an AnsiString
is essentially a pointer, copying strings of this type is fast, since only the reference is copied and the reference count increased.
Modifications may trigger a COW.
caveats
- The compiler directive
{$longStrings on}
(or{$H+}
) aliasesstring
(without a specified length) toAnsiString
. AnsiString
as a managed data type introduces a certain overhead. See Avoiding implicit try finally section for more explanations.- The
sizeOf
value of anAnsiString
variable is merely the size of a pointer. - Assigning an empty string
''
to anAnsiString
variable will in fact assignnil
to the variable and, if the reference count hit zero, release underlying memory (if any was previously allocated at all). Empty strings are not stored as described above.
see also
- Character and string types
- Ansistrings in the reference guide
- Ansistrings in the programmer’s guide