CharacterSet
CharacterSet (and its reference type counterpart, NSCharacterSet) is a Foundation type used to trim, filter, and search for characters in text.
The article introduces CharacterSet
, a Foundation type in Swift for manipulating Unicode scalar values, distinct from Set<Character>
despite its name, as it conforms to the SetAlgebra
protocol rather than storing Character
values directly.
It details predefined character sets like alphanumerics
, letters
, and URL-specific sets (e.g., urlQueryAllowed
), which align with Unicode General Categories, and warns about common pitfalls like confusing capitalizedLetters
(titlecase) with uppercaseLetters
.
Practical uses include trimming whitespace with whitespacesAndNewlines
, percent-encoding URL components, and validating user input by creating custom sets with formUnion
or using inverted
for exclusion. Advanced functionality allows creating a CharacterSet
for Emoji using Swift 5’s Unicode.Scalar.properties.isEmoji
, with its bitmapRepresentation
enabling efficient storage as a 16KB Data
object.
The article contrasts CharacterSet
with NSCharacterSet
, noting its evolution from a 16-bit UCS-2 context to Swift’s Unicode-compliant String
, yet it remains a performant tool for text processing tasks like normalization and filtering.
Category:
Tag:
Year: