CharacterSet
CharacterSet (and its reference type counterpart, NSCharacterSet) is a Foundation type used to trim, filter, and search for characters in text.
The article introduces CharacterSet, a Foundation type in Swift for manipulating Unicode scalar values, distinct from Set<Character> despite its name, as it conforms to the SetAlgebra protocol rather than storing Character values directly.
It details predefined character sets like alphanumerics, letters, and URL-specific sets (e.g., urlQueryAllowed), which align with Unicode General Categories, and warns about common pitfalls like confusing capitalizedLetters (titlecase) with uppercaseLetters.
Practical uses include trimming whitespace with whitespacesAndNewlines, percent-encoding URL components, and validating user input by creating custom sets with formUnion or using inverted for exclusion. Advanced functionality allows creating a CharacterSet for Emoji using Swift 5’s Unicode.Scalar.properties.isEmoji, with its bitmapRepresentation enabling efficient storage as a 16KB Data object.
The article contrasts CharacterSet with NSCharacterSet, noting its evolution from a 16-bit UCS-2 context to Swift’s Unicode-compliant String, yet it remains a performant tool for text processing tasks like normalization and filtering.
Category:
Tag:
Year: