Unicode Converter

UC

What is Unicode?

Unicode provides a unique number for every character, no matter what the platform, program, or language is.
Unicode Consortium
  • unicode. Unicode is the universal character encoding, maintained by the Unicode Consortium. This encoding standard provides the basis for processing, storage and interchange of text data in any language in all modern software and information technology protocols.

    In text processing, Unicode takes the role of providing a unique code point—a number, not a glyph—for each character. In other words, Unicode represents a character in an abstract way and leaves the visual rendering (size, shape, font, or style) to other software, such as a web browser or word processor.

    Code point: U+1F600 😀
    Short Name: grinning face with smiling eyes

    Source:stackoverflow,wikipedia

  • Unicode Plane

    Unicode planes, and code point ranges used
    PlaneAllocated code pointsAssigned characters
    0 BMP6552055,632
    1 SMP25,69622,982
    2 SIP60,91260,872
    3 TIP4,9444,939
    14 SSP368337
    15 SPUA-A65,536none
    16 SPUA-B65,536none
    Totals288,512144,762

    Source:wikipedia

  • UTF-16. (16-bit Unicode Transformation Format) A character encoding capable of encoding all 1,112,064 valid character code points of Unicode. It is used by systems such as the Microsoft Windows API, the Java programming language and JavaScript/ECMAScript. That is also the reason for building this application as to facilitate i18n development in JS.

    E.g.
    4 Hex digit (Surrogate Pair) : \ud83d\ude00 😀
    js-escape-format : \u{1f600} 😀

    Source:wikipedia

  • ISO-10646. Universal Coded Character Set (UCS, Unicode) is a standard set of characters defined by the International Standard ISO/IEC 10646.

    Although the character codes and encoding forms are synchronized between Unicode and ISO/IEC 10646, the Unicode Standard imposes additional constraints on implementations to ensure that they treat characters uniformly across platforms and applications.

    Source:wikipedia,unicode.org

  • ASCII. (American Standard Code for Information Interchange) A character encoding standard.

    • No. of Character: 128
    • Reserved first 32 code as control characters
    • No. of Printable characters: 95
    • 0x20 (space) - 0x7E (~)
    • Last Character: 0x7F (DEL)

    Source:wikipedia

  • HTML Entity. Entities are used to implement reserved characters or to express characters that cannot easily be entered with the keyboard.

    • Chracter: ©
    • Entity Name: &#copy;
    • Entity Number: ©
    • Description: Copyright

    Source:entitycode.com

© 2022 Milkteaholo. All rights reserved