Skip to content

TypeScript implementation of TDLib's functions and utilities related to text parsing and matching text entities.

License

Notifications You must be signed in to change notification settings

dcdunkan/pmodes

Repository files navigation

Parse Modes

Synced with tdlib/td:7c3822d932.

A work-in-progress TypeScript implementation of TDLib's functions and utilities related to parsing text with several parse modes and matching text entities.

Few more methods are left to be implemented. But the tests are direclty ported from TDLib source without a change. And they seem to be passing. So, I'll take that as a "it works"! I cannot assure you the quality of the implementation, as I'm not good at C++ (TDLib is written in C++). So, I probably have done few stupid things because I missed how C++ actually works.

Anyway, thank you.

Here is what we currently have here.

But of course, they still might have a few bugs. If you ever encounter one please consider opening an issue.

match.ts (td/telegram/MessageEntity.cpp)
  • match_mentions
  • match_bot_commands
  • match_hashtags
  • match_cashtags
  • match_media_timestamps
  • match_bank_card_numbers
  • is_url_unicode_symbol
  • is_url_path_symbol
  • match_tg_urls
  • is_protocol_symbol
  • is_user_data_symbol
  • is_domain_symbol
  • match_urls
  • is_valid_bank_card
  • is_email_address
  • is_common_tld
  • fix_url
  • get_valid_short_usernames
  • find_mentions
  • find_bot_commands
  • find_hashtags
  • find_cashtags
  • find_bank_card_numbers
  • find_tg_urls
  • find_urls
  • find_media_timestamps
  • text_length
  • get_type_priority
  • remove_empty_entities
  • sort_entities
  • check_is_sorted
  • check_non_intersecting
  • get_entity_type_mask
  • get_splittable_entities_mask
  • get_blockquote_entities_mask
  • get_continuous_entities_mask
  • get_pre_entities_mask
  • get_user_entities_mask
  • is_splittable_entity
  • is_blockquote_entity
  • is_continuous_entity
  • is_pre_entity
  • is_user_entity
  • is_hidden_data_entity
  • get_splittable_entity_type_index
  • are_entities_valid
  • remove_intersecting_entities
  • remove_entities_intersecting_blockquote
  • fix_entity_offsets
  • find_entities
  • find_media_timestamp_entities
  • merge_entities
  • is_plain_domain
  • get_first_url
  • parse_markdown
  • parse_markdown_v2
  • decode_html_entity
  • parse_html
  • get_formatted_text_object
  • find_text_url_entities_v3
  • clean_input_string_with_entities
  • remove_invalid_entities
  • split_entities
  • resplit_entities
  • merge_new_entities
  • fix_entities
  • fix_formatted_text
  • get_type_priority
  • MessageEntity
  • TextEntityObject
  • get_text_entities_object
  • message_entity_type_string
  • MessageEntityType
random.ts (td/utils/Random.{h,cpp})
  • fast_uint32
  • fast_bool
  • fast(int, int)
utilities.ts (from a lot of source files)
  • is_word_character
  • to_lower_begins_with
  • to_lower
  • split
  • full_split
  • begins_with
  • ends_with
  • is_space
  • is_alpha
  • is_alpha (from misc.h)
  • is_alnum
  • is_digit
  • is_alpha_digit
  • is_alpha_digit_or_underscore
  • is_alpha_digit_underscore_or_minus
  • is_hex_digit
  • hex_to_int
  • is_hashtag_letter
  • CHECK
  • LOG_CHECK
  • to_integer
  • get_to_integer_safe_error
  • to_integer_safe
  • replace_offending_characters
  • clean_input_string
  • trim
  • strip_empty_characters
  • is_empty_string
unicode.ts (tdutils/td/utils/unicode.cpp)
  • UnicodeSimpleCategory
  • get_unicode_simple_category
  • binary_search_ranges
  • unicode_to_lower
utf8.ts (tdutils/td/utils/utf8.cpp)
  • is_utf8_character_first_code_unit
  • utf8_length
  • utf8_utf16_length
  • prev_utf8_unsafe
  • next_utf8_unsafe
  • append_utf8_character
  • append_utf8_character_unsafe
  • utf8_to_lower
  • utf8_truncate
  • utf8_utf16_truncate
  • utf8_substr
  • utf8_utf16_substr
  • check_utf8
Other stuff
  • CustomEmojiId
  • HttpUrl
  • HttpUrlProtocol
  • parse_url
  • IpAddress
  • parse_ipv6 (a compatible port from core-js)
  • LinkManager
    • getLinkUserId
    • getLinkCustomEmojiId
    • getCheckedLink
    • checkLinkImpl
  • UserId

* Most likely too buggy.

About

TypeScript implementation of TDLib's functions and utilities related to text parsing and matching text entities.

Topics

Resources

License

Stars

Watchers

Forks