Deprecated: $wgMWOAuthSharedUserIDs=false is deprecated, set $wgMWOAuthSharedUserIDs=true, $wgMWOAuthSharedUserSource='local' instead [Called from MediaWiki\HookContainer\HookContainer::run in /var/www/html/w/includes/HookContainer/HookContainer.php at line 135] in /var/www/html/w/includes/Debug/MWDebug.php on line 372
tok - MaRDI portal

tok

From MaRDI portal
Software:5983037



CRANtokMaRDI QIDQ5983037

Fast Text Tokenization

Daniel Falbel

Last update: 17 August 2023

Copyright license: MIT license, File License

Software version identifier: 0.1.0, 0.1.1

Interfaces with the 'Hugging Face' tokenizers library to provide implementations of today's most used tokenizers such as the 'Byte-Pair Encoding' algorithm <https://huggingface.co/docs/tokenizers/index>. It's extremely fast for both training new vocabularies and tokenizing texts.





This page was built for software: tok