Deprecated: $wgMWOAuthSharedUserIDs=false is deprecated, set $wgMWOAuthSharedUserIDs=true, $wgMWOAuthSharedUserSource='local' instead [Called from MediaWiki\HookContainer\HookContainer::run in /var/www/html/w/includes/HookContainer/HookContainer.php at line 135] in /var/www/html/w/includes/Debug/MWDebug.php on line 372
ngram - MaRDI portal

ngram

From MaRDI portal
Software:91511



CRANngramMaRDI QIDQ91511

Fast n-Gram 'Tokenization'

Drew Schmidt, Christian Heckendorf

Last update: 10 December 2023

Copyright license: 2-clause BSD License, File License

Software version identifier: 3.2.2, 1.0, 1.1, 3.0.0, 3.0.1, 3.0.2, 3.0.3, 3.0.4, 3.2.0, 3.2.1, 3.2.3

An n-gram is a sequence of n "words" taken, in order, from a body of text. This is a collection of utilities for creating, displaying, summarizing, and "babbling" n-grams. The 'tokenization' and "babbling" are handled by very efficient C code, which can even be built as its own standalone library. The babbler is a simple Markov chain. The package also offers a vignette with complete example 'workflows' and information about the utilities offered in the package.




Related Items (3)


This page was built for software: ngram