Categories: None [Edit]
tokenizer
A simple multilingual tokenizer for NLP tasks. This tool provides a CLI and a library for linguistic tokenization which is an anavoidable step for many HLT (human language technology) tasks in the preprocessing phase for further syntactic, semantic and other higher level processing goals. Use it for tokenization of German, English and French texts.
Total
Ranking: 10,427 of 187,591
Downloads: 263,736
Daily
Ranking: 18,316 of 187,571
Downloads: 27
Downloads Trends
Ranking Trends
Num of Versions Trends
Popular Versions (Major)
Popular Versions (Major.Minor)
Depended by
| Rank | Downloads | Name |
|---|---|---|
| 8,289 | 434,615 | metanorma-iso |
| 70,044 | 13,800 | social_tokenizer |
| 80,823 | 11,140 | jekyll-related-posts |
| 91,592 | 9,056 | valkey-objects |
| 94,694 | 8,612 | smalltext |
| 138,961 | 4,247 | shalmaneser-frappe |
| 140,886 | 4,138 | meiou |
| 150,788 | 3,670 | TacTalk |
| 173,459 | 2,257 | sentimentanalyzer |
Depends on
| Rank | Downloads | Name |
|---|
Owners
| # | Gravatar | Handle |
|---|---|---|
| 1 | arbox |