Two variants of dataset are available in separate files:
Name | Last modified | Size | Description | |
---|---|---|---|---|
Parent Directory | - | |||
pl-embeddings-cbow.txt | 2016-01-04 23:22 | 855M | generated using continuous-bag-of-words method | |
pl-embeddings-skip.txt | 2016-01-04 23:24 | 854M | generated using skip-gram method | |
The first line of each dataset contains two numbers: number of words in a dictionary and a number of dimensions of the word embeddings.
The following lines contain the word and space-separated list of numbers that form the word embedding.