Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About replicating monolingual experiments #1

Closed
rekcu opened this issue May 11, 2020 · 4 comments
Closed

About replicating monolingual experiments #1

rekcu opened this issue May 11, 2020 · 4 comments

Comments

@rekcu
Copy link

rekcu commented May 11, 2020

Hi! Thanks for sharing this codebase. I was trying to replicate the results in Table5 of paper.

I have tried 3 languages so far: DE,IT,TR and I couldn't unfortunately replicate the results.
I am probably missing something. Could you help me on this? Let me try to explain what I did for German(DE)

I have downloaded the FastText embedding from here (Specifically, the text version. the file I downloaded is named wiki.de.vec

Then I run the following command after cloning the repository:

similarity_type="cosine"
language="de"
for test_number in 1 2; do
    python weat.py \
           --test_number $test_number \
           --permutation_number 1000000 \
           --output_file ./results/w2v_wiki_${language}_${similarity_type}_${test_number}_cased.res \
           --lower False \
           --use_glove False \
           --is_vec_format True \
           --lang $language \
           --embeddings \
           data/fastTextEmbeddings/wiki.${language}.vec \
           --similarity_type $similarity_type |& tee ./results/w2v_wiki_${language}_${similarity_type}_${test_number}_cased.out
done

Then I checked the automatically created files. For example in w2v_wiki_de_cosine_1_cased.res, I found this:

Config: 1 and False and 1000000
Result: (0.0, nan, 0.0)
0.10255803883075715

and in w2v_wiki_de_cosine_2_cased.res:

Config: 2 and False and 1000000
Result: (0.0, nan, 0.0)
0.08906068411138322

I was also getting bunch of warning some of which are as follows (written in w2v_wiki_de_cosine_1_cased.out):

WARNING:root:Not in vocab Veilchen
WARNING:root:Not in vocab Trauer
WARNING:root:Not in vocab Tod
WARNING:root:Not in vocab Kricket
WARNING:root:Not in vocab Tausendfüßler
WARNING:root:Not in vocab Hornisse
WARNING:root:Not in vocab Ehre
WARNING:root:Not in vocab Rüsselkäfer
WARNING:root:Not in vocab Narzisse
WARNING:root:Not in vocab Butterblume
WARNING:root:Not in vocab Spinne
WARNING:root:Not in vocab Urlaub
WARNING:root:Not in vocab Käfer
WARNING:root:Not in vocab Qual
WARNING:root:Not in vocab Absturz
WARNING:root:Not in vocab Rose
WARNING:root:Not in vocab Himmel
WARNING:root:Not in vocab Termite
WARNING:root:Not in vocab Orchidee
WARNING:root:Not in vocab Zinnie
WARNING:root:Not in vocab Tarantel
WARNING:root:Not in vocab Tragödie
WARNING:root:Not in vocab Heuschrecke
WARNING:root:Not in vocab Familie
WARNING:root:Not in vocab Regenbogen
WARNING:root:Not in vocab Nelke
WARNING:root:Not in vocab Paradies
WARNING:root:Not in vocab Ameise
WARNING:root:Not in vocab Lachen
WARNING:root:Not in vocab Lilie
WARNING:root:Not in vocab Klee
WARNING:root:Not in vocab Gefängnis
WARNING:root:Not in vocab Bettwanze
WARNING:root:Not in vocab Mord
WARNING:root:Not in vocab Diplom
WARNING:root:Not in vocab Made
WARNING:root:Not in vocab Diamant
WARNING:root:Not in vocab Glockenblume
WARNING:root:Not in vocab Vergnügen
WARNING:root:Not in vocab Krokus
WARNING:root:Not in vocab Missbrauch
WARNING:root:Not in vocab Frieden
INFO:root:Popped T2 0
INFO:root:Popped A2 8
fromnumeric.py:2957: RuntimeWarning: Mean of empty slice.
  out=out, **kwargs)
methods.py:80: RuntimeWarning: invalid value encountered in double_scalars
  ret = ret.dtype.type(ret / rcount)
_methods.py:135: RuntimeWarning: Degrees of freedom <= 0 for slice
  keepdims=keepdims)
_methods.py:105: RuntimeWarning: invalid value encountered in true_divide
  arrmean, rcount, out=arrmean, casting='unsafe', subok=False)
methods.py:127: RuntimeWarning: invalid value encountered in double_scalars
  ret = ret.dtype.type(ret / rcount)
INFO:root:Calculating p value ...
INFO:root:Number of possible permutations: 1
INFO:root:(0.0, nan, 0.0)

Can you tell me what I need to do to run the code successfully? Thanks in advance!

@rekcu
Copy link
Author

rekcu commented May 12, 2020

Here is an update:

I have found the fasttext_multiling.sh script in the repository.

Since I do not have access to the /work/gglavas/data/word_embs/yacle/fasttext/200K/npformat/ft.wiki.${language}.300.vocab which is given as parameter to --embedding_vocab, I didn't set --embedding_vocab to anything. Similarly, I set --embedding_vectors (which is set to /work/gglavas/data/word_embs/yacle/fasttext/200K/npformat/ft.wiki.${language}.300.vectors) in the script) to the german fastText embeddings shared above. The script is then:

similarity_type="cosine"
language="de"
for test_number in 1,2; do
    python weat.py \
           --test_number $test_number \
           --permutation_number 1000000 \
           --output_file ./results/w2v_wiki_${language}_${similarity_type}_${test_number}_cased.res \
           --lower True \
           --use_glove False \
           --is_vec_format True \
           --lang $language \
           --embeddings \
           data/fastTextEmbeddings/wiki.${language}.vec \
           --similarity_type $similarity_type |& tee ./results/w2v_wiki_${language}_${similarity_type}_${test_number}_cased.out
done

With that script I got the following outputs:

results]$ cat w2v_wiki_de_cosine_2_cased.res
Config: 2 and True and 1000000
Result: (1.1948289903673552, 1.2983299550815979, 0.0)

results]$ cat w2v_wiki_de_cosine_6_cased.res
Config: 6 and True and 1000000
Result: (0.48375295403351787, 1.5778229695669055, 7.77000777000777e-05)

results]$ cat w2v_wiki_de_cosine_7_cased.res
Config: 7 and True and 1000000
Result: (0.0071803716755713815, 0.038459829350379, 0.4732711732711733)

So, these results are closer to the ones reported in Table-5. My question is then: "Is it normal to see such variations from the results in Table-5 or are these different results indicator of a mistake I made?"

@anlausch
Copy link
Owner

Thanks for your request. I'll look into the issue and let you know asap.

@anlausch
Copy link
Owner

Your usage and results seem to be fine. I reran the exact configuration and get the same results. The small variation most probably comes from the fact that in our experiments, we cut all vectors to the top 200k thereby increasing efficiency. For test 2, for instance, the term "feuerwaffe" cannot be found in our version. Also note, that in order to keep lists the same lengths, we randomly drop terms from the longer lists. In order to get the exact scores, you might, therefore, need to rerun the experiments multiple times for some languages. If you like to reproduce the exact scores, I can also assist you by forwarding you the exact lists that were used for each individual experiment (but I assume this is not necessary?).

@rekcu
Copy link
Author

rekcu commented May 13, 2020

No need to share the exact list - it is more than enough to know that my usage is correct. Thanks for helping me on this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants