-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Order dependence in ElmoEmbedder? #1169
Comments
Hi @ngoodman that's expected behavior. ELMo has internal state and adapts to your domain over time. We've been thinking about how to make the output more consistent--as this is unexpected behavior for our users. |
Oh, i see! So this is giving me the embedding in the context of the "corpus" of sentences i've asked it to embed so far? Testing my understanding i tried the above test using two separate instances of ElmoEmbedder, and indeed got the same embedding. This seems to be an infeasible approach in practice for lots of sentences, because it takes a long time to make the ElmoEmbedder()... Any workaround? It's certainly true that this was unexpected for me, but a few small changes would have clued me in. E.g. if the call had been At any rate, thanks for the super fast response, and the nice open software! |
@ngoodman - I added a longer description of the statefulness to this PR: #1167 The TLDR; is that the stateful aspect is a consequence of how the biLM was originally trained. For example, modifying your code to run the same sentence multiple times, the embeddings are constant after the first batch:
Displays |
@matt-peters These states simply represent the memory and output for each timestep in the batch, correct? What does it mean that they "adapt to the domain"? Is there a human understandable version of the information that they are storing, or is simply some weighted product of internal states? |
@schmmd is there a human understandable description of what this "context" describes? I think I understand what the LSTMs are doing, they essentially try to predict the next word in a given sentence given the previous (or following in the backwards case) words. But what's not clear to me is how this adapts to the given domain (beyond just predicting 'x' after 'y' if it ends to appear that way in past inputs). |
Apologies for opening an issue with what is likely a conceptual misunderstanding on my part!
I'm playing around with the pre-trained elmo embeddings (which are cool, thanks!) and noticing that the embedder seems to be stateful. That is, if i embed the same sentence twice, i don't get the same result:
This gives a cosine dist of about 0.02 -- not huge, but problematic for the same sentence!
Where does the statefulness come from? Am i mis-using the embedder?
The text was updated successfully, but these errors were encountered: