Seq2Seq の疑問を解決するために論文を読む

疑問1 encoder の input について

Seq2Seq encoder に sentence を input するときに Word Embeddings をするのだけど、input が [word_vec1, word_vec2, ..., word_vecn] のように word vector の sequence になるような気がするが自信なし。
Decoder/Encoder を提案した https://arxiv.org/abs/1406.1078 | Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation を読むと。"We used rank-100 matrices, equivalent to learning an embedding of dimension 100 for each word." と書いてあるから合っているっぽい。

追記
"Each word of the source phrase is embedded in a 500-dimensional vector space: e(xi) ∈ R500." と書いてあるから正解。

疑問2: encoder / decoder の間でやり取りする c について

C は fixed vector であるという記述と、encoder の hidden state そのものが decoder に渡ると読めるものもある。どちらだろうか。前述のオリジナル論文だと "The encoder is an RNN that reads each symbol of an input sequence x sequentially. As it reads each symbol, the hidden state of the RNN changes according to Eq. (1). After reading the end of the sequence (marked by an end-of-sequence sym- bol), the hidden state of the RNN is a summary c of the whole input sequence.
The decoder of the proposed model is another RNN which is trained to generate the output se- quence by predicting the next symbol yt given the hidden state h⟨t⟩." とかいてあるから hidden state h と C は同じものを指しているような気がする。

追記
c = tanh Vh⟨N⟩ だった。