Skip to content

Commit

Permalink
Fixed 'Character level one-hot encoding'. Indexes and characters were…
Browse files Browse the repository at this point in the history
… the other way around. Need to cut off sample at max_length.
  • Loading branch information
Hiroya Chiba committed Sep 17, 2017
1 parent 586d5ab commit c6a3dfc
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions 6.1-one-hot-encoding-of-words-or-characters.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -109,12 +109,12 @@
"\n",
"samples = ['The cat sat on the mat.', 'The dog ate my homework.']\n",
"characters = string.printable # All printable ASCII characters.\n",
"token_index = dict(zip(range(1, len(characters) + 1), characters))\n",
"token_index = dict(zip(characters, range(1, len(characters) + 1)))\n",
"\n",
"max_length = 50\n",
"results = np.zeros((len(samples), max_length, max(token_index.keys()) + 1))\n",
"results = np.zeros((len(samples), max_length, max(token_index.values()) + 1))\n",
"for i, sample in enumerate(samples):\n",
" for j, character in enumerate(sample):\n",
" for j, character in enumerate(sample[:max_length]):\n",
" index = token_index.get(character)\n",
" results[i, j, index] = 1."
]
Expand Down

0 comments on commit c6a3dfc

Please sign in to comment.