Jekyll2016-12-22T03:56:02+00:00http://mckinziebrandon.me/TensorflowNotebooks//Research-RelatedTensorflow stuff.
Playing around in OpenAI Gym in Jupyter2016-12-21T00:00:00+00:002016-12-21T00:00:00+00:00http://mckinziebrandon.me/TensorflowNotebooks/2016/12/21/openai<h2 id="first-figure-out-jupyter-notebook-stuff">First, Figure out Jupyter Notebook Stuff</h2>
<p><a href="http://nbviewer.jupyter.org/github/patrickmineault/xcorr-notebooks/blob/master/Render%20OpenAI%20gym%20as%20GIF.ipynb">This tutorial</a> helped a lot.</p>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="c"># The typical imports</span>
<span class="kn">import</span> <span class="nn">gym</span>
<span class="kn">import</span> <span class="nn">numpy</span> <span class="kn">as</span> <span class="nn">np</span>
<span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="kn">as</span> <span class="nn">plt</span>
<span class="o">%</span><span class="n">matplotlib</span> <span class="n">inline</span>
<span class="c"># Imports specifically so we can render outputs in Jupyter.</span>
<span class="kn">from</span> <span class="nn">JSAnimation.IPython_display</span> <span class="kn">import</span> <span class="n">display_animation</span>
<span class="kn">from</span> <span class="nn">matplotlib</span> <span class="kn">import</span> <span class="n">animation</span>
<span class="kn">from</span> <span class="nn">IPython.display</span> <span class="kn">import</span> <span class="n">display</span>
<span class="k">def</span> <span class="nf">display_frames_as_gif</span><span class="p">(</span><span class="n">frames</span><span class="p">):</span>
<span class="s">"""
Displays a list of frames as a gif, with controls
"""</span>
<span class="c">#plt.figure(figsize=(frames[0].shape[1] / 72.0, frames[0].shape[0] / 72.0), dpi = 72)</span>
<span class="n">patch</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">imshow</span><span class="p">(</span><span class="n">frames</span><span class="p">[</span><span class="mi">0</span><span class="p">])</span>
<span class="n">plt</span><span class="o">.</span><span class="n">axis</span><span class="p">(</span><span class="s">'off'</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">animate</span><span class="p">(</span><span class="n">i</span><span class="p">):</span>
<span class="n">patch</span><span class="o">.</span><span class="n">set_data</span><span class="p">(</span><span class="n">frames</span><span class="p">[</span><span class="n">i</span><span class="p">])</span>
<span class="n">anim</span> <span class="o">=</span> <span class="n">animation</span><span class="o">.</span><span class="n">FuncAnimation</span><span class="p">(</span><span class="n">plt</span><span class="o">.</span><span class="n">gcf</span><span class="p">(),</span> <span class="n">animate</span><span class="p">,</span> <span class="n">frames</span> <span class="o">=</span> <span class="nb">len</span><span class="p">(</span><span class="n">frames</span><span class="p">),</span> <span class="n">interval</span><span class="o">=</span><span class="mi">50</span><span class="p">)</span>
<span class="n">display</span><span class="p">(</span><span class="n">display_animation</span><span class="p">(</span><span class="n">anim</span><span class="p">,</span> <span class="n">default_mode</span><span class="o">=</span><span class="s">'loop'</span><span class="p">))</span>
</code></pre>
</div>
<h3 id="simple-cartpole-example-in-jupyter">Simple Cartpole Example in Jupyter</h3>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="n">env</span> <span class="o">=</span> <span class="n">gym</span><span class="o">.</span><span class="n">make</span><span class="p">(</span><span class="s">'CartPole-v0'</span><span class="p">)</span>
<span class="c"># Run a demo of the environment</span>
<span class="n">observation</span> <span class="o">=</span> <span class="n">env</span><span class="o">.</span><span class="n">reset</span><span class="p">()</span>
<span class="n">cum_reward</span> <span class="o">=</span> <span class="mi">0</span>
<span class="n">frames</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">t</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">5000</span><span class="p">):</span>
<span class="c"># Render into buffer. </span>
<span class="n">frames</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">env</span><span class="o">.</span><span class="n">render</span><span class="p">(</span><span class="n">mode</span> <span class="o">=</span> <span class="s">'rgb_array'</span><span class="p">))</span>
<span class="n">action</span> <span class="o">=</span> <span class="n">env</span><span class="o">.</span><span class="n">action_space</span><span class="o">.</span><span class="n">sample</span><span class="p">()</span>
<span class="n">observation</span><span class="p">,</span> <span class="n">reward</span><span class="p">,</span> <span class="n">done</span><span class="p">,</span> <span class="n">info</span> <span class="o">=</span> <span class="n">env</span><span class="o">.</span><span class="n">step</span><span class="p">(</span><span class="n">action</span><span class="p">)</span>
<span class="k">if</span> <span class="n">done</span><span class="p">:</span>
<span class="k">break</span>
<span class="n">env</span><span class="o">.</span><span class="n">render</span><span class="p">(</span><span class="n">close</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
<span class="n">display_frames_as_gif</span><span class="p">(</span><span class="n">frames</span><span class="p">)</span>
</code></pre>
</div>
<h2 id="openai-gym---documentation">OpenAI Gym - Documentation</h2>
<p>Working through <a href="https://gym.openai.com/docs">this entire page</a> on starting with the gym. First, we again show their cartpole snippet but with the Jupyter support added in by me.</p>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="n">env</span> <span class="o">=</span> <span class="n">gym</span><span class="o">.</span><span class="n">make</span><span class="p">(</span><span class="s">'CartPole-v0'</span><span class="p">)</span>
<span class="n">cum_reward</span> <span class="o">=</span> <span class="mi">0</span>
<span class="n">frames</span> <span class="o">=</span> <span class="p">[]</span>
<span class="n">num_episodes</span><span class="o">=</span><span class="mi">40</span>
<span class="k">for</span> <span class="n">i_episode</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">num_episodes</span><span class="p">):</span>
<span class="n">observation</span> <span class="o">=</span> <span class="n">env</span><span class="o">.</span><span class="n">reset</span><span class="p">()</span>
<span class="k">for</span> <span class="n">t</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">500</span><span class="p">):</span>
<span class="c"># Render into buffer. </span>
<span class="n">frames</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">env</span><span class="o">.</span><span class="n">render</span><span class="p">(</span><span class="n">mode</span> <span class="o">=</span> <span class="s">'rgb_array'</span><span class="p">))</span>
<span class="n">action</span> <span class="o">=</span> <span class="n">env</span><span class="o">.</span><span class="n">action_space</span><span class="o">.</span><span class="n">sample</span><span class="p">()</span> <span class="c"># random action</span>
<span class="n">observation</span><span class="p">,</span> <span class="n">reward</span><span class="p">,</span> <span class="n">done</span><span class="p">,</span> <span class="n">info</span> <span class="o">=</span> <span class="n">env</span><span class="o">.</span><span class="n">step</span><span class="p">(</span><span class="n">action</span><span class="p">)</span>
<span class="k">if</span> <span class="n">done</span><span class="p">:</span>
<span class="k">print</span><span class="p">(</span><span class="s">"</span><span class="se">\r</span><span class="s">Episode {}/{} finished after {} timesteps"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">i_episode</span><span class="p">,</span> <span class="n">num_episodes</span><span class="p">,</span> <span class="n">t</span><span class="o">+</span><span class="mi">1</span><span class="p">),</span> <span class="n">end</span><span class="o">=</span><span class="s">""</span><span class="p">)</span>
<span class="k">break</span>
<span class="n">env</span><span class="o">.</span><span class="n">render</span><span class="p">(</span><span class="n">close</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
<span class="n">display_frames_as_gif</span><span class="p">(</span><span class="n">frames</span><span class="p">)</span>
</code></pre>
</div>
<h3 id="environments">Environments</h3>
<p>Environments all descend from the <a href="https://github.com/openai/gym/blob/master/gym/core.py#L14">Env</a> base class. You can view a list of all environments via:</p>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">gym</span> <span class="kn">import</span> <span class="n">envs</span>
<span class="k">print</span><span class="p">(</span><span class="n">envs</span><span class="o">.</span><span class="n">registry</span><span class="o">.</span><span class="nb">all</span><span class="p">())</span>
</code></pre>
</div>
<p>Important environment functions/properties:</p>
<ul>
<li><strong>step</strong>: Returns info regarding what our actions are doing to the environment at each step. The return values:
<ul>
<li>observation (object)</li>
<li>reward (float)</li>
<li>done (boolean)</li>
<li>info (dict)</li>
</ul>
</li>
<li><strong>reset</strong>: returns an initial observation.</li>
<li><strong>Space objects</strong>: two objects (below) that describe the valid actions and observations.
<ul>
<li>action_space [returns <em>Discrete(2)</em> for cartpole]. Example usage of Discrete:
<code class="highlighter-rouge">python
from gym import spaces
space = spaces.Discrete(8) # Set with 8 elements {0, 1, 2, ..., 7}
x = space.sample()
assert space.contains(x)
assert space.n == 8
</code></li>
<li>observation_space [returns <em>Box(4)</em> for cartpole]</li>
</ul>
</li>
</ul>
<div class="language-python highlighter-rouge"><pre class="highlight"><code>
</code></pre>
</div>First, Figure out Jupyter Notebook StuffSolution for Simple Early Stopping with TFLearn2016-11-28T00:00:00+00:002016-11-28T00:00:00+00:00http://mckinziebrandon.me/TensorflowNotebooks/2016/11/28/early-stop-solution<h2 id="it-works-heres-how">It Works! Here’s How.</h2>
<p>The following is a code snippet directly from <a href="https://github.com/tflearn/tflearn/blob/master/tflearn/helpers/trainer.py#L281">trainer.py</a> in the tflearn github repository, where I’m only showing the relevant parts/logic.</p>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="k">try</span><span class="p">:</span>
<span class="k">for</span> <span class="n">epoch</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">n_epoch</span><span class="p">):</span>
<span class="c"># . . . Setup stuff for epoch here . . . </span>
<span class="k">for</span> <span class="n">batch_step</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">max_batches_len</span><span class="p">):</span>
<span class="c"># . . . Setup stuff for next batch here . . . </span>
<span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">train_op</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">train_ops</span><span class="p">):</span>
<span class="n">caller</span><span class="o">.</span><span class="n">on_sub_batch_begin</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">training_state</span><span class="p">)</span>
<span class="c"># Train our model and store desired information in the train_op that</span>
<span class="c"># we (the user) pass to the trainer as an initialization argument.</span>
<span class="n">snapshot</span> <span class="o">=</span> <span class="n">train_op</span><span class="o">.</span><span class="n">_train</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">training_state</span><span class="o">.</span><span class="n">step</span><span class="p">,</span>
<span class="p">(</span><span class="nb">bool</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">best_checkpoint_path</span><span class="p">)</span> <span class="o">|</span> <span class="n">snapshot_epoch</span><span class="p">),</span>
<span class="n">snapshot_step</span><span class="p">,</span>
<span class="n">show_metric</span><span class="p">)</span>
<span class="c"># Update training state. The training state object tells us </span>
<span class="c"># how our model is doing at various stages of training.</span>
<span class="bp">self</span><span class="o">.</span><span class="n">training_state</span><span class="o">.</span><span class="n">update</span><span class="p">(</span><span class="n">train_op</span><span class="p">,</span> <span class="n">train_ops_count</span><span class="p">)</span>
<span class="c"># All optimizers batch end</span>
<span class="bp">self</span><span class="o">.</span><span class="n">session</span><span class="o">.</span><span class="n">run</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">incr_global_step</span><span class="p">)</span>
<span class="n">caller</span><span class="o">.</span><span class="n">on_batch_end</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">training_state</span><span class="p">,</span> <span class="n">snapshot</span><span class="p">)</span>
<span class="c"># ---------- [What we care about] -------------</span>
<span class="c"># Epoch end. We define what on_epoch_end does. In this</span>
<span class="c"># case, I'll have it raise an exception if our validation accuracy</span>
<span class="c"># reaches some desired threshold. </span>
<span class="n">caller</span><span class="o">.</span><span class="n">on_epoch_end</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">training_state</span><span class="p">)</span>
<span class="c"># ---------------------------------------------</span>
<span class="k">finally</span><span class="p">:</span>
<span class="c"># Once we raise the exception, this code block will execute. </span>
<span class="c"># Note only afterward will our catch block execute. </span>
<span class="n">caller</span><span class="o">.</span><span class="n">on_train_end</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">training_state</span><span class="p">)</span>
<span class="k">for</span> <span class="n">t</span> <span class="ow">in</span> <span class="bp">self</span><span class="o">.</span><span class="n">train_ops</span><span class="p">:</span>
<span class="n">t</span><span class="o">.</span><span class="n">train_dflow</span><span class="o">.</span><span class="n">interrupt</span><span class="p">()</span>
<span class="c"># Set back train_ops</span>
<span class="bp">self</span><span class="o">.</span><span class="n">train_ops</span> <span class="o">=</span> <span class="n">original_train_ops</span>
</code></pre>
</div>
<h2 id="setup-the-basic-network-architecture">Setup the Basic Network Architecture</h2>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">tensorflow</span> <span class="kn">as</span> <span class="nn">tf</span>
<span class="kn">import</span> <span class="nn">tflearn</span>
<span class="kn">import</span> <span class="nn">tflearn.datasets.mnist</span> <span class="kn">as</span> <span class="nn">mnist</span>
<span class="n">trainX</span><span class="p">,</span> <span class="n">trainY</span><span class="p">,</span> <span class="n">testX</span><span class="p">,</span> <span class="n">testY</span> <span class="o">=</span> <span class="n">mnist</span><span class="o">.</span><span class="n">load_data</span><span class="p">(</span><span class="n">one_hot</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
<span class="n">n_features</span> <span class="o">=</span> <span class="mi">784</span>
<span class="n">n_hidden</span> <span class="o">=</span> <span class="mi">256</span>
<span class="n">n_classes</span> <span class="o">=</span> <span class="mi">10</span>
<span class="c"># Define the inputs/outputs/weights as usual.</span>
<span class="n">X</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">placeholder</span><span class="p">(</span><span class="s">"float"</span><span class="p">,</span> <span class="p">[</span><span class="bp">None</span><span class="p">,</span> <span class="n">n_features</span><span class="p">])</span>
<span class="n">Y</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">placeholder</span><span class="p">(</span><span class="s">"float"</span><span class="p">,</span> <span class="p">[</span><span class="bp">None</span><span class="p">,</span> <span class="n">n_classes</span><span class="p">])</span>
<span class="c"># Define the connections/weights and biases between layers.</span>
<span class="n">W1</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">Variable</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">random_normal</span><span class="p">([</span><span class="n">n_features</span><span class="p">,</span> <span class="n">n_hidden</span><span class="p">]),</span> <span class="n">name</span><span class="o">=</span><span class="s">'W1'</span><span class="p">)</span>
<span class="n">W2</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">Variable</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">random_normal</span><span class="p">([</span><span class="n">n_hidden</span><span class="p">,</span> <span class="n">n_hidden</span><span class="p">]),</span> <span class="n">name</span><span class="o">=</span><span class="s">'W2'</span><span class="p">)</span>
<span class="n">W3</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">Variable</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">random_normal</span><span class="p">([</span><span class="n">n_hidden</span><span class="p">,</span> <span class="n">n_classes</span><span class="p">]),</span> <span class="n">name</span><span class="o">=</span><span class="s">'W3'</span><span class="p">)</span>
<span class="n">b1</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">Variable</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">random_normal</span><span class="p">([</span><span class="n">n_hidden</span><span class="p">]),</span> <span class="n">name</span><span class="o">=</span><span class="s">'b1'</span><span class="p">)</span>
<span class="n">b2</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">Variable</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">random_normal</span><span class="p">([</span><span class="n">n_hidden</span><span class="p">]),</span> <span class="n">name</span><span class="o">=</span><span class="s">'b2'</span><span class="p">)</span>
<span class="n">b3</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">Variable</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">random_normal</span><span class="p">([</span><span class="n">n_classes</span><span class="p">]),</span> <span class="n">name</span><span class="o">=</span><span class="s">'b3'</span><span class="p">)</span>
<span class="c"># Define the operations throughout the network.</span>
<span class="n">net</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">tanh</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">add</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">matmul</span><span class="p">(</span><span class="n">X</span><span class="p">,</span> <span class="n">W1</span><span class="p">),</span> <span class="n">b1</span><span class="p">))</span>
<span class="n">net</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">tanh</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">add</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">matmul</span><span class="p">(</span><span class="n">net</span><span class="p">,</span> <span class="n">W2</span><span class="p">),</span> <span class="n">b2</span><span class="p">))</span>
<span class="n">net</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">add</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">matmul</span><span class="p">(</span><span class="n">net</span><span class="p">,</span> <span class="n">W3</span><span class="p">),</span> <span class="n">b3</span><span class="p">)</span>
<span class="c"># Define the optimization problem.</span>
<span class="n">loss</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">reduce_mean</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">nn</span><span class="o">.</span><span class="n">softmax_cross_entropy_with_logits</span><span class="p">(</span><span class="n">net</span><span class="p">,</span> <span class="n">Y</span><span class="p">))</span>
<span class="n">optimizer</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">train</span><span class="o">.</span><span class="n">GradientDescentOptimizer</span><span class="p">(</span><span class="n">learning_rate</span><span class="o">=</span><span class="mf">0.1</span><span class="p">)</span>
<span class="n">accuracy</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">reduce_mean</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">cast</span><span class="p">(</span>
<span class="n">tf</span><span class="o">.</span><span class="n">equal</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">argmax</span><span class="p">(</span><span class="n">net</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span> <span class="n">tf</span><span class="o">.</span><span class="n">argmax</span><span class="p">(</span><span class="n">Y</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span> <span class="p">),</span> <span class="n">tf</span><span class="o">.</span><span class="n">float32</span><span class="p">),</span> <span class="n">name</span><span class="o">=</span><span class="s">'acc'</span><span class="p">)</span>
</code></pre>
</div>
<div class="highlighter-rouge"><pre class="highlight"><code>hdf5 not supported (please install/reinstall h5py)
Extracting mnist/train-images-idx3-ubyte.gz
Extracting mnist/train-labels-idx1-ubyte.gz
Extracting mnist/t10k-images-idx3-ubyte.gz
Extracting mnist/t10k-labels-idx1-ubyte.gz
</code></pre>
</div>
<h2 id="define-the-trainop-and-trainer-objects">Define the TrainOp and Trainer Objects</h2>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="n">trainop</span> <span class="o">=</span> <span class="n">tflearn</span><span class="o">.</span><span class="n">TrainOp</span><span class="p">(</span><span class="n">loss</span><span class="o">=</span><span class="n">loss</span><span class="p">,</span> <span class="n">optimizer</span><span class="o">=</span><span class="n">optimizer</span><span class="p">,</span> <span class="n">metric</span><span class="o">=</span><span class="n">accuracy</span><span class="p">,</span> <span class="n">batch_size</span><span class="o">=</span><span class="mi">128</span><span class="p">)</span>
<span class="n">trainer</span> <span class="o">=</span> <span class="n">tflearn</span><span class="o">.</span><span class="n">Trainer</span><span class="p">(</span><span class="n">train_ops</span><span class="o">=</span><span class="n">trainop</span><span class="p">,</span> <span class="n">tensorboard_verbose</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
</code></pre>
</div>
<h1 id="the-earlystoppingcallback-class">The EarlyStoppingCallback Class</h1>
<p>I show a proof-of-concept version of early stopping below. This is the simplest possible case: just stop training after the first epoch no matter what. It is up to the user to decide the conditions they want to trigger the stopping on.</p>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="k">class</span> <span class="nc">EarlyStoppingCallback</span><span class="p">(</span><span class="n">tflearn</span><span class="o">.</span><span class="n">callbacks</span><span class="o">.</span><span class="n">Callback</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">val_acc_thresh</span><span class="p">):</span>
<span class="s">""" Note: We are free to define our init function however we please. """</span>
<span class="c"># Store a validation accuracy threshold, which we can compare against</span>
<span class="c"># the current validation accuracy at, say, each epoch, each batch step, etc.</span>
<span class="bp">self</span><span class="o">.</span><span class="n">val_acc_thresh</span> <span class="o">=</span> <span class="n">val_acc_thresh</span>
<span class="k">def</span> <span class="nf">on_epoch_end</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">training_state</span><span class="p">):</span>
<span class="s">"""
This is the final method called in trainer.py in the epoch loop.
We can stop training and leave without losing any information with a simple exception.
"""</span>
<span class="k">print</span><span class="p">(</span><span class="s">"Terminating training at the end of epoch"</span><span class="p">,</span> <span class="n">training_state</span><span class="o">.</span><span class="n">epoch</span><span class="p">)</span>
<span class="k">raise</span> <span class="nb">StopIteration</span>
<span class="k">def</span> <span class="nf">on_train_end</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">training_state</span><span class="p">):</span>
<span class="s">"""
Furthermore, tflearn will then immediately call this method after we terminate training,
(or when training ends regardless). This would be a good time to store any additional
information that tflearn doesn't store already.
"""</span>
<span class="k">print</span><span class="p">(</span><span class="s">"Successfully left training! Final model accuracy:"</span><span class="p">,</span> <span class="n">training_state</span><span class="o">.</span><span class="n">acc_value</span><span class="p">)</span>
<span class="c"># Initialize our callback with desired accuracy threshold. </span>
<span class="n">early_stopping_cb</span> <span class="o">=</span> <span class="n">EarlyStoppingCallback</span><span class="p">(</span><span class="n">val_acc_thresh</span><span class="o">=</span><span class="mf">0.5</span><span class="p">)</span>
</code></pre>
</div>
<h1 id="result-train-the-model-and-stop-early">Result: Train the Model and Stop Early</h1>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="k">try</span><span class="p">:</span>
<span class="c"># Give it to our trainer and let it fit the data. </span>
<span class="n">trainer</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">feed_dicts</span><span class="o">=</span><span class="p">{</span><span class="n">X</span><span class="p">:</span> <span class="n">trainX</span><span class="p">,</span> <span class="n">Y</span><span class="p">:</span> <span class="n">trainY</span><span class="p">},</span>
<span class="n">val_feed_dicts</span><span class="o">=</span><span class="p">{</span><span class="n">X</span><span class="p">:</span> <span class="n">testX</span><span class="p">,</span> <span class="n">Y</span><span class="p">:</span> <span class="n">testY</span><span class="p">},</span>
<span class="n">n_epoch</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span>
<span class="n">show_metric</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span> <span class="c"># Calculate accuracy and display at every step.</span>
<span class="n">callbacks</span><span class="o">=</span><span class="n">early_stopping_cb</span><span class="p">)</span>
<span class="k">except</span> <span class="nb">StopIteration</span><span class="p">:</span>
<span class="k">print</span><span class="p">(</span><span class="s">"Caught callback exception. Returning control to user program."</span><span class="p">)</span>
</code></pre>
</div>
<div class="highlighter-rouge"><pre class="highlight"><code>Training Step: 860 | total loss: [1m[32m1.73372[0m[0m
| Optimizer | epoch: 002 | loss: 1.73372 - acc: 0.8196 | val_loss: 1.87058 - val_acc: 0.8011 -- iter: 55000/55000
Training Step: 860 | total loss: [1m[32m1.73372[0m[0m
| Optimizer | epoch: 002 | loss: 1.73372 - acc: 0.8196 | val_loss: 1.87058 - val_acc: 0.8011 -- iter: 55000/55000
--
Terminating training at the end of epoch 2
Successfully left training! Final model accuracy: 0.8196054697036743
Caught callback exception. Returning control to user program.
</code></pre>
</div>
<h1 id="appendix">Appendix</h1>
<p>For my own reference, this is the code I started with before tinkering with the early stopping solution above.</p>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">__future__</span> <span class="kn">import</span> <span class="n">division</span><span class="p">,</span> <span class="n">print_function</span><span class="p">,</span> <span class="n">absolute_import</span>
<span class="kn">import</span> <span class="nn">os</span>
<span class="kn">import</span> <span class="nn">sys</span>
<span class="kn">import</span> <span class="nn">tempfile</span>
<span class="kn">import</span> <span class="nn">urllib</span>
<span class="kn">import</span> <span class="nn">collections</span>
<span class="kn">import</span> <span class="nn">math</span>
<span class="kn">import</span> <span class="nn">numpy</span> <span class="kn">as</span> <span class="nn">np</span>
<span class="kn">import</span> <span class="nn">tensorflow</span> <span class="kn">as</span> <span class="nn">tf</span>
<span class="kn">from</span> <span class="nn">scipy.io</span> <span class="kn">import</span> <span class="n">arff</span>
<span class="kn">import</span> <span class="nn">tflearn</span>
<span class="kn">from</span> <span class="nn">sklearn.utils</span> <span class="kn">import</span> <span class="n">shuffle</span>
<span class="kn">from</span> <span class="nn">sklearn.metrics</span> <span class="kn">import</span> <span class="n">roc_auc_score</span>
<span class="kn">from</span> <span class="nn">tflearn.data_utils</span> <span class="kn">import</span> <span class="n">shuffle</span><span class="p">,</span> <span class="n">to_categorical</span>
<span class="kn">from</span> <span class="nn">tflearn.layers.core</span> <span class="kn">import</span> <span class="n">input_data</span><span class="p">,</span> <span class="n">dropout</span><span class="p">,</span> <span class="n">fully_connected</span>
<span class="kn">from</span> <span class="nn">tflearn.layers.conv</span> <span class="kn">import</span> <span class="n">conv_2d</span><span class="p">,</span> <span class="n">max_pool_2d</span>
<span class="kn">from</span> <span class="nn">tflearn.layers.normalization</span> <span class="kn">import</span> <span class="n">local_response_normalization</span><span class="p">,</span> <span class="n">batch_normalization</span>
<span class="kn">from</span> <span class="nn">tflearn.layers.estimator</span> <span class="kn">import</span> <span class="n">regression</span>
<span class="kn">import</span> <span class="nn">tflearn.datasets.mnist</span> <span class="kn">as</span> <span class="nn">mnist</span>
<span class="c"># Load the data and handle any preprocessing here.</span>
<span class="n">X</span><span class="p">,</span> <span class="n">Y</span><span class="p">,</span> <span class="n">testX</span><span class="p">,</span> <span class="n">testY</span> <span class="o">=</span> <span class="n">mnist</span><span class="o">.</span><span class="n">load_data</span><span class="p">(</span><span class="n">one_hot</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
<span class="n">X</span><span class="p">,</span> <span class="n">Y</span> <span class="o">=</span> <span class="n">shuffle</span><span class="p">(</span><span class="n">X</span><span class="p">,</span> <span class="n">Y</span><span class="p">)</span>
<span class="n">X</span> <span class="o">=</span> <span class="n">X</span><span class="o">.</span><span class="n">reshape</span><span class="p">([</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="mi">28</span><span class="p">,</span> <span class="mi">28</span><span class="p">,</span> <span class="mi">1</span><span class="p">])</span>
<span class="n">testX</span> <span class="o">=</span> <span class="n">testX</span><span class="o">.</span><span class="n">reshape</span><span class="p">([</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="mi">28</span><span class="p">,</span> <span class="mi">28</span><span class="p">,</span> <span class="mi">1</span><span class="p">])</span>
<span class="c"># Define our network architecture: a simple 2-layer network of the form</span>
<span class="c"># InputImages -> Fully Connected -> Softmax</span>
<span class="n">out_readin1</span> <span class="o">=</span> <span class="n">input_data</span><span class="p">(</span><span class="n">shape</span><span class="o">=</span><span class="p">[</span><span class="bp">None</span><span class="p">,</span><span class="mi">28</span><span class="p">,</span><span class="mi">28</span><span class="p">,</span><span class="mi">1</span><span class="p">])</span>
<span class="n">out_fully_connected2</span> <span class="o">=</span> <span class="n">fully_connected</span><span class="p">(</span><span class="n">out_readin1</span><span class="p">,</span> <span class="mi">10</span><span class="p">)</span>
<span class="n">out_softmax3</span> <span class="o">=</span> <span class="n">fully_connected</span><span class="p">(</span><span class="n">out_fully_connected2</span><span class="p">,</span> <span class="mi">10</span><span class="p">,</span> <span class="n">activation</span><span class="o">=</span><span class="s">'softmax'</span><span class="p">)</span>
<span class="nb">hash</span><span class="o">=</span><span class="s">'f0c188c3777519fb93f1a825ca758a0c'</span>
<span class="n">scriptid</span><span class="o">=</span><span class="s">'MNIST-f0c188c3777519fb93f1a825ca758a0c'</span>
<span class="c"># Define our training metrics. </span>
<span class="n">network</span> <span class="o">=</span> <span class="n">regression</span><span class="p">(</span><span class="n">out_softmax3</span><span class="p">,</span>
<span class="n">optimizer</span><span class="o">=</span><span class="s">'adam'</span><span class="p">,</span>
<span class="n">learning_rate</span><span class="o">=</span><span class="mf">0.01</span><span class="p">,</span>
<span class="n">loss</span><span class="o">=</span><span class="s">'categorical_crossentropy'</span><span class="p">,</span>
<span class="n">name</span><span class="o">=</span><span class="s">'target'</span><span class="p">)</span>
<span class="n">model</span> <span class="o">=</span> <span class="n">tflearn</span><span class="o">.</span><span class="n">DNN</span><span class="p">(</span><span class="n">network</span><span class="p">,</span> <span class="n">tensorboard_verbose</span><span class="o">=</span><span class="mi">3</span><span class="p">)</span>
<span class="k">try</span><span class="p">:</span>
<span class="n">model</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">X</span><span class="p">,</span> <span class="n">Y</span><span class="p">,</span> <span class="n">n_epoch</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">validation_set</span><span class="o">=</span><span class="p">(</span><span class="n">testX</span><span class="p">,</span> <span class="n">testY</span><span class="p">),</span>
<span class="n">snapshot_epoch</span><span class="o">=</span><span class="bp">False</span><span class="p">,</span>
<span class="n">show_metric</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span>
<span class="n">run_id</span><span class="o">=</span><span class="n">scriptid</span><span class="p">,</span><span class="n">callbacks</span><span class="o">=</span><span class="n">early_stopping_cb</span><span class="p">)</span>
<span class="k">except</span> <span class="nb">StopIteration</span><span class="p">:</span>
<span class="k">print</span><span class="p">(</span><span class="s">"Caught callback exception. Returning control to user program."</span><span class="p">)</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">testX</span><span class="p">)</span>
<span class="n">auc</span><span class="o">=</span><span class="n">roc_auc_score</span><span class="p">(</span><span class="n">testY</span><span class="p">,</span> <span class="n">prediction</span><span class="p">,</span> <span class="n">average</span><span class="o">=</span><span class="s">'macro'</span><span class="p">,</span> <span class="n">sample_weight</span><span class="o">=</span><span class="bp">None</span><span class="p">)</span>
<span class="n">accuracy</span><span class="o">=</span><span class="n">model</span><span class="o">.</span><span class="n">evaluate</span><span class="p">(</span><span class="n">testX</span><span class="p">,</span><span class="n">testY</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">"Accuracy:"</span><span class="p">,</span> <span class="n">accuracy</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">"ROC AUC Score:"</span><span class="p">,</span> <span class="n">auc</span><span class="p">)</span>
</code></pre>
</div>
<div class="highlighter-rouge"><pre class="highlight"><code>Training Step: 860 | total loss: [1m[32m0.30941[0m[0m
| Adam | epoch: 001 | loss: 0.30941 - acc: 0.9125 -- iter: 55000/55000
Terminating training at the end of epoch 1
Successfully left training! Final model accuracy: 0.9125033020973206
Caught callback exception. Returning control to user program.
Accuracy: [0.90410000000000001]
ROC AUC Score: 0.992379719297
</code></pre>
</div>It Works! Here’s How.Stopping CodeGenerated MNIST with TFLearn2016-11-24T00:00:00+00:002016-11-24T00:00:00+00:00http://mckinziebrandon.me/TensorflowNotebooks/2016/11/24/stop-mnist<h2 id="current-issues">Current Issues</h2>
<p>The main problem is that the training state object that provides the validation accuracy info to the callback object is not
storing the validation accuracy in its instance variables. I’ve debugged it around in circles and this is the main thing
preventing early stopping from working properly. I’ve dug through the tflearn source code and it looks like this value should get
stored. It is most likely related to whatever default trainOp gets passed to the DNN class.</p>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">__future__</span> <span class="kn">import</span> <span class="n">division</span><span class="p">,</span> <span class="n">print_function</span><span class="p">,</span> <span class="n">absolute_import</span>
<span class="kn">import</span> <span class="nn">os</span>
<span class="kn">import</span> <span class="nn">sys</span>
<span class="kn">import</span> <span class="nn">tempfile</span>
<span class="kn">import</span> <span class="nn">urllib</span>
<span class="kn">import</span> <span class="nn">collections</span>
<span class="kn">import</span> <span class="nn">math</span>
<span class="kn">import</span> <span class="nn">numpy</span> <span class="kn">as</span> <span class="nn">np</span>
<span class="kn">import</span> <span class="nn">tensorflow</span> <span class="kn">as</span> <span class="nn">tf</span>
<span class="kn">from</span> <span class="nn">scipy.io</span> <span class="kn">import</span> <span class="n">arff</span>
<span class="kn">import</span> <span class="nn">tflearn</span>
<span class="kn">from</span> <span class="nn">sklearn.utils</span> <span class="kn">import</span> <span class="n">shuffle</span>
<span class="kn">from</span> <span class="nn">sklearn.metrics</span> <span class="kn">import</span> <span class="n">roc_auc_score</span>
<span class="kn">from</span> <span class="nn">tflearn.data_utils</span> <span class="kn">import</span> <span class="n">shuffle</span><span class="p">,</span> <span class="n">to_categorical</span>
<span class="kn">from</span> <span class="nn">tflearn.layers.core</span> <span class="kn">import</span> <span class="n">input_data</span><span class="p">,</span> <span class="n">dropout</span><span class="p">,</span> <span class="n">fully_connected</span>
<span class="kn">from</span> <span class="nn">tflearn.layers.conv</span> <span class="kn">import</span> <span class="n">conv_2d</span><span class="p">,</span> <span class="n">max_pool_2d</span>
<span class="kn">from</span> <span class="nn">tflearn.layers.normalization</span> <span class="kn">import</span> <span class="n">local_response_normalization</span><span class="p">,</span> <span class="n">batch_normalization</span>
<span class="kn">from</span> <span class="nn">tflearn.layers.estimator</span> <span class="kn">import</span> <span class="n">regression</span>
<span class="kn">import</span> <span class="nn">tflearn.datasets.mnist</span> <span class="kn">as</span> <span class="nn">mnist</span>
</code></pre>
</div>
<div class="highlighter-rouge"><pre class="highlight"><code>hdf5 not supported (please install/reinstall h5py)
</code></pre>
</div>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="n">X</span><span class="p">,</span> <span class="n">Y</span><span class="p">,</span> <span class="n">testX</span><span class="p">,</span> <span class="n">testY</span> <span class="o">=</span> <span class="n">mnist</span><span class="o">.</span><span class="n">load_data</span><span class="p">(</span><span class="n">one_hot</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
<span class="n">X</span><span class="p">,</span> <span class="n">Y</span> <span class="o">=</span> <span class="n">shuffle</span><span class="p">(</span><span class="n">X</span><span class="p">,</span> <span class="n">Y</span><span class="p">)</span>
<span class="n">X</span> <span class="o">=</span> <span class="n">X</span><span class="o">.</span><span class="n">reshape</span><span class="p">([</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="mi">28</span><span class="p">,</span> <span class="mi">28</span><span class="p">,</span> <span class="mi">1</span><span class="p">])</span>
<span class="n">testX</span> <span class="o">=</span> <span class="n">testX</span><span class="o">.</span><span class="n">reshape</span><span class="p">([</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="mi">28</span><span class="p">,</span> <span class="mi">28</span><span class="p">,</span> <span class="mi">1</span><span class="p">])</span>
</code></pre>
</div>
<div class="highlighter-rouge"><pre class="highlight"><code>Extracting mnist/train-images-idx3-ubyte.gz
Extracting mnist/train-labels-idx1-ubyte.gz
Extracting mnist/t10k-images-idx3-ubyte.gz
Extracting mnist/t10k-labels-idx1-ubyte.gz
</code></pre>
</div>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="n">out_readin1</span> <span class="o">=</span> <span class="n">input_data</span><span class="p">(</span><span class="n">shape</span><span class="o">=</span><span class="p">[</span><span class="bp">None</span><span class="p">,</span><span class="mi">28</span><span class="p">,</span><span class="mi">28</span><span class="p">,</span><span class="mi">1</span><span class="p">])</span>
<span class="n">out_fully_connected2</span> <span class="o">=</span> <span class="n">fully_connected</span><span class="p">(</span><span class="n">out_readin1</span><span class="p">,</span> <span class="mi">10</span><span class="p">)</span>
<span class="n">out_softmax3</span> <span class="o">=</span> <span class="n">fully_connected</span><span class="p">(</span><span class="n">out_fully_connected2</span><span class="p">,</span> <span class="mi">10</span><span class="p">,</span> <span class="n">activation</span><span class="o">=</span><span class="s">'softmax'</span><span class="p">)</span>
<span class="nb">hash</span><span class="o">=</span><span class="s">'f0c188c3777519fb93f1a825ca758a0c'</span>
<span class="n">scriptid</span><span class="o">=</span><span class="s">'MNIST-f0c188c3777519fb93f1a825ca758a0c'</span>
<span class="n">network</span> <span class="o">=</span> <span class="n">regression</span><span class="p">(</span><span class="n">out_softmax3</span><span class="p">,</span>
<span class="n">optimizer</span><span class="o">=</span><span class="s">'adam'</span><span class="p">,</span>
<span class="n">learning_rate</span><span class="o">=</span><span class="mf">0.01</span><span class="p">,</span>
<span class="n">loss</span><span class="o">=</span><span class="s">'categorical_crossentropy'</span><span class="p">,</span>
<span class="n">name</span><span class="o">=</span><span class="s">'target'</span><span class="p">)</span>
<span class="c">#model = tflearn.DNN(network, tensorboard_verbose=3)</span>
</code></pre>
</div>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="n">model</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">X</span><span class="p">,</span> <span class="n">Y</span><span class="p">,</span>
<span class="n">n_epoch</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span>
<span class="n">validation_set</span><span class="o">=</span><span class="p">(</span><span class="n">testX</span><span class="p">,</span> <span class="n">testY</span><span class="p">),</span>
<span class="n">snapshot_step</span><span class="o">=</span><span class="mi">10</span><span class="p">,</span>
<span class="n">snapshot_epoch</span><span class="o">=</span><span class="bp">False</span><span class="p">,</span>
<span class="n">show_metric</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span>
<span class="n">run_id</span><span class="o">=</span><span class="n">scriptid</span><span class="p">)</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">testX</span><span class="p">)</span>
<span class="n">auc</span><span class="o">=</span><span class="n">roc_auc_score</span><span class="p">(</span><span class="n">testY</span><span class="p">,</span> <span class="n">prediction</span><span class="p">,</span> <span class="n">average</span><span class="o">=</span><span class="s">'macro'</span><span class="p">,</span> <span class="n">sample_weight</span><span class="o">=</span><span class="bp">None</span><span class="p">)</span>
<span class="n">accuracy</span><span class="o">=</span><span class="n">model</span><span class="o">.</span><span class="n">evaluate</span><span class="p">(</span><span class="n">testX</span><span class="p">,</span><span class="n">testY</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">"Accuracy:"</span><span class="p">,</span> <span class="n">accuracy</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">"ROC AUC Score:"</span><span class="p">,</span> <span class="n">auc</span><span class="p">)</span>
</code></pre>
</div>
<h1 id="now-trying-with-tflearn">Now Trying with TFLearn</h1>
<p><strong>Issue</strong>: I can’t seem to figure this out for the life of me, but for some reason training_state never has a non-none value for val_acc, and essentially most other evaluation metrics. I’m assuming this is because I need to explicitly tell TFLearn to store them every n iterations, but the documentation suggests that the default behavior is to store these basic values.</p>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">pdb</span>
<span class="k">class</span> <span class="nc">EarlyStoppingCallback</span><span class="p">(</span><span class="n">tflearn</span><span class="o">.</span><span class="n">callbacks</span><span class="o">.</span><span class="n">Callback</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">val_acc_thresh</span><span class="p">):</span>
<span class="s">""" Note: We are free to define our init function however we please. """</span>
<span class="bp">self</span><span class="o">.</span><span class="n">val_acc_thresh</span> <span class="o">=</span> <span class="n">val_acc_thresh</span>
<span class="k">def</span> <span class="nf">on_epoch_end</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">training_state</span><span class="p">):</span>
<span class="s">""" """</span>
<span class="c"># Apparently this can happen.</span>
<span class="n">pdb</span><span class="o">.</span><span class="n">set_trace</span><span class="p">()</span>
<span class="k">if</span> <span class="n">training_state</span><span class="o">.</span><span class="n">val_acc</span> <span class="ow">is</span> <span class="bp">None</span><span class="p">:</span> <span class="k">return</span>
<span class="k">if</span> <span class="n">training_state</span><span class="o">.</span><span class="n">val_acc</span> <span class="o">></span> <span class="bp">self</span><span class="o">.</span><span class="n">val_acc_thresh</span><span class="p">:</span>
<span class="k">raise</span> <span class="nb">StopIteration</span>
<span class="k">def</span> <span class="nf">on_batch_end</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">training_state</span><span class="p">,</span> <span class="n">snapshot</span><span class="o">=</span><span class="bp">False</span><span class="p">):</span>
<span class="s">""" """</span>
<span class="c"># Apparently this can happen.</span>
<span class="k">if</span> <span class="n">training_state</span><span class="o">.</span><span class="n">val_acc</span> <span class="ow">is</span> <span class="bp">None</span><span class="p">:</span> <span class="k">return</span>
<span class="k">if</span> <span class="n">training_state</span><span class="o">.</span><span class="n">val_acc</span> <span class="o">></span> <span class="bp">self</span><span class="o">.</span><span class="n">val_acc_thresh</span><span class="p">:</span>
<span class="k">raise</span> <span class="nb">StopIteration</span>
<span class="c"># Initializae our callback.</span>
<span class="n">early_stopping_cb</span> <span class="o">=</span> <span class="n">EarlyStoppingCallback</span><span class="p">(</span><span class="n">val_acc_thresh</span><span class="o">=</span><span class="mf">0.5</span><span class="p">)</span>
<span class="n">model</span> <span class="o">=</span> <span class="n">tflearn</span><span class="o">.</span><span class="n">DNN</span><span class="p">(</span><span class="n">network</span><span class="p">,</span> <span class="n">tensorboard_verbose</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="c"># Give it to our trainer and let it fit the data. </span>
<span class="n">model</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">X</span><span class="p">[:</span><span class="mi">20000</span><span class="p">],</span> <span class="n">Y</span><span class="p">[:</span><span class="mi">20000</span><span class="p">],</span>
<span class="n">n_epoch</span><span class="o">=</span><span class="mi">3</span><span class="p">,</span>
<span class="n">validation_set</span><span class="o">=</span><span class="p">(</span><span class="n">testX</span><span class="p">,</span> <span class="n">testY</span><span class="p">),</span>
<span class="n">snapshot_epoch</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span>
<span class="c">#show_metric=True, </span>
<span class="n">callbacks</span><span class="o">=</span><span class="n">early_stopping_cb</span><span class="p">)</span>
</code></pre>
</div>
<div class="highlighter-rouge"><pre class="highlight"><code>Training Step: 313 | total loss: [1m[32m0.34588[0m[0m
| Adam | epoch: 001 | loss: 0.34588 | val_loss: 0.34831 -- iter: 20000/20000
Training Step: 313 | total loss: [1m[32m0.34588[0m[0m
| Adam | epoch: 001 | loss: 0.34588 | val_loss: 0.34831 -- iter: 20000/20000
--
> <ipython-input-4-62e7ee0640e6>(11)on_epoch_end()
-> if training_state.val_acc is None: return
(Pdb) training_state.val_acc
(Pdb) training_state.global_acc
(Pdb) training_state.acc_value
(Pdb) training_state.loss_value
0.34587910771369934
</code></pre>
</div>
<div class="language-python highlighter-rouge"><pre class="highlight"><code>
</code></pre>
</div>Current IssuesEarly Stopping with TensorFlow and TFLearn2016-11-20T00:00:00+00:002016-11-20T00:00:00+00:00http://mckinziebrandon.me/TensorflowNotebooks/2016/11/20/early-stopping<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">tensorflow</span> <span class="kn">as</span> <span class="nn">tf</span>
<span class="kn">import</span> <span class="nn">tflearn</span>
<span class="kn">import</span> <span class="nn">tflearn.datasets.mnist</span> <span class="kn">as</span> <span class="nn">mnist</span>
<span class="n">trainX</span><span class="p">,</span> <span class="n">trainY</span><span class="p">,</span> <span class="n">testX</span><span class="p">,</span> <span class="n">testY</span> <span class="o">=</span> <span class="n">mnist</span><span class="o">.</span><span class="n">load_data</span><span class="p">(</span><span class="n">one_hot</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
</code></pre>
</div>
<div class="highlighter-rouge"><pre class="highlight"><code>hdf5 not supported (please install/reinstall h5py)
Extracting mnist/train-images-idx3-ubyte.gz
Extracting mnist/train-labels-idx1-ubyte.gz
Extracting mnist/t10k-images-idx3-ubyte.gz
Extracting mnist/t10k-labels-idx1-ubyte.gz
</code></pre>
</div>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="n">n_features</span> <span class="o">=</span> <span class="mi">784</span>
<span class="n">n_hidden</span> <span class="o">=</span> <span class="mi">256</span>
<span class="n">n_classes</span> <span class="o">=</span> <span class="mi">10</span>
<span class="c"># Define the inputs/outputs/weights as usual.</span>
<span class="n">X</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">placeholder</span><span class="p">(</span><span class="s">"float"</span><span class="p">,</span> <span class="p">[</span><span class="bp">None</span><span class="p">,</span> <span class="n">n_features</span><span class="p">])</span>
<span class="n">Y</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">placeholder</span><span class="p">(</span><span class="s">"float"</span><span class="p">,</span> <span class="p">[</span><span class="bp">None</span><span class="p">,</span> <span class="n">n_classes</span><span class="p">])</span>
<span class="c"># Define the connections/weights and biases between layers.</span>
<span class="n">W1</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">Variable</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">random_normal</span><span class="p">([</span><span class="n">n_features</span><span class="p">,</span> <span class="n">n_hidden</span><span class="p">]),</span> <span class="n">name</span><span class="o">=</span><span class="s">'W1'</span><span class="p">)</span>
<span class="n">W2</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">Variable</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">random_normal</span><span class="p">([</span><span class="n">n_hidden</span><span class="p">,</span> <span class="n">n_hidden</span><span class="p">]),</span> <span class="n">name</span><span class="o">=</span><span class="s">'W2'</span><span class="p">)</span>
<span class="n">W3</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">Variable</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">random_normal</span><span class="p">([</span><span class="n">n_hidden</span><span class="p">,</span> <span class="n">n_classes</span><span class="p">]),</span> <span class="n">name</span><span class="o">=</span><span class="s">'W3'</span><span class="p">)</span>
<span class="n">b1</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">Variable</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">random_normal</span><span class="p">([</span><span class="n">n_hidden</span><span class="p">]),</span> <span class="n">name</span><span class="o">=</span><span class="s">'b1'</span><span class="p">)</span>
<span class="n">b2</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">Variable</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">random_normal</span><span class="p">([</span><span class="n">n_hidden</span><span class="p">]),</span> <span class="n">name</span><span class="o">=</span><span class="s">'b2'</span><span class="p">)</span>
<span class="n">b3</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">Variable</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">random_normal</span><span class="p">([</span><span class="n">n_classes</span><span class="p">]),</span> <span class="n">name</span><span class="o">=</span><span class="s">'b3'</span><span class="p">)</span>
<span class="c"># Define the operations throughout the network.</span>
<span class="n">net</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">tanh</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">add</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">matmul</span><span class="p">(</span><span class="n">X</span><span class="p">,</span> <span class="n">W1</span><span class="p">),</span> <span class="n">b1</span><span class="p">))</span>
<span class="n">net</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">tanh</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">add</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">matmul</span><span class="p">(</span><span class="n">net</span><span class="p">,</span> <span class="n">W2</span><span class="p">),</span> <span class="n">b2</span><span class="p">))</span>
<span class="n">net</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">add</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">matmul</span><span class="p">(</span><span class="n">net</span><span class="p">,</span> <span class="n">W3</span><span class="p">),</span> <span class="n">b3</span><span class="p">)</span>
<span class="c"># Define the optimization problem.</span>
<span class="n">loss</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">reduce_mean</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">nn</span><span class="o">.</span><span class="n">softmax_cross_entropy_with_logits</span><span class="p">(</span><span class="n">net</span><span class="p">,</span> <span class="n">Y</span><span class="p">))</span>
<span class="n">optimizer</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">train</span><span class="o">.</span><span class="n">GradientDescentOptimizer</span><span class="p">(</span><span class="n">learning_rate</span><span class="o">=</span><span class="mf">0.1</span><span class="p">)</span>
<span class="n">accuracy</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">reduce_mean</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">cast</span><span class="p">(</span>
<span class="n">tf</span><span class="o">.</span><span class="n">equal</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">argmax</span><span class="p">(</span><span class="n">net</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span> <span class="n">tf</span><span class="o">.</span><span class="n">argmax</span><span class="p">(</span><span class="n">Y</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span> <span class="p">),</span> <span class="n">tf</span><span class="o">.</span><span class="n">float32</span><span class="p">),</span> <span class="n">name</span><span class="o">=</span><span class="s">'acc'</span><span class="p">)</span>
</code></pre>
</div>
<h1 id="early-stopping">Early Stopping</h1>
<h2 id="training-setup">Training Setup</h2>
<p>In tflearn, we can train our model with a <a href="http://tflearn.org/helpers/trainer/" title="Documentation">tflearn.Trainer</a> object: “Generic class to handle any TensorFlow graph training. It requires the use of TrainOp to specify all optimization parameters.”</p>
<ul>
<li>
<p><a href="http://tflearn.org/helpers/trainer/#trainop">TrainOp</a> represents a set of operation used for optimizing a network.</p>
</li>
<li>
<p><strong>Example</strong>: Time to initialize our trainer to work with our MNIST network. Below we create a TrainOp object that is then used for the purpose of telling our trainer</p>
<ol>
<li>Our loss function. (softmax cross entropy with logits)</li>
<li>Our optimizer. (GradientDescentOptimizer)</li>
<li>Our evaluation [tensor] metric. (classification accuracy)</li>
</ol>
</li>
</ul>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="n">trainop</span> <span class="o">=</span> <span class="n">tflearn</span><span class="o">.</span><span class="n">TrainOp</span><span class="p">(</span><span class="n">loss</span><span class="o">=</span><span class="n">loss</span><span class="p">,</span> <span class="n">optimizer</span><span class="o">=</span><span class="n">optimizer</span><span class="p">,</span> <span class="n">metric</span><span class="o">=</span><span class="n">accuracy</span><span class="p">,</span> <span class="n">batch_size</span><span class="o">=</span><span class="mi">128</span><span class="p">)</span>
<span class="n">trainer</span> <span class="o">=</span> <span class="n">tflearn</span><span class="o">.</span><span class="n">Trainer</span><span class="p">(</span><span class="n">train_ops</span><span class="o">=</span><span class="n">trainop</span><span class="p">,</span> <span class="n">tensorboard_verbose</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
</code></pre>
</div>
<h2 id="callbacks">Callbacks</h2>
<p>The <a href="http://tflearn.org/getting_started/#training-callbacks">Callbacks</a> interface describes a set of methods that we can implement ourselves that will be called during runtime. Below are our options, where here we will be primarily concerned with the on_epoch_end() method.
* __ Methods __ :</p>
<div class="language-python highlighter-rouge"><pre class="highlight"><code> <span class="k">def</span> <span class="nf">on_train_begin</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">training_state</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">on_epoch_begin</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">training_state</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">on_batch_begin</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">training_state</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">on_sub_batch_begin</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">training_state</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">on_sub_batch_end</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">training_state</span><span class="p">,</span> <span class="n">train_index</span><span class="o">=</span><span class="mi">0</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">on_batch_end</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">training_state</span><span class="p">,</span> <span class="n">snapshot</span><span class="o">=</span><span class="bp">False</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">on_epoch_end</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">training_state</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">on_train_end</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">training_state</span><span class="p">):</span>
</code></pre>
</div>
<ul>
<li><strong>TrainingState</strong>: Notice that each method requires us to pass a <a href="https://github.com/tflearn/tflearn/blob/master/tflearn/helpers/trainer.py#L971">training_state</a> object as an argument. These useful helpers will be able to provide us with the information we need to determine when to stop training. Below is a list of the instance variables we can access with a training_state object:
<ul>
<li>self.epoch</li>
<li>self.step</li>
<li>self.current_iter</li>
<li>self.acc_value</li>
<li>self.loss_value</li>
<li>self.val_acc</li>
<li>self.val_loss</li>
<li>self.best_accuracy</li>
<li>self.global_acc</li>
<li>self.global_loss</li>
</ul>
</li>
<li><strong>Implementing our Callback</strong>: Let’s say we want to stop training when the validation accuracy reaches a certain threshold. Below, we implement the code required to define such a callback and fit the MNIST data.</li>
</ul>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="k">class</span> <span class="nc">EarlyStoppingCallback</span><span class="p">(</span><span class="n">tflearn</span><span class="o">.</span><span class="n">callbacks</span><span class="o">.</span><span class="n">Callback</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">val_acc_thresh</span><span class="p">):</span>
<span class="s">""" Note: We are free to define our init function however we please. """</span>
<span class="bp">self</span><span class="o">.</span><span class="n">val_acc_thresh</span> <span class="o">=</span> <span class="n">val_acc_thresh</span>
<span class="k">def</span> <span class="nf">on_epoch_end</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">training_state</span><span class="p">):</span>
<span class="s">""" """</span>
<span class="c"># Apparently this can happen.</span>
<span class="k">if</span> <span class="n">training_state</span><span class="o">.</span><span class="n">val_acc</span> <span class="ow">is</span> <span class="bp">None</span><span class="p">:</span> <span class="k">return</span>
<span class="k">if</span> <span class="n">training_state</span><span class="o">.</span><span class="n">val_acc</span> <span class="o">></span> <span class="bp">self</span><span class="o">.</span><span class="n">val_acc_thresh</span><span class="p">:</span>
<span class="k">raise</span> <span class="nb">StopIteration</span>
</code></pre>
</div>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="c"># Initializae our callback.</span>
<span class="n">early_stopping_cb</span> <span class="o">=</span> <span class="n">EarlyStoppingCallback</span><span class="p">(</span><span class="n">val_acc_thresh</span><span class="o">=</span><span class="mf">0.5</span><span class="p">)</span>
<span class="c"># Give it to our trainer and let it fit the data. </span>
<span class="n">trainer</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">feed_dicts</span><span class="o">=</span><span class="p">{</span><span class="n">X</span><span class="p">:</span> <span class="n">trainX</span><span class="p">,</span> <span class="n">Y</span><span class="p">:</span> <span class="n">trainY</span><span class="p">},</span>
<span class="n">val_feed_dicts</span><span class="o">=</span><span class="p">{</span><span class="n">X</span><span class="p">:</span> <span class="n">testX</span><span class="p">,</span> <span class="n">Y</span><span class="p">:</span> <span class="n">testY</span><span class="p">},</span>
<span class="n">n_epoch</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span>
<span class="n">show_metric</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span> <span class="c"># Calculate accuracy and display at every step.</span>
<span class="n">snapshot_epoch</span><span class="o">=</span><span class="bp">False</span><span class="p">,</span>
<span class="n">callbacks</span><span class="o">=</span><span class="n">early_stopping_cb</span><span class="p">)</span>
</code></pre>
</div>
<div class="highlighter-rouge"><pre class="highlight"><code>Training Step: 1720 | total loss: [1m[32m0.81290[0m[0m
| Optimizer | epoch: 004 | loss: 0.81290 - acc_2: 0.8854 -- iter: 55000/55000
</code></pre>
</div>
<h1 id="using-tfcontriblearn-instead">Using tf.contrib.learn instead</h1>
<h2 id="iris-data-loadingtutorial-prep">Iris data loading/tutorial prep</h2>
<p>Note: can also load via:
```python
import csv
import random
import numpy as np
from sklearn import datasets
from sklearn.cross_validation import train_test_split</p>
<p>X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.33, random_state=42)
iris = datasets.load_iris()
print(iris.data.shape)
print(“Xt”, X_train.shape, “Yt”, y_train.shape)
```</p>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">__future__</span> <span class="kn">import</span> <span class="n">absolute_import</span>
<span class="kn">from</span> <span class="nn">__future__</span> <span class="kn">import</span> <span class="n">division</span>
<span class="kn">from</span> <span class="nn">__future__</span> <span class="kn">import</span> <span class="n">print_function</span>
<span class="c"># Suppress the massive amount of warnings.</span>
<span class="n">tf</span><span class="o">.</span><span class="n">logging</span><span class="o">.</span><span class="n">set_verbosity</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">logging</span><span class="o">.</span><span class="n">ERROR</span><span class="p">)</span>
<span class="c"># Data sets</span>
<span class="n">IRIS_TRAINING</span> <span class="o">=</span> <span class="s">"iris_training.csv"</span>
<span class="n">IRIS_TEST</span> <span class="o">=</span> <span class="s">"iris_test.csv"</span>
<span class="c"># Load datasets.</span>
<span class="n">training_set</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">contrib</span><span class="o">.</span><span class="n">learn</span><span class="o">.</span><span class="n">datasets</span><span class="o">.</span><span class="n">base</span><span class="o">.</span><span class="n">load_csv_with_header</span><span class="p">(</span><span class="n">filename</span><span class="o">=</span><span class="n">IRIS_TRAINING</span><span class="p">,</span>
<span class="n">target_dtype</span><span class="o">=</span><span class="n">np</span><span class="o">.</span><span class="nb">int</span><span class="p">,</span>
<span class="n">features_dtype</span><span class="o">=</span><span class="n">np</span><span class="o">.</span><span class="n">float32</span><span class="p">)</span>
<span class="n">test_set</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">contrib</span><span class="o">.</span><span class="n">learn</span><span class="o">.</span><span class="n">datasets</span><span class="o">.</span><span class="n">base</span><span class="o">.</span><span class="n">load_csv_with_header</span><span class="p">(</span><span class="n">filename</span><span class="o">=</span><span class="n">IRIS_TEST</span><span class="p">,</span>
<span class="n">target_dtype</span><span class="o">=</span><span class="n">np</span><span class="o">.</span><span class="nb">int</span><span class="p">,</span>
<span class="n">features_dtype</span><span class="o">=</span><span class="n">np</span><span class="o">.</span><span class="n">float32</span><span class="p">)</span>
<span class="c"># Specify that all features have real-value data</span>
<span class="n">feature_columns</span> <span class="o">=</span> <span class="p">[</span><span class="n">tf</span><span class="o">.</span><span class="n">contrib</span><span class="o">.</span><span class="n">layers</span><span class="o">.</span><span class="n">real_valued_column</span><span class="p">(</span><span class="s">""</span><span class="p">,</span> <span class="n">dimension</span><span class="o">=</span><span class="mi">4</span><span class="p">)]</span>
<span class="c"># Build 3 layer DNN with 10, 20, 10 units respectively.</span>
<span class="n">classifier</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">contrib</span><span class="o">.</span><span class="n">learn</span><span class="o">.</span><span class="n">DNNClassifier</span><span class="p">(</span><span class="n">feature_columns</span><span class="o">=</span><span class="n">feature_columns</span><span class="p">,</span>
<span class="n">hidden_units</span><span class="o">=</span><span class="p">[</span><span class="mi">10</span><span class="p">,</span> <span class="mi">20</span><span class="p">,</span> <span class="mi">10</span><span class="p">],</span>
<span class="n">n_classes</span><span class="o">=</span><span class="mi">3</span><span class="p">,</span>
<span class="n">model_dir</span><span class="o">=</span><span class="s">"/tmp/iris_model"</span><span class="p">)</span>
<span class="c"># Fit model.</span>
<span class="n">classifier</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">x</span><span class="o">=</span><span class="n">X_train</span><span class="p">,</span>
<span class="n">y</span><span class="o">=</span><span class="n">y_train</span><span class="p">,</span>
<span class="n">steps</span><span class="o">=</span><span class="mi">2000</span><span class="p">)</span>
<span class="c"># Evaluate accuracy.</span>
<span class="n">accuracy_score</span> <span class="o">=</span> <span class="n">classifier</span><span class="o">.</span><span class="n">evaluate</span><span class="p">(</span><span class="n">x</span><span class="o">=</span><span class="n">X_test</span><span class="p">,</span> <span class="n">y</span><span class="o">=</span><span class="n">y_test</span><span class="p">)[</span><span class="s">"accuracy"</span><span class="p">]</span>
<span class="k">print</span><span class="p">(</span><span class="s">'Accuracy: {0:f}'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">accuracy_score</span><span class="p">))</span>
<span class="c"># Classify two new flower samples.</span>
<span class="n">new_samples</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">([[</span><span class="mf">6.4</span><span class="p">,</span> <span class="mf">3.2</span><span class="p">,</span> <span class="mf">4.5</span><span class="p">,</span> <span class="mf">1.5</span><span class="p">],</span> <span class="p">[</span><span class="mf">5.8</span><span class="p">,</span> <span class="mf">3.1</span><span class="p">,</span> <span class="mf">5.0</span><span class="p">,</span> <span class="mf">1.7</span><span class="p">]],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="o">.</span><span class="n">float32</span><span class="p">)</span>
<span class="n">y</span> <span class="o">=</span> <span class="n">classifier</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">new_samples</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">'Predictions: {}'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="nb">str</span><span class="p">(</span><span class="n">y</span><span class="p">)))</span>
</code></pre>
</div>
<div class="highlighter-rouge"><pre class="highlight"><code>Accuracy: 0.980000
Predictions: [1 1]
</code></pre>
</div>
<h2 id="validation-monitors">Validation Monitors</h2>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="c"># Vanilla version</span>
<span class="n">validation_monitor</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">contrib</span><span class="o">.</span><span class="n">learn</span><span class="o">.</span><span class="n">monitors</span><span class="o">.</span><span class="n">ValidationMonitor</span><span class="p">(</span><span class="n">test_set</span><span class="o">.</span><span class="n">data</span><span class="p">,</span>
<span class="n">test_set</span><span class="o">.</span><span class="n">target</span><span class="p">,</span>
<span class="n">every_n_steps</span><span class="o">=</span><span class="mi">50</span><span class="p">)</span>
<span class="n">classifier</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">contrib</span><span class="o">.</span><span class="n">learn</span><span class="o">.</span><span class="n">DNNClassifier</span><span class="p">(</span><span class="n">feature_columns</span><span class="o">=</span><span class="n">feature_columns</span><span class="p">,</span>
<span class="n">hidden_units</span><span class="o">=</span><span class="p">[</span><span class="mi">10</span><span class="p">,</span> <span class="mi">20</span><span class="p">,</span> <span class="mi">10</span><span class="p">],</span>
<span class="n">n_classes</span><span class="o">=</span><span class="mi">3</span><span class="p">,</span>
<span class="n">model_dir</span><span class="o">=</span><span class="s">"/tmp/iris_model"</span><span class="p">,</span>
<span class="n">config</span><span class="o">=</span><span class="n">tf</span><span class="o">.</span><span class="n">contrib</span><span class="o">.</span><span class="n">learn</span><span class="o">.</span><span class="n">RunConfig</span><span class="p">(</span>
<span class="n">save_checkpoints_secs</span><span class="o">=</span><span class="mi">1</span><span class="p">))</span>
<span class="n">classifier</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">x</span><span class="o">=</span><span class="n">training_set</span><span class="o">.</span><span class="n">data</span><span class="p">,</span>
<span class="n">y</span><span class="o">=</span><span class="n">training_set</span><span class="o">.</span><span class="n">target</span><span class="p">,</span>
<span class="n">steps</span><span class="o">=</span><span class="mi">2000</span><span class="p">,</span>
<span class="n">monitors</span><span class="o">=</span><span class="p">[</span><span class="n">validation_monitor</span><span class="p">])</span>
</code></pre>
</div>
<div class="highlighter-rouge"><pre class="highlight"><code>Estimator(params={'dropout': None, 'hidden_units': [10, 20, 10], 'weight_column_name': None, 'feature_columns': [_RealValuedColumn(column_name='', dimension=4, default_value=None, dtype=tf.float32, normalizer=None)], 'optimizer': 'Adagrad', 'n_classes': 3, 'activation_fn': <function relu at 0x7f8568caa598>, 'num_ps_replicas': 0, 'gradient_clip_norm': None, 'enable_centered_bias': True})
</code></pre>
</div>
<h2 id="customizing-the-evaluation-metrics-and-stopping-early">Customizing the Evaluation Metrics and Stopping Early</h2>
<p>If we run the code below, it stops early! Warning: You’re going to see a lot of WARNING print outputs from tf. I guess this tutorial is a bit out of date. But that’s not what we care abot here, we just want that early stopping! The important output to notice is</p>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="n">INFO</span><span class="p">:</span><span class="n">tensorflow</span><span class="p">:</span><span class="n">Validation</span> <span class="p">(</span><span class="n">step</span> <span class="mi">22556</span><span class="p">):</span> <span class="n">accuracy</span> <span class="o">=</span> <span class="mf">0.966667</span><span class="p">,</span> <span class="n">global_step</span> <span class="o">=</span> <span class="mi">22535</span><span class="p">,</span> <span class="n">loss</span> <span class="o">=</span> <span class="mf">0.2767</span>
<span class="n">INFO</span><span class="p">:</span><span class="n">tensorflow</span><span class="p">:</span><span class="n">Stopping</span><span class="o">.</span> <span class="n">Best</span> <span class="n">step</span><span class="p">:</span> <span class="mi">22356</span> <span class="k">with</span> <span class="n">loss</span> <span class="o">=</span> <span class="mf">0.2758353650569916</span><span class="o">.</span>
</code></pre>
</div>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="n">validation_metrics</span> <span class="o">=</span> <span class="p">{</span><span class="s">"accuracy"</span><span class="p">:</span> <span class="n">tf</span><span class="o">.</span><span class="n">contrib</span><span class="o">.</span><span class="n">metrics</span><span class="o">.</span><span class="n">streaming_accuracy</span><span class="p">,</span>
<span class="s">"precision"</span><span class="p">:</span> <span class="n">tf</span><span class="o">.</span><span class="n">contrib</span><span class="o">.</span><span class="n">metrics</span><span class="o">.</span><span class="n">streaming_precision</span><span class="p">,</span>
<span class="s">"recall"</span><span class="p">:</span> <span class="n">tf</span><span class="o">.</span><span class="n">contrib</span><span class="o">.</span><span class="n">metrics</span><span class="o">.</span><span class="n">streaming_recall</span><span class="p">}</span>
<span class="n">validation_monitor</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">contrib</span><span class="o">.</span><span class="n">learn</span><span class="o">.</span><span class="n">monitors</span><span class="o">.</span><span class="n">ValidationMonitor</span><span class="p">(</span>
<span class="n">test_set</span><span class="o">.</span><span class="n">data</span><span class="p">,</span>
<span class="n">test_set</span><span class="o">.</span><span class="n">target</span><span class="p">,</span>
<span class="n">every_n_steps</span><span class="o">=</span><span class="mi">50</span><span class="p">,</span>
<span class="c">#metrics=validation_metrics,</span>
<span class="n">early_stopping_metric</span><span class="o">=</span><span class="s">'loss'</span><span class="p">,</span>
<span class="n">early_stopping_metric_minimize</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span>
<span class="n">early_stopping_rounds</span><span class="o">=</span><span class="mi">200</span><span class="p">)</span>
<span class="n">tf</span><span class="o">.</span><span class="n">logging</span><span class="o">.</span><span class="n">set_verbosity</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">logging</span><span class="o">.</span><span class="n">ERROR</span><span class="p">)</span>
<span class="n">classifier</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">x</span><span class="o">=</span><span class="n">training_set</span><span class="o">.</span><span class="n">data</span><span class="p">,</span>
<span class="n">y</span><span class="o">=</span><span class="n">training_set</span><span class="o">.</span><span class="n">target</span><span class="p">,</span>
<span class="n">steps</span><span class="o">=</span><span class="mi">2000</span><span class="p">,</span>
<span class="n">monitors</span><span class="o">=</span><span class="p">[</span><span class="n">validation_monitor</span><span class="p">])</span>
</code></pre>
</div>
<div class="highlighter-rouge"><pre class="highlight"><code>Estimator(params={'dropout': None, 'hidden_units': [10, 20, 10], 'weight_column_name': None, 'feature_columns': [_RealValuedColumn(column_name='', dimension=4, default_value=None, dtype=tf.float32, normalizer=None)], 'optimizer': 'Adagrad', 'n_classes': 3, 'activation_fn': <function relu at 0x7f8568caa598>, 'num_ps_replicas': 0, 'gradient_clip_norm': None, 'enable_centered_bias': True})
</code></pre>
</div>```python
import tensorflow as tf
import tflearn
import tflearn.datasets.mnist as mnistTFLearn2016-11-19T00:00:00+00:002016-11-19T00:00:00+00:00http://mckinziebrandon.me/TensorflowNotebooks/2016/11/19/tflearn-only<h1 id="examplesextending-tensorflowtrainer">Examples::Extending Tensorflow::Trainer</h1>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">tensorflow</span> <span class="kn">as</span> <span class="nn">tf</span>
<span class="kn">import</span> <span class="nn">tflearn</span>
<span class="kn">import</span> <span class="nn">tflearn.datasets.mnist</span> <span class="kn">as</span> <span class="nn">mnist</span>
<span class="n">trainX</span><span class="p">,</span> <span class="n">trainY</span><span class="p">,</span> <span class="n">testX</span><span class="p">,</span> <span class="n">testY</span> <span class="o">=</span> <span class="n">mnist</span><span class="o">.</span><span class="n">load_data</span><span class="p">(</span><span class="n">one_hot</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
</code></pre>
</div>
<div class="highlighter-rouge"><pre class="highlight"><code>hdf5 not supported (please install/reinstall h5py)
Extracting mnist/train-images-idx3-ubyte.gz
Extracting mnist/train-labels-idx1-ubyte.gz
Extracting mnist/t10k-images-idx3-ubyte.gz
Extracting mnist/t10k-labels-idx1-ubyte.gz
</code></pre>
</div>
<h2 id="define-the-architecture-basic-tensorflow">Define the Architecture (Basic Tensorflow)</h2>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="c"># Because I don't feel like retyping stuff.</span>
<span class="k">def</span> <span class="nf">tfp</span><span class="p">(</span><span class="n">shape</span><span class="p">):</span>
<span class="k">return</span> <span class="n">tf</span><span class="o">.</span><span class="n">placeholder</span><span class="p">(</span><span class="s">"float"</span><span class="p">,</span> <span class="n">shape</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">tfrn</span><span class="p">(</span><span class="n">shape</span><span class="p">,</span> <span class="n">name</span><span class="p">):</span>
<span class="k">return</span> <span class="n">tf</span><span class="o">.</span><span class="n">Variable</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">random_normal</span><span class="p">(</span><span class="n">shape</span><span class="p">),</span> <span class="n">name</span><span class="o">=</span><span class="n">name</span><span class="p">)</span>
<span class="c"># Define the inputs/outputs/weights as usual.</span>
<span class="n">X</span><span class="p">,</span> <span class="n">Y</span> <span class="o">=</span> <span class="n">tfp</span><span class="p">([</span><span class="bp">None</span><span class="p">,</span> <span class="mi">784</span><span class="p">]),</span> <span class="n">tfp</span><span class="p">([</span><span class="bp">None</span><span class="p">,</span> <span class="mi">10</span><span class="p">])</span>
<span class="n">W1</span><span class="p">,</span> <span class="n">W2</span><span class="p">,</span> <span class="n">W3</span> <span class="o">=</span> <span class="n">tfrn</span><span class="p">([</span><span class="mi">784</span><span class="p">,</span> <span class="mi">256</span><span class="p">],</span> <span class="s">'W1'</span><span class="p">),</span> <span class="n">tfrn</span><span class="p">([</span><span class="mi">256</span><span class="p">,</span> <span class="mi">256</span><span class="p">],</span> <span class="s">'W2'</span><span class="p">),</span> <span class="n">tfrn</span><span class="p">([</span><span class="mi">256</span><span class="p">,</span> <span class="mi">10</span><span class="p">],</span> <span class="s">'W3'</span><span class="p">)</span>
<span class="n">b1</span><span class="p">,</span> <span class="n">b2</span><span class="p">,</span> <span class="n">b3</span> <span class="o">=</span> <span class="n">tfrn</span><span class="p">([</span><span class="mi">256</span><span class="p">],</span> <span class="s">'b1'</span><span class="p">),</span> <span class="n">tfrn</span><span class="p">([</span><span class="mi">256</span><span class="p">],</span> <span class="s">'b2'</span><span class="p">),</span> <span class="n">tfrn</span><span class="p">([</span><span class="mi">10</span><span class="p">],</span> <span class="s">'b3'</span><span class="p">)</span>
<span class="c"># Multilayer perceptron.</span>
<span class="k">def</span> <span class="nf">dnn</span><span class="p">(</span><span class="n">x</span><span class="p">):</span>
<span class="n">x</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">tanh</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">add</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">matmul</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">W1</span><span class="p">),</span> <span class="n">b1</span><span class="p">))</span>
<span class="n">x</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">tanh</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">add</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">matmul</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">W2</span><span class="p">),</span> <span class="n">b2</span><span class="p">))</span>
<span class="n">x</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">add</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">matmul</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">W3</span><span class="p">),</span> <span class="n">b3</span><span class="p">)</span>
<span class="k">return</span> <span class="n">x</span>
<span class="n">net</span> <span class="o">=</span> <span class="n">dnn</span><span class="p">(</span><span class="n">X</span><span class="p">)</span>
<span class="n">loss</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">reduce_mean</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">nn</span><span class="o">.</span><span class="n">softmax_cross_entropy_with_logits</span><span class="p">(</span><span class="n">net</span><span class="p">,</span> <span class="n">Y</span><span class="p">))</span>
<span class="n">optimizer</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">train</span><span class="o">.</span><span class="n">GradientDescentOptimizer</span><span class="p">(</span><span class="n">learning_rate</span><span class="o">=</span><span class="mf">0.1</span><span class="p">)</span>
<span class="n">accuracy</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">reduce_mean</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">cast</span><span class="p">(</span>
<span class="n">tf</span><span class="o">.</span><span class="n">equal</span><span class="p">(</span> <span class="n">tf</span><span class="o">.</span><span class="n">argmax</span><span class="p">(</span><span class="n">net</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span> <span class="n">tf</span><span class="o">.</span><span class="n">argmax</span><span class="p">(</span><span class="n">Y</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span> <span class="p">),</span> <span class="n">tf</span><span class="o">.</span><span class="n">float32</span><span class="p">),</span>
<span class="n">name</span><span class="o">=</span><span class="s">'acc'</span><span class="p">)</span>
</code></pre>
</div>
<h2 id="using--a-tflearn-trainer">Using a TFLearn Trainer</h2>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="n">trainop</span> <span class="o">=</span> <span class="n">tflearn</span><span class="o">.</span><span class="n">TrainOp</span><span class="p">(</span><span class="n">loss</span><span class="o">=</span><span class="n">loss</span><span class="p">,</span> <span class="n">optimizer</span><span class="o">=</span><span class="n">optimizer</span><span class="p">,</span> <span class="n">metric</span><span class="o">=</span><span class="n">accuracy</span><span class="p">,</span> <span class="n">batch_size</span><span class="o">=</span><span class="mi">128</span><span class="p">)</span>
<span class="n">trainer</span> <span class="o">=</span> <span class="n">tflearn</span><span class="o">.</span><span class="n">Trainer</span><span class="p">(</span><span class="n">train_ops</span><span class="o">=</span><span class="n">trainop</span><span class="p">,</span> <span class="n">tensorboard_verbose</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
</code></pre>
</div>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="n">trainer</span><span class="o">.</span><span class="n">fit</span><span class="p">({</span><span class="n">X</span><span class="p">:</span> <span class="n">trainX</span><span class="p">,</span> <span class="n">Y</span><span class="p">:</span> <span class="n">trainY</span><span class="p">},</span> <span class="n">val_feed_dicts</span><span class="o">=</span><span class="p">{</span><span class="n">X</span><span class="p">:</span> <span class="n">testX</span><span class="p">,</span> <span class="n">Y</span><span class="p">:</span> <span class="n">testY</span><span class="p">},</span>
<span class="n">n_epoch</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">show_metric</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
</code></pre>
</div>
<div class="highlighter-rouge"><pre class="highlight"><code>Training Step: 860 | total loss: [1m[32m1.73376[0m[0m
| Optimizer | epoch: 002 | loss: 1.73376 - acc: 0.8053 | val_loss: 1.78279 - val_acc: 0.8015 -- iter: 55000/55000
Training Step: 860 | total loss: [1m[32m1.73376[0m[0m
| Optimizer | epoch: 002 | loss: 1.73376 - acc: 0.8053 | val_loss: 1.78279 - val_acc: 0.8015 -- iter: 55000/55000
--
</code></pre>
</div>
<h1 id="training-callbacks">Training Callbacks</h1>
<p>One suggestion for early stopping with tflearn (made by owner of tflearn repository) is to define a custom callback that raises an exception when we want to stop training. I’ve written a small snippet below as an example.</p>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="k">class</span> <span class="nc">EarlyStoppingCallback</span><span class="p">(</span><span class="n">tflearn</span><span class="o">.</span><span class="n">callbacks</span><span class="o">.</span><span class="n">Callback</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">acc_thresh</span><span class="p">):</span>
<span class="s">"""
Args:
acc_thresh - if our accuracy > acc_thresh, terminate training.
"""</span>
<span class="bp">self</span><span class="o">.</span><span class="n">acc_thresh</span> <span class="o">=</span> <span class="n">acc_thresh</span>
<span class="bp">self</span><span class="o">.</span><span class="n">accs</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">def</span> <span class="nf">on_epoch_end</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">training_state</span><span class="p">):</span>
<span class="s">""" """</span>
<span class="bp">self</span><span class="o">.</span><span class="n">accs</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">training_state</span><span class="o">.</span><span class="n">global_acc</span><span class="p">)</span>
<span class="k">if</span> <span class="n">training_state</span><span class="o">.</span><span class="n">val_acc</span> <span class="ow">is</span> <span class="ow">not</span> <span class="bp">None</span> <span class="ow">and</span> <span class="n">training_state</span><span class="o">.</span><span class="n">val_acc</span> <span class="o"><</span> <span class="bp">self</span><span class="o">.</span><span class="n">acc_thresh</span><span class="p">:</span>
<span class="k">raise</span> <span class="nb">StopIteration</span>
</code></pre>
</div>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="n">cb</span> <span class="o">=</span> <span class="n">EarlyStoppingCallback</span><span class="p">(</span><span class="n">acc_thresh</span><span class="o">=</span><span class="mf">0.5</span><span class="p">)</span>
<span class="n">trainer</span><span class="o">.</span><span class="n">fit</span><span class="p">({</span><span class="n">X</span><span class="p">:</span> <span class="n">trainX</span><span class="p">,</span> <span class="n">Y</span><span class="p">:</span> <span class="n">trainY</span><span class="p">},</span> <span class="n">val_feed_dicts</span><span class="o">=</span><span class="p">{</span><span class="n">X</span><span class="p">:</span> <span class="n">testX</span><span class="p">,</span> <span class="n">Y</span><span class="p">:</span> <span class="n">testY</span><span class="p">},</span>
<span class="n">n_epoch</span><span class="o">=</span><span class="mi">3</span><span class="p">,</span> <span class="n">show_metric</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span> <span class="n">snapshot_epoch</span><span class="o">=</span><span class="bp">False</span><span class="p">,</span>
<span class="n">callbacks</span><span class="o">=</span><span class="n">cb</span><span class="p">)</span>
</code></pre>
</div>
<div class="highlighter-rouge"><pre class="highlight"><code>Training Step: 3965 | total loss: [1m[32m0.33810[0m[0m
| Optimizer | epoch: 010 | loss: 0.33810 - acc: 0.9455 -- iter: 55000/55000
GOODBYE
---------------------------------------------------------------------------
StopIteration Traceback (most recent call last)
<ipython-input-24-9c383c6f5a8b> in <module>()
2 trainer.fit({X: trainX, Y: trainY}, val_feed_dicts={X: testX, Y: testY},
3 n_epoch=3, show_metric=True, snapshot_epoch=False,
----> 4 callbacks=cb)
/usr/local/lib/python3.5/dist-packages/tflearn/helpers/trainer.py in fit(self, feed_dicts, n_epoch, val_feed_dicts, show_metric, snapshot_step, snapshot_epoch, shuffle_all, dprep_dict, daug_dict, excl_trainops, run_id, callbacks)
315
316 # Epoch end
--> 317 caller.on_epoch_end(self.training_state)
318
319 finally:
/usr/local/lib/python3.5/dist-packages/tflearn/callbacks.py in on_epoch_end(self, training_state)
67 def on_epoch_end(self, training_state):
68 for callback in self.callbacks:
---> 69 callback.on_epoch_end(training_state)
70
71 def on_train_end(self, training_state):
<ipython-input-23-d44cbdbc0814> in on_epoch_end(self, training_state)
13 if True:
14 print("GOODBYE")
---> 15 raise StopIteration
StopIteration:
</code></pre>
</div>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="n">cb</span><span class="o">.</span><span class="n">accs</span>
</code></pre>
</div>
<div class="highlighter-rouge"><pre class="highlight"><code>[None]
</code></pre>
</div>
<div class="language-python highlighter-rouge"><pre class="highlight"><code>
</code></pre>
</div>Examples::Extending Tensorflow::TrainerTensorFlow Textbook Tutorials2016-11-19T00:00:00+00:002016-11-19T00:00:00+00:00http://mckinziebrandon.me/TensorflowNotebooks/2016/11/19/tf-textbook<h1 id="using-tensorboard">Using Tensorboard</h1>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">tensorflow</span> <span class="kn">as</span> <span class="nn">tf</span>
<span class="n">a</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">constant</span><span class="p">(</span><span class="mi">10</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="s">"a"</span><span class="p">)</span>
<span class="n">b</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">constant</span><span class="p">(</span><span class="mi">90</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="s">"b"</span><span class="p">)</span>
<span class="n">y</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">Variable</span><span class="p">(</span><span class="n">a</span> <span class="o">+</span> <span class="mi">2</span> <span class="o">*</span> <span class="n">b</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="s">"y"</span><span class="p">)</span>
<span class="n">model</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">initialize_all_variables</span><span class="p">()</span>
<span class="k">with</span> <span class="n">tf</span><span class="o">.</span><span class="n">Session</span><span class="p">()</span> <span class="k">as</span> <span class="n">session</span><span class="p">:</span>
<span class="n">merged</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">merge_all_summaries</span><span class="p">()</span>
<span class="n">writer</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">train</span><span class="o">.</span><span class="n">SummaryWriter</span>\
<span class="p">(</span><span class="s">"/tmp/tensorflowlogs"</span><span class="p">,</span> <span class="n">session</span><span class="o">.</span><span class="n">graph</span><span class="p">)</span>
<span class="n">session</span><span class="o">.</span><span class="n">run</span><span class="p">(</span><span class="n">model</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="n">session</span><span class="o">.</span><span class="n">run</span><span class="p">(</span><span class="n">y</span><span class="p">))</span>
<span class="c"># Open terminal and run command:</span>
<span class="c"># tensorboard --logdir=/tmp/tensorflowlogs</span>
</code></pre>
</div>
<div class="highlighter-rouge"><pre class="highlight"><code>190
</code></pre>
</div>
<h1 id="mnist-convolutional-nn">MNIST Convolutional NN</h1>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">input_data</span>
<span class="kn">import</span> <span class="nn">numpy</span> <span class="kn">as</span> <span class="nn">np</span>
<span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="kn">as</span> <span class="nn">plt</span>
<span class="n">mnist</span> <span class="o">=</span> <span class="n">input_data</span><span class="o">.</span><span class="n">read_data_sets</span><span class="p">(</span><span class="s">"/tmp/data/"</span><span class="p">,</span> <span class="n">one_hot</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
</code></pre>
</div>
<div class="highlighter-rouge"><pre class="highlight"><code>Extracting /tmp/data/train-images-idx3-ubyte.gz
Extracting /tmp/data/train-labels-idx1-ubyte.gz
Extracting /tmp/data/t10k-images-idx3-ubyte.gz
Extracting /tmp/data/t10k-labels-idx1-ubyte.gz
</code></pre>
</div>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="n">n_input</span><span class="p">,</span> <span class="n">n_classes</span> <span class="o">=</span> <span class="mi">784</span><span class="p">,</span> <span class="mi">10</span>
<span class="c"># Hyperparameters</span>
<span class="n">learning_rate</span> <span class="o">=</span> <span class="mf">1e-3</span>
<span class="n">training_iters</span> <span class="o">=</span> <span class="mf">1e5</span>
<span class="n">batch_size</span> <span class="o">=</span> <span class="mi">128</span>
<span class="n">display_step</span> <span class="o">=</span> <span class="mi">10</span>
<span class="n">dropout</span> <span class="o">=</span> <span class="mf">0.75</span> <span class="c"># dropout probability</span>
<span class="n">keep_prob</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">placeholder</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">float32</span><span class="p">)</span> <span class="c"># (for dropout)</span>
<span class="n">x</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">placeholder</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">float32</span><span class="p">,</span> <span class="p">[</span><span class="bp">None</span><span class="p">,</span> <span class="n">n_input</span><span class="p">])</span>
<span class="n">_X</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">reshape</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">shape</span><span class="o">=</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="mi">28</span><span class="p">,</span> <span class="mi">28</span><span class="p">,</span> <span class="mi">1</span><span class="p">])</span> <span class="c"># Assuming -1 will be the number of samples?</span>
<span class="n">y</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">placeholder</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">float32</span><span class="p">,</span> <span class="p">[</span><span class="bp">None</span><span class="p">,</span> <span class="n">n_classes</span><span class="p">])</span> <span class="c"># output probabilities</span>
</code></pre>
</div>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="k">def</span> <span class="nf">conv2d</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="n">w</span><span class="p">,</span> <span class="n">b</span><span class="p">):</span>
<span class="s">"""
Args:
img -- input tensor of shape [batchsize, in_height, in_width, in_channels]
where channels may be, e.g. 3 for RGB color
w -- filter with shape [f_height, f_width, in_channels, n_feat_maps]
b -- bias for each feature map (number of biases = depth of the conv layer)
"""</span>
<span class="k">return</span> <span class="n">tf</span><span class="o">.</span><span class="n">nn</span><span class="o">.</span><span class="n">relu</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">nn</span><span class="o">.</span><span class="n">bias_add</span><span class="p">(</span>\
<span class="n">tf</span><span class="o">.</span><span class="n">nn</span><span class="o">.</span><span class="n">conv2d</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="n">w</span><span class="p">,</span> <span class="n">strides</span><span class="o">=</span><span class="p">[</span><span class="mi">1</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">1</span><span class="p">],</span> <span class="n">padding</span><span class="o">=</span><span class="s">'SAME'</span><span class="p">),</span> <span class="n">b</span><span class="p">))</span>
<span class="k">def</span> <span class="nf">max_pool</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="n">k</span><span class="o">=</span><span class="mi">2</span><span class="p">):</span>
<span class="s">"""
Args:
img -- output of a conv layer
k -- window size and stride (small)
"""</span>
<span class="k">return</span> <span class="n">tf</span><span class="o">.</span><span class="n">nn</span><span class="o">.</span><span class="n">max_pool</span><span class="p">(</span><span class="n">img</span><span class="p">,</span>
<span class="n">ksize</span><span class="o">=</span><span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="n">k</span><span class="p">,</span> <span class="n">k</span><span class="p">,</span> <span class="mi">1</span><span class="p">],</span>
<span class="n">strides</span><span class="o">=</span><span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="n">k</span><span class="p">,</span> <span class="n">k</span><span class="p">,</span> <span class="mi">1</span><span class="p">],</span>
<span class="n">padding</span><span class="o">=</span><span class="s">'SAME'</span><span class="p">)</span>
</code></pre>
</div>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="c"># __________ Weights and biases for all layers __________</span>
<span class="c"># 5x5 conv, 1 input, 32 outputs</span>
<span class="n">wc1</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">Variable</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">random_normal</span><span class="p">([</span><span class="mi">5</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">32</span><span class="p">]))</span>
<span class="n">bc1</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">Variable</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">random_normal</span><span class="p">([</span><span class="mi">32</span><span class="p">]))</span>
<span class="c"># 5x5 conv, 32 inputs, 64 outputs</span>
<span class="n">wc2</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">Variable</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">random_normal</span><span class="p">([</span><span class="mi">5</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">64</span><span class="p">]))</span>
<span class="n">bc2</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">Variable</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">random_normal</span><span class="p">([</span><span class="mi">64</span><span class="p">]))</span>
<span class="c"># FC, 7*7*64 inputs, 1024 outputs</span>
<span class="n">wd1</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">Variable</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">random_normal</span><span class="p">([</span><span class="mi">7</span><span class="o">*</span><span class="mi">7</span><span class="o">*</span><span class="mi">64</span><span class="p">,</span> <span class="mi">1024</span><span class="p">]))</span>
<span class="n">bd1</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">Variable</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">random_normal</span><span class="p">([</span><span class="mi">1024</span><span class="p">]))</span>
<span class="c"># Output layer. 1024 inputs, 10 outputs.</span>
<span class="n">wout</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">Variable</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">random_normal</span><span class="p">([</span><span class="mi">1024</span><span class="p">,</span> <span class="n">n_classes</span><span class="p">]))</span>
<span class="n">bout</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">Variable</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">random_normal</span><span class="p">([</span><span class="n">n_classes</span><span class="p">]))</span>
</code></pre>
</div>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="c"># __________ The layers __________</span>
<span class="c"># [In] --> Conv --> Pool --> Dropout</span>
<span class="n">conv1</span> <span class="o">=</span> <span class="n">conv2d</span><span class="p">(</span><span class="n">_X</span><span class="p">,</span> <span class="n">wc1</span><span class="p">,</span> <span class="n">bc1</span><span class="p">)</span>
<span class="n">conv1</span> <span class="o">=</span> <span class="n">max_pool</span><span class="p">(</span><span class="n">conv1</span><span class="p">,</span> <span class="n">k</span><span class="o">=</span><span class="mi">2</span><span class="p">)</span>
<span class="n">conv1</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">nn</span><span class="o">.</span><span class="n">dropout</span><span class="p">(</span><span class="n">conv1</span><span class="p">,</span> <span class="n">keep_prob</span><span class="p">)</span>
<span class="c"># --> Conv --> Pool --> Dropout</span>
<span class="n">conv2</span> <span class="o">=</span> <span class="n">conv2d</span><span class="p">(</span><span class="n">conv1</span><span class="p">,</span> <span class="n">wc2</span><span class="p">,</span> <span class="n">bc2</span><span class="p">)</span>
<span class="n">conv2</span> <span class="o">=</span> <span class="n">max_pool</span><span class="p">(</span><span class="n">conv2</span><span class="p">,</span> <span class="n">k</span><span class="o">=</span><span class="mi">2</span><span class="p">)</span>
<span class="n">conv2</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">nn</span><span class="o">.</span><span class="n">dropout</span><span class="p">(</span><span class="n">conv2</span><span class="p">,</span> <span class="n">keep_prob</span><span class="p">)</span>
<span class="c"># --> Fully-Connected[ReLu] --> Dropout</span>
<span class="c"># (reshape conv2 out essentially by flattening all maps into single list)</span>
<span class="n">dense1</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">reshape</span><span class="p">(</span><span class="n">conv2</span><span class="p">,</span> <span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="n">wd1</span><span class="o">.</span><span class="n">get_shape</span><span class="p">()</span><span class="o">.</span><span class="n">as_list</span><span class="p">()[</span><span class="mi">0</span><span class="p">]])</span>
<span class="n">dense1</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">nn</span><span class="o">.</span><span class="n">relu</span><span class="p">(</span> <span class="n">tf</span><span class="o">.</span><span class="n">add</span><span class="p">(</span> <span class="n">tf</span><span class="o">.</span><span class="n">matmul</span><span class="p">(</span> <span class="n">dense1</span><span class="p">,</span> <span class="n">wd1</span> <span class="p">),</span> <span class="n">bd1</span> <span class="p">)</span> <span class="p">)</span>
<span class="n">dense1</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">nn</span><span class="o">.</span><span class="n">dropout</span><span class="p">(</span><span class="n">dense1</span><span class="p">,</span> <span class="n">keep_prob</span><span class="p">)</span>
<span class="c"># Output prediction.</span>
<span class="n">pred</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">add</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">matmul</span><span class="p">(</span><span class="n">dense1</span><span class="p">,</span> <span class="n">wout</span><span class="p">),</span> <span class="n">bout</span><span class="p">)</span>
</code></pre>
</div>
<h2 id="cost-and-optimizing">Cost and Optimizing</h2>
<script type="math/tex; mode=display">\text{cost} = \frac{1}{n} \sum_{i = 1}^{n_{out}} y_i \log\bigg( \frac{e^{z_i}}{\sum_k e^{z_k}}\bigg)</script>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="c"># ______________ Training _______</span>
<span class="n">cost</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">reduce_mean</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">nn</span><span class="o">.</span><span class="n">softmax_cross_entropy_with_logits</span><span class="p">(</span><span class="n">pred</span><span class="p">,</span> <span class="n">y</span><span class="p">))</span>
<span class="n">optimizer</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">train</span><span class="o">.</span><span class="n">AdamOptimizer</span><span class="p">(</span><span class="n">learning_rate</span><span class="o">=</span><span class="n">learning_rate</span><span class="p">)</span><span class="o">.</span><span class="n">minimize</span><span class="p">(</span><span class="n">cost</span><span class="p">)</span>
</code></pre>
</div>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="c"># _______ Evaluation _____</span>
<span class="n">correct_pred</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">equal</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">argmax</span><span class="p">(</span><span class="n">pred</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span> <span class="n">tf</span><span class="o">.</span><span class="n">argmax</span><span class="p">(</span><span class="n">y</span><span class="p">,</span> <span class="mi">1</span><span class="p">))</span>
<span class="n">accuracy</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">reduce_mean</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">cast</span><span class="p">(</span><span class="n">correct_pred</span><span class="p">,</span> <span class="n">tf</span><span class="o">.</span><span class="n">float32</span><span class="p">))</span>
</code></pre>
</div>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="c"># _________ BLAST OFF _____________</span>
<span class="n">init</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">initialize_all_variables</span><span class="p">()</span>
<span class="k">with</span> <span class="n">tf</span><span class="o">.</span><span class="n">Session</span><span class="p">()</span> <span class="k">as</span> <span class="n">sess</span><span class="p">:</span>
<span class="n">sess</span><span class="o">.</span><span class="n">run</span><span class="p">(</span><span class="n">init</span><span class="p">)</span>
<span class="n">step</span> <span class="o">=</span> <span class="mi">1</span>
<span class="k">while</span> <span class="n">step</span> <span class="o">*</span> <span class="n">batch_size</span> <span class="o"><</span> <span class="n">training_iters</span><span class="p">:</span>
<span class="n">batch_xs</span><span class="p">,</span> <span class="n">batch_ys</span> <span class="o">=</span> <span class="n">mnist</span><span class="o">.</span><span class="n">train</span><span class="o">.</span><span class="n">next_batch</span><span class="p">(</span><span class="n">batch_size</span><span class="p">)</span>
<span class="n">sess</span><span class="o">.</span><span class="n">run</span><span class="p">(</span><span class="n">optimizer</span><span class="p">,</span> <span class="n">feed_dict</span><span class="o">=</span><span class="p">{</span><span class="n">x</span><span class="p">:</span> <span class="n">batch_xs</span><span class="p">,</span> <span class="n">y</span><span class="p">:</span> <span class="n">batch_ys</span><span class="p">,</span> <span class="n">keep_prob</span><span class="p">:</span> <span class="n">dropout</span><span class="p">})</span>
<span class="k">if</span> <span class="n">step</span> <span class="o">%</span> <span class="n">display_step</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span>
<span class="n">acc</span> <span class="o">=</span> <span class="n">sess</span><span class="o">.</span><span class="n">run</span><span class="p">(</span><span class="n">accuracy</span><span class="p">,</span> <span class="n">feed_dict</span><span class="o">=</span><span class="p">{</span><span class="n">x</span><span class="p">:</span> <span class="n">batch_xs</span><span class="p">,</span> <span class="n">y</span><span class="p">:</span> <span class="n">batch_ys</span><span class="p">,</span> <span class="n">keep_prob</span><span class="p">:</span> <span class="mf">1.</span><span class="p">})</span>
<span class="n">loss</span> <span class="o">=</span> <span class="n">sess</span><span class="o">.</span><span class="n">run</span><span class="p">(</span><span class="n">cost</span><span class="p">,</span> <span class="n">feed_dict</span><span class="o">=</span><span class="p">{</span><span class="n">x</span><span class="p">:</span> <span class="n">batch_xs</span><span class="p">,</span> <span class="n">y</span><span class="p">:</span> <span class="n">batch_ys</span><span class="p">,</span> <span class="n">keep_prob</span><span class="p">:</span> <span class="mf">1.</span><span class="p">})</span>
<span class="k">print</span><span class="p">(</span><span class="s">"Iter"</span><span class="p">,</span> <span class="n">step</span> <span class="o">*</span> <span class="n">batch_size</span><span class="p">,</span>
<span class="s">", Minibatch Loss={:.6f}"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">loss</span><span class="p">),</span>
<span class="s">", Training Accuracy={:.5f}"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">acc</span><span class="p">))</span>
<span class="n">step</span> <span class="o">+=</span> <span class="mi">1</span>
<span class="k">print</span><span class="p">(</span><span class="s">"Optimization finished. Am robot."</span><span class="p">)</span>
</code></pre>
</div>
<div class="highlighter-rouge"><pre class="highlight"><code>('Iter', 1280, ', Minibatch Loss=25021.994141', ', Training Accuracy=0.26562')
('Iter', 2560, ', Minibatch Loss=20956.230469', ', Training Accuracy=0.41406')
('Iter', 3840, ', Minibatch Loss=10467.468750', ', Training Accuracy=0.54688')
('Iter', 5120, ', Minibatch Loss=6931.669434', ', Training Accuracy=0.64844')
('Iter', 6400, ', Minibatch Loss=11381.146484', ', Training Accuracy=0.58594')
('Iter', 7680, ', Minibatch Loss=6931.756836', ', Training Accuracy=0.67188')
('Iter', 8960, ', Minibatch Loss=6043.289062', ', Training Accuracy=0.70312')
('Iter', 10240, ', Minibatch Loss=2950.967041', ', Training Accuracy=0.78906')
('Iter', 11520, ', Minibatch Loss=4387.661133', ', Training Accuracy=0.79688')
('Iter', 12800, ', Minibatch Loss=4279.759277', ', Training Accuracy=0.78125')
('Iter', 14080, ', Minibatch Loss=2511.234863', ', Training Accuracy=0.84375')
('Iter', 15360, ', Minibatch Loss=3200.528809', ', Training Accuracy=0.79688')
('Iter', 16640, ', Minibatch Loss=2861.273438', ', Training Accuracy=0.82031')
('Iter', 17920, ', Minibatch Loss=2214.196289', ', Training Accuracy=0.88281')
('Iter', 19200, ', Minibatch Loss=989.559265', ', Training Accuracy=0.90625')
('Iter', 20480, ', Minibatch Loss=4211.814941', ', Training Accuracy=0.78906')
('Iter', 21760, ', Minibatch Loss=1644.427979', ', Training Accuracy=0.91406')
('Iter', 23040, ', Minibatch Loss=2109.490967', ', Training Accuracy=0.87500')
('Iter', 24320, ', Minibatch Loss=2386.041504', ', Training Accuracy=0.83594')
('Iter', 25600, ', Minibatch Loss=1501.948364', ', Training Accuracy=0.88281')
('Iter', 26880, ', Minibatch Loss=2240.972656', ', Training Accuracy=0.82812')
('Iter', 28160, ', Minibatch Loss=2119.425537', ', Training Accuracy=0.87500')
('Iter', 29440, ', Minibatch Loss=2242.839844', ', Training Accuracy=0.82812')
('Iter', 30720, ', Minibatch Loss=1093.348633', ', Training Accuracy=0.88281')
('Iter', 32000, ', Minibatch Loss=1532.251099', ', Training Accuracy=0.88281')
('Iter', 33280, ', Minibatch Loss=985.126221', ', Training Accuracy=0.88281')
('Iter', 34560, ', Minibatch Loss=1191.394165', ', Training Accuracy=0.90625')
('Iter', 35840, ', Minibatch Loss=2769.808105', ', Training Accuracy=0.82812')
('Iter', 37120, ', Minibatch Loss=451.285889', ', Training Accuracy=0.94531')
('Iter', 38400, ', Minibatch Loss=857.569580', ', Training Accuracy=0.89844')
('Iter', 39680, ', Minibatch Loss=2352.155762', ', Training Accuracy=0.88281')
('Iter', 40960, ', Minibatch Loss=1384.690674', ', Training Accuracy=0.90625')
('Iter', 42240, ', Minibatch Loss=828.415405', ', Training Accuracy=0.92188')
('Iter', 43520, ', Minibatch Loss=437.712341', ', Training Accuracy=0.95312')
('Iter', 44800, ', Minibatch Loss=584.637817', ', Training Accuracy=0.89844')
('Iter', 46080, ', Minibatch Loss=1383.199707', ', Training Accuracy=0.89062')
('Iter', 47360, ', Minibatch Loss=1923.911255', ', Training Accuracy=0.88281')
('Iter', 48640, ', Minibatch Loss=1327.275146', ', Training Accuracy=0.88281')
('Iter', 49920, ', Minibatch Loss=450.466156', ', Training Accuracy=0.90625')
('Iter', 51200, ', Minibatch Loss=461.589783', ', Training Accuracy=0.93750')
('Iter', 52480, ', Minibatch Loss=512.834595', ', Training Accuracy=0.95312')
('Iter', 53760, ', Minibatch Loss=1481.610840', ', Training Accuracy=0.85156')
('Iter', 55040, ', Minibatch Loss=1503.613281', ', Training Accuracy=0.90625')
('Iter', 56320, ', Minibatch Loss=663.131042', ', Training Accuracy=0.91406')
('Iter', 57600, ', Minibatch Loss=836.979126', ', Training Accuracy=0.94531')
('Iter', 58880, ', Minibatch Loss=1394.500244', ', Training Accuracy=0.92188')
('Iter', 60160, ', Minibatch Loss=1150.654297', ', Training Accuracy=0.89062')
('Iter', 61440, ', Minibatch Loss=884.085022', ', Training Accuracy=0.89844')
('Iter', 62720, ', Minibatch Loss=641.650208', ', Training Accuracy=0.93750')
('Iter', 64000, ', Minibatch Loss=612.565613', ', Training Accuracy=0.92188')
('Iter', 65280, ', Minibatch Loss=1026.186890', ', Training Accuracy=0.88281')
('Iter', 66560, ', Minibatch Loss=1012.022217', ', Training Accuracy=0.89844')
('Iter', 67840, ', Minibatch Loss=538.746582', ', Training Accuracy=0.92969')
('Iter', 69120, ', Minibatch Loss=2331.966064', ', Training Accuracy=0.85156')
('Iter', 70400, ', Minibatch Loss=611.249207', ', Training Accuracy=0.92969')
('Iter', 71680, ', Minibatch Loss=611.909607', ', Training Accuracy=0.94531')
('Iter', 72960, ', Minibatch Loss=1363.580566', ', Training Accuracy=0.88281')
('Iter', 74240, ', Minibatch Loss=996.121582', ', Training Accuracy=0.91406')
('Iter', 75520, ', Minibatch Loss=730.850952', ', Training Accuracy=0.92969')
('Iter', 76800, ', Minibatch Loss=781.747681', ', Training Accuracy=0.92969')
('Iter', 78080, ', Minibatch Loss=854.089539', ', Training Accuracy=0.93750')
('Iter', 79360, ', Minibatch Loss=1397.916870', ', Training Accuracy=0.88281')
('Iter', 80640, ', Minibatch Loss=1405.003418', ', Training Accuracy=0.88281')
('Iter', 81920, ', Minibatch Loss=806.627136', ', Training Accuracy=0.92188')
('Iter', 83200, ', Minibatch Loss=647.945007', ', Training Accuracy=0.93750')
('Iter', 84480, ', Minibatch Loss=1018.518982', ', Training Accuracy=0.93750')
('Iter', 85760, ', Minibatch Loss=1204.980469', ', Training Accuracy=0.89062')
('Iter', 87040, ', Minibatch Loss=743.574951', ', Training Accuracy=0.92188')
('Iter', 88320, ', Minibatch Loss=638.823486', ', Training Accuracy=0.95312')
('Iter', 89600, ', Minibatch Loss=549.751770', ', Training Accuracy=0.96094')
('Iter', 90880, ', Minibatch Loss=727.560242', ', Training Accuracy=0.91406')
('Iter', 92160, ', Minibatch Loss=624.963196', ', Training Accuracy=0.91406')
('Iter', 93440, ', Minibatch Loss=1152.272461', ', Training Accuracy=0.85938')
('Iter', 94720, ', Minibatch Loss=409.238037', ', Training Accuracy=0.95312')
('Iter', 96000, ', Minibatch Loss=444.576447', ', Training Accuracy=0.92969')
('Iter', 97280, ', Minibatch Loss=1209.410645', ', Training Accuracy=0.86719')
('Iter', 98560, ', Minibatch Loss=217.887985', ', Training Accuracy=0.93750')
('Iter', 99840, ', Minibatch Loss=469.807068', ', Training Accuracy=0.92969')
Optimization finished. Am robot.
</code></pre>
</div>
<div class="language-python highlighter-rouge"><pre class="highlight"><code>
</code></pre>
</div>Using TensorboardMisc. TensorFlow Tutorials2016-11-19T00:00:00+00:002016-11-19T00:00:00+00:00http://mckinziebrandon.me/TensorflowNotebooks/2016/11/19/tf-examples<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">numpy</span> <span class="kn">as</span> <span class="nn">np</span>
<span class="kn">import</span> <span class="nn">tensorflow</span> <span class="kn">as</span> <span class="nn">tf</span>
</code></pre>
</div>
<h1 id="hello-world">Hello World</h1>
<ul>
<li>Based on <a href="https://github.com/aymericdamien/TensorFlow-Examples/blob/master/notebooks/1_Introduction/helloworld.ipynb" title="I'm watching you">this tutorial</a></li>
</ul>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="n">hello</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">constant</span><span class="p">(</span><span class="s">'Hello, Tensorflow.'</span><span class="p">)</span> <span class="c"># Create a constant op, added as node to default graph.</span>
<span class="n">sess</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">Session</span><span class="p">()</span> <span class="c"># Start tensorflow session.</span>
<span class="k">print</span> <span class="n">sess</span><span class="o">.</span><span class="n">run</span><span class="p">(</span><span class="n">hello</span><span class="p">)</span> <span class="c"># Run graph.</span>
</code></pre>
</div>
<h1 id="basic-operations">Basic Operations</h1>
<ul>
<li>Based on <a href="https://github.com/aymericdamien/TensorFlow-Examples/blob/master/notebooks/1_Introduction/basic_operations.ipynb" title="Still watching">this tutorial</a></li>
<li><strong>Constants</strong>: Directly perform arithmetic with tf.constants within sess.run().</li>
<li><strong>Variables</strong>: (i.e. tf.placeholder) need to provide feed_dict of values.</li>
<li><strong>Matrix Multiplication</strong>: Here, define matrices as constants, and pass to tf.matmul.
<ul>
<li>No feed_dict necessary.</li>
</ul>
</li>
</ul>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="c"># Actual numerical values used in the examples below.</span>
<span class="n">_a</span><span class="p">,</span> <span class="n">_b</span> <span class="o">=</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span>
<span class="n">_matrix1</span><span class="p">,</span> <span class="n">_matrix2</span> <span class="o">=</span> <span class="p">[[</span><span class="mf">3.</span><span class="p">,</span> <span class="mf">3.</span><span class="p">]],</span> <span class="p">[[</span><span class="mf">2.</span><span class="p">],</span> <span class="p">[</span><span class="mf">2.</span><span class="p">]]</span>
<span class="c"># __________ Example: tf.constant ____________</span>
<span class="n">a</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">constant</span><span class="p">(</span><span class="n">_a</span><span class="p">)</span>
<span class="n">b</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">constant</span><span class="p">(</span><span class="n">_b</span><span class="p">)</span>
<span class="k">with</span> <span class="n">tf</span><span class="o">.</span><span class="n">Session</span><span class="p">()</span> <span class="k">as</span> <span class="n">sess</span><span class="p">:</span>
<span class="k">print</span> <span class="s">"a, b = ({0}, {1})"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">_a</span><span class="p">,</span> <span class="n">_b</span><span class="p">)</span>
<span class="k">print</span> <span class="s">"Addition with constants: a + b = </span><span class="si">%</span><span class="s">i "</span> <span class="o">%</span> <span class="n">sess</span><span class="o">.</span><span class="n">run</span><span class="p">(</span><span class="n">a</span> <span class="o">+</span> <span class="n">b</span><span class="p">)</span>
<span class="k">print</span> <span class="s">"Multiplication with constants: a * b = </span><span class="si">%</span><span class="s">i "</span> <span class="o">%</span> <span class="n">sess</span><span class="o">.</span><span class="n">run</span><span class="p">(</span><span class="n">a</span> <span class="o">*</span> <span class="n">b</span><span class="p">)</span>
<span class="c"># __________ Example: tf.placeholder ____________</span>
<span class="n">a</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">placeholder</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">int16</span><span class="p">)</span>
<span class="n">b</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">placeholder</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">int16</span><span class="p">)</span>
<span class="n">add</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">add</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span>
<span class="n">mult</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">mul</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span>
<span class="k">with</span> <span class="n">tf</span><span class="o">.</span><span class="n">Session</span><span class="p">()</span> <span class="k">as</span> <span class="n">sess</span><span class="p">:</span>
<span class="n">feed_dict</span> <span class="o">=</span> <span class="p">{</span><span class="n">a</span><span class="p">:</span> <span class="n">_a</span><span class="p">,</span> <span class="n">b</span><span class="p">:</span> <span class="n">_b</span><span class="p">}</span>
<span class="k">print</span> <span class="s">"</span><span class="err">
</span><span class="s">In [ ]:</span><span class="err">
</span><span class="s">Addition with constants: a + b = </span><span class="si">%</span><span class="s">i "</span> <span class="o">%</span> <span class="n">sess</span><span class="o">.</span><span class="n">run</span><span class="p">(</span><span class="n">add</span><span class="p">,</span> <span class="n">feed_dict</span><span class="p">)</span>
<span class="k">print</span> <span class="s">"Multiplication with constants: a * b = </span><span class="si">%</span><span class="s">i "</span> <span class="o">%</span> <span class="n">sess</span><span class="o">.</span><span class="n">run</span><span class="p">(</span><span class="n">mult</span><span class="p">,</span> <span class="n">feed_dict</span><span class="p">)</span>
<span class="c"># __________ Example: tf.matmul ____________</span>
<span class="n">matrix1</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">constant</span><span class="p">(</span><span class="n">_matrix1</span><span class="p">)</span>
<span class="n">matrix2</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">constant</span><span class="p">(</span><span class="n">_matrix2</span><span class="p">)</span>
<span class="n">matrix_product</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">matmul</span><span class="p">(</span><span class="n">matrix1</span><span class="p">,</span> <span class="n">matrix2</span><span class="p">)</span>
<span class="k">with</span> <span class="n">tf</span><span class="o">.</span><span class="n">Session</span><span class="p">()</span> <span class="k">as</span> <span class="n">sess</span><span class="p">:</span>
<span class="k">print</span> <span class="s">"Matrix multiply 1x2 * 2x1 matrices: {0}"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">sess</span><span class="o">.</span><span class="n">run</span><span class="p">(</span><span class="n">matrix_product</span><span class="p">))</span>
</code></pre>
</div>
<h1 id="nearest-neighbors-on-mnist">Nearest-Neighbors on MNIST</h1>
<ul>
<li>Based on <a href="https://github.com/aymericdamien/TensorFlow-Examples/blob/master/notebooks/2_BasicModels/nearest_neighbor.ipynb" title="Hi">this tutorial</a></li>
</ul>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">tensorflow.examples.tutorials.mnist</span> <span class="kn">import</span> <span class="n">input_data</span>
<span class="n">mnist</span> <span class="o">=</span> <span class="n">input_data</span><span class="o">.</span><span class="n">read_data_sets</span><span class="p">(</span><span class="s">"/tmp/data/"</span><span class="p">,</span> <span class="n">one_hot</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
</code></pre>
</div>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="n">_Xtrain</span><span class="p">,</span> <span class="n">_Ytrain</span> <span class="o">=</span> <span class="n">mnist</span><span class="o">.</span><span class="n">train</span><span class="o">.</span><span class="n">next_batch</span><span class="p">(</span><span class="mi">500</span><span class="p">)</span>
<span class="n">_Xtest</span><span class="p">,</span> <span class="n">_Ytest</span> <span class="o">=</span> <span class="n">mnist</span><span class="o">.</span><span class="n">test</span><span class="o">.</span><span class="n">next_batch</span><span class="p">(</span><span class="mi">200</span><span class="p">)</span>
<span class="n">d</span> <span class="o">=</span> <span class="n">_Xtrain</span><span class="o">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span>
<span class="c"># Graph input.</span>
<span class="n">Xtrain</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">placeholder</span><span class="p">(</span><span class="s">"float"</span><span class="p">,</span> <span class="p">[</span><span class="bp">None</span><span class="p">,</span> <span class="n">d</span><span class="p">])</span> <span class="c"># I think 'None' here means it can be whatever. </span>
<span class="n">Xtest</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">placeholder</span><span class="p">(</span><span class="s">"float"</span><span class="p">,</span> <span class="p">[</span><span class="n">d</span><span class="p">])</span>
<span class="c"># L1 distance between Xtrain and Xtest (why?)</span>
<span class="n">difference</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">add</span><span class="p">(</span><span class="n">Xtrain</span><span class="p">,</span> <span class="n">tf</span><span class="o">.</span><span class="n">neg</span><span class="p">(</span><span class="n">Xtest</span><span class="p">))</span>
<span class="n">L1_dist</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">reduce_sum</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="nb">abs</span><span class="p">(</span><span class="n">difference</span><span class="p">),</span> <span class="n">reduction_indices</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="c"># Prediction : get nearest neighbor. </span>
<span class="n">pred</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">arg_min</span><span class="p">(</span><span class="n">L1_dist</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span>
<span class="n">accuracy</span> <span class="o">=</span> <span class="mf">0.</span>
<span class="n">init</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">initialize_all_variables</span><span class="p">()</span> <span class="c"># TODO: Forgot what this does...</span>
<span class="k">with</span> <span class="n">tf</span><span class="o">.</span><span class="n">Session</span><span class="p">()</span> <span class="k">as</span> <span class="n">sess</span><span class="p">:</span>
<span class="n">sess</span><span class="o">.</span><span class="n">run</span><span class="p">(</span><span class="n">init</span><span class="p">)</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">_Xtest</span><span class="p">)):</span>
<span class="n">feed_dict</span> <span class="o">=</span> <span class="p">{</span><span class="n">Xtrain</span><span class="p">:</span> <span class="n">_Xtrain</span><span class="p">,</span> <span class="n">Xtest</span><span class="p">:</span> <span class="n">_Xtest</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="p">:]}</span>
<span class="n">nearest_neighbor_index</span> <span class="o">=</span> <span class="n">sess</span><span class="o">.</span><span class="n">run</span><span class="p">(</span><span class="n">pred</span><span class="p">,</span> <span class="n">feed_dict</span><span class="p">)</span>
<span class="n">label_train</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">argmax</span><span class="p">(</span><span class="n">_Ytrain</span><span class="p">[</span><span class="n">nearest_neighbor_index</span><span class="p">])</span>
<span class="n">label_test</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">argmax</span><span class="p">(</span><span class="n">_Ytest</span><span class="p">[</span><span class="n">i</span><span class="p">])</span>
<span class="k">if</span> <span class="n">label_train</span> <span class="o">==</span> <span class="n">label_test</span><span class="p">:</span>
<span class="n">accuracy</span> <span class="o">+=</span> <span class="mf">1.</span> <span class="o">/</span> <span class="nb">len</span><span class="p">(</span><span class="n">_Xtest</span><span class="p">)</span>
<span class="k">print</span> <span class="s">"Accuracy:"</span><span class="p">,</span> <span class="n">accuracy</span>
</code></pre>
</div>
<h1 id="tflearn---quick-start">TFLearn - Quick Start</h1>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="n">np</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">reshape</span><span class="p">)</span><span class="kn">import</span> <span class="nn">numpy</span> <span class="kn">as</span> <span class="nn">np</span>
<span class="kn">import</span> <span class="nn">tflearn</span>
<span class="kn">import</span> <span class="nn">tensorflow</span> <span class="kn">as</span> <span class="nn">tf</span>
<span class="kn">from</span> <span class="nn">tflearn.datasets</span> <span class="kn">import</span> <span class="n">titanic</span>
<span class="n">data</span> <span class="o">=</span> <span class="n">titanic</span><span class="o">.</span><span class="n">download_dataset</span><span class="p">(</span><span class="s">'titanic_dataset.csv'</span><span class="p">)</span>
<span class="kn">from</span> <span class="nn">tflearn.data_utils</span> <span class="kn">import</span> <span class="n">load_csv</span>
<span class="n">data</span><span class="p">,</span> <span class="n">labels</span> <span class="o">=</span> <span class="n">load_csv</span><span class="p">(</span><span class="s">'titanic_dataset.csv'</span><span class="p">,</span>
<span class="n">target_column</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span>
<span class="n">categorical_labels</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span>
<span class="n">n_classes</span><span class="o">=</span><span class="mi">2</span><span class="p">)</span>
<span class="c"># ___________ Data Preprocessing ___________</span>
<span class="k">def</span> <span class="nf">preprocess</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="n">columns_to_ignore</span><span class="p">):</span>
<span class="c"># Sort by descending id and delete columns. </span>
<span class="k">for</span> <span class="nb">id</span> <span class="ow">in</span> <span class="nb">sorted</span><span class="p">(</span><span class="n">columns_to_ignore</span><span class="p">,</span> <span class="n">reverse</span><span class="o">=</span><span class="bp">True</span><span class="p">):</span>
<span class="p">[</span><span class="n">r</span><span class="o">.</span><span class="n">pop</span><span class="p">(</span><span class="nb">id</span><span class="p">)</span> <span class="k">for</span> <span class="n">r</span> <span class="ow">in</span> <span class="n">data</span><span class="p">]</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">data</span><span class="p">)):</span>
<span class="c"># Encode male=0, female=1. </span>
<span class="n">data</span><span class="p">[</span><span class="n">i</span><span class="p">][</span><span class="mi">1</span><span class="p">]</span> <span class="o">=</span> <span class="mf">1.</span> <span class="k">if</span> <span class="n">data</span><span class="p">[</span><span class="n">i</span><span class="p">][</span><span class="mi">1</span><span class="p">]</span> <span class="o">==</span> <span class="s">'female'</span> <span class="k">else</span> <span class="mf">0.</span>
<span class="k">return</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="o">.</span><span class="n">float32</span><span class="p">)</span>
<span class="n">to_ignore</span> <span class="o">=</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">6</span><span class="p">]</span>
<span class="n">data</span> <span class="o">=</span> <span class="n">preprocess</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="n">to_ignore</span><span class="p">)</span>
<span class="c"># ___________ Build the DNN ___________</span>
<span class="c"># Input --> FC --> FC --> SOFTMAX</span>
<span class="n">net</span> <span class="o">=</span> <span class="n">tflearn</span><span class="o">.</span><span class="n">input_data</span><span class="p">(</span><span class="n">shape</span><span class="o">=</span><span class="p">[</span><span class="bp">None</span><span class="p">,</span> <span class="mi">6</span><span class="p">])</span>
<span class="n">net</span> <span class="o">=</span> <span class="n">tflearn</span><span class="o">.</span><span class="n">fully_connected</span><span class="p">(</span><span class="n">net</span><span class="p">,</span> <span class="mi">32</span><span class="p">)</span>
<span class="n">net</span> <span class="o">=</span> <span class="n">tflearn</span><span class="o">.</span><span class="n">fully_connected</span><span class="p">(</span><span class="n">net</span><span class="p">,</span> <span class="mi">32</span><span class="p">)</span>
<span class="n">net</span> <span class="o">=</span> <span class="n">tflearn</span><span class="o">.</span><span class="n">fully_connected</span><span class="p">(</span><span class="n">net</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="n">activation</span><span class="o">=</span><span class="s">'softmax'</span><span class="p">)</span>
<span class="n">net</span> <span class="o">=</span> <span class="n">tflearn</span><span class="o">.</span><span class="n">regression</span><span class="p">(</span><span class="n">net</span><span class="p">)</span>
<span class="c"># __________ Training ____________</span>
<span class="n">model</span> <span class="o">=</span> <span class="n">tflearn</span><span class="o">.</span><span class="n">DNN</span><span class="p">(</span><span class="n">net</span><span class="p">)</span>
<span class="n">model</span>
</code></pre>
</div>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="s">"""
Simple Example to train logical operators
"""</span>
<span class="kn">from</span> <span class="nn">__future__</span> <span class="kn">import</span> <span class="n">absolute_import</span><span class="p">,</span> <span class="n">division</span><span class="p">,</span> <span class="n">print_function</span>
<span class="kn">import</span> <span class="nn">tensorflow</span> <span class="kn">as</span> <span class="nn">tf</span>
<span class="kn">import</span> <span class="nn">tflearn</span>
<span class="s">'''
Going further: Graph combination with multiple optimizers
Create a XOR operator using product of NAND and OR operators
'''</span>
<span class="c"># Data</span>
<span class="n">X</span> <span class="o">=</span> <span class="p">[[</span><span class="mf">0.</span><span class="p">,</span> <span class="mf">0.</span><span class="p">],</span> <span class="p">[</span><span class="mf">0.</span><span class="p">,</span> <span class="mf">1.</span><span class="p">],</span> <span class="p">[</span><span class="mf">1.</span><span class="p">,</span> <span class="mf">0.</span><span class="p">],</span> <span class="p">[</span><span class="mf">1.</span><span class="p">,</span> <span class="mf">1.</span><span class="p">]]</span>
<span class="n">Y_nand</span> <span class="o">=</span> <span class="p">[[</span><span class="mf">1.</span><span class="p">],</span> <span class="p">[</span><span class="mf">1.</span><span class="p">],</span> <span class="p">[</span><span class="mf">1.</span><span class="p">],</span> <span class="p">[</span><span class="mf">0.</span><span class="p">]]</span>
<span class="n">Y_or</span> <span class="o">=</span> <span class="p">[[</span><span class="mf">0.</span><span class="p">],</span> <span class="p">[</span><span class="mf">1.</span><span class="p">],</span> <span class="p">[</span><span class="mf">1.</span><span class="p">],</span> <span class="p">[</span><span class="mf">1.</span><span class="p">]]</span>
<span class="c"># Graph definition</span>
<span class="k">with</span> <span class="n">tf</span><span class="o">.</span><span class="n">Graph</span><span class="p">()</span><span class="o">.</span><span class="n">as_default</span><span class="p">():</span>
<span class="c"># Building a network with 2 optimizers</span>
<span class="n">g</span> <span class="o">=</span> <span class="n">tflearn</span><span class="o">.</span><span class="n">input_data</span><span class="p">(</span><span class="n">shape</span><span class="o">=</span><span class="p">[</span><span class="bp">None</span><span class="p">,</span> <span class="mi">2</span><span class="p">])</span>
<span class="c"># Nand operator definition</span>
<span class="n">g_nand</span> <span class="o">=</span> <span class="n">tflearn</span><span class="o">.</span><span class="n">fully_connected</span><span class="p">(</span><span class="n">g</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="n">activation</span><span class="o">=</span><span class="s">'linear'</span><span class="p">)</span>
<span class="n">g_nand</span> <span class="o">=</span> <span class="n">tflearn</span><span class="o">.</span><span class="n">fully_connected</span><span class="p">(</span><span class="n">g_nand</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="n">activation</span><span class="o">=</span><span class="s">'linear'</span><span class="p">)</span>
<span class="n">g_nand</span> <span class="o">=</span> <span class="n">tflearn</span><span class="o">.</span><span class="n">fully_connected</span><span class="p">(</span><span class="n">g_nand</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="n">activation</span><span class="o">=</span><span class="s">'sigmoid'</span><span class="p">)</span>
<span class="n">g_nand</span> <span class="o">=</span> <span class="n">tflearn</span><span class="o">.</span><span class="n">regression</span><span class="p">(</span><span class="n">g_nand</span><span class="p">,</span> <span class="n">optimizer</span><span class="o">=</span><span class="s">'sgd'</span><span class="p">,</span>
<span class="n">learning_rate</span><span class="o">=</span><span class="mf">2.</span><span class="p">,</span>
<span class="n">loss</span><span class="o">=</span><span class="s">'binary_crossentropy'</span><span class="p">)</span>
<span class="c"># Or operator definition</span>
<span class="n">g_or</span> <span class="o">=</span> <span class="n">tflearn</span><span class="o">.</span><span class="n">fully_connected</span><span class="p">(</span><span class="n">g</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="n">activation</span><span class="o">=</span><span class="s">'linear'</span><span class="p">)</span>
<span class="n">g_or</span> <span class="o">=</span> <span class="n">tflearn</span><span class="o">.</span><span class="n">fully_connected</span><span class="p">(</span><span class="n">g_or</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="n">activation</span><span class="o">=</span><span class="s">'linear'</span><span class="p">)</span>
<span class="n">g_or</span> <span class="o">=</span> <span class="n">tflearn</span><span class="o">.</span><span class="n">fully_connected</span><span class="p">(</span><span class="n">g_or</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="n">activation</span><span class="o">=</span><span class="s">'sigmoid'</span><span class="p">)</span>
<span class="n">g_or</span> <span class="o">=</span> <span class="n">tflearn</span><span class="o">.</span><span class="n">regression</span><span class="p">(</span><span class="n">g_or</span><span class="p">,</span> <span class="n">optimizer</span><span class="o">=</span><span class="s">'sgd'</span><span class="p">,</span>
<span class="n">learning_rate</span><span class="o">=</span><span class="mf">2.</span><span class="p">,</span>
<span class="n">loss</span><span class="o">=</span><span class="s">'binary_crossentropy'</span><span class="p">)</span>
<span class="c"># XOR merging Nand and Or operators</span>
<span class="n">g_xor</span> <span class="o">=</span> <span class="n">tflearn</span><span class="o">.</span><span class="n">merge</span><span class="p">([</span><span class="n">g_nand</span><span class="p">,</span> <span class="n">g_or</span><span class="p">],</span> <span class="n">mode</span><span class="o">=</span><span class="s">'elemwise_mul'</span><span class="p">)</span>
<span class="c"># Training</span>
<span class="n">m</span> <span class="o">=</span> <span class="n">tflearn</span><span class="o">.</span><span class="n">DNN</span><span class="p">(</span><span class="n">g_xor</span><span class="p">)</span>
<span class="n">m</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">X</span><span class="p">,</span> <span class="p">[</span><span class="n">Y_nand</span><span class="p">,</span> <span class="n">Y_or</span><span class="p">],</span> <span class="n">n_epoch</span><span class="o">=</span><span class="mi">400</span><span class="p">,</span> <span class="n">snapshot_epoch</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span>
<span class="c"># Testing</span>
<span class="k">print</span><span class="p">(</span><span class="s">"Testing XOR operator"</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">"0 xor 0:"</span><span class="p">,</span> <span class="n">m</span><span class="o">.</span><span class="n">predict</span><span class="p">([[</span><span class="mf">0.</span><span class="p">,</span> <span class="mf">0.</span><span class="p">]]))</span>
<span class="k">print</span><span class="p">(</span><span class="s">"0 xor 1:"</span><span class="p">,</span> <span class="n">m</span><span class="o">.</span><span class="n">predict</span><span class="p">([[</span><span class="mf">0.</span><span class="p">,</span> <span class="mf">1.</span><span class="p">]]))</span>
<span class="k">print</span><span class="p">(</span><span class="s">"1 xor 0:"</span><span class="p">,</span> <span class="n">m</span><span class="o">.</span><span class="n">predict</span><span class="p">([[</span><span class="mf">1.</span><span class="p">,</span> <span class="mf">0.</span><span class="p">]]))</span>
<span class="k">print</span><span class="p">(</span><span class="s">"1 xor 1:"</span><span class="p">,</span> <span class="n">m</span><span class="o">.</span><span class="n">predict</span><span class="p">([[</span><span class="mf">1.</span><span class="p">,</span> <span class="mf">1.</span><span class="p">]]))</span>
</code></pre>
</div>
<div class="highlighter-rouge"><pre class="highlight"><code>Training Step: 400 | total loss: [1m[32m0.81728[0m[0m
| SGD_0 | epoch: 400 | loss: 0.40857 -- iter: 4/4
| SGD_1 | epoch: 400 | loss: 0.40871 -- iter: 4/4
Testing XOR operator
0 xor 0: [[0.0005703496863134205]]
0 xor 1: [[0.9982306957244873]]
1 xor 0: [[0.9982070922851562]]
1 xor 1: [[0.00094714475562796]]
</code></pre>
</div>
<h1 id="early-stopping-investigatoin">Early Stopping Investigatoin</h1>
<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">csv</span>
<span class="kn">import</span> <span class="nn">tensorflow</span> <span class="kn">as</span> <span class="nn">tf</span>
<span class="kn">import</span> <span class="nn">random</span>
<span class="kn">import</span> <span class="nn">pandas</span> <span class="kn">as</span> <span class="nn">pd</span>
<span class="kn">import</span> <span class="nn">numpy</span> <span class="kn">as</span> <span class="nn">np</span>
<span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="kn">as</span> <span class="nn">plt</span>
<span class="o">%</span><span class="n">matplotlib</span> <span class="n">inline</span>
<span class="n">ipd</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">read_csv</span><span class="p">(</span><span class="s">'iris.csv'</span><span class="p">)</span>
<span class="n">ipd</span><span class="o">.</span><span class="n">head</span><span class="p">()</span>
</code></pre>
</div>
<div>
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right;">
<th></th>
<th>sepal_length</th>
<th>sepal_width</th>
<th>petal_length</th>
<th>petal_width</th>
<th>species</th>
</tr>
</thead>
<tbody>
<tr>
<th>0</th>
<td>5.1</td>
<td>3.5</td>
<td>1.4</td>
<td>0.2</td>
<td>setosa</td>
</tr>
<tr>
<th>1</th>
<td>4.9</td>
<td>3.0</td>
<td>1.4</td>
<td>0.2</td>
<td>setosa</td>
</tr>
<tr>
<th>2</th>
<td>4.7</td>
<td>3.2</td>
<td>1.3</td>
<td>0.2</td>
<td>setosa</td>
</tr>
<tr>
<th>3</th>
<td>4.6</td>
<td>3.1</td>
<td>1.5</td>
<td>0.2</td>
<td>setosa</td>
</tr>
<tr>
<th>4</th>
<td>5.0</td>
<td>3.6</td>
<td>1.4</td>
<td>0.2</td>
<td>setosa</td>
</tr>
</tbody>
</table>
</div>
<div class="language-python highlighter-rouge"><pre class="highlight"><code>
</code></pre>
</div>import numpy as np
import tensorflow as tfLink to PDF Notes2016-11-19T00:00:00+00:002016-11-19T00:00:00+00:00http://mckinziebrandon.me/TensorflowNotebooks/2016/11/19/link<p>Click <a href="http://mckinziebrandon.me/assets/pdf/GeneralNotes.pdf">this link</a></p>Click this link